MaNIS Georeferencing Discussion Archive

 

Following are extracts of the Georeferencing Listserv discussions accumulated during the MaNIS georeferencing project. Missing postings were not relevant to georeferencing in perpetuity. Messages have been edited to protect the guilty by masking names of individuals with XXXXXX.

 

>>> Posting number 1, dated 17 Jul 1999 14:12:50

 

-----------------------------------------------------------------------------

 

>>> Posting number 2, dated 17 Jul 1999 14:15:23

 

 

-----------------------------------------------------------------------------

 

>>> Posting number 3, dated 17 Jul 1999 14:16:03

 

 

 

-----------------------------------------------------------------------------

 

>>> Posting number 4, dated 17 Jul 1999 14:19:25

 

 

------------------------------------------------------------------------=

-----

 

>>> Posting number 5, dated 17 Jul 1999 14:19:59

 

 

-----------------------------------------------------------------------------

 

>>> Posting number 6, dated 17 Jul 1999 14:26:41

 

 

 

-----------------------------------------------------------------------------

 

>>> Posting number 7, dated 17 Jul 1999 14:22:50

 

 

-----------------------------------------------------------------------------

 

>>> Posting number 8, dated 17 Jul 1999 14:23:12

 

-----------------------------------------------------------------------------

 

>>> Posting number 9, dated 19 Jul 1999 09:29:01

----------------------------------------------------------------------------

--------------------

 

>>> Posting number 10, dated 23 Jul 1999 16:35:41

 

>>> Posting number 11, dated 3 Sep 1999 16:17:55

 

>>> Posting number 12, dated 17 Sep 1999 15:19:38

 

>>> Posting number 13, dated 17 Sep 1999 13:13:14

 

>>> Posting number 14, dated 17 Sep 1999 14:57:30

 

>>> Posting number 15, dated 20 Sep 1999 09:04:17

 

>>> Posting number 16, dated 24 Sep 1999 17:01:21

 

>>> Posting number 17, dated 28 Sep 1999 12:50:27

 

>>> Posting number 18, dated 15 Oct 1999 19:37:37

 

>>> Posting number 19, dated 17 Oct 1999 16:37:27

 

>>> Posting number 20, dated 18 Oct 1999 16:50:30

 

>>> Posting number 21, dated 19 Oct 1999 11:15:26

 

>>> Posting number 22, dated 19 Oct 1999 16:35:19

 

>>> Posting number 23, dated 20 Oct 1999 15:51:18

 

>>> Posting number 24, dated 20 Oct 1999 11:34:55

 

>>> Posting number 25, dated 20 Oct 1999 16:00:18

 

>>> Posting number 26, dated 10 Nov 1999 10:52:01

 

>>> Posting number 27, dated 10 Nov 1999 13:54:04

 

>>> Posting number 28, dated 17 Nov 1999 15:12:19

 

>>> Posting number 29, dated 18 Nov 1999 12:38:15

 

>>> Posting number 30, dated 18 Nov 1999 10:08:56

 

>>> Posting number 31, dated 18 Nov 1999 13:22:25

 

>>> Posting number 32, dated 19 Nov 1999 14:35:52

 

>>> Posting number 33, dated 3 Dec 1999 10:21:24

 

>>> Posting number 34, dated 3 Jan 2000 11:48:10

 

>>> Posting number 35, dated 3 Jan 2000 16:24:25

 

>>> Posting number 36, dated 18 May 2000 16:51:23

 

>>> Posting number 37, dated 18 May 2000 19:49:29

 

>>> Posting number 38, dated 23 May 2000 18:41:45

 

>>> Posting number 39, dated 24 May 2000 09:38:19

 

--------------------------------------------------------

---------------------

 

>>> Posting number 40, dated 24 May 2000 12:15:39

 

>>> Posting number 41, dated 12 Jun 2000 15:45:50

 

>>> Posting number 42, dated 13 Jun 2000 09:31:26

 

>>> Posting number 43, dated 13 Jun 2000 09:59:02

 

>>> Posting number 44, dated 13 Jun 2000 09:17:08

 

>>> Posting number 45, dated 13 Jun 2000 07:49:43

 

>>> Posting number 46, dated 13 Jun 2000 09:04:22

 

>>> Posting number 47, dated 13 Jun 2000 08:54:22

 

>>> Posting number 48, dated 13 Jun 2000 11:11:31

 

>>> Posting number 49, dated 13 Jun 2000 13:23:46

 

>>> Posting number 50, dated 30 Jun 2000 16:25:38

 

>>> Posting number 51, dated 30 Jun 2000 17:14:31

 

>>> Posting number 52, dated 30 Jun 2000 23:29:35

 

>>> Posting number 53, dated 1 Jul 2000 07:35:15

 

>>> Posting number 54, dated 4 Jul 2000 11:04:23

 

>>> Posting number 55, dated 4 Jul 2000 10:07:33

 

>>> Posting number 56, dated 6 Jul 2000 00:00:0/

 

>>> Posting number 57, dated 5 Jul 2000 19:40:11

 

>>> Posting number 58, dated 5 Aug 2000 09:24:55

 

>>> Posting number 59, dated 5 Aug 2000 12:31:07

 

>>> Posting number 60, dated 7 Aug 2000 13:45:33

 

>>> Posting number 61, dated 15 Aug 2000 21:54:23

 

>>> Posting number 62, dated 23 Aug 2000 16:24:48

 

>>> Posting number 63, dated 30 Aug 2000 11:20:17

 

>>> Posting number 64, dated 22 Sep 2000 09:36:34

 

>>> Posting number 65, dated 29 Sep 2000 08:51:23

 

>>> Posting number 66, dated 2 Oct 2000 10:35:12

 

>>> Posting number 67, dated 5 Oct 2000 09:40:24

 

>>> Posting number 68, dated 17 Oct 2000 18:13:33

 

>>> Posting number 69, dated 1 Nov 2000 07:48:24

 

>>> Posting number 70, dated 1 Nov 2000 08:06:24

 

>>> Posting number 71, dated 28 Nov 2000 18:26:18

 

>>> Posting number 72, dated 29 Nov 2000 21:09:35

 

>>> Posting number 73, dated 30 Nov 2000 08:31:10

 

>>> Posting number 74, dated 30 Nov 2000 11:33:07

 

>>> Posting number 75, dated 14 Dec 2000 20:41:28

 

>>> Posting number 76, dated 15 Dec 2000 07:59:04

 

>>> Posting number 77, dated 26 Apr 2001 09:00:01

 

>>> Posting number 78, dated 16 May 2001 18:29:45

 

>>> Posting number 79, dated 16 May 2001 17:36:59

 

>>> Posting number 80, dated 18 May 2001 08:29:49

 

>>> Posting number 81, dated 24 May 2001 10:19:20

 

>>> Posting number 82, dated 25 May 2001 09:43:37

 

>>> Posting number 83, dated 11 Jun 2001 12:01:03

 

>>> Posting number 84, dated 11 Jun 2001 15:02:51

 

>>> Posting number 85, dated 11 Jun 2001 15:44:56

 

>>> Posting number 86, dated 29 Jun 2001 21:12:37

 

>>> Posting number 87, dated 4 Jul 2001 14:24:24

Date:         Wed, 4 Jul 2001 14:24:24 -0700

Reply-To:     "Mammalogy Z39.50 Network (Private)" <MAMMAL-Z-NET@USOBI.ORG>

Sender:       "Mammalogy Z39.50 Network (Private)" <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: ROM higher geography

In-Reply-To:  <sb433743.076@romfs7.rom.on.ca>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

I'm posting the following exchange to the list because there is information

contained herein that is relevant to everyone. The basic concepts of data

cleanliness, the gazetteer, and data updates are addressed in brief.

 

 

>Once I began working on the Bukedi inconsistency (2nd in your list) I saw

>that your methodology is missing many more errors/inconsistencies that

>exist in County and Province data.

 

Understood.  My analysis reveals only the duplicates of

ORCT+ORCRY+ORPR+ORCY

 

I understand that there may be many other errors and inconsistencies in the

original data, but that is not a concern for the gazetteer.  In fact, the

duplicates I pointed out aren't a problem either. I just wanted to alert

you to them since they came out in my analysis.

 

>   The errors and inconsistencies are a direct reflection of the state of

> documentation on field catalogues or specimen cards, depending on the

> source of the automated record.  We did not have the resources at the

> time of automation (nor do we now for that matter) to resolve what is a

> "Province" term and what is a "County" term for all

> countries.  Additionally, we are looking at historical data that may no

> longer be reflected in the current political reality of our little world

> (e.g., USSR, Northwest Territories).  I have cleaned up data fields that

> are used routinely to manage the collection and retrieve data.  Continent

> and Country should be clean.  The Province field should be clean for

> Canada (I haven't had the time to tackle NWT yet), USA, and Mexico.  I

> just finished cleaning up the Province field for Guyana as well.  The

> County field should be clean for Ontario.  I now periodically print out

> frequency listings for Country etc. for these priority sections of the db

> (and collection) in an effort to maintain the consistency of our

> data.  For all other geographic locations, Province and County are not

> used for managing the collection, so the data clean up or enhancement has

> been a low priority.  This is an ongoing situation that I have discussed

> with Judith with regard to the Manis Project.  My understanding is that

> funding for documentational and staffing resources will be part of this

> "mission".  I am afraid your listing of 13 inconsistencies barely

> scratches the surface of the data cleaning that is required and even more

> importantly, misses all kinds of erroneous or missing data.  I currently

> do not have the maps, atlases, or gazetteers nor the staff/time to

> undertake this project which from a collections' perspective is of low

> priority.  To do a proper job I cannot resolve all of the problems that

> you have identified without undertaking a full review of the entire

> country's data.

 

There is no requirement for any standard of cleanliness. It is my hope that

errors and inconsistencies will be noted during georeferencing and

forwarded to the attention of the institutions as a part of that

process.  The tools are meant to identify the inconsistencies, not to

remedy them. What the institutions do with these notes is entirely up to them.

 

>I am not sure what you are currently attempting to do with the data so we

>may need to further discuss our respective needs to insure that we are not

>working at cross purposes.  If work is to be globally undertaken, I would

>like our data to be the db of record - making long lists of changes for

>you to then repeat is a waste of effort and time; you will see the work

>generated by having two dbs of record by the simple changes that I have

>made this afternoon.  Also, errors in interpretation or typos that are

>bound to occur should be avoided.  Finally, the data you have is already

>out of date, since changes are made by me on a daily basis as errors etc.

>are encountered during the normal activities of managing the collection,

>fulfilling data requests, etc.

 

The institutional databases will always be the database of record.  The

data I have from all of the institutions is just a snapshot, to be used for

georeferencing. I will not ask for these data again during the project, nor

will I make changes to the data I have received.  When we have a network,

the gazetteer will be created and updated automatically whenever data

change and the snapshot will be obsolete.  I've only created the snapshot

so that we have combined data to work with. When people begin to do

georeferencing using the gazetteer they will not change the data - they

will only make commentaries.  Even the latitude and longitude are

commentaries in a sense. It is up to each institution to accept or reject

the commentaries and make changes based on them in its database.

 

 

>Regards,

 

 

> 

> >>> John Wieczorek <tuco@socrates.Berkeley.EDU> 07/02/01 08:50PM >>>

>Attached is a tab-delimited file with the first row containing column

>headings. The contents of the file are combinations of higher geographic

>fields for which you have more than one interpretation in your

>database.  The first field (highergeog) is a concatenation of the fields of

>higher geography that reveal duplication. The second field (geogid) is an

>identifier unique to the ROM higher geography data with one row for every

>unique combination of ORCT, ORCRY, ORPR, and ORCY.  As you can see by the

>rows in the table, there are 13 places for which there are inconsistent

>placements of county vs. province, for example.  It is not critical for my

>purposes to have these resolved, but since I noticed them I thought I might

>as well tell you.  If you do make changes to these combinations, let me

>know which are correct and I'll do so on this end as well.

 

>>> Posting number 88, dated 10 Jul 2001 12:01:24

Date:         Tue, 10 Jul 2001 12:01:24 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      cave localities

Mime-version: 1.0

Content-type: text/plain; charset="US-ASCII"

Content-transfer-encoding: 7bit

 

I've noticed that the USGS GNIS web site does not give information on cave

sites.  (It does give locations of variants such as Boulder Cave

Campground.)  Is this a protocol we wish to follow?  Are there other web

sites that do list cave localities?  What do you think?

 

Cheers,

 

>>> Posting number 89, dated 10 Jul 2001 13:40:25

Date:         Tue, 10 Jul 2001 13:40:25 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Filtering data

In-Reply-To:  <sb4b0d4a.070@romfs7.rom.on.ca>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

This message is in reply to a comment about

records for captive animals.

 

>I would recommend that you do not use any captive records for a

>gazetteer.  Does that make sense?

 

In a restricted view of the utility of a gazetteer it does make sense to

exclude them. However, it is actually easier to include them, yet have them

flagged. This has the benefit that one can filter on the captive attribute.

This could be useful if you wanted to do a quick query of only captive

animals as well as for a query in which you want to leave them out.  The

philosophy in general will be to have a home for all data that anyone deems

useful, yet to allow each institution to decide which data it will provide

through the filters implemented during migration.

 

A filter might do any one of the following:

1) exclude attributes altogether (e.g., not show a "CaptiveFlag" field)

2) exclude records based on the value of an attribute (e.g., not show

records of endangered species)

3) exclude certain values of an attribute (e.g., not show localities for

endangered species)

4) substitute a surrogate value for an attribute of a certain value (e.g.,

instead of showing locality with lat-long, show only county-level and

higher geography for endangered species)

 

These are just a few examples of what might be done at one institution, and

may vary between institutions.  I encourage the participant's to discuss

these issues, and to begin to make institutional decisions about filtering

rules when it comes time to set up the migration.  The rules must be

clearly defined before I begin to create the creation scripts - I can't

afford to stay at any given institution (except maybe Hawaii, heh heh),

while the rules are being hashed out.

 

>>> Posting number 90, dated 8 Aug 2001 13:10:05

 

>>> Posting number 91, dated 14 Sep 2001 08:48:17

 

>>> Posting number 92, dated 23 Sep 2001 17:24:24

 

>>> Posting number 93, dated 24 Sep 2001 20:07:31

Date:         Mon, 24 Sep 2001 20:07:31 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Guidelines

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

Now that we are officially up and running I would like to provide the first

of two documents on the MaNIS collaborative georeferencing effort.  This

first document is meant to open for discussion the issues associated with

turning specific locality descriptions into well-documented latitudes and

longitudes.  The document does not explain what tools to use, or how to use

any of them - that will be in a forthcoming document. Instead, this

document focuses on the "theoretical aspects" of the task, our methods and

assumptions, upon which it would be helpful for us all to agree.  To that

end, please read the Georeferencing Guidelines page, accessible from the

Documents page on the MaNIS website (see below).  Comment by sending

messages to MAMMAL-Z-NET@USOBI.ORG. Let's try to get through this

discussion by 6 Oct.

 

http://dlp.cs.berkeley.edu/manis/Documents.html

 

Anticipating your enthusiastic participation,

 

John Wieczorek

 

>>> Posting number 94, dated 25 Sep 2001 18:30:16

Date:         Tue, 25 Sep 2001 18:30:16 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing text, for reference

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

It was pointed out to me that it might be prudent to have a text-only copy

of the document, with line numbers, to which everyone can refer in

discussions.  I am including the full text of the GeorefGuide.html file

below for that purpose.  The page itself can be found at the following URL:

 

http://dlp.cs.berkeley.edu/manis/GeorefGuide.html

 

 

   1 MaNIS

   2 The Mammal Networked Information System

   3

   4 John Wieczorek

   5 24 September 2001

   6 _________________________________________________

   7

   8 Georeferencing Guidelines

   9

  10 This document contains information about assigning geographic

  11 coordinates and maximum errors for those coordinates to specific

  12 locality descriptions. This document does not attempt to

  13 describe the tools and methods for finding named places on maps

  14 or gazetteers. The process of assigning coordinates and errors,

  15 called georeferencing, can be rather complicated. The complexity

  16 of the process can be greatly reduced and the consistency of the

  17 results can be greatly increased by establishing simple

  18 guidelines that cover most commonly encountered locality

  19 descriptions. The guidelines for assigning coordinates for named

  20 places are presented with examples in the section Determining

  21 Latitude & Longitude.

  22

  23 There are several fundamental sources of error for specific

  24 locality descriptions, and these vary in magnitude. It is

  25 essential during georeferencing to determine and record the

  26 greatest source of error among all possible sources. There are

  27 numerous ways in which the maximum error of a geographic

  28 coordinate might be expressed, but the most convenient is as a

  29 distance, because its size and shape are constant over any

  30 geodetic surface model. The sources of error and their

  31 magnitudes are discussed primarily in the section Determining

  32 Error.

  33

  34 An Appendix containing a description of the data that should be

  35 captured for each georeferenced locality, a glossary, and

  36 references are appended for the convenience of the reader.

  37

  38 Determining Latitude & Longitude

  39

  40 Geographic coordinates can be expressed in a number of different

  41 coordinate systems (e.g. decimal degrees, degrees minutes

  42 seconds, degrees decimal minutes, UTM, etc.). Conversions can be

  43 made readily between coordinate systems, but decimal degrees

  44 provide the most convenient coordinates to use for

  45 georeferencing for no more profound a reason than that a

  46 specific locality can be described with only two attributes

  47 decimal latitude and decimal longitude.

  48

  49 Named Places

  50

  51 The simplest of specific locality descriptions consist of only a

  52 named place. Use the geographic center of a named place for the

  53 latitude and longitude, and use the distance from that point to

  54 the furthest point within that named place for the maximum error

  55 distance. If the geographic center of the named place is not

  56 within the confines of the shape of the named place, use the

  57 point nearest to the geographic center that lies within the

  58 shape.

  59

  60 Example: "Bakersfield"

  61

  62 Township Range Section (TRS) descriptions are essentially no

  63 different from that of any other named place. It is necessary to

  64 understand how TRS descriptions work and how they describe a

  65 place. See the References section, below, for links to TRS

  66 information.

  67

  68 Example: "E of Bakersfield, T29S R29E Sec. 34 NE 1/4"

  69

  70 Offsets

  71

  72 Offsets generally consist of combinations of distances and

  73 directions from a named place. Use the geographic center of the

  74 named place in the direction of the offset as a starting point.

  75 Unless there is contrary information in the locality

  76 description, measure the distance in the offset direction to

  77 find the spot for the geographic coordinates. Offsets that do

  78 not explicitly say that they were measured by air or by some

  79 contour (e.g., by road, river, valley, etc.) should be

  80 determined as if by air in a straight line.

  81

  82 Example: "10 mi E (by air) Bakersfield"

  83

  84 Example: "10 mi E of Bakersfield"

  85

  86 However, if there is no mention of the mode of measurement in

  87 the locality description, but the measurement includes fractions

  88 (e.g., 10.2 miles) and there is a road in the vicinity, use road

  89 miles. Offsets that were described in the specific locality as

  90 being measured by road should be determined using the contours

  91 of the road rather than using a straight line. The methods for

  92 determining the maximum error distances for these types of

  93 specific locality descriptions are given in the Determining

  94 Error section, below.

  95

  96 Example: "10.2 mi E of Bakersfield"

  97

  98 Example: "13 mi E (by road) Bakersfield"

  99

100 Vagueness

101

102 At times, specific locality descriptions are fraught with

103 vagueness. It is not the purpose here to belittle localities of

104 this type; in fact, an honest admission of the unknown is

105 preferable to masking it with unwarranted precision.

106

107 The most important type of vagueness in a specific locality

108 description is one in which the locality is in question. No such

109 locality should be georeferenced.

110

111 Example: "Bakersfield?"

112

113 Many locality descriptions imply an offset from a named place

114 without definitive directions or distances. Use the geographic

115 center of the named place for the geographic coordinates. For

116 the maximum error distance, use the greatest distance that is

117 not likely to be considered in the area of another named place.

118 Clearly there is a measure of subjectivity involved here. Let

119 common sense prevail and document the assumptions made.

120

121 Example: "near Bakersfield"

122

123 Sometimes offset information is vague either in its direction or

124 in its distance. If the direction information is vague, record

125 the geographic coordinates of the center of the named place and

126 add the offset distance to the greatest extent of the named

127 place to get the maximum error distance.

128

129 Example: "5 mi from Bakersfield"

130

131 Uncertainty in the offset distance is a fact of the business.

132 Almost no localities are recorded with error estimates,

133 therefore every offset distance is inherently uncertain. The

134 addition of a modifier in the locality description, while an

135 honest observation, should not change the determination of the

136 geographic coordinates or of the maximum error.

137

138 Example: "about 3 mi E of Bakersfield"

139

140 The worst of situations arises when a specific locality

141 description is internally inconsistent. There are numerous

142 possible causes for inconsistencies. It is the task of the those

143 georeferencing to determine the part of the description most

144 likely to be in error, ignore it for the purpose of the

145 determination, and document the decision to do so. The most

146 common source of inconsistency in a locality description comes

147 from trying to match elevation information with the rest of the

148 description. If there is no reasonable way to reconcile the

149 discrepancy, ignore the elevation.

150

151 Example: "10 mi W of Bakersfield, 6000 ft"

152

153 Determining Error

154

155 The process of georeferencing includes an assessment of the

156 possible sources of error in a geographic coordinate

157 determination. Errors may arise due to the extent of a locality,

158 due to unspecified precision in original measurements (distance

159 precision and directional precision), or due to not knowing the

160 datum under which coordinates were determined. It is essential

161 to determine which of these yields the greatest error and record

162 that value as the maximum error distance. Potential error

163 sources and guidelines for determining the magnitude of each for

164 a given specific locality are given in the paragraphs below.

165

166 Error due to the shape of a locality

167

168 Named places are not single points; they have extents. If a

169 locality description is no more specific than to describe a

170 named place or an offset from a named place, then the size of

171 the named place is a source of error. The treatment of error due

172 to the extent of a locality is described under the examples of

173 determining latitude and longitude, above.

174

175 Error due to a unknown datum

176

177 Seldom have geographic coordinates been recorded for a locality

178 in a natural history collection in which the underlying datum of

179 the coordinate system was given. Even now, when GPS coordinates

180 are being taken as definitive evidence of a location, the

181 geodetic datum is being ignored. Without recording the datum

182 with the coordinates, potential accuracy is being lost. Figure 1

183 shows the magnitude of error (in meters) over North America

184 based on not knowing the datum from which the coordinates were

185 taken.

186

187 [datumerror.jpg]

188

189 Figure 1. Map of North America showing the magnitude of

190 potential error from not knowing whether coordinates were taken

191 from NAD27, NAD83, or WGS84.

192

193 This map can be used as a rough guide for determining the

194 magnitude of error due to not knowing the datum from which the

195 geographic coordinates were recorded.

196

197 Precision

198

199 Precision is difficult to gauge from specific locality

200 descriptions; it may be reflected in the locality description,

201 but it is seldom, if ever, explicitly recorded. Furthermore, a

202 database record may not reflect, or may reflect incorrectly, the

203 precision inherent in the original measurement, especially if

204 the locality description has undergone interpretation from the

205 verbatim original description. Precision issues arise from both

206 distance measurements and directions in a locality description.

207 Potential errors from each of these sources are discussed in the

208 paragraphs below.

209

210 Error associated with distance precision

211

212 Distance may be recorded in a specific locality description with

213 or without significant digits, and those digits may or may not

214 be warranted. A conservative way to insure that distance

215 precision is not inflated is to treat distance measurements as

216 integers with fractional remainders. Thus 10.25 becomes 10 1/4,

217 10.5 becomes 10 1/2, etc. Calculate the error for these distances

218 based on the fractional part of the distance, using 1 divided by

219 the denominator of the fraction.

220

221 Example: "10.5 mi N of Bakersfield" Fraction is 1/2, error should

222 be 0.5 mi.

223

224 Example: "10.6 mi N of Bakersfield" Fraction is 1/10, error

225 should be 0.1 mi.

226

227 Example: "10.75 mi N of Bakersfield" Fraction is 3/4, error should

228 be 0.25 mi.

229

230 If the distance is an integer, use an error of one unit.

231

232 Example: "10 mi N of Bakersfield" Error should be 1 mi.

233

234 Error associated with directional precision

235

236 Direction is almost always expressed in specific locality

237 descriptions using cardinal and intercardinal directions rather

238 than degree headings. A conservative interpretation of these

239 directions allows for an error of 22.5 degrees to either side of

240 the recorded direction. Thus, ENE can be any direction between E

241 and NE, while NE can be any direction between ENE and NNE.

242

243 [directionerror.jpg]

244

245 The error distance resulting from imprecision in direction

246 increases with increasing offset distance. In fact the error

247 distance due to directional imprecision is 0.4142 times the

248 offset. Note, however, that when a locality description uses two

249 offsets based on cardinal directions (e.g., 1 mi N and 3 mi E of

250 Bakersfield), the distances and directions are likely to have

251 been measured on a map. In this case, directional imprecision

252 should be ignored.

253

254 Appendix

255

256 Geographic Coordinate Data

257

258 Following are the essential attributes to be captured for each

259 locality while georeferencing.

260

261 Decimal_Latitude - the latitude coordinate (in decimal degrees) at

262 the center of a circle encompassing the whole of a specific

263 locality. Convention holds that decimal latitudes north of the

264 equator are positive numbers less than or equal to 90, while

265 those south are negative numbers greater or equal to 90.

266 Example: -42.51 degrees (which is the same as 42d 30' 36" S).

267

268 Decimal_Longitude - the longitude coordinate (in decimal degrees)

269 at the center of a circle encompassing the whole of a specific

270 locality. Decimal longitudes west of the Greenwich Meridian are

271 considered negative and must be greater than or equal to 180,

272 while eastern longitudes are positive and less than or equal to

273 180. Example: -122.49 degrees (which is the same as 122d 29' 24"

274 W).

275

276 Maximum_Error_Distance - the upper limit of the distance from the

277 given latitude and longitude within which the described locality

278 must lie.

279

280 Maximum_Error_Units - the units of length in which the maximum

281 error is recorded (e.g., mi, km, m, and ft). Express maximum

282 error distance in the same units as the distance measurement in

283 the specific locality description.

284

285 Datum - the geometric description of a geodetic surface model

286 (e.g., NAD27, NAD83, WGS84). Datums are often recorded on maps

287 and in gazetteers, and can be specifically set for most GPS

288 devices. Use "not recorded" when the datum is not known.

289

290 Original_Coord_System - the coordinate system in which the raw

291 data are being entered. For the purpose of collaborative

292 georeferencing this value will be "decimal degrees." However,

293 existing geographic coordinates may be entered in degrees

294 minutes seconds, degrees decimal minutes, or UTM coordinates.

295

296 Reference - the reference source (e.g., map, gazetteer, or

297 software) used to determine the coordinates. Such information

298 should provide enough detail so that anyone can locate the

299 actual reference that was used (e.g., name, edition or version,

300 year). Lat_Long_Determined_By the person or organization by

301 which the determination was made.

302

303 Lat_Long_Determined_Date - the date on which the determination was

304 made.

305

306 Remarks - comments on methods and assumptions used in determining

307 coordinates or errors when those methods or assumptions differ

308 from or expand upon the accepted guidelines.

309

310 Glossary

311

312 Datum - A geodetic datum describes the size, shape, origin, and

313 orientation of a coordinate system for mapping the surface of

314 the earth.

315

316 Decimal degrees - degrees expressed as a single real number (e.g.,

317 -22.343456) rather than as a composite of degrees, minutes,

318 seconds, and direction (e.g., 7d 54 18.32" E).

319

320 Geodetic surface model - a geometric description of the surface of

321 the earth.

322

323 Geographic coordinates - latitude and longitude, measured in any

324 of various coordinate systems.

325

326 Geographic center - To find the geographic center of a shape,

327 first find the extremes of both latitude and longitude within

328 the shape and then take their respective means.

329

330 UTM - Universal Transverse Mercator. A grid coordinate system

331 specifying a datum, zone, and offsets from the equator and from

332 the meridian of the zone. See the References section, below, for

333 more information.

334

335 References

336

337 Township, Range Section Information:

338

339 http://www.esg.montana.edu/gl/trs-data.html

340

341 Datum Information:

342

343 http://www.colorado.edu/geography/gcraft/notes/datum/datum_f.html

344 http://164.214.2.59/GandG/tm83581/tr83581a.htm

345 http://biology.usgs.gov/geotech/documents/datum.html

346

347 UTM Information:

348

349 http://www.nps.gov/prwi/readutm.htm

350 http://www.dmap.co.uk/ll2tm.htm

351

352 Note

353

354 Specific locality descriptions are inexact and seldom give

355 estimates of error. An ideal description of a specific locality

356 has no error. One way to achieve this ideal is to describe the

357 locality by a shape within which the exact locality must

358 certainly lie. The capture of shape data is certainly possible

359 with current GIS technology, and is even demonstrably more

360 efficient than the methods described above. However, there are

361 technical challenges yet to be met in order to make the capture

362 of shape data feasible in a collaborative Internet-based

363 georeferencing environment.

364

365 An alternative to using a shape to describe a locality is to use

366 a definitive point of arbitrarily high precision with an

367 attendant maximum error. This method, described in the foregoing

368 document, is a conservative expression of the locality which

369 satisfies the requirement that the exact locality must lie

370 within the space described.

371

372

373 _________________________________________________

374

375 Rev. 24 September 2001, JRW

376

377 University of California, Berkeley, CA 94720, Copyright 2001,

378 The Regents of the University of California.

 

>>> Posting number 95, dated 27 Sep 2001 10:45:45

Date:         Thu, 27 Sep 2001 10:45:45 -1000

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Georeferencing document

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

John,

 

I went through your document this morning and find most of it clear and in

agreement with my own practices of georeferencing.  I have some

observations and questions as follows:

 

A.

140 The worst of situations arises when a specific locality

141 description is internally inconsistent. There are numerous

142 possible causes for inconsistencies. It is the task of the those

143 georeferencing to determine the part of the description most

144 likely to be in error, ignore it for the purpose of the

145 determination, and document the decision to do so. The most

146 common source of inconsistency in a locality description comes

147 from trying to match elevation information with the rest of the

148 description. If there is no reasonable way to reconcile the

149 discrepancy, ignore the elevation.

150

151 Example: "10 mi W of Bakersfield, 6000 ft"

 

I have recently been through a georeferencing exercise in the herp

collection for which obtaining coordinates that agreed with the elevations

was critical.  It was only through trying to match the description of the

location (distance and direction from X village) with the elevation given,

and finding that the given elevation at the described site was impossible,

that I uncovered major problems in the locality data provided for a large

number of herps on one particular collecting trip.  In this case I was able

to contact the collector to ask about the inconsistencies and he determined

that his original distances were totally off because he was using miles on

a metric map.  In this case the elevations were the correct piece of

information.  I therefore caution against ignoring elevations out of hand.

 

B.

Section on Determining Latitude and Longitude does not include an example

for when coordinates are provided.  For the sake of completeness, should

such and example be included, or, since they are being provided and not

determined, should this be taken up in another section?  For example, when

coordinates are provided in degrees, minutes and seconds, do we translate

into decimals?  how many decimal places do we go for minutes?  for

seconds?  Does it matter who provided the

coordinates?  collector?  previous museum person?  someone else?  Under

what circumstances, if any, should we recalculate coordinates when they are

provided by some previous source?

 

 

C.

210 Error associated with distance precision

211

212 Distance may be recorded in a specific locality description with

213 or without significant digits, and those digits may or may not

214 be warranted. A conservative way to insure that distance

215 precision is not inflated is to treat distance measurements as

216 integers with fractional remainders. Thus 10.25 becomes 10 1/4,

217 10.5 becomes 10 1/2, etc. Calculate the error for these distances

218 based on the fractional part of the distance, using 1 divided by

219 the denominator of the fraction.

 

Lines 217-219.  Does this mean to "replace" the numerator  with 1, and

divide by the denominator?

 

221 Example: "10.5 mi N of Bakersfield" Fraction is 1/2, error should

222 be 0.5 mi.

 

numerator is 1 to begin with, so doesn't answer the question.

 

224 Example: "10.6 mi N of Bakersfield" Fraction is 1/10, error

225 should be 0.1 mi.

 

Isn't the fraction of .6,  6/10?   Did you replace the 6 with a 1 in order

to calculate the error?

 

227 Example: "10.75 mi N of Bakersfield" Fraction is 3/4, error should

228 be 0.25 mi.

 

Fraction this time is given as 3/4, not 1/4, but you could only get an

error of 0.25 by replacing the 3 with a 1 before dividing by 4.

 

As you can see, the examples are confusing.

 

 

All in all, its a sound document.  Thanks much.

 

 

>>> Posting number 96, dated 27 Sep 2001 20:34:47

Date:         Thu, 27 Sep 2001 20:34:47 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         Gordon Jarrell <fnghj@AURORA.UAF.EDU>

Subject:      Re: Georeferencing document

In-Reply-To:  <5.0.2.1.1.20010927104434.00a2f7e0@mail.bishopmuseum.org>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Some good points.  I've inserted my comments.

 

On Thu, 27 Sep 2001, XXXXXXX wrote:

 

> A.

> 140 The worst of situations arises when a specific locality

> 141 description is internally inconsistent. There are numerous

> 142 possible causes for inconsistencies. It is the task of the those

> 143 georeferencing to determine the part of the description most

> 144 likely to be in error, ignore it for the purpose of the

> 145 determination, and document the decision to do so. The most

> 146 common source of inconsistency in a locality description comes

> 147 from trying to match elevation information with the rest of the

> 148 description. If there is no reasonable way to reconcile the

> 149 discrepancy, ignore the elevation.

> 150

> 151 Example: "10 mi W of Bakersfield, 6000 ft"

> 

> I have recently been through a georeferencing exercise in the herp

> collection for which obtaining coordinates that agreed with the elevations

> was critical.  It was only through trying to match the description of the

> location (distance and direction from X village) with the elevation given,

> and finding that the given elevation at the described site was impossible,

> that I uncovered major problems in the locality data provided for a large

> number of herps on one particular collecting trip.  In this case I was able

> to contact the collector to ask about the inconsistencies and he determined

> that his original distances were totally off because he was using miles on

> a metric map.  In this case the elevations were the correct piece of

> information.  I therefore caution against ignoring elevations out of hand.

> 

 

The key words here are, "IF there is no way to reconcile the

discrepancy..."  A possible resolution of the discrepancy might be to

treat it as "specific locality unknown."  This might best be left to the

discretion of the individual collections.  We have to judge individually

how bad our bad data are, i.e., whether or not we can reconcile them.

 

> B.

> Section on Determining Latitude and Longitude does not include an example

> for when coordinates are provided.  For the sake of completeness, should

> such and example be included, or, since they are being provided and not

> determined, should this be taken up in another section?  For example, when

> coordinates are provided in degrees, minutes and seconds, do we translate

> into decimals?  how many decimal places do we go for minutes?  for

> seconds?  Does it matter who provided the

> coordinates?  collector?  previous museum person?  someone else?  Under

> what circumstances, if any, should we recalculate coordinates when they are

> provided by some previous source?

> 

 

(I know John's answer to some of this one.)  The coordinates define an

infinitely small point, no matter what the format.  Precision is measured

with max_error, not the number of significant figures.

 

Nevertheless, we will have coordinates in which precision was implied by

the recorded format.  We have to convert this implied imprecision into a

measure of max_error.  At UAM we are using 2 km, a little over a nautical

mile, for coordinates that were recorded to the nearest whole minutes.

 

There are other examples, similar to the problems with distance precision:

        64D 28' 30" N -  What they meant to say, in terms of significant

figures, was probably 64D 28.5' N.  I suppose in this example we would use

max_error= 1 km

 

We probably do need to develop a standard here.  And yes, I'll bet we want

to be able to keep track of various determinations, re-determinations, who

did it, when, and how.

 

 

> C.

> 210 Error associated with distance precision

> 211

> 212 Distance may be recorded in a specific locality description with

> 213 or without significant digits, and those digits may or may not

> 214 be warranted. A conservative way to insure that distance

> 215 precision is not inflated is to treat distance measurements as

> 216 integers with fractional remainders. Thus 10.25 becomes 10 1/4,

> 217 10.5 becomes 10 1/2, etc. Calculate the error for these distances

> 218 based on the fractional part of the distance, using 1 divided by

> 219 the denominator of the fraction.

> 

> Lines 217-219.  Does this mean to "replace" the numerator  with 1, and

> divide by the denominator?

> 

> 221 Example: "10.5 mi N of Bakersfield" Fraction is 1/2, error should

> 222 be 0.5 mi.

> 

> numerator is 1 to begin with, so doesn't answer the question.

> 

> 224 Example: "10.6 mi N of Bakersfield" Fraction is 1/10, error

> 225 should be 0.1 mi.

> 

> Isn't the fraction of .6,  6/10?   Did you replace the 6 with a 1 in order

> to calculate the error?

> 

> 227 Example: "10.75 mi N of Bakersfield" Fraction is 3/4, error should

> 228 be 0.25 mi.

> 

> Fraction this time is given as 3/4, not 1/4, but you could only get an

> error of 0.25 by replacing the 3 with a 1 before dividing by 4.

> 

> As you can see, the examples are confusing.

> 

> 

 

Looks like a typo in line 224.

 

I suggest replacing the sentence beginning in line 217 with:

 

The error is the resolution implied by the denominator.  It can be

calculated as a distance by dividing one unit of distance by the

denominator.

 

Is that better?  Or worse?

 

 

>>> Posting number 97, dated 28 Sep 2001 12:53:09

Date:         Fri, 28 Sep 2001 12:53:09 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Georeferencing guidelines

Mime-version: 1.0

Content-type: multipart/alternative;

              boundary="MS_Mac_OE_3084526390_196216_MIME_Part"

 

 

John et al.,

 

The georeferencing guidelines look great to me.  The only (minor) quibble I

have

would be with the second item under the subheading "Offsets" (lines 86-89).

Here, you

suggest that a locality that contains distance fractions (such as "10.2 mi E

Bakerfield") should be assumed to be road miles rather than air miles. I see

it the other way around. Most field workers I know are careful to state "by

road" if their mileage was actually measured along a road.  Otherwise, the

mileage is assumed to be taken directly from a map (i.e., air miles).  I

don't see that the inclusion of fractions in the mileage should

automatically signal that the mileage was read from an odometer...it's easy

to get that level of precision using the distance scale printed on the map.

 

Let's see what the others think.  Well done.

 

 

>>> Posting number 98, dated 28 Sep 2001 11:33:22

Date:         Fri, 28 Sep 2001 11:33:22 -0700

Reply-To:     Peter Rauch <peterr@socrates.Berkeley.EDU>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: Georeferencing guidelines

In-Reply-To:  <OF482A362E.E38FA255-ON86256AD5.00621E6D@lsu.edu>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

On Fri, 28 Sep 2001, XXXXXXXX wrote:

 

> The georeferencing guidelines look great to me.  The only

> (minor) quibble I have would be with the second item under

> the subheading "Offsets" (lines 86-89). Here, you suggest

> that a locality that contains distance fractions (such as

> "10.2 mi E Bakerfield") should be assumed to be road miles

> rather than air miles. I see it the other way around. Most

> field workers I know are careful to state "by road" if their

> mileage was actually measured along a road.

 

On insect labels ;>)  "by road" is just that much more text to

cram onto tiny labels. Maybe things are different with

vertebrate folks, especially for those who keep detailed field

notebooks. I think lots of folks keep careful track of their

odometers, and record road/track miles quite often. I suspect

that *either* assumption is likely to be wrong too often (i.e.,

when no explicit indication is given of which type of

measurement is done). Perhaps the classification should be

"Basis of measure not indicated" and let the "buyer beware"?

(I.e., the geographic analyst can then chose how she wishes to

interpret the distances --perhaps choosing to measure both ways

if a locality seems out of place under one or the other

measurement scheme.)

 

 

 

>  Otherwise, the

> mileage is assumed to be taken directly from a map (i.e.,

> air miles).  I don't see that the inclusion of fractions in

> the mileage should automatically signal that the mileage was

> read from an odometer...it's easy to get that level of

> precision using the distance scale printed on the map.

 

>>> Posting number 99, dated 30 Sep 2001 13:35:49

Date:         Sun, 30 Sep 2001 13:35:49 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      FW: Locality comment

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

 

John et al.:

With regard to assigning coordinates to localities, there is a convention

that has been used here at KU for at least 50 years that will help with

localities that are given with reference to towns in the US.  When the town

(e.g. Lawrence) was a county seat, distances were measured from the

courthouse.  Frequently this was near the center of town, but it reduces the

error in estimating the distance from town because we don't need to worry

about the distance being measured from the city limits.  If the locality is

3.5 mi NW of

Lawrence, we still have the uncertainty associated with the angular

component.  If the town is not a county seat, the Post Office is frequently

specified as the point of reference.  We think this system was exported to

several other collections that are part of MANIS. In general, your

suggestions look quite reasonable (and conservative).

 

 

>>> Posting number 100, dated 12 Oct 2001 16:22:06

Date:         Fri, 12 Oct 2001 16:22:06 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Commentary synopsis

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Hi folks,

 

I've been ruminating over the responses to the Georeferencing Guidelines

document, which was posted on the MaNIS website on 24 Sep 2001. That

document has generated interest in a wider community, including the

Alexandria Digital Library Project, so I feel it worthwhile to spend a

little extra effort to fill in some omissions.  Below I will address the

points brought up in discussion and try to provide satisfactory solutions.

I would like to know if there are any objections to these solutions.  My

next step will be to incorporate this information into the Guidelines

document and then announce the existence of that document to NHCOLL.

 

XXXXXXXX mentioned a convention to use the courthouse for a

point of reference for a county seat and to use a post office as a point of

reference for other towns.  Since the Board on Geographic Names GNIS data

often follows this convention as well I see no conflict. Of course, this

convention applies only to the US, and only to those towns where there is a

single identifiable post office or a courthouse.  For all other

determinations the current geographic center of the town, or the

coordinates given in a gazetteer, should be used. In either case it is best

to note something akin to "measured from the post office" or "measured from

the geographic center of Bakersfield" in the determination remarks.

 

XXXXXXXX bought up the topic of elevations as a critical part of the

determination criteria. I agree with her assessment and I propose that we

follow XXXXXXXX's advice, namely, that localities for which there are

internal inconsistencies should be deferred to the parent institution for

further investigation.  I have designed the collaborative gazetteer to

allow annotations to both localities and higher geography. Through the

annotations, georeferencers can note inconsistencies for follow-up work.

Collaborators will be able to check the gazetteer for annotations that

apply to the data from their institution.

 

XXXXX also noted that there was no example of how to deal with existing

geographic coordinates. My original thought was that we should count these

localities as finished.   Yet, there is merit in revisiting existing data,

both for validation and for edification, especially since none of the

existing coordinates have associated error. Nevertheless, we must remain

cognizant of our budgetary constraints. We were given funds to georeference

localities for which we didn't already have coordinates. All that aside,

XXXXX's point is well-taken. I will provide guidelines for existing

geographic coordinates in the forthcoming revised Georeferencing Guideline

document.

 

XXXXX asked whether we should translate coordinates from other coordinate

systems into decimal degrees for data entry. The gazetteer currently

accommodates the following coordinate systems:

decimal degrees

degrees, decimal minutes

degrees, minutes, decimal seconds

UTM

 

But that doesn't answer the question. I will endeavor to create an

interface in which the user will select the original coordinate system and

provide the data in that system. Behind the scenes the data will be stored

in that system AND will be translated to decimal degrees. There will be

decimal degrees and the original coordinates for every determination.

 

XXXXX's next topic was with respect to the precision stored in the

coordinate fields. There is no reason to truncate the values of coordinates

to conform to a predefined level of precision.  For reasons described under

the section on Precision in the Georeferencing Guidelines document, it is

inappropriate to try to store precision information in the coordinate data.

Since the values of the coordinates do not make a statement about the

precision of the determination, keeping as many digits as your source

provides is the preferred method. Discarding digits may have an effect on

accuracy, so it is not recommended.  Just for edification, a decimal degree

that records five digits to the right of the decimal can distinguish

between two places on the earth roughly one meter apart. Similarly, if you

want to maintain accuracy down to one meter, degrees and decimal minutes

should be recorded with 4 decimal places in the decimal minutes, and

degrees minutes seconds should be recorded with 2 decimal places in the

decimal seconds. Conversely, degrees minutes seconds measured to whole

seconds can introduce inaccuracies of up to 31 meters. Those measured to

whole minutes can introduce inaccuracies of up to 1.85 km. I'll make a

chart of this information for the document revision.

 

XXXXX's final question has to do with recording the information about who

determined the coordinates.  This should certainly be among the best

practices within museums.  At the MVZ these data are recorded by making a

reference to the actual person who made the determination.  Since the data

are internal to the museum we can tell whether that person was also the

collector or another person on staff. Another possibility is to record the

role of the person who made the determination (e.g., 'collector',

'curatorial assistant', 'Joe's specific locality munger', etc.). Or, if you

only care whether the collector was the one to provide the coordinates, you

could include a DeterminedByCollector field. For MaNIS I intend to use the

name of the person who determines the coordinates, this name being

determined from a login to the online georeferencing interface.

 

A point of clarification is in order. When determinations are made, I

intend to treat them as opinions. They will not be stored directly with the

locality record, rather, they will refer to it.  This allows any number of

lat/long opinions to be registered. The individual institutions will be

able to decide which one (if there are multiple opinions) will the

"accepted" determination when they put the data back in their databases.

All of the coordinates that were provided in the data sent to me have been

turned into opinions and are already in the gazetteer.

 

XXXXXX made the following observation:

"There are other examples, similar to the problems with distance precision:

         64D 28' 30" N -  What they meant to say, in terms of significant

figures, was probably 64D 28.5' N.  I suppose in this example we would use

max_error= 1 km"

 

I agree with XXXXXX's assessment of significance, however, the

determination of error is more complicated.  Not all degrees are created

equal. Contrary to popular opinion, the distance between 64 degrees N and

65 degrees N is not the same as the distance between 10 degrees N and 11

degrees N. This is due to the oblateness (flattening from a perfect sphere)

of the earth. This may be a minor point, but longitudinal degrees vary

greatly, being roughly 110 km at the equator and 0 km at the poles. My

point is that I need to provide an interface in which one can enter

coordinates and the digits of precision and get back an error distance

based on those criteria

 

I will amend my wording and typos with respect to using fractions in the

distance precision error section.

 

XXXXXXXXX brought up a reasonable alternative view of how offsets should

be handled. The judgement of whether measurements are "by road" or "by air"

can be a tricky one.  I want to propose a solution and see if I can get a

consensus.

 

Specific localities that actually say what the measurement method is (e.g.,

"2.8 mi (by road) E of Marysville") should use that method for determining

coordinates and errors. No special remark is necessary in these cases.

 

Specific localities that have two orthogonal measurements in them (e.g.,

"2.5 mi E and 1.5 mi N of Bakersfield") are always assumed to be "by

air."  No special remark is necessary in these cases either. Furthermore,

no error due to direction imprecision should be used.

 

So much for the easy stuff.

 

Specific localities that have one linear offset measurement from a named

place, but that do not specify how that measurement was taken (e.g., "10.2

mi E of Yuma") are open for a case-by-case judgment. I propose that the

judgement itself always be documented in the remarks for the determination

(e.g., "Assumed 'by air' - no roads E out of Yuma", or "Assumed 'by road'

on Hwy. 80"). If there is no clear best choice, then use the midpoint

between the two possibilities as the geographic coordinate and assign an

error large enough to encompass the coordinates and errors of both methods.

In this case I would remark something like "Error encompasses both distance

by air and distance by road (Hwy. 80)". This is a conservative solution,

but it is relatively simple to do and to remember.  This method is also

never "wrong," if by "wrong" we mean that the actual place is certainly

within our error distance from the given coordinates.

 

XXXXXXXXX brought up a question about what units should be used

for maximum error distance. I have set up the gazetteer so that the units

are entered (chosen actually) from a list of possible values (m, km, ft,

yds, mi). The distance and units should be chosen to make sense in the

context of the locality description. My conservative stance on translation

and recalculation issues is to "never adulterate data that can be

adulterated later." If you decide to put these data back into your

databases (and I certainly hope that you will), you can decide at that time

whether to normalize to a single unit of measure.

 

XXXXXXX also brought up an essential issue of whether errors propagate and

should therefore be summed rather than simply choosing the greatest single

source or error.  The answer is not a simple one, so bear with me.

 

XXXXXXX's specific example, "3 km N + 2 km W Bakersfield" is an instance

of a type of locality description for which I did not provide an example. A

proper description of the error for this example would be a bounding box

centered on the point 3 km N and 2 km W of Bakersfield. Each side of the

box would be 2 km in length (1 km error in any direction). Since we're

using a point and radius to characterize the error, we need a circle that

will circumscribe the above-mentioned bounding box. To do this, the radius

has to be the distance from the center coordinate to a corner. This could

either be calculated by the geometry of the bounding box (in the above

example it would be the distance to the corner times the square root of 2)

or measured on a map.

 

There remains the more general question of whether errors propagate. They

do, and they are non-linear, so to sum them is a mistake. The paragraph

above shows how a sum is not a satisfactory method of accommodating

multiple sources of error. As more sources of error come to bear, the

propagation gets even more "interesting." I'll spare you the details here,

but I'll make a point of explaining these sources and how they should be

dealt with in the Guidelines revision.

 

In addition to the issues brought up so far in discussion, I have a few to

add independently. First, I got the calculation for directional error

wrong. I'll update that in the revision. Second, it is probably obvious,

but I still need to state that the directional error can be ignored when

the distance is measured either "by road" or when the description gives two

orthogonal offsets (e.g., "2 mi E and 4 mi N"). Third, there is another

source or errors inherent to reading maps. This error is based on the scale

and it reflects inherent errors in the maps themselves. I will quantify

these errors in the revision.

 

Aside from the revised georeferencing document, I'm currently working on

interfaces to do the georeferencing online. I'll send out a how-to guide

when the interface is ready to use.  It is too soon to know when that will be.

 

So that everyone knows, my field season is about to begin. Eileen and I are

scheduled to leave for Argentina on 3 Nov and to return around New Year's day.

 

That's it for my update. Feel free to discourse on my proposed amendments

and thanks to everyone for the comments thus far.

 

John

 

>>> Posting number 101, dated 16 Oct 2001 12:43:55

 

>>> Posting number 102, dated 18 Oct 2001 19:30:33

Date:         Thu, 18 Oct 2001 19:30:33 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Guideline Document Updated

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

It took almost two weeks, but the eagerly-awaited revision to the

Georeferencing Guidelines Document is finally complete. I have replaced the

original document, so the following URL now points to the revision:

 

http://dlp.cs.berkeley.edu/manis/GeorefGuide.html

 

I'm not including the line-numbered text of the document here since we are

presumably past the heated debates.  Nevertheless, commentary is

always  welcome.

 

When you read the revised document you are likely to be stricken by the

complexities of determining error properly. Don't despair. My next task is

to create an error calculator. The idea is to have a web page on which you

can enter the relevant parameters and get a maximum error distance. This

tool will be a supplement to the georeferencing tool itself, the

development of which is underway.

 

John

 

>>> Posting number 103, dated 19 Oct 2001 12:29:38

 

>>> Posting number 104, dated 4 Nov 2001 21:44:44

Date:         Sun, 4 Nov 2001 21:44:44 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Barbara R. Stein" <bstein@OZ.NET>

Subject:      MaNIS--ready, set, georeference!

MIME-Version: 1.0

Content-Type: multipart/alternative;

              boundary="------------24FB9C29A003860042ABE8C3"

 

--------------24FB9C29A003860042ABE8C3

Content-Type: text/plain; charset=iso-8859-1

Content-Transfer-Encoding: 8bit

 

Dear All,

 

This is the moment I know you have all been waiting for!  You will

notice a new Gazetteer link at the bottom of the MaNIS home page

(http://dlp.cs.berkeley.edu/manis).  This is your gateway to hours of

georeferencing fun.  But before starting to work, please read this

message in its entirety, print it out and post it next to the computer

that will be used for georeferencing.  You’ll see why you need to print

it when you get near the bottom.

 

To begin, please review the updated Georeferencing Guidelines.

 

Next, you will want to read the Georeferencing Steps document.  A hot

link to it appears at the top of the gazetteer page.

 

You will also want to read the text below the query screen on the

gazetteer main page.

 

After reading all of the above, you will query the gazetteer for a

locality of interest.  The "Search" button returns a list of all higher

geographies containing the term entered and indicates how many unique

localities are contained in the result set.  The list will not tell you

how many of those localities are already georeferenced.  You will see

those data once you download the localities.

 

You may chose to “View” the queried localities either before or after

downloading BUT this function will not aid you in assigning lat/long

coordinates.  Only those localities for which coordinates have already

been assigned get plotted using the GIS viewer (this is the same tool we

showed you at the ASM meeting, courtesy of the Berkeley Digital Library

Project).

 

Where the GIS viewer is most helpful is in pointing out erroneous

coordinates (e.g., if you view the georeferenced localities from

Algeria, 3 specimens appear in the Atlantic Ocean).  By clicking on that

point on the map, you can see the locality record(s) for that point and

correct it/them or, if the locality is not yours, you can contact the

appropriate institution.  The viewer also allows you to see how much

work you have accomplished!

 

Notes about the viewer:  This is a java applet and takes time to load.

Do not attempt to use it on older machines with inadequate memory.

Also, not all map layers exist for all parts of the world (e.g., you

will only get USGS 7.5” topo maps for the U.S.).  How far you can zoom

and the level of resolution you see will depend on the map layers

available.

 

Additional notes:  1) This gazetteer is a static snapshot of your data

compiled for the sole purpose of georeferencing unique localities.

Corrections to specific localities should be made directly in

institutional databases.  They will not be made in the gazetteer so

don't spend time fixing them in the downloaded files.  2) Below the

georeferencing steps you will see the complete list of fields that will

appear in your downloaded files.  Those that are in bold are fields you

will fill.  Those not in bold are needed by John to reassociate the data

in the gazetteer with the data in your institutional databases.  DO NOT

alter the values in these fields!

 

For security purposes, we are not posting instructions on how to upload

georeferenced localities on the web site.  Below is the complete text

for Step Eight of the Georeferencing Steps document.  These instructions

are also being archived on the listserv should you forget to print out

this message.  Follow the instructions below for uploading completed

files:

 

Step Eight - Upload Finished Localities

    Upload the finished file of georeferenced localities by anonymous

FTP to galaxy.cs.berkeley.edu in the directory incoming/mvz/manis. Use

your favorite FTP client to connect to galaxy.cs.berkeley.edu. Log in as

anonymous, providing your email address as a password. Set the file type

to text. Change to the incoming/mvz/manis directory on galaxy. Transfer

your file.

 

Notice that the MVZ has already laid claim to all California localities

(see MaNIS Georef. Checklist in Step 2).  Try as you might, we will not

relinquish this claim!  It is therefore incumbent upon each of you to

lay claim to an equally prestigious set of localities.

 

Those of you paying attention will realize that John is now in Argentina

for two months.  He hoped to have the Error Calculator completed before

leaving.  He did not.  However, once completed, you will simply enter

your lat/long coordinates and it will do all the work of calculating the

error in those values for you-- so it is worth the wait.  Go ahead and

start georeferencing now.  You will son be able to go back and fill in

the errors needed as he will post the calculator from the field.

 

I wish I had more to report on the status of your subcontracts, but I do

not.  Some of you will be able to begin work regardless.  The

beaurocracy has a timeline of its own. We simply have to proceed as best

we can in the meantime.

 

Please continue to address any questions or comments to the list.

Ready, set, georeference!

 

Best,

Barbara

 

 

>>> Posting number 105, dated 6 Nov 2001 09:51:19

 

>>> Posting number 106, dated 6 Nov 2001 09:00:24

 

>>> Posting number 107, dated 6 Nov 2001 12:24:23

 

>>> Posting number 108, dated 6 Nov 2001 14:29:22

 

>>> Posting number 109, dated 6 Nov 2001 16:52:12

 

>>> Posting number 110, dated 6 Nov 2001 16:06:24

Date:         Tue, 6 Nov 2001 16:06:24 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Patricia W. Freeman" <pfreeman1@UNL.EDU>

Subject:      Re: MaNIS--ready, set, georeference!

Comments: cc: hgenoways1@unl.edu

In-Reply-To:  <4.2.2.20011106122240.00abdfb8@packrat.musm.ttu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Dear members of MaNIS-

 

I am actually out of your official MaNIS loop, but I have already

georeferenced Nebraska for mammals, birds, herps, and fish (over 60,000

specimens) and will probably do South Dakota as well.  I salvaged 8,000

herps and about 1,500 mammals from USD about two years ago.

 

All four vertebrate groups are on our web page and searchable to county.

Although we have already georeferenced all four collections, the complete

localities will not be put on the webpage until next semester (I hope).  My

computer expert who, using the Texas Tech georeferencing idea, modified and

wrote a conversion program changing all our geographic localities to

georeferenced localities.

 

 We now have a large NT server that has the USGS maps and gazetteers on it.

Since Hugh Genoways is rewriting the Mammals of Nebraska and has already

started gathering specimens for that purpose, all mammals and mammal data

used for that study will be automatically georeferenced and those data will

accompany the loaned materials on return to their home institution.  I

expect that he has or will contact most of you who have Nebraska material.

 

Regards-

Trish Freeman

 

PS. Can any of you direct me to FISHNET or BIRDNET if there are such

things?  I am already involved with HERPNET, although I do not know what is

happening with it.  Maybe someday we will have VERTNET.

 

 

 

 

 

 

 

Patricia W. Freeman

Professor/ Curator of Zoology

University of Nebraska State Museum

Lincoln NE 68588-0514

402-472-6606

402-472-8949 (fax)

Natural history museums archive biological diversity.

http://www-museum.unl.edu/research/zoology/zoology.html

 

>>> Posting number 111, dated 7 Nov 2001 09:09:31

 

>>> Posting number 112, dated 7 Nov 2001 08:32:12

 

>>> Posting number 113, dated 8 Nov 2001 14:03:13

 

>>> Posting number 114, dated 8 Nov 2001 14:39:28

Date:         Thu, 8 Nov 2001 14:39:28 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: MaNIS--ready, set, georeference!

In-Reply-To:  <3BE6274C.F9AC2E10@oz.net>

Mime-version: 1.0

Content-type: multipart/alternative;

              boundary="MS_Mac_OE_3088075168_258732_MIME_Part"

 

> This message is in MIME format. Since your mail reader does not understand

this format, some or all of this message may not be legible.

 

--MS_Mac_OE_3088075168_258732_MIME_Part

Content-type: text/plain; charset="US-ASCII"

Content-transfer-encoding: 7bit

 

Dear all,

 

1.  I have Internet Explorer 5 for Macintosh on a G4.  I haven't been able

to download records from the Manis website.

 

2.  Our grant submission allotted funds to each institution based on their

records to be geo-referenced.  Does committing to a state/province or region

change all of this?

 

3.  The process has changed considerably between when our records were

downloaded for John and the ASM meeting.  I thought that our  records were

being submitted so that John would have a snapshot of what the different

databases looked like in order to design the  Manis database.  I had planned

to clear up any inconsistencies, spelling errors, etc in our localities

before we geo-referenced and downloaded to the Manis database.  This seems

to make sense, since many errors in locality records can be cleared up only

with the use of in-house resources such as field notes and catalogs.  Now we

are committing to a region and giving our best opinion on perceived errors

(to be noted in the Locality Annotation) to other institutions (and

ourselves!) for them to rectify (or not) at their leisure.  Since I  haven't

been able to download records,  I don't know how much this new scheme will

save time overall or be more time consuming!

 

4.  There are many localities that are designated unique that simply differ

in syntax, spelling, etc.  They are not necessarily next to each other.

Would editing our own version of the database first for these errors and

then downloading them into the Manis database work?

 

Cheers,

 

XXXXXXXXX

 

--MS_Mac_OE_3088075168_258732_MIME_Part

Content-type: text/html; charset="US-ASCII"

Content-transfer-encoding: quoted-printable

 

<HTML>

<HEAD>

<TITLE>Re: MaNIS--ready, set, georeference!</TITLE>

</HEAD>

<BODY>

<FONT FACE=3D"Century Schoolbook">Dear all,<BR>

<BR>

1. &nbsp;I have Internet Explorer 5 for Macintosh on a G4. &nbsp;I haven't =

been able to download records from the Manis website.<BR>

<BR>

2. &nbsp;Our grant submission allotted funds to each institution based on t=

heir records to be geo-referenced. &nbsp;Does committing to a state/province=

 or region change all of this?<BR>

<BR>

3. &nbsp;The process has changed considerably between when our records were=

 downloaded for John and the ASM meeting. &nbsp;I thought that our &nbsp;rec=

ords were being submitted so that John would have a snapshot of what the dif=

ferent databases looked like in order to design the &nbsp;Manis database. &n=

bsp;I had planned to clear up any inconsistencies, spelling errors, etc in o=

ur localities before we geo-referenced and downloaded to the Manis database.=

 &nbsp;This seems to make sense, since many errors in locality records can b=

e cleared up only with the use of in-house resources such as field notes and=

 catalogs. &nbsp;Now we are committing to a region and giving our best opini=

on on perceived errors (to be noted in the Locality Annotation) to other ins=

titutions (and ourselves!) for them to rectify (or not) at their leisure. &n=

bsp;Since I &nbsp;haven't been able to download records, &nbsp;I don't know =

how much this new scheme will save time overall or be more time consuming!<B=

R>

<BR>

4. &nbsp;There are many localities that are designated unique that simply d=

iffer in syntax, spelling, etc. &nbsp;They are not necessarily next to each =

other. &nbsp;Would editing our own version of the database first for these e=

rrors and then downloading them into the Manis database work?<BR>

<BR>

Cheers,<BR>

<BR>

XXXXXXXXXXXXXX</FONT>

</BODY>

</HTML>

 

 

--MS_Mac_OE_3088075168_258732_MIME_Part--

 

>>> Posting number 115, dated 8 Nov 2001 21:20:18

Date:         Thu, 8 Nov 2001 21:20:18 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      permutations on "unique" localities in the gazetteer

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Dear All:  I was wondering about many of the same points that XXXX

XXXXXXXXX mentioned in his email of 8 Nov.  Especially after perusing the

gazetteer and seeing many permutations on"unique" localities.  Eg.,

localities like Seattle, 20 mi N, 20 mi N of Seattle, Seattle, 20 mi north,

and north of Seattle 20 miles, have to be allowed because of institutional

style or preference.  However, an entry such as Seatle, 20 mi N could be

corrected.  Each is a unique record to the computer and will receive the

same lat/long by georeferencers?   Once georeferenced, the permutations can

be identified, but if  localities are entered differently, how much

efficiency is gained by having one institution georeference all records for

a region vs having each georeference their own records?   In addition when

a typo like Seatle is corrected, it no longer is unique but of the same set

as the correct spelling.  The typos will be deleted from the static

gazetteer after determining that they were corrected in the institutional

database (see comment from Barbara below)?   It is unclear to me how

corrections in institutional databases will be mirrored in the static

gazetteer.

 

Although the idea of compiling a static gazetteer of unique localities

seemed like a good idea at the beginning, it does not seem doable at this

point.  I would prefer to go back to the original plan of each institution

dealing with their own records and offering assistance to others as needed.

Once georeferencing is started  and we get $ for the servers, the

gazetteer could be produced dynamically, or at least by frequent uploads -

rather than statically - and can be consulted, updated, corrected, winnowed

as needed.

 

>From 4 Nov email of Barbara:

...

Additional notes:  1) This gazetteer is a static snapshot of your data

compiled for the sole purpose of georeferencing unique localities.

Corrections to specific localities should be made directly in

institutional databases.  They will not be made in the gazetteer so

don't spend time fixing them in the downloaded files.

...

 

 

 

 

 

>>> Posting number 116, dated 9 Nov 2001 08:57:34

Date:         Fri, 9 Nov 2001 08:57:34 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Barbara R. Stein" <bstein@OZ.NET>

Subject:      Re: MaNIS--ready, set, georeference!

MIME-Version: 1.0

Content-Type: text/plain; charset=iso-8859-1

Content-Transfer-Encoding: quoted-printable

 

> 1.  I have Internet Explorer 5 for Macintosh on a G4.  I haven't been a=

ble to download records from the Manis website.

 

XXXX et al.,

 

We are checking this out and, with luck, will have a fix today.  In the m=

eantime, you can download from a Mac using Netscape.

 

> 2.  Our grant submission allotted funds to each institution based on th=

eir records to be geo-referenced.  Does committing to a state/province or=

 region change all of this?

 

No it does not.  It was presumed that, in most instances, the majority of=

 localities for a given state, and the geographic expertise and resources=

 to untangle geographic problems, would reside with the institution in th=

at state.  Therefore, it made sense that we should work cooperatively to =

georeference.  Each institution naturally will have many other specimens =

collected outside that state.  Each can choose to do onlyu its own locali=

ties, thereby encouraging duplicate effort, or we can attempt a more altr=

uistic approach and save economies of scale.  If, after georeferencing al=

l of California, the MVZ looks at its remaining collections and sees that=

 it has a tremendous amount of material from Brazil, Peru and Argentina, =

and recognizes that it also has more geographic expertise in these region=

s than any of the other institutions (and presumably more maps, gazetteer=

s, etc.), then we are going to offer to do all localities from

those countries for the sake of efficiency and making the money go as far=

 as possible.  In return, we know we will benefit from the Bishop Museum =

doing our PNG material, of which we have a fair number of specimens.  We =

could it, yes.  But they can probably do it more quickly and easily.  Thi=

s approach also allows those with an interest in a particular region of t=

he world to get a good handle on what exists in our joint collections and=

, I suspect, reach some very interesting summaries about those regions an=

d the state of our knowledge of their mammalian fauna.

 

> 3.  The process has changed considerably between when our records were =

downloaded for John and the ASM meeting.

 

No it has not.  All of this was discussed online during the proposal prep=

aration process beginning more than a year ago.

 

> I thought that our  records were being submitted so that John would hav=

e a snapshot of what the different databases looked like in order to desi=

gn the  Manis database.

 

That is also true.  There were always two objectives in giving John your =

data.

 

> I had planned to clear up any inconsistencies, spelling errors, etc in =

our localities before we geo-referenced and downloaded to the Manis datab=

ase.

 

The time to have cleared up those problems was before the data were sent =

to John.  Since this approach was outlined in the first proposal submissi=

on over a year ago, it should not have come as a surprise.  The money we =

receive from NSF was never intended to pay institutions to clean up their=

 locality records.  It is to georeference those records.

 

> This seems to make sense, since many errors in locality records can be =

cleared up only with the use of in-house resources such as field notes an=

d catalogs.  Now we are committing to a region and giving our best opinio=

n on perceived errors (to be noted in the Locality Annotation) to other i=

nstitutions (and ourselves!) for them to rectify (or not) at their leisur=

e.

 

Since you haven't started to georeference, you will have to take my word =

that your fears are probably worse than reality.  Truly erroneous localit=

ies become obvious quite quickly and if they are not your own, simply ema=

il a query to the institution to which that locality belongs.

 

Multiple versions of the same locality also jump out quickly.  The advant=

age of using a single individual to georeference a region in that s/he qu=

ickly becomes familiar with the localirties in that place.  My own person=

al suggestion is that each PI sit down with the data and try this process=

 him- or herself before hiring a student to really get going on it.  It w=

ill give you confidence and a much better feel for how it all works.  And=

, if you love maps like I do, it can actually be quite a seductive exerci=

se.  Your problem will be to keep working and not to get distracted by th=

e geography and all the places you would like to collect, have collected,=

 etc.  Perhaps the most difficult aspect is recognizing place names that =

are no longer in use.  Again, review the georeferencing guidelines which =

remind you not to dwell on any single seemingly intractable locality.

 

> 4.  There are many localities that are designated unique that simply di=

ffer in syntax, spelling, etc.  They are not necessarily next to each oth=

er.  Would editing our own version of the database first for these errors=

 and then downloading them into the Manis database work?

 

I don't believe so.  As mentioned above, each institution has known about=

 this approach for more than a year and could have, in that time, chosen =

to direct part of its routine curatorial effort to cleaning up localities=

 in its db.  The final distributed db will have whatever corrected specif=

ic localities get made during the georeferencing process.  We were not gi=

ven money to clean up our localities.  We received this money to georefer=

ence.  You are under no obligation to correct localities for other instit=

utions.  You are merely being asked to georeference them.  Even if relate=

d localities do not fall out in line with one another in your downloaded =

files, if one individual works on all the localities for a given region, =

s/he will not have trouble recalling that a lat/long for a similar place =

was assigned just two days ago and one can scroll up the list to find it.=

 

 

I am sure John will want to add his own comments to what I have written. =

 He generally has access to email about once a week.  In the meantime, I =

will let you know as soon as we solve the download problem.  That does no=

t have to wait for him.

 

Best, Barbara

 

>>> Posting number 117, dated 9 Nov 2001 09:28:19

Date:         Fri, 9 Nov 2001 09:28:19 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Barbara R. Stein" <bstein@OZ.NET>

Subject:      Re: permutations on "unique" localities in the gazetteer

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

> Each is a unique record to the computer and will receive the

> same lat/long by georeferencers?

 

Yes.

 

> Once georeferenced, the permutations can

> be identified, but if  localities are entered differently, how much

> efficiency is gained by having one institution georeference all records for

> a region vs having each georeference their own records?

 

Please refer to my reply to XXXXXX.'s previous message on this issue.  Having a

fair amount of experience doing georeferencing, the MVZ and other instigators

of this proposal believe strongly that much efficiency can be gained by a

cooperative approach.  Proof of our commitment is that the MVZ has agreed to do

all California localities for this project even though we have completed

georeferencing our own localities for many counties in the state more than a

year ago.  We believe we can just do it more efficiently and more painlessly

than any of you folks can.  Even LACM didn't fight us on this point.  I can

change the oil in my car but...

 

> In addition when

> a typo like Seatle is corrected, it no longer is unique but of the same set

> as the correct spelling.  The typos will be deleted from the static

> gazetteer after determining that they were corrected in the institutional

> database (see comment from Barbara below)?

 

No, the typos will not be deleted from the static gazetteer.  The static

gazetteer exists simply as a way to unite all localities from our respective

dbs for georeferencing and then return the georeferenced locs to their

respective dbs.

 

> It is unclear to me how

> corrections in institutional databases will be mirrored in the static

> gazetteer.

 

I repeat-- corrections in institutional dbs will not be mirrored in the static

gazetteer.  Rather, your efforts will be mirrored in the final product--a

geographic dictionary coupled with the distributed db network and GIS viewer.

Please review our NSF proposal.

 

> Although the idea of compiling a static gazetteer of unique localities

> seemed like a good idea at the beginning, it does not seem doable at this

> point.

 

It has been done, for the purpose it was designed to carry out.

 

> I would prefer to go back to the original plan of each institution

> dealing with their own records and offering assistance to others as needed.

 

That is not what was agreed to or specified in the proposal.

 

> Once georeferencing is started  and we get $ for the servers, the

> gazetteer could be produced dynamically, or at least by frequent uploads -

> rather than statically - and can be consulted, updated, corrected, winnowed

> as needed.

 

And it will be.  You are exactly right.

 

Best,

Barbara

 

>>> Posting number 118, dated 9 Nov 2001 14:20:26

 

 

>>> Posting number 119, dated 9 Nov 2001 14:57:01

Date:         Fri, 9 Nov 2001 14:57:01 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Static Gazetteer

MIME-version: 1.0

Content-type: multipart/alternative;

              boundary="Boundary_(ID_WoSRKrESJwWTVCyCL0UPxw)"

 

--Boundary_(ID_WoSRKrESJwWTVCyCL0UPxw)

Content-type: text/plain; format=flowed; charset=us-ascii

Content-transfer-encoding: 7BIT

 

Dear All,

 

To add to my last message, I don't think the static gazetteer was a

surprise, rather the timing of it was.  When I sent the TTU site data to

John early in the summer, I told him that we are in the middle of verifying

and correcting our database.  (We have been working on checking and

correcting our database for nearly three years; I happily report that we

are all but done now.)  At the time, I told John that the corrected data

were NOT what was being sent to him.  He implied that this was okay and

that the static gazetteer would be created at a later time.  However, I may

have misunderstood him.  Now, it seem that several of us have data that we

are not comfortable with in the already compiled gazetteer.

 

I did understand that the NSF money was to meant to cover database

corrections, but I thought we'd begin georeferencing only after the data

had been corrected.  I think we're all looking for ways to simplify the

process and having the indiosyncracies of years of data entry already fixed

would greatly facilitate the process.  Is there some way to address this

problem (uncorrected data in the gazetteer)?  Or do we push ahead with the

gazetteer as it is.  In my mind, going ahead with it as it is will create

some additional work for those doing the georeferencing (because of the

duplications), but it will create a great deal of additional work  for each

institution as errors are corrected.  In our case at TTU, we will have to

go through the gazetteer (once we get the georeferenced records back),

compare all those records to the file we just spent three years updating

and update the whole thing all over again.  Remember that not all of the

corrections will be simple typos or punctuation problems.  We're correcting

incorrect data as well (e.g., wrong county names entered).  If we could

have the opportunity to update the gazetteer with corrected data before the

process is too far along, it would help considerably.

 

 

>>> Posting number 120, dated 9 Nov 2001 15:09:14

 

>>> Posting number 121, dated 9 Nov 2001 15:59:31

Date:         Fri, 9 Nov 2001 15:59:31 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Correction

MIME-version: 1.0

Content-type: multipart/alternative;

              boundary="Boundary_(ID_spZx6thUFA8HhEMCCdkxcQ)"

 

--Boundary_(ID_spZx6thUFA8HhEMCCdkxcQ)

Content-type: text/plain; format=flowed; charset=us-ascii

Content-transfer-encoding: 7BIT

 

Correction to my last note:  I did understand that NSF money was NOT to be

used to make corrections to the databases.

 

Sorry for the slip.

 

XXXX.

 

 

 

 

>>> Posting number 122, dated 9 Nov 2001 15:13:09

Date:         Fri, 9 Nov 2001 15:13:09 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Barbara R. Stein" <bstein@OZ.NET>

Subject:      Re: Static Gazetteer

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

> ... We're correcting incorrect data as well (e.g., wrong county names entered).  If we could have the

opportunity to update the gazetteer with corrected data before the process is too far along, it would

help considerably.

 

XXXXXX,

 

I am very sympathetic to the argument you put forth and am quite sure I would be operating out of my

league if I were to speak for John on this issue.  However, I would like to offer several thoughts--

 

First, an encouraging thought, with the caveat that John will surely correct me if I am wrong--  The

locality ID field in your downloaded files (the one you have been warned not to alter!) will be used to

reassociate the georeferenced data with the records in your dbs--regardless of the content of those

records.  So do not despair if you have corrected some of your localities since you sent John the data.

This was to be hoped for and should not present a problem.  If records did have erroneous data (like a

wrong county), these will likely be difficult to georeference on the first pass and may be skipped, but

they should be easy to deal with by the home institutions once all the data are returned and we each

look for remaining unreferenced localities in our own dbs.

 

Second, we have committed to quite a large project over the course of three years and it is imperative

that we start working ASAP.  It is simply not possible to delay georeferencing while each collection

takes time to verify and correct its locality data.  Have the majority of collections made substantive

changes/corrections to their locality data since those data were sent to John?  I don't know, but I

suspect the majority has not, even though we are all continually cleaning up our data on a daily basis.

So how long do we wait?  Despite the fact that you have not received your money, we are already two

months into this project.  We need to begin work.  It could also be aruged that we should delay because

of all the new specimens that have been entered into our dbs since the data were sent to John....  At

some point we must draw the line.

 

What I ask is that each institution lay claim to a set of localities, that they download those data, and

then spend a bit of time examining what's really there.  Begin georeferencing.  Become familiar with the

process we've outlined.  It may be slow going initially, but as with all new techniques, it will become

quicker and easier with practice.

 

I sincerely regret any misunderstandings that may have occurred.  It is important to keep communicating

and I thank you for your contributions.

 

Best, Barbara

 

>>> Posting number 123, dated 9 Nov 2001 16:10:16

Date:         Fri, 9 Nov 2001 16:10:16 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Barbara R. Stein" <bstein@OZ.NET>

Subject:      alternative download method

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

XXXX et al.,

 

Beneath the "Download" button there is now an alternative option for

those who may have experienced problems.  Click on the link that says

"Alternate download method is here."  A text file with the data should

display in the browser window. Go to the "File" menu and select "Save

As..." to save the file on your computer.  Then open excel and import

the file.

 

Best,

Barbara

 

>>> Posting number 124, dated 15 Nov 2001 08:18:59

Date:         Thu, 15 Nov 2001 08:18:59 -0800

Reply-To:     bstein@oz.net

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         Barbara Stein <bstein@OZ.NET>

Subject:      downloading problems solved

 

Dear All,

 

I believe that the problems some individuals were having with downloading

locality data are now solved.

 

For those using IE on a Mac, an alternative download button has been added with

instructions.  Click to download after viewing the list of specific localities

that result from your search and you will see the alternative option beneath

the original download button.

 

There is also no longer a problem with downloading large numbers of records

(e.g., >8500) so I hope you will feel emboldened.

 

Remember, the downloaded files need to be imported into your spreadhseet of

choice before you will see the headers and the data lined up in a way that

makes sense to you.  Do not attempt to simply work with the downloaded files as

is.

 

Lastly, the subcontract budgets have been set up and are in the hands of

Berkeley's SPO.  It is up to that office to notifiy your SPOs that the money is

available.  It is out of the MVZ's control at this point.

 

Best,

Barbara

 

>>> Posting number 125, dated 15 Nov 2001 11:09:32

 

>>> Posting number 126, dated 16 Nov 2001 07:38:49

Date:         Fri, 16 Nov 2001 07:38:49 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Collaborative Georeferencing Theory II

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Dear all,

In this message I am responding to the discussion begun by XXXXXXXXX

on 8 Nov and continued by XXXXXXXXX. I will refer to both of their

messages herein. I realize that Barbara has already answered these points

while I was out contracting chilblains in the Patagonian wind, but it may

be a comfort to some to see the extent to which we are in agreement

without having had the benefit of communicating.

 

XXXXXXXX said...

[

2.  Our grant submission allotted funds to each institution based on their

records to be geo-referenced.  Does committing to a state/province or

region change all of this?

]

-------------------

No. Funding was based on the number (and difficulty) of the localities in

your collection that need to be georeferenced. In theory, if everyone does

the amount of georeferencing for which they were funded at the speeds we

deduced from experience, then all of the localities without coordinates

will be georeferenced under the funding we were given. In order to take

advantage of the pooling of like localities (i.e., those in the same area

on the map regardless of their source institution) we need to have people

commit to geographic areas that best suit them. Suitability includes not

only geographic areas of interest and of expertise, but also of scope. For

example, if I am institution X, given funding for 10 weeks of

georeferencing, then committing to a geographic area that will take 20

weeks to georeference may be good citizenship, but it is not good

finance. Basically, spend as many weeks on georeferencing as you are

listed for in the NSF Project Description. Details on georeferencing rates

(i.e., localities per hour for different classes of geography) were given

in the Project Implementation section of the NSF Project Description. If

you need to estimate what you are committing to in terms of time, read

that section. It will probably be worthwhile for everyone to monitor

his/her georeferencing rates. If your rates are significantly different

from those projected, send a message to the list. If you are going a lot

faster, we want to know how you're doing it. If you're going a lot slower,

maybe we can help increase your efficiency.

-------------------

XXXXXXXX said...

[

3.  The process has changed considerably between when our records were

downloaded for John [W.] and the ASM meeting.  I thought that our records

were being submitted so that John [W.] would have a snapshot of what the

different databases looked like in order to design the Manis database.

]

-------------------

The last point is true, but it is not the only reason I gathered the

data. Following is an excerpt from the original message from Barbara Stein

asking that data be sent to John W.:

 

"NOTE:  The data you send him will not be distributed in any way, shape,

or form; he will do nothing more than examine it and compare the structure

and general content of the files and then use this data to make the

initial global locality file that will be available for general

reference.  This is extra work that is being done on MVZ's nickle, but

something we feel will keep this project on track and give you the most

bang for your buck."

 

At that point in time we already knew we would use a combined locality

gazetteer, it just wasn't clearly stated at that point how we would use

it. By the time of the ASM meeting I had almost finished the gazetteer and

its purpose was more definitively stated. Following is a quote from the

ASM 2001 meeting notes:

 

"While John [W.] begins work on developing the network, participants will

begin georeferencing. This is why John [W.] asked for your data. From

those

data he will create a combined snapshot of unique localities, which will

be

used for georeferencing."

-------------------

XXXXXXX said...

[

I had planned to clear up any inconsistencies, spelling errors, etc in our

localities before we geo-referenced and downloaded to the Manis

database.  This seems to make sense, since many errors in locality records

can be cleared up only with the use of in-house resources such as field

notes and catalogs.  Now we are committing to a region and giving our best

opinion on perceived errors (to be noted in the Locality Annotation) to

other institutions (and ourselves!) for them to rectify (or not) at their

leisure.  Since I haven't been able to download records,  I don't know how

much this new scheme will save time overall or be more time consuming!

]

and XXXXXXX said...

[

Dear All:  I was wondering about many of the same points that XXXX

XXXXXX mentioned in his email of 8 Nov.  Especially after perusing the

gazetteer and seeing many permutations on"unique" localities.  Eg.,

localities like Seattle, 20 mi N, 20 mi N of Seattle, Seattle, 20 mi

north, and north of Seattle 20 miles, have to be allowed because of

institutional style or preference.  However, an entry such as Seatle, 20

mi N could be corrected.  Each is a unique record to the computer and will

receive the same lat/long by georeferencers?   Once georeferenced, the

permutations can be identified, but if  localities are entered

differently, how much efficiency is gained by having one institution

georeference all records for a region vs having each georeference their

own records?

]

-------------------

First, it would be nice if we each had clean and consistent data in our

databases. We don't. We vary greatly in how close we are to achieving that

aim, not only in terms the raw amount of cleaning to do, but especially in

how long it would take each of us to do it. For this reason we cannot wait

for localities to be cleaned up before we start georeferencing.

 

Second, NSF provided funds to georeference localities, not to clean up

existing data. Nor did our methods and time estimates in the NSF proposal

depend on "clean" localities. I agree that it would be more efficient to

georeference ALREADY clean localities, but it is faster to georeference

them as they are than it is to clean them up and then georeference them.

 

Third, in answer to XXXX's last question, the methods presented in our

proposal have been tested and shown to be much more efficient than the

alternative of having each institution georeference only its own

localities. Forgive my digression into a lengthy answer, but this is an

extremely important matter.

 

The concept of uniqueness is, as XXXX points out, defined by the

computer's ability to distinguish one locality from another. Thus, "20 mi

N of Seattle" is a different record from "Seattle, 20 mi N." Furthermore,

there might be two localities "20 mi N of Seattle", one for UWBM and one

for PSM. There are several reasons for keeping these separate, the most

obvious and important of which is to be able to identify from which

institution a locality description came. So, with the MaNIS gazetteer I've

basically given everyone a list of their unique localities, but you could

each have done that yourselves. The real purpose behind the gazetteer is

to combine localities for all institutions by geographic regions. By far

the most time-consuming aspect of georeferencing is finding places on a

map. Thus, it behooves you to assemble localities that are likely to be in

roughly the same place and then find them on a map all at once. Once you

are on the right map you can get coordinates for all of the localities in

that area. So, suppose I have downloaded localities for which the county

is "Kern." At the top of my list of localities for Kern County is one from

UWBM that says "Bakersfield, 10 mi E; Rattlesnake Grade." I see that the

named place is Bakersfield, so I filter my Kern County records to show me

only those which contain the word "Bakersfield." It turns out that in Kern

County there are 117 localities from 10 institutions that mention

"Bakersfield." I get out my map of the Bakersfield area and start looking

for "Rattlesnake Grade." I can't find it on my map right away so I'm going

to skip this locality for the moment. The next twelve localities on my

list are from six different institutions, but they all have some variation

on "3 mi E of Bakersfield." I find this location on my map once, get the

coordinates and copy them to all twelve localities that match this

place. The next locality on my list is from MVZ and it says "Bakersfield,

6 mi N, 9 mi E; Rancheria Road (Rattlesnake Grade)." Oh, so that's where

Rattlesnake Grade is - on Rancheria Road. Now I can go figure out that

first locality, which I skipped at first.

 

So, to answer XXXX's last question again, there are multiple ways in which

the combined localities aid in the overall efficiency of the

georeferencing process. From the illustrative example above, only the MVZ

had to possess the Kern County map; nobody had to go out and buy one. Only

one person had to find Bakersfield on a map, rather than one person from

each of the ten institutions that had localities from that area. It was

possible to find Rattlesnake Grade for all localities that mentioned it,

not just for the one that also happened to locate it on Rancheria Road. It

might not otherwise have been possible to georeference this locality or

maybe the error would have been much greater than it needed to be. The

single locality 3 mi E of Bakersfield could be found and measured once and

the results copied to all twelve localities that were really the same

place. While the foregoing is all well and good in theory, empirical

testing at the MVZ backs it up with hard numbers. Georeferencing rates

doubled when localities from three collections were combined versus when

they were done separately. Further increasing the number of collections

will result in even greater efficiency.

 

Now let me go back and address part of XXXX's comment that I have

neglected thus far.

 

XXXXXXXX said...

[

"Now we are committing to a region and giving our best opinion on

perceived errors (to be noted in the Locality Annotation) to other

institutions (and ourselves!) for them to rectify (or not) at their

leisure."

]

-------------------

I'm not sure what XXXX's point is here, but I'll try to explain the

Locality Annotation again. Locality Annotation is one of the fields in the

downloaded locality data. This field is provided as a courtesy to alert

the institution that provided a locality that there is something

inconsistent about it. It's not meant to be filled with opinions on

perceived errors, it is meant to note definitive inconsistencies. For

example, if I get a locality in the downloaded file for Inyo County that

says "Bakersfield", then there is a problem with the locality. It's not an

opinion, and it isn't a perceived error; it is simply true that

Bakersfield is not in Inyo County. It's up to me as the georeferencer to

decide whether this is enough of a problem to not georeference the

locality. In this particular case I could either choose to georeference

the locality, because I know that Bakersfield is in Kern County, or I

could choose not to georeference it simply because I'm doing Inyo County

and Bakersfield is out of my "jurisdiction." I wouldn't take the latter

option because I'm necessarily a stickler for boundaries, it's just that

I'd have to go get another map and that would waste time. It might be

better to leave some inconsistent localities until later. Nevertheless,

since I've spent the energy to figure out that there is a problem with the

locality, I might as well extend the courtesy of noting what the problem

is. It'll save time for someone else later on. It is this philosophy that

led me to include the NoGeorefBecause field in the download as well. If

I'm able to determine that a locality cannot be georeferenced, I might as

well say so, and why, so that the next person who sees that this locality

doesn't have coordinates will not bother to try to determine them.

-------------------

XXXXXXXX said...

[

4.  There are many localities that are designated unique that simply

differ in syntax, spelling, etc.  They are not necessarily next to each

other.  Would editing our own version of the database first for these

errors and then downloading them into the Manis database work?

]

-------------------

Yes. In theory it could work, but it is not practical. In addition to the

reasons I gave above, this kind of activity would take a great deal of my

time, which I hope you would agree could be better spent on other things.

-------------------

XXXXXXX said...

[

In addition when a typo like Seatle is corrected, it no longer is unique

but of the same set as the correct spelling.  The typos will be deleted

from the static gazetteer after determining that they were corrected in

the institutional database (see comment from Barbara below)? It is unclear

to me how corrections in institutional databases will be mirrored in the

static gazetteer.

 

The comment from Barbara was...

[

...

Additional notes:  1) This gazetteer is a static snapshot of your data

compiled for the sole purpose of georeferencing unique localities.

Corrections to specific localities should be made directly in

institutional databases.  They will not be made in the gazetteer so

don't spend time fixing them in the downloaded files.

...

]

-------------------

XXXX's question is well founded. I have nowhere yet described what will

happen to the georeferenced localities. I'll try now to clear up this part

of the grand scheme. I've already explained that I would like the

georeferenced localities to be sent back to me so that I can proof them,

load them back into the gazetteer, and keep a running status of the

georeferencing aspect of the project. In principle, you could download

sets of georeferenced localities for your institution at any time and load

them into your own database. But that isn't the most efficient way to go

about the problem. It would be better to wait until all georeferencing is

done, then download all localities for your institution and create the

lat_long records for them all at once, with my help, if necessary. Note

that I am not explaining how to create the lat_long records or how to

incorporate them in your database. The reason is that (almost) everyone's

database structure is different from everyone else's, so there is no one

single solution to fit all. That's why I offer my help to get these data

back into your databases, but I can only afford to do it one time for each

institution that needs it.

 

Now back to XXXX's question. Changes in your databases will not be

mirrored in the static gazetteer. There will be no changes whatsoever to

localities in the static gazetteer, as per Barbara's additional notes. If

you correct typographical errors in your database it will not affect the

georeferencing process. If you make a substantive change to a locality

(one that would affect how the locality is georeferenced), then there will

be an easily discernible discrepancy that can be resolved at the time when

lat_longs are incorporated into your database. Nevertheless, the more

changes you make to your localities during the georeferencing period, the

more work you will potentially create for yourself later.

-------------------

XXXXXX said...

[

Although the idea of compiling a static gazetteer of unique localities

seemed like a good idea at the beginning, it does not seem doable at this

point.  I would prefer to go back to the original plan of each institution

dealing with their own records and offering assistance to others as

needed.  Once georeferencing is started  and we get $ for the servers, the

gazetteer could be produced dynamically, or at least by frequent uploads -

rather than statically - and can be consulted, updated, corrected,

winnowed as needed.

]

I hope I've done something to counter the above sentiment. Let me add

another note about the static gazetteer. It is an interim tool intended to

help us divide up the georeferencing responsibilities and to monitor

georeferencing progress. Your databases are not static. Yet, to function

effectively, we need a fixed target. The real end product of this endeavor

will include a dynamic gazetteer that will drawn from the

continually-updated locality data contained in the participating

databases. At that point, when you add new data, or change existing data,

it will be reflected in the dynamic gazetteer without intervention.

I hope this clarifies the reasoning behind our approach to

georeferencing. Considerable thought and effort have gone into

establishing and testing the methods set forth here and elsewhere in the

MaNIS documents. Barbara and I remain convinced that this is the most

reasonable approach to an otherwise daunting task.

 

John W.

 

>>> Posting number 127, dated 16 Nov 2001 11:51:55

Date:         Fri, 16 Nov 2001 11:51:55 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: Collaborative Georeferencing Theory II

In-Reply-To:  <Pine.GSO.4.21.0111160737280.29268-100000@socrates.Berkeley.EDU>

Mime-version: 1.0

Content-type: text/plain; charset="US-ASCII"

Content-transfer-encoding: 7bit

 

John,

 

My intention was never to clean up our locality data with geo-referencing

funds!  I was operating on the assumption that we would be responsible for

our own data and therefore it would have been worthwhile to clean it up on

our own dime before geo-referencing.  Which gets to another question.  I

have cleaned up localities in our database since downloading it to you.  Is

this going to cause problems in downloading the newly geo-referenced

localities from MANIS into our current database?  Can I continue to clean up

our own database?  Did I understand you correctly when you said to leave

localities that have lat/long alone?  The reason I ask is that I noticed

that when you transferred our lat/long to the Manis database.  The minutes

were incorrectly interpreted as decimal degrees.  Should I worry about this?

Will we have to change our database to accept decimal degrees?  I appreciate

your thorough responses.  I am trying to clarify and simplify our tasks.

That is my bottom line.

 

Cheers, XXXX

 

PS I didn't put this on the site, because I am seeking clarity not a debate.

 

>>> Posting number 128, dated 16 Nov 2001 15:59:44

 

>>> Posting number 129, dated 17 Nov 2001 12:55:06

Date:         Sat, 17 Nov 2001 12:55:06 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: Collaborative Georeferencing Theory II

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Dear All:  Thanks to John W. for the overview and examples.  In summary, we

are georeferencing unique geographical entries rather than unique

localities.  Unique can be a function of geography, institutional acronym,

syntax, typos, punctuation and errors.   The goal is clearer.

 

 

XXXXXX

 

 

>>> Posting number 130, dated 17 Nov 2001 12:55:48

Date:         Sat, 17 Nov 2001 12:55:48 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Barbara R. Stein" <bstein@OZ.NET>

Subject:      Re: Questions about Georeferencing

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

XXXXXXXX wrote:

 

> Thanks for all of the great georeferencing information, steps, and

> guidelines! XXXXXXX and I have been familiarizing ourselves with the

> guidelines, steps and very helpful weblinks.  We downloaded the Ingham

> County (Michigan) records into the Access template, and I feel that this

> county is a comfortable starting place for us (it is our institution's

> county).

 

Go for it!  Starting is half the battle.

 

> Before we begin, I would appreciate clarification on a couple of items.

> Thank you for your time.

 

As always, I will provide my thoughts and John will weigh in when he's next

online.

 

> 1)  Is it okay to use available "online" latitude and longitude

> coordinates, as long as Datum information, etc. are available?

 

Yes.  Just make sure you specify the source of those coordinates in the

designated field on your spreadsheet.

 

> For example, the Township, Range, Section Information website

> (http://www.esg.montana.edu/gl/trs-data.html.) that is listed in the MaNis

> Georeferencing Guidelines has links whereby one can search for a named

> place, and the decimal degrees coordinates (to four decimal places) come up

> for that place (example, City of Mason, Michigan).  Is it okay to use such

> on-line coordinates for georeferencing place names, or should all

> georeferencing should be done with "hard copy" references?

 

We encourage you to take advantage of all available tools, that's why we

provided those URLs.  There may be others as well.  Just make sure your sources

are credible.

 

> 2)  If the answer to the above question is that all georeferencing should

> be done with "hard copy" references, then ignore this one.

> 

> A related question to 1): from the same website mentioned above, one can

> link to "TerraServer" and get (really interesting) aerial photos of places.

>  With the aid of a labelled map, one can zoom in and find specific

> buildings (such as the Michigan State University Swine Barn - a real Ingham

> County example).  From a zoomed aerial image, you can click on "Image Info"

> and get lat and long (non-decimal) coordinates for "tiles" (corners of

> squares) surrounding the image.  Datum information is included in "Image

> Info".

> 

> So my question is, is it okay for us to use these types of on-line aerial

> images for georeferencing?

 

I'm including this question just for completeness.  The answer is, of course,

yes.  And remember, do not worry about the type of coordinate data you record.

The error calculator will be able to convert data provided in any format (e.g.,

deg, min, sec; dec. degrees; etc.) into any other format.  Knowing the datum,

providing the source of your coordinates, and noting any assumptions you have

made in assigning those coordinates are what's crucial.

 

> 3)  With regard to the "DeterminedDate" data field in the download file -

> is there a specific format for the date data(i.e. MM/DD/YYYY or DD Month

> Spelled Out YYYY) that you would like us to use?

 

No, because most spreadsheet programs will dictate a format.  It seemed

worthless for us to specify one.  John will have to deal with that variety

later.

 

Best,

Barbara

 

>>> Posting number 131, dated 17 Nov 2001 14:01:10

Date:         Sat, 17 Nov 2001 14:01:10 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      download of GNIS dataset

Comments:

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

GNIS locality datasets for states can be downloaded from:

 

http://mapping.usgs.gov/www/gnis/gnisftp.html

 

The dataset for Washington consisted of 32K+ localities and Oregon had

50K+.  Both loaded into Excel without problems (after unzipping), and

provide a good start on an authority file for locations + lat/longs.  I

wish I had it back when we originally entered our data.  Locations can be

found with a search or scrolling in Excel, or by loading into a database

program.  As long as you don't need a map, lookup on the downloaded file is

faster than via the GNIS webpage.   The downloaded file also has lat/longs

as decimals, which don't appear to be accessible on the GNIS webpage.

These can be entered into two fields of MaNIS with a copy/paste rather than

parsing or typing the dddmmss + direction string into the eight fields

required for ddd, mm, entry.

 

 

 

>>> Posting number 132, dated 19 Nov 2001 07:53:03

 

>>> Posting number 133, dated 20 Nov 2001 10:41:10

 

>>> Posting number 134, dated 20 Nov 2001 10:57:38

 

>>> Posting number 135, dated 20 Nov 2001 18:52:31

Date:         Tue, 20 Nov 2001 18:52:31 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Vertical Datum?

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Dear Barbara,

 

Thanks for your reply to my earlier message.  I have another question for

both you and John:

 

Do we need to note the "Vertical Datum" if one is provided on a map source?

 One of the Michigan USGS maps that I looked at this week had the following:

Horizontal Datum:  NAD1927

Vertical Datum:  NGVD 1929

 

Also, it looks like we'll be using Topozone

(www.topozone.com/findplace.asp) for georeferencing some of the Michigan

localities (just point the cursor anywhere on the map and the coordinates

of choice [UTM, lat-long decimal, or lat-long degrees and minutes] appear

on the lower part of the screen).

 

XXXXXXXX

 

 

 

>>> Posting number 136, dated 23 Nov 2001 10:20:39

Date:         Fri, 23 Nov 2001 10:20:39 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Collaborative Georeferencing Theory II

In-Reply-To:  <B81AAE5B.EB7%jrozdil@u.washington.edu>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

On Fri, 16 Nov 2001, John Rozdilsky wrote:

 

> John,

> 

> My intention was never to clean up our locality data with geo-referencing

> funds!  I was operating on the assumption that we would be responsible for

> our own data and therefore it would have been worthwhile to clean it up on

> our own dime before geo-referencing.  Which gets to another question.  I

> have cleaned up localities in our database since downloading it to you.  Is

> this going to cause problems in downloading the newly geo-referenced

> localities from MANIS into our current database?  Can I continue to clean up

> our own database?

 

 

XXXX and all,

There has been some confusion with respect to localities,

lat_longs, higher geographies, and the means by which data get back into

your local databases. I have neglected the discussion so far in favor of

getting people working, but clearly there is a great deal of anticipation

on the subject.  I'll explain this stuff in detail on my trip into town

next week.

 

In the meantime, continue as you were. If you are in the midst of cleaning

up locality data and have a good reason to continue doing so at the

moment, go ahead. If you weren't cleaning up locality data, don't do so

for the sake of MaNIS.

 

>Did I understand you correctly when you said to leave

> localities that have lat/long alone?  The reason I ask is that I noticed

> that when you transferred our lat/long to the Manis database.  The minutes

> were incorrectly interpreted as decimal degrees.  Should I worry about this?

 

It seems I have misinterpreted your latitude and longitude data, is that

correct? The original data should be ddmmss, not dd.dddd? Is this true of

all lat_long entries? If so, then I need to update the gazetteer with the

correct data. I can do this from here in Argentina, but I'll have to do it

the next time I come to town. You were right to worry about this. Even

though we don't have to georeference those localities that already have

coordinates (at least not in the first pass), we do want to be able to use

them for reference, so they should be made correct. It's probably a good

idea if every institution that provided some lat_long data do a little bit

of double checking to see if I've made the correct interpretation of your

data. If I made one mistake, I certainly am capable of making others.

 

> Will we have to change our database to accept decimal degrees?  I appreciate

> your thorough responses.  I am trying to clarify and simplify our tasks.

> That is my bottom line.

 

You will not have to make changes in your database to accept decimal

degrees. You can use whatever coordinate system you like locally, and I

can give you your data in that format when it comes time to download data

from the gazetteer into your database.

 

For better or worse, have been trying to simplify explanations -

sometimes at the expense of explaining the complete plan. I guess it's

turning out OK though, because all of the right questions are being asked.

 

John W.

 

>>> Posting number 137, dated 23 Nov 2001 10:24:34

Date:         Fri, 23 Nov 2001 10:24:34 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Questions about Georeferencing

In-Reply-To:  <3BF6CED4.5296BDB@oz.net>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Dear All,

Barbara has answered everything below perfectly well. I'm just "weighing

in" to say so.

 

On Sat, 17 Nov 2001, Barbara R. Stein wrote:

 

> XXXXXXXX wrote:

> 

> > Thanks for all of the great georeferencing information, steps, and

> > guidelines! Robin Bolig and I have been familiarizing ourselves with the

> > guidelines, steps and very helpful weblinks.  We downloaded the Ingham

> > County (Michigan) records into the Access template, and I feel that this

> > county is a comfortable starting place for us (it is our institution's

> > county).

> 

> Go for it!  Starting is half the battle.

> 

> > Before we begin, I would appreciate clarification on a couple of items.

> > Thank you for your time.

> 

> As always, I will provide my thoughts and John will weigh in when he's next

> online.

> 

> > 1)  Is it okay to use available "online" latitude and longitude

> > coordinates, as long as Datum information, etc. are available?

> 

> Yes.  Just make sure you specify the source of those coordinates in the

> designated field on your spreadsheet.

> 

> > For example, the Township, Range, Section Information website

> > (http://www.esg.montana.edu/gl/trs-data.html.) that is listed in the MaNis

> > Georeferencing Guidelines has links whereby one can search for a named

> > place, and the decimal degrees coordinates (to four decimal places) come up

> > for that place (example, City of Mason, Michigan).  Is it okay to use such

> > on-line coordinates for georeferencing place names, or should all

> > georeferencing should be done with "hard copy" references?

> 

> We encourage you to take advantage of all available tools, that's why we

> provided those URLs.  There may be others as well.  Just make sure your sources

> are credible.

> 

> > 2)  If the answer to the above question is that all georeferencing should

> > be done with "hard copy" references, then ignore this one.

> >

> > A related question to 1): from the same website mentioned above, one can

> > link to "TerraServer" and get (really interesting) aerial photos of places.

> >  With the aid of a labelled map, one can zoom in and find specific

> > buildings (such as the Michigan State University Swine Barn - a real Ingham

> > County example).  From a zoomed aerial image, you can click on "Image Info"

> > and get lat and long (non-decimal) coordinates for "tiles" (corners of

> > squares) surrounding the image.  Datum information is included in "Image

> > Info".

> >

> > So my question is, is it okay for us to use these types of on-line aerial

> > images for georeferencing?

> 

> I'm including this question just for completeness.  The answer is, of course,

> yes.  And remember, do not worry about the type of coordinate data you record.

> The error calculator will be able to convert data provided in any format (e.g.,

> deg, min, sec; dec. degrees; etc.) into any other format.  Knowing the datum,

> providing the source of your coordinates, and noting any assumptions you have

> made in assigning those coordinates are what's crucial.

> 

> > 3)  With regard to the "DeterminedDate" data field in the download file -

> > is there a specific format for the date data(i.e. MM/DD/YYYY or DD Month

> > Spelled Out YYYY) that you would like us to use?

> 

> No, because most spreadsheet programs will dictate a format.  It seemed

> worthless for us to specify one.  John will have to deal with that variety

> later.

> 

> Best,

> Barbara

> 

 

>>> Posting number 138, dated 23 Nov 2001 10:27:35

Date:         Fri, 23 Nov 2001 10:27:35 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Vieglias routine (fwd)

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

XXXX and all,

Don't confuse the lat_long determination with the error determination. You

can get the lat_long without the extents, but you need to use the extents

as one of the sources of uncertainty - which contributes to the maximum

error distance, but does not affect the lat_long itself.

 

The guidelines do allow the distance bearing computation to be made from

GNIS coordinates, and I agree, it would be a crime not to use those data.

I would very much like to provide the tool that can parse the localities

and calculate the lat_longs from any gazetteer. In February I'll likely be

collaborating with the Alexandria Digital Library Project to do just that.

I am currently awaiting the development of a protocol to communicate with

their Digital Gazetteer.

 

There are really two tools that would be nice. I've already mentioned the

first one, which would be based on Dave Vieglais' SPPFind tool, which I

have not yet tested. The second is the error calculator, which is

referenced in the MaNIS web pages, but is not yet functional. I've

finished the Error Calculator Tool except for the datum error

contributions and testing. I would like to suggest that charging ahead on

the lat_long determinations is fine, but leave off the error stuff until

thetool is ready for prime-time.  That error stuff is just too burdensome

to do by hand. Doing one pass for lat_longs and one for errors might

actually be more efficient, but we'll need evidence "from the trenches"

to figure out if this is true.

 

John W.

 

---------- Forwarded message ----------

Date: Sat, 17 Nov 2001 13:21:47 -0800

From:

To: tuco@socrates.Berkeley.EDU

Cc: bstein@oz.net

Subject: Vieglias routine

 

John W.  So much for theory.  On more practical matter.  The rules indicate

that  "If the [SpecLoc] description includes an offset, use the furthest

extent of the named place in the direction of the offset."   So we should

NOT compute terminal lat/longs from the GNIS lat/longs and bearing?   I ask

because GNIS locs don't appear to take into account the furthest extent of

the named place.  Related, should we wait for the georeferencing tool

mentioned in the 10/18/01 email or just charge ahead?  I assume it was to

take GNIS locs and try to match them with occurrences in the MaNIS file

(from project description), then compute terminal lat/longs based on

distance and bearing.   Modifying the rules to allow the distance-bearing

computation based on GNIS lat/long would really increase georeferencing

rate, and as long as the technique was referenced, I don't see a problem.

 

 

 

 

>>> Posting number 139, dated 23 Nov 2001 10:29:58

Date:         Fri, 23 Nov 2001 10:29:58 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Vertical Datum?

In-Reply-To:  <3.0.32.20011120185230.00718380@pilot.msu.edu>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

XXXXX and all,

 

The Vertical Datum refers to the geometric model from with elevations are

determined. In our data we consider altitude (or elevation) as an

attribute of the locality, not as an attribute of the position. Or, to say

it another way, when we record positions digitally, we include latitude,

longitude, and horizontal datum, but we do not include elevation and

vertical datum.  In short, we treat elevation as a part of the locality,

so we do not need to consider the vertical datum since it has no bearing

on our georeferencing.

 

Note, unless I am mistaken there is no way to know the datum when using

Topozone. Someone please correct me if I'm wrong. This isn't really a big

problem as long as the error is calculated with an unknown datum.

 

John W.

 

On Tue, 20 Nov 2001, XXXXXXXXXX wrote:

 

> Dear Barbara,

> 

> Thanks for your reply to my earlier message.  I have another question for

> both you and John:

> 

> Do we need to note the "Vertical Datum" if one is provided on a map source?

>  One of the Michigan USGS maps that I looked at this week had the following:

> Horizontal Datum:  NAD1927

> Vertical Datum:  NGVD 1929

> 

> Also, it looks like we'll be using Topozone

> (www.topozone.com/findplace.asp) for georeferencing some of the Michigan

> localities (just point the cursor anywhere on the map and the coordinates

> of choice [UTM, lat-long decimal, or lat-long degrees and minutes] appear

> on the lower part of the screen).

> 

> Thanks,

> XXXXX

> 

> 

 

 

>>> Posting number 140, dated 26 Nov 2001 10:20:20

Date:         Mon, 26 Nov 2001 10:20:20 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Topozone - Datum

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Hi John,

 

According to the topozone website (address below), it appears that the

given coordinates are based on NAD27.  (This is listed next to the

coordinate buttons - UTM, DecLatLong, etc. - on the website).

 

Please let me know if you have other information about this.

 

Thanks,

XXXXX

 

 

 

>Note, unless I am mistaken there is no way to know the datum when using

>Topozone. Someone please correct me if I'm wrong. This isn't really a big

>problem as long as the error is calculated with an unknown datum.

> 

>John W.

> 

 

> Also, it looks like we'll be using Topozone

> (www.topozone.com/findplace.asp) for georeferencing some of the Michigan

> localities (just point the cursor anywhere on the map and the coordinates

> of choice [UTM, lat-long decimal, or lat-long degrees and minutes] appear

> on the lower part of the screen).

> 

 

 

 

>>> Posting number 141, dated 3 Dec 2001 05:59:15

Date:         Mon, 3 Dec 2001 05:59:15 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Loading Lat_Longs back into databases

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Dear All,

 

Last week I promised a message about the relationship between the

gazetteer and your databases - the bigger picture.

 

We've already talked about the static nature of the current MaNIS

gazetteer. As I've said, the gazetteer in its current form is a temporary

tool to aid in collaborative georeferencing. Once the network gets going

there will be a dynamic gazetteer as described in the NSF proposal.

Because our "snapshot" data are static and our databases are not, the

differences between the two will increase over time, especially for those

who are specifically editing locality-related data. I guess that when

people made this realization it caused some concern.

 

I designed the gazetteer with the issue of changing data in mind, and I've

done a few things to aid in data reconciliation when the lat_longs get

loaded back into your databases. For example, I've stored much more

information in the MaNIS gazetteer than is visible in the online

interface, including information that relates the localities (and

therefore the lat_longs)  back to the specimens themselves. The structure

of the gazetteer may be of interest, so I will post the Gazetteer

Entity-Relationship diagram as a document on the MaNIS website when I get

back to civilization.

 

Since I stored all of the original locality-related information along with

the references to the specimens, it will be possible (when the time comes

to load lat_long information into your databases) to compare the snapshot

locality data with the then-current locality data. For all of those

localities where there has been no change, the lat_long data can be loaded

without question. This first step should take care of most records for

most institutions. For the rest of the records, where the locality data no

longer exactly match the snapshot data, some analyses can be done to

determine if the differences can be considered "substantive," by which I

mean that they would affect the determination of the lat_long. For

example, a snapshot locality that is the same as the then-current locality

except that an elevation has been added can be considered as not

substantively changed and can therefore have its lat_long record loaded.

This step will be a little different for each institution. After doing

some bulk checking for differences such as in the foregoing example, I

envision making one visual pass over the remaining records, with the

original and the then-current localities side-by-side, putting a checkmark

in a column called "substantive" for those records that have had

substantive changes. When that pass has been made, all of the lat_longs

for records without a checkmark can be loaded. This third step should take

care of most of the remainder of the localities. What's left will be

locality-specimen relationships that have changed since the time when the

snapshot was taken. These records will have to be resolved by the

individual institutions.

 

There are some tricks and techniques I haven't presented yet, but I hope

that what I've written above helps to clarify the bigger picture with

respect to georeferencing. Questions have proven useful thus far, so if

there's anything else about which you'd care to have me elaborate, please

ask.

 

In the spirit of looking forward, another thing to think about for the

future is the incorporation of the coordinates and metadata into your own

local databases.  Some institutions don't have attributes in their

databases to hold lat_long information. Similarly, not everyone (but there

are some!) has an attribute to accomodate maximum error distance. It would

be a shame to throw away all of this hard-earned and valuable data. At

this point I'm asking you to consider the ramifications of storing these

data so that there are no unpleasant surprises when the time comes to load

the data back into your databases.

 

John W.

 

>>> Posting number 142, dated 7 Dec 2001 07:24:45

Date:         Fri, 7 Dec 2001 07:24:45 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Guide Revisions

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Dear All,

 

While working through the development of the Georeferencing Calculator I

discovered minor numerical and typographical errors in the Georeferencing

Guidelines document. This message is just to alert you that I have made

revisions to that document. One particular change worth noting is in the

section on "Uncertainty associated with coordinate precision." It seemed

to me quite reasonable to assume that the coordinate precision should be

the same for both coordinates, and so I've rewritten that section to

reflect this assumption.

 

I've also added some calculation examples against which you might test

your understanding both of the georeferencing concepts.

 

One detail of reading the datum error from a file eludes me at the

moment. It is the last remaining issue before the Georeferencing

Calculator becomes available.

 

John W.

 

>>> Posting number 143, dated 10 Dec 2001 13:58:27

Date:         Mon, 10 Dec 2001 13:58:27 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         

Subject:      Information from Topozone  -  NAD 27 Datum

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Dear All,

 

John asked that I follow up with staff at Topozone

(www.topozone.com/findplace.asp) with regard to datum information on their

website's scanned maps (see previous message exchanges copied below).

 

Here is what I found out:

 

1.  For USGS QUAD MAPS (1:24,000 or 1:25,000):  the vast majority of these

original scanned maps on the Topozone website are based on the NAD 27.  If

any underlying Quad map was originally based on another datum (such as NAD

83 for example), Topozone has REPROJECTED that map into NAD 27.

 

2.  Thus, the Topozone cursor coordinates as well as the underlying Quad

map (whether original or reprojected) are ALWAYS in NAD 27.

 

3.  It was confirmed that all original MICHIGAN QUAD maps that were scanned

for the Topozone website are NAD 27.

 

John, please let us know if it is okay for us to list NAD 27 as the datum

instead of "Datum Unknown" for locality coodinates taken from the Topozone

website.

 

Thanks,

XXXXXX

 

 

 

>>> Posting number 144, dated 14 Dec 2001 15:31:53

 

>>> Posting number 145, dated 16 Dec 2001 11:40:45

Date:         Sun, 16 Dec 2001 11:40:45 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Information from Topozone  -  NAD 27 Datum

In-Reply-To:  <3.0.32.20011210135816.00717590@pilot.msu.edu>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Thanks XXXXX, this is most excellent. We can use Topozone coordinates with

NAD27 recorded. They have no idea how big a favor they have done for

us. Everyone please list NAD27 with any coordinates derived from Topozone

and remember to record the Reference_Source as "Topozone 1:24000" or the

like.

 

 

 

>>> Posting number 146, dated 3 Jan 2002 10:14:21

Date:         Thu, 3 Jan 2002 10:14:21 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      number of decimals on decimal degrees

Mime-Version: 1.0

Content-Type: text/plain; format=flowed

 

MaNIS:  How many decimals are folks attaching to lat/long determinations?

I'm going with four on decimal degrees even though this is more than the

justified from the offset distances to the nearest mile or fractional mile.

As I understand it, John W's error calculator will attach the correct error

to lat/long determinations based on the offset direction(s), distance and

units.  Sorry if I missed this in previous discussions?

 

 

 

>>> Posting number 147, dated 7 Jan 2002 09:46:27

Date:         Mon, 7 Jan 2002 09:46:27 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: number of decimals on decimal degrees

In-Reply-To:  <F100rz71znUp8acXUgZ000178f6@hotmail.com>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Hi Folks,

   I'm back. Argentina started rioting when I left for Chile. I won't claim

that my leaving was the cause.

   Anyway, my recommendation is to store as many decimal places as your

source gives you and not to confuse those digits with accuracy or precision

- that's why we're using the explicit maximum error distance. I would

certainly caution that to use fewer digits is to introduce extra,

unwarranted errors. Refer to the table in the Georeferencing Guide at

http://elib.cs.berkeley.edu/manis/GeorefGuide.html to see the magnitude of

these errors. If you use 5 digits in a decimal degree coordinate, the error

will be on the same order of magnitude as that for most of today's accurate

GPS readings. The error calculator will also take into account the

precision of the recorded coordinates when calculating maximum error distances.

 

>MaNIS:  How many decimals are folks attaching to lat/long determinations?

>I'm going with four on decimal degrees even though this is more than the

>justified from the offset distances to the nearest mile or fractional mile.

>As I understand it, John W's error calculator will attach the correct error

>to lat/long determinations based on the offset direction(s), distance and

>units.  Sorry if I missed this in previous discussions?

> 

 

 

>>> Posting number 148, dated 7 Jan 2002 12:37:08

 

>>> Posting number 149, dated 7 Jan 2002 12:57:05

 

>>> Posting number 150, dated 7 Jan 2002 12:45:12

Date:         Mon, 7 Jan 2002 12:45:12 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Should not found SpecLocs default to county?

In-Reply-To:  <v02130501b85f9bf6b38a@[207.207.103.162]>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX, and all,

 

>John W.:  So I'm wondering about the Oregon records.  There are about 400

>with DecLat/longs that were already assigned when downloaded, but they only

>have two decimals.  Was this a formating or rounding decision?  I'll leave

>them as is as I assume if someone assigned lat/long it is more accurate the

>the SpecLoc.

 

Actually, it was a formatting error. The decimal lat/longs that appear in

the download have been truncated to 2 decimal places. This wasn't my

original intention.  The truncation occurred somewhere in transferring

between Access and the Informix database from which the downloaded data are

taken. I'll try to find out where it occurred and fix the problem, then I

will update the decimal latitude and longitude values in the online

gazetteer. This shouldn't affect on those who've already downloaded data

for georeferencing since we agreed that the localities that already have

lat/longs will not be georeferenced (again). If anyone is checking and

changing records that have lat_longs already, let me know.

 

>Related, if we cannot find a SpecLoc, should we default to county or leave

>it ungeoreferenced pending investigation by the contributing institution?

>So far not found SpecLocs are running at about 10%  due to discrepencies in

>SpecLoc and county, apparent typos, or ambiguous text.

 

If you cannot find the SpecLoc, leave it ungeoreferenced and say why in the

field called "NoGeorefBecause." If you find the SpecLoc and it is

unambiguously placed in the wrong county, go ahead and georeference it and

make a note to that effect in the "LocalityAnnotation" field in the

downloaded data file. These notes will eventually get back to the source

institution.

 

 

>>> Posting number 151, dated 7 Jan 2002 14:52:05

Date:         Mon, 7 Jan 2002 14:52:05 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Oregon lat/longs.

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

 

XXXX, and all:

 

"...wondering about the Oregon records. There are about 400..."

 

The Oregon records that had lat/long for specimens in the KU collection

should be redone with the new system.  Those that were added here were done

a couple of years ago using a program that calculated them for us so they

will not be as accurate as the current system we are using.

 

 

 

>>> Posting number 152, dated 8 Jan 2002 20:57:38

 

>>> Posting number 153, dated 16 Jan 2002 15:03:38

Date:         Wed, 16 Jan 2002 15:03:38 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Error Calculator

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

At long last I'm ready to introduce the Georeferencing Error Calculator.

It's been some time in the making, and I apologize for the delay, but I

wanted to give you a product that wouldn't be a moving target due to

constant revision.  The application has been pretty well tested and I

believe you can use it with confidence in the results it

gives.  Nevertheless, if something doesn't seem quite right, try to figure

out why. Usually it means that the coordinate precision is set too low (the

coordinate precision always reverts to "nearest degree" if you change the

coordinate system). If you exhaust all possibilities of making sense of the

maximum error value that the program gives you (this includes reading the

manual and the georeferencing guidelines), then feel free to send me a

message asking what's going on. If you do, please be explicit about what

you are doing and what all of the parameters are for the calculation that

puzzles you.

 

The Georeferencing Guidelines and the Georeferencing Steps documents have

been modified to include references to the Error Calculator, and the Error

Calculator Manual has been added to the list on the Documents page on the

MaNIS website at the following URL:

 

http://dlp.cs.berkeley.edu/manis/Documents.html

 

Please read the manual so you know what to expect when loading the

Calculator into your browser.  In particular, you should be aware of the

browser constraints and the size of the java applet. It can be quite slow

to load the first time if your connection is slow.

 

Two points about making calculations are also worth emphasizing in advance.

I've already mentioned the first, which is that the coordinate precision

will revert to "nearest degree" if you change the coordinate system. If you

get an error that you think is excessive, the coordinate precision is

likely to be the culprit. Another possible culprit is having the datum set

to "not recorded" if you actually know what datum the coordinates were

taken in. The second important point is that all distance measurements in a

given calculation must be in the same units. For example, don't mix an

offset of 10 miles with an extent of named place of 3 kilometers. Both

measures need to be in one system or the other. The error distance will be

given in the same units as the measurements and all will be governed by

your choice in the Distance Units drop-down list.

 

Enjoy!

 

>>> Posting number 154, dated 16 Jan 2002 15:28:45

Date:         Wed, 16 Jan 2002 15:28:45 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: CNMA: mammal collection at UNAM

In-Reply-To:  <5.1.0.14.1.20020107123724.00a00090@ibunam.ibiologia.unam.m x>

Mime-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"; format=flowed

Content-Transfer-Encoding: quoted-printable

 

Dear All,

 

I have changed all references to UNAM in the MaNIS documents and database=20

to be CNMA based on the following request from Fernando Cervantes. The=20

acronym was not changed in the Project Description document, which is a=20

copy of the document sent as part of the NSF grant application. Those of=20

you who downloaded localities previous to 16 January 2002 will still have=20

UNAM as a CollectionCode in your downloaded data. This will not present a=20

problem when you return the georeferenced data to me.

 

John W

 

>Dear John

> 

>    To better describe who and where we are at I would like to ask you for=

=20

> the following:

> 

>1. In the list of institutions participating in MaNIS and the contacts=20

>(web site), please include the name, position, and e-mail account of my=20

>assistants:

> 

>Yolanda Hortelano, yolahm@ibiologia.unam.mx

>Julieta Vargas, jvargas@ibiologia.unam.mx

> 

>2. In addition, please change the acronym of our collection.  Our mammal=20

>collection is known and registered as CNMA (after Colecci=F3n Nacional de=

=20

>Mam=EDferos) and is hosted by Instituto de Biolog=EDa, that belongs to=20

>Universidad Nacional Aut=F3noma de M=E9xico (UNAM).

> 

>Thank you for your help,

> 

>Fernando

>------------------------------------------------

>Fernando A. Cervantes

>Zoologia. Instituto de Biologia, UNAM

>Apartado Postal 70-153, Coyoacan

>Mexico, D. F. 04510

>Mexico

> 

>tel.: (525) 622 9143; fax: (525) 550 0164

>e-mail: fac@ibiologia.unam.mx

>sitio web: www.ibiologia.unam.mx/cnma

>------------------------------------------------

 

>>> Posting number 155, dated 17 Jan 2002 09:38:42

Date:         Thu, 17 Jan 2002 09:38:42 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: georeferencing

In-Reply-To:  <5.1.0.14.1.20020107123124.00a00ec0@ibunam.ibiologia.unam.m x>

Mime-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"; format=flowed

Content-Transfer-Encoding: quoted-printable

 

XXXXXXX,

Now that the Error Calculator is done and on the web I was able to check=20

the data you sent me in December.  I had no problems importing those data=20

into my system.  When I do this step I check for inconsistencies in the=20

data and fix them if I can.  The Determination References you provided are=

=20

excellent. I wish we could figure out which datum those sources use.

 

I'm curious why you chose to record degrees minutes seconds instead of=20

decimal degrees for the localities your team georeferenced. I'm only asking=

=20

to point out that it would have been easier to just copy and paste the two=

=20

decimal degree values. This would have been a little faster and it would=20

have left less room for error. Even so, there was only one coordinate error=

=20

I could find in your data. There was a 10 for decimal seconds where there=20

should have been a 0.

 

There are some limitations of the Alexandria Digital Library Data of which=

=20

everyone should be aware. As long as you recognize these limitations, the=20

ADL gazetteer is extremely useful. I'm including, below, a message to and a=

=20

response from Linda Hill about these limitations.

 

I noticed that none of the localities in the records you georeferenced have=

=20

maximum error distances. I hope you will provide these data in the future,=

=20

especially now that I've released the Error Calculator, which is supposed=20

to make the calculations much easier. When you do make error calculations,=

=20

be sure to use a coordinate precision of "nearest minute" for Alexandria=20

Digital Library data that come from NIMA. If you look at the values that=20

come up there is always either a 0 or a 59 in the seconds field for non-USA=

=20

named places. There is something wrong with a coordinate translation=20

algorithm somewhere that produces this problem. I recommend using the=20

decimal degree coordinates since they err less than the degrees minutes=20

seconds.

 

I especially appreciate the Locality Annotations your team provided and I=20

hope the other recipients of your georeferenced data do as well.

 

 

 

>John: Here's the situation. The data in our gazetteer for the example you=

=20

>used is

>NIMA. The original NIMA coordinates are:

> 

>NIMA: 20=B0 11' 00" N 098=B0 03' 00" W

> 

>NIMA points are all limited to 1 minute resolution, I believe, although=

 they

>don't document this anyway that I have seen.

> 

>We have two clients and they show the coordinates as:

> 

>CDL-Middleware client to ADL Gazetteer: Longitude W 98=B0 03' Latitude N=

 20=B0 11'

> 

>AOL client to ADL Gazetteer: Longitude: -98.050003 (98=B03'0"W) Latitude:=

=20

>20.183332

>(20=B010'59"N)

> 

>The problem with the AOL client is that the original ddmmss values were=20

>converted

>to decimal degrees and then the ddmmss values that are shown in the=20

>interface are

>calculated from them, giving the impression that there is more resolution=

=20

>in the

>location than is warranted. As you point out, in your example there is=20

>obviously

>a problem with the '3' as the last digit in the longitude value. We are=20

>aware of

>these problems but have not gone back and fixed it. We have limited staff=

=20

>to work

>on the gazetteer and have put more work into other developments. What we=20

>intend

>to do is to phase out the AOL client and replace it with a client based on=

 our

>middleware software (like the CDL client). We will be storing decimal=20

>degrees in

>our database but need to be smarter about the specificity

> 

>Neither the USGS nor NIMA clearly reference the geodetic basis of their

>coordinates. We are assuming that they are using WGS-84. In our revised=20

>Gazetteer

>Content Standard there is an element to declare the geodetic basis for the

>coordinates. We are setting the default value as WGS-84 but other bases can=

 be

>entered. With our current gazetteer, I think you will not go far wrong with

>assuming WGS-84. Also, we have elements for making a statement about the

>'accuracy' of the coordinates. In the future as we build up better data,=

 these

>statements could give assistance in making the estimates that you need.

> 

>I had a look at your 'estimator' for maximum geospatial error in specifying

>locations. It looks very useful. I passed the URL on to our gazetteer team=

=20

>here

>so that they can see what you are doing.

> 

>We are still working on getting our gazetteer protocol server working=20

>properly.

>We solved a major parsing problem today. There is still more to do but you=

=20

>might

>start thinking about how you might embed gazetteer lookup in your script=

 using

>our gazetteer service protocol.

> 

>I appreciate your feedback and apologize for the limitations of our=20

>gazetteer. We

>continue to work on it and welcome collaboration to 'make it right'.

> 

>- Linda

> 

> 

>John Wieczorek wrote:

> 

> > Hi again,

> > I have people engaged in georeferencing for the MaNIS Project now. My=

 first

> > set of georeferenced data have just been returned and the ADL gazetteer=

 was

> > among the Reference Sources used to get coordinates for the data. My

> > questions are about the coordinates themselves. I'll use a specific=

 example

> > to better illustrate the questions.

> >

> > The locality in question is Huauchinango, Puebla, Mexico. The gazetteer=

=20

> shows

> > coordinates in two units, decimal degrees and degrees minutes seconds.

> > Specifically, for this example, the decimal degrees are 20.183332,

> > -98.050003. The degrees minutes seconds are 20=B010'59"N, 98=B03'0"W.=

 These two

> > aren't the same when you get out to that sixth decimal place in=

 longitude,

> > and they differ even more in latitude. I'm wondering whether there is a=

 way

> > to know which is the original coordinate system (i.e., the one without=

 the

> > error introduced by translation). Both coordinates actually have=

 tell-tale

> > signs of tampering. That 3 out at the end of the decimal longitude looks

> > like a floating point error. The fact that so many of the named place=

 from

> > this region have only 0 or 59 in the seconds fields is also highly=

 suspect.

> > So, I wonder at what step the translation(s) was(were) made - whether it

> > comes from the original data source (in this case NIMA) or whether it is

> > post-processing done on your end. If it is the former, I suppose we're

> > stuck with it, but if it's the latter I wonder if a better algorithm=

 could

> > be used to keep the coordinates in sync. I can offer one, if that helps.

> >

> > Finally, I've probably asked this before, but is it possible to get the

> > datum information along with the coordinates. I suspect that information=

 is

> > missing as metadata from the original data sources, but if it isn't

> > missing, is there any possibility that it could be among the data you

> > provide in the ADL gazetteer interface? It makes a great deal of=

 difference

> > sometimes in determining the maximum error distance for the coordinates

> > assigned to a locality, and this will, in turn, affect analyses further=

 on

> > down the road.

> >

> > Thanks bunches,

> > John W

 

>>> Posting number 156, dated 20 Jan 2002 10:39:23

 

>>> Posting number 157, dated 31 Jan 2002 15:13:44

 

>>> Posting number 158, dated 31 Jan 2002 16:18:34

Date:         Thu, 31 Jan 2002 16:18:34 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Update

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

I write for two purposes. The first is that I'm curious to know how many of

you have actually begun to georeference. So far, I know that CNMA and the

MVZ have begun. The reason I ask is that I would like to begin a discussion

on the list of techniques to make the task go faster. I don't really want

to do that until most everyone is actually getting their hands dirty. In

this way everyone will be able to benefit from the discussion. So, please

let me know either that you have already begun georeferencing, or when you

anticipate beginning.

 

My second purpose is to let you know that, due to my ignorance of the

details of two of your esteemed collections databases, I made some faulty

assumptions when I first processed the data for the online gazetteer.  As a

result, I need to reload data for UWBM and for ROM.  I have already

reprocessed the UWBM data and I'll try to load it into the gazetteer as

soon as possible (hopefully by Monday). The situation with ROM is more

complex and I anticipating making an update to the ROM data in about one

month. There are a few implications of this unfortunate necessity.

 

1) If you have not yet downloaded localities for georeferencing, wait to

make your downloads at least until I announce that the update for UWBM has

been done. Don't wait for the ROM update to be done unless for some reason

you weren't going to begin georeferencing for another month anyway.

 

2) If you have downloaded localities, but have not yet begun georeferencing

them, throw away the downloaded file(s) you have and download them again

after I announce that the UWBM update is complete. Again, don't wait for

the ROM update to be done unless you weren't going to begin georeferencing

for another month anyway.

 

3) If you downloaded and began georeferencing files that include UWBM

and/or ROM records, please discard those records (only) from your record

set, even if you happen to have already georeferenced some of them. My

suspicion is that not much actual georeferencing has commenced to date

(though I'd love to hear otherwise), so this is unlikely to be a big

problem. After discarding the UWBM and ROM records, please do another

download with the same criteria you used last time, but this time please

select UWBM in the Institution drop-down box. This will give you only the

UWBM records from your geographic area of interest. After they download

successfully, append these UWBM records to the records you've already begun

georeferencing and proceed as if nothing had happened.

 

When the ROM records are ready I'll make another announcement to the list

about downloading only ROM records to append to your working files. The

process will be exactly the same is in scenario 3, above. In the meantime,

ROM records will still be in the gazetteer, but please do spend time to

georeference them. Throw them out now, or when I make the announcement, as

you prefer.

 

Thanks, and my sincere apologies for the inconvenience. I promise to try to

not make assumptions about other people's data anymore. I should know

better by now.

 

John W

 

>>> Posting number 159, dated 1 Feb 2002 17:29:10

 

>>> Posting number 160, dated 1 Feb 2002 17:33:01

 

>>> Posting number 161, dated 1 Feb 2002 18:24:45

 

>>> Posting number 162, dated 1 Feb 2002 15:42:04

 

>>> Posting number 163, dated 1 Feb 2002 19:27:30

Date:         Fri, 1 Feb 2002 19:27:30 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Gazetteer update

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

The promised gazetteer update is complete. Download with abandon!

 

John W

 

>>> Posting number 164, dated 4 Feb 2002 17:36:54

Date:         Mon, 4 Feb 2002 17:36:54 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Fwd: Georeferencing by MSU

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

Please read, there are some excellent questions raised here.

 

>Date: Mon, 04 Feb 2002 17:58:51 -0500

>To: tuco@socrates.Berkeley.EDU

>From:

>Subject: Georeferencing by MSU

> 

>Hi John,

> 

>XXXXXXX and I want to give you an update on georeferencing and relay

>some concerns/questions.

> 

>In late November, we downloaded records for several Michigan counties and

>have since practiced on the different types of localities.  Using

>Topozone, we worked individually on Eaton and Barry Counties and then

>compared and discussed our approaches and results.  Prior to the

>introduction of the error calculator, we reported our results as UTM

>coordinates in the Access file template provided.

> 

>With the availability of the error calculator (thank you very much!) and

>recent revised guidelines, we began recording original coordinates as

>decimal degrees for Barry County.  Our plan is to send you each of our

>Barry County files in the next few days.  We would appreciate your

>comments on our results and techniques before we proceed with the "real

>thing".

> 

>We have some questions and comments:

> 

>1. Evolving Guidelines - We would appreciate an announcement whenever

>there is an update to the guidelines, specifying which sections are

>altered, to ensure that we are always working with the most recent

>information.  Thanks again for all of your hard work with this!!

 

Point well taken. I've tried to be good about announcing the updates, but I

haven't always completely described what the changes were.

 

>2. Guidelines Questions - In the calculation example of Distance Along

>Orthogonal Directions, the Direction Precision is given as 45 degrees.  It

>seemed earlier in the document that "directional imprecision can be

>ignored" in such an example.  Are we misunderstanding something?

 

I included one too many lines in my copy and paste. The Direction Precision

should not figure into that calculation. I will remove the extraneous line

from the Georeferencing Guidelines.

 

>In the calculation example of Named Place Only/Bakersfield, the

>coordinates are 35 degrees, 22', 24"N and 119 degrees, 1' 4" W.  We

>understand from the example that these are the GNIS coordinates for

>Bakersfield.  In other  examples (e.g. Distance Along Orthogonal

>Directions and Distance at a Heading) the latitude and longitude

>coordinates are the same as for the Named Place/Bakersfield example. Since

>the actual localities are different (from Bakersfield), shouldn't the

>coordinates be different as well?

 

Absolutely. You win a prize for catching those mistakes. The "Distance

Along a Path" example was similarly problematic. I have changed the wording

as well as the values for Latitude, Longitude, Decimal Latitude, and

Decimal Longitude for these examples to reflect that the coordinates of the

locality are different from the coordinates of the named place mentioned in

the locality description.

 

>3.  Coordinates for the Center of a Township  -  If a locality is a

>township name only, is it preferable to use the coordinates for the

>township that are automatically provided by Topozone (via the place name

>search), or use the coordinates for the intersection of Sections 15, 16,

>21, and 22 (assuming the township consists of the "standard" 36 one-mile

>square sections)?

 

I was unaware that one could (and unable to figure out how to) find a

township, in the TRS sense, from the place name search on Topozone. I did

notice that you can find named townships (Michigan is full of them), but I

don't believe their coordinates correspond with the TRS sections they

occupy. Nevertheless, the coordinates we're looking for are those of the

intersection of center sections, as Laura mentioned above.

 

>4.  Extent of an intersection - One of the localities that we recently

>georeferenced in Barry County was the intersection of two roads.  We used

>the coordinates from Topozone and estimated the extent of the intersection

>to be 50 meters.  Is this a reasonable estimate to use in general for this

>type of locality?  (The locality was considered as a named place for

>calculation of error).

 

That seems like a generous extent unless the roads are 12-lane highways or

something.  I would opt for something more like 10 meters for your everyday

two-lane roads.  Certainly, feel free to override my opinion if the

circumstances warrant it.

 

>5. Extent of a named place that lacks bounding boxes  - We have

>encountered named places that lack bounding boxes on both the Topozone

>image as well as a Michigan County Gazetteer book.  We have estimated

>extents of such places based on the clusters of buildings that appear as

>black squares on Topozone in 1:25,000 scale.  Is this type of estimate okay?

 

That's what I'd do, and that's what my georeferencers have been doing from

the outset.

 

>6.  Cursor Accuracy - Robin and I have different model computers that

>utilize different web browsers (I have Netscape; Robin has

>Explorer).  When Robin connects to Topozone, her computer cursor

>automatically changes to a crosshair.  I manually changed my computer

>cursor from the "standard" arrow to a crosshair.  I believe this has made

>a difference in attempting to pinpoint localities on the Topozone map.

 

Good idea. It hadn't occurred to me because we're all using Netscape, and

we're only using Topozone occasionally.  Just as a point of information,

for California we most often use Terrain Navigator from MapTech

(http://maptech.com/) to do our georeferencing.

 

>Thanks for all of your help!!

 

Thanks for your excellent questions and comments.

 

>XXXXXXXXX

 

 

>>> Posting number 165, dated 6 Feb 2002 11:48:03

Date:         Wed, 6 Feb 2002 11:48:03 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Should we save extents?

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

MaNISers:

Should we save extents?  In georeferencing, one variable that will not be

saved is the extent used to compute the error.   The extent cannot be

inferred from the locality descriptions unlike coordinate and offset

imprecision.     In addition, an extent for a populated place will vary

depending on the scale, map, year.  For many records it is the largest

component of the error.  To give folks an idea of how I computed the error,

I am annotating each record with the extent I used.   One could go

overboard and reference the extent, but I am assuming the same system used

to get lat/long (GNIS).   Would it be too much trouble to save extents in

the annotation field?

 

For TRS lat/longs, I am using the extents in the Guidelines update.  For

lookup on the MontanaTRS site I am assuming unknown datum and no error due

to scale as done in the Georef Guidelines examples for placename only.

Correct?

 

 

 

 

>>> Posting number 166, dated 7 Feb 2002 09:55:12

Date:         Thu, 7 Feb 2002 09:55:12 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Datum error significance

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

I figured it was worth answering this question on the list in case others

were wondering the same thing. The commonly used datums in the US are the

North American Datum 1927 (NAD 27), the North American Datum (NAD 83), and

the World Geodetic System 1984 (WGS 84). The difference between NAD 83 and

WGS 84 is quite small compared to the difference between NAD 27 and NAD 83.

All of the USGS maps are in one or the other of NAD 27 or NAD 83. I haven't

done an exhaustive search, but it looks like most US Forest Service and

Bureau of Land Management maps use NAD 27.

Anyway, the 79 m used in the Bakersfield example is the actual distance

between two points having the exact same latitude and longitude, but with

one of the points based on NAD 27 and the other based on WGS 84.  The Error

Calculator uses a pre-calculated matrix of the greatest difference between

these two datums in every 0.2 by 0.2 degree cell in the region between

84.69 degrees North, 179.48 degrees West and 13.69 degrees North, 51.48

degrees West. Outside of this region the calculator uses the assumption of

1km error due to an unknown datum as documented in the Georeferencing

Guidelines.

When entering coordinates in the calculator it is important to enter the

correct hemisphere. Perhaps that goes without saying, but it is pretty easy

to enter decimal longitude erroneously (without the negative sign in front)

for localities in the western hemisphere. Doing so could seriously affect

the error contribution from an unknown datum.

 

John W.

 

 

>Date: Wed, 6 Feb 2002 11:45:19 -0800

>To: tuco@socrates.Berkeley.EDU

>From:

> 

>John:  Unknown datum question.  Fig 1 in the guidelines has the ranges of

>error for unknown datum.  For Bakersfield the range 76-100 m error.

>Oregon, which I am georeferencing, is in the same 76-100 m band, so a

>midpoint would be 88 m.  Does 79 m used in the Georeferencing Guidelines

>examples for Bakersfield have some significance?  I realize this doesn't

>matter when using the web calculator, but just wondering because it makes a

>difference of several m when using Excel calculator.

> 

 

 

>>> Posting number 167, dated 13 Feb 2002 11:35:40

Date:         Wed, 13 Feb 2002 11:35:40 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      MSU PRACTICE RECORDS

In-Reply-To:  <3.0.32.20020213132000.00687df0@pilot.msu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

Below are extracts from an exchange between me, XXXXXXXXXXX, and

XXXXXX stemming from a request to review a set of records that each of

them had georeferenced independently. Several points of interest to the

readers of this list were raised, including a continuing discussion of the

issue of extents raised by XXXXXXX on 6 Feb 2002.

 

I'd like to report that this exercise turned out to be a wonderful field

test of the georeferencing guidelines. The coordinates and errors were

remarkably similar, with the largest deviations corresponding to the most

vague locality descriptions. Go team!

 

John W

 

> >Topozone actually has

> >only two underlying scales, 1:25,000, and 1:100,000. The 1:50,000 and

> >1:200,000 versions are just zoomed out by a factor of two from their

> >counterparts. Also, the 1:25,000 Topozone scales actually consist (were

> >derived from) a combination of 1:25,000 and 1:24,000 maps, which were

> >"resized." It doesn't make all that much difference in the error

> >determination if you use 1:25,000 vs. 1:24,000. However, I hope you are

> >using the 1:25,000 map scale contribution in the error calculator for the

> >1:50:000 Topozone maps. Similarly, use the 1:100,000 map scale for the

> >1:200,000 Topozone maps.

> 

>Good to know all of the above.  Actually, we used "gazetteer" from the

>dropdown on the error calculator for all of the Topozone practice records.

>We were following the example from the georeferencing guidelines where the

>coordinate source (Topozone) was considered to be a gazetteer, and thus

>selected "gazetteer" on the error calculator.  It sounds like we need to

>redo the MAX ERROR with the map scale incorporated.

 

Actually, there is a subtle distinction to make. In the Georeferencing

Guidelines document I said that the source for that "Distance Only" example

was a gazetteer, because the coordinates were for a named place and

Topozone uses the GNIS data to plot named places; thus, the ultimate source

of the coordinates for that example is the GNIS database, which is a

gazetteer. If you had used Topozone to measure on a map, then the map

itself is the source of the coordinates and should be so reflected in the

error calculations by selecting an appropriate map scale.

 

> >I'm very happy to see the extent information in there. I am ruminating over

> >the inclusion of a field in the download data for the extent. I'm

> >interested in your opinion on the subject. It seems like it would actually

> >be easier than writing it out in the remarks, especially if you can copy

> >and paste it among several records. However, I think we'd do well to add a

> >NamedPlace field as well so we know to what the extent refers.

> 

>XXXX and I have been meaning to reply to XXXX's message about extent.  My

>opinion is that extent should be included somewhere (and in the remarks

>field is fine with me) as a record of what was done in the georeferencing

>process.

 

I think the general sentiment is that the complete determination

(coordinates AND error) would be fully documented if we go ahead and add

the value of the extent to the data we capture. By having a base set of

rules along with recording  extent, we will know know the magnitude of

every contribution to the determination. Without recording extent, we are

left to wonder how the georeferencer arrived at his/her result. Would it be

onerous to include the extent in its own field? I think it will be easier

than adding it to the remarks, both for the georeferencer and for the

compiler of named place extents (me). Part of the reason I ask this is that

I'm thinking even bigger than MaNIS to the ubiquitous problem of

georeferencing, which could benefit by having a database of extents. The

GNIS data allows for features to be described by bounding boxes, which can

be interpreted to find extents. However, for most features the bounding box

reduces to a single point. This is true of all but the largest populated

place features in the GNIS database.  Given the paucity of extent data

available, and given that we (MaNIS georeferencers) will have to determine

extents for every named place we run across, we could assemble these data

and use them to provide added value to existing gazetteers. Furthermore,

these additional data could be used in the future to automate the process

of georeferencing and error calculation. If this is, indeed, a worthy goal,

then it makes sense to capture the information in its own field so that it

need not be parsed from remarks in the future.

 

Comments are hereby solicited.

 

> >Overall, the agreement in the coordinates and the errors is astonishing.

> >The mean deviation in coordinates across the whole dataset is only about

> >300 meters and most of this is due to the two vague localities ("Barry

> >State Game Area" and "Yankee Springs Area"). For the most part the errors

> >take care of the differences. You have bolstered my faith in the system.

> 

>Yes - these were large areas that were actually adjacent to one another. I

>found them to be somewhat difficult to georeference.

> 

> >The one locality for which I cannot understand the discrepancy is "Clear

> >Lake Camp, 6 mi. E Delton." You might want to revisit that one to see where

> >the problem occurred.

> 

>I know what happened here - operator difference (or assumption error?) pure

>and simple.  I believe that Robin treated this as an offset, and I

>completely ignored the offset and focused on a "church camp" on the map

>that was on the shore of Clear Lake (the lake was about 6.5 miles east of

>Delton).  Thus, I treated this as a named place (and perhaps my assumption

>was an unwarranted big stretch) and Robin treated it as an offset.  I

>believe that Robin's choice was the better of the two.

> >

> >Nice.

> 

>Thanks again!

> 

>XXXXX

> 

> 

> >

> >John W

> >

> >>Attached are two files containing identical Barry County localities that we

> >>have georeferenced individually as practice with the MaNIS guidelines.  We

> >>would sincerely appreciate your critique of our work before we submit files

> >>for inclusion in the project.

> >>

> >>Thanks for all of your help.

> >>

> >>Sincerely,

> >>

> >>XXXXXX

 

 

>>> Posting number 168, dated 13 Feb 2002 11:55:01

Date:         Wed, 13 Feb 2002 11:55:01 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Should we save extents?

In-Reply-To:  <v0213050ab886050432cf@[207.207.103.162]>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

>For TRS lat/longs, I am using the extents in the Guidelines update.  For

>lookup on the MontanaTRS site I am assuming unknown datum and no error due

>to scale as done in the Georef Guidelines examples for placename only.

>Correct?

 

Correct.

 

>>> Posting number 169, dated 13 Feb 2002 12:21:25

Date:         Wed, 13 Feb 2002 12:21:25 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: MSU Practice  - More Comments

Comments:

In-Reply-To:  <3.0.32.20020213145319.00720da8@pilot.msu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

More relevant exchanges.

 

> >>We do not and will not be using Excel for georeferencing.  We just used it

> >>this one time to send you the sample records via e-mail.  I am hoping that

> >>the data were not altered.

> >

> >Will you use Access then?

> 

>Yes - we are extremely happy with your Access template!  (Why would anyone

>want use something else?)

 

Good question! I would have no problem accepting that everyone used it.

 

>On the template, we have found it useful to just "close up" the columns

>that we don't want to look at while georeferencing.  (You probably noticed

>this in the Excel version).

> 

>XXXXXXX (our IT person) will help us send the "real" files using the

>project protocol.

 

In so doing, be sure to preserve all of the precision in the numeric

fields. There are two ways to do this. The first is to bypass protocol and

just send me the Access database mdb file (preferably with a date in the

filename, e.g., msu_barry020213.mdb). The second is to change the data type

of those fields to text after the georeferencing is all done and then

export the data into a tab-delimited text file.

 

> >I'm composing a reply to your previous message, which I'll send out to the

> >list due to common items of interest, and as a way of introducing more

> >information on the issue of extents.

> >

>Okay.  Robin replied to me (from home) about extents.  Here is her "vote".

>FROM XXXXXX:  I'd vote for an actual column regarding extent

>information to assure that it was remembered.  I view the column headings

>as a checklist of things I need to provide and without reference to it, it

>could easily be forgotten with all the other components.

 

This is a valuable, practical point with which I entirely agree.

 

John W

 

>>> Posting number 170, dated 13 Feb 2002 12:33:47

Date:         Wed, 13 Feb 2002 12:33:47 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: MSU PRACTICE RECORDS

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

John, XXXX, XXXX:  Can I get copies of the data files?  I'd like to run

them through the lat/long calculator for comparsion.

 

 

 

>>> Posting number 171, dated 14 Feb 2002 18:30:16

Date:         Thu, 14 Feb 2002 18:30:16 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Error Calculator:Coordinate Source & Topozone.com

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Hi John,

 

Thanks for the helpful information about map scales and choices to make on

the error calculator when using Topozone.com for georeferencing.  I have

some additional questions about this.  The message exchanges (from

Mammal-Z-Net) are copied below.

 

>From John:

>> >Topozone actually has

>> >only two underlying scales, 1:25,000, and 1:100,000. The 1:50,000 and

>> >1:200,000 versions are just zoomed out by a factor of two from their

>> >counterparts. Also, the 1:25,000 Topozone scales actually consist (were

>> >derived from) a combination of 1:25,000 and 1:24,000 maps, which were

>> >"resized." It doesn't make all that much difference in the error

>> >determination if you use 1:25,000 vs. 1:24,000. However, I hope you are

>> >using the 1:25,000 map scale contribution in the error calculator for the

>> >1:50:000 Topozone maps. Similarly, use the 1:100,000 map scale for the

>> >1:200,000 Topozone maps.

>> 

>From XXXXX:

>>Good to know all of the above.  Actually, we used "gazetteer" from the

>>dropdown on the error calculator for all of the Topozone practice records.

>>We were following the example from the georeferencing guidelines where the

>>coordinate source (Topozone) was considered to be a gazetteer, and thus

>>selected "gazetteer" on the error calculator.  It sounds like we need to

>>redo the MAX ERROR with the map scale incorporated.

 

>From John:

>Actually, there is a subtle distinction to make. In the Georeferencing

>Guidelines document I said that the source for that "Distance Only" example

>was a gazetteer, because the coordinates were for a named place and

>Topozone uses the GNIS data to plot named places; thus, the ultimate source

>of the coordinates for that example is the GNIS database, which is a

>gazetteer. If you had used Topozone to measure on a map, then the map

>itself is the source of the coordinates and should be so reflected in the

>error calculations by selecting an appropriate map scale.

> 

My questions:

 

1.  I understand (from exchange above) that if the locality that we want to

georeference is a named place (such as East Lansing or Beaver Island or

Fine Lake) and we enter this into the Place Name Search in Topozone and

Topozone gives us the coordinates of that place, then the Coordinate Source

that we select on the Error Calculator will be a Gazetteer (because

Topozone got those coordinates from GNIS).  Thus, I believe that we

calculated the error correctly in the practice records that contained

coordinates given by Topozone for named places (such as Fine Lake).  Is

this correct?

 

2.  Are the Topozone maps considered to be USGS or non-USGS maps?  For

Example, If we used a Topozone.com map at 1:25,000 scale to measure the

distance from a town, shall we select USGS Map 1:25,000 or non-USGS Map

1:25,000 from the Coordinate Source dropdown on the Error Calculator?

 

Thanks again,

XXXXX

 

 

 

 

>>> Posting number 172, dated 14 Feb 2002 15:39:16

Date:         Thu, 14 Feb 2002 15:39:16 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Error Calculator:Coordinate Source & Topozone.com

In-Reply-To:  <3.0.32.20020214183015.0072e530@pilot.msu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXXX, and all,

 

You are correct with respect to question 1, below. You got the coordinates

indirectly from GNIS for named places, therefore, the appropriate source is

a gazetteer.  If you use Topozone to find a locality, but do any kind of

measuring on the Topozone maps, then you are indirectly using a USGS map,

and you should select the appropriate scale in the coordinate source

dropdown box in the error calculator application. So, to explicitly answer

question 2, below, use "USGS Map 1:25,000" for Topozone maps at either

1:25,000 or 1:50,000. Use "USGS Map 1:100,000" for Topozone maps at either

1:100,000 or 1:200,000. While we're at it, here's a reminder to always use

NAD27 for Topozone-derived coordinates, whether from the gazetteer or from

the maps.

 

John W

 

>Thanks for the helpful information about map scales and choices to make on

>the error calculator when using Topozone.com for georeferencing.  I have

>some additional questions about this.  The message exchanges (from

>Mammal-Z-Net) are copied below.

> 

> >From John:

> >> >Topozone actually has

> >> >only two underlying scales, 1:25,000, and 1:100,000. The 1:50,000 and

> >> >1:200,000 versions are just zoomed out by a factor of two from their

> >> >counterparts. Also, the 1:25,000 Topozone scales actually consist (were

> >> >derived from) a combination of 1:25,000 and 1:24,000 maps, which were

> >> >"resized." It doesn't make all that much difference in the error

> >> >determination if you use 1:25,000 vs. 1:24,000. However, I hope you are

> >> >using the 1:25,000 map scale contribution in the error calculator for the

> >> >1:50:000 Topozone maps. Similarly, use the 1:100,000 map scale for the

> >> >1:200,000 Topozone maps.

> >>

> >From XXXXX:

> >>Good to know all of the above.  Actually, we used "gazetteer" from the

> >>dropdown on the error calculator for all of the Topozone practice records.

> >>We were following the example from the georeferencing guidelines where the

> >>coordinate source (Topozone) was considered to be a gazetteer, and thus

> >>selected "gazetteer" on the error calculator.  It sounds like we need to

> >>redo the MAX ERROR with the map scale incorporated.

> 

> >From John:

> >Actually, there is a subtle distinction to make. In the Georeferencing

> >Guidelines document I said that the source for that "Distance Only" example

> >was a gazetteer, because the coordinates were for a named place and

> >Topozone uses the GNIS data to plot named places; thus, the ultimate source

> >of the coordinates for that example is the GNIS database, which is a

> >gazetteer. If you had used Topozone to measure on a map, then the map

> >itself is the source of the coordinates and should be so reflected in the

> >error calculations by selecting an appropriate map scale.

> >

>My questions:

> 

>1.  I understand (from exchange above) that if the locality that we want to

>georeference is a named place (such as East Lansing or Beaver Island or

>Fine Lake) and we enter this into the Place Name Search in Topozone and

>Topozone gives us the coordinates of that place, then the Coordinate Source

>that we select on the Error Calculator will be a Gazetteer (because

>Topozone got those coordinates from GNIS).  Thus, I believe that we

>calculated the error correctly in the practice records that contained

>coordinates given by Topozone for named places (such as Fine Lake).  Is

>this correct?

> 

>2.  Are the Topozone maps considered to be USGS or non-USGS maps?  For

>Example, If we used a Topozone.com map at 1:25,000 scale to measure the

>distance from a town, shall we select USGS Map 1:25,000 or non-USGS Map

>1:25,000 from the Coordinate Source dropdown on the Error Calculator?

> 

>Thanks again,

>XXXXX

> 

 

 

>>> Posting number 173, dated 15 Feb 2002 10:55:46

Date:         Fri, 15 Feb 2002 10:55:46 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: Error Calculator:Coordinate Source & Topozone.com

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Hi John,

 

Thanks for the information.  We'll go ahead and recalculate the Max Error

Values on our "practice" records.

 

One minor question with respect to the word "measuring" in your response

below:  For some localities, such as road intersections for example, we get

the coordinates by placing the cursor on the Topozone map, and then

clicking to get the target coordinates of that particular locality.  We

really aren't "measuring", but the coordinates are still considered to be

derived from Topozone, and so the map scale information gets applied to the

error calculator - correct?

 

Thanks,

XXXXX

 

 

 

At 03:39 PM 02/14/2002 -0800, you wrote:

>XXXXX, and all,

> 

>You are correct with respect to question 1, below. You got the coordinates

>indirectly from GNIS for named places, therefore, the appropriate source is

>a gazetteer.  If you use Topozone to find a locality, but do any kind of

>measuring on the Topozone maps, then you are indirectly using a USGS map,

>and you should select the appropriate scale in the coordinate source

>dropdown box in the error calculator application. So, to explicitly answer

>question 2, below, use "USGS Map 1:25,000" for Topozone maps at either

>1:25,000 or 1:50,000. Use "USGS Map 1:100,000" for Topozone maps at either

>1:100,000 or 1:200,000. While we're at it, here's a reminder to always use

>NAD27 for Topozone-derived coordinates, whether from the gazetteer or from

>the maps.

> 

>John W

> 

 

 

>>> Posting number 174, dated 15 Feb 2002 09:09:40

Date:         Fri, 15 Feb 2002 09:09:40 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Error Calculator:Coordinate Source & Topozone.com

In-Reply-To:  <3.0.32.20020215105545.0072c878@pilot.msu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXXX,

 

You aptly described exactly what I meant. Thank you.

 

John

 

>One minor question with respect to the word "measuring" in your response

>below:  For some localities, such as road intersections for example, we get

>the coordinates by placing the cursor on the Topozone map, and then

>clicking to get the target coordinates of that particular locality.  We

>really aren't "measuring", but the coordinates are still considered to be

>derived from Topozone, and so the map scale information gets applied to the

>error calculator - correct?

> 

>Thanks,

>XXXXX

> 

 

 

>>> Posting number 175, dated 15 Feb 2002 18:08:31

Date:         Fri, 15 Feb 2002 18:08:31 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: coordinate source?

Comments: cc: fsyu <fsyu@uaf.edu>

In-Reply-To:  <3C6D8138@webmail.uaf.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXXX and all,

 

There is no provision for georeferencing records that already have

coordinates, but this shouldn't necessarily deter you from doing so. If you

go this route, please be sure to note that you have provided these

additional data when you send them in to me. It makes a difference in how I

handle the data on this end.

 

To answer your specific question, you should put "original locality

description" in the DeterminationRef field in the downloaded data file and

use "locality description" as the Coordinate Source choice in the Error

Calculator.

 

John W

 

>Hi John,

> 

>Many Alaska data are already georeferenced, but don't have maximum error.

>I've

>been calculating max. error for them, but determination references are not

>recorded for most of them.  What should I enter in Coordinate source in Error

>Calculator?

> 

>XXXXXX

 

>>> Posting number 176, dated 20 Feb 2002 09:06:36

Date:         Wed, 20 Feb 2002 09:06:36 -1000

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Topo USA Ver. 3.0 by DeLorme

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

For anyone using DeLorme software Topo USA Ver. 3.0 (which I am using to do

Hawaii localities)  you will need this information for the georeferencing

calculator.  I just spoke with the Tech help people and got the information

that all topo maps, at all zoom levels, are based on USGS 1:24,000.  I

quite like this software as it allows me to place markers for all the

localities I've done which greatly speeds up any double checking I might

want to do.  Measuring distances is also easy, either by air or road.

 

XXXXX

 

 

>>> Posting number 177, dated 25 Feb 2002 14:36:49

Date:         Mon, 25 Feb 2002 14:36:49 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      MaNIS Server recommendations

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

Due to popular demand, I'm writing to give an updated recommendation for

the MaNIS server specifications. The requirements haven't changed since the

original specification were sent out on 2 Oct 2001.  Nevertheless, I'll

reiterate the essentials of the configuration, ordered by importance:

 

1) dual processor Windows 2000 Professional - the Xeon processor is good

for our purposes; faster is better, but anything on the market today is

fast enough.

 

2) 512 MB RAM - more is better, but not at the cost of any of the other

essentials.

 

3) one fast SCSI hard drive - essential; faster is better; capacity is much

less important. 18GB is a good target capacity.

 

4) 10/100 Ethernet adapter - essential; most systems these days have one on

board.

 

5 ) 3 yr service on parts and labor - essential; we don't want anything to

break without warranty during the period of the grant.

 

6) CD-ROM drive - faster is better; a CD-RW may be a useful alternative, if

it fits your budget.

 

7) 17" Monitor - this machine is supposed to be a server, not a

workstation, so don't spend big money on a fancy display.

 

8) 1.44 MB diskette drive - less essential every day, but most machine

still come with one.

 

I've created a model system on the Dell website to give you an idea for a

recommended configuration. To look at the specifications for the system

you'll need to Retrieve EQuote #E001554835. You'll also need to enter

either the E-Quote name, which is "manis2," or my email address.

 

Let me know if you have any questions.

 

John W.

 

>>> Posting number 178, dated 27 Feb 2002 14:59:52

Date:         Wed, 27 Feb 2002 14:59:52 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Mystified

In-Reply-To:  <3.0.32.20020227173043.007327a8@pilot.msu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Hi XXXX, XXXX, and all,

 

I have noticed the syndrome you mentioned and I tried to ignore it. That's

harder to do when someone else notices it. It's even worse when two people

notice it - it gets harder to remove the witnesses. I think I know why it

occurs, but I don't have a satisfactory solution yet. I actually made the

interface show 3 decimal places in the Maximum Error field so that this

inconsistency would make less of an impact on the results, which may

currently differ from the expected by up to .001 distance units. So, the

worst case scenario occurs when your distance units are miles, and then the

error (in the error) amounts to about 5.3 feet. This is probably acceptable

and worth trading in your concern for a life. :)  In the meantime, I'll

remain cognizant of the problem and try to work on its resolution.

 

John

 

At 05:30 PM 2/27/02 -0500, you wrote:

>Hi John,

> 

>XXXX and I are mystified about some of the error values in our Barry

>County records (files sent to you in today's earlier message).

> 

>1.  In the first set of Barry County records (the files that we sent to you

>on 2/12/2002) we incorrectly chose Gazetteer as the error calculator

>coordinate source for Topozone for all records.  For the records that were

>TRS localities, we anticipated getting identical values for maximum error.

>This was not the case.  When XXXX used the error calculator on her

>computer, she got .716 as the error.  When I used the error calculator on

>my computer for these types of records, I got .715 as the error.

> 

>2.  In the second set of Barry County records (the files that we sent to

>you today 2/27/2002 where maximum error was recalculated with the

>appropriate Topozone map scale), our computers continue to give different

>error calculator values for some of the TRS localities that used an error

>calculator map scale of 1:25,000 (See Sec. 23, T1N, R7W,

>Sec. 24, T1N, R7W and

>T01N R07W Section 4)

> 

>3.  We were surprised at the above examples.  We then entered each other's

>coordinates using identical dropdown choices on the error calculator on our

>respective computers.  XXXX's computer still consistently returned an

>error of .723 for all of the TRS localities that had the 1:25,000 scale.

>However, XXXX's computer returned an error of .723 on some localities and

>.724 on others with the 1:25,000 scale.  Do we need to be concerned about

>this? (or shall we get a life?)

> 

>Thanks,

>XXXXX

 

 

>>> Posting number 179, dated 27 Feb 2002 16:24:51

Date:         Wed, 27 Feb 2002 16:24:51 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Sample of georeferencing from Baton Rouge

Comments: To:

In-Reply-To:  <OF09532E16.D5566143-ON86256B6D.00611AF2@lsu.edu>

Mime-Version: 1.0

Content-Type: multipart/mixed; boundary="=====================_-1683450515==_"

 

--=====================_-1683450515==_

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX, and all,

 

Very nicely done.  I can see that you've gone to a lot of trouble to

document the determination methods in the Remarks. There should be no

trouble for someone to figure out later what you did.  Some of the

techniques you used (and documented) will surely be useful to others, so

I'm attaching your file with this message to the mammal-z-net list.

 

I'm trying to decide if/how to make everyone's job a little easier, perhaps

by including a field for named place along with one for the extent. That

way we'll know unequivocally to what the extent refers. I've just started

having my georeferencers do this, and it seems to be better (faster anyway)

than trying to write that information out in plain english in the remarks.

I'm interested in feedback from you and anyone else with an opinion about

whether this change would have a positive effect on your georeferencing.

I'm hoping to set a policy on this subject once there has been ample time

for cogitation on it. In the meantime, I recommend that georeferencers add

two columns to their data, one for NamedPlace, followed by one for Extent,

and put these right before MaximumErrorDistance. Do not include a

ExtentUnits field; instead, use the same units as for the

MaximumErrorDistance and the MaxErrorUnits will refer to both measures.

 

John W

 

>Hi John,

> 

>Here at LSU, we've downloaded all the Louisiana records from the MANIS

>database, and have begun georeferencing, starting with records from Baton

>Rouge (our home turf). We've learned a lot as we've worked through our

>first batch of records, especially from much of the recent email exchanges

>with other institutions, and we really appreciate the ease of use of the

>Error Calculator. We were wondering if you could look over a small (<20

>records) sample of some of the different types of localities we have

>georeferenced, just to see if we are on the right track. Our longest field

>is the LatLongRemarks, where we describe how we located the point and the

>extent that we estimated to calculate error with. We just wanted to make

>sure that you would be able to follow what we did if there are any

>questions with our georeferencing. Should we place the extents in a

>separate field, and if so, should we place it in any particular order with

>respect to the other fields? Let us know if you see any problems.

> 

>Many thanks,

> 

>XXXXXXX

 

>**********************************************************

 

--=====================_-1683450515==_

Content-Type: text/plain; charset="us-ascii"

Content-Disposition: attachment; filename="batonrouge.txt"

 

"LocalityID"    "CollectionCode"        "HigherGeog"    "SpecLocality"  "ElevationText" "MinElev"

"MaxElev"       "ElevUnits"     "LatText"       "LongText"      "TRS"   "Township"      "TownshipDir"

"Range" "RangeDir"      "TRSSection"    "TRSPart"       "DetByAgentID"  "DeterminedByPerson"

"DeterminedDate"        "DeterminationRef"      "OrigCoordSystem"       "Datum" "DecLat"

"DecLong"       "LatDeg"        "LatMin"        "LatSec"        "LatDir"        "LongDeg"

"LongMin"       "LongSec"       "LongDir"       "UTMZone"       "UTMEW" "UTMNS" "MaxErrorDistance"

"MaxErrorUnits" "LatLongRemarks"        "CaptiveFlag"   "NoGeorefBecause"       "LocalityAnnotation"

13056   "CAS"   "North America, USA, Louisiana" "Briar patch near LSU campus, East Baton Rouge"

"Dinakar Nethi" "1-22-02"       "Topozone - gazetteer"  "decimal degrees"       "NAD27" "30.4141"

"-91.1759"

"1.009" "mi"    "center point of LSU Campus obtained from topozone, estimated furthest extent of ""near

LSU campus"" from center as 1 mi"       "0"

28636   "FMNH"  "USA, Louisiana, Baton Rouge Par"       "Baton Rouge"

"Satya Maliakal"        "1-23-02"       "Topozone - gazetteer"  "decimal degrees"       "NAD27"

"30.4451"       "-91.1867"

"13.009"        "mi"    "used EBR Parish courthouse as center, furthest extent of BR city limits from

courthouse estimated at 13 mi"    "0"

47616   "KU"    "U S A, LOUISIANA, EAST BATON ROUGE PARISH"     "BATON ROUGE, 5 MI S OF"

"m"                                                                                     "Satya Maliakal"

"1-23-02"       "Topozone -1:100,000"   "decimal degrees"       "NAD27" "30.3725"       "-91.1867"

"15.903"        "mi"    "located point 5mi S of EBR Parish courthouse, furthest extent of BR city limits

from courthouse estimated at 13 mi"    "0"

71051   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "0.25 mi E jct. Highland and Lee (on

Highland), Baton Rouge"            "0"     "0"

"Satya Maliakal"        "1-28-02"       "Topozone -1:25,000"    "decimal degrees"       "NAD27"

"30.3911"       "-90.1562"

"38.412"        "m"     "located point 0.25 mi E of intersection of Highland and Lee on Highland,

estimated extent of intersection as 10 m"     "0"

71121   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "1 km S Baton Rouge, intersection Ben

Hur Rd. and Nicholson Rd., E tracks along fence line, 5 m"                "0"     "0"

"Satya Maliakal"        "1-28-02"       "Topozone -1:25,000"    "decimal degrees"       "NAD27"

"30.3841"       "-91.1687"

"43.413"        "m"     "located point at intersection of nicholson drive RR tracks and ben hur road,

assuming that 1 km S of BR refers to this intersection, estimated extent of intersection as 10 m with 5

m offset" "0"

71074   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "0.33 mi S of Baton Rouge City Limits on

Highland Rd"           "0"     "0"

"Satya Maliakal"        "1-28-02"       "Topozone -1:25,000"    "decimal degrees"       "NAD27"

"30.3687"       "-91.1227"

"38.414"        "m"     "point located .33 mi S of intersection of Highland Rd. and southern Baton Rouge

Corp. Limit on Highland Road, estimated extent of intersection as 10 m"        "0"

71248   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "10 mi S Baton Rouge on River Rd"

"16"    "16"    "meters"

"Satya Maliakal"        "1-28-02"       "Topozone -1:100,000"   "decimal degrees"       "NAD27"

"30.3533"       "-91.1808"

"14.041"        "mi"    "located point 10 mi S of EBR courthouse following River Road, furthest extent

of Baton Rouge city limits from courthouse estimated at 13 mi"   "0"

71268   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "11465 Robin Hood, Baton Rouge"

"0"     "0"

"Satya Maliakal"        "1-29-02"       "Topozone -1:25,000"    "decimal degrees"       "NAD27"

"30.4555"       "-91.0561"

"37.408"        "m"     "located 11465 Robin Hood with yahoo maps, then located this point with

topozone, estimated extent of property at 10 m" "0"

71243   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "10 mi N Baton Rouge, US 61"

"0"     "0"

"Satya Maliakal"        "1-29-02"       "Topozone -1:100,000"   "decimal degrees"       "NAD27"

"30.5503"       "-91.1969"

"14.041"        "mi"    "located point 10 mi N of BR along US 61 (starting from EBR Parish courthouse

latitude), furthest extent of Baton Rouge city limits estimated at 13 mi" "0"

71511   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "3.4 mi E, 1 mi N Baton Rouge on LA 37"

"0"     "0"

"Satya Maliakal"        "2-13-02"       "Topozone -1:25,000"    "decimal degrees"       "NAD27"

"30.4655"       "-91.1329"

"19.819"        "mi"    "located closest point 3.4 mi E and 1 mi N of EBR courthouse on LA 37, furthest

extent of BR city limits from courthouse estimated at 13 mi"    "0"

71294   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "2 mi N Baton Rouge on Miss. River"

"0"     "0"

"Satya Maliakal"        "2-08-02"       "Topozone -1:25,000"    "decimal degrees"       "NAD27"

"30.4733"       "-91.1927"

"14.017"        "mi"    "located point 2 mi N of EBR Parish courthouse following Mississippi River,

furthest extent of BR city limits from courthouse estimated at 13 mi"       "0"

71801   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "Baton Rouge on River Road"

"16"    "16"    "meters"

"Satya Maliakal"        "2-20-02"       "Topozone -1:100,000"   "decimal degrees"       "NAD27"

"30.3749"       "-91.2249"

"5.041" "mi"    "located point at center of River Rd. in Baton Rouge, estimated furthest exent of River

Rd. in BR from center at 5 mi"  "0"

71802   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "Baton Rouge Quad. 15' Sec 51, T7S, R2E"

"45"    "45"    "feet"

"Satya Maliakal"        "2-21-02"       "Topozone -1:25,000"    "decimal degrees"       "NAD27"

"30.4277"       "-91.0072"

"4.260" "mi"    "located point at center of T7S, R2E (unable to locate Quad. 15' Sec. 51)"      "0"

71897   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "Baton Rouge, Tulane Ave"

"0"     "0"

"Dinakar Nethi" "02-25-02"      "Topozone -1:25,000"    "decimal degrees"       "NAD27" "30.4019"

"-91.1652"

"0.527" "km"    "point located at approximate center of Tulane Ave., furthest extent of Tulane avenue

from center point estimated as .5 km"     "0"

71821   "LSU"   "USA, Louisiana, East Baton Rouge Parish"       "Baton Rouge, 2100 Stanford"

"0"     "0"

"Dinakar Nethi" "02-08-02"      "Topozone - 1:25,000"   "decimal degrees"       "NAD27" "30.4187"

"-91.1536"

"37.410"        "m"     "located 2100 Stanford with yahoo maps and then located this point on topozone,

extent of property estimated at 10 m"   "0"

 

 

--=====================_-1683450515==_

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

 

--=====================_-1683450515==_--

 

>>> Posting number 180, dated 7 Mar 2002 14:15:38

Date:         Thu, 7 Mar 2002 14:15:38 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: MaNIS

In-Reply-To:  <a05100301b8ad6761b1be@[141.211.110.228]>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX and all,

 

>Hello John,

>My apologies, when I am georeferencing I use the "hide" command under

>"column" in the "format" menu of excel to close down columns that I seldom

>or never use. In this way, I can see the decimal latitude and longitude

>columns, for example, directly next to the locality column on my computer

>screen.  I inadvertantly forgot to "unhide" a few columns when I sent the

>excel files back to you.

 

I should have looked for that.

 

>A question for you: I have some localities where the data is obviously in

>error but cannot be corrected by me. Do you prefer that I reference the

>county center with a note in the locality annotation column, or not

>georeference the locality with a note in the NoGeorefBecause column?

 

There are two different classes of locality errors that you need to worry

about, those with internal inconsistencies that make the locality

impossible to determine (e.g., Hogback Creek, Inyo County - there are two

of these), and those that have an obvious error that can be corrected

unambiguously (e.g., Needles, Mojave Co., California - Mojave Co. is in

Arizona and Needles is in San Bernardino Co, California).

 

If there is an internal inconsistency in the locality information that

makes the locality impossible to determine unambiguously, do not provide

coordinates and error, but do put something like "internal inconsistency"

in the NoGeorefBecause field and explain the problem in the

LocalityAnnotation field (e.g., "there are two Hogback Creeks in Inyo

Co."). When the source institution gets the georeferenced data back,

they'll be able to see what the problem was for each locality that was not

georeferenced.

 

If there is an obvious error that doesn't make the georeferencing

ambiguous, go ahead and georeference the locality, but put your assumptions

in the LatLongRemarks field and definitely point out the error in the

LocalityAnnotation field. The source institution will be able to see what

your assumptions were and they'll be able to fix the errors you uncovered.

 

In summary, LatLongRemark should be filled with information about how you

georeferenced, LocalityAnnotation should be filled with information about

errors or ambiguities - intended for the source institution, and

NoGeorefBecause should be a brief phrase describing your reason for not

georeferencing a locality (e.g., "internal inconsistency", "too vague", "no

specific locality").

 

John W

 

 

 

>>> Posting number 181, dated 9 Mar 2002 11:19:22

Date:         Sat, 9 Mar 2002 11:19:22 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Some other useful Excel operations for MaNIS work

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

In addition to Hide columns, some other useful Excel operation I have found

useful are:

 

1. AutoFilter (similar to Access):

First select a column or columns, then choose

 

Data Menu>AutoFilter>select (Custom...) from the scrollable pick list>pick

contains and enter data of interest.

 

Using custom contains filtering, you can pull out all records for a county

from the backward HighGeo field or get all occurrences of a placename in

SpecLoc.   Records can be worked with as desired.

 

Show All just under AutoFilter on the Data menu brings all records back.

 

2. Protect Worksheet: This will prevent inadvertent changes to MaNIS

records handed down from the mount but cells, columns, or rows can be left

open for data entry if you first select them, then under Format

cells>Protection tab>click unlocked.  Once a worksheet is locked you can

enter data manually or automatically (egs. DecLat DecLong, error) but still

lock out changes to the locality fields.   Protecting disables the Sort

capability.

 

3.  LookUp:

Works great for dynamic lookup (as you type) and automatic assignment of

data like a placename lat/long from another list like the GNIS download.

With about 5000 of these links in the Oregon records, my machine (196 mg

RAM) starts to bog down.   To get rid of the links but retain the data, do

a Copy, Paste Special, click Value.

 

I've been using LookUp in four columns after LocAnnotation, I enter

placename (winnowed by user) that is then looked up and values for GNIS

placename, type of locality, county,  and DecLat, DecLong are returned.

Placename, type and county are for user verification and lat & long are for

computing lat/longs based on  offsets.

 

4. Concatenation:  For a text field this is done with "&", eg, columns A,

B, C  can be appended to D with

 "=D:D&", "&A:A&", "&B:B&", "&C:C" .  Enter this in the first field, then

fill down as needed.  Used to added misc notes to memo fields of MaNIS.

 

You can flip the HighGeo to have county first for sorting by doing a Text

to columns (Data menu), then contentating the columns with the county

column first.  Of course leave the original HighGeo unaltered.

 

When you get tired of these, there is the underlying Visual Basic macro

editor which is fun if you like that sort of thing.

 

I'll probably stick with Excel through the project due to our "Mac-enabled"

status in the museum.  I use Windows at home and in the museum as soon as

our server arrives.

 

 

 

>>> Posting number 182, dated 11 Mar 2002 12:02:55

Date:         Mon, 11 Mar 2002 12:02:55 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Sending Data from MSU

In-Reply-To:  <3.0.32.20020311144354.006e023c@pilot.msu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX, and all,

 

>XXXX and I have data from a few Michigan counties to send to you.  So far,

>we have Access.mdb files for Barry, Branch, and Muskegon ready to go, and

>Kent, Ionia, and Montcalm are forthcoming.  We have two questions for you:

> 

>1.  We minimize the width (what I call "closing up") of many columns on the

>template (basically ones that we don't fill in with data, or don't want to

>look at).  Do you want us to open these columns back up before we send the

>file to you?

 

Nope. They're fine all closed up.

 

>2.  Do you have a preference for how often we send files to you?  (Aren't

>you getting bombarded with georeferencing data??)

 

Yes, the deluge has begun. Well, it's best to have the work backed up, so

it seems that you should send them as you finish them. Keep a copy on your

end too, for the sake of safety - you never know when we'll get hit by "the

Big One."  To minimize the threat of loss, it's probably best to upload

them as described in the Georeferencing Steps document (i.e., ftp to

galaxy.cs.berkeley.edu/incoming/mvz). Then send me messages as they arrive

safely. Of course, if you are sending Excel (.xls) or Access (.mdb) files,

you don't need to export as tab-delimited text and you should change the

file type to binary when ftp-ing.

 

>Thanks,

>XXXX

> 

>P.S. Thanks for "secretly" adding the NamedPlace and Extent fields to the

>template.  (We moved them over next to the MaxError column in our tables).

 

OK, the secret is out. For those of you who may not be aware of it, there

is an Access Database template for georeferencing that can be accessed

through a link in Step Five on the GeorefSteps document at the following URL:

 

http://dlp.cs.berkeley.edu/manis/GeorefSteps.html

 

 

 

>>> Posting number 183, dated 11 Mar 2002 13:49:55

Date:         Mon, 11 Mar 2002 13:49:55 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      MaNIS Servers

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

I've been asked a couple of times about making hardware substitutions in

the Equipment portion of MaNIS subcontract budgets. The bottom line is that

each institution must have, when the time comes to connect to the network,

a DEDICATED machine with the specifications highlighted in my 25 Feb

message "MaNIS Server recommendations." Dedicated means that the sole

purpose of the machine is to support data provision to the network. Beyond

that, I'm not picky.

 

John W.

 

>>> Posting number 184, dated 12 Mar 2002 14:45:06

 

>>> Posting number 185, dated 19 Mar 2002 10:46:55

Date:         Tue, 19 Mar 2002 10:46:55 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Fwd: fraction format in the error calculator

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear XXXX, and all,

 

I'm glad you uncovered this bug. The error calculator is actually not as

smart as you expected it to be. The discrepancy you're experiencing arises

because the calculator interprets 1/2 as 1, ignoring everything after the

/. Therefore, please use only decimals or whole numbers in the Offset

Distance and Extent of Named Place fields

 

John

 

 

> 

>Hi John

> 

>I notice that maximum error is noticeably affected by the format of the

>extent entered on the error calculator if the extent contains a

>fraction.  Since the extent field accepts both decimal and common

>fractions, I experimented with 0.5 and 1/2 for the locality of 3/8 mi. N

>of Casnovia, Kent County, MI.  I approached the situation "by road," used

>decimal degrees on Topozone, and obtained the coordinates of 43.2401 and

>-85.7901.  Datum is NAD27; coordinate precision, 0.0001; coordinate

>source, USGS map 1:25,000.  Distance precision of 1/8 was selected from

>the drop-down.  When the extent of the bounding box is expressed as 0.5 (a

>logical choice for TopoZone users), the maximum error is 0.641; but when

>it is expressed as 1/2 (in keeping with the format of distance precision),

>maximum error is 1.141.

> 

>Depending on the extent, one format may be easier to use than the

>other.  However, if both formats are allowed by the calculator but only

>one yields the desired maximum error, shouldn't the field be restricted to

>that format?  [Actually now I believe the extent is slightly less than 0.5

>miles, but remain curious about the discrepancy.]  Again, your assistance

>will be greatly appreciated.

> 

>XXXXX

 

 

>>> Posting number 186, dated 21 Mar 2002 15:51:25

Date:         Thu, 21 Mar 2002 15:51:25 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: georeferencing rivers

In-Reply-To:  <Pine.OSF.4.33.0203211410400.8199-100000@aurora.uaf.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX and all,

 

These are good questions. I'll put the answers right below each one.

 

>1. When I georeference rivers, should I take coordinates of the source or

>the drainage of the river? How much should extent of the river be?

 

The coordinates should be at the geographic center of the river, on the

river itself. The extent should be the distance to the furthest reach of

the river in either direction.

 

>2. An example: specific locality is "Brooks Range, Anaktiktoot", where

>Anaktiktoot is not on the map. Should I georeference for Brooks Range

>(which will be more than 600 miles in length)? There are many cases that

>higher geography is followed by unknown specific locality.

 

You should go ahead and put coordinates on the vague localities, even

though the maximum_error_distance will be large. Some of the higher

geographies that have no value or "no specific locality" in the locality

field can still be specific, such as islands.

 

>3. Related to my question 2: how much is too big to georeference? In many

>cases, only the name of the island, mountains, peninsula etc. are

>provided.

 

Do them all. The maximum_error_number will be useful even if it is large.

 

John

 

>>> Posting number 187, dated 30 Mar 2002 09:00:41

Date:         Sat, 30 Mar 2002 09:00:41 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      UAM declat/longs truncated in MaNIS?

Mime-Version: 1.0

Content-Type: text/plain; format=flowed

 

John:  It looks like the UAM records in the gazetteer have the same problem

that KU's records had -- declat/longs only go to two decimals.  KU (XXX

XXXX) asked me to recompute KU's Oregon so I am overwriting  calculated

declat/longs.  Please advise on UAM records - there are several hundred.

 

Examples:

LocalityID      CollectionCode  Datum   DecLat  DecLong LatDeg  LatMin  LatSec  LatDir  LongDeg LongMin

LongSec LongDir

186407  UAM     not recorded    45.2600 -123.8800       45      16      1       N       123     53

17      W

186662  UAM     not recorded    45.2600 -123.8800       45      16      1       N       123     53

10      W

186663  UAM     not recorded    45.2600 -123.8800       45      16      1       N       123     53

1       W

186721  UAM     not recorded    45.1600 -123.7300       45      10      1       N       123     44

6       W

186731  UAM     not recorded    45.2100 -123.6400       45      13      1       N       123     38

42      W

186514  UAM     not recorded    44.2300 -123.8000       44      14      2       N       123     48

32      W

186515  UAM     not recorded    44.2300 -123.8000       44      14      2       N       123     48

21      W

186516  UAM     not recorded    44.2300 -123.8000       44      14      2       N       123     48

2       W

186556  UAM     not recorded    44.2800 -123.7600       44      17      2       N       123     46

2       W

186557  UAM     not recorded    44.2800 -123.7500       44      17      2       N       123     45

2       W

186689  UAM     not recorded    45.3300 -123.7800       45      20      2       N       123     47

2       W

186690  UAM     not recorded    45.3300 -123.6400       45      20      2       N       123     38

49      W

186691  UAM     not recorded    45.3300 -123.6300       45      20      2       N       123     38

2       W

 

 

 

>>> Posting number 188, dated 1 Apr 2002 14:19:03

Date:         Mon, 1 Apr 2002 14:19:03 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: MaNIS questions

In-Reply-To:  <5.1.0.14.0.20020327144722.01df95c0@mail.fmnh.org>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear XXXX, and all,

 

I know that Barbara made a preliminary answer to the questions raised here.

I'll try to add a few points of explanation from which everyone on the list

might benefit.

 

I agree with Barbara's statement of the georeferencing priorities within

the MaNIS context. To summarize them, the MaNIS grant covers (only)

complete georeferencing for localities that have no lat_longs. Our hope is

that, through innovation and properly-guided cooperation, we will be able

to follow through on our promise to finish this. In fact, we hope that we

will be able to refine the process and the tools enough to actually get

ahead of the game. If we do get ahead, we will be able to turn our

attention next to those localities for which lat_longs exist without

supporting metadata.

 

I know we all have the desire to have consistent data quality, especially

when faced with making those data public. Within the context of our

project, however, cleaning up locality descriptions is neither covered, nor

is it recommended. Every change made to locality descriptions on your end

since the data were collected for the MaNIS gazetteer has the potential to

confound the process of properly reconnecting the georeferenced localities

with specimens in your database.

I have not yet explained the reconnecting part of the process, thinking

that what I've presented thus far is enough to swallow for the time being.

Perhaps a brief synopsis now would be of use to illustrate the potential

complications and to get people to think about the future of locality data

in institutional databases.

 

In the MaNIS gazetteer I have rendered unique occurrences of localities by

institution. These you can query on and see as results in the online MaNIS

gazetteer. Behind the scenes there is another table to cross-reference

unique localities to specimens. The specimens are linked to the localities

(and hence to the coordinates and metadata that georeferencing provide)

based on the locality string. Thus, if you change the locality string in

your database, it will not match the locality string for the same specimen

in the gazetteer. This is the crux of the issue, so it is important to

understand when it matters, and when it doesn't.

If the locality string in your database doesn't match the locality string

in the MaNIS gazetteer, but the locality really is exactly the same place

and would get the same coordinates when georeferenced, then the change

doesn't matter - the specimen will get the correct coordinates anyway.

However, if the change in your database effectively changes the place that

is described (resulting in different coordinates when georeferenced) then

the change DOES matter - it is what I have elsewhere called "substantive."

If a substantive change is made in your database and I apply the

georeferenced coordinates to the specimens that once referred to that

locality, the georeferenced data will be wrong. Therefore, there needs to

be a verification process when re-associating georeferenced localities with

individual databases. There are two steps to this process. The first is to

determine if the locality string in your database is the same as that in

the gazetteer. For all of those localities for which the locality strings

match, the georeferenced data can go into your database automatically, no

fuss, no questions asked. For the rest of the georeferenced localities from

the gazetteer, a comparison will have to be made between the then-current

locality and the georeferenced locality to determine if they still refer to

the same place. Imagine putting a check mark by each pair that still match.

The amount of checking to be done in this step is directly determined by

the number of changes you make to your locality strings between the time

when I collected the data for the gazetteer and the time when the data go

back into your database. Clearly, fewer changes mean less checking.

 

OK? Take a breath. Now, a topic for rumination as the project progresses.

Start thinking about incorporating the georeferenced coordinates and

metadata into your individual databases. Not one of the participating

institutions currently has the structure in its database to capture all of

the metadata we are gathering. It would be nice if we all could. We don't

want to throw away all of this hard work after all.

 

John W

 

 

 

>>> Posting number 189, dated 1 Apr 2002 15:37:38

Date:         Mon, 1 Apr 2002 15:37:38 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: UAM declat/longs truncated in MaNIS?

In-Reply-To:  <F569gG8WPbLgJyAUypU000104d9@hotmail.com>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX and XXXXX,

 

The problem is not exactly the same. UAM has both decimal lat_long and

degrees minutes seconds in its database. The decimal lat_longs often have

only two decimal places when there are fully specified degrees minutes

seconds, but this shouldn't affect what you're doing unless you want to

copy and paste lat_longs that UAM had already done to localities for other

institutions. If that's the case, recompute the decimal lat_longs for UAM

using the degrees minutes seconds values where the OrigCoordSystem is "deg.

min. sec."

 

XXXXX, you may want to put XXXX on recomputing decimal lat_longs for the

conditions described above.

 

General Reminder: Lat_Long recomputations should not be on MaNIS time

until/unless we finish the georeferencing of localities without lat_longs.

 

>John:  It looks like the UAM records in the gazetteer have the same problem

>that KU's records had -- declat/longs only go to two decimals.  KU (XXX

>XXXX) asked me to recompute KU's Oregon so I am overwriting  calculated

>declat/longs.  Please advise on UAM records - there are several hundred.

> 

>Examples:

>LocalityID      CollectionCode  Datum   DecLat  DecLong

>LatDeg  LatMin  LatSec  LatDir  LongDeg LongMin LongSec LongDir

>186407  UAM     not recorded    45.2600

>-123.8800       45      16      1       N       123     53      17      W

>186662  UAM     not recorded    45.2600

>-123.8800       45      16      1       N       123     53      10      W

>186663  UAM     not recorded    45.2600

>-123.8800       45      16      1       N       123     53      1       W

>186721  UAM     not recorded    45.1600

>-123.7300       45      10      1       N       123     44      6       W

>186731  UAM     not recorded    45.2100

>-123.6400       45      13      1       N       123     38      42      W

>186514  UAM     not recorded    44.2300

>-123.8000       44      14      2       N       123     48      32      W

>186515  UAM     not recorded    44.2300

>-123.8000       44      14      2       N       123     48      21      W

>186516  UAM     not recorded    44.2300

>-123.8000       44      14      2       N       123     48      2       W

>186556  UAM     not recorded    44.2800

>-123.7600       44      17      2       N       123     46      2       W

>186557  UAM     not recorded    44.2800

>-123.7500       44      17      2       N       123     45      2       W

>186689  UAM     not recorded    45.3300

>-123.7800       45      20      2       N       123     47      2       W

>186690  UAM     not recorded    45.3300

>-123.6400       45      20      2       N       123     38      49      W

>186691  UAM     not recorded    45.3300

>-123.6300       45      20      2       N       123     38      2       W

> 

 

 

>>> Posting number 190, dated 1 Apr 2002 16:47:19

Date:         Mon, 1 Apr 2002 16:47:19 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: MaNIS questions

In-Reply-To:  <5.0.0.25.2.20020401125307.024018f0@socrates.berkeley.edu>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Fellow MANES:

John's message closed with this statment:

"Not one of the participating

institutions currently has the structure in its database to capture all of

the metadata we are gathering. It would be nice if we all could. We don't

want to throw away all of this hard work after all."

 

My response:  It has been a surprise to find ourselves dealing with the

topic of error estimates, etc in lat/long data, since that was not part of

the original scope of the project.  And indeed (in light of the above

quote) we do not have a capacity to absorb such information into our

present databases, let alone deciding how much time we have to care about

this.  Seeing the impact of the request for so much attention to error

estimates, I find it hard to support so much allocation of additional time

to this effort.

 

I have witnessed, over the years, many publications based on massive

datasets in which the authors were not able to document (or even care)

about variance in the quality and accuracy of the data.  Typically, they

just put on their blinders and accepted all the "AVAILABLE" data.  This is

just an inherent problem for those who move up the scale (allometric

analyses, macroecology, or whatever), and at such LARGE scales of analyses

they usually say that small local errors become insignificant, because of

the LARGE SCALE of the overall analysis.

 

I hope we can strike a balance here and get the big data entry and

conversion project done.  I don't want to see the project slowed down by

such a big commitment to accounting for aspects of the data (and the

corresponding time commitment) that were not built in to our original

estimates of what it would take to carry out the project.

 

Is this a helpful comment?

 

 

>>> Posting number 191, dated 1 Apr 2002 17:01:39

Date:         Mon, 1 Apr 2002 17:01:39 -0900

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Organization: University of Alaska Museum

Subject:      Re: MaNIS questions

MIME-Version: 1.0

Content-Type: multipart/mixed; boundary="------------4C7E03390063999F5E48C0EE"

 

This is a multi-part message in MIME format.

--------------4C7E03390063999F5E48C0EE

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

UAM's online database (along with MVZ's) is displaying error estimates through

the Berkeley Digital Library Project's GIS viewer.  I assume that the

"finished" MaNIS project could look about the same.  That is, error estimates

will be a prominent and critical feature of the system.  Given that the GIS

viewer will map data points over satellite photos of much of the U.S., the

precision associated with the data points is critical.  The implication of "no

error" on a such fine scale GIS layer is that the specimen came from a

specific tree or bush!  Our database contains max_errors from as small as a

few meters to as large as several tens of kilometers.  These are not arcane

details.

 

    XXXXXX

 

 

 

>>> Posting number 192, dated 2 Apr 2002 11:09:56

Date:         Tue, 2 Apr 2002 11:09:56 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Lat_Long metadata

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Oops, my mistake. There IS a collection with the structure to capture all

of the metadata. Two others, UAM and MVZ, have everything except "Extent of

Named Place."

 

Thanks XXXX, bright spot appreciated.

 

John W

 

 

>X-Sender: carlak@mail.bishopmuseum.org

>X-Mailer: QUALCOMM Windows Eudora Version 5.0.2

>Date: Tue, 02 Apr 2002 08:39:08 -1000

>To: John Wieczorek <tuco@socrates.Berkeley.EDU>

>From:

>Subject:

> 

>FYI:  in reference to your statement below....................

> 

>Start thinking about incorporating the georeferenced coordinates and

>metadata into your individual databases. Not one of the participating

>institutions currently has the structure in its database to capture all of

>the metadata we are gathering. It would be nice if we all could. We don't

>want to throw away all of this hard work after all.

> 

>Here's a bright spot to your day:  I have incorporated the MANIS locality

>structure into my Locality table and will thus be saving all the metadata

>for the BPBM specimens and for all new specimens into the collection that

>are completely georeferenced.

> 

>XXXX

>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 

 

>>> Posting number 193, dated 2 Apr 2002 12:00:20

Date:         Tue, 2 Apr 2002 12:00:20 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: georeferencing rivers

In-Reply-To:  <.20020401170449.0099fc90@pilot.msu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX, and all,

 

First, I want to apologize for having given contradictory opinions on how

these vague localities should be treated. I stated at least once in the

past that we shouldn't bother with these kinds of localities. However, that

opinion was not based on unassailable logic. In both of the circumstances

described below in Robin's message the coordinates will be of limited

utility due to their very large maximum error. Nevertheless, providing the

coordinates and maximum error will allow the user to determine the extent

to which they ARE useful.

In replying to XXXX I first expressed the opinion that we should provide

maximum errors even in the truly vague cases. My unstated personal

justification for that opinion was that it makes the rules simpler. More

philosophically, by georeferencing all non-contradictory localities, we

don't need to answer the question "How big of an area is too vague?" We

cannot fully anticipate all of the uses to which the data will be put, so

we don't really have a basis on which to make that judgement. A locality

with coordinates and a maximum error distance is always more useful than a

locality without them. End of apology.

 

Now, back to the questions.

 

>John:

> 

>XXXXX's questions and your responses prompted additional questions re:

>georeferencing rivers and vague localities.

> 

>1.  Is it correct to assume that when one measures the length of a river

>to determine its geographic center the river's possibly winding path is

>taken into consideration; however, the extent is determined "as the crow

>flies" from the geographic center to the furthest reach?

 

You don't need to know the length of the river to determine its geographic

center, you need only take the means of the extremes of latitude and

longitude encompassing it. After that, you need to find the point on the

river nearest the geographic center. From there, the extent would be the

distance to the furthest point on the river.

 

>2.  Should we put coordinates on the following vague locality:

> 

>HigherGeog: Michigan, Barry County

>SpecLocality: "no specific locality recorded"

> 

>XXXX and I have not georeferenced such localities thus far, but it

>appears from your response that county center coordinates and the extent

>of Barry County should be provided.

 

Yes. These should be georeferenced. However, there isn't really a need for

you to do it. Such localities can be georeferenced automatically from a

table of county centroids when we're all done. In retrospect, it would have

probably been useful for me to do that before making the gazetteer

"public," but I didn't think it worth the delay at the time.

 

John W

 

 

>>XXXX and all,

>> 

>>These are good questions. I'll put the answers right below each one.

>> 

>>>1. When I georeference rivers, should I take coordinates of the source or

>>>the drainage of the river? How much should extent of the river be?

>> 

>>The coordinates should be at the geographic center of the river, on the

>>river itself. The extent should be the distance to the furthest reach of

>>the river in either direction.

>> 

>>>2. An example: specific locality is "Brooks Range, Anaktiktoot", where

>>>Anaktiktoot is not on the map. Should I georeference for Brooks Range

>>>(which will be more than 600 miles in length)? There are many cases that

>>>higher geography is followed by unknown specific locality.

>> 

>>You should go ahead and put coordinates on the vague localities, even

>>though the maximum_error_distance will be large. Some of the higher

>>geographies that have no value or "no specific locality" in the locality

>>field can still be specific, such as islands.

>> 

>>>3. Related to my question 2: how much is too big to georeference? In many

>>>cases, only the name of the island, mountains, peninsula etc. are

>>>provided.

>> 

>>Do them all. The maximum_error_number will be useful even if it is large.

>> 

>>John

> 

 

>>> Posting number 194, dated 2 Apr 2002 21:23:13

Date:         Tue, 2 Apr 2002 21:23:13 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Barbara R. Stein" <bstein@OZ.NET>

Subject:      Re: MaNIS questions

MIME-Version: 1.0

Content-Type: multipart/alternative;

              boundary="------------8D04441FBD1587A8D66E30D2"

 

--------------8D04441FBD1587A8D66E30D2

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

Dear XXX et al.,

 

>From the outset, this project has proceeded, and proceeded successfully,

because we have all been "on the same page."  Your email (see below) provides

an opportunity to reiterate what we said we were going to do, what we intend

to do, and exactly why we are doing it as stated.

 

John and I (particularly John) are extremely grateful to those of you who have

immersed yourselves in the intricacies of georeferecning and have been willing

to share your thoughts and insights with the list.  However, such discussions

in and of themselves have not added to the work load that was initially

budgeted or funded.  Quite to the contrary, both the "Coordinate

Georeferencing Activities" and "Implement Specimen Data Model" sections in the

MaNIS Project Description described providing georeferencing metadata as well

as the coordinates.  And we stated emphatically,

 

"Well-documented, georeferenced collecting events are crucial to biogeographic

data...."

 

This is exactly what we are doing.

 

The error calculator and spreadsheet templates that John provided make the

addition of metadata such as lat/long error a relatively trivial exercise and

one that should not be confused with the discussion of such topics on this

list.  Several individuals have chosen to probe that tool more closely and we

have all benefited from their interest and experimentation.  Their comments

have enhanced our understanding of the process and the resulting data, and

improved the tool, but they have not created more work.

 

Where confusion may have arisen, is in the following:

 

> And indeed (in light of the above

> quote) we do not have a capacity to absorb such information into our

> present databases, let alone deciding how much time we have to care about

> this.  Seeing the impact of the request for so much attention to error

> estimates, I find it hard to support so much allocation of additional time

> to this effort.

 

It is not your job to incorporate such information into your present databases

and we apologize for any confusion that John might have engendered in his

previous email.  This is a topic we will be discussing at our meeting at ASM

in June but perhaps it is worth clarifying now what John was intimating when

he made reference to this issue.

 

Think of your current dbms in two parts, the databases themselves and the

interfaces you now use to input, query and display those data in-house.  For

most of you, neither your databases nor your interfaces are currently designed

to handle any new fields (e.g., lat/long error).  However, we are expending a

great deal of time and effort to collect such data and want to make them

available to researchers.  Whereas it is a fairly tricky task (given

constraints of time and budget) to modify each of your interfaces to add new

fields, it is relatively easy to add those fields to your current databases

and migrate the data directly to the MaNIS servers along with your specimen

data.  This will happen when John writes the  migration scripts for each of

your institutions.  Hence, the data will be displayed over the network and

available to you without impacting your current set-ups in-house.  In raising

this issue, he was merely letting you know that we are, in fact, moving ahead

and beginning to work on the next step of the project, creating the migration

scripts and software that will make the network function.

 

> I have witnessed, over the years, many publications based on massive

> datasets in which the authors were not able to document (or even care)

> about variance in the quality and accuracy of the data.  Typically, they

> just put on their blinders and accepted all the "AVAILABLE" data.  This is

> just an inherent problem for those who move up the scale (allometric

> analyses, macroecology, or whatever), and at such LARGE scales of analyses

> they usually say that small local errors become insignificant, because of

> the LARGE SCALE of the overall analysis.

 

Here I will part company with XXX and argue that it is our intention to do

better than what has always been done or has been done previously.  Neither

John nor I see this "inherent problem," particularly with the advent of

increased computing technology.  I participated in one of the planning

workshops for NEON (National Ecological Observatories Network) two years ago

and I can state unequicocally that the standard is changing/has changed.  The

kinds of publications to which XXX refers will no longer be acceptable (if

they even are at this time) because it is possible to document variance in

quality and accuracy of data, even for extremely large datasets.  Furthermore,

we believe we have a designed the georeferencing protocol  to do just that,

with relatively little overhead and impact to the participating institutions.

 

At this point everyone has at least begun the georeferencing process and from

what we can gather, once initial inertia is overcome, things actually progress

quite smoothly and quickly.  I may be premature in saying so, but it is our

hope that MVZ will have completed georeferencing the ca. 40,000+ localities

for California in the next two months.  How have we done this?  I would remind

each of you that our first priority is to provide georeferenced data to those

localities in our collections that currently have none!  It is not to add

error to localities that already have lat/long coordinates assigned to them,

it is not to verify already georeferenced localities, and it is not to clean

up locality descriptions.  Our budget figures were based on the number of

unique localities in our collections that lacked lat/long coordinates of any

sort.  I would also add, that while we cannot dictate whom you hire to do

georeferencing, your money will go lots farther if you hire undergraduates,

and it will go farthest if you hire work-study students.

 

We have all taken the first giant step.  What is needed now is to just keep

putting one foot in front of the other.  I guarantee you will amaze

yourselves.

 

Best,

Barbara

 

 

 

>>> Posting number 195, dated 3 Apr 2002 08:38:21

 

>>> Posting number 196, dated 3 Apr 2002 10:52:59

Date:         Wed, 3 Apr 2002 10:52:59 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Contemporary informatics science, etc.

In-Reply-To:  <3CAA91C1.79E00185@oz.net>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Barbara et al.,

I appreciate the comments and forum that exist among our Manis group, and

I thank Barbara for her most recent.  I also agree that the developing

field of informatics is helping us to raise the bar on scientific

standards in generaland I dont wish my comments to be taken as an

endorsement of the crudeness of broad synthetic work done in the past

(without error estimates).  I also realize that for the many data fields

that we have entered into our XXXX mammal database (other than lat/long)

we will probably continue without error estimates for some time to come.

On the other hand we can only await the further development of these kinds

of massive data management projects in the future, assuming that financial

resources will remain available for this kind of thing.  It will be great

if we can be surprised by continued improvements in the overall quality of

the data that stand behind the specimens we hold in our collections.  I

obviously remain committed to assuring that we get our job done on this

current project.

XXX

 

 

 

>>> Posting number 197, dated 5 Apr 2002 15:49:39

Date:         Fri, 5 Apr 2002 15:49:39 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear all,

 

In late February when I was fixing my mistake with the UWBM Lat_Longs I

mentioned that I would be reloading ROM data at some time as well. That

time has come. The new ROM data have now been loaded into the gazetteer.

What does this mean for you?  If you haven't begun georeferencing yet

(though as far as I know, everyone has), you just need to download your

localities again and proceed as described in the Georeferencing Steps

document ( http://dlp.CS.Berkeley.EDU/manis/GeorefSteps.html ). If you have

downloaded localities and started georeferencing them, first you need to

remove any ROM records from the set. Next make another query in the MaNIS

gazetteer just like the original query that gave you the records you are

working on, but this time pick ROM in the Institution box on the MaNIS

Gazetteer page to get only ROM records for that combination of higher

geography. Download these ROM records and append them to the end of the

file you are working on.

 

Sorry for this inconvenience. I'm pretty sure I've got everything correct

now and that this kind of thing won't happen any more. So, everyone,

proceed with confidence.

 

My next undertaking will be to write the documentation for a new Calculator

that can calculate not only errors, but also coordinates. This calculator

will be VERY similar to the Error Calculator, so there won't be much new to

learn. The new calculator has already been tested; the results agree with

those given by Gary Shugart's Excel tool for the same localities. This is

good. I'll announce the new calculator as soon as I've posted the manual

for it, which should be next Friday or so after I return from San Diego.

 

Happy georeferencing!

 

John W

 

>>> Posting number 198, dated 5 Apr 2002 17:47:21

 

>>> Posting number 199, dated 15 Apr 2002 16:52:52

Date:         Mon, 15 Apr 2002 16:52:52 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      GNIS Website Gazeteering

 

Hello All,

 

I am a recent addition to the group, and I have thrown myself headlong into

the midst, hopefully well.

 

That said, I do have a question about a source.  I am using the USGS GNIS

website http://geonames.usgs.gov/pls/gnis/web_query.gnis_web_query_form and

I was wondering what, if any, experiences have been had.  Specifically, if

I read it correctly, it is a database of information culled for the USGS

maps.  I am just unsure of a few things...:

 

First, datum, scale, and other info.  The site refers to "7.5' by 7.5'

Map"; what other data can be culled just from that?

 

Second, it at times gives coordinates from multiple maps that are slightly

different.  How do I reconcile this variances??  Do I give my own best

combination, or has a process been agreed upon, that I have missed in going

through the past posting?

 

Thanks, and greetings to you all.

 

>>> Posting number 200, dated 15 Apr 2002 15:45:02

Date:         Mon, 15 Apr 2002 15:45:02 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      GNIS Info

MIME-Version: 1.0

Content-Type: multipart/alternative;

              boundary="----=_NextPart_000_0088_01C1E494.81AC73E0"

 

This is a multi-part message in MIME format.

 

------=_NextPart_000_0088_01C1E494.81AC73E0

Content-Type: text/plain;

        charset="iso-8859-1"

Content-Transfer-Encoding: quoted-printable

 

XXXXXX

 

I too am new at working on this MaNIS project, just started this week.  =

Anyways, I had the exact same question as you and talked to John =

Wieczorek this morning at the Museum of Vertebrate Zoology in Berkeley, =

CA.  He said that using the GNIS data is fine even though that the =

source of the database is not from one place.  These are the "givens" =

for GNIS use with the "Error Calculator":

 

1)  Coordinate System: decimal degrees

2)  Coordinate Source: USGS map 1:25,000

3)  Datum: NAD27 (North American Datum 1927)

 

Make sure that you fill out the "Extent of Named Place Field" as much as =

possible each time.  If anyone from this board has other suggestions, I =

would be glad to hear them.

 

Is anyone else converting the GNIS database to a shape file to be used =

in ArcView to calculate distances?  If there are a lot of you, I will =

start posting ArcView questions pertaining to this project here.  =

Thanks.

 

XXXXXXX

 

>>> Posting number 201, dated 16 Apr 2002 09:49:31

Date:         Tue, 16 Apr 2002 09:49:31 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         

Subject:      Re: GNIS Info

MIME-Version: 1.0

Content-Type: multipart/alternative;

              boundary="----=_NextPart_000_0022_01C1E52C.020D1CA0"

 

This is a multi-part message in MIME format.

 

------=_NextPart_000_0022_01C1E52C.020D1CA0

Content-Type: text/plain;

        charset="iso-8859-1"

Content-Transfer-Encoding: quoted-printable

 

XXXXX--

 

Thanks for the quick reply; it was very helpful. =20

 

I am always interested in other people's experiences with ArcView.

 

 

John Wieczorek & Group--

 

I still am on the fence with the locations that give me two or more =

different georeferencing points.  I have the feeling that, as they are =

both "legitimate" sources (different USGS maps), that I can just choose =

one, and indicate in the proper field in the database which I chose.  =

Does this seem acceptable/appropriate??

 

Thanks

 

XXXXX

 

----- Original Message -----=20

  From:

  To: MAMMAL-Z-NET@USOBI.ORG=20

  Sent: Monday, April 15, 2002 5:45 PM

  Subject: [MANIS] GNIS Info

 

 

  XXXXXXXXX

 

  I too am new at working on this MaNIS project, just started this week. =

 Anyways, I had the exact same question as you and talked to John =

Wieczorek this morning at the Museum of Vertebrate Zoology in Berkeley, =

CA.  He said that using the GNIS data is fine even though that the =

source of the database is not from one place.  These are the "givens" =

for GNIS use with the "Error Calculator":

  =20

  1)  Coordinate System: decimal degrees

  2)  Coordinate Source: USGS map 1:25,000

  3)  Datum: NAD27 (North American Datum 1927)

 

  Make sure that you fill out the "Extent of Named Place Field" as much =

as possible each time.  If anyone from this board has other suggestions, =

I would be glad to hear them.

 

  Is anyone else converting the GNIS database to a shape file to be used =

in ArcView to calculate distances?  If there are a lot of you, I will =

start posting ArcView questions pertaining to this project here.  =

Thanks.

 

 

 

 

>>> Posting number 202, dated 16 Apr 2002 10:04:18

Date:         Tue, 16 Apr 2002 10:04:18 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: GNIS Info

In-Reply-To:  <002501c1e555$eb1a1320$b16f0a0a@fmnh.org>

Mime-Version: 1.0

Content-Type: multipart/alternative;

              boundary="=====================_12236688==_.ALT"

 

--=====================_12236688==_.ALT

Content-Type: text/plain; charset="us-ascii"

 

XXXX and others

Having already georeferenced thousands of South American localities, this is an

important and porrly understood question.  My strong conviction is that simply

picking a point arbitrarily is apt to prove more misleading than leaving the

point undetermined.  If there are 28 "San Martin"s in Peru, for example, and

there is no additional information for specifying this (e.g., compiling an

expedition itinerary, locations of field activities immediately beforehand and

afterwards, and (rarely) the distributions of animals themselves), then

guessing--and being explicit about your guesses--can only be misleading.

 

Following this strategy with the Field Museum's 2300 locality records from Peru

lead me to leave 14% of the localities unspecified.  However, I am confidant

that the remaining 86% came from where they plot.

 

I would be interested in hearing the experiences of others and the druthers of

curators/collection managers on the data fidelity (vs accuracy) question.

Clearly, we need to embrace a community-wide standard

XXXXX

 

 

 

>>> Posting number 203, dated 16 Apr 2002 09:11:41

Date:         Tue, 16 Apr 2002 09:11:41 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: GNIS Info

In-Reply-To:  <4.1.20020416095834.00a94a90@mail.fmnh.org>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

I agree wholeheartedly with XXXXX. If there is ambiguity in terms of a

multitude of potential named places for a given locality we should NOT

georeference it, but give the reason ("ambiguous" or "multiple possible

places" or something like that) in the NoGeorefBecause field. It may be

that some of these localities can be resolved by the host institution by

looking in field notes and the like. However, that's a time-consuming

activity and we should leave that until after the coordinates get

redistributed.

For the record, the other type of locality we should NOT georeference is

one that is in question (e.g., "Bakersfield?"). For these, put something

like "locality questionable" in the NoGeorefBecause field. The reason for

filling out the NoGeorefBecause field is so that the host institution knows

that someone actually looked at the locality. You wouldn't otherwise know

this if the Lat and Long were just blank. While reviewing, I might as well

remind everyone to make use of the Remarks field to alert host institutions

of likely errors such as misspellings as well as unusual assumptions that

were made in the course of the coordinate determination.

 

It's nice to see the list serving its purpose. Thanks for the questions and

responses!

 

John W

 

 

 

 

>>> Posting number 204, dated 16 Apr 2002 11:21:06

Date:         Tue, 16 Apr 2002 11:21:06 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         

Subject:      Re: GNIS Info

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: 7bit

 

All,

 

I believe I am not being as clear in my situation as I thought.  Here is an

example.

 

Aurora, Illinois, is a city/town that spreads across multiple counties, and

is on 3 different USGS maps, according to the query form results I received.

As I understand it, the information on the site comes about in the same

manner as if I had all of these maps myself, and were picking the point, and

best approximating the lat & long according to those lines given on the map.

But, as there are 3 different maps, 3 slightly different numbers are

arrived.  41 45 38 N, 88 19 12 W; 41 44 45 N, 88 18 31 W; & 41 45 45 N, 88

22 45 W to be exact.  Each of these is agreed (by the consent to use the

information from the site at all) to from a reliable source; as in, if there

were only one, there would be NO problem.  So, my question is, can I "pick"

one, and then indicate which map it was taken from??  Please use the query

page to see what I mean

http://geonames.usgs.gov/pls/gnis/web_query.gnis_web_query_form

 

If I am beating a dead horse, please let me know, but this is a quite

specific question.

 

Thank you all.

 

 

>>> Posting number 205, dated 16 Apr 2002 10:46:47

Date:         Tue, 16 Apr 2002 10:46:47 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         

Subject:      Re: GNIS Info

Comments: To:

In-Reply-To:  <008f01c1e562$b62135b0$b16f0a0a@fmnh.org>

MIME-Version: 1.0

Content-Type: text/plain; charset=ISO-8859-1

Content-Transfer-Encoding: 8bit

 

XXXX

 

Rather than using the GNIS query form, you can open the GNIS gazetteer for IL:

 

ftp://mapping.usgs.gov/pub/gnis/IL_deci

 

This is an alphabetical listing of named places.  There is one entry

for "Aurora", followed by specific locations (Aurora city hall, etc).  For

georefencing Utah localities, we are using the GNIS UT gazetteer of named

places and digitized 1:24,000 maps from the National Geographic TOPO! series

which provide very accurate readings.  So far (for Utah) this combination works

very well and gazetteer and maps are in very close agreement.

 

XXXX

 

 

 

 

>>> Posting number 206, dated 16 Apr 2002 11:36:27

Date:         Tue, 16 Apr 2002 11:36:27 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: GNIS Info

In-Reply-To:  <1018975607.3cbc5577d6871@bluebird.umnh.utah.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX,

 

I'm sorry I didn't address your specific question. Your example is an

interesting and uncommon one. The populated place is actually given three

different sets of coordinates - one for each of the three 7.5' maps on

which it appears. In this particular case I would actually choose the

coordinates of the city hall ( 41 45' 24" N  88 18'  52" W) as an

unambiguous solution. Make sure to comment to that effect in the Locality

Remarks, and be sure to use the distance from the city hall to the furthest

edge of town as the extent of Aurora in the error calculations.

 

John

 

 

 

>>> Posting number 207, dated 16 Apr 2002 03:01:04

Date:         Tue, 16 Apr 2002 03:01:04 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      related thoughts on GNIS data- hierarchy of placename types

In-Reply-To:  <008f01c1e562$b62135b0$b16f0a0a@fmnh.org>

Mime-Version: 1.0

Content-Type: multipart/alternative;

              boundary="============_-1193171221==_ma============"

 

--============_-1193171221==_ma============

Content-Type: text/plain; charset="us-ascii" ; format="flowed"

 

MaNIS: I have been using the GNIS download for Oregon

(http://geonames.usgs.gov/gnisftp.html) to get declat/longs for

placenames. After getting a placename and the lat/long, the lat/long

for the record is calculated based on the offsets from the placename.

In looking up placenames, I typically find that there are multiple

possibilities for sites such as Hood River that could be a populated

place (ppl), post office (po), river, county, locale, etc.  Another

example I was just trying to estimate extent on was Agate Beach -

ppl, beach, or po?  I picked ppl.  This ambiguity seems to be

characteristic of localities such as ppls, pos, crossroads, locales

that were named after some feature of the landscape.  It is rare to

have a record that unambiguously states what is referenced (e.g..,

Hood River, town of).   The point is that a SpecLocality with just a

name often will be ambiguous and the ambiguity increases the more you

look and the better your reference dataset because you find more

possibilities.  I've been going with a hierarchy of ppl, po, locale,

then if these don't exist, whatever else looks good.  My thinking is

that a collector would have used a po, ppl, or locale as a first

choice.  As long as I provide a reference to what I did, there should

be no problem.  The reference will include the Placename, placename

type, county, and placename lat/long plus the other data needed to

calculate lat/long and error.  The contributing institution can

accept or reject the lat/long.

 

 

 

 

>>> Posting number 208, dated 17 Apr 2002 18:19:56

Date:         Wed, 17 Apr 2002 18:19:56 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: related thoughts on GNIS data- hierarchy of placename types

In-Reply-To:  <p05100301b8e1716dad3e@[207.207.104.113]>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX, and all,

 

I disagree with the methods described below on three counts. The more

important of my two reasons is that some of the determinations using this

methodology will simply be wrong, and there won't be any way to know if

they are wrong even with the Locality Remarks. My secondary reason is that

it will be difficult to find these localities among the data in order to

filter them out from analyses for which they aren't appropriate. Finally,

it seems to me we gain no benefit from actually having coordinates for

ambiguous localities, especially given that they take up precious time to

georeference. By ambiguous I mean that there is more than one possible

distinct place to which the locality may refer. By distinct I mean that the

maximum error circles do not overlap.

 

One could, sometimes appropriately, choose the geographic center of

multiple possible places for the coordinates of a locality and have the

extent cover all of them. In this case the error would be larger than it

would be by choosing any one of the possible localities, but the

determination (coordinates plus maximum error distance) would not be wrong.

I would use this method sparingly, however, given that it does take quite a

bit of time to make a determination of this kind. The method is most

appropriate when the distances between the possible places is relatively

small so that the maximum error distance itself remains small. I have no

objection to this approach, but I would argue vehemently to avoid

determinations that are likely to be wrong.

 

John W

 

At 03:01 AM 4/16/02 -0700, you wrote:

>MaNIS: I have been using the GNIS download for Oregon

>(http://geonames.usgs.gov/gnisftp.html) to get declat/longs for

>placenames. After getting a placename and the lat/long, the lat/long for

>the record is calculated based on the offsets from the placename.  In

>looking up placenames, I typically find that there are multiple

>possibilities for sites such as Hood River that could be a populated place

>(ppl), post office (po), river, county, locale, etc.  Another example I

>was just trying to estimate extent on was Agate Beach - ppl, beach, or

>po?  I picked ppl.  This ambiguity seems to be characteristic of

>localities such as ppls, pos, crossroads, locales that were named after

>some feature of the landscape.  It is rare to have a record that

>unambiguously states what is referenced (e.g.., Hood River, town

>of).   The point is that a SpecLocality with just a name often will be

>ambiguous and the ambiguity increases the more you look and the better

>your reference dataset because you find more possibilities.  I've been

>going with a hierarchy of ppl, po, locale, then if these don't exist,

>whatever else looks good.  My thinking is that a collector would have used

>a po, ppl, or locale as a first choice.  As long as I provide a reference

>to what I did, there should be no problem.  The reference will include the

>Placename, placename type, county, and placename lat/long plus the other

>data needed to calculate lat/long and error.  The contributing institution

>can accept or reject the lat/long.

> 

> 

 

 

>>> Posting number 209, dated 19 Apr 2002 10:49:10

Date:         Fri, 19 Apr 2002 10:49:10 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      New Calculator is ready

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

I've finished upgrading the Georeferencing Calculator to calculate not only

Errors, but also Coordinates with errors. Links throughout the MaNIS

website now point to this Calculator instead of the Error Calculator. A

link to the manual for the new Calculator can be found on the MaNIS

Documents page at the following URL:

 

http://dlp.cs.berkeley.edu/manis/Documents.html

 

This new Calculator will look familiar to anyone who has used its

predecessor, but please be sure to read the manual to be sure you

understand how it differs.

 

John W

 

>>> Posting number 210, dated 19 Apr 2002 15:29:58

Date:         Fri, 19 Apr 2002 15:29:58 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: New Calculator is ready

 

John, and all--

 

I have an issue regarding the Calculator, and the OrigCoordSystem heading.

When entering data with a distance component (e.g. Alton, 2 mi N), if the

coordinates of the named place have been determined as and are entered in

deg min sec format, and the calculator gives out decimal degrees, what

should be entered in as the OrigCoordSystem?  Is it decimal degrees, since

the ultimate determination of the coords by the calculator is such, or is

it d/m/s since that is what the actual, e.g. gazetter, data is from??  More

simply, maybe, can the OrigCoordSystem say "deg min sec", but the actual

data be given in decimal degrees?  Hope that is clear.

 

Thanks

 

>>> Posting number 211, dated 19 Apr 2002 14:06:08

Date:         Fri, 19 Apr 2002 14:06:08 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: New Calculator is ready

In-Reply-To:  <MAMMAL-Z-NET%2002041915295851@USOBI.ORG>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX,

 

The quick answer is to use the value given in the long line of blue

tab-delimited

data at the bottom of the new calculator after you click on the Calculate

button. We want to record the coordinate system from which the

determination was made, which is to say, the one upon which the coordinate

precision was based.

 

John W

 

 

 

>>> Posting number 212, dated 22 Apr 2002 00:00:0/

Date:         Mon, 22 Apr 2002 00:48:24 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         

Subject:      SpecLocality placenames, simply wrong vs probably right

In-Reply-To:  <5.0.0.25.2.20020417173410.02732420@socrates.berkeley.edu>

Mime-Version: 1.0

Content-Type: multipart/alternative;

              boundary="============_-1192660781==_ma============"

 

--============_-1192660781==_ma============

Content-Type: text/plain; charset="iso-8859-1" ; format="flowed"

Content-Transfer-Encoding: quoted-printable

 

John:  It wasn't clear what methods or methodology you disagree with.

Regarding assumptions, the comment that  "some of the determinations

using this methodology will simply be wrong" seems overly optimistic.

We will never know "simply" if many localities are wrong or right.

It is more productive to think in probabilities.  We have to make

assumptions regarding what SpecLocality values represent given that

we are georeferencing  other institution's records and formats

without consulting primary data.  So I have to assume, as does

everyone else, that entries like "Hood River" from different

collectors and institutions probably refer to the same locality and

the same type of locality (city rather than river).  For determining

lat/lons and errors, we also assume that the lat/longs and boundaries

of population centers like Hood River probably have not shifted or

expanded/contracted significantly over the years.   These basic

assumptions make determinations probabilities rather than right or

wrong.

 

Another possibility is that the comment "this methodology will simply

be wrong" refers to calculating lat/longs based on a placename and

offsets?  Formal release of the web lat/long calculator validates

this methodology.  But, regardless of the methodology the assumptions

are the same.

 

A third possibility is that perhaps you were referring to the

reference string that Hood River, the ppl, was georeferenced but that

there are other possibilities? Many placenames with a landscape

feature in the name (e.g., river, falls, beach, spring) will be in

the category of at least two types of locality, so for these I am

looking (a few keystrokes, so little time is wasted) and I am

annotating as I find them with a standard "could also be =8A".  About

10% of the Oregon records are in this category.  If you don't want

them georeferenced let me know.

 

 

 

>XXXX, and all,

> 

>I disagree with the methods described below on three counts. The more

>important of my two reasons is that some of the determinations using this

>methodology will simply be wrong, and there won't be any way to know if

>they are wrong even with the Locality Remarks. My secondary reason is that

>it will be difficult to find these localities among the data in order to

>filter them out from analyses for which they aren't appropriate. Finally,

>it seems to me we gain no benefit from actually having coordinates for

>ambiguous localities, especially given that they take up precious time to

>georeference. By ambiguous I mean that there is more than one possible

>distinct place to which the locality may refer. By distinct I mean that the

>maximum error circles do not overlap.

> 

>One could, sometimes appropriately, choose the geographic center of

>multiple possible places for the coordinates of a locality and have the

>extent cover all of them. In this case the error would be larger than it

>would be by choosing any one of the possible localities, but the

>determination (coordinates plus maximum error distance) would not be wrong.

>I would use this method sparingly, however, given that it does take quite a

>bit of time to make a determination of this kind. The method is most

>appropriate when the distances between the possible places is relatively

>small so that the maximum error distance itself remains small. I have no

>objection to this approach, but I would argue vehemently to avoid

>determinations that are likely to be wrong.

> 

>John W

> 

 

 

 

>>> Posting number 213, dated 22 Apr 2002 14:56:54

Date:         Mon, 22 Apr 2002 14:56:54 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: SpecLocality placenames, simply wrong vs probably right

MIME-Version: 1.0

Content-Type: multipart/alternative;

              boundary="----=_NextPart_000_0048_01C1EA0D.F12B0F50"

 

This is a multi-part message in MIME format.

 

------=_NextPart_000_0048_01C1EA0D.F12B0F50

Content-Type: text/plain;

        charset="iso-8859-1"

Content-Transfer-Encoding: quoted-printable

 

SpecLocality placenames, simply wrong vs probably righAll--

 

Does anyone know of any resource(s) that has/have listed named =

cities/towns and their georef. boundaries??  This would definitely help =

greatly in cases where a number of smaller towns are concentrated in a =

small place without obvious boundaries. =20

 

 

 

>>> Posting number 214, dated 29 Apr 2002 11:24:34

 

>>> Posting number 215, dated 29 Apr 2002 11:29:13

 

>>> Posting number 216, dated 29 Apr 2002 09:30:15

 

>>> Posting number 217, dated 29 Apr 2002 09:39:26

 

>>> Posting number 218, dated 29 Apr 2002 09:49:47

 

>>> Posting number 219, dated 1 May 2002 13:10:17

Date:         Wed, 1 May 2002 13:10:17 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Localitly changed locations...

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding:  quoted-printable

 

All--

 

I have an interesting dilemma, and would like to get thoughts, comments.

 

I have a list of different versions of the same locality:  Chicago, =

FMNH; FMNH Boiler Room; Field Museum; Field Museum Building, etc. for at =

least nine "unique" localities.  The problem lies in the fact that both =

the name and location of the Field Museum has changed over the course of =

years.  According to our records, the specimens collected in the =

locality named "Field Museum" span from 1907 to 1999.  But in 1921 the =

museum relocated, which was not taken into account.  Similarly for =

"Field Museum Building".  Specimens from 1921 onwards would have one set =

of coordinates, while the prior to '21 would be an entirely different =

set (locations are 10 km apart, by air).

=20

So--is this just a cleanup issue to be taken up by the Museum itself, or =

can it be addressed here? =20

XXXXXX

 

 

>>> Posting number 220, dated 1 May 2002 17:39:09

Date:         Wed, 1 May 2002 17:39:09 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Localitly changed locations...

In-Reply-To:  <007101c1f13b$72e26a50$b16f0a0a@fmnh.org>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX, and all,

 

This is a very good question. There are a few possible ways to proceed. I

have recorded the Verbatim Collecting Dates from the information that was

sent to me by each institution. I didn't put that information into the

gazetteer, but I assembled for just this kind of issue. If you send me a

list of distinct LocalityIDs I can send back the dates associated with each

one. After that it might get complicated, but we might do that much and see

how it goes.

 

Alternatively, it may be worth investigating if the localities in question

refer to specimens that were captive. If so, the localities need not be

georeferenced and the reason (in the NoGeorefBecause field) could be set to

"captive."

 

Another solution is to defer the georeferencing of these localities to the

individual institutions. In this case, you might want to put "locality is

time dependent" or something to that effect, in the NoGeorefBecause field.

 

John

 

At 01:10 PM 5/1/02 -0500, you wrote:

>All--

> 

>I have an interesting dilemma, and would like to get thoughts, comments.

> 

>I have a list of different versions of the same locality:  Chicago, FMNH;

>FMNH Boiler Room; Field Museum; Field Museum Building, etc. for at least

>nine "unique" localities.  The problem lies in the fact that both the name

>and location of the Field Museum has changed over the course of

>years.  According to our records, the specimens collected in the locality

>named "Field Museum" span from 1907 to 1999.  But in 1921 the museum

>relocated, which was not taken into account.  Similarly for "Field Museum

>Building".  Specimens from 1921 onwards would have one set of coordinates,

>while the prior to '21 would be an entirely different set (locations are

>10 km apart, by air).

> 

>So--is this just a cleanup issue to be taken up by the Museum itself, or

>can it be addressed here?

>XXXX

 

 

>>> Posting number 221, dated 1 May 2002 18:31:05

Date:         Wed, 1 May 2002 18:31:05 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Thanks and Can O' Worms??

In-Reply-To:  <3.0.32.20020501154229.006d7aa0@pilot.msu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear XXXXX, and all,

 

>Hi John,

>Okay - thanks for the comments on the coordinates, and sorry for the FTP

>address mistake.  Robin and I will make sure to send future files to

>"incoming/mvz/manis" instead of what we have been doing (our most recent

>printout of the steps - and step 8 -  didn't have this information).

 

Not a problem - that's just the best I could come up with for a criticism.

 

>Also, I have been thinking about named places, their extents, possible

>future issues with such information, and how I am now recording info in the

>"NamedPlace" field.

> 

>I have some questions for you (sorry if this opens a can of worms).

> 

>1)  Should there be a format for listing named places, such as Beaumont

>Tower, MSU Campus?  I guess I could just call the place "Beaumont Tower",

>but currently I have listed it as "Beaumont Tower, Michigan State

>University".  I then wondered if I should have put "Michigan State

>University" first, so if these named places are alphabetically listed

>somewhere as part of our project, then all of the "MSU" items would appear

>together.  The same applies to TRS examples.  I often list the TRS

>information in the NamedPlace field as it is given in the gazetteer -

>sometimes these begin with 1/4 of 1/4 of a section, and sometimes they

>begin with the "T" information.  Should there be a format for these as

>well?  (I know I really need to get a life).  What do you think?

 

Good questions. The bottom line, I think, is that the Named Place itself

must be uniquely identifiable. For example, in "Beaumont Tower, Michigan

State University," no one is going to confuse that with any other Beaumont

Tower. In terms of which should come first, I think it will be slightly

more useful to put the less specific part of the named place before the

more specific for the reasons you stated, thus, "Michigan State University,

Beaumont Tower" would be preferable entry.

 

As for TRS data, we will not likely ever use those to make a gazetteer,

since there are already tools to extract coordinates from TRS. Because of

this, it is probably sufficient to simply record "TRS" as the named place

rather than to copy the TRS data into the Named Place field. Nevertheless,

if the TRS data are recorded in a consistent manner in the Named Place

field, the coordinates could be checked (or even assigned)

programmatically. The easiest TRS format to parse programmatically would be

something like "T7N R13E S17 NW1/4 of SE1/4."  In my example, it doesn't

matter if you have extra white space between any of the letters and their

adjacent numbers, but it would be helpful to keep the same order as well as

the word "of."

 

>2)  For offsets from a named place, should what I enter in the NamedPlace

>field reflect the direction of the offset?  For example, if I have the

>following localities:  Mason, 2 miles west of Mason, and 1 mile NW of

>Mason, I will be recording the following respective extents for Mason:

>greatest extent; extent to the west; and the northwestern extent.

>Currently I am listing Mason as the NamedPlace for all of these records,

>even though the extents are different.  Should I be entering a name to

>reflect the offset direction (e.g. Mason, extent to the west)?

 

Interesting point. I think it is best to record just the name of the place

without reference to offsets. The named place is intended to show the

starting point for a coordinate calculation, but I think it is asking too

much to include the direction information. Think of how horrible it would

be for orthogonal offsets (e.g., "2 mi N and 3 mi E of Mason").

 

John

 

 

 

>>> Posting number 222, dated 2 May 2002 10:27:33

Date:         Thu, 2 May 2002 10:27:33 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Updated manisgeoreftemplate.mdb file

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear all,

 

It has been brought to my attention that there was a data type problem in

the Extent field in my Access97 manisgeoreftemplate.mdb file. Up until

moments ago, that field was of type Long Integer, which wasn't very useful

for recording fractional extents. I have changed the data type to Double

and posted the new file in place of the old one. The file can be accessed

through the Georeferencing Steps document on the MaNIS web site.

 

Thanks for finding that, XXX,

 

John

 

>>> Posting number 223, dated 3 May 2002 09:46:43

 

>>> Posting number 224, dated 3 May 2002 12:03:26

Date:         Fri, 3 May 2002 12:03:26 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Area

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding:  quoted-printable

 

Quick question...

 

If the extent of a town/city is vague do to proximity to other =

towns/cities, is the area of the place useful at all??

 

 

XXXXXX

 

>>> Posting number 225, dated 3 May 2002 10:12:46

Date:         Fri, 3 May 2002 10:12:46 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Area

In-Reply-To:  <002c01c1f2c4$710cf170$9f6e0a0a@FMNHCJ>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX,

 

In and of itself, the area describing the town/city may not be too useful,

but in terms of determining the maximum error distance, it is essential. Do

you have something more specific in mind?

 

John

 

At 12:03 PM 5/3/02 -0500, you wrote:

>Quick question...

> 

>If the extent of a town/city is vague do to proximity to other

>towns/cities, is the area of the place useful at all??

> 

> 

>XXXXXXXX

 

>>> Posting number 226, dated 3 May 2002 12:17:17

Date:         Fri, 3 May 2002 12:17:17 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: Area

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: 7bit

 

Yes, that.  I have Cicero IL, but the maps I have from Topozone are not as

helpful, as the Cicero is closely bordered with Chicago and other suburbs.

But I managed to find a reference on the web that listed the area of Cicero,

and thought that it would be helpful for the error determination.

 

So, should just take the area itself as the max extent such as the

following:  if the area is 6 sq. mi.,  and area is length times width,

shouldn't the assumption be in favor of the greatest error, as in 6 by 1,

not 3 by 2?  Or 4.8 by 1.25, etc.?

 

----- Original Message -----

From: "John Wieczorek" <tuco@SOCRATES.BERKELEY.EDU>

To: <MAMMAL-Z-NET@USOBI.ORG>

Sent: Friday, May 03, 2002 12:12 PM

Subject: Re: [MANIS] Area

 

 

> XXXX,

> 

> In and of itself, the area describing the town/city may not be too useful,

> but in terms of determining the maximum error distance, it is essential.

Do

> you have something more specific in mind?

> 

> John

> 

> At 12:03 PM 5/3/02 -0500, you wrote:

> >Quick question...

> >

> >If the extent of a town/city is vague do to proximity to other

> >towns/cities, is the area of the place useful at all??

> >

> >

> >XXXXXX

 

>>> Posting number 227, dated 3 May 2002 10:45:32

Date:         Fri, 3 May 2002 10:45:32 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Area

In-Reply-To:  <006001c1f2c6$60721460$9f6e0a0a@FMNHCJ>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX,

 

Hmmm. This is probably a less-than-desirable approach for the simple reason

that we can't know from an area what the shape of the town is. Taking the

square root of the area would be a simple rule to employ, but it would

always result in an underestimate of the furthest extent of the town. We

want to use the furthest extent in our calculations so that the resulting

maximum error distance satisfies the criterion that it MUST encompass the

actual locality.

 

John

 

>Yes, that.  I have Cicero IL, but the maps I have from Topozone are not as

>helpful, as the Cicero is closely bordered with Chicago and other suburbs.

>But I managed to find a reference on the web that listed the area of Cicero,

>and thought that it would be helpful for the error determination.

> 

>So, should just take the area itself as the max extent such as the

>following:  if the area is 6 sq. mi.,  and area is length times width,

>shouldn't the assumption be in favor of the greatest error, as in 6 by 1,

>not 3 by 2?  Or 4.8 by 1.25, etc.?

> 

 

 

>>> Posting number 228, dated 3 May 2002 13:03:42

Date:         Fri, 3 May 2002 13:03:42 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Fwd: different extents for 0,1 or 2 offets

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX,

 

This particular point has been bugging me somewhat as well, though from the

standpoint of georeferencing automation. Basically, it makes life difficult

to try to maintain multiple extents for a given named place. More rules

means greater difficulty and greater opportunity for inconsistency.

 

So, I'm willing to simplify the Extents rule to read as follows, unless

there are any objections, which should be voiced immediately.

 

<begin proposed update>

Uncertainty due to the extent of a locality

Named places are not single points; they have extents. Although there are

conventions for placing the coordinates of a named place at the post

office, courthouse, or geographic center of a town, one cannot be sure that

the person who recorded the locality used a particular convention. Use the

distance from the geographic center of the named place to its furthest

extent as the uncertainty.

<end proposed update>

 

Yep, that's nice and succinct. I like it.

 

You asked about whether the uncertainty in the extent would be half of this

distance. Under most circumstances I would argue that you are correct, but

the rule, as it stands, covers any original measurement made from within

the named place. It's conservative, but at least it isn't going to be wrong.

 

Thanks for motivating me to make a commitment on this issue.

 

John

 

>X-Originating-IP: [67.25.99.222]

>From:

>To: <tuco@socrates.Berkeley.EDU>

>Subject: different extents for 0,1 or 2 offets

>Date: Fri, 3 May 2002 12:26:00 -0700

>X-Mailer: MSN Explorer 7.00.0021.1900

>X-OriginalArrivalTime: 03 May 2002 19:32:31.0624 (UTC)

>FILETIME=[449FE880:01C1F2D9]

> 

>While working through extents the concept a circumscribing circle with

>radius to encompass the most distant point of a placename, and thus all

>points,  is floating around in my head.  I'm wondering  for single offsets

>why we are taking extents as the distance from the center of a placename

>to the boundary in the direction of the offset.  An example I just did had

>a W extent of 3 mi and the N extent is 2 mi.

> 

>The relevant section from the guidelines is:

>Uncertainty due to the extent of a locality

>Named places are not single points; they have extents. Although there are

>conventions for placing the coordinates of a named place at the post

>office, courthouse, or geographic center of a town, one cannot be sure

>that the person who recorded the locality used a particular convention. If

>only the named place is given in the locality description use the distance

>from the geographic center of the named place to its furthest extent as

>the uncertainty.  If the description includes an offset, use the distance

>from the geographic center to furthest extent of the named place in the

>direction of the offset. For multitple offsets, (e.g., 3 km N, 5 km E of

>Bakersfield) use the furthest of the extents from the geographic center of

>the named place in the two cardinal directions.

> 

>I glossed over this in previous reading because I thought it sort of made

>sense.  But after doing extents, these guidelines seems arbitrary and I'm

>just wondering why different rules for different situations.  It seems if

>we don't know what the conventions were for the placename, regardless of

>the number of extents, it is only reasonable to assume the first and "use

>the distance from the geographic center of the named place to its furthest

>extent as the uncertainty."   Using distance from the center to boundary

>in the direction of the offset assumes what?  Perhaps that the collecter

>was referring to the center or the boundary?   If true, then wouldn't the

>extent (for computation of the error radius)  be half the distance between

>the center and boundary?   But who knows where in the placename the

>collector was referencing.  It doesn't appear to be a big deal, the

>difference is usually less than a mile and usually insignificant combined

>with degree error.  If ok, I'll just go with the max, with annotation of

>course.

> 

> 

>----------

 

 

>>> Posting number 229, dated 3 May 2002 16:54:21

Date:         Fri, 3 May 2002 16:54:21 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: Fwd: different extents for 0,1 or 2 offets

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding:  quoted-printable

 

John:  Sounds ok, but doesn't the two offset technique use the linear ext=

ent as a side of the bounding box to calculate the extent based on center=

 to corner hypotenuse while the single/zero offset technique uses the lin=

ear extent as a component of d'?  Whew.  So for the same linear extent, t=

he two offset extent will be greater.  But this is in line with the maxim=

al approach to estimating error.  =20

----- Original Message -----

From: John Wieczorek

Sent: Friday, May 03, 2002 1:01 PM

To: MAMMAL-Z-NET@USOBI.ORG

Subject: [MANIS] Fwd: different extents for 0,1 or 2 offets

 

XXXX,

 

This particular point has been bugging me somewhat as well, though from t=

he

standpoint of georeferencing automation. Basically, it makes life difficu=

lt

to try to maintain multiple extents for a given named place. More rules

means greater difficulty and greater opportunity for inconsistency.

 

So, I'm willing to simplify the Extents rule to read as follows, unless

there are any objections, which should be voiced immediately.

 

<begin proposed update>

Uncertainty due to the extent of a locality

Named places are not single points; they have extents. Although there are

conventions for placing the coordinates of a named place at the post

office, courthouse, or geographic center of a town, one cannot be sure th=

at

the person who recorded the locality used a particular convention. Use th=

e

distance from the geographic center of the named place to its furthest

extent as the uncertainty.

<end proposed update>

 

Yep, that's nice and succinct. I like it.Get more from the Web.  FREE MSN=

 Explorer download : http://explorer.msn.com

 

>>> Posting number 230, dated 3 May 2002 17:44:06

Date:         Fri, 3 May 2002 17:44:06 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Fwd: different extents for 0,1 or 2 offets

In-Reply-To:  <OE486LSpSpwbuZ8f0SY0000325d@hotmail.com>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX,

 

The two offset technique is not used to calculate an extent. The extent is

used to calculate the maximum error distance. The fact that we will use the

most conservative extent of a named place does not change how that extent

is used in the calculations.

 

While it is true that the extent contributes to the size of the bounding

box, it is not equal to the length of the side of the bounding box. The

length of the extent can be no more than half of the length of the side of

the bounding box, and that only if there are no other contributions of

distance uncertainties. The single/zero offset technique will still use the

linear extent as a component of d'. Also, the contribution to the maximum

error distance for a given extent will still be the square root of two

times greater for two offsets than it will be for one offset, regardless of

how big the offsets are.

 

The description of error calculations for "Combinations of uncertainties:

distance" is already based on the greatest single extent of the named place

in the two orthogonal offsets. Sorry if the explanations above are

confusing. The bottom line is that the proposed simplification in how to

determine the extent does affect how the extent is used in calculations. No

change to the documentation or methodology is required beyond the proposed

simplification in determining extents.

 

John

 

At 04:54 PM 5/3/02 -0700, you wrote:

>John:  Sounds ok, but doesn't the two offset technique use the linear

>extent as a side of the bounding box to calculate the extent based on

>center to corner hypotenuse while the single/zero offset technique uses

>the linear extent as a component of d'?  Whew.  So for the same linear

>extent, the two offset extent will be greater.  But this is in line with

>the maximal approach to estimating error.

>----- Original Message -----

>From: John Wieczorek

>Sent: Friday, May 03, 2002 1:01 PM

>To: MAMMAL-Z-NET@USOBI.ORG

>Subject: [MANIS] Fwd: different extents for 0,1 or 2 offets

> 

>XXXX,

> 

>This particular point has been bugging me somewhat as well, though from the

>standpoint of georeferencing automation. Basically, it makes life difficult

>to try to maintain multiple extents for a given named place. More rules

>means greater difficulty and greater opportunity for inconsistency.

> 

>So, I'm willing to simplify the Extents rule to read as follows, unless

>there are any objections, which should be voiced immediately.

> 

><begin proposed update>

>Uncertainty due to the extent of a locality

>Named places are not single points; they have extents. Although there are

>conventions for placing the coordinates of a named place at the post

>office, courthouse, or geographic center of a town, one cannot be sure that

>the person who recorded the locality used a particular convention. Use the

>distance from the geographic center of the named place to its furthest

>extent as the uncertainty.

><end proposed update>

> 

>Yep, that's nice and succinct. I like it.Get more from the Web.  FREE MSN

>Explorer download : http://explorer.msn.com

 

>>> Posting number 231, dated 5 May 2002 09:24:51

Date:         Sun, 5 May 2002 09:24:51 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: Fwd: different extents for 0,1 or 2 offets

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding:  quoted-printable

 

John:   I meant to say half the side of the bounding box hence the center=

 to corner hypotenuse.

 

- Original Message -----

From: John Wieczorek

Sent: Friday, May 03, 2002 5:42 PM

To: MAMMAL-Z-NET@USOBI.ORG

Subject: Re: [MANIS] Fwd: different extents for 0,1 or 2 offets

 

XXXX,

 

The two offset technique is not used to calculate an extent. The extent i=

s

used to calculate the maximum error distance. The fact that we will use t=

he

most conservative extent of a named place does not change how that extent

is used in the calculations.

 

While it is true that the extent contributes to the size of the bounding

box, it is not equal to the length of the side of the bounding box. The

length of the extent can be no more than half of the length of the side o=

f

the bounding box, and that only if there are no other contributions of

distance uncertainties. The single/zero offset technique will still use t=

he

linear extent as a component of d'. Also, the contribution to the maximum

error distance for a given extent will still be the square root of two

times greater for two offsets than it will be for one offset, regardless =

of

how big the offsets are.

 

The description of error calculations for "Combinations of uncertainties:

distance" is already based on the greatest single extent of the named pla=

ce

in the two orthogonal offsets. Sorry if the explanations above are

confusing. The bottom line is that the proposed simplification in how to

determine the extent does affect how the extent is used in calculations. =

No

change to the documentation or methodology is required beyond the propose=

d

simplification in determining extents.

 

John

 

 

 

>>> Posting number 232, dated 7 May 2002 12:50:12

 

>>> Posting number 233, dated 7 May 2002 21:30:01

 

>>> Posting number 234, dated 8 May 2002 17:58:12

Date:         Wed, 8 May 2002 17:58:12 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing Step Nine

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

Recently there have been postings to this list announcing data uploads. I

applaud those who have done so for their ability to follow directions. That

protocol is exactly what I had written in the Georeferencing Steps

document. Though these are useful reminders of progress to everyone, in

retrospect it seems that this kind of traffic to the list is unnecessary,

and that I am the only one who really needs to know when the data are

available. Consequently I have changed Step Nine in the Georeferencing

Steps document. Please review it if you have any questions.

 

In addition, I took the liberty of making the change discussed on 3 May

about simplifying the protocol for determining extents of named places.

That change can be reviewed under the section in the Georeferencing

Guidelines document entitled "Uncertainty due to the extent of a locality."

 

John

 

>>> Posting number 235, dated 9 May 2002 13:13:29

Date:         Thu, 9 May 2002 13:13:29 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Named Place

In-Reply-To:  <002901c1f781$c9acc730$9f6e0a0a@Sbober>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX, and all,

 

Good eye. The downloaded data did not contain the fields "Extent" and

"Named Place" until moments ago, though the Access97 database

manisgeoreftemplate.mdb did include them.

 

No other column is needed for units. All distance units for a given

calculation must be the same as the original units used in the locality

description, and these units are recorded in the MaxErrorUnits field. The

Named Place field should be filled with the proper name (e.g, Bakersfield,

or "junction of Hwy. 5 and Hwy. 80") rather than the type of named place

(e.g., "ppl", or "lake").

 

John

 

At 12:48 PM 5/9/02 -0500, XXXXXXXXXX wrote:

>John-

> 

>I just noticed the 'Extent' and Named Place fields listed on the

>Georeferencing Steps page.  Are the files downloaded supposed to already

>have them?  I just added them, but was unsure if another column was needed

>for units, of if that should all be covered with the max error distance

>units(as it all needs to be normalized anyway).  Also, is the named place

>column a specific or general term?  Is it the proper name of the place

>used to find the coord.'s, or should just be something along the lines of

>'populated place', 'lake', etc.

> 

 

>>> Posting number 236, dated 9 May 2002 19:05:15

Date:         Thu, 9 May 2002 19:05:15 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Fwd: SpecLocality placenames, simply wrong vs probably right

Mime-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"; format=flowed

Content-Transfer-Encoding: quoted-printable

 

XXXX, and all,

 

I have not forgotten this, and I wanted to tell you so, even though I can't=

=20

spare the time to do it justice at the moment. It's been like that a lot=20

lately. I WILL reply though.

 

John

 

 

>X-Sender: gshugart@mail.ups.edu

>Date:         Mon, 22 Apr 2002 00:48:24 -0700

>Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

>Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

>From:

>Subject:      SpecLocality placenames, simply wrong vs probably right

>To: MAMMAL-Z-NET@USOBI.ORG

>X-Status:

>X-Keywords:

> 

>John:  It wasn't clear what methods or methodology you disagree=20

>with.  Regarding assumptions, the comment that  "some of the=20

>determinations using this methodology will simply be wrong" seems overly=20

>optimistic.  We will never know "simply" if many localities are wrong or=20

>right.  It is more productive to think in probabilities.  We have to make=

=20

>assumptions regarding what SpecLocality values represent given that we are=

=20

>georeferencing  other institution's records and formats without consulting=

=20

>primary data.  So I have to assume, as does everyone else, that entries=20

>like "Hood River" from different collectors and institutions probably=20

>refer to the same locality and the same type of locality (city rather than=

=20

>river).  For determining lat/lons and errors, we also assume that the=20

>lat/longs and boundaries of population centers like Hood River probably=20

>have not shifted or expanded/contracted significantly over the=20

>years.   These basic assumptions make determinations probabilities rather=

=20

>than right or wrong.

> 

>Another possibility is that the comment "this methodology will simply be=20

>wrong" refers to calculating lat/longs based on a placename and=20

>offsets?  Formal release of the web lat/long calculator validates this=20

>methodology.  But, regardless of the methodology the assumptions are the=

 same.

> 

>A third possibility is that perhaps you were referring to the reference=20

>string that Hood River, the ppl, was georeferenced but that there are=20

>other possibilities? Many placenames with a landscape feature in the name=

=20

>(e.g., river, falls, beach, spring) will be in the category of at least=20

>two types of locality, so for these I am looking (a few keystrokes, so=20

>little time is wasted) and I am annotating as I find them with a standard=

=20

>"could also be =8A".  About 10% of the Oregon records are in this=20

>category.  If you don't want them georeferenced let me know.

> 

> 

> 

>>XXXX, and all,

>> 

>>I disagree with the methods described below on three counts. The more

>>important of my two reasons is that some of the determinations using this

>>methodology will simply be wrong, and there won't be any way to know if

>>they are wrong even with the Locality Remarks. My secondary reason is that

>>it will be difficult to find these localities among the data in order to

>>filter them out from analyses for which they aren't appropriate. Finally,

>>it seems to me we gain no benefit from actually having coordinates for

>>ambiguous localities, especially given that they take up precious time to

>>georeference. By ambiguous I mean that there is more than one possible

>>distinct place to which the locality may refer. By distinct I mean that=

 the

>>maximum error circles do not overlap.

>> 

>>One could, sometimes appropriately, choose the geographic center of

>>multiple possible places for the coordinates of a locality and have the

>>extent cover all of them. In this case the error would be larger than it

>>would be by choosing any one of the possible localities, but the

>>determination (coordinates plus maximum error distance) would not be=

 wrong.

>>I would use this method sparingly, however, given that it does take quite=

 a

>>bit of time to make a determination of this kind. The method is most

>>appropriate when the distances between the possible places is relatively

>>small so that the maximum error distance itself remains small. I have no

>>objection to this approach, but I would argue vehemently to avoid

>>determinations that are likely to be wrong.

>>John W

> 

> 

>--

 

>>> Posting number 237, dated 21 May 2002 18:26:31

Date:         Tue, 21 May 2002 18:26:31 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Fwd: sufficiently georeferenced?

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

XXXXX has brought up another good question. I consider the examples she has

given below to be specific localities that just happen to contain

coordinates, and they should therefore be georeferenced. Localities that

have values in the DecLat and DecLong fields should be considered to have

been georeferenced for the purposes of this discussion.

 

John

 

>From:

>Subject: sufficiently georeferenced?

> 

>John:

> 

>The subject of georeferencing records that already have coordinates has

>been discussed in the past, but I had not encountered any prior to Benzie

>County, Michigan.

> 

>(Copied here is part of your reply to XXXXXX on 2/15/02:

> 

>XXXXXX and all,

>There is no provision for georeferencing records that already have

>coordinates, but this shouldn't necessarily deter you from doing so. If you

>go this route, please be sure to note that you have provided these

>additional data when you send them in to me. It makes a difference in how I

>handle the data on this end....)

> 

> 

>Would you please advise if the following Benzie examples are what you

>referred to as having coordinates?  Are these searchable and provide

>sufficient information on the localities, or should I add decimal degrees

>et al. in the columns provided in the ACCESS template?

> 

>SpecLocality: SLEEPING BEAR DUNES NTL LAKESHORE, ESCH RD AT ARAL

>LatText: 44D45'86D04'30"

> 

>SpecLocality: ESCH RD AT ARAL, LINE 7

>TRS:  44D45'N 86D04'30"W, 185M

> 

>Thanks again for your assistance.

> 

 

 

>>> Posting number 238, dated 21 May 2002 20:22:25

 

>>> Posting number 239, dated 23 May 2002 11:49:56

Date:         Thu, 23 May 2002 11:49:56 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: urban boundaries Topo USA 4.0

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii" ; format="flowed"

 

MaNISers:  Haven't heard back yet if there is an accessible boundary

file, but the boundaries are boundaries rather than some artefact.

There seems to be a trend to annex or incorporate airports and

watersheds which increases the extents in modern times.  See original

query below regarding boundaries.

 

>Dear XXXX,

> 

>The boundaries are selectable but you are not able to determine a perimeter

>distance for the boundary. Also, I would only use the program as a general

>reference tool and not for something that requires precise measurement.

> 

> 

>Regards,

>Dan Lee

>Technical Support Specialist

> 

>DeLorme Tech Support

>Two DeLorme Drive

>Yarmouth, Maine 04096

>E-Mail:  tech_inbox@DeLorme.com

> 

>-----Original Message-----

>From:

>Sent: Tuesday, May 07, 2002 1:44 PM

>To: tech@delorme.com

>Subject: urban boundaries Topo USA 4.0

> 

> 

>Support:  I recently purchased Topo USA 4.0 for a

>mapping/georeferencing project.  I'm impressed with the program.  For

>the project it is important that I can come up with the length &

>width  or extents of cities, urban areas and suburbs.  I noticed that

>a right click on an urban area (brown) then the create route option

>shows a green line highlighting what appear to be boundaries of the

>city, suburb, incorporated area.   This also seems to work with

>parks.  This boundary feature doesn't appear to be documented in the

>manual.  Are these the official city/urban boundaries or some route

>that the program creates independent of boundaries?

> 

>I queried the knowledge base but didn't come up with anything

>specific on limits, city limits, urban boundaries with and without

>create route.

>--

 

 

 

 

>>> Posting number 240, dated 26 May 2002 23:34:58

 

>>> Posting number 241, dated 27 May 2002 09:17:14

 

>>> Posting number 242, dated 29 May 2002 18:14:13

 

>>> Posting number 243, dated 29 May 2002 18:17:09

 

>>> Posting number 244, dated 4 Jun 2002 22:18:54

 

>>> Posting number 245, dated 5 Jun 2002 14:33:55

Date:         Wed, 5 Jun 2002 14:33:55 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      georeferencing rates

Comments: cc:

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: 7bit

 

Barbara & John,

 

I have finished approximately 2700 records since the second week of April.

However, the first two weeks was spent getting a computer, setting it up

with the correct software and familiarizing myself with the project and how

I was going to tackle it.  The third week was when I really started on the

project.  Here is the break down by counties in Nevada:

 

Washoe County:

809 records

17 working days to complete

119 working hrs.

6.79 records/hr. average

FTP'ed 5/7/2002

 

Storey, Carson City, Douglas, Lyon, and Mineral Counties:

822 records

10 working days to complete

70 working hours

11.74 records/hour average

FTP'ed 5/23/2002

 

Humboldt County:

605 records

6 working days to complete

42 working hours

14.4 records/hour average

FTP'ed 5/31/2002

 

Pershing County:

136 records

2 working days to complete

14 working hours

9.7 records/hour average

FTP'ed 6/4/2002

 

Total Average Records/Hour = 10.65 Records/Hour

 

My working hours are calculated for a 7 hour working day.  I am in the

office for 9 hours, however, 1 hour is used for lunch and the other hour is

for my miscellaneous walks around the museum to keep me fresh and awake,

therefore, 7 working hour days.

 

I use:

MS Access 2000 for data manipulation and compiling

ArcView 3.2 for viewing localities

USGS Geographic Names Information System (GNIS) for the state of Nevada.

Feature data used in ArcView.

NV Atlas & Gazatteer by DeLorme for verifying locations and getting the

"extents" or size of features.

Topozone.com to look for features that are not found using GNIS or to get a

better topographic view than what DeLorme delivers.

http://www.esg.montana.edu/gl/trs-data.html - for converting TRS data to

latitude and longitude

Picking XXXXXXXX' brain for some locational stumpers!

 

John has asked previously what I have done to get the rates that I am

getting and I had answered him in another e-mail about that.  In short, I

manipulate the data as much as I can before I start working on it.  I divide

the NV database into individual counties.  I then arrange the localities

within the counties to be clumped together so that I can work on one

locality at a time.  Saves me time by not having to look up new new

latitudes and longitudes for different areas.  If you want details, let me

know and I will write a much more detailed explanation.

 

Hope this helps.  Take care.

 

Sincerely,

XXXXXXXXXXX

Curatorial Assistant

 

>>> Posting number 246, dated 6 Jun 2002 09:45:36

Date:         Thu, 6 Jun 2002 09:45:36 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      plotting lat/longs in Topo USA 4.0

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding:  quoted-printable

 

Barb and all:

 =20

A cool item in the tip category that might be useful for the meeting or f=

or just reviewing, proofing, or displaying records is the cability to dis=

play your work in Topo USA 4.0.  All that is required is a file with reco=

rds having lat, long, and a label.  Import & open the file (in text forma=

t), and the localities appear with little flags or label of your choice o=

n the Topo maps.  Multiple records for the same point could use some prep=

rocessing to count and display the number.  Other programs do the same, b=

ut I was so impressed just thought I would pass it along.  Get more from =

the Web. 

 

>>> Posting number 247, dated 6 Jun 2002 09:57:24

Date:         Thu, 6 Jun 2002 09:57:24 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: georeferencing rates

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

Barbara and John,

 

Our  georeferencer for UAM, XXXXXXXXXX, has just left for the field for

the remainder of the summer.  I have looked over the Access file that she

has been using as well as her timesheets to figure out how much time she has

been spending on this and how many records are completed.  This is what I

found:

 

4078 records from Alaska have been georeferenced.

A total of 200 hours have been charged to the MaNIS account for her time.

20.39 records/hour

 

XXXXX works 20 hours per week and does nothing but georeference. She has

spent 10 weeks on this project.  This average (20/hr) seems very high. I am

wondering if it is just a consequence of the Alaska data - there are many

duplicate localities (e.g. 50 different localities that contain Fairbanks -

the city)  XXXXXX looks up the lat/long once (or a few times, depending on

the data) and varies the ME as needed. There are also quite a few that she

was not able to be georeference.  We have been mainly using a CD map program

"All Topo Maps: ALASKA"  It contains all the USGS maps of Alaska. It is very

handy for georeferencing rivers, bays, lakes, distances from named places,

etc.

 

We have georeferenced about 46% of the localities that we started with (that

needed georeferencing).  We still have about 4734 records to georeference.

We have broken the Alaska localities into 2 groups, those from other

museums, and those from UAM.  All of the localities that are currently

georeferenced are from other museums - we only have 740 more to go.  As for

the other 4000 or so left to georeference, they are all from UAM

 

Let me know if you need any additional clarification.  Cheers!

 

XXXXXXXXXXXXXXX

 

 

 

"Barbara R. Stein" wrote:

 

> Dear All,

> 

> John and I are very interested in how the georeferencing is

> progressing.  It is one of many topics we would like to discuss at the

> ASM meeting, but time will be short and I am not sure it is the best use

> of that forum.  I would much rather demo the network to you and begin

> talking about how we will be bringing collections on line!

> Consequently, we would like to ask each institution to calculate their

> current georeferencing rate and determine if you are doing less than,

> the same as, or better than the rates on which we based our proposal

> budget.  They were:

> 

> 9/hr for US localities

> 6/hr for non-US North American localities

> 3/hr for non-North American localities

> 

> We suspect that your rates will have improved now that you are all

> familiar with the process, but if you are easily exceeding these figures

> we woul'd like to know if there are tricks you have developed and ask

> that you post these to the list.  If you are pretty much on target, this

> is good for us to know as well.  And if you are lagging, now is the time

> to speak up and let's figure out how you can catch up.  If you would

> send this information within the next week, John and I will summarize

> the stats in time for the meeting and present a quick overview.

> 

> FYI, John and I are getting an increasing number of queries about MaNIS,

> from folks who are anxious to access our data, both the georeferenced

> localities and the specimen records to which they will eventually be

> linked, and from those who are considering pursuing similar projects on

> their own.  We have pointed a number of individuals to the

> georeferencing steps and guidelines posted on the web site and we will

> continue to update these and streamline them as you provide us with

> feedback.

> 

> Looking forward to seeing you in Lake Charles.

> 

> Best,

> Barbara

 

>>> Posting number 248, dated 6 Jun 2002 13:25:30

Date:         Thu, 6 Jun 2002 13:25:30 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: georeferencing rates

In-Reply-To:  <3CFD9F3D.EB78F191@oz.net>

Mime-version: 1.0

Content-transfer-encoding: 7bit

Content-type: text/plain; charset="US-ASCII"

 

Dear Barbara, John, et al.,

My team--XXXXXXXXXX and XXXXXXXXXX--seem to be working on schedule

and on pace.  XXXXX tells me that most localities can be georeferenced at a

rate of about 9-10 per hour, but they occasionally hit snags that slow them

down to about 5, or so, per hour.   Not surprisingly, their overall rate is

still improving as they gain more and more experience.

See you in Lake Charles.   XXXXX

 

 

 

 

>>> Posting number 249, dated 7 Jun 2002 09:34:28

Date:         Fri, 7 Jun 2002 09:34:28 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: georeferencing rates

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding:  quoted-printable

 

Barb and all:  I was doing about 25 records/hr using GNIS placenames and =

calculated lat/longs in Excel, even with having to lookup extents on Topo=

 USA 4.0.  The rate has slowed to about 10/hr doing road miles and other =

non-GNIS placename records.  TRS records should be in the 20/hr range.  I=

 should finish up with the 6,752 Oregon dataset sometime in July.

 

Some ideas to maximize georeferencing rates for future work are to concen=

trate on those records with GNIS placenames.  This would get about 60-70%=

 in the US.  Also clean your data first, GNIS provides a great filter/dic=

tionary file for US records and can be done at a rate of about 100 record=

s/hour using semi-automation.  Eliminating or saving extents for later wo=

uld push the georeferencing rate to over 100/hr with clean data.  Get mor=

e from the Web. 

 

>>> Posting number 250, dated 10 Jun 2002 09:26:26

Date:         Mon, 10 Jun 2002 09:26:26 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: georeferencing rates

In-Reply-To:  <3CFD9F3D.EB78F191@oz.net>

MIME-Version: 1.0

Content-Type: text/plain; charset=ISO-8859-1

Content-Transfer-Encoding: 8bit

 

Barbara, John, et al.

 

We (UMNH) are georeferencing Utah localities at a much faster rate than

anticipated using a very effective combination of the GNIS gazetteer and the

National Geographic TOPO! software (USGS 1:24,000 maps).  I estimate 20-30/h

for unambiguous localities, and 5-10/h for those that are more difficult (i.e.

vague).  We are flagging very few as unresolved problems.

 

XXXX

 

>>> Posting number 251, dated 10 Jun 2002 10:50:20

Date:         Mon, 10 Jun 2002 10:50:20 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: georeferencing rates

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

Barbara, John, and all-

 

Here at CAS, I have been georeferencing US localities at a rate of 12-15 per

hour for straightforward localities and 8-10 per hour for localities that

require more detective work.  I have been using a combination of GNIS

gazetteer and Terrain Navigator 2001 for each New England state I have

worked on thus far.

 

XXX

 

 

 

>>> Posting number 252, dated 11 Jun 2002 11:51:06

Date:         Tue, 11 Jun 2002 11:51:06 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: georeferencing rates

In-Reply-To:  <3CFD9F3D.EB78F191@oz.net>

MIME-version: 1.0

Content-type: text/plain; format=flowed; charset=us-ascii

Content-transfer-encoding: 7BIT

 

Dear Barb and John,

 

XXX's georeferencing progress to date is as follows:

 

Texas- 10049 localities, 4095 finished, 290 not georeferenced due to lack

of data.

Oklahoma- 1233 localities, 218 finished, 33 insufficient locality data.

 

Total- 4699 records finished

91 working days @ 4 hrs per day = 364 hours.

 

Average georeferencing rate = 13 per hour.

 

We are using Topozone and USGS data for georeferencing.  Many of the

records are being done semi-automatically using the UTM Converter developed

at TTU several years ago.  The data have to be cleaned up considerably

before this process can occur, however.  Ambiguous localities are located

(where possible) using National Geographic's "TOPO! Texas" CD set and Texas

State Department of Highways and Public Transportation's published county

maps (paper).

 

Sincerely,

XXXXXX.

 

 

 

 

>>> Posting number 253, dated 11 Jun 2002 19:25:25

Date:         Tue, 11 Jun 2002 19:25:25 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      georeferencing rates

In-Reply-To:  <4.2.2.20020611114004.00aa3478@packrat.musm.ttu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"; format=flowed

Content-Transfer-Encoding:  quoted-printable

 

Dear all:

 

    CNMA's (Colecci=F3n Nacional de Mam=EDferos, at Instituto de Biolog=EDa,=

=20

UNAM) georeferencing progress is as follows:

 

Mexican states:  Puebla and Tlaxcala.

effective working period: march - may 2002 =3D 66 working days;

6 h per day =3D 396 h;

total number of different localities (not records) georeferenced =3D 547;

georeferencing rate =3D 1.4 localities/h;

 

>We are using INEGI (Instituto Nacional de Estad=EDstica, Geograf=EDa e=20

>Inform=E1tica, M=E9xico) data and maps for georeferencing.  Most data had=

 to=20

>be cleaned up before georeferencing; most Mexican localities cannot be=20

>automatically done since they have not been georeferenced.  Our skills are=

=20

>improving though.

 

>  XXXXXXXX

 

 

------------------------------------------------=20

 

>>> Posting number 254, dated 12 Jun 2002 09:26:20

 

>>> Posting number 255, dated 12 Jun 2002 13:05:00

Date:         Wed, 12 Jun 2002 13:05:00 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Georef. Rate

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding:  7bit

 

For Illinois, I have gotten up to about 15 per hour with unambiguous, or

down to about 5-6 per hour for those requiring a bit more work.  Still

pursuing available resources for speeding this up as needed.

 

XXXXXXXXX

 

>>> Posting number 256, dated 12 Jun 2002 14:02:26

Date:         Wed, 12 Jun 2002 14:02:26 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Organization: Mammalogy

Subject:      Re: georeferencing rates

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

Folks:

 

We have averaged 9 localities per hour.  At times we can move at 20+ per

 

hour, others are show-stoppers.  New Mexico - with ca. 17,000 unique

localities to work on - started slow, but we have beat the learning

curve and should move along nicely this summer.

 

-XXXXXXX

 

 

>>> Posting number 257, dated 19 Jun 2002 09:25:52

 

>>> Posting number 258, dated 19 Jun 2002 10:29:00

 

>>> Posting number 259, dated 19 Jun 2002 10:52:01

 

>>> Posting number 260, dated 19 Jun 2002 08:51:14

 

>>> Posting number 261, dated 26 Jun 2002 12:54:15

Date:         Wed, 26 Jun 2002 12:54:15 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Automated Georeferencer

In-Reply-To:  <p05100302b8e9651a4b72@[207.207.104.113]>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX,

 

Here's the URL for the test site for the International Automated Georeferencer:

 

http://129.237.201.122/manis_si.html

 

>>> Posting number 262, dated 26 Jun 2002 15:13:32

 

>>> Posting number 263, dated 26 Jun 2002 13:22:40

Date:         Wed, 26 Jun 2002 13:22:40 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Automated Georeferencer

In-Reply-To:  <5.0.0.25.2.20020626125318.024ce160@socrates.berkeley.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

This site is in development. Feel free to look at it and try it, but please

do not try to incorporate it into your daily routine yet. Reed Beaman and I

are working to make a system that is streamlined with our georeferencing

techniques. We will announce the tool as ready for prime time when the

current pending issues have been addressed and I have written documentation

on how to use it properly.

 

Thanks,

John W

 

 

>XXXX,

> 

>Here's the URL for the test site for the International Automated

>Georeferencer:

> 

>http://129.237.201.122/manis_si.html

 

>>> Posting number 264, dated 26 Jun 2002 13:35:29

 

>>> Posting number 265, dated 28 Jun 2002 12:50:56

Date:         Fri, 28 Jun 2002 12:50:56 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Captive flag, Zoo's

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding:  7bit

 

John, and all--

 

I just explored the various Zoo records, for Illinois, and felt I should

inform you of what I found.  Two localites that were captive flagged were

actually from the zoo grounds, and were not captive specimens, but animals

living wild on the grounds themselves.  I found this with our own records,

but cannot investigate to confirm the other records.  I will change the

captive flag, and add georeferencing information on the 2, but thought you,

and all, might want to know about this potential (albeit somewhat nitpicky)

snag.

 

XXXX

 

 

 

>>> Posting number 266, dated 28 Jun 2002 11:50:31

Date:         Fri, 28 Jun 2002 11:50:31 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      georeferencing streams, creeks, rivers for SpecLoc

Mime-Version: 1.0

Content-Type: text/plain; format=flowed

 

MaNISers: Hard to believe, but creeks, rivers and streams are used as

SpecLoc for many Oregon records and probably others.  I've checked some of

our records against the specimen tags/catalogs and they are not typos or

ommission of data.  I've been georeferenceing them taking a midpoint

(straight line or creek miles).  The GNIS download has the mouth and source

dec lat/longs for streams, creeks, rivers.  Plotting these first saves much

time vs trying to follow a creek onscreen.  Once ends are marked the

measuring tool gives the total distance and midpoint.

 

Topo USA 4.0 will also give intersections of roads/water courses and water

courses using the street intersection option.

 

 

 

 

>>> Posting number 267, dated 28 Jun 2002 12:31:00

Date:         Fri, 28 Jun 2002 12:31:00 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Captive flag, Zoo's

In-Reply-To:  <GMENIPGLHCAIHELBOECBIEJNCAAA.sbober@fmnh.org>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=US-ASCII

 

Dear XXXX, and all,

 

This is indeed an important point, of which all data provider's and

georeferencers should remain aware. Here's a point of information: Every

record that comes with the captive flag turned on (except for those from

MVZ and UAM) were set by me when I parsed the data for the gazetteer. My

purpose in doing so was to highlight records that are likely not to

represent localities that are valid for species distributions. By setting

this flag, I identified records that should be checked by the contributing

institutions. At times, as XXXX notes, these records actually refer to

collecting events that are valid for the species distribution. However, as

georeferencers without access to the primary material we are unable to

determine this. The plan, then, is to georeference these localities as you

would any other. It will be up to the individual institutions

to determine whether to include the data for particular

specimens. Institutions that feel the need may later want

to add a field to their databases that is similar to the captive flag (we

call it the Valid_Distribution_Flag) to accomodate this issue.

 

John W

 

On Fri, 28 Jun 2002, XXXXXXXXX wrote:

 

> John, and all--

> 

> I just explored the various Zoo records, for Illinois, and felt I should

> inform you of what I found.  Two localites that were captive flagged were

> actually from the zoo grounds, and were not captive specimens, but animals

> living wild on the grounds themselves.  I found this with our own records,

> but cannot investigate to confirm the other records.  I will change the

> captive flag, and add georeferencing information on the 2, but thought you,

> and all, might want to know about this potential (albeit somewhat nitpicky)

> snag.

> 

> XXXX

> 

 

 

>>> Posting number 268, dated 28 Jun 2002 12:40:29

Date:         Fri, 28 Jun 2002 12:40:29 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: georeferencing streams, creeks, rivers for SpecLoc

In-Reply-To:  <F84thODt9Cz93MgPPf300000ff7@hotmail.com>

MIME-Version: 1.0

Content-Type: TEXT/PLAIN; charset=X-UNKNOWN

Content-Transfer-Encoding: QUOTED-PRINTABLE

 

Nice. And remember, if a Locality says "Bitteroot River" and the county is

"Missoula County" use only that part of the Bitteroot River that is in

Missoula County for the determination.

 

On Fri, 28 Jun 2002, XXXXXXXXX wrote:

 

> MaNISers: Hard to believe, but creeks, rivers and streams are used as

> SpecLoc for many Oregon records and probably others.  I've checked some o=

f

> our records against the specimen tags/catalogs and they are not typos or

> ommission of data.  I've been georeferenceing them taking a midpoint

> (straight line or creek miles).  The GNIS download has the mouth and sour=

ce

> dec lat/longs for streams, creeks, rivers.  Plotting these first saves mu=

ch

> time vs trying to follow a creek onscreen.  Once ends are marked the

> measuring tool gives the total distance and midpoint.

> 

> Topo USA 4.0 will also give intersections of roads/water courses and wate=

r

> courses using the street intersection option.

> 

> 

> 

> 

 

>>> Posting number 269, dated 28 Jun 2002 13:39:01

Date:         Fri, 28 Jun 2002 13:39:01 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: georeferencing streams, creeks, rivers for SpecLoc

Mime-Version: 1.0

Content-Type: text/plain; format=flowed

 

John:  Thanks, forgot to say that.  The mouth and source are often in

different counties.  GNIS gives the county for the mouth which often doesn't

match the HigherGeog MaNIS county.  In these cases I am assuming county is

correct and going with the part of the water course in the HigherGeog

county.  Annotating as needed.  We never did discuss the assumption that

county is correct.

 

 

 

>>> Posting number 270, dated 28 Jun 2002 15:50:52

Date:         Fri, 28 Jun 2002 15:50:52 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         

Subject:      Re: georeferencing streams, creeks, rivers for SpecLoc

In-Reply-To:  <F123RvITt7Qs6hvfPAZ00001126@hotmail.com>

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: 8bit

 

As for the county issue, I generally assume its correctness, unless given

reason to doubt.  I have run into discrepancies that were cleared up by

looking at a paper map with counties, and seeing that certain things run

through multiple counties.  Also, there have been two counties listed for

the same named locality, that I initially assumed were identical, until GNIS

gave two choices.  Annotation/remarks seem to be the way to keep things

clear, unless there are other ideas.

 

XXXXXXXX

 

-----Original Message-----

From: Mammal Networked Information System

[mailto:MAMMAL-Z-NET@USOBI.ORG]On Behalf Of Gary Shugart

Sent: Friday, June 28, 2002 3:39 PM

To: MAMMAL-Z-NET@USOBI.ORG

Subject: Re: [MANIS] georeferencing streams, creeks, rivers for SpecLoc

 

 

John:  Thanks, forgot to say that.  The mouth and source are often in

different counties.  GNIS gives the county for the mouth which often doesn't

match the HigherGeog MaNIS county.  In these cases I am assuming county is

correct and going with the part of the water course in the HigherGeog

county.  Annotating as needed.  We never did discuss the assumption that

county is correct.

 

 

 

 

>>> Posting number 271, dated 9 Jul 2002 10:21:16

Date:         Tue, 9 Jul 2002 10:21:16 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      batch processing TRS data

Mime-Version: 1.0

Content-Type: text/plain; format=flowed

 

MaNISer:  After having assistants do a number of records via the TRS web

site I happened on the batch processor link. The converter program can be

downloaded to your desktop and data can be submitted as a file.  It takes

some parsing and concatenating of the MaNIS (or other) TRS strings or fields

to get the data in an acceptable format, but is fairly easy to do with

Excel.  Lat/long are output with the TRS identifier thus allowiing a link

back to the MaNIS file.  If anyone is interested and has 100's of TRS

records to do this might be a speedy option to consider. I'll help with the

data formating if needed.

 

 

 

 

 

>>> Posting number 272, dated 9 Jul 2002 10:28:02

Date:         Tue, 9 Jul 2002 10:28:02 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      batch processing TRS data, the link

Mime-Version: 1.0

Content-Type: text/plain; format=flowed

 

The link for the TRS batch processor is

http://www.geocities.com/jeremiahobrien/trs2ll.html.

 

Also I tried the program and it and it works.

 

 

 

 

>>> Posting number 273, dated 18 Jul 2002 10:18:17

Date:         Thu, 18 Jul 2002 10:18:17 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: georeferencing Congo

Comments: To:

In-Reply-To:  <AA33E10E16DAD411BDFD0008C7CF50E6097FFB48@hawk.mail.ukans.e du>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear XXXXXX and all,

 

The question raised below is a good one, and it is the most difficult of

the problems we'll encounter with localities from which detailed maps are

not available.  I'll include here an excerpt from the forthcoming ASM

Meeting Notes document and open my proposed solution to discussion.

 

"A question was raised about determining errors for foreign localities if

you do not know the extent of the nearest named place.  For instance, you

may know that you were 4.6km NW of Hotezel, South Africa, but if you don't

know the extent of the village of Hotezel itself, how do you determine the

extent?

 

This is a tricky problem to which there are numerous possible

solutions.  An ideal solution is one that is simple to remember and simple

to implement so that it is executed consistently under all

circumstances.  The first thing to remember is that we have no dictum

saying that the maximum error distance has to be as small as possible.

Instead, it has to be as large as necessary to ensure that we are not

over-representing the accuracy of the data.  With that in mind, I recommend

the following approach to determining the extent of a named place when it

cannot be determined directly from the maps, gazetteers, or any of the

other tools at hand.

 

1)      Determine the location of the named place that is nearest to the

one for which you are trying to determine the extent.  Let's call that

named place the "nearest neighbor."

 

2)      Use one-half the distance between the named place of interest and

its nearest neighbor as the extent of the named place of interest. At times

this may turn out to be an unrealistically large extent, but there is no

harm in that. Into the future, estimates of the error distance can be

refined as better information becomes available."

 

At 09:15 AM 7/18/02 -0500, XXXXXXXX wrote:

>Hi.

> 

>I'm a graduate student at XX and I'm georeferencing Congo localities for the

>MaNIS project. I have a question related to calculating the maximum error

>distance; I don't know what to do for the extent of localities. I've been

>using gazetteers and atlases to find localities but I couldn't find good

>maps to determine the extent of localities.

> 

>Are there any chances I could set the extent of most localities at unknown?!

>I'm sure this is a dumb question, but I run into this problem when I started

>to use the georeferencing calculator and I don't know what to do.

> 

>Please let me know if you have suggestions.

>Thank you.

>XXXXXXX

> 

 

 

>>> Posting number 274, dated 18 Jul 2002 14:48:37

 

>>> Posting number 275, dated 19 Jul 2002 17:42:23

Date:         Fri, 19 Jul 2002 17:42:23 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      TRS records

Mime-Version: 1.0

Content-Type: text/plain; format=flowed

 

MaNIS:  Any guidance on records dealing with 1/4 of 1/4 sections

appreciated.  Example Oregon records appear in the gazetteer as:

 

T2N,R45E,Sec. 25 NW SE (UW)

SW .25,NE .25 sec.12,T11S,R5W (LSU)

 

Some of the ambiguity might result from concatenting separate fields for 1/4

and 1/4 of 1/4 but I can't tell.

 

On these I stop at the section or use the placename + offset if error

appears to be smaller.  Just wondered if I missed a standard order

somewhere? (Note: KU's TSRs are truncated so I am stopping at section on KU

records at the suggestion of KU).

 

One way of dealing with the ambiguity I found in the UW records:

T35S,R6E,Sec. 10 SW1/4,SE1/16

Presumably this is the SE 1/4 of the SW 1/4 of Sec 10 and not a typo?

 

To finish off the Oregon TRS records I used the batch processor.  It worked

great and matched those I had done one at a time on the web site.  It

doesn't like TR without a section however.  On the few without section, I

used section 15 and calculated the lat/long .5 mi S and .5 mi W to get the

center of the TR.  For some with just TR a smaller error resulted from using

the placename and offset.  In this case I went with the placename + offset.

 

 

 

>>> Posting number 276, dated 19 Jul 2002 20:03:17

Date:         Fri, 19 Jul 2002 20:03:17 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: TRS records

In-Reply-To:  <F102xrUDugdSORDNUjE00011800@hotmail.com>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX, and other doing locales with TRS data from UW and LSU,

 

I constructed the localities from the Burke Museum from data that were

originally contained three separate fields for the TRS data. For the

Section part of the localities you see, I concatenated the verbatim value

of the Section field to the abbreviation "Sec.", thus, for your first

example below, the original data had "25 NW SE" in the Section field. For

your second UW locality below, the original section field contained "10

SW1/4,SE1/16". There are quite a few examples like these. Would a

representative from UW please clarify these issues?

 

The TRS data for LSU are all contained in the Locality field itself and

were not interpreted or parsed in any way. An LSU representative will have

to let us know if there is a consistent rule about how to interpret the TRS

for LSU localities.

 

Have there been any other ambiguities such as these?

 

Thanks for bringing these to our attention, XXXX.

 

John

 

 

>>> Posting number 277, dated 19 Jul 2002 23:16:04

Date:         Fri, 19 Jul 2002 23:16:04 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: TRS records

Mime-Version: 1.0

Content-Type: text/plain; format=flowed

 

Others ambiguous 1/4 sections:

 

We (PSM) has about 20 records in Oregon like:

Brookings; T40S, R14W, S36, NW .25, SE .25

There is no way to tell 1/4 vs 1/4 of 1/4 from the record, but our format is

TRS, 1/4 section, 1/4 or 1/4 section unless the the text says something

different.  These follow the format of the collector, but are ambiguous and

will be corrected when I get to it.

 

Also MVS had one T2S,R10E,S21 SW1/4,NE1/4 (MaNIS # 278049.

 

 

>From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

>Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

>To: MAMMAL-Z-NET@USOBI.ORG

>Subject: Re: [MANIS] TRS records

>Date: Fri, 19 Jul 2002 20:03:17 -0700

> 

>....

> 

>Have there been any other ambiguities such as these?

> 

....>

>John

> 

> 

 

 

 

>>> Posting number 278, dated 22 Jul 2002 09:08:40

Date:         Mon, 22 Jul 2002 09:08:40 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: TRS records

In-Reply-To:  <5.0.0.25.2.20020719195028.02489710@socrates.berkeley.edu>

Mime-version: 1.0

Content-type: text/plain; charset="US-ASCII"

Content-transfer-encoding: 7bit

 

The example cited section 10 SW1/4, SE1/16  means the SE1/16 of the SW1/4 of

section 10.  I could check with the collector, but that is the way it is

written in his field catalog and I have to presume that this is correct.

The hierarchy of locality is to go from larger to smaller entities.

 

The T2N, R45E, Sec 25 NW SE is straight from the collector's descriptions

and I would presume that in this case it means the SE1/4 of the NW1/4 of

Sec. 25.  The alternative is to not assume anything and truncate the

locality to the section  with a comment on the ambiguity.

 

Hope that this helps.

 

Cheers, XXXX

 

 

> 

 

>>> Posting number 279, dated 22 Jul 2002 12:16:29

 

>>> Posting number 280, dated 22 Jul 2002 20:00:22

Date:         Mon, 22 Jul 2002 20:00:22 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Chile is done

Comments: cc: acaiozzi@uclink.berkeley.edu

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear all,

 

Having a native Chilean student to do georeferencing of Chilean localities

has proven to be extremely beneficial.  First, on her trip back home to

Santiago in June, she acquired some nice 1:500,000 military maps of most of

the country at fairly low cost. She was then able to use those maps, the

Alexandria Digital Library Gazetteer, the US Board of Geographic Names

Gazetteer, and a variety of other sources on the web to georeference the

877 localities in 75.5 hours for a mean rate of 11.62 localities per hour.

Huzzah. Moral of the story: go out and find willing and able natives to do

georeferencing of foreign localities.

 

John

 

>>> Posting number 281, dated 23 Jul 2002 08:14:23

 

>>> Posting number 282, dated 23 Jul 2002 08:15:09

 

>>> Posting number 283, dated 23 Jul 2002 18:48:47

Date:         Tue, 23 Jul 2002 18:48:47 -0400

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: TRS records

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Dear MaNis Users,

 

I wanted to comment on interpretation of fractional divisions of sections

for MSU Museum Township-Range-Section (TRS) records.

 

For fractional land divisions, MSU uses the format based on the legal

description/U.S. Public Land Survey system, in that the smallest fractional

division is listed first going from left to right, and the comma(s)

separating fractions are read as "of the".

 

The abbreviated description NW 1/4, NE 1/4, Sec. 5, T2N, R1W would be read

as:  The Northwest quarter of the Northeast quarter of Section 5 of

Township 2 North, Range 1 West.

 

In our records, the fractions may appear after the TRS information, but the

first fraction in the list is always the smallest.  The abbreviated

description T. 2 N., R. 1 E., Sec. 16, SW 1/4, SE 1/4, NW 1/4 would be read

as: The Southwest quarter of the Southeast quarter of the Northwest quarter

of Section 16, Township 2 North, Range 1 East.

 

(We have found the following sources helpful:

 

http://www.outfitters.com/genealogy/land/twprangemap.html

 

Muehrcke, P.C. and J.O., 1998.  Map Use:  Reading, Analysis, and

Interpretation, JP Publications, Madison, Wisconsin)

 

Also - We calculated the extent of a 1/4 of 1/4 of 1/4 (quarter of quarter

of quarter) section to be 0.088 miles.  John W, is this okay?

 

Thanks,

XXXXXXXXXXXXX

 

 

 

 

 

>>> Posting number 284, dated 24 Jul 2002 11:48:10

 

>>> Posting number 285, dated 30 Jul 2002 18:07:53

Date:         Tue, 30 Jul 2002 18:07:53 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear all,

 

I have made a minor amendment to Step Seven of the Georeferencing Steps

document (http://dlp.cs.berkeley.edu/manis/GeorefSteps.html). Basically, I

realized it will be a little easier for me to "curate" the finished

georeferencing files if I know not only when they were done and by which

institution, but also if I know which geographic region are contained

within them.  There is no need to worry about any of the files that have

already been sent to me. I'll just suffer for not having thought of this

earlier - it's how I train myself.

 

The new content of Step Seven of the Georeferencing Steps document is

repeated here for convenience.

 

Step Seven - Export Finished Localities

When a downloaded set of localities is done being georeferenced. Export the

complete set of data (all records with all fields and a column header row)

to a tab-delimited text file. Rename the file to reflect the institution,

the geographic scope of its content, and the date the file was finished.

For example, a file of Peruvian localities finished by the Field Museum on

Halloween would be FMNH-Peru-2001-10-31.txt. Make a backup of this file and

store it in a safe place until the data have been loaded into the MaNIS

Gazetteer.

 

 

John W

 

>>> Posting number 286, dated 31 Jul 2002 13:25:46

 

>>> Posting number 287, dated 31 Jul 2002 12:37:37

 

>>> Posting number 288, dated 1 Aug 2002 13:49:40

Date:         Thu, 1 Aug 2002 13:49:40 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Fwd: neat maps

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

Chris Conroy was kind enough to forward this map resource to me. Please

take note of it if you have any interest in georeferencing any of these areas.

 

John W

 

>From: Christopher Conroy <ondatra@socrates.Berkeley.EDU>

>Subject: neat maps

> 

>Folks,

> 

>      Here is a link to neat maps of former soviet republics for your

> viewing pleasure.

>http://www.reisenett.no/map_collection/commonwealth.html

> 

>Chris

>--

 

>>> Posting number 289, dated 6 Aug 2002 12:23:05

Date:         Tue, 6 Aug 2002 12:23:05 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      TRS updates to the Georeferencing Guidelines

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

Due to popular demand I have added a few paragraphs to the Georeferencing

Guidelines about calculating the coordinates of Townships and subsections

thereof. I also added a row to the table on TRS extents to include 1/4 of

1/4 of 1/4 section as well as a column for the extent of Township divisions

when orthogonal offsets are used to calculate the coordinates. Finally, I

added two new links to URLs that explain Townships quite nicely. Enjoy.

 

John W.

 

>>> Posting number 290, dated 12 Aug 2002 11:30:02

 

>>> Posting number 291, dated 13 Aug 2002 20:12:32

 

>>> Posting number 292, dated 16 Aug 2002 10:24:58

Date:         Fri, 16 Aug 2002 10:24:58 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Selecting a "Coordinate Source" for Namibia in Africa

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding:  quoted-printable

 

LACM has started work on Namibia after downloading many shapefiles and =

data tables from the "Atlas of Namibia" website located here:

 

http://www.dea.met.gov.na/data/Atlas/Atlas_web.htm#2Elevations,%20relief,=

%20profiles

 

I have cross-referenced (quality-checked the latitudes and longitudes) =

the data with source data of Namibia from the GNS website of NIMA =

(National Imagery and Mapping Agency) and the location data from the =

Atlas seems to be very accurate for the needs of MaNIS.  Therefore, I =

created a customized topographic map with place names using ArcView for =

georeferencing. =20

 

Finally, my question. =20

 

In the "georeferencing calculator" under the heading "Coordinate Source" =

there are a number of choices, but none really describe my situation.  =

Would selecting "Gazatteer" be my best bet when using data from the =

Atlas?  When using GNS data from the NIMA website should I also select =

"Gazatteer"?  The link to the NIMA website for data downloads for many =

countries is here:

 

http://164.214.2.59/gns/html/index.html

 

My thought is that since the source data for the Atlas and NIMA are from =

a variety of sources, that the logical choice is "Gazatteer", but I =

wanted to check with the board first.  Sorry if this question seemed too =

"simple".  This is my first international site and wanted to be sure on =

this.  Thanks.

 

 

Sincerely,

XXXXXXXXXXXXXXXX

 

 

>>> Posting number 293, dated 16 Aug 2002 10:42:08

Date:         Fri, 16 Aug 2002 10:42:08 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Selecting a "Coordinate Source" for Namibia in Africa

In-Reply-To:  <000c01c24549$d8b624e0$180775ce@Subaru>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

These are definitely reasonable questions, the answers to which are "Yes,

use Gazetteer as the Coordinate Source in the Georeferencing Calcultor for

compilations of numeric coordinates for named places - as opposed to maps

from which the coordinates are "measured" by the georeferencer."

 

 

At 10:24 AM 8/16/02 -0700, you wrote:

>LACM has started work on Namibia after downloading many shapefiles and

>data tables from the "Atlas of Namibia" website located here:

> 

>http://www.dea.met.gov.na/data/Atlas/Atlas_web.htm#2Elevations,%20relief,%20profiles

> 

>I have cross-referenced (quality-checked the latitudes and longitudes) the

>data with source data of Namibia from the GNS website of NIMA (National

>Imagery and Mapping Agency) and the location data from the Atlas seems to

>be very accurate for the needs of MaNIS.  Therefore, I created a

>customized topographic map with place names using ArcView for georeferencing.

> 

>Finally, my question.

> 

>In the "georeferencing calculator" under the heading "Coordinate Source"

>there are a number of choices, but none really describe my

>situation.  Would selecting "Gazatteer" be my best bet when using data

>from the Atlas?  When using GNS data from the NIMA website should I also

>select "Gazatteer"?  The link to the NIMA website for data downloads for

>many countries is here:

> 

>http://164.214.2.59/gns/html/index.html

> 

>My thought is that since the source data for the Atlas and NIMA are from a

>variety of sources, that the logical choice is "Gazatteer", but I wanted

>to check with the board first.  Sorry if this question seemed too

>"simple".  This is my first international site and wanted to be sure on

>this.  Thanks.

> 

 

>>> Posting number 294, dated 21 Aug 2002 10:45:00

Date:         Wed, 21 Aug 2002 10:45:00 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Previously parsed records - what to do?

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: 7bit

 

If one downloads records with the locations already parsed (by John

Wieczorek), am I allowed to copy the information for new records?  Or am I

to do the new records from "scratch".  Typically, my findings are pretty

similar to John W's parsed records except I have more decimal places.

 

If I am allowed to copy the information, do I note that I copied it from

John W's previously finished records?  Thanks in advance.

 

Sincerely,

XXXXXXXXXX

 

>>> Posting number 295, dated 21 Aug 2002 14:38:40

Date:         Wed, 21 Aug 2002 14:38:40 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Previously parsed records - what to do?

In-Reply-To:  <001e01c2493a$795c5420$180775ce@Subaru>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear all,

 

This is a tough one, actually.  Here are my two cents worth, though I'd

encourage discussion if there is disagreement or further insight.

 

One thing is for sure, *do not* modify the records that say "Parsed by John

Wieczorek from data provided by ..." in the LatLongRemarks field. This

means that the data are original, from the source institution. They may

even be "exact" coordinates directly from the collector, and therefore may

be more specific than the locality description. We don't really have a way

to know.

 

Given the above, it may be that the coordinates in those parsed records do

not accurately reflect the locality in the records of another institution.

Even if they do, they may not be as precise as they might be if you

georeference them using our MaNIS georeferencing guidelines.

 

One case I can think of where one might argue to use the existing parsed

coordinates is where you have multiple occurrences of the same locality

from the same institution and some of  them do not already have

coordinates. It may be more likely in this case that the localities really

do reflect the same exact place. However, this may be more likely for some

institutions than for others. I don't have enough familiarity with the data

and collecting practices of every institution to say for sure.

 

So, at the risk of looking like I'm waffling, I'll make this recommendation:

Use existing parsed coordinates to help you find localities on the map, but

georeference them as you would any other locality. If you are

georeferencing localities from your own institutions records and have good

reason to believe that the parsed coordinates accurately reflect the

position of all similar localities *from your institution's records*, go

ahead and use them. However, in the latter case, do not copy the "Parsed by

John Wieczorek from data provided by ..." in the LatLongRemarks field to

any other records. Instead, say something like "Coordinates copied from a

similar locality."

 

John W

 

At 10:45 AM 8/21/02 -0700, you wrote:

>If one downloads records with the locations already parsed (by John

>Wieczorek), am I allowed to copy the information for new records?  Or am I

>to do the new records from "scratch".  Typically, my findings are pretty

>similar to John W's parsed records except I have more decimal places.

> 

>If I am allowed to copy the information, do I note that I copied it from

>John W's previously finished records?  Thanks in advance.

> 

 

 

>>> Posting number 296, dated 21 Aug 2002 17:00:26

Date:         Wed, 21 Aug 2002 17:00:26 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: Previously parsed records - what to do?

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding:  quoted-printable

 

I thought the instructions from Barbara (Feb or March) were to leave the =

previously georeferenced records as they are and do not verify lat/longs,=

 proof or add errors.  So it doesn't seem possible to georeference them a=

s you would any other locality but leave them as they are.  Did you mean =

"but do not georeference them"?  =20

 

 

----- Original Message -----

From: John Wieczorek

Sent: Wednesday, August 21, 2002 4:46 PM

To: MAMMAL-Z-NET@USOBI.ORG

Subject: Re: Previously parsed records - what to do?

 

Dear all,

 

...So, at the risk of looking like I'm waffling, I'll make this recommend=

ation:

Use existing parsed coordinates to help you find localities on the map, b=

ut

georeference them as you would any other locality. ...Get more from the W=

eb. 

 

>>> Posting number 297, dated 21 Aug 2002 17:24:19

Date:         Wed, 21 Aug 2002 17:24:19 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Previously parsed records - what to do?

In-Reply-To:  <OE17mQbaYgUCwMtkDFl00011a64@hotmail.com>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX, and all,

 

You caught an unfortunate glitch in clarity. Let me restate that whole blurb.

 

I'll make this recommendation:

Use existing parsed coordinates to help you find localities on the map, but

do not georeference them. Use the standard georeferencing guidelines for

all other localities, even if they are similar to localities with original

parsed coordinates. Exception: If you are georeferencing localities from

your own institutions records and have good reason to believe that the

parsed coordinates accurately reflect the position of all similar

localities *from your institution's records*, go ahead and use them.

However, in the latter case, do not copy the "Parsed by John Wieczorek from

data provided by ..." in the LatLongRemarks field to any other records.

Instead, say something like "Coordinates copied from a similar locality."

 

Thanks for keeping me honest.

John

 

At 05:00 PM 8/21/02 -0700, you wrote:

>I thought the instructions from Barbara (Feb or March) were to leave the

>previously georeferenced records as they are and do not verify lat/longs,

>proof or add errors.  So it doesn't seem possible to georeference them as

>you would any other locality but leave them as they are.  Did you mean

>"but do not georeference them"?

> 

> 

>----- Original Message -----

>From: John Wieczorek

>Sent: Wednesday, August 21, 2002 4:46 PM

>To: MAMMAL-Z-NET@USOBI.ORG

>Subject: Re: Previously parsed records - what to do?

> 

>Dear all,

> 

>...So, at the risk of looking like I'm waffling, I'll make this

>recommendation:

>Use existing parsed coordinates to help you find localities on the map, but

>georeference them as you would any other locality. ...Get more from the

>Web. 

 

>>> Posting number 298, dated 23 Aug 2002 18:26:08

 

>>> Posting number 299, dated 26 Aug 2002 16:23:28

 

>>> Posting number 300, dated 2 Sep 2002 09:46:51

 

>>> Posting number 301, dated 4 Sep 2002 10:16:08

 

>>> Posting number 302, dated 4 Sep 2002 15:58:26

 

>>> Posting number 303, dated 5 Sep 2002 10:52:29

 

>>> Posting number 304, dated 5 Sep 2002 15:34:03

 

>>> Posting number 305, dated 5 Sep 2002 17:30:55

Date:         Thu, 5 Sep 2002 17:30:55 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         

Subject:      Re: Initiating Discussion of Concept Info

In-Reply-To:  <1244.140.107.26.190.1031248349.squirrel@www.oz.net>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii" ; format="flowed"

 

The proposed MaNIS structure

(http://dlp.cs.berkeley.edu/manis/darwin2jrwConceptInfo.htm) and DC2

looks fine to me for an online museum catalog.   If this is the goal

then I favor adoption of the DC2 structure and concepts and let

contributors use their discretion to populate the fields.   The

online catalog seems to be a broadening of the original MaNIS goal of

a network of georeferenced records that could be used for plotting

specimen locations.  But no problem.

 

I plan to include as much data as are available in our in-house

records when writing data out to the server.  I had planned on

leaving out specimens that could not be georeferenced but can include

them.

 

Predefined result sets (as counts) specifically for georeferencing:

 

species with lat/long data  (or w/wo lat/long data)

species with lat/long data by area of interest (county, state,

country, continent)

species with lat/long by error radii (categorical)

species with lat/long data and month and year of collection - for plotting

species - all data - for Excel or Access work

 

for general mammal work:

 

species by prep type

species by locality

 

I am assuming there will be a clickable "plot the localities" and

"plot the localities and errors" buttons on the web page.

 

A question on the bounding box field.  (Also in DC1 and 2).  Is this

really a field/concept?.  I would seem to be more along the lines of

a query that pops up and asks for opposing corners (lat/long) that

define the sides of the box.

 

Anticipated queries:  I just got a NPS request for all vertebrates in

our collection from US National Parks.  Plants too, but they aren't

online.   My response was to wait abit and this type of request can

be answered online if the requestor had lat/long boundary files or

target counties/areas.   This type of a request would be best handled

via a batch request of a list of lat/long boundaries or target areas.

Searching with an irregular boundary, rather than simple box, also

would be a great utility.

 

 

>>> Posting number 306, dated 6 Sep 2002 08:48:12

 

>>> Posting number 307, dated 6 Sep 2002 09:24:18

 

>>> Posting number 308, dated 6 Sep 2002 15:12:05

Date:         Fri, 6 Sep 2002 15:12:05 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      UAM missing Concepts

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii; format=flowed

Content-Transfer-Encoding: 7bit

 

 Looks good.  From UAM, we could address data to most of the proposed

concepts.  (There might be a water shrew that swam into a minnow trap,

but we can not yet provide depth data for a mammal.)  Nevertheless,

there are a couple fields implemented in our present web interface that

are missing.

 

1.)  We have a field for "Geographic Feature" which contains geographic

descriptors below the state/province level such as parks, refuges, etc.

 This field has been crucial in pleasing various agencies.  They can

generate their own real-time report of our holdings from their

jurisdictions, map them, etc.  (NPS can search on any particular unit,

but not on all NPS holdings.)

 

2.)  Analogous to to OtherCatalogNumber, we have Other_ID, but

supporting it, we have Other_ID_Type.  An Other_ID_Type could be

"GenBank accession number," and the associated Other_ID would be

something like "Z123456."  I suppose, that to avoid a one-to-many

relationship, these could be concatenated into another string.  This

would complicate, or slow, some searches that are possible from our

present site, e.g. taxon="Sorex" + Other_ID_Type = "GenBank..." +

country="Russia" will show all the red-toothed shrews from Russia with

GenBank accession numbers.  "Atomic" searches are also possible. (e.g.,

Other_ID_Type="GenBank..." + Other_ID = "Z123456")

 

(Try it at

http://arctos.museum.uaf.edu:8080/cgi-bin/uam_db/specimensearch.cgi)

 

This example has had consequences.  In individual specimen records, we

display a GenBank number as a hyperlink to the sequence page on GenBank.

 Now, NCBI has set up "LinkOuts" from "our" sequence pages to the

specimen records at UAM.  NCBI is enthusiastic about repository

databases as much-needed supplemental documentation of GenBank and will

be making some announcements soon.  We should consider if and how

OtherCatalogNumber might be used by GenBank's "Entrez" for LinkOuts to

MaNIS.

 

XXXXXX

 

>>> Posting number 309, dated 9 Sep 2002 09:46:40

 

>>> Posting number 310, dated 10 Sep 2002 08:16:56

 

>>> Posting number 311, dated 11 Sep 2002 09:13:26

 

>>> Posting number 312, dated 16 Sep 2002 09:01:13

 

>>> Posting number 313, dated 16 Sep 2002 10:42:31

 

>>> Posting number 314, dated 20 Sep 2002 11:02:33

 

>>> Posting number 315, dated 20 Sep 2002 11:13:40

Date:         Fri, 20 Sep 2002 11:13:40 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Initiating Discussion of Concept Info

In-Reply-To:  <p05100306b99d6117277d@[207.207.104.113]>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

More comments, interspersed below.

 

John

 

At 05:30 PM 9/5/02 -0700, you wrote:

>The proposed MaNIS structure

>(http://dlp.cs.berkeley.edu/manis/darwin2jrwConceptInfo.htm) and DC2

>looks fine to me for an online museum catalog.   If this is the goal

>then I favor adoption of the DC2 structure and concepts and let

>contributors use their discretion to populate the fields.   The

>online catalog seems to be a broadening of the original MaNIS goal of

>a network of georeferenced records that could be used for plotting

>specimen locations.  But no problem.

 

Actually, most of the expanded concepts were in the proposal from the

beginning as we wanted to be able to serve the scientific community with

the kinds of information to which they are accustomed when they write to us

now, including things such as preparation types.

 

>I plan to include as much data as are available in our in-house

>records when writing data out to the server.  I had planned on

>leaving out specimens that could not be georeferenced but can include

>them.

 

I would hope that others agree that the ungeoreferenced data can still be

useful in other contexts.

 

>Predefined result sets (as counts) specifically for georeferencing:

> 

>species with lat/long data  (or w/wo lat/long data)

>species with lat/long data by area of interest (county, state,

>country, continent)

>species with lat/long by error radii (categorical)

>species with lat/long data and month and year of collection - for plotting

>species - all data - for Excel or Access work

> 

>for general mammal work:

> 

>species by prep type

>species by locality

 

Queries for all of these questions will be possible. Barbara and I will try

to engineer predefined result set schemata and post them here for review. I

know already that there will be a "Full" result set, which contains

everything on the Concept Info page. Beyond that, we'll try to identify

special purpose result sets.

 

>I am assuming there will be a clickable "plot the localities" and

>"plot the localities and errors" buttons on the web page.

 

Not immediately, but yes, that is my hope as well.

 

>A question on the bounding box field.  (Also in DC1 and 2).  Is this

>really a field/concept?.  I would seem to be more along the lines of

>a query that pops up and asks for opposing corners (lat/long) that

>define the sides of the box.

 

BoundingBox, like JulianDay, is a concept calculated from other concepts

that map to fields in the database. It is not a separate field expected to

be found in your database.

 

>Anticipated queries:  I just got a NPS request for all vertebrates in

>our collection from US National Parks.  Plants too, but they aren't

>online.   My response was to wait abit and this type of request can

>be answered online if the requestor had lat/long boundary files or

>target counties/areas.   This type of a request would be best handled

>via a batch request of a list of lat/long boundaries or target areas.

>Searching with an irregular boundary, rather than simple box, also

>would be a great utility.

 

I would hope that an application for this kind of spatial query can be

built on top of MaNIS. I envision drawing the area of interest on a map as

one of the criteria in the query itself.  Work along these lines is being

investigated here at the MVZ.

 

 

>>> Posting number 316, dated 20 Sep 2002 11:27:55

Date:         Fri, 20 Sep 2002 11:27:55 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: UAM missing Concepts

In-Reply-To:  <3D793645.2050001@uaf.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

More comments, interspersed below.

 

At 03:12 PM 9/6/02 -0800, you wrote:

>Looks good.  From UAM, we could address data to most of the proposed

>concepts.  (There might be a water shrew that swam into a minnow trap,

>but we can not yet provide depth data for a mammal.)  Nevertheless,

>there are a couple fields implemented in our present web interface that

>are missing.

> 

>1.)  We have a field for "Geographic Feature" which contains geographic

>descriptors below the state/province level such as parks, refuges, etc.

>This field has been crucial in pleasing various agencies.  They can

>generate their own real-time report of our holdings from their

>jurisdictions, map them, etc.  (NPS can search on any particular unit,

>but not on all NPS holdings.)

 

Features are interesting and useful, as Gary also pointed out in a previous

message. They can also be problematic the way they have been implemented at

MVZ and UAM. In short, it isn't possible to have a specimen located in more

than one feature. This means you couldn't have a specimen that was both in

Yosemite National Park and in the Sierra Nevada Range.  Recognizing this

problem and seeing spatial query capabilities on the horizon, we at MVZ

have backed off of the wholesale use of the Feature field. I suggest that

its eventual fate is in question, so probably shouldn't try to include it.

 

>2.)  Analogous to to OtherCatalogNumber, we have Other_ID, but

>supporting it, we have Other_ID_Type.  An Other_ID_Type could be

>"GenBank accession number," and the associated Other_ID would be

>something like "Z123456."  I suppose, that to avoid a one-to-many

>relationship, these could be concatenated into another string.  This

>would complicate, or slow, some searches that are possible from our

>present site, e.g. taxon="Sorex" + Other_ID_Type = "GenBank..." +

>country="Russia" will show all the red-toothed shrews from Russia with

>GenBank accession numbers.  "Atomic" searches are also possible. (e.g.,

>Other_ID_Type="GenBank..." + Other_ID = "Z123456")

> 

>(Try it at

>http://arctos.museum.uaf.edu:8080/cgi-bin/uam_db/specimensearch.cgi)

> 

>This example has had consequences.  In individual specimen records, we

>display a GenBank number as a hyperlink to the sequence page on GenBank.

>Now, NCBI has set up "LinkOuts" from "our" sequence pages to the

>specimen records at UAM.  NCBI is enthusiastic about repository

>databases as much-needed supplemental documentation of GenBank and will

>be making some announcements soon.  We should consider if and how

>OtherCatalogNumber might be used by GenBank's "Entrez" for LinkOuts to

>MaNIS.

 

MaNIS can support one to many relationships if we want it to. If you look

at the original DC2 specification you'll see that I didn't include some

fields that would require one-to-many relationships to be really useful.

The most important one is RelatedCatalogedItem. Presumably if a specimen

can be realted to another one, it can be related to many other ones, even

across collections. If there is interest in resurrecting this concept for

MaNIS I am willing to do so.

 

 

 

>>> Posting number 317, dated 20 Sep 2002 15:28:31

 

>>> Posting number 318, dated 20 Sep 2002 17:42:08

Date:         Fri, 20 Sep 2002 17:42:08 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: UAM missing Concepts

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii; format=flowed

Content-Transfer-Encoding: 7bit

 

Some remarks on two segments below:

   

 

> 

> Features are interesting and useful, as XXXX also pointed out in a

> previous

> message. They can also be problematic the way they have been

> implemented at

> MVZ and UAM. In short, it isn't possible to have a specimen located in

> more

> than one feature. This means you couldn't have a specimen that was both in

> Yosemite National Park and in the Sierra Nevada Range.  Recognizing this

> problem and seeing spatial query capabilities on the horizon, we at MVZ

> have backed off of the wholesale use of the Feature field. I suggest that

> its eventual fate is in question, so probably shouldn't try to include it.

 

I do not disagree with your reasoning, but between now and "the

horizon," we might find there was substantial use for this.  We

definitely have some fuzzy stuff in there but we could restrict the use

in MaNIS to units that collections have in common and units for which

there are unequivocal boundaries, e.g., national and state parks,

wildlife refuges, etc.

 

> 

> MaNIS can support one to many relationships if we want it to. If you look

> at the original DC2 specification you'll see that I didn't include some

> fields that would require one-to-many relationships to be really useful.

> The most important one is RelatedCatalogedItem. Presumably if a specimen

> can be realted to another one, it can be related to many other ones, even

> across collections. If there is interest in resurrecting this concept for

> MaNIS I am willing to do so.

 

UAM's vote is clear.  There will be some specimens split among the

present MaNIS collections.   I know we have vouchers for some tissues at

Texas Tech, etc.  But in the long run, specimen databases should be able

to link to result databases.

 

XXXXXX

 

>>> Posting number 319, dated 21 Sep 2002 12:18:02

Date:         Sat, 21 Sep 2002 12:18:02 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Datum

In-Reply-To:  <3D8BF1A9.C7E952C9@oz.net>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXXXX and all,

 

Good find. You definitely should use a datum when you can find it. If you

use "datum not recorded" you'll add a systematic error of 1 km for

localities outside of the area shown on the map in the "Uncertainty due to

an unknown datum" section of the Georeferencing Guidelines Document on the

MaNIS website.  As Barbara suspected, I did go ahead and add "Japanese

Geodetic Datum 2000" to the list of datums from which to choose in the

GeorefCalculator.  Thanks for bringing this new Datum to our attention.

 

John

 

>XXXXXXXXXXXX wrote:

> 

> > Barbara,

> >

> > Could you give me some advice as to how to deal with Datum on the

> > Georeferenceing Calculator?

> >

> > In the pop-up list for 'Datum' in the Calculator, a variety of coordinate

> > systems adopted worldwide or locally are listed including Tokyo Datum.

> >

> > However, maps I am referencing to for geocoding are not listed in it and

> > different than Tokyo Datum, a traditional standard that had long been

> > adopted in Japan until a few years ago.

> > If I could use the newest version of Japanese maps based on the Japanese

> > Geodetic Datum 2000 (JGD2000) , should I choose 'not recorded' from a

> > pop-up list of Datum?

> >

> > For your convenience in understanding the difference between these two

> > geodetic data, the following site may help.

> >

> > http://ivs.crl.go.jp/mirror/publications/gm2002/imakiire/

> >

> > Thank you very much for your assistance.

> >

> > - XXXXXX

> >

 

 

>>> Posting number 320, dated 23 Sep 2002 11:31:28

Date:         Mon, 23 Sep 2002 11:31:28 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      updates to existing files

In-Reply-To:  <3D8BCE70.8010102@uaf.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

         This is primarily a question for John, but of relevance to all.

 

         It is my understanding that we are not to make any changes to

locality data in our existing databases, because it would mess things up

for John (I don't know the details, but don't need to).

         However, it is not clear whether it would cause problems if we

update IDs.  We have a continual stream of re-identifications that come in

for many taxa, from our own on-going research and from people who borrow

our material for study.  It would be quite inconvenient to wait for two

years to update these changes in identification.

         What about other fields, such as date, collector, etc., if we

happen to notice misspellings, errors, etc.?

         John, your thoughts?  Anyone else concerned about this?

 

                                                                         XXXXX

 

 

 

 

>>> Posting number 321, dated 23 Sep 2002 12:48:01

Date:         Mon, 23 Sep 2002 12:48:01 -0400

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: updates to existing files

Mime-Version: 1.0

Content-Type: text/plain; charset=US-ASCII

Content-Transfer-Encoding: quoted-printable

 

My understanding of the georeferencing data is that any non-geographic =

data within the "home db" can be editted without impact to the MaNIS =

mission.  I will leave it to John to clearly explain the locality part.

 

On a similar line though, it occurred to me the other day, as I was =

deleting 102 duplicate records from out db that record deletion may well =

be a real problem for John.  As I understand it new records to our db =

would not pose a similar problem.  Correct John?

 

XXXXXn

 

 

 

>>> Posting number 322, dated 23 Sep 2002 13:00:11

 

>>> Posting number 323, dated 23 Sep 2002 22:02:29

Date:         Mon, 23 Sep 2002 22:02:29 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         Barbara Stein <bstein@OZ.NET>

Subject:      Re: updates to existing files

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

> My understanding of the georeferencing data is that any non-geographic data within the "home db" can

be editted without impact to the MaNIS mission.  I will leave it to John to clearly explain the locality

part.

 

I agree with XXXXX that you are free to update any non-geography fields in your institutional dbs.

 

When John extracted the distinct set of localities from each of your dbs, he assigned a unique

identifier to each locality that will allow him to reassociate that locality with all of the specimen

records with which it was previously associated.  Being

reductionist, I will simply say that if you alter the geography in your dbs, you may seriously

compromise that reassociation process.  That would include deleting records.  He's a clever guy, but the

less problems, the better.

 

You are all familiar with the unique geog "ID" that John added to each of the files you download for

georeferencing.  Even if you do not disturb the geog. data in your dbs, altering this identifier will

also seriously compromise his ability to reassociate the

georeferenced localities with your speicmen records.

 

That having been said, I would ask you to refrain from using the abbreviation "ID" when you are

referring to taxonomic identifications.  "IDs" have a specific meaning in a db context.  They are

identifiers, not identifications, and reserving the use of "ID" for

indentifiers is just one easy, but important way, to lessen confusion in future postings.

 

So do keep updating your taxonomy, dates of collection, prep types, coll. names and nos. etc.  We

absolutely want the best data we can get on the network.  And keep posting questions.  There is a lot

going on and it is good to reiterate these things periodically.

 

Best,

Barbara

 

> On a similar line though, it occurred to me the other day, as I was deleting 102 duplicate records

from out db that record deletion may well be a real problem for John.  As I understand it new records to

our db would not pose a similar problem.  Correct John?

> 

> XXXXX

> 

 

>>> Posting number 324, dated 25 Sep 2002 09:37:52

 

>>> Posting number 325, dated 30 Sep 2002 10:30:05

 

>>> Posting number 326, dated 1 Oct 2002 20:42:54

 

>>> Posting number 327, dated 2 Oct 2002 16:30:47

 

>>> Posting number 328, dated 2 Oct 2002 17:54:45

 

>>> Posting number 329, dated 2 Oct 2002 22:52:02

 

>>> Posting number 330, dated 3 Oct 2002 10:30:11

Date:         Thu, 3 Oct 2002 10:30:11 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: updates to existing files

In-Reply-To:  <3D8FF1E5.BED78CD7@oz.net>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

Sorry to have been preoccupied for so long. I'm trying to catch up again,

so here goes.

 

Additions of new records to your databases have no bearing on the

georeferencing process. Changing data does have the potential to impact the

re-association of georeferences to specimens, but only in a limited set of

circumstances, as follow:

 

1) changing catalog numbers

2) removing specimen records.

3) changing locality descriptions

 

The first case isn't likely to happen very often, but if it does, it's

still not the end of the world - we can still match the localities for

those records to ones in the gazetteer.

The second case isn't much of an issue, if the specimen record isn't there

anymore, it's not going to be of much use to georeference it anyway.

The third case is the most difficult one. If a locality description has

changed in your database, it will show up as being different when it comes

time to re-associate georeferences with specimens, because part of the

process will be to compare the original to the then-current locality. At

that point it will be necessary to look at the gazetteer version next to

your new version to see if the new locality is actually a different place.

If it is a different place, throw away the georeference for it. If it isn't

substantively different, use the georeference. Not such a big deal unless

you've systematically changed locality descriptions.

 

So, make all of the changes you like, but try to avoid changes to locality

descriptions. Misspellings and inconsistencies will be in the reports you

get back for your localities when georeferencing is done. And once the

georeferencing is done, it won't matter nearly as much how clear or

consistently formatted your localities are - someone will have already been

through the pain to figure out where they really are.  In addition, there

will be a whole set of localities that were not possible to georeference

for one reason or another, so those who are truly bored will have a nice

set of problem localities to spend their evenings and weekends in solving.

 

John

 

At 10:02 PM 9/23/02 -0700, you wrote:

> > My understanding of the georeferencing data is that any non-geographic

> data within the "home db" can be editted without impact to the MaNIS

> mission.  I will leave it to John to clearly explain the locality part.

> 

>I agree with XXXXX that you are free to update any non-geography fields in

>your institutional dbs.

> 

>When John extracted the distinct set of localities from each of your dbs,

>he assigned a unique identifier to each locality that will allow him to

>reassociate that locality with all of the specimen records with which it

>was previously associated.  Being

>reductionist, I will simply say that if you alter the geography in your

>dbs, you may seriously compromise that reassociation process.  That would

>include deleting records.  He's a clever guy, but the less problems, the

>better.

> 

>You are all familiar with the unique geog "ID" that John added to each of

>the files you download for georeferencing.  Even if you do not disturb the

>geog. data in your dbs, altering this identifier will also seriously

>compromise his ability to reassociate the

>georeferenced localities with your speicmen records.

> 

>That having been said, I would ask you to refrain from using the

>abbreviation "ID" when you are referring to taxonomic

>identifications.  "IDs" have a specific meaning in a db context.  They are

>identifiers, not identifications, and reserving the use of "ID" for

>indentifiers is just one easy, but important way, to lessen confusion in

>future postings.

> 

>So do keep updating your taxonomy, dates of collection, prep types, coll.

>names and nos. etc.  We absolutely want the best data we can get on the

>network.  And keep posting questions.  There is a lot going on and it is

>good to reiterate these things periodically.

> 

>Best,

>Barbara

> 

 

>>> Posting number 331, dated 4 Oct 2002 09:05:25

Date:         Fri, 4 Oct 2002 09:05:25 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: updates to existing files

In-Reply-To:  <5.0.0.25.2.20021003095534.02637790@socrates.berkeley.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Hi John,

 

Thanks for the message.  Here at the Field, when we say "locality

descriptions" we are often referring to the specific locality AND the

elevation AND the lat and long.  As you know, we have different fields for

each.  When you say we should not change locality descriptions are you

referring to all three fields or just the specific locality.  I suspect

that we are free to change incorrect lat and longs (since that is what the

project is about), but I want to be sure before I change any incorrect

elevations.

 

So for example, if I come across

 

"Chicago, 50 m" in the specific locality field I should leave it alone

 

but if I come across "Chicago" in the specific locality field and "50 m" in

the elevation field can I change the 50?

 

We are also wondering whether we might set up a copy of our specific

locality on our in-house database that we could correct as we come across

problems.  The idea would be to copy this over to the "proper" catalogue

after you are finished with the reassociation of georeferences to our

specimens.  In this way we could make corrections as we come across

problems rather than putting them in my "TO DO" pile that currently extends

through the roof of the Field Museum and has a red blinking light on top of

it to warn planes.  Any thoughts on this?

 

Cheers,

 

XXXX

 

 

>>> Posting number 332, dated 4 Oct 2002 10:43:04

Date:         Fri, 4 Oct 2002 10:43:04 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: updates to existing files

In-Reply-To:  <5.1.0.14.1.20021004082747.02103ec0@mail.fmnh.org>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXX, and all,

 

I agree with your concept of the locality - it includes any piece of

information, no matter which field it "lives in," that helps to describe

the spatial bounds of collecting event. So, State/Province, County,

Specific Locality, Township/Range, Elevation, Depth, UTM coordinates,

Lat/Long, are all parts of the locality description.

 

I can appreciate your imagery of the changes backlog, and the problems I've

caused. I don't really want you to have to change your daily business while

waiting for the georeferencing to get done, but I don't want you to go out

of your way to clean up localities now because it will end up creating more

work for you later.

 

Here's a way to minimize the work later while allowing you to decrease the

aviation hazard now.  Make a column in your database that can flag a record

as having been changed with respect to locality. For example, you could

create a column called "GeoreferenceAgain." Whenever you edit a locality in

such a way that it would be georeferenced differently, put a "yes" in this

field. Whenever you edit a locality, but it wouldn't change how you

georeference it, put a "no" in the GeoreferenceAgain field. The point is to

always put something in there if you edit the locality, that way you (and

I) know how to deal with it when re-associating coordinates with specimens.

 

Alternatively, you could do this with remarks in a Comments field if you

are very careful to be consistent about how you record the information so

that it can be parsed by a computer later. For example, if you edit a

locality in such a way that it would be georeferenced differently, append

"; Georeference again" to the end of your Comments field, and if you edit

the locality, but the change wouldn't affect the georeference, append "; do

not Georeference again" to the Comments field.

 

If you accept one of my solutions, you could go ahead and change "Chicago,

50m" to have "Chicago" in the SpecificLocality field and "50m" in the

elevation field. Then put "no" in the GeoreferenceAgain field or "; do not

Georeference again" at the end of a Comments field.

 

Suppose you had "Colorado River, 100ft" in your SpecificLocality field and

you came across some information that placed the collecting event at 100m

instead. This change would clearly affect how you georeference the

locality. Thus, regardless of whether you just change the SpecificLocality

field to be "Colorado River, 100m" or if you set the SpecificLocality to

"Colorado River" and set the Elevation to "100m", you should put "yes" in

the GeoreferenceAgain field, or put "; Georeference again" at the end of

the Comments field.

 

If you can commit to a solution such as the ones I've described above,

there will be no extra work later to figure out if a locality that you've

edited in your database has changed enough to warrant a new georeference.

 

Hope that helps,

 

John

 

 

>>> Posting number 333, dated 10 Oct 2002 16:55:34

 

>>> Posting number 334, dated 10 Oct 2002 18:10:52

 

>>> Posting number 335, dated 11 Oct 2002 17:19:29

 

>>> Posting number 336, dated 11 Oct 2002 17:48:29

 

>>> Posting number 337, dated 14 Oct 2002 11:09:14

 

>>> Posting number 338, dated 14 Oct 2002 09:21:04

 

>>> Posting number 339, dated 14 Oct 2002 10:34:42

 

>>> Posting number 340, dated 15 Oct 2002 09:01:44

 

>>> Posting number 341, dated 16 Oct 2002 11:25:41

 

>>> Posting number 342, dated 16 Oct 2002 11:38:42

 

>>> Posting number 343, dated 17 Oct 2002 10:31:07

 

>>> Posting number 344, dated 21 Oct 2002 16:30:30

 

>>> Posting number 345, dated 23 Oct 2002 15:25:41

 

>>> Posting number 346, dated 26 Oct 2002 12:50:33

 

>>> Posting number 347, dated 26 Oct 2002 19:56:01

 

>>> Posting number 348, dated 27 Oct 2002 13:21:08

 

>>> Posting number 349, dated 27 Oct 2002 12:26:49

 

>>> Posting number 350, dated 28 Oct 2002 09:31:32

 

>>> Posting number 351, dated 28 Oct 2002 13:02:50

 

>>> Posting number 352, dated 28 Oct 2002 14:22:20

 

>>> Posting number 353, dated 29 Oct 2002 13:39:47

 

>>> Posting number 354, dated 30 Oct 2002 14:53:58

 

>>> Posting number 355, dated 31 Oct 2002 18:22:33

 

>>> Posting number 356, dated 1 Nov 2002 15:19:31

 

>>> Posting number 357, dated 2 Nov 2002 02:32:36

 

>>> Posting number 358, dated 2 Nov 2002 03:40:24

Date:         Sat, 2 Nov 2002 03:40:24 -0600

Reply-To:    

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         

Subject:      Apology

MIME-version: 1.0

Content-type: text/plain; format=flowed; charset=us-ascii

Content-transfer-encoding: 7BIT

 

Dear All,

 

I am so sorry that the last message I sent should have been addressed

personally, not to the mailing list.

 

Instead, let me talk about this example that I just encountered while I was

working on georeferencing.  Some one might give us a good suggestion.  The

records from 'Japan' includes one from Taiwan area.  As some of you know,

Taiwan used to be also called Formosa for a certain period of time

including when it was politically under control of the Japanese government.

Presumably the specimen was collected and recorded during that time.  When

we publish the georeference data of this specimen on the MANIS platform,

are there, if any, standards or agreements we should be ethically aware of,

especially when we deal with such potentially sensitive issues in terms of

the geographic regions which may have experienced change in their higher

geographic (country) names for any political or historical reasons?

 

 

>>> Posting number 359, dated 4 Nov 2002 13:37:40

 

>>> Posting number 360, dated 5 Nov 2002 20:36:41

Date:         Tue, 5 Nov 2002 20:36:41 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         Barbara Stein <bstein@OZ.NET>

Subject:      Re: Apology

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

> The records from 'Japan' includes one from Taiwan area.  As some of you know,

> 

> Taiwan used to be also called Formosa for a certain period of time

> including when it was politically under control of the Japanese government.

> Presumably the specimen was collected and recorded during that time.  When

> we publish the georeference data of this specimen on the MANIS platform,

> are there, if any, standards or agreements we should be ethically aware of,

> especially when we deal with such potentially sensitive issues in terms of

> the geographic regions which may have experienced change in their higher

> geographic (country) names for any political or historical reasons?

 

XXXXXX,

 

I agree completely that sensitivity to geographic place name changes is one of

the most important and difficult issues data managers of natural history

information confront.  Because it is essential that our databases are both

useful and historically accurate, the best suggestion I have is to create two

fields in your database, one for verbatim locality, i.e., the locality as it

was exactly written on the specimen tag or in the field notes at the time of

collection, and specific locality, i.e., the currently recognized locality.

This approach respects history while providing users with appropriate

information to assess species distributions, and both localities can be

provided to users in response to queries.  It is usually extremely awkward to

generate complete result sets without querying on cleaned-up specific

localities.  At the same time, it is often only possible to make sense of the

resulting data by having access to the verbatim locality.

 

Best,

Barbara

 

>>> Posting number 361, dated 12 Nov 2002 12:09:12

 

>>> Posting number 362, dated 16 Nov 2002 16:09:32

 

>>> Posting number 363, dated 19 Nov 2002 12:16:55

 

>>> Posting number 364, dated 19 Nov 2002 14:34:37

 

>>> Posting number 365, dated 20 Nov 2002 16:00:18

 

>>> Posting number 366, dated 25 Nov 2002 09:30:52

 

>>> Posting number 367, dated 11 Dec 2002 11:11:11

 

>>> Posting number 368, dated 11 Dec 2002 09:41:46

 

>>> Posting number 369, dated 8 Jan 2003 15:59:43

 

>>> Posting number 370, dated 9 Jan 2003 04:24:02

Date:         Thu, 9 Jan 2003 04:24:02 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Idaho time trials

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding:  quoted-printable

 

Dear All:

 

 

 

Below are my Idaho time trial summaries for informational purposes (for =

herp and bird efforts).  Unlike Oregon georeferencing, I kept track of =

hours spent on various aspects.  Of the total records georeferenced:

 

=20

 

Method used......................................Total.....% of Total.

 

georef'd prior to download 7/27/02..121.....5.7%

 

GNIS-calculated lat/long.....................1568.....74.2%

 

Topo USA 4.0.......................................299.....14.1%

 

TRS (TRS2LL batch)...............31.....1.5%

 

Insufficient data.......................................95.....4.5%

 

Grand Total..........................................2114...100.0%

 

                                                                   =20

 

Comments:

 

GNIS were done using a lookup of the name and lat/long then calculating =

lat/long from the placename using offsets and a macro similar to the web =

based calculator.  Error calculation is done simultaneously.

 

=20

 

Topo USA 4.0 was used for road miles, creek miles, etc.

 

=20

 

TRS records were done using the batch program for western states =

(TRS2LL).  Lat/longs for =BC sections and =BC of =BC section centers =

were calculated using lat/long calculator.

 

=20

 

Insufficient data (no specific locality, inconsistent information, =

placenames not found).

 

=20

 

Hours        =20

 

12.33.....GNIS lookup of placename and lat/long and offset entry

 

17..........lookup extents

 

16..........Road miles, creek/river miles, junctions and TRS

 

15..........Proofing, reading back through placenames and offsets and =

plotting calculated lat/longs  by county

 

48.5.......Total hours

 

.....

 

41.1 records/hr (1993/48.5)

 

=20

 

Greatest speed was in doing the GNIS records.  Even with 17 hr for =

extents the rate was 53.5 records/hr (1568/29.33).  Eliminating extents =

or putting them off for later would raise the rate to 127.2 records/hr =

(1568/12.33).  This rate was=20

 

=20

 

Summaries of extents and error radii:

 

=20

 

(Only 3 records had units in km, which were converted to miles for =

summary of extents and errors.)

 

=20

 

Extents of placenames were done as outlined in the MaNIS instructions.   =

The contribution of extent to error differs depending on the number of =

offsets. =20

 

=20

 

Range in mi......Count

 

extent< .1.............515

 

.1<extent<1..........466

 

1<=3Dextent<5.........671

 

5<=3Dextent<10.......186

 

10<=3Dextent<20........47

 

20<=3Dextent<30......8

 

30<=3Dextent<50........5

 

not done..............216

 

Grand Total........2114

 

=20

 

Error radii were not as large as I thought they would be.  The greatest =

variable contributing to error seems to be precision (N, NE, ENE) used =

to describe the direction of the offset. =20

 

=20

 

Range in mi.....Count

 

error< .1..............218

 

.1<error<1......250

 

1<=3Derror<5.....887

 

5<=3Derror<10.....371

 

10<=3Derror<20.....153

 

20<=3Derror<30.......13

 

30<=3Derror<50.......6

 

not done......216

 

Grand Total.......2114

 

=20

 

The Oregon summary was similar, except that about 20% were TRS records.  =

I didn't keep track of hours spent on various aspect so haven't posted =

that summary.  A  lower rate of 20/hr for Oregon was partly getting a =

system down.

 

=20

 

If getting paid by the record, choose those localities with published =

datasets for placenames and lat/longs and electronic maps for extents.  =

If by the hour, I would still do those with placename datasets before =

delving into the one at a time records. The approach I would suggest is =

to use a batch system or semi batch, like mine, to do the US.

 

=20

 

I'm doing British Columbia now.  I used the NIMA dataset (2500 records) =

then discovered that BCGNIS dataset with 42K records.  However maps are =

not interactive (no Topo Canada) and extents may have to be done as =

estimates.  Unlike NIMA and (US) GNIS the datum may have to be unknown.

 

>>> Posting number 371, dated 9 Jan 2003 04:27:38

Date:         Thu, 9 Jan 2003 04:27:38 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      PSM will do MMNH Idaho and Oregon records

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: 7bit

 

John:  Just to make it official.  There are 100 Oregon and about 10 Idaho.

 

>>> Posting number 372, dated 9 Jan 2003 09:31:34

Date:         Thu, 9 Jan 2003 09:31:34 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Idaho time trials

In-Reply-To:  <BAY1-DAV65FYfYpwMIH0001048e@hotmail.com>

Mime-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"; format=flowed

Content-Transfer-Encoding: quoted-printable

 

XXXX,

 

Thanks for these wonderful stats. You're right, there are lots of folks who=

=20

will be interested to know this information, not the least of which is me.=

=20

It gives us a good basis from which to measure of the efficacy of=20

technological an/or methodological innovations.  Again, thanks,

 

John

 

 

 

 

 

>>> Posting number 373, dated 9 Jan 2003 15:07:18

 

>>> Posting number 374, dated 14 Jan 2003 12:48:13

 

>>> Posting number 375, dated 14 Jan 2003 10:49:37

 

>>> Posting number 376, dated 14 Jan 2003 15:08:42

 

>>> Posting number 377, dated 14 Jan 2003 16:05:30

 

>>> Posting number 378, dated 14 Jan 2003 16:09:19

 

>>> Posting number 379, dated 20 Jan 2003 08:50:00

 

>>> Posting number 380, dated 22 Jan 2003 13:56:17

 

>>> Posting number 381, dated 23 Jan 2003 11:09:35

 

>>> Posting number 382, dated 24 Jan 2003 12:52:37

 

>>> Posting number 383, dated 24 Jan 2003 17:53:34

 

>>> Posting number 384, dated 25 Jan 2003 12:59:55

Date:         Sat, 25 Jan 2003 12:59:55 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Georeferencing status check

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

Georeferencing has been going quite smoothly, overall, and I'd like to

thank all of the participants for their contributions. As more regions get

claimed and georeferenced, it's becoming more difficult to determine what

remains to be done. In an effort to make things easier, I'm trying to put

together a new interface for the CheckList page, including a graphic

depiction of our progress to date. To do that, I'm asking that each

participating institution to send me (it doesn't have to go on the

listserv) a brief message telling me how much is finished for regions that

have been claimed but not yet submitted. It'll help me to get an accurate

accounting and thereby determine how we are progressing overall. I will

assemble the status page with information received by 31 Jan 2003, so

please get back to me with your status by then.

 

Just so you all know, MVZ has 1143 localities to go to finish the 58543

localities for California, 1170 localities remaining of the 3389 for Costa

Rica, and we haven't yet begun with The Netherlands, Argentina, or

Arizona.  If there is anyone out there who is interested in, ready, and has

the resources to georeference Arizona, the MVZ can relinquish it for

another, unclaimed region. Speak soon or we're likely to start on it as

soon as California is out of the way.

 

John

 

>>> Posting number 385, dated 26 Jan 2003 08:28:36

 

>>> Posting number 386, dated 27 Jan 2003 11:53:41

Date:         Mon, 27 Jan 2003 11:53:41 -1000

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      procedural question

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: quoted-printable

 

What are people doing in cases where you can't find the given locality =

in gazeteers or on maps, but another institution has already provided =

coordinates (in checking the coordinates on the map I see no place name =

similar to the one given for locality).  Do you use the coordinates from =

the other institution?  What do you enter for the various fields needed =

to find max error?

 

Sorry if this question has already been answered somewhere.

 

XXXXX

 

 

>>> Posting number 387, dated 28 Jan 2003 10:01:19

Date:         Tue, 28 Jan 2003 10:01:19 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: procedural question

In-Reply-To:  <9325D4CE29553F42A57E02B845C067F937D068@exchange.corp.bisho

              pmuseum.org>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear XXXXX, and all,

 

It seems to me that if the coordinates don't help you to find a locality on

a map then there is a fair chance that there is actually something wrong

with the coordinates. If you've exhausted your available map resources

without finding the place, I would put, for example, "unable to locate Burt

Ranch" in the NoGeorefBecause field. In the DeterminationRef you could

still include the references on which you were unable to find the locality.

In LatLongRemarks I would add, for example, "data from MVZ suggest that

this locality is in the vicinity of 34.34 -120.43."  The bottom line is

that we would like to avoid propagating suspect (or undocumented) data.

 

John

 

 

>>> Posting number 388, dated 28 Jan 2003 12:29:35

Date:         Tue, 28 Jan 2003 12:29:35 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: procedural question

In-Reply-To:  <5.1.1.5.2.20030128093301.017f9138@socrates.berkeley.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

John,

I appreciate your intent in adopting this convention, but I would urge that it

be applied sparing to historic collections and overseas localities.  As one

example, many Neotropical collectors in the early 20th century based their

localities on the Map of the Hispanic Americas.  I'm guessing that relatively

few collections have this source at hand during geo-referencing.  A set of

coordinates rather laboriously determined by hand from a place name (tag) and

the atlas (MHA) or even a hand-drawn map included in field notes could be lost

or demoted to an ancillary field not used in mapping by your method.

 

The bottom line seems to be "What sort of effort would be reuqired to exhume

it?"

 

Anyhow, some thoughts on the basis of past experience with our collection....

XXXXX

 

 

 

>>> Posting number 389, dated 28 Jan 2003 09:17:12

Date:         Tue, 28 Jan 2003 09:17:12 -1000

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      procedural question

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: quoted-printable

 

Actually, I quite like John's solution.  It does not comment on the =

validity of the coordinates provided by another institution, but it does =

tell the institution whose locality is currently in question that I was =

not able to find the locality in my sources but here is what someone =

else has come up with, check it out and decide for yourself.  In this =

way we don't, as John says, keep passing on possibly erroneous =

information but we don't lose it either.  If either instituion can =

validate the data later, the full georeferencing data can be filled in =

at that time.

 

XXXXX

 

 

>>> Posting number 390, dated 29 Jan 2003 07:59:20

Date:         Wed, 29 Jan 2003 07:59:20 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: procedural question

In-Reply-To:  <9325D4CE29553F42A57E02B845C067F91EDAA3@exchange.corp.bisho

              pmuseum.org>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

Perhaps in my hurry to rush off to lecture I left too much unsaid in the

earlier message--permit me to elaborate a bit.

 

Place names change over time, as we all appreciate. Old maps depict old place

names (albeit sometimes with crudely determined coordinates). If one is dealing

with old specimens referenced with old place names, then it sometimes happens

that only old maps will register those localities. If a place has been renamed,

or simply abandoned, it won't appear on a modern map, no matter the detail or

accuracy of the latter. And if someone has already determined that, and gone to

the trouble of determining its coordinates, wouldn't we be far better off

leaving those coordinates in the database?

 

I believe a better arbiter of a "suspect" distributional record is an otherwise

extralimital one. In the extreme, application of John's suggestion would

actually remove a GPS-determined coordinates for a locality if (a) the fact it

was recorded by the collector with a GPS were unnoted in the catalog, and (b)

the place name chosen for a reference point were too obscure or colloquial to

appear on the maps a given geo-referencer was using in his/her country/state

review. Personally, I would rather identify these after the fact--in

distributional context--rather than a priori, on whether they can be relocated.

By the way, all of these would lack the estimate of precision (accuracy) we are

associating with all newly determined coordinates--by mapping only records with

spatial error terms, we can automatically exclude such records, without ever

changing their coordinates

 

 As I suggested earlier, John's standard is a good one that will work with

recent collections being geo-referenced with comprehensive, well documented map

sources. I believe it will fail (meaning we lose information we already have in

digital form) on historic collections being georeferenced without all the

sources utilized by or available to the collectors. To safeguard against the

latter, perhaps persons trying to georeference specimens with undocumented

coordinates could contact the host museum(s) to inquire about possible

alternative data sources BEFORE eliminating those records' coordinates. I'd

sure hate to throw the baby out with the bathwater...

Bruce

 

> Actually, I quite like John's solution. It does not comment on the > validity

of the coordinates provided by another institution, but it does > tell the

institution whose locality is currently in question that I was > not able to

find the locality in my sources but here is what someone > else has come up

with, check it out and decide for yourself. In this > way we don't, as John

says, keep passing on possibly erroneous > information but we don't lose it

either. If either instituion can > validate the data later, the full

georeferencing data can be filled in > at that time. > >

 

 

 

>>> Posting number 391, dated 29 Jan 2003 08:59:09

Date:         Wed, 29 Jan 2003 08:59:09 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: procedural question

In-Reply-To:  <4.1.20030129075122.00965460@mail.fmnh.org>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii" ; format="flowed"

 

for what it is worth, I agree totally with what XXXXX writes below.

There are lots of named ranches here is the western US, given as

localities, that are identified on old (e.g., 1927) 7.5 minute USGS

topo sheets but that do not appear on more recent paper versions and

thus on any of the digitally available maps.

XXX

 

>Perhaps in my hurry to rush off to lecture I left too much unsaid in the

>earlier message--permit me to elaborate a bit.

> 

>Place names change over time, as we all appreciate. Old maps depict old place

>names (albeit sometimes with crudely determined coordinates). If one

>is dealing

>with old specimens referenced with old place names, then it sometimes happens

>that only old maps will register those localities. If a place has

>been renamed,

>or simply abandoned, it won't appear on a modern map, no matter the detail or

>accuracy of the latter. And if someone has already determined that,

>and gone to

>the trouble of determining its coordinates, wouldn't we be far better off

>leaving those coordinates in the database?

> 

>I believe a better arbiter of a "suspect" distributional record is

>an otherwise

>extralimital one. In the extreme, application of John's suggestion would

>actually remove a GPS-determined coordinates for a locality if (a) the fact it

>was recorded by the collector with a GPS were unnoted in the catalog, and (b)

>the place name chosen for a reference point were too obscure or colloquial to

>appear on the maps a given geo-referencer was using in his/her country/state

>review. Personally, I would rather identify these after the fact--in

>distributional context--rather than a priori, on whether they can be

>relocated.

>By the way, all of these would lack the estimate of precision

>(accuracy) we are

>associating with all newly determined coordinates--by mapping only

>records with

>spatial error terms, we can automatically exclude such records, without ever

>changing their coordinates

> 

>  As I suggested earlier, John's standard is a good one that will work with

>recent collections being geo-referenced with comprehensive, well

>documented map

>sources. I believe it will fail (meaning we lose information we

>already have in

>digital form) on historic collections being georeferenced without all the

>sources utilized by or available to the collectors. To safeguard against the

>latter, perhaps persons trying to georeference specimens with undocumented

>coordinates could contact the host museum(s) to inquire about possible

>alternative data sources BEFORE eliminating those records' coordinates. I'd

>sure hate to throw the baby out with the bathwater...

>Bruce

> 

>>  Actually, I quite like John's solution. It does not comment on

>>the > validity

>of the coordinates provided by another institution, but it does > tell the

>institution whose locality is currently in question that I was > not able to

>find the locality in my sources but here is what someone > else has come up

>with, check it out and decide for yourself. In this > way we don't, as John

>says, keep passing on possibly erroneous > information but we don't lose it

>either. If either instituion can > validate the data later, the full

>georeferencing data can be filled in > at that time. > >

> 

 

 

>>> Posting number 392, dated 29 Jan 2003 09:24:28

Date:         Wed, 29 Jan 2003 09:24:28 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: procedural question

In-Reply-To:  <p05100303ba5db78db1d2@[128.32.214.36]>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

I don't disagree with any of these sentiments, but I want to clarify my

original response to the original questions posed by XXXXX, which were:

 

"What are people doing in cases where you can't find the given locality in

gazetteers or on maps, but another institution has already provided

coordinates (in checking the coordinates on the map I see no place name

similar to the one given for locality).  Do you use the coordinates from

the other institution?  What do you enter for the various fields needed to

find max error?"

 

Remember, we're supposed to georeference localities that *do not* already

have coordinates, and we're supposed to leave localities that have

coordinates unchanged. It was never the intention to throw anything away,

not even the bathwater.

 

 

 

>>> Posting number 393, dated 29 Jan 2003 11:39:31

Date:         Wed, 29 Jan 2003 11:39:31 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: procedural question

In-Reply-To:  <5.1.1.5.2.20030129090833.01727938@socrates.berkeley.edu>

MIME-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

Content-Transfer-Encoding: 7bit

 

I have a question regarding extents and existing data.  I have found

identical localities from different museums, but one has coordinates as

parsed by John.  What should be done with these at error/extent time?

 

XXXXXXXXXXXX

 

 

>>> Posting number 394, dated 29 Jan 2003 09:58:00

Date:         Wed, 29 Jan 2003 09:58:00 -1000

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: procedural question

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: quoted-printable

 

Regarding XXXX's question:  per instructions, I leave any records that =

come with coordinates absolutely alone: "we're supposed to georeference =

localities that *do not* already have coordinates, and we're supposed to =

leave localities that have coordinates unchanged."  I do find =

coordinates and do the whole deal for the records that don't have the =

coordinates added yet.  This may seem like duplicating the =

georeferencing when another museum has coordinates for what looks like =

the exact same locality, but I do it for three reasons: =20

 

1.  I find that many times the parsed coordinates are from less accurate =

coordinates than I come up with, for instance, many times the decimal =

lat and long are determined from coordinates that only are accurate to =

the closest minute.  My measurements off of maps are far more accurate.

 

2.  It's a good cross check of the information for both institutions.

 

3.  As pointed out, the parsed records do not come with all the =

information needed for determining maximum error.  Since I need to find =

these out for the records that do not come with coordinates, I may as =

well determine the coordinates while I'm at it.

 

Regarding XXXXX's concerns: no data is ever eliminated from the MaNIS =

records.  The records that contain coordinates are never changed in any =

way, so that historical data is left untouched for those instutions that =

have already done the work.  My question had to do with situations when =

the locality from institution A does not show up in any of my available =

sources, but I do see that institution B has provided coordinates.  =

Following John's suggestion, I will now let Instituion A know why their =

locality has not been georeferenced, and refer them to the information =

provided by Instuition B.

 

XXXXX

 

-----Original Message-----

From:

Sent: Wednesday, January 29, 2003 7:40 AM

To: MAMMAL-Z-NET@USOBI.ORG

Subject: Re: procedural question

 

 

I have a question regarding extents and existing data.  I have found

identical localities from different museums, but one has coordinates as

parsed by John.  What should be done with these at error/extent time?

 

 

>>> Posting number 395, dated 29 Jan 2003 12:26:14

Date:         Wed, 29 Jan 2003 12:26:14 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      finding ranches

In-Reply-To:  <p05100303ba5db78db1d2@[128.32.214.36]>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii" ; format="flowed"

 

In Oregon and Idaho, I found that zooming in all the way on Topozone

reveals many features that do not come up with the search utility and

are not in GNIS.   Supposedly Topozone maps are scans of original

USGS maps.  However to find a ranch, flat, or cave you have to know

the general locality.    A fast connection and big screen also helps.

 

>for what it is worth, I agree totally with what Bruce writes below.

>There are lots of named ranches here is the western US, given as

>localities, that are identified on old (e.g., 1927) 7.5 minute USGS

>topo sheets but that do not appear on more recent paper versions and

>thus on any of the digitally available maps.

 

>>> Posting number 396, dated 30 Jan 2003 11:49:22

 

>>> Posting number 397, dated 30 Jan 2003 10:00:47

 

>>> Posting number 398, dated 30 Jan 2003 12:12:44

 

>>> Posting number 399, dated 30 Jan 2003 16:08:52

 

>>> Posting number 400, dated 30 Jan 2003 16:37:05

 

>>> Posting number 401, dated 30 Jan 2003 15:40:20

 

>>> Posting number 402, dated 30 Jan 2003 16:52:20

 

>>> Posting number 403, dated 30 Jan 2003 14:06:32

 

>>> Posting number 404, dated 30 Jan 2003 16:59:54

 

>>> Posting number 405, dated 30 Jan 2003 17:34:18

 

>>> Posting number 406, dated 30 Jan 2003 18:29:53

 

>>> Posting number 407, dated 30 Jan 2003 18:35:05

 

>>> Posting number 408, dated 30 Jan 2003 18:36:56

 

>>> Posting number 409, dated 31 Jan 2003 10:19:27

 

>>> Posting number 410, dated 31 Jan 2003 15:12:23

 

>>> Posting number 411, dated 31 Jan 2003 14:03:38

 

>>> Posting number 412, dated 31 Jan 2003 19:47:53

 

>>> Posting number 413, dated 3 Feb 2003 13:44:57

 

>>> Posting number 414, dated 4 Feb 2003 18:58:30

 

>>> Posting number 415, dated 5 Feb 2003 09:44:07

 

>>> Posting number 416, dated 5 Feb 2003 12:30:28

 

>>> Posting number 417, dated 5 Feb 2003 12:37:31

 

>>> Posting number 418, dated 6 Feb 2003 13:49:21

 

>>> Posting number 419, dated 7 Feb 2003 17:03:42

 

>>> Posting number 420, dated 7 Feb 2003 16:24:36

 

>>> Posting number 421, dated 7 Feb 2003 20:42:49

 

>>> Posting number 422, dated 10 Feb 2003 10:19:24

 

>>> Posting number 423, dated 10 Feb 2003 10:21:43

 

>>> Posting number 424, dated 10 Feb 2003 19:37:19

Date:         Mon, 10 Feb 2003 19:37:19 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      MaNIS Georeferencing Status Report - 1 Feb 2003

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

Brace yourselves. I'm about to blast you with numbers. Below are several

ways of looking at the georeferencing progress to date. The data include

the localities and contributions of the Bell Museum, as well as the

contributions of CONABIO and Town Peterson's group at KU - to all of whom I

extend another hearty thanks.

 

The following data represent the georeferencing status as of 1 Feb 2003, 17

months into the MaNIS project. When looking at these data there are several

attendant issues to consider, namely:

 

1) MVZ actually started georeferencing before the grant did.

2) Most institutions did not start georeferencing until about Jan 2002.

3) One institution did not submit a report but has 3600 localities in progress.

4) There has been assistance from colleagues who were not budgeted into the

original calculations. This accounts for 18932 claimed localities of which

8545 have been georeferenced and 2808 had coordinates when we began.

5) Of the 34090 unclaimed localities, 4838 already have lat/longs.

6) The Bell Museum has contributed 5407 localities to the workload and has

thus far committed to georeferencing 3815 localities. Some of the Bell

Museum localities may not be georeferenced in the context of MaNIS.

 

Total number of localities:     296737

Total number georeferenced:     203881  (68.7%)

Total pre-georeferenced:        64073   (21.6%)

Georeferenced in MaNIS:         139808  (47.1%)

Remaining to georeference:      92856   (31.3%)

 

Breakdown by georef category:

USA georeferenced:              161906 of 192887        (83.9%)

CAN georeferenced:              8426 of 12197           (69.1%)

MEX georeferenced:              6705 of 30062           (22.3%)

Other georeferenced:            26844 of 61591          (43.6%)

 

The claims are as follow:

USA claimed:                    191125 of 192887        (99.1%)

CAN claimed:                    8202 of 12197           (67.2%)

MEX claimed:                    29898 of 30062          (99.5%)

Other claimed:                  33422 of 61591          (54.3%)

Total claimed:                  262647 of 296737        (88.5%)

 

Using reasonable (I think) estimates of the georeferencing rates for the

three original categories (18/hr, 12/hr, and 9/hr for USA, non-USA North

America, and non-North America respectively), we have finished 214 weeks of

georeferencing and have another 196 weeks of georeferencing to go. In other

words, 52.2% of the workload is behind us. These weeks are not based on the

actual hours spent georeferencing. However, given that georeferencing takes

time, and time is money (if we're to believe the addage), another way to

assess our progress is to look at how much money has been spent to get to

our current status. As of 1 Feb 2003, 49% of the money available for

georeferencing has been spent. Given the caveats in items 1-6 above and

that there is some variability in the georeferencing rates, I'm happy to

report that we're right on target!

 

To celebrate, I've made changes to the Georef Checklist

(http://dlp.cs.berkeley.edu/manis/Checklist.html) page to reflect that we

have done more than we have left to do. We now have a map of claimed

georeferencing regions courtesy of Robert Hijmans. In addition, I've added

a new table with an alphabetical listing of geographic regions that have

not yet been claimed for georeferencing, including the number of localities

and the number left to georeference. Because I'm feeling particularly

generous (relieved is probably more accurate, actually), I've also made an

update to the georeferencing calculator to include a few new map scales

that have been encountered since beginning to georeference far-away places

where we'd all rather be.

 

I guess I'll take this opportunity to claim Albania, Djibouti, Macau,

Guinea-Bissau, and the Maldives for georeferencing while I'm at it. (You

should probably look on the new checklist before being too laudatory).

 

Thanks, and good work,

John

 

>>> Posting number 425, dated 20 Feb 2003 11:25:34

 

>>> Posting number 426, dated 25 Feb 2003 11:37:32

Date:         Tue, 25 Feb 2003 11:37:32 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         "Patricia W. Freeman" <pfreeman1@UNL.EDU>

Subject:      smaller collections and overlap

In-Reply-To:  <5.0.2.1.0.20030220112241.02400a00@nhm.org>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"

 

I meant to pass this along to the list a couple of weeks ago.  It brings up

an interesting issue.  The University of Nebraska State Museum (UNSM)

division of Zoology had already done much of work that MaNIS set out to do

when the proposal went out, or at least we had much of the work in place

and had created the UNSM Georeferencing Calculator.  It then became an

issue as to whether Steve Hinsaw at Michigan would "do" Nebraska for MaNIS

or that UNSM would.  We both volunteered.

 

I hope we can be linked in some fashion to MaNIS.  Other  state/regional

collections may be in the same boat.  We concentrate on Nebraska and the

Northern Great Plains region and have the largest collection of Nebraska

mammals (and birds, and herps, and fish)

 

Here is my letter.

 

10/2/03

 

Dear Steve and John-

 

If you want consistency with Manis, Steve should probably go ahead and

calculate localities for Nebraska.

Our algorithm [*], based on ideas from that Texas Tech paper, is similar

but figuring accuracy is a different matter.  We have applied our

Calculator method to all four of our vertebrate collections here.  We try

to have two ways of reporting locality now.   For the third, retroactively

figuring lat /long, we feel, is art rather than science and the buyers will

have to beware, particularly when it comes to different centuries, town

centers and post offices over time,  different models of GPS units, and

whether those units have been corrected.

 

I hope, however, Manis will consider links to smaller state and regional

collections that are georeferenced and have a good knowledge of and are

closer to old place names in our state or region.

 

 

As regarding the sharing of data issue.

We have two levels of information.  The public/online one gives localities

to counties only.  These lead to professional inquires that come straight

to my Collections Manager or myself.  It has worked very well over the last

several years and researchers contact us on a regular, and increasing

basis. We handle endangered species localities on a case by case basis. [**

then***]

 

Best regards-

Trish Freeman

 

 

my notes today that are more explanatory:

 

 *that we have now named the UNSM Georeferencing Calculator and was created

in 1999 or 2000 by Cliff Lemen, all-round biologist and is computer

knowledgeable (ask Bruce Patterson)

 

** This way we know exactly who wants data and can assess for-profit

requests. Our UNSM management policy has suggested charging for these

requests.  They have not come up much, however.

 

*** We have a disclaimer on every report sent out that we cannot assure the

accuracy of the data (this would be both taxonomically and

geogreference-wise).

 

 

Thanks-

Trish

 

Patricia W. Freeman

Professor/ Curator of Zoology

University of Nebraska State Museum

Lincoln NE 68588-0514

402-472-6606

402-472-8949 (fax)

pfreeman1@unl.edu

http://www-museum.unl.edu/research/zoology/zoology.html

 

>>> Posting number 427, dated 5 Mar 2003 09:46:19

 

>>> Posting number 428, dated 5 Mar 2003 13:07:44

 

>>> Posting number 429, dated 5 Mar 2003 16:58:32

 

>>> Posting number 430, dated 11 Mar 2003 21:02:37

 

>>> Posting number 431, dated 11 Mar 2003 09:47:45

 

>>> Posting number 432, dated 11 Mar 2003 12:02:27

 

>>> Posting number 433, dated 13 Mar 2003 15:53:15

Date:         Thu, 13 Mar 2003 15:53:15 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Georeferencing guidelines citation

In-Reply-To:  <5.1.1.5.2.20030311022218.018130b0@socrates.berkeley.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"; format=flowed

Content-Transfer-Encoding: quoted-printable

 

   Hello John.

   Do you have the proper citation for your 'Georeferencing guidelines'?

 

   Thanks

   XXXXXXXXXXXXXXXXXXX

  

 

>>> Posting number 434, dated 13 Mar 2003 15:56:31

Date:         Thu, 13 Mar 2003 15:56:31 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Georeferencing guidelines citation

In-Reply-To:  <5.2.0.9.0.20030313155151.00b5ce48@xolo.conabio.gob.mx>

Mime-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"; format=flowed

Content-Transfer-Encoding: quoted-printable

 

For the web page I suppose you'd use something like the following:

 

Wieczorek, J. R. 2001. Georeferencing Guidelines.=20

(http://dlp.cs.berkeley.edu/manis/GeorefGuide.html)

 

I'm in the process of preparing a paper on the subject to the International=

=20

Journal of GIS based on the GeorefGuide document. I'm going to try to=20

submit it in the next week or two.

 

 

At 03:53 PM 3/13/03 -0600, you wrote:

>   Hello John.

>   Do you have the proper citation for your 'Georeferencing guidelines'?

> 

>   Thanks

>   XXXXXXXXXXXXXXXXXXXXXXX

>  

 

>>> Posting number 435, dated 14 Mar 2003 12:04:44

 

>>> Posting number 436, dated 14 Mar 2003 16:34:31

 

>>> Posting number 437, dated 14 Mar 2003 18:56:55

Date:         Fri, 14 Mar 2003 18:56:55 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      New Georeferencing Calculator

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

By popular demand (way back when I first released the Georeferencing

Calculator, and more recently as new browser versions have become

available) I have developed a new Calculator. The new version is the same

as the old in every respect except the following:

 

1) The code for the new calculator is much smaller and should therefore

load much more quickly. The release consists of one file,

georefcalculator.jar, which is 244k.

 

2) I have removed an internal dependency that required access to

elib.cs.berkeley.edu to determine datum errors. The datum error data are

now included within the program. This means the calculator can be run with

appletviewer without requiring access to the internet.

 

3) I have created an enhanced web page that checks for and installs (if

necessary) a Java plugin in your browser before trying to load the applet.

 

4) I have made the applet Mac compatible in IE and Netscape versions later

than  4.x.

 

The calculator still seems to run fine in all of the previously supported

operating systems and browsers. Feel free to report findings to the contrary.

 

The new Calculator can be found at the following URL:

 

http://dlp.cs.berkeley.edu/manis/gc.html

 

When it is clear that this version is at least as stable as the old one, I

will replace all of the links in the MaNIS web pages to this new version

and "retire" the old one. Therefore, please start to use this new version

and tell me if anything is amiss.

 

Thanks,

 

John

 

>>> Posting number 438, dated 15 Mar 2003 13:37:26

 

>>> Posting number 439, dated 17 Mar 2003 15:18:21

 

>>> Posting number 440, dated 20 Mar 2003 15:28:02

Date:         Thu, 20 Mar 2003 15:28:02 -1000

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      New Georeferencing Calculator

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: quoted-printable

 

John,

 

your calculators (old and new) do not provide a selection for GPS =

reading in the "coordinate source" box.  I've got new localities coming =

in that require that option and there just may be some fairly recent =

ones in the MaNIS gazeteer that will need it also.

 

XXXXX

 

>>> Posting number 441, dated 21 Mar 2003 13:05:20

Date:         Fri, 21 Mar 2003 13:05:20 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Updates for GPS-derived coordinates

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All, and especially XXXXXXXXXXXXX for the reminder,

 

I have made an update to the new Georeferencing Calculator (

(http://elib.cs.berkeley.edu/manis/gc.html) to accommodate GPS as a

coordinate source. When the Locality Type is "Coordinates Only" and the

Coordinate Source is "GPS" a new text box appears in which you can enter

the GPS accuracy. The calculations treat GPS accuracy in the same way as an

extent of a named place. In this case the named place is the pair of

coordinates and the extent is the accuracy.

I have updated the Georeferencing Guidelines

(http://dlp.cs.berkeley.edu/manis/GeorefGuide.html) to include a section on

GPS accuracy as well as minor changes in the text to accommodate this

source of uncertainty. I'm including the text of the GPS section below for

you convenience.

 

The changes have not been to the old calculator, which I will retire when

I'm confident that this new calculator has no flaws more serious than the

old one has. To date I have had no reported bugs, which I take to be a good

sign rather than a sign that you have all given up georeferencing altogether.

 

John

 

 

Excerpt from Georefencing Guidelines 21 Mar 2003:

 

Uncertainty due to GPS accuracy

The accuracy of the coordinate data reported by a GPS varies with time,

place, and equipment used. Previous to the order to cease Selective

Availability (deliberate GPS signal scrambling) at 8PM EST 1 May 2000,

uncorrected GPS receivers were subject to artificial inaccuracies of about

100 meters. Today, many GPS receivers have a function to determine the

estimated accuracy of given reading, but this information is not

universally available, nor is it often recorded with the coordinates. It is

not possible to determine the actual accuracy of a GPS reading

retroactively if it was not recorded at the time of the reading. In fact,

many GPS receivers estimate accuracy poorly. My Garmin eTrex Summit, for

example, reports positions with putative accuracies of 7 meters that are

demonstrably off by 15 meters. Where extreme accuracy is required, be sure

of the capabilities of your GPS under the prevailing conditions when the

coordinates are recerded. For retrospective uncertainty estimates where

detailed information is not available, 30 meters is a reasonable,

conservative estimate of GPS accuracy in the absence of Selective

Availability.

 

>>> Posting number 442, dated 24 Mar 2003 16:56:35

Date:         Mon, 24 Mar 2003 16:56:35 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Canadian Geographic Names Service

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

I have just discovered a Canadian corollary to the USGS GNIS, the

Geographical Names of Canada, at the following URL:

 

http://geonames.nrcan.gc.ca/index_e.php

 

My apologies to those of you who might already have known of this, but if

you did, why haven't you told the rest of us? :)

 

John

 

>>> Posting number 443, dated 24 Mar 2003 20:13:00

Date:         Mon, 24 Mar 2003 20:13:00 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: Canadian Geographic Names Service

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: 7bit

 

John and All:  I assumed everyone knew about it.  It is on the US GNIS

website under Links>Useful Geographic Names links.  I guess to differentiate

these from nonuseful links.  Back in the Idaho time trials posting I

mentioned BCGNIS and downloading placenames and lat/longs using the bounding

box.  I ended up with 42K+ BC placename for BC for use in the Excel

calculator.  The problem then was in the datum.  I have since learned (will

forward the email when I locate it) that the almost all lat/longs are NAD27.

 

In BCGNIS the lat/long are to the nearest minute leading to my current

conundrum:  Is it worth bothering with other components of error (especially

extent) when the lat/longs are to the nearest minute.  More as I think about

the problem.

 

----- Original Message -----

From: "John Wieczorek" <tuco@SOCRATES.BERKELEY.EDU>

To: <MAMMAL-Z-NET@USOBI.ORG>

Sent: Monday, March 24, 2003 4:56 PM

Subject: [MANIS] Canadian Geographic Names Service

 

 

> Dear All,

> 

> I have just discovered a Canadian corollary to the USGS GNIS, the

> Geographical Names of Canada, at the following URL:

> 

> http://geonames.nrcan.gc.ca/index_e.php

> 

> My apologies to those of you who might already have known of this, but if

> you did, why haven't you told the rest of us? :)

> 

> John

> 

 

>>> Posting number 444, dated 24 Mar 2003 20:14:39

Date:         Mon, 24 Mar 2003 20:14:39 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Fw: BCGNIS datum

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: 7bit

 

----- Original Message -----

From: "Mason, Janet SRM:EX" <Janet.Mason@gems4.gov.bc.ca>

To:

Sent: Thursday, March 20, 2003 8:09 AM

Subject: RE: BCGNIS datum

 

 

> The bounding box is certainly one way to pull all BC names. Note that

> applications using static gazetteer data are discouraged, because place

> names can and do change.

> 

> I'm familiar with the Topo USA 4.0 product, but you might get some value

> from the Internet Mapping Framework (IMF) utility being developed through

> the Land Information BC initiative in our ministry, Sustainable Resource

> Management. I've been finalizing protocols with IMF designers so that

their

> 'search by name' utility will hit a live view of BCGNIS, hence obviate

> static retention of place name data.

> 

> http://maps.gov.bc.ca/ --> "Provincial Basemap".

> Select "Find Location" on toolbar, then 'Place Name'. Note that the

utility

> zooms to the coordinates held in BCGNIS (ie. MOUTH of rivers, CENTRE of

> aerial features, SUMMIT of elevated features, so you'll still need to pan

if

> your site is near  headwaters, etc.)  Merging with orthophotos might be

> useful for you. The amount & quality of toponymy on base maps is abyssmal,

> but a markup tool allows you to customize maps.

> 

> Cheers,

> 

> Janet Mason

> Provincial Toponymist

> Base Mapping & Geomatic Services Branch

> Ministry of Sustainable Resource Management

> PO Box 9355  STN Prov Govt

> VICTORIA  BC   V8W 9M2

> *(250) 387-9328   fax (250) 356-7831

> * janet.mason@gems4.gov.bc.ca

> BC Geographical Names Information System:

> http://srmwww.gov.bc.ca/bcnames

> 

> 

> 

> -----Original Message-----

> From:

> Sent: Wednesday, March 19, 2003 6:00 AM

> To: Mason, Janet SRM:EX

> Subject: Re: BCGNIS datum

> 

> 

> Janet:  Your response was very helpful.  From your web site I downloaded

> about 42K localities using your bounding box search.  The bounding box was

> impressive.  These localities then went into an Excel workbook for lookup

of

> BC placenames.

> 

> Do you know of any electronic (interactive) maps similar to Topo USA 4.0

> (from DeLorme) that encompass BC?  I have to determine extents for

> placenames (encompassing radius) and do a number of measurements and don't

> have the funds to purchase all the hard copy maps.

> 

> 

> ----- Original Message -----

> From: "Mason, Janet SRM:EX" <Janet.Mason@gems4.gov.bc.ca>

> To:

> Sent: Tuesday, March 18, 2003 10:25 AM

> Subject: RE: BCGNIS datum

> 

> 

> > Hi XXXX.

> >

> > The VAST majority of coordinates in BCGNIS are NAD 27; they have been

> > hand-scaled off federal 1:50k or 1:250k lithographed sheets over the

past

> 5

> > decades, and - as you will have noticed - are usually rounded to the

> nearest

> > minute. At worst, rounding represents less than 1.5 km on the ground at

> our

> > latitudes.  Names adopted in the last 10 years (a few hundred, max) are

> > located to the nearest 5-second, but still scaled off the available

litho

> > sheets, hence primarly NAD27.

> >

> > We're working to repopulate these values from locations on the 1:20 000

> > provincial base, TRIM (NAD 83), but I'm guessing we're a year or more

> away.

> > At that time, datum will be specified, and UTM and decimal-degrees will

> also

> > be displayed, or calculated on the fly.

> >

> > I hope this helps,

> >

> > Cheers,

> >

> > Janet Mason

> > Provincial Toponymist

> > Base Mapping & Geomatic Services Branch

> > Ministry of Sustainable Resource Management

> > PO Box 9355  STN Prov Govt

> > VICTORIA  BC   V8W 9M2

> > *(250) 387-9328   fax (250) 356-7831

> > * janet.mason@gems4.gov.bc.ca

> > BC Geographical Names Information System:

> > http://srmwww.gov.bc.ca/bcnames <http://srmwww.gov.bc.ca/bcnames>

> >

> > -----Original Message-----

> > From:

> > Sent: Friday, March 14, 2003 11:46 AM

> > To: Mason, Janet SRM:EX

> > Subject: BCGNIS datum

> >

> >

> >

> > Dear Janet:  I am working on a georeferencing project through Berkeley

> > Museum of Vertebrate Zoology ( http://dlp.cs.berkeley.edu/manis/

> > <http://dlp.cs.berkeley.edu/manis/> ).  I used BCGNIS to get lat/longs

for

> > mammal specimens from British Columbia.  One component of error we are

> using

> > is the  deteremining lat/long is the datum (known or unknow).  I'm just

> > wondering if the datum (NAD27, NAD83, WGS84) is specified for BCGNIS

> > localities on the original maps or elsewhere.

> >

> >

> >

> >

> > XXXXXXXXXXXXXXXXXXX

> >

 

>>> Posting number 445, dated 25 Mar 2003 11:29:14

 

>>> Posting number 446, dated 25 Mar 2003 12:06:07

 

>>> Posting number 447, dated 25 Mar 2003 09:22:23

Date:         Tue, 25 Mar 2003 09:22:23 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Canadian Geographic Names Service

In-Reply-To:  <BAY1-DAV18SXI3acH0x0002cc62@hotmail.com>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear XXXX, and all,

 

For the sake of consistency all georeferencing should follow the MaNIS

guidelines. If an example of a reason not to neglect the extent

determinations would be helpful, here's one:

 

Vancouver Island

 

John

 

 

At 08:13 PM 3/24/03 -0800, you wrote:

>John and All:  I assumed everyone knew about it.  It is on the US GNIS

>website under Links>Useful Geographic Names links.  I guess to differentiate

>these from nonuseful links.  Back in the Idaho time trials posting I

>mentioned BCGNIS and downloading placenames and lat/longs using the bounding

>box.  I ended up with 42K+ BC placename for BC for use in the Excel

>calculator.  The problem then was in the datum.  I have since learned (will

>forward the email when I locate it) that the almost all lat/longs are NAD27.

> 

>In BCGNIS the lat/long are to the nearest minute leading to my current

>conundrum:  Is it worth bothering with other components of error (especially

>extent) when the lat/longs are to the nearest minute.  More as I think about

>the problem.

> 

>----- Original Message -----

>From: "John Wieczorek" <tuco@SOCRATES.BERKELEY.EDU>

>To: <MAMMAL-Z-NET@USOBI.ORG>

>Sent: Monday, March 24, 2003 4:56 PM

>Subject: [MANIS] Canadian Geographic Names Service

> 

> 

> > Dear All,

> >

> > I have just discovered a Canadian corollary to the USGS GNIS, the

> > Geographical Names of Canada, at the following URL:

> >

> > http://geonames.nrcan.gc.ca/index_e.php

> >

> > My apologies to those of you who might already have known of this, but if

> > you did, why haven't you told the rest of us? :)

> >

> > John

> >

 

>>> Posting number 448, dated 25 Mar 2003 09:53:09

Date:         Tue, 25 Mar 2003 09:53:09 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: Canadian Geographic Names Service

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: 7bit

 

If it was just Vancouver Island, I would put it down as "extent too large,

find more precise reference" with the idea that it would be a waste of time

to fiddle and annotate when the owning institution most likely had more

data.  However, I can georeference it with a 210 km extent and annotate it

as above.  In this case, the extent component of error swamps the the other

components, esp coordinate (in)precision due the minute bounding box.

Vancouver Is is the size of western Washington or Oregon which I did not

georeference because of "extent too large" or "ambiguous reference" problem.

 

----- Original Message -----

From: "John Wieczorek" <tuco@SOCRATES.BERKELEY.EDU>

To: <MAMMAL-Z-NET@USOBI.ORG>

Sent: Tuesday, March 25, 2003 9:22 AM

Subject: Re: [MANIS] Canadian Geographic Names Service

 

 

> Dear XXXX, and all,

> 

> For the sake of consistency all georeferencing should follow the MaNIS

> guidelines. If an example of a reason not to neglect the extent

> determinations would be helpful, here's one:

> 

> Vancouver Island

> 

> John

> 

> 

 

>>> Posting number 449, dated 25 Mar 2003 10:42:56

Date:         Tue, 25 Mar 2003 10:42:56 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Canadian Geographic Names Service

In-Reply-To:  <BAY1-DAV717eG5yhCSp00030258@hotmail.com>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Granted that my example is extreme, but I wanted to illustrate

unequivocally that the extent matters. Extents will be in a variety of

sizes. At times they will be larger than other sources of uncertainty, and

at other times they will be trivial in comparison.

 

 

At 09:53 AM 3/25/03 -0800, you wrote:

>If it was just Vancouver Island, I would put it down as "extent too large,

>find more precise reference" with the idea that it would be a waste of time

>to fiddle and annotate when the owning institution most likely had more

>data.  However, I can georeference it with a 210 km extent and annotate it

>as above.  In this case, the extent component of error swamps the the other

>components, esp coordinate (in)precision due the minute bounding box.

>Vancouver Is is the size of western Washington or Oregon which I did not

>georeference because of "extent too large" or "ambiguous reference" problem.

> 

>----- Original Message -----

>From: "John Wieczorek" <tuco@SOCRATES.BERKELEY.EDU>

>To: <MAMMAL-Z-NET@USOBI.ORG>

>Sent: Tuesday, March 25, 2003 9:22 AM

>Subject: Re: [MANIS] Canadian Geographic Names Service

> 

> 

> > Dear XXXX, and all,

> >

> > For the sake of consistency all georeferencing should follow the MaNIS

> > guidelines. If an example of a reason not to neglect the extent

> > determinations would be helpful, here's one:

> >

> > Vancouver Island

> >

> > John

> >

> >

 

>>> Posting number 450, dated 25 Mar 2003 11:28:58

Date:         Tue, 25 Mar 2003 11:28:58 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      plotting different datums

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: 7bit

 

John:  Just wondering how the plotting software is going to deal with the

lat/longs from different datums (or data)?  BC will have about 55% WGS84

(NIMA) and 35% NAD27 (NAD27)?  I can envision an overlay for each datum or

conversions prior to plotting?  The difference isn't that great for North

America at least, so depending on scale/magnification it might not matter?

 

>>> Posting number 451, dated 25 Mar 2003 11:41:41

Date:         Tue, 25 Mar 2003 11:41:41 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: plotting different datums

In-Reply-To:  <BAY1-DAV56eACbzCWfb00030af3@hotmail.com>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

If, by plotting software, you mean applications running "on top of" MaNIS,

using our georeferenced data, then we envision a datum transformation layer

that can take our original data (in whatever datum) and transform it to the

datum of choice for the purpose of visualization or analysis. Coordinates

for which the datum is "not recorded" will not be transformed, hence the

need for the unknown datum uncertainty in the calculations.

 

In BC the uncertainty from not knowing whether the source is NAD27 or WGS84

ranges between about 70 m in the extreme SE to about 110 m in the extreme

NW. If we didn't do datum transformations, then those localities that are

well specified (uncertainties of about the same scale as the datum

transformation distance) would

be quite obviously displaced from their correct positions.

 

At 11:28 AM 3/25/03 -0800, you wrote:

>John:  Just wondering how the plotting software is going to deal with the

>lat/longs from different datums (or data)?  BC will have about 55% WGS84

>(NIMA) and 35% NAD27 (NAD27)?  I can envision an overlay for each datum or

>conversions prior to plotting?  The difference isn't that great for North

>America at least, so depending on scale/magnification it might not matter?

 

>>> Posting number 452, dated 25 Mar 2003 14:33:57

Date:         Tue, 25 Mar 2003 14:33:57 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      How to deal with non-standard characters in Locality description?

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

While it is not strictly related "georeferencing"

rules, I recently discovered that special

characters in languages other than english pose a

problem. I also discovered that various

collections have different rules how to deal with

these characters. For example many european

languages use additional characters (think about

the german umlaut, or accent marks on certain

vowels). Some collections tried to use these

characters, while others used the closest

equivalent (instead of an umlaut a, an a is used).

Unfortunately, there are problems with both

approach.

 

1,Due to change in how this special characters are

mapped in computers (DOS-->Win3.1-->Win98 etc.)

characters can be lost. During my georeferencing

work, I found examples when characters were lost

(discarded) because they were special non-standard

english characters.

 

2, One character can make a big difference:

certain locality names in europe differ only by

one or two of these special characters. Some

locality names were ambiguous in Hunary because

those special characters were transcribed to an

english equivalent.

 

The reason why I am mentioning this problem is

that many people start to georeference non-US

localities and this kind of problems will be more

common. During the time when I georeferenced

Hungarian localities, I tried to resolve some

pretty distorted locality names. When I found

these distorted (some of them was transcribed to

english alphabet and and had typos in them) I

noted the correct spelling of the place name in

the "NamedPlace" field. However, I am not sure how

long this data will survive after some transfer

and conversion between various databases and

operating systems. At the time when I was working

on the Hungarian localities I felt that it is

important to make these notes because they explain

what assumptions I made when I resolved the

location of each localities.

 

Anyway, do we have rules how to deal with these

names that contain non-standard characters? Is it

worth to "correct" spelling of locality names, or

shall we just find the geographic coordinates for

the records?

 

Sincerely,

XXXXX         (spelled correctly with an accent

mark on the a)

 

 

>>> Posting number 453, dated 26 Mar 2003 10:06:18

Date:         Wed, 26 Mar 2003 10:06:18 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: How to deal with non-standard characters in Locality

              description?

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii; format=flowed

Content-Transfer-Encoding: 7bit

 

Nobody wants to say the "U word <http://www.unicode.org/> ."

    XXXXXX

 

 

>>> Posting number 454, dated 26 Mar 2003 13:52:39

 

>>> Posting number 455, dated 27 Mar 2003 13:41:00

Date:         Thu, 27 Mar 2003 00:13:41 -0600

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: How to deal with non-standard characters in Locality

              description?

In-Reply-To:  <01KTYKRN651Q9KO309@TTACS.TTU.EDU>

MIME-version: 1.0

Content-type: text/plain; charset=us-ascii; format=flowed

 

With regard to XXXXX's  (with an accent mark on the a) suggestion and

question, I am wondering if "LocalityAnnotation" field rather than

"NamedPlace" is to be used for any note when I located a weird

transliteration from non-alphabetical language to US-alphabets for a

specific geographic name and made a reasonable assumption from such

apparently incorrect spelling.  Almost every few lines, I came across

this sort of problems in georeferencing Japanese localities.

 

Sincerely,

 

XXXXXXXXXXXXX

 

>>> Posting number 456, dated 27 Mar 2003 09:16:43

Date:         Thu, 27 Mar 2003 09:16:43 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: How to deal with non-standard characters in Locality

              description?

In-Reply-To:  <p05111b00baa834ed4cbe@[129.118.175.4]>

Mime-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"; format=flowed

Content-Transfer-Encoding: quoted-printable

 

XXXXX, XXXXX, and All,

 

The LocalityAnnotation field is meant to alert people at the source=20

institution that there is something amiss in their locality description=20

that they should investigate. Misspellings fall into this category, as do=20

internal inconsistencies in the description. The NamedPlace field should=20

contain the name as used in the source for the coordinates.

 

I understand that not everyone has a Unicode-capable database. For those of=

=20

us who don't, making some of these spelling updates will not be possible=20

right now. Nevertheless, it is well providing this information during=20

georeferencing, because some institutions will be able to take advantage of=

 it.

 

John

 

 

 

>>> Posting number 457, dated 31 Mar 2003 22:04:51

Date:         Mon, 31 Mar 2003 22:04:51 -0800

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      more on diacritics and Unicode

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: 7bit

 

John and all:  I ran across this link will looking for some maps.

Check the link "Diacritics, Special Characters and their Codes" link on the

NIMA - GNS page at

http://www.nima.mil/gns/html/diacritic.html

 

>>> Posting number 458, dated 1 Apr 2003 12:43:08

 

>>> Posting number 459, dated 8 Apr 2003 16:31:21

 

>>> Posting number 460, dated 11 Apr 2003 14:08:36

 

>>> Posting number 461, dated 12 Apr 2003 10:42:18

 

>>> Posting number 462, dated 15 Apr 2003 13:33:18

Date:         Tue, 15 Apr 2003 13:33:18 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      GMT and the required DateLastModified field

In-Reply-To:  <5.1.1.5.2.20030315123544.016f79f8@socrates.berkeley.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii" ; format="flowed"

 

Dear All:  The manner that the required DateLastModified field

description is written seems to require the date last modified to

seconds.

 

DateLastModified

 

ISO 8601 date and time in UTC(GMT) when the record was last modified.

Example: "November 5, 1994, 8:15:30 am, US Eastern Standard Time"

would be "1994-11-05T13:15:30Z".

(http://dlp.cs.berkeley.edu/manis/darwin2ConceptInfo030315jrw.htm) .

 

However there appear to be six levels of date-time precision

(granularity) and it might be easier to just go with the third of

YYYY-MM-DD?

 

Formats (from http://www.w3.org/TR/NOTE-datetime).

 

Different standards may need different levels of granularity in the

date and time, so this profile defines six levels. Standards that

reference this profile should specify one or more of these

granularities. If a given standard allows more than one granularity,

it should specify the meaning of the dates and times with reduced

precision, for example, the result of comparing two dates with

different precisions.

 

The formats are as follows. Exactly the components shown here must be

present, with exactly this punctuation. Note that the "T" appears

literally in the string, to indicate the beginning of the time

element, as specified in ISO 8601.

 

 

    Year:

       YYYY (eg 1997)

    Year and month:

       YYYY-MM (eg 1997-07)

    Complete date:

       YYYY-MM-DD (eg 1997-07-16)

    Complete date plus hours and minutes:

       YYYY-MM-DDThh:mmTZD (eg 1997-07-16T19:20+01:00)

    Complete date plus hours, minutes and seconds:

       YYYY-MM-DDThh:mm:ssTZD (eg 1997-07-16T19:20:30+01:00)

    Complete date plus hours, minutes, seconds and a decimal fraction of a

second

       YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00)

 

For Washington State the time offset from GMT is -8 hr (-7 hr

daylight saving time).

To figure the offset in hr from GMT, here is a useful site:

http://greenwichmeantime.com/local/usa/.

 

>>> Posting number 463, dated 15 Apr 2003 16:06:32

Date:         Tue, 15 Apr 2003 16:06:32 -0500

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         Dave Vieglais <vieglais@KU.EDU>

Subject:      Re: GMT and the required DateLastModified field

In-Reply-To:  <p05100301bac216189669@[207.207.104.113]>

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii; format=flowed

Content-Transfer-Encoding: 7bit

 

Hi All,

Perhaps one of the important things to realize here is the use of the

this field.  It is intended to provide a timestamp indicating that the

record data is guaranteed not to have changed since this time.

Providing such a timestamp enables one to quickly determine if a copy of

a record needs to be updated from the source.

 

The actual data for this field should be stored as a DATETIME type in

the database, and will retain the precision native to that type in the

database.  This will generally be in the 100'ths of a second ballpark.

 

The representation of this field in queries and in response records will

be one of the ISO8601 formats, most likely that used in the description.

Note that the representation of date/time in the request or response

records is completely independent of the representation that is visible

to the user through either the database management interface or through

portal applications retrieving this information.

 

XXXXXXXXXX wrote:

> Dear All:  The manner that the required DateLastModified field

> description is written seems to require the date last modified to

> seconds.

 

Yes, but that does not mean that the data entry person needs to capture

this information to the nearest second.  Indeed, this field should

generally be determined by the database system, and not be manually

entered at all.

 

> 

> DateLastModified

> 

> ISO 8601 date and time in UTC(GMT) when the record was last modified.

> Example: "November 5, 1994, 8:15:30 am, US Eastern Standard Time"

> would be "1994-11-05T13:15:30Z".

> (http://dlp.cs.berkeley.edu/manis/darwin2ConceptInfo030315jrw.htm) .

> 

> However there appear to be six levels of date-time precision

> (granularity) and it might be easier to just go with the third of

> YYYY-MM-DD?

> 

> Formats (from http://www.w3.org/TR/NOTE-datetime).

> 

> Different standards may need different levels of granularity in the

> date and time, so this profile defines six levels. Standards that

> reference this profile should specify one or more of these

> granularities. If a given standard allows more than one granularity,

> it should specify the meaning of the dates and times with reduced

> precision, for example, the result of comparing two dates with

> different precisions.

> 

> The formats are as follows. Exactly the components shown here must be

> present, with exactly this punctuation. Note that the "T" appears

> literally in the string, to indicate the beginning of the time

> element, as specified in ISO 8601.

> 

> 

>    Year:

>       YYYY (eg 1997)

>    Year and month:

>       YYYY-MM (eg 1997-07)

>    Complete date:

>       YYYY-MM-DD (eg 1997-07-16)

>    Complete date plus hours and minutes:

>       YYYY-MM-DDThh:mmTZD (eg 1997-07-16T19:20+01:00)

>    Complete date plus hours, minutes and seconds:

>       YYYY-MM-DDThh:mm:ssTZD (eg 1997-07-16T19:20:30+01:00)

>    Complete date plus hours, minutes, seconds and a decimal fraction of a

> second

>       YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00)

> 

> For Washington State the time offset from GMT is -8 hr (-7 hr

> daylight saving time).

> To figure the offset in hr from GMT, here is a useful site:

> http://greenwichmeantime.com/local/usa/.

 

It is important that the correct timezone information is set on the

database server.  This is because in most cases, dates and times stored

in databases use local time, and generally do not capture the offset

from GMT.  Interface software (such as DiGIR) use the system locale

information to determine the timezone of the system, and adjust incoming

requests to reflect local time so that the values can be compared with

the database entries.  Similarly for outgoing data.

 

So:

1.  Do not try to manually enter timestamps.  These values should be

computed by the database.

2. Make sure that the timezone information is correctly set on the

machine serving the data, and that all mirrors of the data use the same

timezone.

3. Make sure that the clock on your system(s) are up to date.  Use some

time synching tools to make this happen automatically.

 

cheers,

   Dave V.

 

 

 

>>> Posting number 464, dated 15 Apr 2003 16:14:52

 

>>> Posting number 465, dated 15 Apr 2003 16:23:50

 

>>> Posting number 466, dated 15 Apr 2003 16:09:20

Date:         Tue, 15 Apr 2003 16:09:20 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Re: GMT and the required DateLastModified field

In-Reply-To:  <3E9C7458.9050305@ku.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii" ; format="flowed"

 

Dave:  You lost me.  I'm now thinking that this field has nothing to

do with the actual curatorial functions on the inhouse databases?

Here we have FileMaker Pro (inhouse) write data to the MaNIS server.

 

>Hi All,

>Perhaps one of the important things to realize here is the use of the

>this field.  It is intended to provide a timestamp indicating that the

>record data is guaranteed not to have changed since this time.

>Providing such a timestamp enables one to quickly determine if a copy of

>a record needs to be updated from the source.

 

  Seems obvious, but the source is what?  The MaNIS server or the

inhouse database?

 

........

 

>So:

>1.  Do not try to manually enter timestamps.

 

Meaning do not have auto enter date and time fields (and Datetime) in

the inhouse database stamp records?

 

>These values should be

>computed by the database.

 

Meaning the server software on the MaNIS server is going to stamp the

record as it is written to the server from the inhouse database?

Thus the inhouse database (FMP here) does not need the time stamp?

 

>2. Make sure that the timezone information is correctly set on the

>machine serving the data, and that all mirrors of the data use the same

>timezone.

Roger

 

>3. Make sure that the clock on your system(s) are up to date.  Use some

>time synching tools to make this happen automatically.

Roger, roger

 

> 

>cheers,

>   Dave V.

> 

 

 

 

>>> Posting number 467, dated 15 Apr 2003 17:08:17

Date:         Tue, 15 Apr 2003 17:08:17 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         Stan Blum <sblum@CALACADEMY.ORG>

Subject:      Re: GMT and the required DateLastModified field

In-Reply-To:  <p05100304bac23e0efa12@[207.207.104.113]>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

At 04:09 PM 4/15/03 -0700, you wrote:

>Dave:  You lost me.  I'm now thinking that this field has nothing to

>do with the actual curatorial functions on the inhouse databases?

>Here we have FileMaker Pro (inhouse) write data to the MaNIS server.

> 

>>Hi All,

>>Perhaps one of the important things to realize here is the use of the

>>this field.  It is intended to provide a timestamp indicating that the

>>record data is guaranteed not to have changed since this time.

>>Providing such a timestamp enables one to quickly determine if a copy of

>>a record needs to be updated from the source.

> 

>  Seems obvious, but the source is what?  The MaNIS server or the

>inhouse database?

 

XXXX et al.,

 

My understanding of this field is that it's purpose is to lighten the load

of (ro)bots or spiders that will be going around to collection databases

and indexing their contents.  For example, the folks at ITIS Canada built

something that indexed collections for names, and so when someone searched

for a particular taxon, they could respond with links to databases holding

relevant specimen records.  Spiders like that don't want to have to copy

ALL the relevant data every time they visit, but only data that have

changed since they last visited.  They can keep track of when they last

visited, we just need to be able to respond to a query for everything "with

a LastModifiedDate newer than X".

 

Like most of you, we at CAS have our real databases separated from the

copies that are being queried from the Web.  Unfortunately, our original

databases don't have time-stamps to record when individual records were

last edited.  That means every time I update our web version, or

"DiGIR-resource", I export the whole thing.  I can, however, make life

easier for the spiders, by adding a time-stamp field to the web version and

setting it equal to the date-time of the upload.  In other words, I do the

upload to the web version (wiping out everything that was there before),

and then run an update query that sets "DateLastModified" to Now.  Only a

small portion of our records may be new, but I can't tell the spiders which

ones, so they have to "get them all".  I can tell them, when they come back

next, that nothing has been updated since they visited; they have the most

recent version that's available.

 

By the way, our DiGIR resource is in Microsoft SQL Server and the data type

I'm using for the DateLastModified field is smalldatetime.  It's precision

is only to the nearest minute, so when I ask it to spit out date time in

the (its version of the) ISO format, it pads the seconds with zeros.  I

think that's sufficient for our purposes.

 

Right now (daylight savings time) I have to add 8 hours to NOW to the get

the correct value.  What I don't have yet is the right set up for dealing

with daylight savings time automatically...

 

This stuff is always so complicated.

 

-Stan

 

>>> Posting number 468, dated 17 Apr 2003 00:00:0/

Date:         Thu, 17 Apr 2003 00:54:19 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Providing the DateLastModified field

In-Reply-To:  <5.2.0.9.0.20030415161716.04e6c6e0@mail.calacademy.org>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

For those of you whose data migrations I design and implement (i.e., those

conforming to my view of the world - heheh), the DateLastModified will be

constructed during the migration process. Each time the data get migrated,

I will compare the contents of each newly migrated specimen record to the

contents of the corresponding specimen record from the previous migration.

If the record has changed since the last migration, or if it is new since

the last migration, I will set the DateLastModified to the date when the

migration is made. If the record is unchanged, I will retain the previous

DateLastModified. I'm considering recording the nature of the changes and

making that resource available as well, but I haven't fully thought through

the implications or utility of creating such a resource. Comments are welcome.

 

For those subscribers who are interested, my proposed implementation for

determining the DateLastModified will work properly only for those records

that have a unique identifier (i.e., a unique combination of

InstitutionCode, CollectionCode, and CatalogNumberText, within a resource).

Records that have a duplicate identifier within the resource will all have

the DateLastModified set to the most recent migration date.

 

John

 

At 05:08 PM 4/15/03 -0700, you wrote:

>At 04:09 PM 4/15/03 -0700, you wrote:

>>Dave:  You lost me.  I'm now thinking that this field has nothing to

>>do with the actual curatorial functions on the inhouse databases?

>>Here we have FileMaker Pro (inhouse) write data to the MaNIS server.

>> 

>>>Hi All,

>>>Perhaps one of the important things to realize here is the use of the

>>>this field.  It is intended to provide a timestamp indicating that the

>>>record data is guaranteed not to have changed since this time.

>>>Providing such a timestamp enables one to quickly determine if a copy of

>>>a record needs to be updated from the source.

>> 

>>  Seems obvious, but the source is what?  The MaNIS server or the

>>inhouse database?

> 

>XXXX et al.,

> 

>My understanding of this field is that it's purpose is to lighten the load

>of (ro)bots or spiders that will be going around to collection databases

>and indexing their contents.  For example, the folks at ITIS Canada built

>something that indexed collections for names, and so when someone searched

>for a particular taxon, they could respond with links to databases holding

>relevant specimen records.  Spiders like that don't want to have to copy

>ALL the relevant data every time they visit, but only data that have

>changed since they last visited.  They can keep track of when they last

>visited, we just need to be able to respond to a query for everything "with

>a LastModifiedDate newer than X".

> 

>Like most of you, we at CAS have our real databases separated from the

>copies that are being queried from the Web.  Unfortunately, our original

>databases don't have time-stamps to record when individual records were

>last edited.  That means every time I update our web version, or

>"DiGIR-resource", I export the whole thing.  I can, however, make life

>easier for the spiders, by adding a time-stamp field to the web version and

>setting it equal to the date-time of the upload.  In other words, I do the

>upload to the web version (wiping out everything that was there before),

>and then run an update query that sets "DateLastModified" to Now.  Only a

>small portion of our records may be new, but I can't tell the spiders which

>ones, so they have to "get them all".  I can tell them, when they come back

>next, that nothing has been updated since they visited; they have the most

>recent version that's available.

> 

>By the way, our DiGIR resource is in Microsoft SQL Server and the data type

>I'm using for the DateLastModified field is smalldatetime.  It's precision

>is only to the nearest minute, so when I ask it to spit out date time in

>the (its version of the) ISO format, it pads the seconds with zeros.  I

>think that's sufficient for our purposes.

> 

>Right now (daylight savings time) I have to add 8 hours to NOW to the get

>the correct value.  What I don't have yet is the right set up for dealing

>with daylight savings time automatically...

> 

>This stuff is always so complicated.

> 

>-Stan

 

>>> Posting number 469, dated 17 Apr 2003 01:49:19

Date:         Thu, 17 Apr 2003 01:49:19 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: two questiones

In-Reply-To:  <31320.1050426708@www21.gmx.net>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

XXX has some good questions, the discussion of which may be of benefit to

many of you. I've interspersed my commentary with the original message.

 

>I encountered few more cases new for me :)) and would highly appreciate your

>suggestions how to deal with them. Compared to Kenya a lot of records in

>Guatemala have very precise distances and a lot of records represent one

>of the

>cases listed below.

> 

>1 What value of the "Distance precision" shall I enter if the distances

>given along orthogonal directions are of the different precision: like 2

>km E and

>6.5 km S San Marcos. Is the "distance precision" value 1 km or 0.5km in this

>case. In other records it is like 2.0 km E and 6.5 km S. That seems to be

>more clear. In few other records they can differ even more like 7 km E and

>1.25

>km N. I am not sure how to proceed in these cases?

 

Use the more precise measurement as the gauge. The premise is that the one

who recorded the data was cognizant of that level of precision and executed

the measurement with equal precision even though the record does not

reflect it. So, for your example "7 km E and 1.25 km N" the distance

precision would be 1/4 km.

 

>2 A lot of distances are given along the road with direction indicated: like

>  15km S (by road) Santa Anna. We also have the same distance to the same

>place meassured by air. Shall I try to follow/meassure the road distance?

 

Here's my "official" stance on the subject - an excerpt from a paper I've

written for the International Journal of GIS with Qinghua Guo and Robert

Hijmans, both at UC Berkeley.

 

"3.4 Using offsets

Offsets generally consist of combinations of distances and directions from

a named place. Some locality descriptions explicitly state the path to

follow when measuring the offset (e.g. 'by road', 'by river', 'by air', 'up

the valley', etc.). In this case the georeferencer should follow the path

designated in the description using a map with the largest available scale

to find the coordinates of the offset from the named place. The smaller the

scale of the map used, the more the measured distance on the map is likely

to overshoot the intended target.

It is sometimes possible to infer the offset path from additional

supporting evidence in the locality description. For example, in the

locality '58 km NW of Haines Junction, Kluane Lake' supports a measurement

by road since the final coordinates by that path are nearer to the lake

than going 58 km NW in a straight line. By convention, localities

containing two offsets in orthogonal directions (e.g. '10 km S and 5 km W

of Bikini Atoll') are always linear measurements.

Sometimes the environmental constraints of the collected specimen can imply

the method of measurement. For example, '30 km W of Boonville' if taken as

a linear measurement, would lie off the coast of northern California in the

Pacific Ocean. If the locality refers to the collection of a terrestrial

mammal, it is likely that the collector followed the road heading west out

of Boonville, winding toward the coast, in which case the animal was

collected on land.

If either of the above methods fail to distinguish the offset method, it

may be necessary to refer to more detailed supplementary sources, such as

field notes or itineraries, to determine this information. Supplementary

sources often do not exist, or do not contain additional information,

making it difficult to distinguish between offsets meant to be along a path

and those meant to be along a straight line. A particularly conservative

approach would be to not georeference localities that fall into this

category and instead record a comment explaining the reasoning. However,

value can still be derived from georeferencing localities that suffer from

the ambiguity described above. One solution is to determine the coordinates

based on one or the other of the offset paths. Another solution is use the

midpoint between all possible paths. There may be discipline-specific

reasons to choose one solution over another, but the georeferencer should

always document the choice and accommodate the ambiguity in the uncertainty

calculations. "

 

>3 A lot of records in Guatemala refere to farms (called "fincas"). Most of

>them have own coordinates. In records in addition to the farm name, a

>distances from the larger named place are given. The problem is, that the

>distances

>from the same named place are different. I assume that animals were captured

>in different farm areas. Differences in the distances usually are not very

>big. Question is. Shall I refere to the farm as such or count coordinates

>along

>the distances from the named place?

>Here is two examples:

>Finca Santa Julia, 1.5 km E and 0.75 km S San Rafael Pie de La Cuesta

>and

>Finca Santa Julia, 1.5 km SE (by air) San Rafael Pie de La Cuesta

>as well as

>Finca Santa Julia, ca. 1.25 km E, 0.75 km S (by air) San Rafael Pie de La

>Cuesta

 

This one is a very interesting problem. One is left to wonder, from such

descriptions, if the collector meant "all of Finca Santa Julia, which can

be found at 1.5 km E and 0.75 km S San Rafael Pie de La Cuesta" or "1.5 km

E and 0.75 km S San Rafael Pie de La Cuesta, which happens to be in Finca

Santa Julia."

 

The first question for me as a georeferencer is, "Can I find Finca Santa

Julia in a resource that tells me how big the Finca is?" If I cannot, then

I will ignore that part of the locality and use the alternative description

and I'll note "can't find Finca Santa Julia, georeferenced based on

offsets." If I can find Finca Santa Julia and it has a smaller extent than

the precision of the offsets, I will ignore the offset information and use

the location and size of the Finca. If I can find Finca Santa Julia and it

has a larger extent than the precision of the offsets, I will use the

intersection of the Finca with the bounding box or circle defined by the

offsets and their precision as my named place and calculate my uncertainty

based on the rules for a locality of the type "Named Place Only."

 

>4 The last one. How to deal with records, where animals where captured on

>particular place (coordinates given) but died in captivity. Shall I reference

>them like others or shall add some remarks to the "CaptiveFlag"

 

This is another interesting question. We're not concerned about where the

animal died per se. Instead, we're concerned about whence it was taken from

the wild. It will be difficult, when georeferencing without reference to

the specimens, to know if the specimens collected at that locality were

captive at the time or not, unless the locality itself describes something

about captive. Here are two fictitious examples:

 

"captive from fox farm 2 km E of Kluane Lake"

"fox farm 2 km E of Kluane Lake"

 

The first example is trivial - the CaptiveFlag should be "yes." The second

locality, however, could equally well refer to a captive fox from the fox

farm or to an unlucky itinerant ground squirrel that was collected there.

The bottom line is that the CaptiveFlag is not really a proper attribute of

the locality itself, but of the collecting event that associates the

locality with a specimen. I put it in the gazetteer to identify those

locality records that are known to refer only to collections of captive

animals.

 

>sorry for disturbance

 

Not at all. These were very good questions.

 

John

 

>Thanks

> 

>XXX

 

 

>>> Posting number 470, dated 18 Apr 2003 15:55:53

 

>>> Posting number 471, dated 20 Apr 2003 07:47:23

Date:         Sun, 20 Apr 2003 07:47:23 -1000

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         

Subject:      Re: Providing the DateLastModified field

In-Reply-To:  <5.1.1.5.2.20030417003736.0176ba08@socrates.berkeley.edu>

MIME-version: 1.0

Content-type: text/plain; charset=us-ascii

Content-transfer-encoding: 7bit

 

Hi John (and others),

 

> I'm considering recording the nature of the changes and

> making that resource available as well, but I haven't fully

> thought through

> the implications or utility of creating such a resource. Comments

> are welcome.

 

For what it's worth, I'm setting up our Access databases such that every

data edit is logged at the time of edit (similar to transaction logs of more

robust DBMS apps).  The table name is "EditLog", and the fields are:

 

Field Name              Type            Description

-------------------------------------------------

EditLogID               AutoNumb*       Unique Primary Key

TableName               Text (255)      Name of table in which record was edited

FieldName               Text (255)      Name of field in which record was edited

PKID                    Long Int.       Unique Primary Key of record that was edited

PreviousValue   Memo            Previous value of Field in record that was edited

EditorID                Long Int.       ID number of the Person who edited the record

TimeStamp               Date/Time       Date and time when record was edited

 

*AutoNumber field automatically assigns a unique random long integer value

to each new record.

 

Thus, with a single table, I can track all record transactions.  This

structure works best if each table has a Long Integer as its primary key

(usually surrogate), but with some alterations to the PKID field(s) you

could probably accomodate multiple-field and other "natural" primary key

values. Only one transaction is logged for each new record addition

(FieldName="*"; PreviousValue="ADDED").  When records are deleted, I log the

value of all non-null fields for that record, and log an additional

transaction record with FieldName="*"; PreviousValue="DELETED".

 

Whenever a record is updated, Code is triggered to interrogate the record

for each field whose value has changed, so transaction logs are created for

only those fields (except for a DELETE transaction, in which case all

non-null values are logged). Although I apply this logging Code at the time

of record edit, the same code could be modified to work at the time of data

migration.

 

Note that only the previous values are logged.  Current values are obtained

from the active data tables.  The Code currently skips fields whose previous

value is NULL; the assumption being that if no edit log record exists for

the field, its previous value was NULL.  This approach is flawed for the

case where a record had a value, then was changed to NULL, then was later

changed to another value.  The first and second values would be logged, but

no log record would exist of the intermidiate NULL state (and thus no way to

know that the field was NULL for a period of time in-between). I'm

considering several alternative solutions to this problem.

 

In any case, I've found it to be an extremely useful generic feature to add

to databases to maintain an edit history of each field.

 

Anyone interested in more details, please feel free to contact me.

 

XXXXX

 

>>> Posting number 472, dated 20 Apr 2003 22:23:26

Date:         Sun, 20 Apr 2003 22:23:26 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         

Subject:      A caution using NIMA and other gazetteers that are incomplete.

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding: 7bit

 

Dear All:

I downloaded the BC lat/longs from NIMA and did an automated then a

semi-automated filtered pass through the BC records.  There were 2,515

records in the NIMA dataset and I should have suspected that it was

incomplete.  Initially I was pleased as I got about 60% apparently good

hits. But I then I downloaded the BCGNIS gazetteer of 43,690 records and

started looking up records not found in NIMA.  It wasn't long until I

realized that many (more than half) of the apparently singular and

apparently unambiguous hits from NIMA were not singular and hence ambiguous.

An example:

 

There are two Vernons in BCGNIS, but only one in NIMA:

Vernon, City 50.2583 -119.2667

Vernon, Community 50.0333 -126.3500

 

There were about 45 Vernon records in the MaNIS download and 37 had only

Vernon in SpecLoc.

 

Most are not this bad but only involve a few records with a possible city or

community named after a geographical feature (lake, creek, bay, mount).

 

In the BCGNIS records, there are 3,198 multiple placenames (city, community,

locale, lake etc.). A frequency count of the multiples (I was curious):

 

Same name  Number of occurrences in BCGNIS

2  2174

3  568

4  209

5  96

6  59

7  30

8  16

9  18

10  7

11  4

12  4

13  3

14  3

16  2

17  2

18  1  Summit Lake (1 community, 15 lakes, 2 localities)

19  1  Bear Creek (17 Creeks, 1 locality, 1 railway point)

20  1  Long Lake  (all Lakes)

 

So, use GNIS for Canada and avoid NIMA.

 

The situation in the US is somewhat better with a county to assist in

filtering.  However, it is circular to use county to verify the SpecLoc.  We

assume county was entered with the locality but this might not be the case

and could have been added at any time to the computerized record.

 

Both NIMA and BCGNIS are to the nearest minute so there are no differences

in precision.

 

>>> Posting number 473, dated 23 Apr 2003 13:05:16

 

>>> Posting number 474, dated 27 Apr 2003 17:29:04

Date:         Sun, 27 Apr 2003 17:29:04 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Old GeorefCalculator made obsolete

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear All,

 

Having received no bug reports on the new GeorefCalculator since it's

release on 14 Mar 2003, I decided to remove all references to the old

georeferencing calculator from the MaNIS web pages. All links to the

calculator are now supposed to point to the following URL:

 

http://elib.cs.berkeley.edu/manis/gc.html

 

Thanks for not breaking it!

 

John

 

>>> Posting number 475, dated 30 Apr 2003 16:02:52

 

>>> Posting number 476, dated 1 May 2003 13:59:55

 

>>> Posting number 477, dated 5 May 2003 14:18:11

 

>>> Posting number 478, dated 9 May 2003 13:11:22

 

>>> Posting number 479, dated 13 May 2003 10:02:58

 

>>> Posting number 480, dated 13 May 2003 16:29:35

 

>>> Posting number 481, dated 13 May 2003 16:50:29

 

>>> Posting number 482, dated 16 May 2003 12:52:08

 

>>> Posting number 483, dated 23 May 2003 13:10:51

 

>>> Posting number 484, dated 29 May 2003 14:30:19

 

>>> Posting number 485, dated 29 May 2003 16:34:24

 

>>> Posting number 486, dated 29 May 2003 11:48:00

 

>>> Posting number 487, dated 29 May 2003 15:23:01

Date:         Thu, 29 May 2003 15:23:01 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Martinique and Guadeloupe

In-Reply-To:  <5.2.1.1.0.20030529163213.00ab4888@packrat.musm.ttu.edu>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXXX, and all,

 

This mix up on my part is important to understand as we start claiming "the

dregs" of localities. The problems encountered are especially likely to

occur with islands, and even more likely for islands that are possessions

of another country.

 

 

>Do you mean that there were some Guadeloupe localities that showed up when

>you downloaded France?  TTU has already finished the regular Guadeloupe

>localities.  Please contact me.

 

Yes, there were. If you look on the country list you'll find that France

and Guadeloupe appear independently. When they downloaded based on

country=France, there were 83 records with Guadeloupe in the StateProvince

field. These are completely distinct from the records that would be

downloaded by a query on country=Guadeloupe, of which there are 17 records.

As it turns out, there are 3 Guadeloupe records that aren't located by

either of these "country=" queries because the word Guadeloupe is in some

other field. The only sure way to get them all is to query on Higher

Geography contains Guadeloupe, which returns a total of 103 records.

 

So, from here on out it is a good idea to put the region you are trying to

match into the Higher Geography field.  Please, also let me know the number

of localities that match your claim, so that I can check that nothing is

being "orphaned." Such is the price of heterogeneous data structures. This

is exactly the kind of thing that makes people fanatic about

standardization. (I'm not one of those people, for the record).

 

Sorry for the mix up,

 

John

 

>>> Posting number 488, dated 30 May 2003 14:46:26

 

>>> Posting number 489, dated 31 May 2003 17:09:52

 

>>> Posting number 490, dated 2 Jun 2003 06:30:41

 

>>> Posting number 491, dated 2 Jun 2003 10:05:19

 

>>> Posting number 492, dated 2 Jun 2003 12:21:30

 

>>> Posting number 493, dated 2 Jun 2003 13:16:08

 

>>> Posting number 494, dated 2 Jun 2003 14:04:52

 

>>> Posting number 495, dated 2 Jun 2003 22:38:22

 

>>> Posting number 496, dated 3 Jun 2003 10:34:20

Date:         Tue, 3 Jun 2003 10:34:20 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: Last doc

Comments: To: Juan Carlos =?iso-8859-1?Q?Hern=E1ndez?= Barrios

          <jhernan@xolo.conabio.gob.mx>

In-Reply-To:  <5.2.0.9.0.20030602160747.00b4bb30@xolo.conabio.gob.mx>

Mime-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"; format=flowed

Content-Transfer-Encoding: quoted-printable

 

Juan Carlos,

 

I have confirmed that the file has been received, and that all of the files=

=20

for all of the states that you claimed for georeferencing for MaNIS are=20

archived at MVZ.

 

I'd like to extend sincere appreciation for your efforts from all of the=20

MaNIS participants and from those who will benefit in perpetuity from your=

=20

contribution to this project.

 

Thank you,

 

John

 

At 08:45 AM 6/3/03 -0500, you wrote:

>   Hello John

>   After several months, we've finished the mexican records that we have=20

> selected from the hole country, originally we had selected 12098 records=

=20

> out of 30 thousand+-.

>   In this document: CONABIO-Mexico-2003-6-2.txt,

>   we send you the records from Sonora and Sinaloa,

> 

>   They sum a total of 3,079,

>   2482 georeferenced and

>     597 not georeferenced

> 

>   This is our last delivering, let us know if you have some more=20

> questions about the files that we had sent to you.

> 

>   Cheers.

>   Juan Carlos Hern=E1ndez Barrios

>   CONABIO

 

>>> Posting number 497, dated 3 Jun 2003 15:33:12

Date:         Tue, 3 Jun 2003 15:33:12 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      problems with Microsoft Excel

MIME-Version: 1.0

Content-Type: text/plain; charset=us-ascii

Content-Transfer-Encoding: 7bit

 

Hello fellow georeferencers,

 

I have been working on the MaNIS project for the past year and am

currently georeferencing Baja California, Mexico.  I usually work in

Microsoft Excel.  A few weeks ago I made the unfortunate discovery that

at some point during my georeferencing process I sorted my Baja Excel

file while a few columns were hidden.  In many versions of Excel (listed

below) a sort while columns are hidden results in the visible columns

being sorted without the hidden columns, thus scrambling the data.  In

my situation I had hidden the LocalityID and the CollectionCode columns

so that after the sort (or multiple sorts) these columns were no longer

associated with the locality information and my georeferenced results.

I was able to fix the problem by re-associating the LocalityID and

CollectionCode from original MaNIS downloaded files with the locality

description in my completed georeferenced data.  I am grateful however

that I discovered the issue before I sent the finished data to John with

mismatched data.  John informed me that they are not checking for this

type of problem as the data comes in, but that their data checking

methods would reveal such a problem later on in the process, at which

point he could re-associate localities with their LocalityIDs, though

with a great deal of effort.

 

I am sure all of you are aware of the sorting issues involved when

working in Excel and have not made the same mistake that I did.  If you

are not already doing so, I suggest that each georeferencer spot check

their data against the original files before submitting them to John.  I

could easily see how the problem I encountered would have gone unnoticed

if I hadn't referred back to the original file to check my work.

 

If you are using Excel XP this problem is no longer an issue as it was

revised so that now hidden columns are sorted with other columns.  In

the versions of Microsoft Excel listed below, the sorting feature does

not sort hidden rows or columns.

 

Microsoft Excel 2000

Microsoft Excel 2002

Microsoft Excel 97 for Windows

Microsoft Excel for Windows 95 7.0

Microsoft Excel for Windows 95 7.0a

Microsoft Excel for Windows 5.0

Microsoft Excel for Windows 5.0c

Microsoft Excel 98 Macintosh Edition

Microsoft Excel for the Macintosh 5.0

Microsoft Excel for the Macintosh 5.0a

 

Feel free to contact me if you want/need more specific information about

this problem.

 

XXXXXXXXXX

 

>>> Posting number 498, dated 4 Jun 2003 06:53:17

Date:         Wed, 4 Jun 2003 06:53:17 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:        

Subject:      Excel features

MIME-Version: 1.0

Content-Type: text/plain; charset="iso-8859-1"

Content-Transfer-Encoding:  quoted-printable

 

XXX et al:  Select the entire worksheet or row to avoid the partial sort

problem.  Avoid selecting cells for sorts.  Guess you know that, but =

others

might not.  Not sure how far back this works, but it does in Excel 2000.

 

Another nifty comparison via Excel is a cell by cell comparison.  Prior =

to

submitting data to JW, I sort a copy of the original download and the =

final

copy the same way.  Then enter in an unused col in cell 1 of a worksheet =

an

if statement:

 

=3Dif(Final!A:A=3DOriginalDwnld!A:A,"", 1)

 

where I click col A in Final and col A OrigDwnld.  Dragging to the right

will do the same for cols B-D. Then a return does the comparison and =

then a

fill down of the entire col will compare entire cols.  If

the same, the cell will be blank, if different, a 1 will show.  Text can =

also

be substituted for the result values blank and 1.  For example "same" =

(need

quotes) and "different" gives the result in words.  This allows me to =

check

that I have not made any changes to the original MaNIS data (MaNIS =

site,collection,

HigherGeog, SpecLocality) prior to submitting.

 

Cell, column, row or range locking can also be done via

Format>Cell>Protection tab then Tools>Protection>Document.  Unlocked =

cells

can be change but locked cell cannot be altered.

 

 

>>> Posting number 499, dated 4 Jun 2003 16:50:58

 

>>> Posting number 500, dated 4 Jun 2003 17:45:42

 

>>> Posting number 501, dated 6 Jun 2003 13:35:48

 

>>> Posting number 502, dated 6 Jun 2003 13:44:35

 

>>> Posting number 503, dated 6 Jun 2003 13:50:31

 

>>> Posting number 504, dated 9 Jun 2003 16:38:32

 

>>> Posting number 505, dated 10 Jun 2003 12:31:56

 

>>> Posting number 506, dated 10 Jun 2003 14:02:22

Date:         Tue, 10 Jun 2003 14:02:22 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Re: [HERPNET] Distances by road

Comments: To: HERPNET@USOBI.ORG

In-Reply-To:  <p0510031ebb0ad3cd7747@[207.207.103.216]>

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

XXXXXX, and all,

 

Please excuse the cross-posting, but I think this topic is critical for all

of our efforts.

 

I fully understand that it is, at times, possible to do better than

prescribed in the MaNIS Guidelines. For anyone interested, I also have a

pdf file of a manuscript that is currently in review for the International

Journal of Geographical Information Science. The manuscript goes into more

depth than the currently "published" guidelines

(http://dlp.cs.berkeley.edu/manis/GeorefGuide.html) and addresses just such

questions as we are now discussing. Please write to me (don't reply to this

message, as your reply will go to the HerpNet list) if you would like to

see that paper.

 

The Georeferencing Guidelines have a threefold purpose. First, and

foremost, they are intended to help educate the georeferencer with respect

to the complexities involved in taking a descriptive locality and making a

spatially explicit determination from it. It is not unlike making a

specimen identification - an opinion is rendered based on available

information. Second, they are intended to provide consistency across a

large-scale operation such as HerpNet, whether it is done collaboratively

or not. Third, the product resulting from the application of the guidelines

is intended to maximally useful. It is the combination of these goals,

along with the limitations on resources, that should shape the course of

action.

 

OK, I think I'm done waxing philosophic. I'll try to address some specifics

interspersed in the original message.

 

At 06:13 PM 6/9/03 -0700, you wrote:

>John, I've spent a lot of time collecting in tropical countries,

>where there are may be little villages and long stretches of road

>between them, and if I said I collected something in San Pedro, I

>would mean San Pedro, not something with an error figure "within 50

>km of San Pedro" just because the nearest village was 100 km away.

 

I don't disagree with you at all. However, if I don't know about you, or

your methods, how can I know what you mean by San Pedro? Do you mean the

center of San Pedro? Or the edge of San Pedro as you drive out of town?

This is where the real problem comes in - where *is* the edge of town? We

have done quite a bit of work to determine if one could estimate the size

of a town from its population, that being one piece of information that is

generally available. Whereas there is a correlation between these two

attributes, it is not consistent enough to provide a simple rule. So, we

try to figure out the sizes (extents) of towns and other features whenever

possible. In fact, we record that information, along with the source from

which we got the information, so that others can take advantage of it in

the future.

 

Back to your example. Clearly we are losing specificity by following the

rule in this case, and we're compromising the third goal, that of

maximizing the usefulness of the data. There are ways to do better, and

georeferencers are free to do so *if* they document their assumptions in

the LatLongRemarks field provided for that purpose. One way to do better

for this example is to find maps of a scale large enough to show the extent

of the named place. This really is the ideal, and suggests that whoever has

the best resources for a given geographic area should claim and

georeference it. Resources may include good maps, or extensive knowledge,

or supplemental material (such as field notes) for many of the localities

in a geographic region. Even so, there will be cases where an extent cannot

be determined. For such cases, a simple rule - no, guideline - is needed. I

chose a guideline that allowed the locality to be georeferenced, albeit

with a liberal margin of uncertainty. It is perfectly reasonable to

establish an alternative guideline that says "don't georeference a locality

for which the extent of the named place cannot be determined." No matter

which guideline is adopted, the determination (the georeference) is never

beyond revision. Suppose the "half-way" guideline is followed. One could

later discover candidate localities for refined georeferencing by searching

on the CoordinateUncertaintyInMeters  (see

http://dlp.cs.berkeley.edu/manis/darwin2ConceptInfo030315jrw.htm). For

example, I could search for all records where the collector was "John

Wieczorek" and the CoordinateUncertainty in meters is greater than 5000 m,

because I know where all of my collecting localities are to within that

level of uncertainty. I could look over the results and change the

georeference to reflect my knowledge of the collecting events. After doing

so, I could even update the metadata about the determination to say that it

has a VerificationStatus of "collector-verified," whereas before I

revisited it the VerificationStatus was "unverified." In the alternate

scenario, in which the guideline says that "no locality shall be

georeferenced without knowing (and recording) the extent of the named

place," I would fill in a NoGeorefBecause field to say "extent not found."

In this scenario I could later search for all localities where the

collector was "John Wieczorek" and the NoGeorefBecause field was not null.

I could look through those records to see if there were any that I could

georeference because of my special knowledge of the events. Either method

works. Remember, a georeference is just an opinion. It's essential to know

how the opinion was formed if you intend to use it for anything important.

 

>I realize that some collectors in the past were much more casual about

>localities, but for the last 50 years, in my experience, specimens

>have been more accurately allocated, and I would argue for a little

>less liberal convention.

 

I have found that this is not a universal truth. Some of the finest

collectors today remain incapable of recording a reasonable locality

description. The problem isn't so much that collectors don't follow a

particular convention, instead, the problem is that they are almost never

specific enough. Interestingly, GPS has not solved this problem, it has

compounded it.  All of that aside, the basic problem in georeferencing

based on the locality description, which is all we have to go on in some

cases, is that there remains a gap in our knowledge about the extent of the

named place.

 

>   To me using such an error figure would

>severely compromise the value of the specimen record.  For example,

>San Pedro might be on the slope of the Andes, and 50 km in any

>direction might involve an altitudinal change of 3000 meters and

>passing through 3 or 4 major habitat types.

 

I wholeheartedly agree. The smaller the CoordinateUncertainty, the greater

the number of questions for which a locality can be useful.

 

>   I would have to ask what

>purpose is served by such a convention.  Either the record is

>believable (in which case, why calculate an error figure?), or, if

>not (i.e., with a huge error figure), is the point of such a

>convention merely to cast doubt on records?

 

It is a mistake to confuse the CoordinateUncertainty with "believability."

It is actually a measure of "specificity." The Georeferencing Guidelines

are intended to provide data that are all equally believable, and that are

all explicitly measured with respect to their specificity.

 

>This seems a not very constructive convention, and I wonder where it

>comes from.  It's certainly not anything I would have thought of if I

>were making up "extent rules."

 

It came from me, for the reasons described above. Remember, the rule is

flexible if you have additional information and you document it. Just in

case, I've offered another alternative above - to not georeference if the

extent cannot be determined, and to say so. Do you have another

alternative? My recommendation is that it be simple and universally applicable.

 

>No offense meant to those who constructed the MaNIS rules,

 

That was me. No offense taken. The rules themselves were built by being

continually challenged. It *has* to be that way, and I welcome it.

 

>but it

>might be worth a group of active museum herpetologists and

>ornithologists thinking about exactly what specimen locality records

>are used for

 

Though I don't doubt the value of the exercise, I think it is a mistake to

presume that you will come up with all of the possible, or even likely,

uses of the data. Why limit the scope a priori?

 

>  and how much value is added to that use by the

>complicated and time-consuming (up to 50% of time spent

>georeferencing, according to Gary Shugart who has done many MaNIS

>records) calculation of errors (which come largely from extents).

>I'm concerned that the time and money spent calculating errors in the

>relatively easily handled localities from the USA will in fact take

>away from the basic georeferencing of many specimens from other

>countries, and those are perhaps the specimens that need it most!

 

Some additional information may be of use here. Funding was based on

documenting coordinates and errors as set out in the MaNIS guidelines,

using the Georeferencing Calculator, and the following collaborative

paradigm. The georeferencing rates for the USA were different from the

rates for Canada and Mexico, and these in turn were different from the rate

for everywhere else. All of these rates were determined empirically, and

they were all conservative. In other words, the funds for georeferencing

were based on the known level of effort required (number of localities by

geographic category by institution), not for the effort based solely on the

rates for the USA. Since those rates were determined, additional means of

semi-automating the process have been developed, most notably by Gary

Shugart at PSM, and make the georeferencing rates even greater for most

geographic regions. I highly recommend that Gary's tools be fully developed

and documented so that everyone can take advantage of them. By doing so,

you will gain that extra time that can be spent to increase the specificity

of georeferences.

 

John W

 

>XXXXXX

> 

>>There are situations, especially using small-scale (large area) maps, where

>>you cannot tell how big a feature, such as a town, really is. The

>>convention in such cases, in the absence of better references (maps, remote

>>sensing data, etc.) is to use one-half the distance from the coordinates of

>>the named place (e.g., a town) to the nearest named place of the same type

>>(e.g., nearest town) as the extent. Automated georeferencing that is done

>>based on gazetteers that don't have extent information will also have to

>>rely on this convention for (liberally) estimating the extents. I believe

>>this helps answer the first question, below.

>> 

>>John Wieczorek

> 

>--

 

>>> Posting number 507, dated 10 Jun 2003 17:34:33

 

>>> Posting number 508, dated 11 Jun 2003 10:39:00

Date:         Wed, 11 Jun 2003 10:39:00 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Fwd: Re: [HERPNET] Distances by road

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

I'm cross-posting this response from XXXXXX.

 

>From:

>Subject:      Re: [HERPNET] Distances by road

>To: HERPNET@USOBI.ORG

> 

>Much thanks to John Wieczorek for the thoughtful and lengthy

>response.  I can see that the answer is in having good maps, and such

>maps should be available for most countries.  Then there won't be

>problems of the sort I envisioned.

> 

>I still have a residual question about the value of coordinate

>uncertainties and exactly how they will be used.  I know circles can

>be drawn around coordinates showing those CUs, but with enough dots

>(records) and enough circles of varying sizes around them, a plot of

>locality records will be cluttered to say the least, perhaps to the

>point of being undecipherable.  If I wanted to plot the range of a

>certain lizard from specimen records for a generic revision or a book

>on Costa Rican herps, my inclination would be to leave the circles

>off for the sake of the user.  In the past, after laboriously finding

>each locality on a map, we would have applied those dots, made a

>subjective judgment that we thought one or more of them were suspect

>or just plain false, and gone back and done further research on the

>specimen/locality in question.  We would then have either informed

>the reader of our suspicions (or even made the record a different

>symbol) or just struck the offending dot from the map.

> 

>To me, as an end user with great interest in the geographic range of

>species, the incalculable value in MaNIS and HerpNet and BirdNet (and

>OdoNet, in my dreams) will be our having been able to plot the

>coordinates for all of our thousands of specimens to make order from

>chaos, not the coordinate uncertainties, which strike me as more

>along the line of producing chaos from order.  The thought of all the

>time/money/energy spent calculating the uncertainties will continue

>to discomfit me, even though I have heard those calculations are

>among the factors that made the proposals attractive to the granting

>agency.

> 

>Happy georeferencing,

> 

>XXXXXX

>--

 

>>> Posting number 509, dated 11 Jun 2003 10:40:04

Date:         Wed, 11 Jun 2003 10:40:04 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Fwd: Re: [HERPNET] Distances by road

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

I'm cross-posting this one too.

 

>From: Stan Blum <sblum@CALACADEMY.ORG>

>Subject:      Re: [HERPNET] Distances by road

>To: HERPNET@USOBI.ORG

> 

>I would also like to thank John for the thorough description of the

>rationale behind his/our approach to geo-referencing.  Just so John doesn't

>feel so lonely, I'd like to add that a number of people who have been

>thinking hard about geo-referencing have settled on similar "best

>practices" -- some of them through discussion with John and his colleagues,

>and some of them independently.  This is not to say that there isn't room

>for further discussion.

> 

> 

>At 04:40 PM 6/10/2003 -0700, XXXXXXXXX wrote:

>>I still have a residual question about the value of coordinate

>>uncertainties and exactly how they will be used.

> 

>While I agree with John that we shouldn't limit ourselves to specific uses

>-- our goal is to create the most useful and accurate data for the long

>haul -- I will describe what I think the uses of coordinate uncertainty (or

>more broadly, locality uncertainty) are likely to be in the near

>future.  Over the last few years, we have seen the availability and

>precision of environmental layers (Digital Elevation Models, for example)

>grow and grow.  There is no reason to assume we have hit some sort of limit

>with environmental data.  With more and better data, we should be able to

>create better and better models of how environmental characteristics

>influence species' distributions.  The greater the uncertainty we have in

>our occurrence data, the more noise we are going to have in our

>models.  Having uncertainty expressed as a scalar (distance or area) will

>allow analysts to filter the data they feed into these models (e.g.,

>everything with an uncertainty less than 100 m).  We can't presuppose the

>dividing line between acceptable and unacceptable for any particular use,

>so we are recording uncertainty as a continuous measure and leaving it up

>to the future analyst.  So, in my view, data without any sort of associated

>uncertainty will be much less useful on the long run.

> 

>Another thing we can take heart from is that the cost of geo-referencing

>(even with an uncertainty measure) is marginal compared to the cost of

>collecting and preparing the specimens in the first place and then keeping

>them in collections for all these years.  And the geo-reference makes the

>data so much more useful.

> 

>These projects are the best thing that's come down the road for NH

>collections in a long, long time.

> 

>-Stan

 

>>> Posting number 510, dated 11 Jun 2003 16:00:57

 

>>> Posting number 511, dated 13 Jun 2003 12:49:40

 

>>> Posting number 512, dated 17 Jun 2003 08:58:07

 

>>> Posting number 513, dated 17 Jun 2003 13:46:43

 

>>> Posting number 514, dated 17 Jun 2003 15:16:59

 

>>> Posting number 515, dated 17 Jun 2003 19:07:26

 

>>> Posting number 516, dated 18 Jun 2003 16:35:10

 

>>> Posting number 517, dated 26 Jun 2003 15:18:48

 

>>> Posting number 518, dated 27 Jun 2003 17:27:27

 

>>> Posting number 519, dated 2 Jul 2003 10:31:05

 

>>> Posting number 520, dated 2 Jul 2003 12:19:01

 

>>> Posting number 521, dated 3 Jul 2003 14:05:09

 

>>> Posting number 522, dated 7 Jul 2003 10:28:39

 

>>> Posting number 523, dated 7 Jul 2003 13:30:13

 

>>> Posting number 524, dated 7 Jul 2003 18:29:30

 

>>> Posting number 525, dated 7 Jul 2003 18:44:46

 

>>> Posting number 526, dated 8 Jul 2003 08:23:58

 

>>> Posting number 527, dated 8 Jul 2003 07:54:05

 

>>> Posting number 528, dated 9 Jul 2003 12:02:37

 

>>> Posting number 529, dated 11 Jul 2003 10:24:59

 

>>> Posting number 530, dated 11 Jul 2003 12:56:25

 

>>> Posting number 531, dated 11 Jul 2003 13:57:30

 

>>> Posting number 532, dated 12 Jul 2003 12:07:54

Date:         Sat, 12 Jul 2003 12:07:54 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Fwd: georeferencing issue

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Dear XXX, and all,

 

The subject of data verification is an important one, so I'm including the

original message and my reply on the MaNIS list for the benefit of all.

I'll intersperse my comments with the original message.

 

 

>Date: Fri, 11 Jul 2003 16:09:42 -0700

>From:

>Subject: georeferencing issue

> 

>John-

> 

>This is probably one of those issues that is an inherent problem with

>georeferencing that there is no real solution for, but I wanted to know

>how host institutions should deal with these situations.

> 

>Here is the example in question:  today our other Curatorial Assistant,

>XXXXXX, was dealing with a marine mammal specimen and it's locality and

>called me for verification (as a person who lived in Bolinas and

>georeferences for a living).  The speclocality read "Bolinas, Stinson

>Beach".  Two completely different towns.  I thought about it and decided

>as a georeferencer I would find the geographic center between the two

>and then assign an error that extends to the furthest extent of the two.

 

I have no problem with the method you would have chosen (given Bolinas and

Stinson Beach as two populated places) as long as the LatLongRemark

documented your choice.

Another way to georeference this locality would be to put the point in the

middle of the town of Stinson Beach and have the extent cover Bolinas. Or,

put the point in Bolinas and have the extent cover Stinson Beach.

Yet another reasonable alternative would have been to not georeference the

locality and say "internal inconsistency" in the NoGeorefBecause field

since there's no way to be in both towns at the same time.

 

There's a slight complication even beyond what you've mentioned, though,

which is that Stinson Beach is also a beach on the south side of the town

of Stinson Beach, with its own entry in the GNIS database. Given this extra

information, the "mid-point" that you would choose for your method of

georeferencing might be in a little different place than if you didn't

consider the beach.

For the second method mentioned above the point and extent would be

unaffected since the beach is closer to the town of Stinson Beach than

Bolinas is.

The third option mentioned above is also still (maybe even more so) a

reasonable choice.

 

>I then checked the MaNIS data to see how the georeferencer had handled

>this problem.  The info in the MaNIS file (CAS early delivery) was

>lat/long as Stinson Beach ppl with an error of .7 mi. I assume the

>georeferencer was unfamiliar with the area and assumed Stinson Beach was

>a more specific locality than Bolinas instead of two separate towns.

 

I agree that this is the assumption that must have been made - and it

wasn't in the LatLongRemarks.

 

>  In

>my georeferencing results the error would have been almost 2 miles vs .7

>mi.  Knowing the geography of the area I know that the MaNIS

>georeferenced data is not at all accurate to where the specimen was most

>likely collected, in fact the error does not even encompass the most

>probable true locality.

 

This brings up the most critical point of all, which is that our

georeferencing efforts are providing determinations (opinions) based on the

locality descriptions - not on the specimens. Without knowledge of the

specimens that are associated with the locality we are not able to make the

kind of judgement to which XXX is referring above. In other words, we

didn't know a priori that the specimens from that locality are marine mammals.

 

>Is this just an example of the host institution needing to verify the

>information pre (i.e. CAS should have noticed the conflict before

>submission) and post MaNIS?

 

I argue that it isn't efficient to pre-verify (or standardize) locality

data before georeferencing. It actually ends up taking more time overall

that way. My basic reasoning here is two-fold. First, in a large-scale

collaborative effort we would never have even begun georeferencing if we

waited for the pre-verification to take place. Second, the georeferencing

itself increases the efficiency with which we are able to isolate

problematic localities. We get to see a whole bunch of localities in a

context of other localities from the same general area and we get to see

patterns in recording techniques and formats. Eventually, we'll also be

able to group locality information by species with environmental

data.   What this means for us is that every georeference is unverified at

this stage. Think of them as the opinions of the georeferencers given the

information at hand.

 

The next logical phase is to validate the georeferences, which is my

responsibility. The first part of validation consists of checking that the

data provided by georeferencers are complete (e.g., NoGeorefBecause filled

out when there are no LatLongs, DeterminationRefs are provided, etc.). The

second part of validation is to make sure that the georeferences are

consistent with the higher geographic information (e.g., that records

putatively from Marin County actually lie within Marin County, or that the

LocalityAnnotation says that the county must have been wrong).

That is the limit of the validation that we can do without reference to the

rest of the specimen record, so it will be at this stage, when all

validation is finished for all localities for all institutions, that the

georeferenced locality information will be returned to the source

institutions to be re-associated with the specimens. Once that task is

accomplished the institutions will have the custodianship of the data and

the responsibility for verification in perpetuity. By verification I mean

that the specimens are checked to see that they were collected in the

locations described by the georeferences.  That's going to be an ongoing

task, for which it is my hope that we'll be able to provide valuable tools

to all participants in MaNIS. Funding for this purpose is being sought in

the context of the ORNIS project, which will be the Ornitholigical sister

of MaNIS, based on all of the same principles and technology. We envision

niche modelling tools to help isolate environmental outliers as well as

tools for following itineraries and mechanisms for users of the networks to

provide feedback to the source institutions for their verification. The

more different ways we have of looking at the data, the more data problems

will be exposed and fixed.

 

>If so how to you propose the host

>institutions proof the data now that it is finished?

 

I hope I've convinced you that we're not finished yet. In fact, those

pre-release data to CAS came specifically with the disclaimer that they

hadn't yet been validated by us, so don't put them in your database. You'll

get the whole batch again after validation.

 

>If Andrea hadn't

>been working with this specimen we probably wouldn't have noticed the

>error just by mapping the points from MaNIS.  I am curious as to your

>thoughts on this issue.

 

In addition to what I've said above, I propose that we track the

VerificationStatus of individual specimen or locality records, depending on

your database structure. Specifically, when the data come back from MaNIS,

they will have VerificationStatus = "unverified" and GeorefMethod = "MaNIS

Georeferencing Guidelines". At MVZ it is our intention to have other

possible values of verification status, such as "MVZ verified" which will

meant that staff of the MVZ checked the specimens against the locality and

found no inconsistency. The highest level of verification will be

"collector verified" which will mean that the collector actually looked at

a plot of the specimens based on the coordinates and errors and said "Yes,

all of those specimens came from within that circle and the circle is of

the correct size to describe the locality for all of them." It doesn't get

better than that. In order to engage the collectors, however, I think we'll

have to make some fun tools that we all can play with.  That's our goal anyway.

 

Thanks for asking the tough and timely questions.

 

John

 

 

 

>>> Posting number 533, dated 14 Jul 2003 14:50:47

 

>>> Posting number 534, dated 16 Jul 2003 10:22:52

 

>>> Posting number 535, dated 16 Jul 2003 11:47:15

 

>>> Posting number 536, dated 17 Jul 2003 14:32:11

 

>>> Posting number 537, dated 18 Jul 2003 18:28:33

 

>>> Posting number 538, dated 23 Jul 2003 18:28:59

 

>>> Posting number 539, dated 23 Jul 2003 18:33:04

 

>>> Posting number 540, dated 24 Jul 2003 13:17:56

 

>>> Posting number 541, dated 25 Jul 2003 19:12:38

 

>>> Posting number 542, dated 25 Jul 2003 19:14:29

 

>>> Posting number 543, dated 25 Jul 2003 19:24:30

 

>>> Posting number 544, dated 29 Jul 2003 13:52:29

 

>>> Posting number 545, dated 31 Jul 2003 14:25:49

 

>>> Posting number 546, dated 31 Jul 2003 21:15:20

 

>>> Posting number 547, dated 1 Aug 2003 15:24:12

 

>>> Posting number 548, dated 1 Aug 2003 17:01:45

 

>>> Posting number 549, dated 2 Aug 2003 19:13:23

 

>>> Posting number 550, dated 4 Aug 2003 10:04:46

 

>>> Posting number 551, dated 4 Aug 2003 15:01:02

 

>>> Posting number 552, dated 4 Aug 2003 15:15:59

 

>>> Posting number 553, dated 5 Aug 2003 15:10:11

 

>>> Posting number 554, dated 6 Aug 2003 08:11:44

 

>>> Posting number 555, dated 6 Aug 2003 18:36:56

 

>>> Posting number 556, dated 11 Aug 2003 09:43:41

 

>>> Posting number 557, dated 15 Aug 2003 09:31:00

 

>>> Posting number 558, dated 15 Aug 2003 10:21:12

 

>>> Posting number 559, dated 19 Aug 2003 16:28:06

Date:         Tue, 19 Aug 2003 16:28:06 -0700

Reply-To:     Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

Sender:       Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>

From:         John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>

Subject:      Last claim: Russia

Mime-Version: 1.0

Content-Type: text/plain; charset="us-ascii"; format=flowed

 

Whoohoo. This is an historic day. The MVZ would like to claim the last of

the known universe for georeferencing.

 

Having come to this milestone, I would like to thank everyone for their

participation in this grand experiment. I don't mean to imply that we're

done yet, but at least the Checklist map is all filled in with either green

(denoting geographic regions for which georeferences have been completed)

or red (denoting regions in progress). This makes Robert Hijmans very

happy. I'm pretty happy too, but I'll be even happier when the whole map is

green, so here's my reminder to keep up the good work and get those

outstanding georeferences to me as soon as you can.

 

We've already begun the first phase of data validation and standardization

on the files that have been returned already. Our hope is to be caught up

with this process as the last files come in so that we can do three final

important steps, 1) determine if there are localities that we missed, 2)

use GIS to do spatial validations on the georeferences against

administrative boundary layers, and 3) prepare the georeferences to be

returned to the source databases.

 

In preparation for the remainder of this part of the MaNIS project, it

would be helpful if participants could do two things at your earliest

convenience:

 

1) Look at the Georeferencing Checklist

(http://elib.cs.berkeley.edu/manis/Checklist.html) to see if my records of

outstanding claims are correct, and

 

2) Send me an estimate of when you expect to finish georeferencing the

regions for which there are claims outstanding.

 

Thanks to all,

 

John

 

>>> Posting number 560, dated 21 Aug 2003 12:04:31

 

>>> Posting number 561, dated 8 Sep 2003 10:32:10

 

>>> Posting number 562, dated 8 Sep 2003 18:53:07

 

>>> Posting number 563, dated 7 Oct 2003 19:47:47

 

>>> Posting number 564, dated 10 Dec 2003 13:03:43

 

>>> Posting number 565, dated 12 Dec 2003 09:43:00

 

>>> Posting number 566, dated 12 Dec 2003 13:53:59

 

>>> Posting number 567, dated 21 Jan 2004 11:19:04

 

>>> Posting number 568, dated 21 Jan 2004 13:00:12

 

>>> Posting number 569, dated 21 Jan 2004 15:16:04

 

>>> Posting number 570, dated 21 Jan 2004 15:18:13

 

>>> Posting number 571, dated 22 Jan 2004 12:46:18

 

>>> Posting number 572, dated 21 Jan 2004 15:11:33

 

>>> Posting number 573, dated 22 Jan 2004 14:21:24