MaNIS Georeferencing Discussion
Archive
Following are extracts of the Georeferencing Listserv discussions accumulated during the MaNIS georeferencing project. Missing postings were not relevant to georeferencing in perpetuity. Messages have been edited to protect the guilty by masking names of individuals with XXXXXX.
>>> Posting number 1, dated 17 Jul 1999 14:12:50
-----------------------------------------------------------------------------
>>> Posting number 2, dated 17 Jul 1999 14:15:23
-----------------------------------------------------------------------------
>>> Posting number 3, dated 17 Jul 1999 14:16:03
-----------------------------------------------------------------------------
>>> Posting number 4, dated 17 Jul 1999 14:19:25
------------------------------------------------------------------------=
-----
>>> Posting number 5, dated 17 Jul 1999 14:19:59
-----------------------------------------------------------------------------
>>> Posting number 6, dated 17 Jul 1999 14:26:41
-----------------------------------------------------------------------------
>>> Posting number 7, dated 17 Jul 1999 14:22:50
-----------------------------------------------------------------------------
>>> Posting number 8, dated 17 Jul 1999 14:23:12
-----------------------------------------------------------------------------
>>> Posting number 9, dated 19 Jul 1999 09:29:01
----------------------------------------------------------------------------
--------------------
>>> Posting number 10, dated 23 Jul 1999 16:35:41
>>> Posting number 11, dated 3 Sep 1999 16:17:55
>>> Posting number 12, dated 17 Sep 1999 15:19:38
>>> Posting number 13, dated 17 Sep 1999 13:13:14
>>> Posting number 14, dated 17 Sep 1999 14:57:30
>>> Posting number 15, dated 20 Sep 1999 09:04:17
>>> Posting number 16, dated 24 Sep 1999 17:01:21
>>> Posting number 17, dated 28 Sep 1999 12:50:27
>>> Posting number 18, dated 15 Oct 1999 19:37:37
>>> Posting number 19, dated 17 Oct 1999 16:37:27
>>> Posting number 20, dated 18 Oct 1999 16:50:30
>>> Posting number 21, dated 19 Oct 1999 11:15:26
>>> Posting number 22, dated 19 Oct 1999 16:35:19
>>> Posting number 23, dated 20 Oct 1999 15:51:18
>>> Posting number 24, dated 20 Oct 1999 11:34:55
>>> Posting number 25, dated 20 Oct 1999 16:00:18
>>> Posting number 26, dated 10 Nov 1999 10:52:01
>>> Posting number 27, dated 10 Nov 1999 13:54:04
>>> Posting number 28, dated 17 Nov 1999 15:12:19
>>> Posting number 29, dated 18 Nov 1999 12:38:15
>>> Posting number 30, dated 18 Nov 1999 10:08:56
>>> Posting number 31, dated 18 Nov 1999 13:22:25
>>> Posting number 32, dated 19 Nov 1999 14:35:52
>>> Posting number 33, dated 3 Dec 1999 10:21:24
>>> Posting number 34, dated 3 Jan 2000 11:48:10
>>> Posting number 35, dated 3 Jan 2000 16:24:25
>>> Posting number 36, dated 18 May 2000 16:51:23
>>> Posting number 37, dated 18 May 2000 19:49:29
>>> Posting number 38, dated 23 May 2000 18:41:45
>>> Posting number 39, dated 24 May 2000 09:38:19
--------------------------------------------------------
---------------------
>>> Posting number 40, dated 24 May 2000 12:15:39
>>> Posting number 41, dated 12 Jun 2000 15:45:50
>>> Posting number 42, dated 13 Jun 2000 09:31:26
>>> Posting number 43, dated 13 Jun 2000 09:59:02
>>> Posting number 44, dated 13 Jun 2000 09:17:08
>>> Posting number 45, dated 13 Jun 2000 07:49:43
>>> Posting number 46, dated 13 Jun 2000 09:04:22
>>> Posting number 47, dated 13 Jun 2000 08:54:22
>>> Posting number 48, dated 13 Jun 2000 11:11:31
>>> Posting number 49, dated 13 Jun 2000 13:23:46
>>> Posting number 50, dated 30 Jun 2000 16:25:38
>>> Posting number 51, dated 30 Jun 2000 17:14:31
>>> Posting number 52, dated 30 Jun 2000 23:29:35
>>> Posting number 53, dated 1 Jul 2000 07:35:15
>>> Posting number 54, dated 4 Jul 2000 11:04:23
>>> Posting number 55, dated 4 Jul 2000 10:07:33
>>> Posting number 56, dated 6 Jul 2000 00:00:0/
>>> Posting number 57, dated 5 Jul 2000 19:40:11
>>> Posting number 58, dated 5 Aug 2000 09:24:55
>>> Posting number 59, dated 5 Aug 2000 12:31:07
>>> Posting number 60, dated 7 Aug 2000 13:45:33
>>> Posting number 61, dated 15 Aug 2000 21:54:23
>>> Posting number 62, dated 23 Aug 2000 16:24:48
>>> Posting number 63, dated 30 Aug 2000 11:20:17
>>> Posting number 64, dated 22 Sep 2000 09:36:34
>>> Posting number 65, dated 29 Sep 2000 08:51:23
>>> Posting number 66, dated 2 Oct 2000 10:35:12
>>> Posting number 67, dated 5 Oct 2000 09:40:24
>>> Posting number 68, dated 17 Oct 2000 18:13:33
>>> Posting number 69, dated 1 Nov 2000 07:48:24
>>> Posting number 70, dated 1 Nov 2000 08:06:24
>>> Posting number 71, dated 28 Nov 2000 18:26:18
>>> Posting number 72, dated 29 Nov 2000 21:09:35
>>> Posting number 73, dated 30 Nov 2000 08:31:10
>>> Posting number 74, dated 30 Nov 2000 11:33:07
>>> Posting number 75, dated 14 Dec 2000 20:41:28
>>> Posting number 76, dated 15 Dec 2000 07:59:04
>>> Posting number 77, dated 26 Apr 2001 09:00:01
>>> Posting number 78, dated 16 May 2001 18:29:45
>>> Posting number 79, dated 16 May 2001 17:36:59
>>> Posting number 80, dated 18 May 2001 08:29:49
>>> Posting number 81, dated 24 May 2001 10:19:20
>>> Posting number 82, dated 25 May 2001 09:43:37
>>> Posting number 83, dated 11 Jun 2001 12:01:03
>>> Posting number 84, dated 11 Jun 2001 15:02:51
>>> Posting number 85, dated 11 Jun 2001 15:44:56
>>> Posting number 86, dated 29 Jun 2001 21:12:37
>>> Posting number 87, dated 4 Jul 2001 14:24:24
Date: Wed, 4 Jul 2001 14:24:24 -0700
Reply-To: "Mammalogy Z39.50 Network (Private)" <MAMMAL-Z-NET@USOBI.ORG>
Sender: "Mammalogy Z39.50 Network (Private)" <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: ROM higher geography
In-Reply-To: <sb433743.076@romfs7.rom.on.ca>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
I'm posting the following exchange to the list because there is information
contained herein that is relevant to everyone. The basic concepts of data
cleanliness, the gazetteer, and data updates are addressed in brief.
>Once I began working on the Bukedi inconsistency (2nd in your list) I saw
>that your methodology is missing many more errors/inconsistencies that
>exist in County and Province data.
Understood. My analysis reveals only the duplicates of
ORCT+ORCRY+ORPR+ORCY
I understand that there may be many other errors and inconsistencies in the
original data, but that is not a concern for the gazetteer. In fact, the
duplicates I pointed out aren't a problem either. I just wanted to alert
you to them since they came out in my analysis.
> The errors and inconsistencies are a direct reflection of the state of
> documentation on field catalogues or specimen cards, depending on the
> source of the automated record. We did not have the resources at the
> time of automation (nor do we now for that matter) to resolve what is a
> "Province" term and what is a "County" term for all
> countries. Additionally, we are looking at historical data that may no
> longer be reflected in the current political reality of our little world
> (e.g.,
> are used routinely to manage the collection and retrieve data. Continent
> and Country should be clean. The Province field should be clean for
>
> just finished cleaning up the Province field for
> County field should be clean for
> frequency listings for Country etc. for these priority sections of the db
> (and collection) in an effort to maintain the consistency of our
> data. For all other geographic locations, Province and County are not
> used for managing the collection, so the data clean up or enhancement has
> been a low priority. This is an ongoing situation that I have discussed
> with Judith with regard to the Manis Project. My understanding is that
> funding for documentational and staffing resources will be part of this
> "mission". I am afraid your listing of 13 inconsistencies barely
> scratches the surface of the data cleaning that is required and even more
> importantly, misses all kinds of erroneous or missing data. I currently
> do not have the maps, atlases, or gazetteers nor the staff/time to
> undertake this project which from a collections' perspective is of low
> priority. To do a proper job I cannot resolve all of the problems that
> you have identified without undertaking a full review of the entire
> country's data.
There is no requirement for any standard of cleanliness. It is my hope that
errors and inconsistencies will be noted during georeferencing and
forwarded to the attention of the institutions as a part of that
process. The tools are meant to identify the inconsistencies, not to
remedy them. What the institutions do with these notes is entirely up to them.
>I am not sure what you are currently attempting to do with the data so we
>may need to further discuss our respective needs to insure that we are not
>working at cross purposes. If work is to be globally undertaken, I would
>like our data to be the db of record - making long lists of changes for
>you to then repeat is a waste of effort and time; you will see the work
>generated by having two dbs of record by the simple changes that I have
>made this afternoon. Also, errors in interpretation or typos that are
>bound to occur should be avoided. Finally, the data you have is already
>out of date, since changes are made by me on a daily basis as errors etc.
>are encountered during the normal activities of managing the collection,
>fulfilling data requests, etc.
The institutional databases will always be the database of record. The
data I have from all of the institutions is just a snapshot, to be used for
georeferencing. I will not ask for these data again during the project, nor
will I make changes to the data I have received. When we have a network,
the gazetteer will be created and updated automatically whenever data
change and the snapshot will be obsolete. I've only created the snapshot
so that we have combined data to work with. When people begin to do
georeferencing using the gazetteer they will not change the data - they
will only make commentaries. Even the latitude and longitude are
commentaries in a sense. It is up to each institution to accept or reject
the commentaries and make changes based on them in its database.
>Regards,
>
> >>> John Wieczorek <tuco@socrates.Berkeley.EDU> 07/02/01 08:50PM >>>
>Attached is a tab-delimited file with the first row containing column
>headings. The contents of the file are combinations of higher geographic
>fields for which you have more than one interpretation in your
>database. The first field (highergeog) is a concatenation of the fields of
>higher geography that reveal duplication. The second field (geogid) is an
>identifier unique to the ROM higher geography data with one row for every
>unique combination of ORCT, ORCRY, ORPR, and ORCY. As you can see by the
>rows in the table, there are 13 places for which there are inconsistent
>placements of county vs. province, for example. It is not critical for my
>purposes to have these resolved, but since I noticed them I thought I might
>as well tell you. If you do make changes to these combinations, let me
>know which are correct and I'll do so on this end as well.
>>> Posting number 88, dated 10 Jul 2001 12:01:24
Date: Tue, 10 Jul 2001 12:01:24 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: cave localities
Mime-version: 1.0
Content-type: text/plain; charset="US-ASCII"
Content-transfer-encoding: 7bit
I've noticed that the USGS GNIS web site does not give information on cave
sites. (It does give locations of variants such as Boulder Cave
Campground.) Is this a protocol we wish to follow? Are there other web
sites that do list cave localities? What do you think?
Cheers,
>>> Posting number 89, dated 10 Jul 2001 13:40:25
Date: Tue, 10 Jul 2001 13:40:25 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Filtering data
In-Reply-To: <sb4b0d4a.070@romfs7.rom.on.ca>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
This message is in reply to a comment about
records for captive animals.
>I would recommend that you do not use any captive records for a
>gazetteer. Does that make sense?
In a restricted view of the utility of a gazetteer it does make sense to
exclude them. However, it is actually easier to include them, yet have them
flagged. This has the benefit that one can filter on the captive attribute.
This could be useful if you wanted to do a quick query of only captive
animals as well as for a query in which you want to leave them out. The
philosophy in general will be to have a home for all data that anyone deems
useful, yet to allow each institution to decide which data it will provide
through the filters implemented during migration.
A filter might do any one of the following:
1) exclude attributes altogether (e.g., not show a "CaptiveFlag" field)
2) exclude records based on the value of an attribute (e.g., not show
records of endangered species)
3) exclude certain values of an attribute (e.g., not show localities for
endangered species)
4) substitute a surrogate value for an attribute of a certain value (e.g.,
instead of showing locality with lat-long, show only county-level and
higher geography for endangered species)
These are just a few examples of what might be done at one institution, and
may vary between institutions. I encourage the participant's to discuss
these issues, and to begin to make institutional decisions about filtering
rules when it comes time to set up the migration. The rules must be
clearly defined before I begin to create the creation scripts - I can't
afford to stay at any given institution (except maybe Hawaii, heh heh),
while the rules are being hashed out.
>>> Posting number 90, dated 8 Aug 2001 13:10:05
>>> Posting number 91, dated 14 Sep 2001 08:48:17
>>> Posting number 92, dated 23 Sep 2001 17:24:24
>>> Posting number 93, dated 24 Sep 2001 20:07:31
Date: Mon, 24 Sep 2001 20:07:31 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing Guidelines
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
Now that we are officially up and running I would like to provide the first
of two documents on the MaNIS collaborative georeferencing effort. This
first document is meant to open for discussion the issues associated with
turning specific locality descriptions into well-documented latitudes and
longitudes. The document does not explain what tools to use, or how to use
any of them - that will be in a forthcoming document. Instead, this
document focuses on the "theoretical aspects" of the task, our methods and
assumptions, upon which it would be helpful for us all to agree. To that
end, please read the Georeferencing Guidelines page, accessible from the
Documents page on the MaNIS website (see below). Comment by sending
messages to MAMMAL-Z-NET@USOBI.ORG. Let's try to get through this
discussion by 6 Oct.
http://dlp.cs.berkeley.edu/manis/Documents.html
Anticipating your enthusiastic participation,
John Wieczorek
>>> Posting number 94, dated 25 Sep 2001 18:30:16
Date: Tue, 25 Sep 2001 18:30:16 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing text, for reference
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
It was pointed out to me that it might be prudent to have a text-only copy
of the document, with line numbers, to which everyone can refer in
discussions. I am including the full text of the GeorefGuide.html file
below for that purpose. The page itself can be found at the following URL:
http://dlp.cs.berkeley.edu/manis/GeorefGuide.html
1 MaNIS
2 The Mammal Networked Information System
3
4 John Wieczorek
5 24 September 2001
6 _________________________________________________
7
8 Georeferencing Guidelines
9
10 This document contains information about assigning geographic
11 coordinates and maximum errors for those coordinates to specific
12 locality descriptions. This document does not attempt to
13 describe the tools and methods for finding named places on maps
14 or gazetteers. The process of assigning coordinates and errors,
15 called georeferencing, can be rather complicated. The complexity
16 of the process can be greatly reduced and the consistency of the
17 results can be greatly increased by establishing simple
18 guidelines that cover most commonly encountered locality
19 descriptions. The guidelines for assigning coordinates for named
20 places are presented with examples in the section Determining
21 Latitude & Longitude.
22
23 There are several fundamental sources of error for specific
24 locality descriptions, and these vary in magnitude. It is
25 essential during georeferencing to determine and record the
26 greatest source of error among all possible sources. There are
27 numerous ways in which the maximum error of a geographic
28 coordinate might be expressed, but the most convenient is as a
29 distance, because its size and shape are constant over any
30 geodetic surface model. The sources of error and their
31 magnitudes are discussed primarily in the section Determining
32 Error.
33
34 An Appendix containing a description of the data that should be
35 captured for each georeferenced locality, a glossary, and
36 references are appended for the convenience of the reader.
37
38 Determining Latitude & Longitude
39
40 Geographic coordinates can be expressed in a number of different
41 coordinate systems (e.g. decimal degrees, degrees minutes
42 seconds, degrees decimal minutes, UTM, etc.). Conversions can be
43 made readily between coordinate systems, but decimal degrees
44 provide the most convenient coordinates to use for
45 georeferencing for no more profound a reason than that a
46 specific locality can be described with only two attributes
47 decimal latitude and decimal longitude.
48
49 Named Places
50
51 The simplest of specific locality descriptions consist of only a
52 named place. Use the geographic center of a named place for the
53 latitude and longitude, and use the distance from that point to
54 the furthest point within that named place for the maximum error
55 distance. If the geographic center of the named place is not
56 within the confines of the shape of the named place, use the
57 point nearest to the geographic center that lies within the
58 shape.
59
60 Example: "Bakersfield"
61
62 Township Range Section (TRS) descriptions are essentially no
63 different from that of any other named place. It is necessary to
64 understand how TRS descriptions work and how they describe a
65 place. See the References section, below, for links to TRS
66 information.
67
68 Example: "E of Bakersfield, T29S R29E Sec. 34 NE 1/4"
69
70 Offsets
71
72 Offsets generally consist of combinations of distances and
73 directions from a named place. Use the geographic center of the
74 named place in the direction of the offset as a starting point.
75 Unless there is contrary information in the locality
76 description, measure the distance in the offset direction to
77 find the spot for the geographic coordinates. Offsets that do
78 not explicitly say that they were measured by air or by some
79 contour (e.g., by road, river, valley, etc.) should be
80 determined as if by air in a straight line.
81
82 Example: "10 mi E (by air) Bakersfield"
83
84 Example: "10 mi E of Bakersfield"
85
86 However, if there is no mention of the mode of measurement in
87 the locality description, but the measurement includes fractions
88 (e.g., 10.2 miles) and there is a road in the vicinity, use road
89 miles. Offsets that were described in the specific locality as
90 being measured by road should be determined using the contours
91 of the road rather than using a straight line. The methods for
92 determining the maximum error distances for these types of
93 specific locality descriptions are given in the Determining
94 Error section, below.
95
96 Example: "10.2 mi E of Bakersfield"
97
98 Example: "13 mi E (by road) Bakersfield"
99
100 Vagueness
101
102 At times, specific locality descriptions are fraught with
103 vagueness. It is not the purpose here to belittle localities of
104 this type; in fact, an honest admission of the unknown is
105 preferable to masking it with unwarranted precision.
106
107 The most important type of vagueness in a specific locality
108 description is one in which the locality is in question. No such
109 locality should be georeferenced.
110
111 Example: "Bakersfield?"
112
113 Many locality descriptions imply an offset from a named place
114 without definitive directions or distances. Use the geographic
115 center of the named place for the geographic coordinates. For
116 the maximum error distance, use the greatest distance that is
117 not likely to be considered in the area of another named place.
118 Clearly there is a measure of subjectivity involved here. Let
119 common sense prevail and document the assumptions made.
120
121 Example: "near Bakersfield"
122
123 Sometimes offset information is vague either in its direction or
124 in its distance. If the direction information is vague, record
125 the geographic coordinates of the center of the named place and
126 add the offset distance to the greatest extent of the named
127 place to get the maximum error distance.
128
129 Example: "5 mi from Bakersfield"
130
131 Uncertainty in the offset distance is a fact of the business.
132 Almost no localities are recorded with error estimates,
133 therefore every offset distance is inherently uncertain. The
134 addition of a modifier in the locality description, while an
135 honest observation, should not change the determination of the
136 geographic coordinates or of the maximum error.
137
138 Example: "about 3 mi E of Bakersfield"
139
140 The worst of situations arises when a specific locality
141 description is internally inconsistent. There are numerous
142 possible causes for inconsistencies. It is the task of the those
143 georeferencing to determine the part of the description most
144 likely to be in error, ignore it for the purpose of the
145 determination, and document the decision to do so. The most
146 common source of inconsistency in a locality description comes
147 from trying to match elevation information with the rest of the
148 description. If there is no reasonable way to reconcile the
149 discrepancy, ignore the elevation.
150
151 Example: "10 mi W of Bakersfield, 6000 ft"
152
153 Determining Error
154
155 The process of georeferencing includes an assessment of the
156 possible sources of error in a geographic coordinate
157 determination. Errors may arise due to the extent of a locality,
158 due to unspecified precision in original measurements (distance
159 precision and directional precision), or due to not knowing the
160 datum under which coordinates were determined. It is essential
161 to determine which of these yields the greatest error and record
162 that value as the maximum error distance. Potential error
163 sources and guidelines for determining the magnitude of each for
164 a given specific locality are given in the paragraphs below.
165
166 Error due to the shape of a locality
167
168 Named places are not single points; they have extents. If a
169 locality description is no more specific than to describe a
170 named place or an offset from a named place, then the size of
171 the named place is a source of error. The treatment of error due
172 to the extent of a locality is described under the examples of
173 determining latitude and longitude, above.
174
175 Error due to a unknown datum
176
177 Seldom have geographic coordinates been recorded for a locality
178 in a natural history collection in which the underlying datum of
179 the coordinate system was given. Even now, when GPS coordinates
180 are being taken as definitive evidence of a location, the
181 geodetic datum is being ignored. Without recording the datum
182 with the coordinates, potential accuracy is being lost. Figure 1
183 shows the magnitude of error (in meters) over North America
184 based on not knowing the datum from which the coordinates were
185 taken.
186
187 [datumerror.jpg]
188
189 Figure 1. Map of North America showing the magnitude of
190 potential error from not knowing whether coordinates were taken
191 from NAD27, NAD83, or WGS84.
192
193 This map can be used as a rough guide for determining the
194 magnitude of error due to not knowing the datum from which the
195 geographic coordinates were recorded.
196
197 Precision
198
199 Precision is difficult to gauge from specific locality
200 descriptions; it may be reflected in the locality description,
201 but it is seldom, if ever, explicitly recorded. Furthermore, a
202 database record may not reflect, or may reflect incorrectly, the
203 precision inherent in the original measurement, especially if
204 the locality description has undergone interpretation from the
205 verbatim original description. Precision issues arise from both
206 distance measurements and directions in a locality description.
207 Potential errors from each of these sources are discussed in the
208 paragraphs below.
209
210 Error associated with distance precision
211
212 Distance may be recorded in a specific locality description with
213 or without significant digits, and those digits may or may not
214 be warranted. A conservative way to insure that distance
215 precision is not inflated is to treat distance measurements as
216 integers with fractional remainders. Thus 10.25 becomes 10 1/4,
217 10.5 becomes 10 1/2, etc. Calculate the error for these distances
218 based on the fractional part of the distance, using 1 divided by
219 the denominator of the fraction.
220
221 Example: "10.5 mi N of Bakersfield" Fraction is 1/2, error should
222 be 0.5 mi.
223
224 Example: "10.6 mi N of Bakersfield" Fraction is 1/10, error
225 should be 0.1 mi.
226
227 Example: "10.75 mi N of Bakersfield" Fraction is 3/4, error should
228 be 0.25 mi.
229
230 If the distance is an integer, use an error of one unit.
231
232 Example: "10 mi N of Bakersfield" Error should be 1 mi.
233
234 Error associated with directional precision
235
236 Direction is almost always expressed in specific locality
237 descriptions using cardinal and intercardinal directions rather
238 than degree headings. A conservative interpretation of these
239 directions allows for an error of 22.5 degrees to either side of
240 the recorded direction. Thus, ENE can be any direction between E
241 and NE, while NE can be any direction between ENE and NNE.
242
243 [directionerror.jpg]
244
245 The error distance resulting from imprecision in direction
246 increases with increasing offset distance. In fact the error
247 distance due to directional imprecision is 0.4142 times the
248 offset. Note, however, that when a locality description uses two
249 offsets based on cardinal directions (e.g., 1 mi N and 3 mi E of
250 Bakersfield), the distances and directions are likely to have
251 been measured on a map. In this case, directional imprecision
252 should be ignored.
253
254 Appendix
255
256 Geographic Coordinate Data
257
258 Following are the essential attributes to be captured for each
259 locality while georeferencing.
260
261 Decimal_Latitude - the latitude coordinate (in decimal degrees) at
262 the center of a circle encompassing the whole of a specific
263 locality. Convention holds that decimal latitudes north of the
264 equator are positive numbers less than or equal to 90, while
265 those south are negative numbers greater or equal to 90.
266 Example: -42.51 degrees (which is the same as 42d 30' 36" S).
267
268 Decimal_Longitude - the longitude coordinate (in decimal degrees)
269 at the center of a circle encompassing the whole of a specific
270 locality. Decimal longitudes west of the Greenwich Meridian are
271 considered negative and must be greater than or equal to 180,
272 while eastern longitudes are positive and less than or equal to
273 180. Example: -122.49 degrees (which is the same as 122d 29' 24"
274 W).
275
276 Maximum_Error_Distance - the upper limit of the distance from the
277 given latitude and longitude within which the described locality
278 must lie.
279
280 Maximum_Error_Units - the units of length in which the maximum
281 error is recorded (e.g., mi, km, m, and ft). Express maximum
282 error distance in the same units as the distance measurement in
283 the specific locality description.
284
285 Datum - the geometric description of a geodetic surface model
286 (e.g., NAD27, NAD83, WGS84). Datums are often recorded on maps
287 and in gazetteers, and can be specifically set for most GPS
288 devices. Use "not recorded" when the datum is not known.
289
290 Original_Coord_System - the coordinate system in which the raw
291 data are being entered. For the purpose of collaborative
292 georeferencing this value will be "decimal degrees." However,
293 existing geographic coordinates may be entered in degrees
294 minutes seconds, degrees decimal minutes, or UTM coordinates.
295
296 Reference - the reference source (e.g., map, gazetteer, or
297 software) used to determine the coordinates. Such information
298 should provide enough detail so that anyone can locate the
299 actual reference that was used (e.g., name, edition or version,
300 year). Lat_Long_Determined_By the person or organization by
301 which the determination was made.
302
303 Lat_Long_Determined_Date - the date on which the determination was
304 made.
305
306 Remarks - comments on methods and assumptions used in determining
307 coordinates or errors when those methods or assumptions differ
308 from or expand upon the accepted guidelines.
309
310 Glossary
311
312 Datum - A geodetic datum describes the size, shape, origin, and
313 orientation of a coordinate system for mapping the surface of
314 the earth.
315
316 Decimal degrees - degrees expressed as a single real number (e.g.,
317 -22.343456) rather than as a composite of degrees, minutes,
318 seconds, and direction (e.g., 7d 54 18.32" E).
319
320 Geodetic surface model - a geometric description of the surface of
321 the earth.
322
323 Geographic coordinates - latitude and longitude, measured in any
324 of various coordinate systems.
325
326 Geographic center - To find the geographic center of a shape,
327 first find the extremes of both latitude and longitude within
328 the shape and then take their respective means.
329
330 UTM - Universal Transverse Mercator. A grid coordinate system
331 specifying a datum, zone, and offsets from the equator and from
332 the meridian of the zone. See the References section, below, for
333 more information.
334
335 References
336
337 Township, Range Section Information:
338
339 http://www.esg.montana.edu/gl/trs-data.html
340
341 Datum Information:
342
343 http://www.colorado.edu/geography/gcraft/notes/datum/datum_f.html
344 http://164.214.2.59/GandG/tm83581/tr83581a.htm
345 http://biology.usgs.gov/geotech/documents/datum.html
346
347 UTM Information:
348
349 http://www.nps.gov/prwi/readutm.htm
350 http://www.dmap.co.uk/ll2tm.htm
351
352 Note
353
354 Specific locality descriptions are inexact and seldom give
355 estimates of error. An ideal description of a specific locality
356 has no error. One way to achieve this ideal is to describe the
357 locality by a shape within which the exact locality must
358 certainly lie. The capture of shape data is certainly possible
359 with current GIS technology, and is even demonstrably more
360 efficient than the methods described above. However, there are
361 technical challenges yet to be met in order to make the capture
362 of shape data feasible in a collaborative Internet-based
363 georeferencing environment.
364
365 An alternative to using a shape to describe a locality is to use
366 a definitive point of arbitrarily high precision with an
367 attendant maximum error. This method, described in the foregoing
368 document, is a conservative expression of the locality which
369 satisfies the requirement that the exact locality must lie
370 within the space described.
371
372
373 _________________________________________________
374
375 Rev. 24 September 2001, JRW
376
377 University of California, Berkeley, CA 94720, Copyright 2001,
378 The Regents of the University of California.
>>> Posting number 95, dated 27 Sep 2001 10:45:45
Date: Thu, 27 Sep 2001 10:45:45 -1000
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Georeferencing document
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
John,
I went through your document this morning and find most of it clear and in
agreement with my own practices of georeferencing. I have some
observations and questions as follows:
A.
140 The worst of situations arises when a specific locality
141 description is internally inconsistent. There are numerous
142 possible causes for inconsistencies. It is the task of the those
143 georeferencing to determine the part of the description most
144 likely to be in error, ignore it for the purpose of the
145 determination, and document the decision to do so. The most
146 common source of inconsistency in a locality description comes
147 from trying to match elevation information with the rest of the
148 description. If there is no reasonable way to reconcile the
149 discrepancy, ignore the elevation.
150
151 Example: "10 mi W of Bakersfield, 6000 ft"
I have recently been through a georeferencing exercise in the herp
collection for which obtaining coordinates that agreed with the elevations
was critical. It was only through trying to match the description of the
location (distance and direction from X village) with the elevation given,
and finding that the given elevation at the described site was impossible,
that I uncovered major problems in the locality data provided for a large
number of herps on one particular collecting trip. In this case I was able
to contact the collector to ask about the inconsistencies and he determined
that his original distances were totally off because he was using miles on
a metric map. In this case the elevations were the correct piece of
information. I therefore caution against ignoring elevations out of hand.
B.
Section on Determining Latitude and Longitude does not include an example
for when coordinates are provided. For the sake of completeness, should
such and example be included, or, since they are being provided and not
determined, should this be taken up in another section? For example, when
coordinates are provided in degrees, minutes and seconds, do we translate
into decimals? how many decimal places do we go for minutes? for
seconds? Does it matter who provided the
coordinates? collector? previous museum person? someone else? Under
what circumstances, if any, should we recalculate coordinates when they are
provided by some previous source?
C.
210 Error associated with distance precision
211
212 Distance may be recorded in a specific locality description with
213 or without significant digits, and those digits may or may not
214 be warranted. A conservative way to insure that distance
215 precision is not inflated is to treat distance measurements as
216 integers with fractional remainders. Thus 10.25 becomes 10 1/4,
217 10.5 becomes 10 1/2, etc. Calculate the error for these distances
218 based on the fractional part of the distance, using 1 divided by
219 the denominator of the fraction.
Lines 217-219. Does this mean to "replace" the numerator with 1, and
divide by the denominator?
221 Example: "10.5 mi N of Bakersfield" Fraction is 1/2, error should
222 be 0.5 mi.
numerator is 1 to begin with, so doesn't answer the question.
224 Example: "10.6 mi N of Bakersfield" Fraction is 1/10, error
225 should be 0.1 mi.
Isn't the fraction of .6, 6/10? Did you replace the 6 with a 1 in order
to calculate the error?
227 Example: "10.75 mi N of Bakersfield" Fraction is 3/4, error should
228 be 0.25 mi.
Fraction this time is given as 3/4, not 1/4, but you could only get an
error of 0.25 by replacing the 3 with a 1 before dividing by 4.
As you can see, the examples are confusing.
All in all, its a sound document. Thanks much.
>>> Posting number 96, dated 27 Sep 2001 20:34:47
Date: Thu, 27 Sep 2001 20:34:47 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: Gordon Jarrell <fnghj@AURORA.UAF.EDU>
Subject: Re: Georeferencing document
In-Reply-To: <5.0.2.1.1.20010927104434.00a2f7e0@mail.bishopmuseum.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Some good points. I've inserted my comments.
On Thu, 27 Sep 2001, XXXXXXX wrote:
> A.
> 140 The worst of situations arises when a specific locality
> 141 description is internally inconsistent. There are numerous
> 142 possible causes for inconsistencies. It is the task of the those
> 143 georeferencing to determine the part of the description most
> 144 likely to be in error, ignore it for the purpose of the
> 145 determination, and document the decision to do so. The most
> 146 common source of inconsistency in a locality description comes
> 147 from trying to match elevation information with the rest of the
> 148 description. If there is no reasonable way to reconcile the
> 149 discrepancy, ignore the elevation.
> 150
> 151 Example: "10 mi W of Bakersfield, 6000 ft"
>
> I have recently been through a georeferencing exercise in the herp
> collection for which obtaining coordinates that agreed with the elevations
> was critical. It was only through trying to match the description of the
> location (distance and direction from X village) with the elevation given,
> and finding that the given elevation at the described site was impossible,
> that I uncovered major problems in the locality data provided for a large
> number of herps on one particular collecting trip. In this case I was able
> to contact the collector to ask about the inconsistencies and he determined
> that his original distances were totally off because he was using miles on
> a metric map. In this case the elevations were the correct piece of
> information. I therefore caution against ignoring elevations out of hand.
>
The key words here are, "IF there is no way to reconcile the
discrepancy..." A possible resolution of the discrepancy might be to
treat it as "specific locality unknown." This might best be left to the
discretion of the individual collections. We have to judge individually
how bad our bad data are, i.e., whether or not we can reconcile them.
> B.
> Section on Determining Latitude and Longitude does not include an example
> for when coordinates are provided. For the sake of completeness, should
> such and example be included, or, since they are being provided and not
> determined, should this be taken up in another section? For example, when
> coordinates are provided in degrees, minutes and seconds, do we translate
> into decimals? how many decimal places do we go for minutes? for
> seconds? Does it matter who provided the
> coordinates? collector? previous museum person? someone else? Under
> what circumstances, if any, should we recalculate coordinates when they are
> provided by some previous source?
>
(I know John's answer to some of this one.) The coordinates define an
infinitely small point, no matter what the format. Precision is measured
with max_error, not the number of significant figures.
Nevertheless, we will have coordinates in which precision was implied by
the recorded format. We have to convert this implied imprecision into a
measure of max_error. At UAM we are using 2 km, a little over a nautical
mile, for coordinates that were recorded to the nearest whole minutes.
There are other examples, similar to the problems with distance precision:
64D 28' 30" N - What they meant to say, in terms of significant
figures, was probably 64D 28.5' N. I suppose in this example we would use
max_error= 1 km
We probably do need to develop a standard here. And yes, I'll bet we want
to be able to keep track of various determinations, re-determinations, who
did it, when, and how.
> C.
> 210 Error associated with distance precision
> 211
> 212 Distance may be recorded in a specific locality description with
> 213 or without significant digits, and those digits may or may not
> 214 be warranted. A conservative way to insure that distance
> 215 precision is not inflated is to treat distance measurements as
> 216 integers with fractional remainders. Thus 10.25 becomes 10 1/4,
> 217 10.5 becomes 10 1/2, etc. Calculate the error for these distances
> 218 based on the fractional part of the distance, using 1 divided by
> 219 the denominator of the fraction.
>
> Lines 217-219. Does this mean to "replace" the numerator with 1, and
> divide by the denominator?
>
> 221 Example: "10.5 mi N of Bakersfield" Fraction is 1/2, error should
> 222 be 0.5 mi.
>
> numerator is 1 to begin with, so doesn't answer the question.
>
> 224 Example: "10.6 mi N of Bakersfield" Fraction is 1/10, error
> 225 should be 0.1 mi.
>
> Isn't the fraction of .6, 6/10? Did you replace the 6 with a 1 in order
> to calculate the error?
>
> 227 Example: "10.75 mi N of Bakersfield" Fraction is 3/4, error should
> 228 be 0.25 mi.
>
> Fraction this time is given as 3/4, not 1/4, but you could only get an
> error of 0.25 by replacing the 3 with a 1 before dividing by 4.
>
> As you can see, the examples are confusing.
>
>
Looks like a typo in line 224.
I suggest replacing the sentence beginning in line 217 with:
The error is the resolution implied by the denominator. It can be
calculated as a distance by dividing one unit of distance by the
denominator.
Is that better? Or worse?
>>> Posting number 97, dated 28 Sep 2001 12:53:09
Date: Fri, 28 Sep 2001 12:53:09 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Georeferencing guidelines
Mime-version: 1.0
Content-type: multipart/alternative;
boundary="MS_Mac_OE_3084526390_196216_MIME_Part"
John et al.,
The georeferencing guidelines look great to me. The only (minor) quibble I
have
would be with the second item under the subheading "Offsets" (lines 86-89).
Here, you
suggest that a locality that contains distance fractions (such as "10.2 mi E
Bakerfield") should be assumed to be road miles rather than air miles. I see
it the other way around. Most field workers I know are careful to state "by
road" if their mileage was actually measured along a road. Otherwise, the
mileage is assumed to be taken directly from a map (i.e., air miles). I
don't see that the inclusion of fractions in the mileage should
automatically signal that the mileage was read from an odometer...it's easy
to get that level of precision using the distance scale printed on the map.
Let's see what the others think. Well done.
>>> Posting number 98, dated 28 Sep 2001 11:33:22
Date: Fri, 28 Sep 2001 11:33:22 -0700
Reply-To: Peter Rauch <peterr@socrates.Berkeley.EDU>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Georeferencing guidelines
In-Reply-To: <OF482A362E.E38FA255-ON86256AD5.00621E6D@lsu.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
On Fri, 28 Sep 2001, XXXXXXXX wrote:
> The georeferencing guidelines look great to me. The only
> (minor) quibble I have would be with the second item under
> the subheading "Offsets" (lines 86-89). Here, you suggest
> that a locality that contains distance fractions (such as
> "10.2 mi E Bakerfield") should be assumed to be road miles
> rather than air miles. I see it the other way around. Most
> field workers I know are careful to state "by road" if their
> mileage was actually measured along a road.
On insect labels ;>) "by road" is just that much more text to
cram onto tiny labels. Maybe things are different with
vertebrate folks, especially for those who keep detailed field
notebooks. I think lots of folks keep careful track of their
odometers, and record road/track miles quite often. I suspect
that *either* assumption is likely to be wrong too often (i.e.,
when no explicit indication is given of which type of
measurement is done). Perhaps the classification should be
"Basis of measure not indicated" and let the "buyer beware"?
(I.e., the geographic analyst can then chose how she wishes to
interpret the distances --perhaps choosing to measure both ways
if a locality seems out of place under one or the other
measurement scheme.)
> Otherwise, the
> mileage is assumed to be taken directly from a map (i.e.,
> air miles). I don't see that the inclusion of fractions in
> the mileage should automatically signal that the mileage was
> read from an odometer...it's easy to get that level of
> precision using the distance scale printed on the map.
>>> Posting number 99, dated 30 Sep 2001 13:35:49
Date: Sun, 30 Sep 2001 13:35:49 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: FW: Locality comment
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
John et al.:
With regard to assigning coordinates to localities, there is a convention
that has been used here at KU for at least 50 years that will help with
localities that are given with reference to towns in the US. When the town
(e.g. Lawrence) was a county seat, distances were measured from the
courthouse. Frequently this was near the center of town, but it reduces the
error in estimating the distance from town because we don't need to worry
about the distance being measured from the city limits. If the locality is
3.5 mi NW of
Lawrence, we still have the uncertainty associated with the angular
component. If the town is not a county seat, the Post Office is frequently
specified as the point of reference. We think this system was exported to
several other collections that are part of MANIS. In general, your
suggestions look quite reasonable (and conservative).
>>> Posting number 100, dated 12 Oct 2001 16:22:06
Date: Fri, 12 Oct 2001 16:22:06 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing Commentary synopsis
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Hi folks,
I've been ruminating over the responses to the Georeferencing Guidelines
document, which was posted on the MaNIS website on 24 Sep 2001. That
document has generated interest in a wider community, including the
Alexandria Digital Library Project, so I feel it worthwhile to spend a
little extra effort to fill in some omissions. Below I will address the
points brought up in discussion and try to provide satisfactory solutions.
I would like to know if there are any objections to these solutions. My
next step will be to incorporate this information into the Guidelines
document and then announce the existence of that document to NHCOLL.
XXXXXXXX mentioned a convention to use the courthouse for a
point of reference for a county seat and to use a post office as a point of
reference for other towns. Since the Board on Geographic Names GNIS data
often follows this convention as well I see no conflict. Of course, this
convention applies only to the US, and only to those towns where there is a
single identifiable post office or a courthouse. For all other
determinations the current geographic center of the town, or the
coordinates given in a gazetteer, should be used. In either case it is best
to note something akin to "measured from the post office" or "measured from
the geographic center of Bakersfield" in the determination remarks.
XXXXXXXX bought up the topic of elevations as a critical part of the
determination criteria. I agree with her assessment and I propose that we
follow XXXXXXXX's advice, namely, that localities for which there are
internal inconsistencies should be deferred to the parent institution for
further investigation. I have designed the collaborative gazetteer to
allow annotations to both localities and higher geography. Through the
annotations, georeferencers can note inconsistencies for follow-up work.
Collaborators will be able to check the gazetteer for annotations that
apply to the data from their institution.
XXXXX also noted that there was no example of how to deal with existing
geographic coordinates. My original thought was that we should count these
localities as finished. Yet, there is merit in revisiting existing data,
both for validation and for edification, especially since none of the
existing coordinates have associated error. Nevertheless, we must remain
cognizant of our budgetary constraints. We were given funds to georeference
localities for which we didn't already have coordinates. All that aside,
XXXXX's point is well-taken. I will provide guidelines for existing
geographic coordinates in the forthcoming revised Georeferencing Guideline
document.
XXXXX asked whether we should translate coordinates from other coordinate
systems into decimal degrees for data entry. The gazetteer currently
accommodates the following coordinate systems:
decimal degrees
degrees, decimal minutes
degrees, minutes, decimal seconds
UTM
But that doesn't answer the question. I will endeavor to create an
interface in which the user will select the original coordinate system and
provide the data in that system. Behind the scenes the data will be stored
in that system AND will be translated to decimal degrees. There will be
decimal degrees and the original coordinates for every determination.
XXXXX's next topic was with respect to the precision stored in the
coordinate fields. There is no reason to truncate the values of coordinates
to conform to a predefined level of precision. For reasons described under
the section on Precision in the Georeferencing Guidelines document, it is
inappropriate to try to store precision information in the coordinate data.
Since the values of the coordinates do not make a statement about the
precision of the determination, keeping as many digits as your source
provides is the preferred method. Discarding digits may have an effect on
accuracy, so it is not recommended. Just for edification, a decimal degree
that records five digits to the right of the decimal can distinguish
between two places on the earth roughly one meter apart. Similarly, if you
want to maintain accuracy down to one meter, degrees and decimal minutes
should be recorded with 4 decimal places in the decimal minutes, and
degrees minutes seconds should be recorded with 2 decimal places in the
decimal seconds. Conversely, degrees minutes seconds measured to whole
seconds can introduce inaccuracies of up to 31 meters. Those measured to
whole minutes can introduce inaccuracies of up to 1.85 km. I'll make a
chart of this information for the document revision.
XXXXX's final question has to do with recording the information about who
determined the coordinates. This should certainly be among the best
practices within museums. At the MVZ these data are recorded by making a
reference to the actual person who made the determination. Since the data
are internal to the museum we can tell whether that person was also the
collector or another person on staff. Another possibility is to record the
role of the person who made the determination (e.g., 'collector',
'curatorial assistant', 'Joe's specific locality munger', etc.). Or, if you
only care whether the collector was the one to provide the coordinates, you
could include a DeterminedByCollector field. For MaNIS I intend to use the
name of the person who determines the coordinates, this name being
determined from a login to the online georeferencing interface.
A point of clarification is in order. When determinations are made, I
intend to treat them as opinions. They will not be stored directly with the
locality record, rather, they will refer to it. This allows any number of
lat/long opinions to be registered. The individual institutions will be
able to decide which one (if there are multiple opinions) will the
"accepted" determination when they put the data back in their databases.
All of the coordinates that were provided in the data sent to me have been
turned into opinions and are already in the gazetteer.
XXXXXX made the following observation:
"There are other examples, similar to the problems with distance precision:
64D 28' 30" N - What they meant to say, in terms of significant
figures, was probably 64D 28.5' N. I suppose in this example we would use
max_error= 1 km"
I agree with XXXXXX's assessment of significance, however, the
determination of error is more complicated. Not all degrees are created
equal. Contrary to popular opinion, the distance between 64 degrees N and
65 degrees N is not the same as the distance between 10 degrees N and 11
degrees N. This is due to the oblateness (flattening from a perfect sphere)
of the earth. This may be a minor point, but longitudinal degrees vary
greatly, being roughly 110 km at the equator and 0 km at the poles. My
point is that I need to provide an interface in which one can enter
coordinates and the digits of precision and get back an error distance
based on those criteria
I will amend my wording and typos with respect to using fractions in the
distance precision error section.
XXXXXXXXX brought up a reasonable alternative view of how offsets should
be handled. The judgement of whether measurements are "by road" or "by air"
can be a tricky one. I want to propose a solution and see if I can get a
consensus.
Specific localities that actually say what the measurement method is (e.g.,
"2.8 mi (by road) E of Marysville") should use that method for determining
coordinates and errors. No special remark is necessary in these cases.
Specific localities that have two orthogonal measurements in them (e.g.,
"2.5 mi E and 1.5 mi N of Bakersfield") are always assumed to be "by
air." No special remark is necessary in these cases either. Furthermore,
no error due to direction imprecision should be used.
So much for the easy stuff.
Specific localities that have one linear offset measurement from a named
place, but that do not specify how that measurement was taken (e.g., "10.2
mi E of Yuma") are open for a case-by-case judgment. I propose that the
judgement itself always be documented in the remarks for the determination
(e.g., "Assumed 'by air' - no roads E out of Yuma", or "Assumed 'by road'
on Hwy. 80"). If there is no clear best choice, then use the midpoint
between the two possibilities as the geographic coordinate and assign an
error large enough to encompass the coordinates and errors of both methods.
In this case I would remark something like "Error encompasses both distance
by air and distance by road (Hwy. 80)". This is a conservative solution,
but it is relatively simple to do and to remember. This method is also
never "wrong," if by "wrong" we mean that the actual place is certainly
within our error distance from the given coordinates.
XXXXXXXXX brought up a question about what units should be used
for maximum error distance. I have set up the gazetteer so that the units
are entered (chosen actually) from a list of possible values (m, km, ft,
yds, mi). The distance and units should be chosen to make sense in the
context of the locality description. My conservative stance on translation
and recalculation issues is to "never adulterate data that can be
adulterated later." If you decide to put these data back into your
databases (and I certainly hope that you will), you can decide at that time
whether to normalize to a single unit of measure.
XXXXXXX also brought up an essential issue of whether errors propagate and
should therefore be summed rather than simply choosing the greatest single
source or error. The answer is not a simple one, so bear with me.
XXXXXXX's specific example, "3 km N + 2 km W Bakersfield" is an instance
of a type of locality description for which I did not provide an example. A
proper description of the error for this example would be a bounding box
centered on the point 3 km N and 2 km W of Bakersfield. Each side of the
box would be 2 km in length (1 km error in any direction). Since we're
using a point and radius to characterize the error, we need a circle that
will circumscribe the above-mentioned bounding box. To do this, the radius
has to be the distance from the center coordinate to a corner. This could
either be calculated by the geometry of the bounding box (in the above
example it would be the distance to the corner times the square root of 2)
or measured on a map.
There remains the more general question of whether errors propagate. They
do, and they are non-linear, so to sum them is a mistake. The paragraph
above shows how a sum is not a satisfactory method of accommodating
multiple sources of error. As more sources of error come to bear, the
propagation gets even more "interesting." I'll spare you the details here,
but I'll make a point of explaining these sources and how they should be
dealt with in the Guidelines revision.
In addition to the issues brought up so far in discussion, I have a few to
add independently. First, I got the calculation for directional error
wrong. I'll update that in the revision. Second, it is probably obvious,
but I still need to state that the directional error can be ignored when
the distance is measured either "by road" or when the description gives two
orthogonal offsets (e.g., "2 mi E and 4 mi N"). Third, there is another
source or errors inherent to reading maps. This error is based on the scale
and it reflects inherent errors in the maps themselves. I will quantify
these errors in the revision.
Aside from the revised georeferencing document, I'm currently working on
interfaces to do the georeferencing online. I'll send out a how-to guide
when the interface is ready to use. It is too soon to know when that will be.
So that everyone knows, my field season is about to begin. Eileen and I are
scheduled to leave for Argentina on 3 Nov and to return around New Year's day.
That's it for my update. Feel free to discourse on my proposed amendments
and thanks to everyone for the comments thus far.
John
>>> Posting number 101, dated 16 Oct 2001 12:43:55
>>> Posting number 102, dated 18 Oct 2001 19:30:33
Date: Thu, 18 Oct 2001 19:30:33 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing Guideline Document Updated
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
It took almost two weeks, but the eagerly-awaited revision to the
Georeferencing Guidelines Document is finally complete. I have replaced the
original document, so the following URL now points to the revision:
http://dlp.cs.berkeley.edu/manis/GeorefGuide.html
I'm not including the line-numbered text of the document here since we are
presumably past the heated debates. Nevertheless, commentary is
always welcome.
When you read the revised document you are likely to be stricken by the
complexities of determining error properly. Don't despair. My next task is
to create an error calculator. The idea is to have a web page on which you
can enter the relevant parameters and get a maximum error distance. This
tool will be a supplement to the georeferencing tool itself, the
development of which is underway.
John
>>> Posting number 103, dated 19 Oct 2001 12:29:38
>>> Posting number 104, dated 4 Nov 2001 21:44:44
Date: Sun, 4 Nov 2001 21:44:44 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Barbara R. Stein" <bstein@OZ.NET>
Subject: MaNIS--ready, set, georeference!
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="------------24FB9C29A003860042ABE8C3"
--------------24FB9C29A003860042ABE8C3
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
Dear All,
This is the moment I know you have all been waiting for! You will
notice a new Gazetteer link at the bottom of the MaNIS home page
(http://dlp.cs.berkeley.edu/manis). This is your gateway to hours of
georeferencing fun. But before starting to work, please read this
message in its entirety, print it out and post it next to the computer
that will be used for georeferencing. You’ll see why you need to print
it when you get near the bottom.
To begin, please review the updated Georeferencing Guidelines.
Next, you will want to read the Georeferencing Steps document. A hot
link to it appears at the top of the gazetteer page.
You will also want to read the text below the query screen on the
gazetteer main page.
After reading all of the above, you will query the gazetteer for a
locality of interest. The "Search" button returns a list of all higher
geographies containing the term entered and indicates how many unique
localities are contained in the result set. The list will not tell you
how many of those localities are already georeferenced. You will see
those data once you download the localities.
You may chose to “View” the queried localities either before or after
downloading BUT this function will not aid you in assigning lat/long
coordinates. Only those localities for which coordinates have already
been assigned get plotted using the GIS viewer (this is the same tool we
showed you at the ASM meeting, courtesy of the Berkeley Digital Library
Project).
Where the GIS viewer is most helpful is in pointing out erroneous
coordinates (e.g., if you view the georeferenced localities from
Algeria, 3 specimens appear in the Atlantic Ocean). By clicking on that
point on the map, you can see the locality record(s) for that point and
correct it/them or, if the locality is not yours, you can contact the
appropriate institution. The viewer also allows you to see how much
work you have accomplished!
Notes about the viewer: This is a java applet and takes time to load.
Do not attempt to use it on older machines with inadequate memory.
Also, not all map layers exist for all parts of the world (e.g., you
will only get USGS 7.5” topo maps for the U.S.). How far you can zoom
and the level of resolution you see will depend on the map layers
available.
Additional notes: 1) This gazetteer is a static snapshot of your data
compiled for the sole purpose of georeferencing unique localities.
Corrections to specific localities should be made directly in
institutional databases. They will not be made in the gazetteer so
don't spend time fixing them in the downloaded files. 2) Below the
georeferencing steps you will see the complete list of fields that will
appear in your downloaded files. Those that are in bold are fields you
will fill. Those not in bold are needed by John to reassociate the data
in the gazetteer with the data in your institutional databases. DO NOT
alter the values in these fields!
For security purposes, we are not posting instructions on how to upload
georeferenced localities on the web site. Below is the complete text
for Step Eight of the Georeferencing Steps document. These instructions
are also being archived on the listserv should you forget to print out
this message. Follow the instructions below for uploading completed
files:
Step Eight - Upload Finished Localities
Upload the finished file of georeferenced localities by anonymous
FTP to galaxy.cs.berkeley.edu in the directory incoming/mvz/manis. Use
your favorite FTP client to connect to galaxy.cs.berkeley.edu. Log in as
anonymous, providing your email address as a password. Set the file type
to text. Change to the incoming/mvz/manis directory on galaxy. Transfer
your file.
Notice that the MVZ has already laid claim to all California localities
(see MaNIS Georef. Checklist in Step 2). Try as you might, we will not
relinquish this claim! It is therefore incumbent upon each of you to
lay claim to an equally prestigious set of localities.
Those of you paying attention will realize that John is now in Argentina
for two months. He hoped to have the Error Calculator completed before
leaving. He did not. However, once completed, you will simply enter
your lat/long coordinates and it will do all the work of calculating the
error in those values for you-- so it is worth the wait. Go ahead and
start georeferencing now. You will son be able to go back and fill in
the errors needed as he will post the calculator from the field.
I wish I had more to report on the status of your subcontracts, but I do
not. Some of you will be able to begin work regardless. The
beaurocracy has a timeline of its own. We simply have to proceed as best
we can in the meantime.
Please continue to address any questions or comments to the list.
Ready, set, georeference!
Best,
Barbara
>>> Posting number 105, dated 6 Nov 2001 09:51:19
>>> Posting number 106, dated 6 Nov 2001 09:00:24
>>> Posting number 107, dated 6 Nov 2001 12:24:23
>>> Posting number 108, dated 6 Nov 2001 14:29:22
>>> Posting number 109, dated 6 Nov 2001 16:52:12
>>> Posting number 110, dated 6 Nov 2001 16:06:24
Date: Tue, 6 Nov 2001 16:06:24 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Patricia W. Freeman" <pfreeman1@UNL.EDU>
Subject: Re: MaNIS--ready, set, georeference!
Comments: cc: hgenoways1@unl.edu
In-Reply-To: <4.2.2.20011106122240.00abdfb8@packrat.musm.ttu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Dear members of MaNIS-
I am actually out of your official MaNIS loop, but I have already
georeferenced Nebraska for mammals, birds, herps, and fish (over 60,000
specimens) and will probably do South Dakota as well. I salvaged 8,000
herps and about 1,500 mammals from USD about two years ago.
All four vertebrate groups are on our web page and searchable to county.
Although we have already georeferenced all four collections, the complete
localities will not be put on the webpage until next semester (I hope). My
computer expert who, using the Texas Tech georeferencing idea, modified and
wrote a conversion program changing all our geographic localities to
georeferenced localities.
We now have a large NT server that has the USGS maps and gazetteers on it.
Since Hugh Genoways is rewriting the Mammals of Nebraska and has already
started gathering specimens for that purpose, all mammals and mammal data
used for that study will be automatically georeferenced and those data will
accompany the loaned materials on return to their home institution. I
expect that he has or will contact most of you who have Nebraska material.
Regards-
Trish Freeman
PS. Can any of you direct me to FISHNET or BIRDNET if there are such
things? I am already involved with HERPNET, although I do not know what is
happening with it. Maybe someday we will have VERTNET.
Patricia W. Freeman
Professor/ Curator of Zoology
University of Nebraska State Museum
Lincoln NE 68588-0514
402-472-6606
402-472-8949 (fax)
Natural history museums archive biological diversity.
http://www-museum.unl.edu/research/zoology/zoology.html
>>> Posting number 111, dated 7 Nov 2001 09:09:31
>>> Posting number 112, dated 7 Nov 2001 08:32:12
>>> Posting number 113, dated 8 Nov 2001 14:03:13
>>> Posting number 114, dated 8 Nov 2001 14:39:28
Date: Thu, 8 Nov 2001 14:39:28 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: MaNIS--ready, set, georeference!
In-Reply-To: <3BE6274C.F9AC2E10@oz.net>
Mime-version: 1.0
Content-type: multipart/alternative;
boundary="MS_Mac_OE_3088075168_258732_MIME_Part"
> This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.
--MS_Mac_OE_3088075168_258732_MIME_Part
Content-type: text/plain; charset="US-ASCII"
Content-transfer-encoding: 7bit
Dear all,
1. I have Internet Explorer 5 for Macintosh on a G4. I haven't been able
to download records from the Manis website.
2. Our grant submission allotted funds to each institution based on their
records to be geo-referenced. Does committing to a state/province or region
change all of this?
3. The process has changed considerably between when our records were
downloaded for John and the ASM meeting. I thought that our records were
being submitted so that John would have a snapshot of what the different
databases looked like in order to design the Manis database. I had planned
to clear up any inconsistencies, spelling errors, etc in our localities
before we geo-referenced and downloaded to the Manis database. This seems
to make sense, since many errors in locality records can be cleared up only
with the use of in-house resources such as field notes and catalogs. Now we
are committing to a region and giving our best opinion on perceived errors
(to be noted in the Locality Annotation) to other institutions (and
ourselves!) for them to rectify (or not) at their leisure. Since I haven't
been able to download records, I don't know how much this new scheme will
save time overall or be more time consuming!
4. There are many localities that are designated unique that simply differ
in syntax, spelling, etc. They are not necessarily next to each other.
Would editing our own version of the database first for these errors and
then downloading them into the Manis database work?
Cheers,
XXXXXXXXX
--MS_Mac_OE_3088075168_258732_MIME_Part
Content-type: text/html; charset="US-ASCII"
Content-transfer-encoding: quoted-printable
<HTML>
<HEAD>
<TITLE>Re: MaNIS--ready, set, georeference!</TITLE>
</HEAD>
<BODY>
<FONT FACE=3D"Century Schoolbook">Dear all,<BR>
<BR>
1. I have Internet Explorer 5 for Macintosh on a G4. I haven't =
been able to download records from the Manis website.<BR>
<BR>
2. Our grant submission allotted funds to each institution based on t=
heir records to be geo-referenced. Does committing to a state/province=
or region change all of this?<BR>
<BR>
3. The process has changed considerably between when our records were=
downloaded for John and the ASM meeting. I thought that our rec=
ords were being submitted so that John would have a snapshot of what the dif=
ferent databases looked like in order to design the Manis database. &n=
bsp;I had planned to clear up any inconsistencies, spelling errors, etc in o=
ur localities before we geo-referenced and downloaded to the Manis database.=
This seems to make sense, since many errors in locality records can b=
e cleared up only with the use of in-house resources such as field notes and=
catalogs. Now we are committing to a region and giving our best opini=
on on perceived errors (to be noted in the Locality Annotation) to other ins=
titutions (and ourselves!) for them to rectify (or not) at their leisure. &n=
bsp;Since I haven't been able to download records, I don't know =
how much this new scheme will save time overall or be more time consuming!<B=
R>
<BR>
4. There are many localities that are designated unique that simply d=
iffer in syntax, spelling, etc. They are not necessarily next to each =
other. Would editing our own version of the database first for these e=
rrors and then downloading them into the Manis database work?<BR>
<BR>
Cheers,<BR>
<BR>
XXXXXXXXXXXXXX</FONT>
</BODY>
</HTML>
--MS_Mac_OE_3088075168_258732_MIME_Part--
>>> Posting number 115, dated 8 Nov 2001 21:20:18
Date: Thu, 8 Nov 2001 21:20:18 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: permutations on "unique" localities in the gazetteer
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Dear All: I was wondering about many of the same points that XXXX
XXXXXXXXX mentioned in his email of 8 Nov. Especially after perusing the
gazetteer and seeing many permutations on"unique" localities. Eg.,
localities like Seattle, 20 mi N, 20 mi N of Seattle, Seattle, 20 mi north,
and north of Seattle 20 miles, have to be allowed because of institutional
style or preference. However, an entry such as Seatle, 20 mi N could be
corrected. Each is a unique record to the computer and will receive the
same lat/long by georeferencers? Once georeferenced, the permutations can
be identified, but if localities are entered differently, how much
efficiency is gained by having one institution georeference all records for
a region vs having each georeference their own records? In addition when
a typo like Seatle is corrected, it no longer is unique but of the same set
as the correct spelling. The typos will be deleted from the static
gazetteer after determining that they were corrected in the institutional
database (see comment from Barbara below)? It is unclear to me how
corrections in institutional databases will be mirrored in the static
gazetteer.
Although the idea of compiling a static gazetteer of unique localities
seemed like a good idea at the beginning, it does not seem doable at this
point. I would prefer to go back to the original plan of each institution
dealing with their own records and offering assistance to others as needed.
Once georeferencing is started and we get $ for the servers, the
gazetteer could be produced dynamically, or at least by frequent uploads -
rather than statically - and can be consulted, updated, corrected, winnowed
as needed.
>From 4 Nov email of Barbara:
...
Additional notes: 1) This gazetteer is a static snapshot of your data
compiled for the sole purpose of georeferencing unique localities.
Corrections to specific localities should be made directly in
institutional databases. They will not be made in the gazetteer so
don't spend time fixing them in the downloaded files.
...
>>> Posting number 116, dated 9 Nov 2001 08:57:34
Date: Fri, 9 Nov 2001 08:57:34 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Barbara R. Stein" <bstein@OZ.NET>
Subject: Re: MaNIS--ready, set, georeference!
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
> 1. I have Internet Explorer 5 for Macintosh on a G4. I haven't been a=
ble to download records from the Manis website.
XXXX et al.,
We are checking this out and, with luck, will have a fix today. In the m=
eantime, you can download from a Mac using Netscape.
> 2. Our grant submission allotted funds to each institution based on th=
eir records to be geo-referenced. Does committing to a state/province or=
region change all of this?
No it does not. It was presumed that, in most instances, the majority of=
localities for a given state, and the geographic expertise and resources=
to untangle geographic problems, would reside with the institution in th=
at state. Therefore, it made sense that we should work cooperatively to =
georeference. Each institution naturally will have many other specimens =
collected outside that state. Each can choose to do onlyu its own locali=
ties, thereby encouraging duplicate effort, or we can attempt a more altr=
uistic approach and save economies of scale. If, after georeferencing al=
l of California, the MVZ looks at its remaining collections and sees that=
it has a tremendous amount of material from Brazil, Peru and Argentina, =
and recognizes that it also has more geographic expertise in these region=
s than any of the other institutions (and presumably more maps, gazetteer=
s, etc.), then we are going to offer to do all localities from
those countries for the sake of efficiency and making the money go as far=
as possible. In return, we know we will benefit from the Bishop Museum =
doing our PNG material, of which we have a fair number of specimens. We =
could it, yes. But they can probably do it more quickly and easily. Thi=
s approach also allows those with an interest in a particular region of t=
he world to get a good handle on what exists in our joint collections and=
, I suspect, reach some very interesting summaries about those regions an=
d the state of our knowledge of their mammalian fauna.
> 3. The process has changed considerably between when our records were =
downloaded for John and the ASM meeting.
No it has not. All of this was discussed online during the proposal prep=
aration process beginning more than a year ago.
> I thought that our records were being submitted so that John would hav=
e a snapshot of what the different databases looked like in order to desi=
gn the Manis database.
That is also true. There were always two objectives in giving John your =
data.
> I had planned to clear up any inconsistencies, spelling errors, etc in =
our localities before we geo-referenced and downloaded to the Manis datab=
ase.
The time to have cleared up those problems was before the data were sent =
to John. Since this approach was outlined in the first proposal submissi=
on over a year ago, it should not have come as a surprise. The money we =
receive from NSF was never intended to pay institutions to clean up their=
locality records. It is to georeference those records.
> This seems to make sense, since many errors in locality records can be =
cleared up only with the use of in-house resources such as field notes an=
d catalogs. Now we are committing to a region and giving our best opinio=
n on perceived errors (to be noted in the Locality Annotation) to other i=
nstitutions (and ourselves!) for them to rectify (or not) at their leisur=
e.
Since you haven't started to georeference, you will have to take my word =
that your fears are probably worse than reality. Truly erroneous localit=
ies become obvious quite quickly and if they are not your own, simply ema=
il a query to the institution to which that locality belongs.
Multiple versions of the same locality also jump out quickly. The advant=
age of using a single individual to georeference a region in that s/he qu=
ickly becomes familiar with the localirties in that place. My own person=
al suggestion is that each PI sit down with the data and try this process=
him- or herself before hiring a student to really get going on it. It w=
ill give you confidence and a much better feel for how it all works. And=
, if you love maps like I do, it can actually be quite a seductive exerci=
se. Your problem will be to keep working and not to get distracted by th=
e geography and all the places you would like to collect, have collected,=
etc. Perhaps the most difficult aspect is recognizing place names that =
are no longer in use. Again, review the georeferencing guidelines which =
remind you not to dwell on any single seemingly intractable locality.
> 4. There are many localities that are designated unique that simply di=
ffer in syntax, spelling, etc. They are not necessarily next to each oth=
er. Would editing our own version of the database first for these errors=
and then downloading them into the Manis database work?
I don't believe so. As mentioned above, each institution has known about=
this approach for more than a year and could have, in that time, chosen =
to direct part of its routine curatorial effort to cleaning up localities=
in its db. The final distributed db will have whatever corrected specif=
ic localities get made during the georeferencing process. We were not gi=
ven money to clean up our localities. We received this money to georefer=
ence. You are under no obligation to correct localities for other instit=
utions. You are merely being asked to georeference them. Even if relate=
d localities do not fall out in line with one another in your downloaded =
files, if one individual works on all the localities for a given region, =
s/he will not have trouble recalling that a lat/long for a similar place =
was assigned just two days ago and one can scroll up the list to find it.=
I am sure John will want to add his own comments to what I have written. =
He generally has access to email about once a week. In the meantime, I =
will let you know as soon as we solve the download problem. That does no=
t have to wait for him.
Best, Barbara
>>> Posting number 117, dated 9 Nov 2001 09:28:19
Date: Fri, 9 Nov 2001 09:28:19 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Barbara R. Stein" <bstein@OZ.NET>
Subject: Re: permutations on "unique" localities in the gazetteer
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
> Each is a unique record to the computer and will receive the
> same lat/long by georeferencers?
Yes.
> Once georeferenced, the permutations can
> be identified, but if localities are entered differently, how much
> efficiency is gained by having one institution georeference all records for
> a region vs having each georeference their own records?
Please refer to my reply to XXXXXX.'s previous message on this issue. Having a
fair amount of experience doing georeferencing, the MVZ and other instigators
of this proposal believe strongly that much efficiency can be gained by a
cooperative approach. Proof of our commitment is that the MVZ has agreed to do
all California localities for this project even though we have completed
georeferencing our own localities for many counties in the state more than a
year ago. We believe we can just do it more efficiently and more painlessly
than any of you folks can. Even LACM didn't fight us on this point. I can
change the oil in my car but...
> In addition when
> a typo like Seatle is corrected, it no longer is unique but of the same set
> as the correct spelling. The typos will be deleted from the static
> gazetteer after determining that they were corrected in the institutional
> database (see comment from Barbara below)?
No, the typos will not be deleted from the static gazetteer. The static
gazetteer exists simply as a way to unite all localities from our respective
dbs for georeferencing and then return the georeferenced locs to their
respective dbs.
> It is unclear to me how
> corrections in institutional databases will be mirrored in the static
> gazetteer.
I repeat-- corrections in institutional dbs will not be mirrored in the static
gazetteer. Rather, your efforts will be mirrored in the final product--a
geographic dictionary coupled with the distributed db network and GIS viewer.
Please review our NSF proposal.
> Although the idea of compiling a static gazetteer of unique localities
> seemed like a good idea at the beginning, it does not seem doable at this
> point.
It has been done, for the purpose it was designed to carry out.
> I would prefer to go back to the original plan of each institution
> dealing with their own records and offering assistance to others as needed.
That is not what was agreed to or specified in the proposal.
> Once georeferencing is started and we get $ for the servers, the
> gazetteer could be produced dynamically, or at least by frequent uploads -
> rather than statically - and can be consulted, updated, corrected, winnowed
> as needed.
And it will be. You are exactly right.
Best,
Barbara
>>> Posting number 118, dated 9 Nov 2001 14:20:26
>>> Posting number 119, dated 9 Nov 2001 14:57:01
Date: Fri, 9 Nov 2001 14:57:01 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Static Gazetteer
MIME-version: 1.0
Content-type: multipart/alternative;
boundary="Boundary_(ID_WoSRKrESJwWTVCyCL0UPxw)"
--Boundary_(ID_WoSRKrESJwWTVCyCL0UPxw)
Content-type: text/plain; format=flowed; charset=us-ascii
Content-transfer-encoding: 7BIT
Dear All,
To add to my last message, I don't think the static gazetteer was a
surprise, rather the timing of it was. When I sent the TTU site data to
John early in the summer, I told him that we are in the middle of verifying
and correcting our database. (We have been working on checking and
correcting our database for nearly three years; I happily report that we
are all but done now.) At the time, I told John that the corrected data
were NOT what was being sent to him. He implied that this was okay and
that the static gazetteer would be created at a later time. However, I may
have misunderstood him. Now, it seem that several of us have data that we
are not comfortable with in the already compiled gazetteer.
I did understand that the NSF money was to meant to cover database
corrections, but I thought we'd begin georeferencing only after the data
had been corrected. I think we're all looking for ways to simplify the
process and having the indiosyncracies of years of data entry already fixed
would greatly facilitate the process. Is there some way to address this
problem (uncorrected data in the gazetteer)? Or do we push ahead with the
gazetteer as it is. In my mind, going ahead with it as it is will create
some additional work for those doing the georeferencing (because of the
duplications), but it will create a great deal of additional work for each
institution as errors are corrected. In our case at TTU, we will have to
go through the gazetteer (once we get the georeferenced records back),
compare all those records to the file we just spent three years updating
and update the whole thing all over again. Remember that not all of the
corrections will be simple typos or punctuation problems. We're correcting
incorrect data as well (e.g., wrong county names entered). If we could
have the opportunity to update the gazetteer with corrected data before the
process is too far along, it would help considerably.
>>> Posting number 120, dated 9 Nov 2001 15:09:14
>>> Posting number 121, dated 9 Nov 2001 15:59:31
Date: Fri, 9 Nov 2001 15:59:31 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Correction
MIME-version: 1.0
Content-type: multipart/alternative;
boundary="Boundary_(ID_spZx6thUFA8HhEMCCdkxcQ)"
--Boundary_(ID_spZx6thUFA8HhEMCCdkxcQ)
Content-type: text/plain; format=flowed; charset=us-ascii
Content-transfer-encoding: 7BIT
Correction to my last note: I did understand that NSF money was NOT to be
used to make corrections to the databases.
Sorry for the slip.
XXXX.
>>> Posting number 122, dated 9 Nov 2001 15:13:09
Date: Fri, 9 Nov 2001 15:13:09 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Barbara R. Stein" <bstein@OZ.NET>
Subject: Re: Static Gazetteer
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
> ... We're correcting incorrect data as well (e.g., wrong county names entered). If we could have the
opportunity to update the gazetteer with corrected data before the process is too far along, it would
help considerably.
XXXXXX,
I am very sympathetic to the argument you put forth and am quite sure I would be operating out of my
league if I were to speak for John on this issue. However, I would like to offer several thoughts--
First, an encouraging thought, with the caveat that John will surely correct me if I am wrong-- The
locality ID field in your downloaded files (the one you have been warned not to alter!) will be used to
reassociate the georeferenced data with the records in your dbs--regardless of the content of those
records. So do not despair if you have corrected some of your localities since you sent John the data.
This was to be hoped for and should not present a problem. If records did have erroneous data (like a
wrong county), these will likely be difficult to georeference on the first pass and may be skipped, but
they should be easy to deal with by the home institutions once all the data are returned and we each
look for remaining unreferenced localities in our own dbs.
Second, we have committed to quite a large project over the course of three years and it is imperative
that we start working ASAP. It is simply not possible to delay georeferencing while each collection
takes time to verify and correct its locality data. Have the majority of collections made substantive
changes/corrections to their locality data since those data were sent to John? I don't know, but I
suspect the majority has not, even though we are all continually cleaning up our data on a daily basis.
So how long do we wait? Despite the fact that you have not received your money, we are already two
months into this project. We need to begin work. It could also be aruged that we should delay because
of all the new specimens that have been entered into our dbs since the data were sent to John.... At
some point we must draw the line.
What I ask is that each institution lay claim to a set of localities, that they download those data, and
then spend a bit of time examining what's really there. Begin georeferencing. Become familiar with the
process we've outlined. It may be slow going initially, but as with all new techniques, it will become
quicker and easier with practice.
I sincerely regret any misunderstandings that may have occurred. It is important to keep communicating
and I thank you for your contributions.
Best, Barbara
>>> Posting number 123, dated 9 Nov 2001 16:10:16
Date: Fri, 9 Nov 2001 16:10:16 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Barbara R. Stein" <bstein@OZ.NET>
Subject: alternative download method
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
XXXX et al.,
Beneath the "Download" button there is now an alternative option for
those who may have experienced problems. Click on the link that says
"Alternate download method is here." A text file with the data should
display in the browser window. Go to the "File" menu and select "Save
As..." to save the file on your computer. Then open excel and import
the file.
Best,
Barbara
>>> Posting number 124, dated 15 Nov 2001 08:18:59
Date: Thu, 15 Nov 2001 08:18:59 -0800
Reply-To: bstein@oz.net
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: Barbara Stein <bstein@OZ.NET>
Subject: downloading problems solved
Dear All,
I believe that the problems some individuals were having with downloading
locality data are now solved.
For those using IE on a Mac, an alternative download button has been added with
instructions. Click to download after viewing the list of specific localities
that result from your search and you will see the alternative option beneath
the original download button.
There is also no longer a problem with downloading large numbers of records
(e.g., >8500) so I hope you will feel emboldened.
Remember, the downloaded files need to be imported into your spreadhseet of
choice before you will see the headers and the data lined up in a way that
makes sense to you. Do not attempt to simply work with the downloaded files as
is.
Lastly, the subcontract budgets have been set up and are in the hands of
Berkeley's SPO. It is up to that office to notifiy your SPOs that the money is
available. It is out of the MVZ's control at this point.
Best,
Barbara
>>> Posting number 125, dated 15 Nov 2001 11:09:32
>>> Posting number 126, dated 16 Nov 2001 07:38:49
Date: Fri, 16 Nov 2001 07:38:49 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Collaborative Georeferencing Theory II
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Dear all,
In this message I am responding to the discussion begun by XXXXXXXXX
on 8 Nov and continued by XXXXXXXXX. I will refer to both of their
messages herein. I realize that Barbara has already answered these points
while I was out contracting chilblains in the Patagonian wind, but it may
be a comfort to some to see the extent to which we are in agreement
without having had the benefit of communicating.
XXXXXXXX said...
[
2. Our grant submission allotted funds to each institution based on their
records to be geo-referenced. Does committing to a state/province or
region change all of this?
]
-------------------
No. Funding was based on the number (and difficulty) of the localities in
your collection that need to be georeferenced. In theory, if everyone does
the amount of georeferencing for which they were funded at the speeds we
deduced from experience, then all of the localities without coordinates
will be georeferenced under the funding we were given. In order to take
advantage of the pooling of like localities (i.e., those in the same area
on the map regardless of their source institution) we need to have people
commit to geographic areas that best suit them. Suitability includes not
only geographic areas of interest and of expertise, but also of scope. For
example, if I am institution X, given funding for 10 weeks of
georeferencing, then committing to a geographic area that will take 20
weeks to georeference may be good citizenship, but it is not good
finance. Basically, spend as many weeks on georeferencing as you are
listed for in the NSF Project Description. Details on georeferencing rates
(i.e., localities per hour for different classes of geography) were given
in the Project Implementation section of the NSF Project Description. If
you need to estimate what you are committing to in terms of time, read
that section. It will probably be worthwhile for everyone to monitor
his/her georeferencing rates. If your rates are significantly different
from those projected, send a message to the list. If you are going a lot
faster, we want to know how you're doing it. If you're going a lot slower,
maybe we can help increase your efficiency.
-------------------
XXXXXXXX said...
[
3. The process has changed considerably between when our records were
downloaded for John [W.] and the ASM meeting. I thought that our records
were being submitted so that John [W.] would have a snapshot of what the
different databases looked like in order to design the Manis database.
]
-------------------
The last point is true, but it is not the only reason I gathered the
data. Following is an excerpt from the original message from Barbara Stein
asking that data be sent to John W.:
"NOTE: The data you send him will not be distributed in any way, shape,
or form; he will do nothing more than examine it and compare the structure
and general content of the files and then use this data to make the
initial global locality file that will be available for general
reference. This is extra work that is being done on MVZ's nickle, but
something we feel will keep this project on track and give you the most
bang for your buck."
At that point in time we already knew we would use a combined locality
gazetteer, it just wasn't clearly stated at that point how we would use
it. By the time of the ASM meeting I had almost finished the gazetteer and
its purpose was more definitively stated. Following is a quote from the
ASM 2001 meeting notes:
"While John [W.] begins work on developing the network, participants will
begin georeferencing. This is why John [W.] asked for your data. From
those
data he will create a combined snapshot of unique localities, which will
be
used for georeferencing."
-------------------
XXXXXXX said...
[
I had planned to clear up any inconsistencies, spelling errors, etc in our
localities before we geo-referenced and downloaded to the Manis
database. This seems to make sense, since many errors in locality records
can be cleared up only with the use of in-house resources such as field
notes and catalogs. Now we are committing to a region and giving our best
opinion on perceived errors (to be noted in the Locality Annotation) to
other institutions (and ourselves!) for them to rectify (or not) at their
leisure. Since I haven't been able to download records, I don't know how
much this new scheme will save time overall or be more time consuming!
]
and XXXXXXX said...
[
Dear All: I was wondering about many of the same points that XXXX
XXXXXX mentioned in his email of 8 Nov. Especially after perusing the
gazetteer and seeing many permutations on"unique" localities. Eg.,
localities like Seattle, 20 mi N, 20 mi N of Seattle, Seattle, 20 mi
north, and north of Seattle 20 miles, have to be allowed because of
institutional style or preference. However, an entry such as Seatle, 20
mi N could be corrected. Each is a unique record to the computer and will
receive the same lat/long by georeferencers? Once georeferenced, the
permutations can be identified, but if localities are entered
differently, how much efficiency is gained by having one institution
georeference all records for a region vs having each georeference their
own records?
]
-------------------
First, it would be nice if we each had clean and consistent data in our
databases. We don't. We vary greatly in how close we are to achieving that
aim, not only in terms the raw amount of cleaning to do, but especially in
how long it would take each of us to do it. For this reason we cannot wait
for localities to be cleaned up before we start georeferencing.
Second, NSF provided funds to georeference localities, not to clean up
existing data. Nor did our methods and time estimates in the NSF proposal
depend on "clean" localities. I agree that it would be more efficient to
georeference ALREADY clean localities, but it is faster to georeference
them as they are than it is to clean them up and then georeference them.
Third, in answer to XXXX's last question, the methods presented in our
proposal have been tested and shown to be much more efficient than the
alternative of having each institution georeference only its own
localities. Forgive my digression into a lengthy answer, but this is an
extremely important matter.
The concept of uniqueness is, as XXXX points out, defined by the
computer's ability to distinguish one locality from another. Thus, "20 mi
N of Seattle" is a different record from "Seattle, 20 mi N." Furthermore,
there might be two localities "20 mi N of Seattle", one for UWBM and one
for PSM. There are several reasons for keeping these separate, the most
obvious and important of which is to be able to identify from which
institution a locality description came. So, with the MaNIS gazetteer I've
basically given everyone a list of their unique localities, but you could
each have done that yourselves. The real purpose behind the gazetteer is
to combine localities for all institutions by geographic regions. By far
the most time-consuming aspect of georeferencing is finding places on a
map. Thus, it behooves you to assemble localities that are likely to be in
roughly the same place and then find them on a map all at once. Once you
are on the right map you can get coordinates for all of the localities in
that area. So, suppose I have downloaded localities for which the county
is "Kern." At the top of my list of localities for Kern County is one from
UWBM that says "Bakersfield, 10 mi E; Rattlesnake Grade." I see that the
named place is Bakersfield, so I filter my Kern County records to show me
only those which contain the word "Bakersfield." It turns out that in Kern
County there are 117 localities from 10 institutions that mention
"Bakersfield." I get out my map of the Bakersfield area and start looking
for "Rattlesnake Grade." I can't find it on my map right away so I'm going
to skip this locality for the moment. The next twelve localities on my
list are from six different institutions, but they all have some variation
on "3 mi E of Bakersfield." I find this location on my map once, get the
coordinates and copy them to all twelve localities that match this
place. The next locality on my list is from MVZ and it says "Bakersfield,
6 mi N, 9 mi E; Rancheria Road (Rattlesnake Grade)." Oh, so that's where
Rattlesnake Grade is - on Rancheria Road. Now I can go figure out that
first locality, which I skipped at first.
So, to answer XXXX's last question again, there are multiple ways in which
the combined localities aid in the overall efficiency of the
georeferencing process. From the illustrative example above, only the MVZ
had to possess the Kern County map; nobody had to go out and buy one. Only
one person had to find Bakersfield on a map, rather than one person from
each of the ten institutions that had localities from that area. It was
possible to find Rattlesnake Grade for all localities that mentioned it,
not just for the one that also happened to locate it on Rancheria Road. It
might not otherwise have been possible to georeference this locality or
maybe the error would have been much greater than it needed to be. The
single locality 3 mi E of Bakersfield could be found and measured once and
the results copied to all twelve localities that were really the same
place. While the foregoing is all well and good in theory, empirical
testing at the MVZ backs it up with hard numbers. Georeferencing rates
doubled when localities from three collections were combined versus when
they were done separately. Further increasing the number of collections
will result in even greater efficiency.
Now let me go back and address part of XXXX's comment that I have
neglected thus far.
XXXXXXXX said...
[
"Now we are committing to a region and giving our best opinion on
perceived errors (to be noted in the Locality Annotation) to other
institutions (and ourselves!) for them to rectify (or not) at their
leisure."
]
-------------------
I'm not sure what XXXX's point is here, but I'll try to explain the
Locality Annotation again. Locality Annotation is one of the fields in the
downloaded locality data. This field is provided as a courtesy to alert
the institution that provided a locality that there is something
inconsistent about it. It's not meant to be filled with opinions on
perceived errors, it is meant to note definitive inconsistencies. For
example, if I get a locality in the downloaded file for Inyo County that
says "Bakersfield", then there is a problem with the locality. It's not an
opinion, and it isn't a perceived error; it is simply true that
Bakersfield is not in Inyo County. It's up to me as the georeferencer to
decide whether this is enough of a problem to not georeference the
locality. In this particular case I could either choose to georeference
the locality, because I know that Bakersfield is in Kern County, or I
could choose not to georeference it simply because I'm doing Inyo County
and Bakersfield is out of my "jurisdiction." I wouldn't take the latter
option because I'm necessarily a stickler for boundaries, it's just that
I'd have to go get another map and that would waste time. It might be
better to leave some inconsistent localities until later. Nevertheless,
since I've spent the energy to figure out that there is a problem with the
locality, I might as well extend the courtesy of noting what the problem
is. It'll save time for someone else later on. It is this philosophy that
led me to include the NoGeorefBecause field in the download as well. If
I'm able to determine that a locality cannot be georeferenced, I might as
well say so, and why, so that the next person who sees that this locality
doesn't have coordinates will not bother to try to determine them.
-------------------
XXXXXXXX said...
[
4. There are many localities that are designated unique that simply
differ in syntax, spelling, etc. They are not necessarily next to each
other. Would editing our own version of the database first for these
errors and then downloading them into the Manis database work?
]
-------------------
Yes. In theory it could work, but it is not practical. In addition to the
reasons I gave above, this kind of activity would take a great deal of my
time, which I hope you would agree could be better spent on other things.
-------------------
XXXXXXX said...
[
In addition when a typo like Seatle is corrected, it no longer is unique
but of the same set as the correct spelling. The typos will be deleted
from the static gazetteer after determining that they were corrected in
the institutional database (see comment from Barbara below)? It is unclear
to me how corrections in institutional databases will be mirrored in the
static gazetteer.
The comment from Barbara was...
[
...
Additional notes: 1) This gazetteer is a static snapshot of your data
compiled for the sole purpose of georeferencing unique localities.
Corrections to specific localities should be made directly in
institutional databases. They will not be made in the gazetteer so
don't spend time fixing them in the downloaded files.
...
]
-------------------
XXXX's question is well founded. I have nowhere yet described what will
happen to the georeferenced localities. I'll try now to clear up this part
of the grand scheme. I've already explained that I would like the
georeferenced localities to be sent back to me so that I can proof them,
load them back into the gazetteer, and keep a running status of the
georeferencing aspect of the project. In principle, you could download
sets of georeferenced localities for your institution at any time and load
them into your own database. But that isn't the most efficient way to go
about the problem. It would be better to wait until all georeferencing is
done, then download all localities for your institution and create the
lat_long records for them all at once, with my help, if necessary. Note
that I am not explaining how to create the lat_long records or how to
incorporate them in your database. The reason is that (almost) everyone's
database structure is different from everyone else's, so there is no one
single solution to fit all. That's why I offer my help to get these data
back into your databases, but I can only afford to do it one time for each
institution that needs it.
Now back to XXXX's question. Changes in your databases will not be
mirrored in the static gazetteer. There will be no changes whatsoever to
localities in the static gazetteer, as per Barbara's additional notes. If
you correct typographical errors in your database it will not affect the
georeferencing process. If you make a substantive change to a locality
(one that would affect how the locality is georeferenced), then there will
be an easily discernible discrepancy that can be resolved at the time when
lat_longs are incorporated into your database. Nevertheless, the more
changes you make to your localities during the georeferencing period, the
more work you will potentially create for yourself later.
-------------------
XXXXXX said...
[
Although the idea of compiling a static gazetteer of unique localities
seemed like a good idea at the beginning, it does not seem doable at this
point. I would prefer to go back to the original plan of each institution
dealing with their own records and offering assistance to others as
needed. Once georeferencing is started and we get $ for the servers, the
gazetteer could be produced dynamically, or at least by frequent uploads -
rather than statically - and can be consulted, updated, corrected,
winnowed as needed.
]
I hope I've done something to counter the above sentiment. Let me add
another note about the static gazetteer. It is an interim tool intended to
help us divide up the georeferencing responsibilities and to monitor
georeferencing progress. Your databases are not static. Yet, to function
effectively, we need a fixed target. The real end product of this endeavor
will include a dynamic gazetteer that will drawn from the
continually-updated locality data contained in the participating
databases. At that point, when you add new data, or change existing data,
it will be reflected in the dynamic gazetteer without intervention.
I hope this clarifies the reasoning behind our approach to
georeferencing. Considerable thought and effort have gone into
establishing and testing the methods set forth here and elsewhere in the
MaNIS documents. Barbara and I remain convinced that this is the most
reasonable approach to an otherwise daunting task.
John W.
>>> Posting number 127, dated 16 Nov 2001 11:51:55
Date: Fri, 16 Nov 2001 11:51:55 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Collaborative Georeferencing Theory II
In-Reply-To: <Pine.GSO.4.21.0111160737280.29268-100000@socrates.Berkeley.EDU>
Mime-version: 1.0
Content-type: text/plain; charset="US-ASCII"
Content-transfer-encoding: 7bit
John,
My intention was never to clean up our locality data with geo-referencing
funds! I was operating on the assumption that we would be responsible for
our own data and therefore it would have been worthwhile to clean it up on
our own dime before geo-referencing. Which gets to another question. I
have cleaned up localities in our database since downloading it to you. Is
this going to cause problems in downloading the newly geo-referenced
localities from MANIS into our current database? Can I continue to clean up
our own database? Did I understand you correctly when you said to leave
localities that have lat/long alone? The reason I ask is that I noticed
that when you transferred our lat/long to the Manis database. The minutes
were incorrectly interpreted as decimal degrees. Should I worry about this?
Will we have to change our database to accept decimal degrees? I appreciate
your thorough responses. I am trying to clarify and simplify our tasks.
That is my bottom line.
Cheers, XXXX
PS I didn't put this on the site, because I am seeking clarity not a debate.
>>> Posting number 128, dated 16 Nov 2001 15:59:44
>>> Posting number 129, dated 17 Nov 2001 12:55:06
Date: Sat, 17 Nov 2001 12:55:06 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Collaborative Georeferencing Theory II
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Dear All: Thanks to John W. for the overview and examples. In summary, we
are georeferencing unique geographical entries rather than unique
localities. Unique can be a function of geography, institutional acronym,
syntax, typos, punctuation and errors. The goal is clearer.
XXXXXX
>>> Posting number 130, dated 17 Nov 2001 12:55:48
Date: Sat, 17 Nov 2001 12:55:48 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Barbara R. Stein" <bstein@OZ.NET>
Subject: Re: Questions about Georeferencing
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
XXXXXXXX wrote:
> Thanks for all of the great georeferencing information, steps, and
> guidelines! XXXXXXX and I have been familiarizing ourselves with the
> guidelines, steps and very helpful weblinks. We downloaded the Ingham
> County (Michigan) records into the Access template, and I feel that this
> county is a comfortable starting place for us (it is our institution's
> county).
Go for it! Starting is half the battle.
> Before we begin, I would appreciate clarification on a couple of items.
> Thank you for your time.
As always, I will provide my thoughts and John will weigh in when he's next
online.
> 1) Is it okay to use available "online" latitude and longitude
> coordinates, as long as Datum information, etc. are available?
Yes. Just make sure you specify the source of those coordinates in the
designated field on your spreadsheet.
> For example, the Township, Range, Section Information website
> (http://www.esg.montana.edu/gl/trs-data.html.) that is listed in the MaNis
> Georeferencing Guidelines has links whereby one can search for a named
> place, and the decimal degrees coordinates (to four decimal places) come up
> for that place (example, City of Mason, Michigan). Is it okay to use such
> on-line coordinates for georeferencing place names, or should all
> georeferencing should be done with "hard copy" references?
We encourage you to take advantage of all available tools, that's why we
provided those URLs. There may be others as well. Just make sure your sources
are credible.
> 2) If the answer to the above question is that all georeferencing should
> be done with "hard copy" references, then ignore this one.
>
> A related question to 1): from the same website mentioned above, one can
> link to "TerraServer" and get (really interesting) aerial photos of places.
> With the aid of a labelled map, one can zoom in and find specific
> buildings (such as the Michigan State University Swine Barn - a real Ingham
> County example). From a zoomed aerial image, you can click on "Image Info"
> and get lat and long (non-decimal) coordinates for "tiles" (corners of
> squares) surrounding the image. Datum information is included in "Image
> Info".
>
> So my question is, is it okay for us to use these types of on-line aerial
> images for georeferencing?
I'm including this question just for completeness. The answer is, of course,
yes. And remember, do not worry about the type of coordinate data you record.
The error calculator will be able to convert data provided in any format (e.g.,
deg, min, sec; dec. degrees; etc.) into any other format. Knowing the datum,
providing the source of your coordinates, and noting any assumptions you have
made in assigning those coordinates are what's crucial.
> 3) With regard to the "DeterminedDate" data field in the download file -
> is there a specific format for the date data(i.e. MM/DD/YYYY or DD Month
> Spelled Out YYYY) that you would like us to use?
No, because most spreadsheet programs will dictate a format. It seemed
worthless for us to specify one. John will have to deal with that variety
later.
Best,
Barbara
>>> Posting number 131, dated 17 Nov 2001 14:01:10
Date: Sat, 17 Nov 2001 14:01:10 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: download of GNIS dataset
Comments:
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
GNIS locality datasets for states can be downloaded from:
http://mapping.usgs.gov/www/gnis/gnisftp.html
The dataset for Washington consisted of 32K+ localities and Oregon had
50K+. Both loaded into Excel without problems (after unzipping), and
provide a good start on an authority file for locations + lat/longs. I
wish I had it back when we originally entered our data. Locations can be
found with a search or scrolling in Excel, or by loading into a database
program. As long as you don't need a map, lookup on the downloaded file is
faster than via the GNIS webpage. The downloaded file also has lat/longs
as decimals, which don't appear to be accessible on the GNIS webpage.
These can be entered into two fields of MaNIS with a copy/paste rather than
parsing or typing the dddmmss + direction string into the eight fields
required for ddd, mm, entry.
>>> Posting number 132, dated 19 Nov 2001 07:53:03
>>> Posting number 133, dated 20 Nov 2001 10:41:10
>>> Posting number 134, dated 20 Nov 2001 10:57:38
>>> Posting number 135, dated 20 Nov 2001 18:52:31
Date: Tue, 20 Nov 2001 18:52:31 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Vertical Datum?
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Dear Barbara,
Thanks for your reply to my earlier message. I have another question for
both you and John:
Do we need to note the "Vertical Datum" if one is provided on a map source?
One of the Michigan USGS maps that I looked at this week had the following:
Horizontal Datum: NAD1927
Vertical Datum: NGVD 1929
Also, it looks like we'll be using Topozone
(www.topozone.com/findplace.asp) for georeferencing some of the Michigan
localities (just point the cursor anywhere on the map and the coordinates
of choice [UTM, lat-long decimal, or lat-long degrees and minutes] appear
on the lower part of the screen).
XXXXXXXX
>>> Posting number 136, dated 23 Nov 2001 10:20:39
Date: Fri, 23 Nov 2001 10:20:39 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Collaborative Georeferencing Theory II
In-Reply-To: <B81AAE5B.EB7%jrozdil@u.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
On Fri, 16 Nov 2001, John Rozdilsky wrote:
> John,
>
> My intention was never to clean up our locality data with geo-referencing
> funds! I was operating on the assumption that we would be responsible for
> our own data and therefore it would have been worthwhile to clean it up on
> our own dime before geo-referencing. Which gets to another question. I
> have cleaned up localities in our database since downloading it to you. Is
> this going to cause problems in downloading the newly geo-referenced
> localities from MANIS into our current database? Can I continue to clean up
> our own database?
XXXX and all,
There has been some confusion with respect to localities,
lat_longs, higher geographies, and the means by which data get back into
your local databases. I have neglected the discussion so far in favor of
getting people working, but clearly there is a great deal of anticipation
on the subject. I'll explain this stuff in detail on my trip into town
next week.
In the meantime, continue as you were. If you are in the midst of cleaning
up locality data and have a good reason to continue doing so at the
moment, go ahead. If you weren't cleaning up locality data, don't do so
for the sake of MaNIS.
>Did I understand you correctly when you said to leave
> localities that have lat/long alone? The reason I ask is that I noticed
> that when you transferred our lat/long to the Manis database. The minutes
> were incorrectly interpreted as decimal degrees. Should I worry about this?
It seems I have misinterpreted your latitude and longitude data, is that
correct? The original data should be ddmmss, not dd.dddd? Is this true of
all lat_long entries? If so, then I need to update the gazetteer with the
correct data. I can do this from here in Argentina, but I'll have to do it
the next time I come to town. You were right to worry about this. Even
though we don't have to georeference those localities that already have
coordinates (at least not in the first pass), we do want to be able to use
them for reference, so they should be made correct. It's probably a good
idea if every institution that provided some lat_long data do a little bit
of double checking to see if I've made the correct interpretation of your
data. If I made one mistake, I certainly am capable of making others.
> Will we have to change our database to accept decimal degrees? I appreciate
> your thorough responses. I am trying to clarify and simplify our tasks.
> That is my bottom line.
You will not have to make changes in your database to accept decimal
degrees. You can use whatever coordinate system you like locally, and I
can give you your data in that format when it comes time to download data
from the gazetteer into your database.
For better or worse, have been trying to simplify explanations -
sometimes at the expense of explaining the complete plan. I guess it's
turning out OK though, because all of the right questions are being asked.
John W.
>>> Posting number 137, dated 23 Nov 2001 10:24:34
Date: Fri, 23 Nov 2001 10:24:34 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Questions about Georeferencing
In-Reply-To: <3BF6CED4.5296BDB@oz.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Dear All,
Barbara has answered everything below perfectly well. I'm just "weighing
in" to say so.
On Sat, 17 Nov 2001, Barbara R. Stein wrote:
> XXXXXXXX wrote:
>
> > Thanks for all of the great georeferencing information, steps, and
> > guidelines! Robin Bolig and I have been familiarizing ourselves with the
> > guidelines, steps and very helpful weblinks. We downloaded the Ingham
> > County (Michigan) records into the Access template, and I feel that this
> > county is a comfortable starting place for us (it is our institution's
> > county).
>
> Go for it! Starting is half the battle.
>
> > Before we begin, I would appreciate clarification on a couple of items.
> > Thank you for your time.
>
> As always, I will provide my thoughts and John will weigh in when he's next
> online.
>
> > 1) Is it okay to use available "online" latitude and longitude
> > coordinates, as long as Datum information, etc. are available?
>
> Yes. Just make sure you specify the source of those coordinates in the
> designated field on your spreadsheet.
>
> > For example, the Township, Range, Section Information website
> > (http://www.esg.montana.edu/gl/trs-data.html.) that is listed in the MaNis
> > Georeferencing Guidelines has links whereby one can search for a named
> > place, and the decimal degrees coordinates (to four decimal places) come up
> > for that place (example, City of Mason, Michigan). Is it okay to use such
> > on-line coordinates for georeferencing place names, or should all
> > georeferencing should be done with "hard copy" references?
>
> We encourage you to take advantage of all available tools, that's why we
> provided those URLs. There may be others as well. Just make sure your sources
> are credible.
>
> > 2) If the answer to the above question is that all georeferencing should
> > be done with "hard copy" references, then ignore this one.
> >
> > A related question to 1): from the same website mentioned above, one can
> > link to "TerraServer" and get (really interesting) aerial photos of places.
> > With the aid of a labelled map, one can zoom in and find specific
> > buildings (such as the Michigan State University Swine Barn - a real Ingham
> > County example). From a zoomed aerial image, you can click on "Image Info"
> > and get lat and long (non-decimal) coordinates for "tiles" (corners of
> > squares) surrounding the image. Datum information is included in "Image
> > Info".
> >
> > So my question is, is it okay for us to use these types of on-line aerial
> > images for georeferencing?
>
> I'm including this question just for completeness. The answer is, of course,
> yes. And remember, do not worry about the type of coordinate data you record.
> The error calculator will be able to convert data provided in any format (e.g.,
> deg, min, sec; dec. degrees; etc.) into any other format. Knowing the datum,
> providing the source of your coordinates, and noting any assumptions you have
> made in assigning those coordinates are what's crucial.
>
> > 3) With regard to the "DeterminedDate" data field in the download file -
> > is there a specific format for the date data(i.e. MM/DD/YYYY or DD Month
> > Spelled Out YYYY) that you would like us to use?
>
> No, because most spreadsheet programs will dictate a format. It seemed
> worthless for us to specify one. John will have to deal with that variety
> later.
>
> Best,
> Barbara
>
>>> Posting number 138, dated 23 Nov 2001 10:27:35
Date: Fri, 23 Nov 2001 10:27:35 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Vieglias routine (fwd)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
XXXX and all,
Don't confuse the lat_long determination with the error determination. You
can get the lat_long without the extents, but you need to use the extents
as one of the sources of uncertainty - which contributes to the maximum
error distance, but does not affect the lat_long itself.
The guidelines do allow the distance bearing computation to be made from
GNIS coordinates, and I agree, it would be a crime not to use those data.
I would very much like to provide the tool that can parse the localities
and calculate the lat_longs from any gazetteer. In February I'll likely be
collaborating with the Alexandria Digital Library Project to do just that.
I am currently awaiting the development of a protocol to communicate with
their Digital Gazetteer.
There are really two tools that would be nice. I've already mentioned the
first one, which would be based on Dave Vieglais' SPPFind tool, which I
have not yet tested. The second is the error calculator, which is
referenced in the MaNIS web pages, but is not yet functional. I've
finished the Error Calculator Tool except for the datum error
contributions and testing. I would like to suggest that charging ahead on
the lat_long determinations is fine, but leave off the error stuff until
thetool is ready for prime-time. That error stuff is just too burdensome
to do by hand. Doing one pass for lat_longs and one for errors might
actually be more efficient, but we'll need evidence "from the trenches"
to figure out if this is true.
John W.
---------- Forwarded message ----------
Date: Sat, 17 Nov 2001 13:21:47 -0800
From:
To: tuco@socrates.Berkeley.EDU
Cc: bstein@oz.net
Subject: Vieglias routine
John W. So much for theory. On more practical matter. The rules indicate
that "If the [SpecLoc] description includes an offset, use the furthest
extent of the named place in the direction of the offset." So we should
NOT compute terminal lat/longs from the GNIS lat/longs and bearing? I ask
because GNIS locs don't appear to take into account the furthest extent of
the named place. Related, should we wait for the georeferencing tool
mentioned in the 10/18/01 email or just charge ahead? I assume it was to
take GNIS locs and try to match them with occurrences in the MaNIS file
(from project description), then compute terminal lat/longs based on
distance and bearing. Modifying the rules to allow the distance-bearing
computation based on GNIS lat/long would really increase georeferencing
rate, and as long as the technique was referenced, I don't see a problem.
>>> Posting number 139, dated 23 Nov 2001 10:29:58
Date: Fri, 23 Nov 2001 10:29:58 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Vertical Datum?
In-Reply-To: <3.0.32.20011120185230.00718380@pilot.msu.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
XXXXX and all,
The Vertical Datum refers to the geometric model from with elevations are
determined. In our data we consider altitude (or elevation) as an
attribute of the locality, not as an attribute of the position. Or, to say
it another way, when we record positions digitally, we include latitude,
longitude, and horizontal datum, but we do not include elevation and
vertical datum. In short, we treat elevation as a part of the locality,
so we do not need to consider the vertical datum since it has no bearing
on our georeferencing.
Note, unless I am mistaken there is no way to know the datum when using
Topozone. Someone please correct me if I'm wrong. This isn't really a big
problem as long as the error is calculated with an unknown datum.
John W.
On Tue, 20 Nov 2001, XXXXXXXXXX wrote:
> Dear Barbara,
>
> Thanks for your reply to my earlier message. I have another question for
> both you and John:
>
> Do we need to note the "Vertical Datum" if one is provided on a map source?
> One of the Michigan USGS maps that I looked at this week had the following:
> Horizontal Datum: NAD1927
> Vertical Datum: NGVD 1929
>
> Also, it looks like we'll be using Topozone
> (www.topozone.com/findplace.asp) for georeferencing some of the Michigan
> localities (just point the cursor anywhere on the map and the coordinates
> of choice [UTM, lat-long decimal, or lat-long degrees and minutes] appear
> on the lower part of the screen).
>
> Thanks,
> XXXXX
>
>
>>> Posting number 140, dated 26 Nov 2001 10:20:20
Date: Mon, 26 Nov 2001 10:20:20 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Topozone - Datum
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Hi John,
According to the topozone website (address below), it appears that the
given coordinates are based on NAD27. (This is listed next to the
coordinate buttons - UTM, DecLatLong, etc. - on the website).
Please let me know if you have other information about this.
Thanks,
XXXXX
>Note, unless I am mistaken there is no way to know the datum when using
>Topozone. Someone please correct me if I'm wrong. This isn't really a big
>problem as long as the error is calculated with an unknown datum.
>
>John W.
>
> Also, it looks like we'll be using Topozone
> (www.topozone.com/findplace.asp) for georeferencing some of the Michigan
> localities (just point the cursor anywhere on the map and the coordinates
> of choice [UTM, lat-long decimal, or lat-long degrees and minutes] appear
> on the lower part of the screen).
>
>>> Posting number 141, dated 3 Dec 2001 05:59:15
Date: Mon, 3 Dec 2001 05:59:15 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Loading Lat_Longs back into databases
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Dear All,
Last week I promised a message about the relationship between the
gazetteer and your databases - the bigger picture.
We've already talked about the static nature of the current MaNIS
gazetteer. As I've said, the gazetteer in its current form is a temporary
tool to aid in collaborative georeferencing. Once the network gets going
there will be a dynamic gazetteer as described in the NSF proposal.
Because our "snapshot" data are static and our databases are not, the
differences between the two will increase over time, especially for those
who are specifically editing locality-related data. I guess that when
people made this realization it caused some concern.
I designed the gazetteer with the issue of changing data in mind, and I've
done a few things to aid in data reconciliation when the lat_longs get
loaded back into your databases. For example, I've stored much more
information in the MaNIS gazetteer than is visible in the online
interface, including information that relates the localities (and
therefore the lat_longs) back to the specimens themselves. The structure
of the gazetteer may be of interest, so I will post the Gazetteer
Entity-Relationship diagram as a document on the MaNIS website when I get
back to civilization.
Since I stored all of the original locality-related information along with
the references to the specimens, it will be possible (when the time comes
to load lat_long information into your databases) to compare the snapshot
locality data with the then-current locality data. For all of those
localities where there has been no change, the lat_long data can be loaded
without question. This first step should take care of most records for
most institutions. For the rest of the records, where the locality data no
longer exactly match the snapshot data, some analyses can be done to
determine if the differences can be considered "substantive," by which I
mean that they would affect the determination of the lat_long. For
example, a snapshot locality that is the same as the then-current locality
except that an elevation has been added can be considered as not
substantively changed and can therefore have its lat_long record loaded.
This step will be a little different for each institution. After doing
some bulk checking for differences such as in the foregoing example, I
envision making one visual pass over the remaining records, with the
original and the then-current localities side-by-side, putting a checkmark
in a column called "substantive" for those records that have had
substantive changes. When that pass has been made, all of the lat_longs
for records without a checkmark can be loaded. This third step should take
care of most of the remainder of the localities. What's left will be
locality-specimen relationships that have changed since the time when the
snapshot was taken. These records will have to be resolved by the
individual institutions.
There are some tricks and techniques I haven't presented yet, but I hope
that what I've written above helps to clarify the bigger picture with
respect to georeferencing. Questions have proven useful thus far, so if
there's anything else about which you'd care to have me elaborate, please
ask.
In the spirit of looking forward, another thing to think about for the
future is the incorporation of the coordinates and metadata into your own
local databases. Some institutions don't have attributes in their
databases to hold lat_long information. Similarly, not everyone (but there
are some!) has an attribute to accomodate maximum error distance. It would
be a shame to throw away all of this hard-earned and valuable data. At
this point I'm asking you to consider the ramifications of storing these
data so that there are no unpleasant surprises when the time comes to load
the data back into your databases.
John W.
>>> Posting number 142, dated 7 Dec 2001 07:24:45
Date: Fri, 7 Dec 2001 07:24:45 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing Guide Revisions
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Dear All,
While working through the development of the Georeferencing Calculator I
discovered minor numerical and typographical errors in the Georeferencing
Guidelines document. This message is just to alert you that I have made
revisions to that document. One particular change worth noting is in the
section on "Uncertainty associated with coordinate precision." It seemed
to me quite reasonable to assume that the coordinate precision should be
the same for both coordinates, and so I've rewritten that section to
reflect this assumption.
I've also added some calculation examples against which you might test
your understanding both of the georeferencing concepts.
One detail of reading the datum error from a file eludes me at the
moment. It is the last remaining issue before the Georeferencing
Calculator becomes available.
John W.
>>> Posting number 143, dated 10 Dec 2001 13:58:27
Date: Mon, 10 Dec 2001 13:58:27 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Information from Topozone - NAD 27 Datum
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Dear All,
John asked that I follow up with staff at Topozone
(www.topozone.com/findplace.asp) with regard to datum information on their
website's scanned maps (see previous message exchanges copied below).
Here is what I found out:
1. For USGS QUAD MAPS (1:24,000 or 1:25,000): the vast majority of these
original scanned maps on the Topozone website are based on the NAD 27. If
any underlying Quad map was originally based on another datum (such as NAD
83 for example), Topozone has REPROJECTED that map into NAD 27.
2. Thus, the Topozone cursor coordinates as well as the underlying Quad
map (whether original or reprojected) are ALWAYS in NAD 27.
3. It was confirmed that all original MICHIGAN QUAD maps that were scanned
for the Topozone website are NAD 27.
John, please let us know if it is okay for us to list NAD 27 as the datum
instead of "Datum Unknown" for locality coodinates taken from the Topozone
website.
Thanks,
XXXXXX
>>> Posting number 144, dated 14 Dec 2001 15:31:53
>>> Posting number 145, dated 16 Dec 2001 11:40:45
Date: Sun, 16 Dec 2001 11:40:45 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Information from Topozone - NAD 27 Datum
In-Reply-To: <3.0.32.20011210135816.00717590@pilot.msu.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Thanks XXXXX, this is most excellent. We can use Topozone coordinates with
NAD27 recorded. They have no idea how big a favor they have done for
us. Everyone please list NAD27 with any coordinates derived from Topozone
and remember to record the Reference_Source as "Topozone 1:24000" or the
like.
>>> Posting number 146, dated 3 Jan 2002 10:14:21
Date: Thu, 3 Jan 2002 10:14:21 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: number of decimals on decimal degrees
Mime-Version: 1.0
Content-Type: text/plain; format=flowed
MaNIS: How many decimals are folks attaching to lat/long determinations?
I'm going with four on decimal degrees even though this is more than the
justified from the offset distances to the nearest mile or fractional mile.
As I understand it, John W's error calculator will attach the correct error
to lat/long determinations based on the offset direction(s), distance and
units. Sorry if I missed this in previous discussions?
>>> Posting number 147, dated 7 Jan 2002 09:46:27
Date: Mon, 7 Jan 2002 09:46:27 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: number of decimals on decimal degrees
In-Reply-To: <F100rz71znUp8acXUgZ000178f6@hotmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Hi Folks,
I'm back. Argentina started rioting when I left for Chile. I won't claim
that my leaving was the cause.
Anyway, my recommendation is to store as many decimal places as your
source gives you and not to confuse those digits with accuracy or precision
- that's why we're using the explicit maximum error distance. I would
certainly caution that to use fewer digits is to introduce extra,
unwarranted errors. Refer to the table in the Georeferencing Guide at
http://elib.cs.berkeley.edu/manis/GeorefGuide.html to see the magnitude of
these errors. If you use 5 digits in a decimal degree coordinate, the error
will be on the same order of magnitude as that for most of today's accurate
GPS readings. The error calculator will also take into account the
precision of the recorded coordinates when calculating maximum error distances.
>MaNIS: How many decimals are folks attaching to lat/long determinations?
>I'm going with four on decimal degrees even though this is more than the
>justified from the offset distances to the nearest mile or fractional mile.
>As I understand it, John W's error calculator will attach the correct error
>to lat/long determinations based on the offset direction(s), distance and
>units. Sorry if I missed this in previous discussions?
>
>>> Posting number 148, dated 7 Jan 2002 12:37:08
>>> Posting number 149, dated 7 Jan 2002 12:57:05
>>> Posting number 150, dated 7 Jan 2002 12:45:12
Date: Mon, 7 Jan 2002 12:45:12 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Should not found SpecLocs default to county?
In-Reply-To: <v02130501b85f9bf6b38a@[207.207.103.162]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX, and all,
>John W.: So I'm wondering about the Oregon records. There are about 400
>with DecLat/longs that were already assigned when downloaded, but they only
>have two decimals. Was this a formating or rounding decision? I'll leave
>them as is as I assume if someone assigned lat/long it is more accurate the
>the SpecLoc.
Actually, it was a formatting error. The decimal lat/longs that appear in
the download have been truncated to 2 decimal places. This wasn't my
original intention. The truncation occurred somewhere in transferring
between Access and the Informix database from which the downloaded data are
taken. I'll try to find out where it occurred and fix the problem, then I
will update the decimal latitude and longitude values in the online
gazetteer. This shouldn't affect on those who've already downloaded data
for georeferencing since we agreed that the localities that already have
lat/longs will not be georeferenced (again). If anyone is checking and
changing records that have lat_longs already, let me know.
>Related, if we cannot find a SpecLoc, should we default to county or leave
>it ungeoreferenced pending investigation by the contributing institution?
>So far not found SpecLocs are running at about 10% due to discrepencies in
>SpecLoc and county, apparent typos, or ambiguous text.
If you cannot find the SpecLoc, leave it ungeoreferenced and say why in the
field called "NoGeorefBecause." If you find the SpecLoc and it is
unambiguously placed in the wrong county, go ahead and georeference it and
make a note to that effect in the "LocalityAnnotation" field in the
downloaded data file. These notes will eventually get back to the source
institution.
>>> Posting number 151, dated 7 Jan 2002 14:52:05
Date: Mon, 7 Jan 2002 14:52:05 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Oregon lat/longs.
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
XXXX, and all:
"...wondering about the Oregon records. There are about 400..."
The Oregon records that had lat/long for specimens in the KU collection
should be redone with the new system. Those that were added here were done
a couple of years ago using a program that calculated them for us so they
will not be as accurate as the current system we are using.
>>> Posting number 152, dated 8 Jan 2002 20:57:38
>>> Posting number 153, dated 16 Jan 2002 15:03:38
Date: Wed, 16 Jan 2002 15:03:38 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing Error Calculator
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
At long last I'm ready to introduce the Georeferencing Error Calculator.
It's been some time in the making, and I apologize for the delay, but I
wanted to give you a product that wouldn't be a moving target due to
constant revision. The application has been pretty well tested and I
believe you can use it with confidence in the results it
gives. Nevertheless, if something doesn't seem quite right, try to figure
out why. Usually it means that the coordinate precision is set too low (the
coordinate precision always reverts to "nearest degree" if you change the
coordinate system). If you exhaust all possibilities of making sense of the
maximum error value that the program gives you (this includes reading the
manual and the georeferencing guidelines), then feel free to send me a
message asking what's going on. If you do, please be explicit about what
you are doing and what all of the parameters are for the calculation that
puzzles you.
The Georeferencing Guidelines and the Georeferencing Steps documents have
been modified to include references to the Error Calculator, and the Error
Calculator Manual has been added to the list on the Documents page on the
MaNIS website at the following URL:
http://dlp.cs.berkeley.edu/manis/Documents.html
Please read the manual so you know what to expect when loading the
Calculator into your browser. In particular, you should be aware of the
browser constraints and the size of the java applet. It can be quite slow
to load the first time if your connection is slow.
Two points about making calculations are also worth emphasizing in advance.
I've already mentioned the first, which is that the coordinate precision
will revert to "nearest degree" if you change the coordinate system. If you
get an error that you think is excessive, the coordinate precision is
likely to be the culprit. Another possible culprit is having the datum set
to "not recorded" if you actually know what datum the coordinates were
taken in. The second important point is that all distance measurements in a
given calculation must be in the same units. For example, don't mix an
offset of 10 miles with an extent of named place of 3 kilometers. Both
measures need to be in one system or the other. The error distance will be
given in the same units as the measurements and all will be governed by
your choice in the Distance Units drop-down list.
Enjoy!
>>> Posting number 154, dated 16 Jan 2002 15:28:45
Date: Wed, 16 Jan 2002 15:28:45 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: CNMA: mammal collection at UNAM
In-Reply-To: <5.1.0.14.1.20020107123724.00a00090@ibunam.ibiologia.unam.m x>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: quoted-printable
Dear All,
I have changed all references to UNAM in the MaNIS documents and database=20
to be CNMA based on the following request from Fernando Cervantes. The=20
acronym was not changed in the Project Description document, which is a=20
copy of the document sent as part of the NSF grant application. Those of=20
you who downloaded localities previous to 16 January 2002 will still have=20
UNAM as a CollectionCode in your downloaded data. This will not present a=20
problem when you return the georeferenced data to me.
John W
>Dear John
>
> To better describe who and where we are at I would like to ask you for=
=20
> the following:
>
>1. In the list of institutions participating in MaNIS and the contacts=20
>(web site), please include the name, position, and e-mail account of my=20
>assistants:
>
>Yolanda Hortelano, yolahm@ibiologia.unam.mx
>Julieta Vargas, jvargas@ibiologia.unam.mx
>
>2. In addition, please change the acronym of our collection. Our mammal=20
>collection is known and registered as CNMA (after Colecci=F3n Nacional de=
=20
>Mam=EDferos) and is hosted by Instituto de Biolog=EDa, that belongs to=20
>Universidad Nacional Aut=F3noma de M=E9xico (UNAM).
>
>Thank you for your help,
>
>Fernando
>------------------------------------------------
>Fernando A. Cervantes
>Zoologia. Instituto de Biologia, UNAM
>Apartado Postal 70-153, Coyoacan
>Mexico, D. F. 04510
>Mexico
>
>tel.: (525) 622 9143; fax: (525) 550 0164
>e-mail: fac@ibiologia.unam.mx
>sitio web: www.ibiologia.unam.mx/cnma
>------------------------------------------------
>>> Posting number 155, dated 17 Jan 2002 09:38:42
Date: Thu, 17 Jan 2002 09:38:42 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: georeferencing
In-Reply-To: <5.1.0.14.1.20020107123124.00a00ec0@ibunam.ibiologia.unam.m x>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: quoted-printable
XXXXXXX,
Now that the Error Calculator is done and on the web I was able to check=20
the data you sent me in December. I had no problems importing those data=20
into my system. When I do this step I check for inconsistencies in the=20
data and fix them if I can. The Determination References you provided are=
=20
excellent. I wish we could figure out which datum those sources use.
I'm curious why you chose to record degrees minutes seconds instead of=20
decimal degrees for the localities your team georeferenced. I'm only asking=
=20
to point out that it would have been easier to just copy and paste the two=
=20
decimal degree values. This would have been a little faster and it would=20
have left less room for error. Even so, there was only one coordinate error=
=20
I could find in your data. There was a 10 for decimal seconds where there=20
should have been a 0.
There are some limitations of the Alexandria Digital Library Data of which=
=20
everyone should be aware. As long as you recognize these limitations, the=20
ADL gazetteer is extremely useful. I'm including, below, a message to and a=
=20
response from Linda Hill about these limitations.
I noticed that none of the localities in the records you georeferenced have=
=20
maximum error distances. I hope you will provide these data in the future,=
=20
especially now that I've released the Error Calculator, which is supposed=20
to make the calculations much easier. When you do make error calculations,=
=20
be sure to use a coordinate precision of "nearest minute" for Alexandria=20
Digital Library data that come from NIMA. If you look at the values that=20
come up there is always either a 0 or a 59 in the seconds field for non-USA=
=20
named places. There is something wrong with a coordinate translation=20
algorithm somewhere that produces this problem. I recommend using the=20
decimal degree coordinates since they err less than the degrees minutes=20
seconds.
I especially appreciate the Locality Annotations your team provided and I=20
hope the other recipients of your georeferenced data do as well.
>John: Here's the situation. The data in our gazetteer for the example you=
=20
>used is
>NIMA. The original NIMA coordinates are:
>
>NIMA: 20=B0 11' 00" N 098=B0 03' 00" W
>
>NIMA points are all limited to 1 minute resolution, I believe, although=
they
>don't document this anyway that I have seen.
>
>We have two clients and they show the coordinates as:
>
>CDL-Middleware client to ADL Gazetteer: Longitude W 98=B0 03' Latitude N=
20=B0 11'
>
>AOL client to ADL Gazetteer: Longitude: -98.050003 (98=B03'0"W) Latitude:=
=20
>20.183332
>(20=B010'59"N)
>
>The problem with the AOL client is that the original ddmmss values were=20
>converted
>to decimal degrees and then the ddmmss values that are shown in the=20
>interface are
>calculated from them, giving the impression that there is more resolution=
=20
>in the
>location than is warranted. As you point out, in your example there is=20
>obviously
>a problem with the '3' as the last digit in the longitude value. We are=20
>aware of
>these problems but have not gone back and fixed it. We have limited staff=
=20
>to work
>on the gazetteer and have put more work into other developments. What we=20
>intend
>to do is to phase out the AOL client and replace it with a client based on=
our
>middleware software (like the CDL client). We will be storing decimal=20
>degrees in
>our database but need to be smarter about the specificity
>
>Neither the USGS nor NIMA clearly reference the geodetic basis of their
>coordinates. We are assuming that they are using WGS-84. In our revised=20
>Gazetteer
>Content Standard there is an element to declare the geodetic basis for the
>coordinates. We are setting the default value as WGS-84 but other bases can=
be
>entered. With our current gazetteer, I think you will not go far wrong with
>assuming WGS-84. Also, we have elements for making a statement about the
>'accuracy' of the coordinates. In the future as we build up better data,=
these
>statements could give assistance in making the estimates that you need.
>
>I had a look at your 'estimator' for maximum geospatial error in specifying
>locations. It looks very useful. I passed the URL on to our gazetteer team=
=20
>here
>so that they can see what you are doing.
>
>We are still working on getting our gazetteer protocol server working=20
>properly.
>We solved a major parsing problem today. There is still more to do but you=
=20
>might
>start thinking about how you might embed gazetteer lookup in your script=
using
>our gazetteer service protocol.
>
>I appreciate your feedback and apologize for the limitations of our=20
>gazetteer. We
>continue to work on it and welcome collaboration to 'make it right'.
>
>- Linda
>
>
>John Wieczorek wrote:
>
> > Hi again,
> > I have people engaged in georeferencing for the MaNIS Project now. My=
first
> > set of georeferenced data have just been returned and the ADL gazetteer=
was
> > among the Reference Sources used to get coordinates for the data. My
> > questions are about the coordinates themselves. I'll use a specific=
example
> > to better illustrate the questions.
> >
> > The locality in question is Huauchinango, Puebla, Mexico. The gazetteer=
=20
> shows
> > coordinates in two units, decimal degrees and degrees minutes seconds.
> > Specifically, for this example, the decimal degrees are 20.183332,
> > -98.050003. The degrees minutes seconds are 20=B010'59"N, 98=B03'0"W.=
These two
> > aren't the same when you get out to that sixth decimal place in=
longitude,
> > and they differ even more in latitude. I'm wondering whether there is a=
way
> > to know which is the original coordinate system (i.e., the one without=
the
> > error introduced by translation). Both coordinates actually have=
tell-tale
> > signs of tampering. That 3 out at the end of the decimal longitude looks
> > like a floating point error. The fact that so many of the named place=
from
> > this region have only 0 or 59 in the seconds fields is also highly=
suspect.
> > So, I wonder at what step the translation(s) was(were) made - whether it
> > comes from the original data source (in this case NIMA) or whether it is
> > post-processing done on your end. If it is the former, I suppose we're
> > stuck with it, but if it's the latter I wonder if a better algorithm=
could
> > be used to keep the coordinates in sync. I can offer one, if that helps.
> >
> > Finally, I've probably asked this before, but is it possible to get the
> > datum information along with the coordinates. I suspect that information=
is
> > missing as metadata from the original data sources, but if it isn't
> > missing, is there any possibility that it could be among the data you
> > provide in the ADL gazetteer interface? It makes a great deal of=
difference
> > sometimes in determining the maximum error distance for the coordinates
> > assigned to a locality, and this will, in turn, affect analyses further=
on
> > down the road.
> >
> > Thanks bunches,
> > John W
>>> Posting number 156, dated 20 Jan 2002 10:39:23
>>> Posting number 157, dated 31 Jan 2002 15:13:44
>>> Posting number 158, dated 31 Jan 2002 16:18:34
Date: Thu, 31 Jan 2002 16:18:34 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing Update
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
I write for two purposes. The first is that I'm curious to know how many of
you have actually begun to georeference. So far, I know that CNMA and the
MVZ have begun. The reason I ask is that I would like to begin a discussion
on the list of techniques to make the task go faster. I don't really want
to do that until most everyone is actually getting their hands dirty. In
this way everyone will be able to benefit from the discussion. So, please
let me know either that you have already begun georeferencing, or when you
anticipate beginning.
My second purpose is to let you know that, due to my ignorance of the
details of two of your esteemed collections databases, I made some faulty
assumptions when I first processed the data for the online gazetteer. As a
result, I need to reload data for UWBM and for ROM. I have already
reprocessed the UWBM data and I'll try to load it into the gazetteer as
soon as possible (hopefully by Monday). The situation with ROM is more
complex and I anticipating making an update to the ROM data in about one
month. There are a few implications of this unfortunate necessity.
1) If you have not yet downloaded localities for georeferencing, wait to
make your downloads at least until I announce that the update for UWBM has
been done. Don't wait for the ROM update to be done unless for some reason
you weren't going to begin georeferencing for another month anyway.
2) If you have downloaded localities, but have not yet begun georeferencing
them, throw away the downloaded file(s) you have and download them again
after I announce that the UWBM update is complete. Again, don't wait for
the ROM update to be done unless you weren't going to begin georeferencing
for another month anyway.
3) If you downloaded and began georeferencing files that include UWBM
and/or ROM records, please discard those records (only) from your record
set, even if you happen to have already georeferenced some of them. My
suspicion is that not much actual georeferencing has commenced to date
(though I'd love to hear otherwise), so this is unlikely to be a big
problem. After discarding the UWBM and ROM records, please do another
download with the same criteria you used last time, but this time please
select UWBM in the Institution drop-down box. This will give you only the
UWBM records from your geographic area of interest. After they download
successfully, append these UWBM records to the records you've already begun
georeferencing and proceed as if nothing had happened.
When the ROM records are ready I'll make another announcement to the list
about downloading only ROM records to append to your working files. The
process will be exactly the same is in scenario 3, above. In the meantime,
ROM records will still be in the gazetteer, but please do spend time to
georeference them. Throw them out now, or when I make the announcement, as
you prefer.
Thanks, and my sincere apologies for the inconvenience. I promise to try to
not make assumptions about other people's data anymore. I should know
better by now.
John W
>>> Posting number 159, dated 1 Feb 2002 17:29:10
>>> Posting number 160, dated 1 Feb 2002 17:33:01
>>> Posting number 161, dated 1 Feb 2002 18:24:45
>>> Posting number 162, dated 1 Feb 2002 15:42:04
>>> Posting number 163, dated 1 Feb 2002 19:27:30
Date: Fri, 1 Feb 2002 19:27:30 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Gazetteer update
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
The promised gazetteer update is complete. Download with abandon!
John W
>>> Posting number 164, dated 4 Feb 2002 17:36:54
Date: Mon, 4 Feb 2002 17:36:54 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Fwd: Georeferencing by MSU
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
Please read, there are some excellent questions raised here.
>Date: Mon, 04 Feb 2002 17:58:51 -0500
>To: tuco@socrates.Berkeley.EDU
>From:
>Subject: Georeferencing by MSU
>
>Hi John,
>
>XXXXXXX and I want to give you an update on georeferencing and relay
>some concerns/questions.
>
>In late November, we downloaded records for several Michigan counties and
>have since practiced on the different types of localities. Using
>Topozone, we worked individually on Eaton and Barry Counties and then
>compared and discussed our approaches and results. Prior to the
>introduction of the error calculator, we reported our results as UTM
>coordinates in the Access file template provided.
>
>With the availability of the error calculator (thank you very much!) and
>recent revised guidelines, we began recording original coordinates as
>decimal degrees for Barry County. Our plan is to send you each of our
>Barry County files in the next few days. We would appreciate your
>comments on our results and techniques before we proceed with the "real
>thing".
>
>We have some questions and comments:
>
>1. Evolving Guidelines - We would appreciate an announcement whenever
>there is an update to the guidelines, specifying which sections are
>altered, to ensure that we are always working with the most recent
>information. Thanks again for all of your hard work with this!!
Point well taken. I've tried to be good about announcing the updates, but I
haven't always completely described what the changes were.
>2. Guidelines Questions - In the calculation example of Distance Along
>Orthogonal Directions, the Direction Precision is given as 45 degrees. It
>seemed earlier in the document that "directional imprecision can be
>ignored" in such an example. Are we misunderstanding something?
I included one too many lines in my copy and paste. The Direction Precision
should not figure into that calculation. I will remove the extraneous line
from the Georeferencing Guidelines.
>In the calculation example of Named Place Only/Bakersfield, the
>coordinates are 35 degrees, 22', 24"N and 119 degrees, 1' 4" W. We
>understand from the example that these are the GNIS coordinates for
>Bakersfield. In other examples (e.g. Distance Along Orthogonal
>Directions and Distance at a Heading) the latitude and longitude
>coordinates are the same as for the Named Place/Bakersfield example. Since
>the actual localities are different (from Bakersfield), shouldn't the
>coordinates be different as well?
Absolutely. You win a prize for catching those mistakes. The "Distance
Along a Path" example was similarly problematic. I have changed the wording
as well as the values for Latitude, Longitude, Decimal Latitude, and
Decimal Longitude for these examples to reflect that the coordinates of the
locality are different from the coordinates of the named place mentioned in
the locality description.
>3. Coordinates for the Center of a Township - If a locality is a
>township name only, is it preferable to use the coordinates for the
>township that are automatically provided by Topozone (via the place name
>search), or use the coordinates for the intersection of Sections 15, 16,
>21, and 22 (assuming the township consists of the "standard" 36 one-mile
>square sections)?
I was unaware that one could (and unable to figure out how to) find a
township, in the TRS sense, from the place name search on Topozone. I did
notice that you can find named townships (Michigan is full of them), but I
don't believe their coordinates correspond with the TRS sections they
occupy. Nevertheless, the coordinates we're looking for are those of the
intersection of center sections, as Laura mentioned above.
>4. Extent of an intersection - One of the localities that we recently
>georeferenced in Barry County was the intersection of two roads. We used
>the coordinates from Topozone and estimated the extent of the intersection
>to be 50 meters. Is this a reasonable estimate to use in general for this
>type of locality? (The locality was considered as a named place for
>calculation of error).
That seems like a generous extent unless the roads are 12-lane highways or
something. I would opt for something more like 10 meters for your everyday
two-lane roads. Certainly, feel free to override my opinion if the
circumstances warrant it.
>5. Extent of a named place that lacks bounding boxes - We have
>encountered named places that lack bounding boxes on both the Topozone
>image as well as a Michigan County Gazetteer book. We have estimated
>extents of such places based on the clusters of buildings that appear as
>black squares on Topozone in 1:25,000 scale. Is this type of estimate okay?
That's what I'd do, and that's what my georeferencers have been doing from
the outset.
>6. Cursor Accuracy - Robin and I have different model computers that
>utilize different web browsers (I have Netscape; Robin has
>Explorer). When Robin connects to Topozone, her computer cursor
>automatically changes to a crosshair. I manually changed my computer
>cursor from the "standard" arrow to a crosshair. I believe this has made
>a difference in attempting to pinpoint localities on the Topozone map.
Good idea. It hadn't occurred to me because we're all using Netscape, and
we're only using Topozone occasionally. Just as a point of information,
for California we most often use Terrain Navigator from MapTech
(http://maptech.com/) to do our georeferencing.
>Thanks for all of your help!!
Thanks for your excellent questions and comments.
>XXXXXXXXX
>>> Posting number 165, dated 6 Feb 2002 11:48:03
Date: Wed, 6 Feb 2002 11:48:03 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Should we save extents?
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
MaNISers:
Should we save extents? In georeferencing, one variable that will not be
saved is the extent used to compute the error. The extent cannot be
inferred from the locality descriptions unlike coordinate and offset
imprecision. In addition, an extent for a populated place will vary
depending on the scale, map, year. For many records it is the largest
component of the error. To give folks an idea of how I computed the error,
I am annotating each record with the extent I used. One could go
overboard and reference the extent, but I am assuming the same system used
to get lat/long (GNIS). Would it be too much trouble to save extents in
the annotation field?
For TRS lat/longs, I am using the extents in the Guidelines update. For
lookup on the MontanaTRS site I am assuming unknown datum and no error due
to scale as done in the Georef Guidelines examples for placename only.
Correct?
>>> Posting number 166, dated 7 Feb 2002 09:55:12
Date: Thu, 7 Feb 2002 09:55:12 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Datum error significance
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
I figured it was worth answering this question on the list in case others
were wondering the same thing. The commonly used datums in the US are the
North American Datum 1927 (NAD 27), the North American Datum (NAD 83), and
the World Geodetic System 1984 (WGS 84). The difference between NAD 83 and
WGS 84 is quite small compared to the difference between NAD 27 and NAD 83.
All of the USGS maps are in one or the other of NAD 27 or NAD 83. I haven't
done an exhaustive search, but it looks like most US Forest Service and
Bureau of Land Management maps use NAD 27.
Anyway, the 79 m used in the Bakersfield example is the actual distance
between two points having the exact same latitude and longitude, but with
one of the points based on NAD 27 and the other based on WGS 84. The Error
Calculator uses a pre-calculated matrix of the greatest difference between
these two datums in every 0.2 by 0.2 degree cell in the region between
84.69 degrees North, 179.48 degrees West and 13.69 degrees North, 51.48
degrees West. Outside of this region the calculator uses the assumption of
1km error due to an unknown datum as documented in the Georeferencing
Guidelines.
When entering coordinates in the calculator it is important to enter the
correct hemisphere. Perhaps that goes without saying, but it is pretty easy
to enter decimal longitude erroneously (without the negative sign in front)
for localities in the western hemisphere. Doing so could seriously affect
the error contribution from an unknown datum.
John W.
>Date: Wed, 6 Feb 2002 11:45:19 -0800
>To: tuco@socrates.Berkeley.EDU
>From:
>
>John: Unknown datum question. Fig 1 in the guidelines has the ranges of
>error for unknown datum. For Bakersfield the range 76-100 m error.
>Oregon, which I am georeferencing, is in the same 76-100 m band, so a
>midpoint would be 88 m. Does 79 m used in the Georeferencing Guidelines
>examples for Bakersfield have some significance? I realize this doesn't
>matter when using the web calculator, but just wondering because it makes a
>difference of several m when using Excel calculator.
>
>>> Posting number 167, dated 13 Feb 2002 11:35:40
Date: Wed, 13 Feb 2002 11:35:40 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: MSU PRACTICE RECORDS
In-Reply-To: <3.0.32.20020213132000.00687df0@pilot.msu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
Below are extracts from an exchange between me, XXXXXXXXXXX, and
XXXXXX stemming from a request to review a set of records that each of
them had georeferenced independently. Several points of interest to the
readers of this list were raised, including a continuing discussion of the
issue of extents raised by XXXXXXX on 6 Feb 2002.
I'd like to report that this exercise turned out to be a wonderful field
test of the georeferencing guidelines. The coordinates and errors were
remarkably similar, with the largest deviations corresponding to the most
vague locality descriptions. Go team!
John W
> >Topozone actually has
> >only two underlying scales, 1:25,000, and 1:100,000. The 1:50,000 and
> >1:200,000 versions are just zoomed out by a factor of two from their
> >counterparts. Also, the 1:25,000 Topozone scales actually consist (were
> >derived from) a combination of 1:25,000 and 1:24,000 maps, which were
> >"resized." It doesn't make all that much difference in the error
> >determination if you use 1:25,000 vs. 1:24,000. However, I hope you are
> >using the 1:25,000 map scale contribution in the error calculator for the
> >1:50:000 Topozone maps. Similarly, use the 1:100,000 map scale for the
> >1:200,000 Topozone maps.
>
>Good to know all of the above. Actually, we used "gazetteer" from the
>dropdown on the error calculator for all of the Topozone practice records.
>We were following the example from the georeferencing guidelines where the
>coordinate source (Topozone) was considered to be a gazetteer, and thus
>selected "gazetteer" on the error calculator. It sounds like we need to
>redo the MAX ERROR with the map scale incorporated.
Actually, there is a subtle distinction to make. In the Georeferencing
Guidelines document I said that the source for that "Distance Only" example
was a gazetteer, because the coordinates were for a named place and
Topozone uses the GNIS data to plot named places; thus, the ultimate source
of the coordinates for that example is the GNIS database, which is a
gazetteer. If you had used Topozone to measure on a map, then the map
itself is the source of the coordinates and should be so reflected in the
error calculations by selecting an appropriate map scale.
> >I'm very happy to see the extent information in there. I am ruminating over
> >the inclusion of a field in the download data for the extent. I'm
> >interested in your opinion on the subject. It seems like it would actually
> >be easier than writing it out in the remarks, especially if you can copy
> >and paste it among several records. However, I think we'd do well to add a
> >NamedPlace field as well so we know to what the extent refers.
>
>XXXX and I have been meaning to reply to XXXX's message about extent. My
>opinion is that extent should be included somewhere (and in the remarks
>field is fine with me) as a record of what was done in the georeferencing
>process.
I think the general sentiment is that the complete determination
(coordinates AND error) would be fully documented if we go ahead and add
the value of the extent to the data we capture. By having a base set of
rules along with recording extent, we will know know the magnitude of
every contribution to the determination. Without recording extent, we are
left to wonder how the georeferencer arrived at his/her result. Would it be
onerous to include the extent in its own field? I think it will be easier
than adding it to the remarks, both for the georeferencer and for the
compiler of named place extents (me). Part of the reason I ask this is that
I'm thinking even bigger than MaNIS to the ubiquitous problem of
georeferencing, which could benefit by having a database of extents. The
GNIS data allows for features to be described by bounding boxes, which can
be interpreted to find extents. However, for most features the bounding box
reduces to a single point. This is true of all but the largest populated
place features in the GNIS database. Given the paucity of extent data
available, and given that we (MaNIS georeferencers) will have to determine
extents for every named place we run across, we could assemble these data
and use them to provide added value to existing gazetteers. Furthermore,
these additional data could be used in the future to automate the process
of georeferencing and error calculation. If this is, indeed, a worthy goal,
then it makes sense to capture the information in its own field so that it
need not be parsed from remarks in the future.
Comments are hereby solicited.
> >Overall, the agreement in the coordinates and the errors is astonishing.
> >The mean deviation in coordinates across the whole dataset is only about
> >300 meters and most of this is due to the two vague localities ("Barry
> >State Game Area" and "Yankee Springs Area"). For the most part the errors
> >take care of the differences. You have bolstered my faith in the system.
>
>Yes - these were large areas that were actually adjacent to one another. I
>found them to be somewhat difficult to georeference.
>
> >The one locality for which I cannot understand the discrepancy is "Clear
> >Lake Camp, 6 mi. E Delton." You might want to revisit that one to see where
> >the problem occurred.
>
>I know what happened here - operator difference (or assumption error?) pure
>and simple. I believe that Robin treated this as an offset, and I
>completely ignored the offset and focused on a "church camp" on the map
>that was on the shore of Clear Lake (the lake was about 6.5 miles east of
>Delton). Thus, I treated this as a named place (and perhaps my assumption
>was an unwarranted big stretch) and Robin treated it as an offset. I
>believe that Robin's choice was the better of the two.
> >
> >Nice.
>
>Thanks again!
>
>XXXXX
>
>
> >
> >John W
> >
> >>Attached are two files containing identical Barry County localities that we
> >>have georeferenced individually as practice with the MaNIS guidelines. We
> >>would sincerely appreciate your critique of our work before we submit files
> >>for inclusion in the project.
> >>
> >>Thanks for all of your help.
> >>
> >>Sincerely,
> >>
> >>XXXXXX
>>> Posting number 168, dated 13 Feb 2002 11:55:01
Date: Wed, 13 Feb 2002 11:55:01 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Should we save extents?
In-Reply-To: <v0213050ab886050432cf@[207.207.103.162]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
>For TRS lat/longs, I am using the extents in the Guidelines update. For
>lookup on the MontanaTRS site I am assuming unknown datum and no error due
>to scale as done in the Georef Guidelines examples for placename only.
>Correct?
Correct.
>>> Posting number 169, dated 13 Feb 2002 12:21:25
Date: Wed, 13 Feb 2002 12:21:25 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: MSU Practice - More Comments
Comments:
In-Reply-To: <3.0.32.20020213145319.00720da8@pilot.msu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
More relevant exchanges.
> >>We do not and will not be using Excel for georeferencing. We just used it
> >>this one time to send you the sample records via e-mail. I am hoping that
> >>the data were not altered.
> >
> >Will you use Access then?
>
>Yes - we are extremely happy with your Access template! (Why would anyone
>want use something else?)
Good question! I would have no problem accepting that everyone used it.
>On the template, we have found it useful to just "close up" the columns
>that we don't want to look at while georeferencing. (You probably noticed
>this in the Excel version).
>
>XXXXXXX (our IT person) will help us send the "real" files using the
>project protocol.
In so doing, be sure to preserve all of the precision in the numeric
fields. There are two ways to do this. The first is to bypass protocol and
just send me the Access database mdb file (preferably with a date in the
filename, e.g., msu_barry020213.mdb). The second is to change the data type
of those fields to text after the georeferencing is all done and then
export the data into a tab-delimited text file.
> >I'm composing a reply to your previous message, which I'll send out to the
> >list due to common items of interest, and as a way of introducing more
> >information on the issue of extents.
> >
>Okay. Robin replied to me (from home) about extents. Here is her "vote".
>FROM XXXXXX: I'd vote for an actual column regarding extent
>information to assure that it was remembered. I view the column headings
>as a checklist of things I need to provide and without reference to it, it
>could easily be forgotten with all the other components.
This is a valuable, practical point with which I entirely agree.
John W
>>> Posting number 170, dated 13 Feb 2002 12:33:47
Date: Wed, 13 Feb 2002 12:33:47 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: MSU PRACTICE RECORDS
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
John, XXXX, XXXX: Can I get copies of the data files? I'd like to run
them through the lat/long calculator for comparsion.
>>> Posting number 171, dated 14 Feb 2002 18:30:16
Date: Thu, 14 Feb 2002 18:30:16 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Error Calculator:Coordinate Source & Topozone.com
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Hi John,
Thanks for the helpful information about map scales and choices to make on
the error calculator when using Topozone.com for georeferencing. I have
some additional questions about this. The message exchanges (from
Mammal-Z-Net) are copied below.
>From John:
>> >Topozone actually has
>> >only two underlying scales, 1:25,000, and 1:100,000. The 1:50,000 and
>> >1:200,000 versions are just zoomed out by a factor of two from their
>> >counterparts. Also, the 1:25,000 Topozone scales actually consist (were
>> >derived from) a combination of 1:25,000 and 1:24,000 maps, which were
>> >"resized." It doesn't make all that much difference in the error
>> >determination if you use 1:25,000 vs. 1:24,000. However, I hope you are
>> >using the 1:25,000 map scale contribution in the error calculator for the
>> >1:50:000 Topozone maps. Similarly, use the 1:100,000 map scale for the
>> >1:200,000 Topozone maps.
>>
>From XXXXX:
>>Good to know all of the above. Actually, we used "gazetteer" from the
>>dropdown on the error calculator for all of the Topozone practice records.
>>We were following the example from the georeferencing guidelines where the
>>coordinate source (Topozone) was considered to be a gazetteer, and thus
>>selected "gazetteer" on the error calculator. It sounds like we need to
>>redo the MAX ERROR with the map scale incorporated.
>From John:
>Actually, there is a subtle distinction to make. In the Georeferencing
>Guidelines document I said that the source for that "Distance Only" example
>was a gazetteer, because the coordinates were for a named place and
>Topozone uses the GNIS data to plot named places; thus, the ultimate source
>of the coordinates for that example is the GNIS database, which is a
>gazetteer. If you had used Topozone to measure on a map, then the map
>itself is the source of the coordinates and should be so reflected in the
>error calculations by selecting an appropriate map scale.
>
My questions:
1. I understand (from exchange above) that if the locality that we want to
georeference is a named place (such as East Lansing or Beaver Island or
Fine Lake) and we enter this into the Place Name Search in Topozone and
Topozone gives us the coordinates of that place, then the Coordinate Source
that we select on the Error Calculator will be a Gazetteer (because
Topozone got those coordinates from GNIS). Thus, I believe that we
calculated the error correctly in the practice records that contained
coordinates given by Topozone for named places (such as Fine Lake). Is
this correct?
2. Are the Topozone maps considered to be USGS or non-USGS maps? For
Example, If we used a Topozone.com map at 1:25,000 scale to measure the
distance from a town, shall we select USGS Map 1:25,000 or non-USGS Map
1:25,000 from the Coordinate Source dropdown on the Error Calculator?
Thanks again,
XXXXX
>>> Posting number 172, dated 14 Feb 2002 15:39:16
Date: Thu, 14 Feb 2002 15:39:16 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Error Calculator:Coordinate Source & Topozone.com
In-Reply-To: <3.0.32.20020214183015.0072e530@pilot.msu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXXX, and all,
You are correct with respect to question 1, below. You got the coordinates
indirectly from GNIS for named places, therefore, the appropriate source is
a gazetteer. If you use Topozone to find a locality, but do any kind of
measuring on the Topozone maps, then you are indirectly using a USGS map,
and you should select the appropriate scale in the coordinate source
dropdown box in the error calculator application. So, to explicitly answer
question 2, below, use "USGS Map 1:25,000" for Topozone maps at either
1:25,000 or 1:50,000. Use "USGS Map 1:100,000" for Topozone maps at either
1:100,000 or 1:200,000. While we're at it, here's a reminder to always use
NAD27 for Topozone-derived coordinates, whether from the gazetteer or from
the maps.
John W
>Thanks for the helpful information about map scales and choices to make on
>the error calculator when using Topozone.com for georeferencing. I have
>some additional questions about this. The message exchanges (from
>Mammal-Z-Net) are copied below.
>
> >From John:
> >> >Topozone actually has
> >> >only two underlying scales, 1:25,000, and 1:100,000. The 1:50,000 and
> >> >1:200,000 versions are just zoomed out by a factor of two from their
> >> >counterparts. Also, the 1:25,000 Topozone scales actually consist (were
> >> >derived from) a combination of 1:25,000 and 1:24,000 maps, which were
> >> >"resized." It doesn't make all that much difference in the error
> >> >determination if you use 1:25,000 vs. 1:24,000. However, I hope you are
> >> >using the 1:25,000 map scale contribution in the error calculator for the
> >> >1:50:000 Topozone maps. Similarly, use the 1:100,000 map scale for the
> >> >1:200,000 Topozone maps.
> >>
> >From XXXXX:
> >>Good to know all of the above. Actually, we used "gazetteer" from the
> >>dropdown on the error calculator for all of the Topozone practice records.
> >>We were following the example from the georeferencing guidelines where the
> >>coordinate source (Topozone) was considered to be a gazetteer, and thus
> >>selected "gazetteer" on the error calculator. It sounds like we need to
> >>redo the MAX ERROR with the map scale incorporated.
>
> >From John:
> >Actually, there is a subtle distinction to make. In the Georeferencing
> >Guidelines document I said that the source for that "Distance Only" example
> >was a gazetteer, because the coordinates were for a named place and
> >Topozone uses the GNIS data to plot named places; thus, the ultimate source
> >of the coordinates for that example is the GNIS database, which is a
> >gazetteer. If you had used Topozone to measure on a map, then the map
> >itself is the source of the coordinates and should be so reflected in the
> >error calculations by selecting an appropriate map scale.
> >
>My questions:
>
>1. I understand (from exchange above) that if the locality that we want to
>georeference is a named place (such as East Lansing or Beaver Island or
>Fine Lake) and we enter this into the Place Name Search in Topozone and
>Topozone gives us the coordinates of that place, then the Coordinate Source
>that we select on the Error Calculator will be a Gazetteer (because
>Topozone got those coordinates from GNIS). Thus, I believe that we
>calculated the error correctly in the practice records that contained
>coordinates given by Topozone for named places (such as Fine Lake). Is
>this correct?
>
>2. Are the Topozone maps considered to be USGS or non-USGS maps? For
>Example, If we used a Topozone.com map at 1:25,000 scale to measure the
>distance from a town, shall we select USGS Map 1:25,000 or non-USGS Map
>1:25,000 from the Coordinate Source dropdown on the Error Calculator?
>
>Thanks again,
>XXXXX
>
>>> Posting number 173, dated 15 Feb 2002 10:55:46
Date: Fri, 15 Feb 2002 10:55:46 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Error Calculator:Coordinate Source & Topozone.com
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Hi John,
Thanks for the information. We'll go ahead and recalculate the Max Error
Values on our "practice" records.
One minor question with respect to the word "measuring" in your response
below: For some localities, such as road intersections for example, we get
the coordinates by placing the cursor on the Topozone map, and then
clicking to get the target coordinates of that particular locality. We
really aren't "measuring", but the coordinates are still considered to be
derived from Topozone, and so the map scale information gets applied to the
error calculator - correct?
Thanks,
XXXXX
At 03:39 PM 02/14/2002 -0800, you wrote:
>XXXXX, and all,
>
>You are correct with respect to question 1, below. You got the coordinates
>indirectly from GNIS for named places, therefore, the appropriate source is
>a gazetteer. If you use Topozone to find a locality, but do any kind of
>measuring on the Topozone maps, then you are indirectly using a USGS map,
>and you should select the appropriate scale in the coordinate source
>dropdown box in the error calculator application. So, to explicitly answer
>question 2, below, use "USGS Map 1:25,000" for Topozone maps at either
>1:25,000 or 1:50,000. Use "USGS Map 1:100,000" for Topozone maps at either
>1:100,000 or 1:200,000. While we're at it, here's a reminder to always use
>NAD27 for Topozone-derived coordinates, whether from the gazetteer or from
>the maps.
>
>John W
>
>>> Posting number 174, dated 15 Feb 2002 09:09:40
Date: Fri, 15 Feb 2002 09:09:40 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Error Calculator:Coordinate Source & Topozone.com
In-Reply-To: <3.0.32.20020215105545.0072c878@pilot.msu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXXX,
You aptly described exactly what I meant. Thank you.
John
>One minor question with respect to the word "measuring" in your response
>below: For some localities, such as road intersections for example, we get
>the coordinates by placing the cursor on the Topozone map, and then
>clicking to get the target coordinates of that particular locality. We
>really aren't "measuring", but the coordinates are still considered to be
>derived from Topozone, and so the map scale information gets applied to the
>error calculator - correct?
>
>Thanks,
>XXXXX
>
>>> Posting number 175, dated 15 Feb 2002 18:08:31
Date: Fri, 15 Feb 2002 18:08:31 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: coordinate source?
Comments: cc: fsyu <fsyu@uaf.edu>
In-Reply-To: <3C6D8138@webmail.uaf.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXXX and all,
There is no provision for georeferencing records that already have
coordinates, but this shouldn't necessarily deter you from doing so. If you
go this route, please be sure to note that you have provided these
additional data when you send them in to me. It makes a difference in how I
handle the data on this end.
To answer your specific question, you should put "original locality
description" in the DeterminationRef field in the downloaded data file and
use "locality description" as the Coordinate Source choice in the Error
Calculator.
John W
>Hi John,
>
>Many Alaska data are already georeferenced, but don't have maximum error.
>I've
>been calculating max. error for them, but determination references are not
>recorded for most of them. What should I enter in Coordinate source in Error
>Calculator?
>
>XXXXXX
>>> Posting number 176, dated 20 Feb 2002 09:06:36
Date: Wed, 20 Feb 2002 09:06:36 -1000
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Topo USA Ver. 3.0 by DeLorme
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
For anyone using DeLorme software Topo USA Ver. 3.0 (which I am using to do
Hawaii localities) you will need this information for the georeferencing
calculator. I just spoke with the Tech help people and got the information
that all topo maps, at all zoom levels, are based on USGS 1:24,000. I
quite like this software as it allows me to place markers for all the
localities I've done which greatly speeds up any double checking I might
want to do. Measuring distances is also easy, either by air or road.
XXXXX
>>> Posting number 177, dated 25 Feb 2002 14:36:49
Date: Mon, 25 Feb 2002 14:36:49 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: MaNIS Server recommendations
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
Due to popular demand, I'm writing to give an updated recommendation for
the MaNIS server specifications. The requirements haven't changed since the
original specification were sent out on 2 Oct 2001. Nevertheless, I'll
reiterate the essentials of the configuration, ordered by importance:
1) dual processor Windows 2000 Professional - the Xeon processor is good
for our purposes; faster is better, but anything on the market today is
fast enough.
2) 512 MB RAM - more is better, but not at the cost of any of the other
essentials.
3) one fast SCSI hard drive - essential; faster is better; capacity is much
less important. 18GB is a good target capacity.
4) 10/100 Ethernet adapter - essential; most systems these days have one on
board.
5 ) 3 yr service on parts and labor - essential; we don't want anything to
break without warranty during the period of the grant.
6) CD-ROM drive - faster is better; a CD-RW may be a useful alternative, if
it fits your budget.
7) 17" Monitor - this machine is supposed to be a server, not a
workstation, so don't spend big money on a fancy display.
8) 1.44 MB diskette drive - less essential every day, but most machine
still come with one.
I've created a model system on the Dell website to give you an idea for a
recommended configuration. To look at the specifications for the system
you'll need to Retrieve EQuote #E001554835. You'll also need to enter
either the E-Quote name, which is "manis2," or my email address.
Let me know if you have any questions.
John W.
>>> Posting number 178, dated 27 Feb 2002 14:59:52
Date: Wed, 27 Feb 2002 14:59:52 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Mystified
In-Reply-To: <3.0.32.20020227173043.007327a8@pilot.msu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Hi XXXX, XXXX, and all,
I have noticed the syndrome you mentioned and I tried to ignore it. That's
harder to do when someone else notices it. It's even worse when two people
notice it - it gets harder to remove the witnesses. I think I know why it
occurs, but I don't have a satisfactory solution yet. I actually made the
interface show 3 decimal places in the Maximum Error field so that this
inconsistency would make less of an impact on the results, which may
currently differ from the expected by up to .001 distance units. So, the
worst case scenario occurs when your distance units are miles, and then the
error (in the error) amounts to about 5.3 feet. This is probably acceptable
and worth trading in your concern for a life. :) In the meantime, I'll
remain cognizant of the problem and try to work on its resolution.
John
At 05:30 PM 2/27/02 -0500, you wrote:
>Hi John,
>
>XXXX and I are mystified about some of the error values in our Barry
>County records (files sent to you in today's earlier message).
>
>1. In the first set of Barry County records (the files that we sent to you
>on 2/12/2002) we incorrectly chose Gazetteer as the error calculator
>coordinate source for Topozone for all records. For the records that were
>TRS localities, we anticipated getting identical values for maximum error.
>This was not the case. When XXXX used the error calculator on her
>computer, she got .716 as the error. When I used the error calculator on
>my computer for these types of records, I got .715 as the error.
>
>2. In the second set of Barry County records (the files that we sent to
>you today 2/27/2002 where maximum error was recalculated with the
>appropriate Topozone map scale), our computers continue to give different
>error calculator values for some of the TRS localities that used an error
>calculator map scale of 1:25,000 (See Sec. 23, T1N, R7W,
>Sec. 24, T1N, R7W and
>T01N R07W Section 4)
>
>3. We were surprised at the above examples. We then entered each other's
>coordinates using identical dropdown choices on the error calculator on our
>respective computers. XXXX's computer still consistently returned an
>error of .723 for all of the TRS localities that had the 1:25,000 scale.
>However, XXXX's computer returned an error of .723 on some localities and
>.724 on others with the 1:25,000 scale. Do we need to be concerned about
>this? (or shall we get a life?)
>
>Thanks,
>XXXXX
>>> Posting number 179, dated 27 Feb 2002 16:24:51
Date: Wed, 27 Feb 2002 16:24:51 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Sample of georeferencing from Baton Rouge
Comments: To:
In-Reply-To: <OF09532E16.D5566143-ON86256B6D.00611AF2@lsu.edu>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="=====================_-1683450515==_"
--=====================_-1683450515==_
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX, and all,
Very nicely done. I can see that you've gone to a lot of trouble to
document the determination methods in the Remarks. There should be no
trouble for someone to figure out later what you did. Some of the
techniques you used (and documented) will surely be useful to others, so
I'm attaching your file with this message to the mammal-z-net list.
I'm trying to decide if/how to make everyone's job a little easier, perhaps
by including a field for named place along with one for the extent. That
way we'll know unequivocally to what the extent refers. I've just started
having my georeferencers do this, and it seems to be better (faster anyway)
than trying to write that information out in plain english in the remarks.
I'm interested in feedback from you and anyone else with an opinion about
whether this change would have a positive effect on your georeferencing.
I'm hoping to set a policy on this subject once there has been ample time
for cogitation on it. In the meantime, I recommend that georeferencers add
two columns to their data, one for NamedPlace, followed by one for Extent,
and put these right before MaximumErrorDistance. Do not include a
ExtentUnits field; instead, use the same units as for the
MaximumErrorDistance and the MaxErrorUnits will refer to both measures.
John W
>Hi John,
>
>Here at LSU, we've downloaded all the Louisiana records from the MANIS
>database, and have begun georeferencing, starting with records from Baton
>Rouge (our home turf). We've learned a lot as we've worked through our
>first batch of records, especially from much of the recent email exchanges
>with other institutions, and we really appreciate the ease of use of the
>Error Calculator. We were wondering if you could look over a small (<20
>records) sample of some of the different types of localities we have
>georeferenced, just to see if we are on the right track. Our longest field
>is the LatLongRemarks, where we describe how we located the point and the
>extent that we estimated to calculate error with. We just wanted to make
>sure that you would be able to follow what we did if there are any
>questions with our georeferencing. Should we place the extents in a
>separate field, and if so, should we place it in any particular order with
>respect to the other fields? Let us know if you see any problems.
>
>Many thanks,
>
>XXXXXXX
>**********************************************************
--=====================_-1683450515==_
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: attachment; filename="batonrouge.txt"
"LocalityID" "CollectionCode" "HigherGeog" "SpecLocality" "ElevationText" "MinElev"
"MaxElev" "ElevUnits" "LatText" "LongText" "TRS" "Township" "TownshipDir"
"Range" "RangeDir" "TRSSection" "TRSPart" "DetByAgentID" "DeterminedByPerson"
"DeterminedDate" "DeterminationRef" "OrigCoordSystem" "Datum" "DecLat"
"DecLong" "LatDeg" "LatMin" "LatSec" "LatDir" "LongDeg"
"LongMin" "LongSec" "LongDir" "UTMZone" "UTMEW" "UTMNS" "MaxErrorDistance"
"MaxErrorUnits" "LatLongRemarks" "CaptiveFlag" "NoGeorefBecause" "LocalityAnnotation"
13056 "CAS" "North America, USA, Louisiana" "Briar patch near LSU campus, East Baton Rouge"
"Dinakar Nethi" "1-22-02" "Topozone - gazetteer" "decimal degrees" "NAD27" "30.4141"
"-91.1759"
"1.009" "mi" "center point of LSU Campus obtained from topozone, estimated furthest extent of ""near
LSU campus"" from center as 1 mi" "0"
28636 "FMNH" "USA, Louisiana, Baton Rouge Par" "Baton Rouge"
"Satya Maliakal" "1-23-02" "Topozone - gazetteer" "decimal degrees" "NAD27"
"30.4451" "-91.1867"
"13.009" "mi" "used EBR Parish courthouse as center, furthest extent of BR city limits from
courthouse estimated at 13 mi" "0"
47616 "KU" "U S A, LOUISIANA, EAST BATON ROUGE PARISH" "BATON ROUGE, 5 MI S OF"
"m" "Satya Maliakal"
"1-23-02" "Topozone -1:100,000" "decimal degrees" "NAD27" "30.3725" "-91.1867"
"15.903" "mi" "located point 5mi S of EBR Parish courthouse, furthest extent of BR city limits
from courthouse estimated at 13 mi" "0"
71051 "LSU" "USA, Louisiana, East Baton Rouge Parish" "0.25 mi E jct. Highland and Lee (on
Highland), Baton Rouge" "0" "0"
"Satya Maliakal" "1-28-02" "Topozone -1:25,000" "decimal degrees" "NAD27"
"30.3911" "-90.1562"
"38.412" "m" "located point 0.25 mi E of intersection of Highland and Lee on Highland,
estimated extent of intersection as 10 m" "0"
71121 "LSU" "USA, Louisiana, East Baton Rouge Parish" "1 km S Baton Rouge, intersection Ben
Hur Rd. and Nicholson Rd., E tracks along fence line, 5 m" "0" "0"
"Satya Maliakal" "1-28-02" "Topozone -1:25,000" "decimal degrees" "NAD27"
"30.3841" "-91.1687"
"43.413" "m" "located point at intersection of nicholson drive RR tracks and ben hur road,
assuming that 1 km S of BR refers to this intersection, estimated extent of intersection as 10 m with 5
m offset" "0"
71074 "LSU" "USA, Louisiana, East Baton Rouge Parish" "0.33 mi S of Baton Rouge City Limits on
Highland Rd" "0" "0"
"Satya Maliakal" "1-28-02" "Topozone -1:25,000" "decimal degrees" "NAD27"
"30.3687" "-91.1227"
"38.414" "m" "point located .33 mi S of intersection of Highland Rd. and southern Baton Rouge
Corp. Limit on Highland Road, estimated extent of intersection as 10 m" "0"
71248 "LSU" "USA, Louisiana, East Baton Rouge Parish" "10 mi S Baton Rouge on River Rd"
"16" "16" "meters"
"Satya Maliakal" "1-28-02" "Topozone -1:100,000" "decimal degrees" "NAD27"
"30.3533" "-91.1808"
"14.041" "mi" "located point 10 mi S of EBR courthouse following River Road, furthest extent
of Baton Rouge city limits from courthouse estimated at 13 mi" "0"
71268 "LSU" "USA, Louisiana, East Baton Rouge Parish" "11465 Robin Hood, Baton Rouge"
"0" "0"
"Satya Maliakal" "1-29-02" "Topozone -1:25,000" "decimal degrees" "NAD27"
"30.4555" "-91.0561"
"37.408" "m" "located 11465 Robin Hood with yahoo maps, then located this point with
topozone, estimated extent of property at 10 m" "0"
71243 "LSU" "USA, Louisiana, East Baton Rouge Parish" "10 mi N Baton Rouge, US 61"
"0" "0"
"Satya Maliakal" "1-29-02" "Topozone -1:100,000" "decimal degrees" "NAD27"
"30.5503" "-91.1969"
"14.041" "mi" "located point 10 mi N of BR along US 61 (starting from EBR Parish courthouse
latitude), furthest extent of Baton Rouge city limits estimated at 13 mi" "0"
71511 "LSU" "USA, Louisiana, East Baton Rouge Parish" "3.4 mi E, 1 mi N Baton Rouge on LA 37"
"0" "0"
"Satya Maliakal" "2-13-02" "Topozone -1:25,000" "decimal degrees" "NAD27"
"30.4655" "-91.1329"
"19.819" "mi" "located closest point 3.4 mi E and 1 mi N of EBR courthouse on LA 37, furthest
extent of BR city limits from courthouse estimated at 13 mi" "0"
71294 "LSU" "USA, Louisiana, East Baton Rouge Parish" "2 mi N Baton Rouge on Miss. River"
"0" "0"
"Satya Maliakal" "2-08-02" "Topozone -1:25,000" "decimal degrees" "NAD27"
"30.4733" "-91.1927"
"14.017" "mi" "located point 2 mi N of EBR Parish courthouse following Mississippi River,
furthest extent of BR city limits from courthouse estimated at 13 mi" "0"
71801 "LSU" "USA, Louisiana, East Baton Rouge Parish" "Baton Rouge on River Road"
"16" "16" "meters"
"Satya Maliakal" "2-20-02" "Topozone -1:100,000" "decimal degrees" "NAD27"
"30.3749" "-91.2249"
"5.041" "mi" "located point at center of River Rd. in Baton Rouge, estimated furthest exent of River
Rd. in BR from center at 5 mi" "0"
71802 "LSU" "USA, Louisiana, East Baton Rouge Parish" "Baton Rouge Quad. 15' Sec 51, T7S, R2E"
"45" "45" "feet"
"Satya Maliakal" "2-21-02" "Topozone -1:25,000" "decimal degrees" "NAD27"
"30.4277" "-91.0072"
"4.260" "mi" "located point at center of T7S, R2E (unable to locate Quad. 15' Sec. 51)" "0"
71897 "LSU" "USA, Louisiana, East Baton Rouge Parish" "Baton Rouge, Tulane Ave"
"0" "0"
"Dinakar Nethi" "02-25-02" "Topozone -1:25,000" "decimal degrees" "NAD27" "30.4019"
"-91.1652"
"0.527" "km" "point located at approximate center of Tulane Ave., furthest extent of Tulane avenue
from center point estimated as .5 km" "0"
71821 "LSU" "USA, Louisiana, East Baton Rouge Parish" "Baton Rouge, 2100 Stanford"
"0" "0"
"Dinakar Nethi" "02-08-02" "Topozone - 1:25,000" "decimal degrees" "NAD27" "30.4187"
"-91.1536"
"37.410" "m" "located 2100 Stanford with yahoo maps and then located this point on topozone,
extent of property estimated at 10 m" "0"
--=====================_-1683450515==_
Content-Type: text/plain; charset="us-ascii"; format=flowed
--=====================_-1683450515==_--
>>> Posting number 180, dated 7 Mar 2002 14:15:38
Date: Thu, 7 Mar 2002 14:15:38 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: MaNIS
In-Reply-To: <a05100301b8ad6761b1be@[141.211.110.228]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX and all,
>Hello John,
>My apologies, when I am georeferencing I use the "hide" command under
>"column" in the "format" menu of excel to close down columns that I seldom
>or never use. In this way, I can see the decimal latitude and longitude
>columns, for example, directly next to the locality column on my computer
>screen. I inadvertantly forgot to "unhide" a few columns when I sent the
>excel files back to you.
I should have looked for that.
>A question for you: I have some localities where the data is obviously in
>error but cannot be corrected by me. Do you prefer that I reference the
>county center with a note in the locality annotation column, or not
>georeference the locality with a note in the NoGeorefBecause column?
There are two different classes of locality errors that you need to worry
about, those with internal inconsistencies that make the locality
impossible to determine (e.g., Hogback Creek, Inyo County - there are two
of these), and those that have an obvious error that can be corrected
unambiguously (e.g., Needles, Mojave Co., California - Mojave Co. is in
Arizona and Needles is in San Bernardino Co, California).
If there is an internal inconsistency in the locality information that
makes the locality impossible to determine unambiguously, do not provide
coordinates and error, but do put something like "internal inconsistency"
in the NoGeorefBecause field and explain the problem in the
LocalityAnnotation field (e.g., "there are two Hogback Creeks in Inyo
Co."). When the source institution gets the georeferenced data back,
they'll be able to see what the problem was for each locality that was not
georeferenced.
If there is an obvious error that doesn't make the georeferencing
ambiguous, go ahead and georeference the locality, but put your assumptions
in the LatLongRemarks field and definitely point out the error in the
LocalityAnnotation field. The source institution will be able to see what
your assumptions were and they'll be able to fix the errors you uncovered.
In summary, LatLongRemark should be filled with information about how you
georeferenced, LocalityAnnotation should be filled with information about
errors or ambiguities - intended for the source institution, and
NoGeorefBecause should be a brief phrase describing your reason for not
georeferencing a locality (e.g., "internal inconsistency", "too vague", "no
specific locality").
John W
>>> Posting number 181, dated 9 Mar 2002 11:19:22
Date: Sat, 9 Mar 2002 11:19:22 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Some other useful Excel operations for MaNIS work
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
In addition to Hide columns, some other useful Excel operation I have found
useful are:
1. AutoFilter (similar to Access):
First select a column or columns, then choose
Data Menu>AutoFilter>select (Custom...) from the scrollable pick list>pick
contains and enter data of interest.
Using custom contains filtering, you can pull out all records for a county
from the backward HighGeo field or get all occurrences of a placename in
SpecLoc. Records can be worked with as desired.
Show All just under AutoFilter on the Data menu brings all records back.
2. Protect Worksheet: This will prevent inadvertent changes to MaNIS
records handed down from the mount but cells, columns, or rows can be left
open for data entry if you first select them, then under Format
cells>Protection tab>click unlocked. Once a worksheet is locked you can
enter data manually or automatically (egs. DecLat DecLong, error) but still
lock out changes to the locality fields. Protecting disables the Sort
capability.
3. LookUp:
Works great for dynamic lookup (as you type) and automatic assignment of
data like a placename lat/long from another list like the GNIS download.
With about 5000 of these links in the Oregon records, my machine (196 mg
RAM) starts to bog down. To get rid of the links but retain the data, do
a Copy, Paste Special, click Value.
I've been using LookUp in four columns after LocAnnotation, I enter
placename (winnowed by user) that is then looked up and values for GNIS
placename, type of locality, county, and DecLat, DecLong are returned.
Placename, type and county are for user verification and lat & long are for
computing lat/longs based on offsets.
4. Concatenation: For a text field this is done with "&", eg, columns A,
B, C can be appended to D with
"=D:D&", "&A:A&", "&B:B&", "&C:C" . Enter this in the first field, then
fill down as needed. Used to added misc notes to memo fields of MaNIS.
You can flip the HighGeo to have county first for sorting by doing a Text
to columns (Data menu), then contentating the columns with the county
column first. Of course leave the original HighGeo unaltered.
When you get tired of these, there is the underlying Visual Basic macro
editor which is fun if you like that sort of thing.
I'll probably stick with Excel through the project due to our "Mac-enabled"
status in the museum. I use Windows at home and in the museum as soon as
our server arrives.
>>> Posting number 182, dated 11 Mar 2002 12:02:55
Date: Mon, 11 Mar 2002 12:02:55 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Sending Data from MSU
In-Reply-To: <3.0.32.20020311144354.006e023c@pilot.msu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX, and all,
>XXXX and I have data from a few Michigan counties to send to you. So far,
>we have Access.mdb files for Barry, Branch, and Muskegon ready to go, and
>Kent, Ionia, and Montcalm are forthcoming. We have two questions for you:
>
>1. We minimize the width (what I call "closing up") of many columns on the
>template (basically ones that we don't fill in with data, or don't want to
>look at). Do you want us to open these columns back up before we send the
>file to you?
Nope. They're fine all closed up.
>2. Do you have a preference for how often we send files to you? (Aren't
>you getting bombarded with georeferencing data??)
Yes, the deluge has begun. Well, it's best to have the work backed up, so
it seems that you should send them as you finish them. Keep a copy on your
end too, for the sake of safety - you never know when we'll get hit by "the
Big One." To minimize the threat of loss, it's probably best to upload
them as described in the Georeferencing Steps document (i.e., ftp to
galaxy.cs.berkeley.edu/incoming/mvz). Then send me messages as they arrive
safely. Of course, if you are sending Excel (.xls) or Access (.mdb) files,
you don't need to export as tab-delimited text and you should change the
file type to binary when ftp-ing.
>Thanks,
>XXXX
>
>P.S. Thanks for "secretly" adding the NamedPlace and Extent fields to the
>template. (We moved them over next to the MaxError column in our tables).
OK, the secret is out. For those of you who may not be aware of it, there
is an Access Database template for georeferencing that can be accessed
through a link in Step Five on the GeorefSteps document at the following URL:
http://dlp.cs.berkeley.edu/manis/GeorefSteps.html
>>> Posting number 183, dated 11 Mar 2002 13:49:55
Date: Mon, 11 Mar 2002 13:49:55 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: MaNIS Servers
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
I've been asked a couple of times about making hardware substitutions in
the Equipment portion of MaNIS subcontract budgets. The bottom line is that
each institution must have, when the time comes to connect to the network,
a DEDICATED machine with the specifications highlighted in my 25 Feb
message "MaNIS Server recommendations." Dedicated means that the sole
purpose of the machine is to support data provision to the network. Beyond
that, I'm not picky.
John W.
>>> Posting number 184, dated 12 Mar 2002 14:45:06
>>> Posting number 185, dated 19 Mar 2002 10:46:55
Date: Tue, 19 Mar 2002 10:46:55 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Fwd: fraction format in the error calculator
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear XXXX, and all,
I'm glad you uncovered this bug. The error calculator is actually not as
smart as you expected it to be. The discrepancy you're experiencing arises
because the calculator interprets 1/2 as 1, ignoring everything after the
/. Therefore, please use only decimals or whole numbers in the Offset
Distance and Extent of Named Place fields
John
>
>Hi John
>
>I notice that maximum error is noticeably affected by the format of the
>extent entered on the error calculator if the extent contains a
>fraction. Since the extent field accepts both decimal and common
>fractions, I experimented with 0.5 and 1/2 for the locality of 3/8 mi. N
>of Casnovia, Kent County, MI. I approached the situation "by road," used
>decimal degrees on Topozone, and obtained the coordinates of 43.2401 and
>-85.7901. Datum is NAD27; coordinate precision, 0.0001; coordinate
>source, USGS map 1:25,000. Distance precision of 1/8 was selected from
>the drop-down. When the extent of the bounding box is expressed as 0.5 (a
>logical choice for TopoZone users), the maximum error is 0.641; but when
>it is expressed as 1/2 (in keeping with the format of distance precision),
>maximum error is 1.141.
>
>Depending on the extent, one format may be easier to use than the
>other. However, if both formats are allowed by the calculator but only
>one yields the desired maximum error, shouldn't the field be restricted to
>that format? [Actually now I believe the extent is slightly less than 0.5
>miles, but remain curious about the discrepancy.] Again, your assistance
>will be greatly appreciated.
>
>XXXXX
>>> Posting number 186, dated 21 Mar 2002 15:51:25
Date: Thu, 21 Mar 2002 15:51:25 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: georeferencing rivers
In-Reply-To: <Pine.OSF.4.33.0203211410400.8199-100000@aurora.uaf.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX and all,
These are good questions. I'll put the answers right below each one.
>1. When I georeference rivers, should I take coordinates of the source or
>the drainage of the river? How much should extent of the river be?
The coordinates should be at the geographic center of the river, on the
river itself. The extent should be the distance to the furthest reach of
the river in either direction.
>2. An example: specific locality is "Brooks Range, Anaktiktoot", where
>Anaktiktoot is not on the map. Should I georeference for Brooks Range
>(which will be more than 600 miles in length)? There are many cases that
>higher geography is followed by unknown specific locality.
You should go ahead and put coordinates on the vague localities, even
though the maximum_error_distance will be large. Some of the higher
geographies that have no value or "no specific locality" in the locality
field can still be specific, such as islands.
>3. Related to my question 2: how much is too big to georeference? In many
>cases, only the name of the island, mountains, peninsula etc. are
>provided.
Do them all. The maximum_error_number will be useful even if it is large.
John
>>> Posting number 187, dated 30 Mar 2002 09:00:41
Date: Sat, 30 Mar 2002 09:00:41 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: UAM declat/longs truncated in MaNIS?
Mime-Version: 1.0
Content-Type: text/plain; format=flowed
John: It looks like the UAM records in the gazetteer have the same problem
that KU's records had -- declat/longs only go to two decimals. KU (XXX
XXXX) asked me to recompute KU's Oregon so I am overwriting calculated
declat/longs. Please advise on UAM records - there are several hundred.
Examples:
LocalityID CollectionCode Datum DecLat DecLong LatDeg LatMin LatSec LatDir LongDeg LongMin
LongSec LongDir
186407 UAM not recorded 45.2600 -123.8800 45 16 1 N 123 53
17 W
186662 UAM not recorded 45.2600 -123.8800 45 16 1 N 123 53
10 W
186663 UAM not recorded 45.2600 -123.8800 45 16 1 N 123 53
1 W
186721 UAM not recorded 45.1600 -123.7300 45 10 1 N 123 44
6 W
186731 UAM not recorded 45.2100 -123.6400 45 13 1 N 123 38
42 W
186514 UAM not recorded 44.2300 -123.8000 44 14 2 N 123 48
32 W
186515 UAM not recorded 44.2300 -123.8000 44 14 2 N 123 48
21 W
186516 UAM not recorded 44.2300 -123.8000 44 14 2 N 123 48
2 W
186556 UAM not recorded 44.2800 -123.7600 44 17 2 N 123 46
2 W
186557 UAM not recorded 44.2800 -123.7500 44 17 2 N 123 45
2 W
186689 UAM not recorded 45.3300 -123.7800 45 20 2 N 123 47
2 W
186690 UAM not recorded 45.3300 -123.6400 45 20 2 N 123 38
49 W
186691 UAM not recorded 45.3300 -123.6300 45 20 2 N 123 38
2 W
>>> Posting number 188, dated 1 Apr 2002 14:19:03
Date: Mon, 1 Apr 2002 14:19:03 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: MaNIS questions
In-Reply-To: <5.1.0.14.0.20020327144722.01df95c0@mail.fmnh.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear XXXX, and all,
I know that Barbara made a preliminary answer to the questions raised here.
I'll try to add a few points of explanation from which everyone on the list
might benefit.
I agree with Barbara's statement of the georeferencing priorities within
the MaNIS context. To summarize them, the MaNIS grant covers (only)
complete georeferencing for localities that have no lat_longs. Our hope is
that, through innovation and properly-guided cooperation, we will be able
to follow through on our promise to finish this. In fact, we hope that we
will be able to refine the process and the tools enough to actually get
ahead of the game. If we do get ahead, we will be able to turn our
attention next to those localities for which lat_longs exist without
supporting metadata.
I know we all have the desire to have consistent data quality, especially
when faced with making those data public. Within the context of our
project, however, cleaning up locality descriptions is neither covered, nor
is it recommended. Every change made to locality descriptions on your end
since the data were collected for the MaNIS gazetteer has the potential to
confound the process of properly reconnecting the georeferenced localities
with specimens in your database.
I have not yet explained the reconnecting part of the process, thinking
that what I've presented thus far is enough to swallow for the time being.
Perhaps a brief synopsis now would be of use to illustrate the potential
complications and to get people to think about the future of locality data
in institutional databases.
In the MaNIS gazetteer I have rendered unique occurrences of localities by
institution. These you can query on and see as results in the online MaNIS
gazetteer. Behind the scenes there is another table to cross-reference
unique localities to specimens. The specimens are linked to the localities
(and hence to the coordinates and metadata that georeferencing provide)
based on the locality string. Thus, if you change the locality string in
your database, it will not match the locality string for the same specimen
in the gazetteer. This is the crux of the issue, so it is important to
understand when it matters, and when it doesn't.
If the locality string in your database doesn't match the locality string
in the MaNIS gazetteer, but the locality really is exactly the same place
and would get the same coordinates when georeferenced, then the change
doesn't matter - the specimen will get the correct coordinates anyway.
However, if the change in your database effectively changes the place that
is described (resulting in different coordinates when georeferenced) then
the change DOES matter - it is what I have elsewhere called "substantive."
If a substantive change is made in your database and I apply the
georeferenced coordinates to the specimens that once referred to that
locality, the georeferenced data will be wrong. Therefore, there needs to
be a verification process when re-associating georeferenced localities with
individual databases. There are two steps to this process. The first is to
determine if the locality string in your database is the same as that in
the gazetteer. For all of those localities for which the locality strings
match, the georeferenced data can go into your database automatically, no
fuss, no questions asked. For the rest of the georeferenced localities from
the gazetteer, a comparison will have to be made between the then-current
locality and the georeferenced locality to determine if they still refer to
the same place. Imagine putting a check mark by each pair that still match.
The amount of checking to be done in this step is directly determined by
the number of changes you make to your locality strings between the time
when I collected the data for the gazetteer and the time when the data go
back into your database. Clearly, fewer changes mean less checking.
OK? Take a breath. Now, a topic for rumination as the project progresses.
Start thinking about incorporating the georeferenced coordinates and
metadata into your individual databases. Not one of the participating
institutions currently has the structure in its database to capture all of
the metadata we are gathering. It would be nice if we all could. We don't
want to throw away all of this hard work after all.
John W
>>> Posting number 189, dated 1 Apr 2002 15:37:38
Date: Mon, 1 Apr 2002 15:37:38 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: UAM declat/longs truncated in MaNIS?
In-Reply-To: <F569gG8WPbLgJyAUypU000104d9@hotmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX and XXXXX,
The problem is not exactly the same. UAM has both decimal lat_long and
degrees minutes seconds in its database. The decimal lat_longs often have
only two decimal places when there are fully specified degrees minutes
seconds, but this shouldn't affect what you're doing unless you want to
copy and paste lat_longs that UAM had already done to localities for other
institutions. If that's the case, recompute the decimal lat_longs for UAM
using the degrees minutes seconds values where the OrigCoordSystem is "deg.
min. sec."
XXXXX, you may want to put XXXX on recomputing decimal lat_longs for the
conditions described above.
General Reminder: Lat_Long recomputations should not be on MaNIS time
until/unless we finish the georeferencing of localities without lat_longs.
>John: It looks like the UAM records in the gazetteer have the same problem
>that KU's records had -- declat/longs only go to two decimals. KU (XXX
>XXXX) asked me to recompute KU's Oregon so I am overwriting calculated
>declat/longs. Please advise on UAM records - there are several hundred.
>
>Examples:
>LocalityID CollectionCode Datum DecLat DecLong
>LatDeg LatMin LatSec LatDir LongDeg LongMin LongSec LongDir
>186407 UAM not recorded 45.2600
>-123.8800 45 16 1 N 123 53 17 W
>186662 UAM not recorded 45.2600
>-123.8800 45 16 1 N 123 53 10 W
>186663 UAM not recorded 45.2600
>-123.8800 45 16 1 N 123 53 1 W
>186721 UAM not recorded 45.1600
>-123.7300 45 10 1 N 123 44 6 W
>186731 UAM not recorded 45.2100
>-123.6400 45 13 1 N 123 38 42 W
>186514 UAM not recorded 44.2300
>-123.8000 44 14 2 N 123 48 32 W
>186515 UAM not recorded 44.2300
>-123.8000 44 14 2 N 123 48 21 W
>186516 UAM not recorded 44.2300
>-123.8000 44 14 2 N 123 48 2 W
>186556 UAM not recorded 44.2800
>-123.7600 44 17 2 N 123 46 2 W
>186557 UAM not recorded 44.2800
>-123.7500 44 17 2 N 123 45 2 W
>186689 UAM not recorded 45.3300
>-123.7800 45 20 2 N 123 47 2 W
>186690 UAM not recorded 45.3300
>-123.6400 45 20 2 N 123 38 49 W
>186691 UAM not recorded 45.3300
>-123.6300 45 20 2 N 123 38 2 W
>
>>> Posting number 190, dated 1 Apr 2002 16:47:19
Date: Mon, 1 Apr 2002 16:47:19 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: MaNIS questions
In-Reply-To: <5.0.0.25.2.20020401125307.024018f0@socrates.berkeley.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Fellow MANES:
John's message closed with this statment:
"Not one of the participating
institutions currently has the structure in its database to capture all of
the metadata we are gathering. It would be nice if we all could. We don't
want to throw away all of this hard work after all."
My response: It has been a surprise to find ourselves dealing with the
topic of error estimates, etc in lat/long data, since that was not part of
the original scope of the project. And indeed (in light of the above
quote) we do not have a capacity to absorb such information into our
present databases, let alone deciding how much time we have to care about
this. Seeing the impact of the request for so much attention to error
estimates, I find it hard to support so much allocation of additional time
to this effort.
I have witnessed, over the years, many publications based on massive
datasets in which the authors were not able to document (or even care)
about variance in the quality and accuracy of the data. Typically, they
just put on their blinders and accepted all the "AVAILABLE" data. This is
just an inherent problem for those who move up the scale (allometric
analyses, macroecology, or whatever), and at such LARGE scales of analyses
they usually say that small local errors become insignificant, because of
the LARGE SCALE of the overall analysis.
I hope we can strike a balance here and get the big data entry and
conversion project done. I don't want to see the project slowed down by
such a big commitment to accounting for aspects of the data (and the
corresponding time commitment) that were not built in to our original
estimates of what it would take to carry out the project.
Is this a helpful comment?
>>> Posting number 191, dated 1 Apr 2002 17:01:39
Date: Mon, 1 Apr 2002 17:01:39 -0900
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Organization: University of Alaska Museum
Subject: Re: MaNIS questions
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------4C7E03390063999F5E48C0EE"
This is a multi-part message in MIME format.
--------------4C7E03390063999F5E48C0EE
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
UAM's online database (along with MVZ's) is displaying error estimates through
the Berkeley Digital Library Project's GIS viewer. I assume that the
"finished" MaNIS project could look about the same. That is, error estimates
will be a prominent and critical feature of the system. Given that the GIS
viewer will map data points over satellite photos of much of the U.S., the
precision associated with the data points is critical. The implication of "no
error" on a such fine scale GIS layer is that the specimen came from a
specific tree or bush! Our database contains max_errors from as small as a
few meters to as large as several tens of kilometers. These are not arcane
details.
XXXXXX
>>> Posting number 192, dated 2 Apr 2002 11:09:56
Date: Tue, 2 Apr 2002 11:09:56 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Lat_Long metadata
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Oops, my mistake. There IS a collection with the structure to capture all
of the metadata. Two others, UAM and MVZ, have everything except "Extent of
Named Place."
Thanks XXXX, bright spot appreciated.
John W
>X-Sender: carlak@mail.bishopmuseum.org
>X-Mailer: QUALCOMM Windows Eudora Version 5.0.2
>Date: Tue, 02 Apr 2002 08:39:08 -1000
>To: John Wieczorek <tuco@socrates.Berkeley.EDU>
>From:
>Subject:
>
>FYI: in reference to your statement below....................
>
>Start thinking about incorporating the georeferenced coordinates and
>metadata into your individual databases. Not one of the participating
>institutions currently has the structure in its database to capture all of
>the metadata we are gathering. It would be nice if we all could. We don't
>want to throw away all of this hard work after all.
>
>Here's a bright spot to your day: I have incorporated the MANIS locality
>structure into my Locality table and will thus be saving all the metadata
>for the BPBM specimens and for all new specimens into the collection that
>are completely georeferenced.
>
>XXXX
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> Posting number 193, dated 2 Apr 2002 12:00:20
Date: Tue, 2 Apr 2002 12:00:20 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: georeferencing rivers
In-Reply-To: <.20020401170449.0099fc90@pilot.msu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX, and all,
First, I want to apologize for having given contradictory opinions on how
these vague localities should be treated. I stated at least once in the
past that we shouldn't bother with these kinds of localities. However, that
opinion was not based on unassailable logic. In both of the circumstances
described below in Robin's message the coordinates will be of limited
utility due to their very large maximum error. Nevertheless, providing the
coordinates and maximum error will allow the user to determine the extent
to which they ARE useful.
In replying to XXXX I first expressed the opinion that we should provide
maximum errors even in the truly vague cases. My unstated personal
justification for that opinion was that it makes the rules simpler. More
philosophically, by georeferencing all non-contradictory localities, we
don't need to answer the question "How big of an area is too vague?" We
cannot fully anticipate all of the uses to which the data will be put, so
we don't really have a basis on which to make that judgement. A locality
with coordinates and a maximum error distance is always more useful than a
locality without them. End of apology.
Now, back to the questions.
>John:
>
>XXXXX's questions and your responses prompted additional questions re:
>georeferencing rivers and vague localities.
>
>1. Is it correct to assume that when one measures the length of a river
>to determine its geographic center the river's possibly winding path is
>taken into consideration; however, the extent is determined "as the crow
>flies" from the geographic center to the furthest reach?
You don't need to know the length of the river to determine its geographic
center, you need only take the means of the extremes of latitude and
longitude encompassing it. After that, you need to find the point on the
river nearest the geographic center. From there, the extent would be the
distance to the furthest point on the river.
>2. Should we put coordinates on the following vague locality:
>
>HigherGeog: Michigan, Barry County
>SpecLocality: "no specific locality recorded"
>
>XXXX and I have not georeferenced such localities thus far, but it
>appears from your response that county center coordinates and the extent
>of Barry County should be provided.
Yes. These should be georeferenced. However, there isn't really a need for
you to do it. Such localities can be georeferenced automatically from a
table of county centroids when we're all done. In retrospect, it would have
probably been useful for me to do that before making the gazetteer
"public," but I didn't think it worth the delay at the time.
John W
>>XXXX and all,
>>
>>These are good questions. I'll put the answers right below each one.
>>
>>>1. When I georeference rivers, should I take coordinates of the source or
>>>the drainage of the river? How much should extent of the river be?
>>
>>The coordinates should be at the geographic center of the river, on the
>>river itself. The extent should be the distance to the furthest reach of
>>the river in either direction.
>>
>>>2. An example: specific locality is "Brooks Range, Anaktiktoot", where
>>>Anaktiktoot is not on the map. Should I georeference for Brooks Range
>>>(which will be more than 600 miles in length)? There are many cases that
>>>higher geography is followed by unknown specific locality.
>>
>>You should go ahead and put coordinates on the vague localities, even
>>though the maximum_error_distance will be large. Some of the higher
>>geographies that have no value or "no specific locality" in the locality
>>field can still be specific, such as islands.
>>
>>>3. Related to my question 2: how much is too big to georeference? In many
>>>cases, only the name of the island, mountains, peninsula etc. are
>>>provided.
>>
>>Do them all. The maximum_error_number will be useful even if it is large.
>>
>>John
>
>>> Posting number 194, dated 2 Apr 2002 21:23:13
Date: Tue, 2 Apr 2002 21:23:13 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Barbara R. Stein" <bstein@OZ.NET>
Subject: Re: MaNIS questions
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="------------8D04441FBD1587A8D66E30D2"
--------------8D04441FBD1587A8D66E30D2
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Dear XXX et al.,
>From the outset, this project has proceeded, and proceeded successfully,
because we have all been "on the same page." Your email (see below) provides
an opportunity to reiterate what we said we were going to do, what we intend
to do, and exactly why we are doing it as stated.
John and I (particularly John) are extremely grateful to those of you who have
immersed yourselves in the intricacies of georeferecning and have been willing
to share your thoughts and insights with the list. However, such discussions
in and of themselves have not added to the work load that was initially
budgeted or funded. Quite to the contrary, both the "Coordinate
Georeferencing Activities" and "Implement Specimen Data Model" sections in the
MaNIS Project Description described providing georeferencing metadata as well
as the coordinates. And we stated emphatically,
"Well-documented, georeferenced collecting events are crucial to biogeographic
data...."
This is exactly what we are doing.
The error calculator and spreadsheet templates that John provided make the
addition of metadata such as lat/long error a relatively trivial exercise and
one that should not be confused with the discussion of such topics on this
list. Several individuals have chosen to probe that tool more closely and we
have all benefited from their interest and experimentation. Their comments
have enhanced our understanding of the process and the resulting data, and
improved the tool, but they have not created more work.
Where confusion may have arisen, is in the following:
> And indeed (in light of the above
> quote) we do not have a capacity to absorb such information into our
> present databases, let alone deciding how much time we have to care about
> this. Seeing the impact of the request for so much attention to error
> estimates, I find it hard to support so much allocation of additional time
> to this effort.
It is not your job to incorporate such information into your present databases
and we apologize for any confusion that John might have engendered in his
previous email. This is a topic we will be discussing at our meeting at ASM
in June but perhaps it is worth clarifying now what John was intimating when
he made reference to this issue.
Think of your current dbms in two parts, the databases themselves and the
interfaces you now use to input, query and display those data in-house. For
most of you, neither your databases nor your interfaces are currently designed
to handle any new fields (e.g., lat/long error). However, we are expending a
great deal of time and effort to collect such data and want to make them
available to researchers. Whereas it is a fairly tricky task (given
constraints of time and budget) to modify each of your interfaces to add new
fields, it is relatively easy to add those fields to your current databases
and migrate the data directly to the MaNIS servers along with your specimen
data. This will happen when John writes the migration scripts for each of
your institutions. Hence, the data will be displayed over the network and
available to you without impacting your current set-ups in-house. In raising
this issue, he was merely letting you know that we are, in fact, moving ahead
and beginning to work on the next step of the project, creating the migration
scripts and software that will make the network function.
> I have witnessed, over the years, many publications based on massive
> datasets in which the authors were not able to document (or even care)
> about variance in the quality and accuracy of the data. Typically, they
> just put on their blinders and accepted all the "AVAILABLE" data. This is
> just an inherent problem for those who move up the scale (allometric
> analyses, macroecology, or whatever), and at such LARGE scales of analyses
> they usually say that small local errors become insignificant, because of
> the LARGE SCALE of the overall analysis.
Here I will part company with XXX and argue that it is our intention to do
better than what has always been done or has been done previously. Neither
John nor I see this "inherent problem," particularly with the advent of
increased computing technology. I participated in one of the planning
workshops for NEON (National Ecological Observatories Network) two years ago
and I can state unequicocally that the standard is changing/has changed. The
kinds of publications to which XXX refers will no longer be acceptable (if
they even are at this time) because it is possible to document variance in
quality and accuracy of data, even for extremely large datasets. Furthermore,
we believe we have a designed the georeferencing protocol to do just that,
with relatively little overhead and impact to the participating institutions.
At this point everyone has at least begun the georeferencing process and from
what we can gather, once initial inertia is overcome, things actually progress
quite smoothly and quickly. I may be premature in saying so, but it is our
hope that MVZ will have completed georeferencing the ca. 40,000+ localities
for California in the next two months. How have we done this? I would remind
each of you that our first priority is to provide georeferenced data to those
localities in our collections that currently have none! It is not to add
error to localities that already have lat/long coordinates assigned to them,
it is not to verify already georeferenced localities, and it is not to clean
up locality descriptions. Our budget figures were based on the number of
unique localities in our collections that lacked lat/long coordinates of any
sort. I would also add, that while we cannot dictate whom you hire to do
georeferencing, your money will go lots farther if you hire undergraduates,
and it will go farthest if you hire work-study students.
We have all taken the first giant step. What is needed now is to just keep
putting one foot in front of the other. I guarantee you will amaze
yourselves.
Best,
Barbara
>>> Posting number 195, dated 3 Apr 2002 08:38:21
>>> Posting number 196, dated 3 Apr 2002 10:52:59
Date: Wed, 3 Apr 2002 10:52:59 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Contemporary informatics science, etc.
In-Reply-To: <3CAA91C1.79E00185@oz.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Barbara et al.,
I appreciate the comments and forum that exist among our Manis group, and
I thank Barbara for her most recent. I also agree that the developing
field of informatics is helping us to raise the bar on scientific
standards in generaland I dont wish my comments to be taken as an
endorsement of the crudeness of broad synthetic work done in the past
(without error estimates). I also realize that for the many data fields
that we have entered into our XXXX mammal database (other than lat/long)
we will probably continue without error estimates for some time to come.
On the other hand we can only await the further development of these kinds
of massive data management projects in the future, assuming that financial
resources will remain available for this kind of thing. It will be great
if we can be surprised by continued improvements in the overall quality of
the data that stand behind the specimens we hold in our collections. I
obviously remain committed to assuring that we get our job done on this
current project.
XXX
>>> Posting number 197, dated 5 Apr 2002 15:49:39
Date: Fri, 5 Apr 2002 15:49:39 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear all,
In late February when I was fixing my mistake with the UWBM Lat_Longs I
mentioned that I would be reloading ROM data at some time as well. That
time has come. The new ROM data have now been loaded into the gazetteer.
What does this mean for you? If you haven't begun georeferencing yet
(though as far as I know, everyone has), you just need to download your
localities again and proceed as described in the Georeferencing Steps
document ( http://dlp.CS.Berkeley.EDU/manis/GeorefSteps.html ). If you have
downloaded localities and started georeferencing them, first you need to
remove any ROM records from the set. Next make another query in the MaNIS
gazetteer just like the original query that gave you the records you are
working on, but this time pick ROM in the Institution box on the MaNIS
Gazetteer page to get only ROM records for that combination of higher
geography. Download these ROM records and append them to the end of the
file you are working on.
Sorry for this inconvenience. I'm pretty sure I've got everything correct
now and that this kind of thing won't happen any more. So, everyone,
proceed with confidence.
My next undertaking will be to write the documentation for a new Calculator
that can calculate not only errors, but also coordinates. This calculator
will be VERY similar to the Error Calculator, so there won't be much new to
learn. The new calculator has already been tested; the results agree with
those given by Gary Shugart's Excel tool for the same localities. This is
good. I'll announce the new calculator as soon as I've posted the manual
for it, which should be next Friday or so after I return from San Diego.
Happy georeferencing!
John W
>>> Posting number 198, dated 5 Apr 2002 17:47:21
>>> Posting number 199, dated 15 Apr 2002 16:52:52
Date: Mon, 15 Apr 2002 16:52:52 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: GNIS Website Gazeteering
Hello All,
I am a recent addition to the group, and I have thrown myself headlong into
the midst, hopefully well.
That said, I do have a question about a source. I am using the USGS GNIS
website http://geonames.usgs.gov/pls/gnis/web_query.gnis_web_query_form and
I was wondering what, if any, experiences have been had. Specifically, if
I read it correctly, it is a database of information culled for the USGS
maps. I am just unsure of a few things...:
First, datum, scale, and other info. The site refers to "7.5' by 7.5'
Map"; what other data can be culled just from that?
Second, it at times gives coordinates from multiple maps that are slightly
different. How do I reconcile this variances?? Do I give my own best
combination, or has a process been agreed upon, that I have missed in going
through the past posting?
Thanks, and greetings to you all.
>>> Posting number 200, dated 15 Apr 2002 15:45:02
Date: Mon, 15 Apr 2002 15:45:02 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: GNIS Info
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_NextPart_000_0088_01C1E494.81AC73E0"
This is a multi-part message in MIME format.
------=_NextPart_000_0088_01C1E494.81AC73E0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
XXXXXX
I too am new at working on this MaNIS project, just started this week. =
Anyways, I had the exact same question as you and talked to John =
Wieczorek this morning at the Museum of Vertebrate Zoology in Berkeley, =
CA. He said that using the GNIS data is fine even though that the =
source of the database is not from one place. These are the "givens" =
for GNIS use with the "Error Calculator":
1) Coordinate System: decimal degrees
2) Coordinate Source: USGS map 1:25,000
3) Datum: NAD27 (North American Datum 1927)
Make sure that you fill out the "Extent of Named Place Field" as much as =
possible each time. If anyone from this board has other suggestions, I =
would be glad to hear them.
Is anyone else converting the GNIS database to a shape file to be used =
in ArcView to calculate distances? If there are a lot of you, I will =
start posting ArcView questions pertaining to this project here. =
Thanks.
XXXXXXX
>>> Posting number 201, dated 16 Apr 2002 09:49:31
Date: Tue, 16 Apr 2002 09:49:31 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: GNIS Info
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_NextPart_000_0022_01C1E52C.020D1CA0"
This is a multi-part message in MIME format.
------=_NextPart_000_0022_01C1E52C.020D1CA0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
XXXXX--
Thanks for the quick reply; it was very helpful. =20
I am always interested in other people's experiences with ArcView.
John Wieczorek & Group--
I still am on the fence with the locations that give me two or more =
different georeferencing points. I have the feeling that, as they are =
both "legitimate" sources (different USGS maps), that I can just choose =
one, and indicate in the proper field in the database which I chose. =
Does this seem acceptable/appropriate??
Thanks
XXXXX
----- Original Message -----=20
From:
To: MAMMAL-Z-NET@USOBI.ORG=20
Sent: Monday, April 15, 2002 5:45 PM
Subject: [MANIS] GNIS Info
XXXXXXXXX
I too am new at working on this MaNIS project, just started this week. =
Anyways, I had the exact same question as you and talked to John =
Wieczorek this morning at the Museum of Vertebrate Zoology in Berkeley, =
CA. He said that using the GNIS data is fine even though that the =
source of the database is not from one place. These are the "givens" =
for GNIS use with the "Error Calculator":
=20
1) Coordinate System: decimal degrees
2) Coordinate Source: USGS map 1:25,000
3) Datum: NAD27 (North American Datum 1927)
Make sure that you fill out the "Extent of Named Place Field" as much =
as possible each time. If anyone from this board has other suggestions, =
I would be glad to hear them.
Is anyone else converting the GNIS database to a shape file to be used =
in ArcView to calculate distances? If there are a lot of you, I will =
start posting ArcView questions pertaining to this project here. =
Thanks.
>>> Posting number 202, dated 16 Apr 2002 10:04:18
Date: Tue, 16 Apr 2002 10:04:18 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: GNIS Info
In-Reply-To: <002501c1e555$eb1a1320$b16f0a0a@fmnh.org>
Mime-Version: 1.0
Content-Type: multipart/alternative;
boundary="=====================_12236688==_.ALT"
--=====================_12236688==_.ALT
Content-Type: text/plain; charset="us-ascii"
XXXX and others
Having already georeferenced thousands of South American localities, this is an
important and porrly understood question. My strong conviction is that simply
picking a point arbitrarily is apt to prove more misleading than leaving the
point undetermined. If there are 28 "San Martin"s in Peru, for example, and
there is no additional information for specifying this (e.g., compiling an
expedition itinerary, locations of field activities immediately beforehand and
afterwards, and (rarely) the distributions of animals themselves), then
guessing--and being explicit about your guesses--can only be misleading.
Following this strategy with the Field Museum's 2300 locality records from Peru
lead me to leave 14% of the localities unspecified. However, I am confidant
that the remaining 86% came from where they plot.
I would be interested in hearing the experiences of others and the druthers of
curators/collection managers on the data fidelity (vs accuracy) question.
Clearly, we need to embrace a community-wide standard
XXXXX
>>> Posting number 203, dated 16 Apr 2002 09:11:41
Date: Tue, 16 Apr 2002 09:11:41 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: GNIS Info
In-Reply-To: <4.1.20020416095834.00a94a90@mail.fmnh.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
I agree wholeheartedly with XXXXX. If there is ambiguity in terms of a
multitude of potential named places for a given locality we should NOT
georeference it, but give the reason ("ambiguous" or "multiple possible
places" or something like that) in the NoGeorefBecause field. It may be
that some of these localities can be resolved by the host institution by
looking in field notes and the like. However, that's a time-consuming
activity and we should leave that until after the coordinates get
redistributed.
For the record, the other type of locality we should NOT georeference is
one that is in question (e.g., "Bakersfield?"). For these, put something
like "locality questionable" in the NoGeorefBecause field. The reason for
filling out the NoGeorefBecause field is so that the host institution knows
that someone actually looked at the locality. You wouldn't otherwise know
this if the Lat and Long were just blank. While reviewing, I might as well
remind everyone to make use of the Remarks field to alert host institutions
of likely errors such as misspellings as well as unusual assumptions that
were made in the course of the coordinate determination.
It's nice to see the list serving its purpose. Thanks for the questions and
responses!
John W
>>> Posting number 204, dated 16 Apr 2002 11:21:06
Date: Tue, 16 Apr 2002 11:21:06 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: GNIS Info
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
All,
I believe I am not being as clear in my situation as I thought. Here is an
example.
Aurora, Illinois, is a city/town that spreads across multiple counties, and
is on 3 different USGS maps, according to the query form results I received.
As I understand it, the information on the site comes about in the same
manner as if I had all of these maps myself, and were picking the point, and
best approximating the lat & long according to those lines given on the map.
But, as there are 3 different maps, 3 slightly different numbers are
arrived. 41 45 38 N, 88 19 12 W; 41 44 45 N, 88 18 31 W; & 41 45 45 N, 88
22 45 W to be exact. Each of these is agreed (by the consent to use the
information from the site at all) to from a reliable source; as in, if there
were only one, there would be NO problem. So, my question is, can I "pick"
one, and then indicate which map it was taken from?? Please use the query
page to see what I mean
http://geonames.usgs.gov/pls/gnis/web_query.gnis_web_query_form
If I am beating a dead horse, please let me know, but this is a quite
specific question.
Thank you all.
>>> Posting number 205, dated 16 Apr 2002 10:46:47
Date: Tue, 16 Apr 2002 10:46:47 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: GNIS Info
Comments: To:
In-Reply-To: <008f01c1e562$b62135b0$b16f0a0a@fmnh.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
XXXX
Rather than using the GNIS query form, you can open the GNIS gazetteer for IL:
ftp://mapping.usgs.gov/pub/gnis/IL_deci
This is an alphabetical listing of named places. There is one entry
for "Aurora", followed by specific locations (Aurora city hall, etc). For
georefencing Utah localities, we are using the GNIS UT gazetteer of named
places and digitized 1:24,000 maps from the National Geographic TOPO! series
which provide very accurate readings. So far (for Utah) this combination works
very well and gazetteer and maps are in very close agreement.
XXXX
>>> Posting number 206, dated 16 Apr 2002 11:36:27
Date: Tue, 16 Apr 2002 11:36:27 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: GNIS Info
In-Reply-To: <1018975607.3cbc5577d6871@bluebird.umnh.utah.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX,
I'm sorry I didn't address your specific question. Your example is an
interesting and uncommon one. The populated place is actually given three
different sets of coordinates - one for each of the three 7.5' maps on
which it appears. In this particular case I would actually choose the
coordinates of the city hall ( 41 45' 24" N 88 18' 52" W) as an
unambiguous solution. Make sure to comment to that effect in the Locality
Remarks, and be sure to use the distance from the city hall to the furthest
edge of town as the extent of Aurora in the error calculations.
John
>>> Posting number 207, dated 16 Apr 2002 03:01:04
Date: Tue, 16 Apr 2002 03:01:04 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: related thoughts on GNIS data- hierarchy of placename types
In-Reply-To: <008f01c1e562$b62135b0$b16f0a0a@fmnh.org>
Mime-Version: 1.0
Content-Type: multipart/alternative;
boundary="============_-1193171221==_ma============"
--============_-1193171221==_ma============
Content-Type: text/plain; charset="us-ascii" ; format="flowed"
MaNIS: I have been using the GNIS download for Oregon
(http://geonames.usgs.gov/gnisftp.html) to get declat/longs for
placenames. After getting a placename and the lat/long, the lat/long
for the record is calculated based on the offsets from the placename.
In looking up placenames, I typically find that there are multiple
possibilities for sites such as Hood River that could be a populated
place (ppl), post office (po), river, county, locale, etc. Another
example I was just trying to estimate extent on was Agate Beach -
ppl, beach, or po? I picked ppl. This ambiguity seems to be
characteristic of localities such as ppls, pos, crossroads, locales
that were named after some feature of the landscape. It is rare to
have a record that unambiguously states what is referenced (e.g..,
Hood River, town of). The point is that a SpecLocality with just a
name often will be ambiguous and the ambiguity increases the more you
look and the better your reference dataset because you find more
possibilities. I've been going with a hierarchy of ppl, po, locale,
then if these don't exist, whatever else looks good. My thinking is
that a collector would have used a po, ppl, or locale as a first
choice. As long as I provide a reference to what I did, there should
be no problem. The reference will include the Placename, placename
type, county, and placename lat/long plus the other data needed to
calculate lat/long and error. The contributing institution can
accept or reject the lat/long.
>>> Posting number 208, dated 17 Apr 2002 18:19:56
Date: Wed, 17 Apr 2002 18:19:56 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: related thoughts on GNIS data- hierarchy of placename types
In-Reply-To: <p05100301b8e1716dad3e@[207.207.104.113]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX, and all,
I disagree with the methods described below on three counts. The more
important of my two reasons is that some of the determinations using this
methodology will simply be wrong, and there won't be any way to know if
they are wrong even with the Locality Remarks. My secondary reason is that
it will be difficult to find these localities among the data in order to
filter them out from analyses for which they aren't appropriate. Finally,
it seems to me we gain no benefit from actually having coordinates for
ambiguous localities, especially given that they take up precious time to
georeference. By ambiguous I mean that there is more than one possible
distinct place to which the locality may refer. By distinct I mean that the
maximum error circles do not overlap.
One could, sometimes appropriately, choose the geographic center of
multiple possible places for the coordinates of a locality and have the
extent cover all of them. In this case the error would be larger than it
would be by choosing any one of the possible localities, but the
determination (coordinates plus maximum error distance) would not be wrong.
I would use this method sparingly, however, given that it does take quite a
bit of time to make a determination of this kind. The method is most
appropriate when the distances between the possible places is relatively
small so that the maximum error distance itself remains small. I have no
objection to this approach, but I would argue vehemently to avoid
determinations that are likely to be wrong.
John W
At 03:01 AM 4/16/02 -0700, you wrote:
>MaNIS: I have been using the GNIS download for Oregon
>(http://geonames.usgs.gov/gnisftp.html) to get declat/longs for
>placenames. After getting a placename and the lat/long, the lat/long for
>the record is calculated based on the offsets from the placename. In
>looking up placenames, I typically find that there are multiple
>possibilities for sites such as Hood River that could be a populated place
>(ppl), post office (po), river, county, locale, etc. Another example I
>was just trying to estimate extent on was Agate Beach - ppl, beach, or
>po? I picked ppl. This ambiguity seems to be characteristic of
>localities such as ppls, pos, crossroads, locales that were named after
>some feature of the landscape. It is rare to have a record that
>unambiguously states what is referenced (e.g.., Hood River, town
>of). The point is that a SpecLocality with just a name often will be
>ambiguous and the ambiguity increases the more you look and the better
>your reference dataset because you find more possibilities. I've been
>going with a hierarchy of ppl, po, locale, then if these don't exist,
>whatever else looks good. My thinking is that a collector would have used
>a po, ppl, or locale as a first choice. As long as I provide a reference
>to what I did, there should be no problem. The reference will include the
>Placename, placename type, county, and placename lat/long plus the other
>data needed to calculate lat/long and error. The contributing institution
>can accept or reject the lat/long.
>
>
>>> Posting number 209, dated 19 Apr 2002 10:49:10
Date: Fri, 19 Apr 2002 10:49:10 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: New Calculator is ready
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
I've finished upgrading the Georeferencing Calculator to calculate not only
Errors, but also Coordinates with errors. Links throughout the MaNIS
website now point to this Calculator instead of the Error Calculator. A
link to the manual for the new Calculator can be found on the MaNIS
Documents page at the following URL:
http://dlp.cs.berkeley.edu/manis/Documents.html
This new Calculator will look familiar to anyone who has used its
predecessor, but please be sure to read the manual to be sure you
understand how it differs.
John W
>>> Posting number 210, dated 19 Apr 2002 15:29:58
Date: Fri, 19 Apr 2002 15:29:58 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: New Calculator is ready
John, and all--
I have an issue regarding the Calculator, and the OrigCoordSystem heading.
When entering data with a distance component (e.g. Alton, 2 mi N), if the
coordinates of the named place have been determined as and are entered in
deg min sec format, and the calculator gives out decimal degrees, what
should be entered in as the OrigCoordSystem? Is it decimal degrees, since
the ultimate determination of the coords by the calculator is such, or is
it d/m/s since that is what the actual, e.g. gazetter, data is from?? More
simply, maybe, can the OrigCoordSystem say "deg min sec", but the actual
data be given in decimal degrees? Hope that is clear.
Thanks
>>> Posting number 211, dated 19 Apr 2002 14:06:08
Date: Fri, 19 Apr 2002 14:06:08 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: New Calculator is ready
In-Reply-To: <MAMMAL-Z-NET%2002041915295851@USOBI.ORG>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX,
The quick answer is to use the value given in the long line of blue
tab-delimited
data at the bottom of the new calculator after you click on the Calculate
button. We want to record the coordinate system from which the
determination was made, which is to say, the one upon which the coordinate
precision was based.
John W
>>> Posting number 212, dated 22 Apr 2002 00:00:0/
Date: Mon, 22 Apr 2002 00:48:24 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: SpecLocality placenames, simply wrong vs probably right
In-Reply-To: <5.0.0.25.2.20020417173410.02732420@socrates.berkeley.edu>
Mime-Version: 1.0
Content-Type: multipart/alternative;
boundary="============_-1192660781==_ma============"
--============_-1192660781==_ma============
Content-Type: text/plain; charset="iso-8859-1" ; format="flowed"
Content-Transfer-Encoding: quoted-printable
John: It wasn't clear what methods or methodology you disagree with.
Regarding assumptions, the comment that "some of the determinations
using this methodology will simply be wrong" seems overly optimistic.
We will never know "simply" if many localities are wrong or right.
It is more productive to think in probabilities. We have to make
assumptions regarding what SpecLocality values represent given that
we are georeferencing other institution's records and formats
without consulting primary data. So I have to assume, as does
everyone else, that entries like "Hood River" from different
collectors and institutions probably refer to the same locality and
the same type of locality (city rather than river). For determining
lat/lons and errors, we also assume that the lat/longs and boundaries
of population centers like Hood River probably have not shifted or
expanded/contracted significantly over the years. These basic
assumptions make determinations probabilities rather than right or
wrong.
Another possibility is that the comment "this methodology will simply
be wrong" refers to calculating lat/longs based on a placename and
offsets? Formal release of the web lat/long calculator validates
this methodology. But, regardless of the methodology the assumptions
are the same.
A third possibility is that perhaps you were referring to the
reference string that Hood River, the ppl, was georeferenced but that
there are other possibilities? Many placenames with a landscape
feature in the name (e.g., river, falls, beach, spring) will be in
the category of at least two types of locality, so for these I am
looking (a few keystrokes, so little time is wasted) and I am
annotating as I find them with a standard "could also be =8A". About
10% of the Oregon records are in this category. If you don't want
them georeferenced let me know.
>XXXX, and all,
>
>I disagree with the methods described below on three counts. The more
>important of my two reasons is that some of the determinations using this
>methodology will simply be wrong, and there won't be any way to know if
>they are wrong even with the Locality Remarks. My secondary reason is that
>it will be difficult to find these localities among the data in order to
>filter them out from analyses for which they aren't appropriate. Finally,
>it seems to me we gain no benefit from actually having coordinates for
>ambiguous localities, especially given that they take up precious time to
>georeference. By ambiguous I mean that there is more than one possible
>distinct place to which the locality may refer. By distinct I mean that the
>maximum error circles do not overlap.
>
>One could, sometimes appropriately, choose the geographic center of
>multiple possible places for the coordinates of a locality and have the
>extent cover all of them. In this case the error would be larger than it
>would be by choosing any one of the possible localities, but the
>determination (coordinates plus maximum error distance) would not be wrong.
>I would use this method sparingly, however, given that it does take quite a
>bit of time to make a determination of this kind. The method is most
>appropriate when the distances between the possible places is relatively
>small so that the maximum error distance itself remains small. I have no
>objection to this approach, but I would argue vehemently to avoid
>determinations that are likely to be wrong.
>
>John W
>
>>> Posting number 213, dated 22 Apr 2002 14:56:54
Date: Mon, 22 Apr 2002 14:56:54 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: SpecLocality placenames, simply wrong vs probably right
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_NextPart_000_0048_01C1EA0D.F12B0F50"
This is a multi-part message in MIME format.
------=_NextPart_000_0048_01C1EA0D.F12B0F50
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
SpecLocality placenames, simply wrong vs probably righAll--
Does anyone know of any resource(s) that has/have listed named =
cities/towns and their georef. boundaries?? This would definitely help =
greatly in cases where a number of smaller towns are concentrated in a =
small place without obvious boundaries. =20
>>> Posting number 214, dated 29 Apr 2002 11:24:34
>>> Posting number 215, dated 29 Apr 2002 11:29:13
>>> Posting number 216, dated 29 Apr 2002 09:30:15
>>> Posting number 217, dated 29 Apr 2002 09:39:26
>>> Posting number 218, dated 29 Apr 2002 09:49:47
>>> Posting number 219, dated 1 May 2002 13:10:17
Date: Wed, 1 May 2002 13:10:17 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Localitly changed locations...
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
All--
I have an interesting dilemma, and would like to get thoughts, comments.
I have a list of different versions of the same locality: Chicago, =
FMNH; FMNH Boiler Room; Field Museum; Field Museum Building, etc. for at =
least nine "unique" localities. The problem lies in the fact that both =
the name and location of the Field Museum has changed over the course of =
years. According to our records, the specimens collected in the =
locality named "Field Museum" span from 1907 to 1999. But in 1921 the =
museum relocated, which was not taken into account. Similarly for =
"Field Museum Building". Specimens from 1921 onwards would have one set =
of coordinates, while the prior to '21 would be an entirely different =
set (locations are 10 km apart, by air).
=20
So--is this just a cleanup issue to be taken up by the Museum itself, or =
can it be addressed here? =20
XXXXXX
>>> Posting number 220, dated 1 May 2002 17:39:09
Date: Wed, 1 May 2002 17:39:09 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Localitly changed locations...
In-Reply-To: <007101c1f13b$72e26a50$b16f0a0a@fmnh.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX, and all,
This is a very good question. There are a few possible ways to proceed. I
have recorded the Verbatim Collecting Dates from the information that was
sent to me by each institution. I didn't put that information into the
gazetteer, but I assembled for just this kind of issue. If you send me a
list of distinct LocalityIDs I can send back the dates associated with each
one. After that it might get complicated, but we might do that much and see
how it goes.
Alternatively, it may be worth investigating if the localities in question
refer to specimens that were captive. If so, the localities need not be
georeferenced and the reason (in the NoGeorefBecause field) could be set to
"captive."
Another solution is to defer the georeferencing of these localities to the
individual institutions. In this case, you might want to put "locality is
time dependent" or something to that effect, in the NoGeorefBecause field.
John
At 01:10 PM 5/1/02 -0500, you wrote:
>All--
>
>I have an interesting dilemma, and would like to get thoughts, comments.
>
>I have a list of different versions of the same locality: Chicago, FMNH;
>FMNH Boiler Room; Field Museum; Field Museum Building, etc. for at least
>nine "unique" localities. The problem lies in the fact that both the name
>and location of the Field Museum has changed over the course of
>years. According to our records, the specimens collected in the locality
>named "Field Museum" span from 1907 to 1999. But in 1921 the museum
>relocated, which was not taken into account. Similarly for "Field Museum
>Building". Specimens from 1921 onwards would have one set of coordinates,
>while the prior to '21 would be an entirely different set (locations are
>10 km apart, by air).
>
>So--is this just a cleanup issue to be taken up by the Museum itself, or
>can it be addressed here?
>XXXX
>>> Posting number 221, dated 1 May 2002 18:31:05
Date: Wed, 1 May 2002 18:31:05 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Thanks and Can O' Worms??
In-Reply-To: <3.0.32.20020501154229.006d7aa0@pilot.msu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear XXXXX, and all,
>Hi John,
>Okay - thanks for the comments on the coordinates, and sorry for the FTP
>address mistake. Robin and I will make sure to send future files to
>"incoming/mvz/manis" instead of what we have been doing (our most recent
>printout of the steps - and step 8 - didn't have this information).
Not a problem - that's just the best I could come up with for a criticism.
>Also, I have been thinking about named places, their extents, possible
>future issues with such information, and how I am now recording info in the
>"NamedPlace" field.
>
>I have some questions for you (sorry if this opens a can of worms).
>
>1) Should there be a format for listing named places, such as Beaumont
>Tower, MSU Campus? I guess I could just call the place "Beaumont Tower",
>but currently I have listed it as "Beaumont Tower, Michigan State
>University". I then wondered if I should have put "Michigan State
>University" first, so if these named places are alphabetically listed
>somewhere as part of our project, then all of the "MSU" items would appear
>together. The same applies to TRS examples. I often list the TRS
>information in the NamedPlace field as it is given in the gazetteer -
>sometimes these begin with 1/4 of 1/4 of a section, and sometimes they
>begin with the "T" information. Should there be a format for these as
>well? (I know I really need to get a life). What do you think?
Good questions. The bottom line, I think, is that the Named Place itself
must be uniquely identifiable. For example, in "Beaumont Tower, Michigan
State University," no one is going to confuse that with any other Beaumont
Tower. In terms of which should come first, I think it will be slightly
more useful to put the less specific part of the named place before the
more specific for the reasons you stated, thus, "Michigan State University,
Beaumont Tower" would be preferable entry.
As for TRS data, we will not likely ever use those to make a gazetteer,
since there are already tools to extract coordinates from TRS. Because of
this, it is probably sufficient to simply record "TRS" as the named place
rather than to copy the TRS data into the Named Place field. Nevertheless,
if the TRS data are recorded in a consistent manner in the Named Place
field, the coordinates could be checked (or even assigned)
programmatically. The easiest TRS format to parse programmatically would be
something like "T7N R13E S17 NW1/4 of SE1/4." In my example, it doesn't
matter if you have extra white space between any of the letters and their
adjacent numbers, but it would be helpful to keep the same order as well as
the word "of."
>2) For offsets from a named place, should what I enter in the NamedPlace
>field reflect the direction of the offset? For example, if I have the
>following localities: Mason, 2 miles west of Mason, and 1 mile NW of
>Mason, I will be recording the following respective extents for Mason:
>greatest extent; extent to the west; and the northwestern extent.
>Currently I am listing Mason as the NamedPlace for all of these records,
>even though the extents are different. Should I be entering a name to
>reflect the offset direction (e.g. Mason, extent to the west)?
Interesting point. I think it is best to record just the name of the place
without reference to offsets. The named place is intended to show the
starting point for a coordinate calculation, but I think it is asking too
much to include the direction information. Think of how horrible it would
be for orthogonal offsets (e.g., "2 mi N and 3 mi E of Mason").
John
>>> Posting number 222, dated 2 May 2002 10:27:33
Date: Thu, 2 May 2002 10:27:33 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Updated manisgeoreftemplate.mdb file
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear all,
It has been brought to my attention that there was a data type problem in
the Extent field in my Access97 manisgeoreftemplate.mdb file. Up until
moments ago, that field was of type Long Integer, which wasn't very useful
for recording fractional extents. I have changed the data type to Double
and posted the new file in place of the old one. The file can be accessed
through the Georeferencing Steps document on the MaNIS web site.
Thanks for finding that, XXX,
John
>>> Posting number 223, dated 3 May 2002 09:46:43
>>> Posting number 224, dated 3 May 2002 12:03:26
Date: Fri, 3 May 2002 12:03:26 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Area
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Quick question...
If the extent of a town/city is vague do to proximity to other =
towns/cities, is the area of the place useful at all??
XXXXXX
>>> Posting number 225, dated 3 May 2002 10:12:46
Date: Fri, 3 May 2002 10:12:46 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Area
In-Reply-To: <002c01c1f2c4$710cf170$9f6e0a0a@FMNHCJ>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX,
In and of itself, the area describing the town/city may not be too useful,
but in terms of determining the maximum error distance, it is essential. Do
you have something more specific in mind?
John
At 12:03 PM 5/3/02 -0500, you wrote:
>Quick question...
>
>If the extent of a town/city is vague do to proximity to other
>towns/cities, is the area of the place useful at all??
>
>
>XXXXXXXX
>>> Posting number 226, dated 3 May 2002 12:17:17
Date: Fri, 3 May 2002 12:17:17 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Area
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Yes, that. I have Cicero IL, but the maps I have from Topozone are not as
helpful, as the Cicero is closely bordered with Chicago and other suburbs.
But I managed to find a reference on the web that listed the area of Cicero,
and thought that it would be helpful for the error determination.
So, should just take the area itself as the max extent such as the
following: if the area is 6 sq. mi., and area is length times width,
shouldn't the assumption be in favor of the greatest error, as in 6 by 1,
not 3 by 2? Or 4.8 by 1.25, etc.?
----- Original Message -----
From: "John Wieczorek" <tuco@SOCRATES.BERKELEY.EDU>
To: <MAMMAL-Z-NET@USOBI.ORG>
Sent: Friday, May 03, 2002 12:12 PM
Subject: Re: [MANIS] Area
> XXXX,
>
> In and of itself, the area describing the town/city may not be too useful,
> but in terms of determining the maximum error distance, it is essential.
Do
> you have something more specific in mind?
>
> John
>
> At 12:03 PM 5/3/02 -0500, you wrote:
> >Quick question...
> >
> >If the extent of a town/city is vague do to proximity to other
> >towns/cities, is the area of the place useful at all??
> >
> >
> >XXXXXX
>>> Posting number 227, dated 3 May 2002 10:45:32
Date: Fri, 3 May 2002 10:45:32 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Area
In-Reply-To: <006001c1f2c6$60721460$9f6e0a0a@FMNHCJ>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX,
Hmmm. This is probably a less-than-desirable approach for the simple reason
that we can't know from an area what the shape of the town is. Taking the
square root of the area would be a simple rule to employ, but it would
always result in an underestimate of the furthest extent of the town. We
want to use the furthest extent in our calculations so that the resulting
maximum error distance satisfies the criterion that it MUST encompass the
actual locality.
John
>Yes, that. I have Cicero IL, but the maps I have from Topozone are not as
>helpful, as the Cicero is closely bordered with Chicago and other suburbs.
>But I managed to find a reference on the web that listed the area of Cicero,
>and thought that it would be helpful for the error determination.
>
>So, should just take the area itself as the max extent such as the
>following: if the area is 6 sq. mi., and area is length times width,
>shouldn't the assumption be in favor of the greatest error, as in 6 by 1,
>not 3 by 2? Or 4.8 by 1.25, etc.?
>
>>> Posting number 228, dated 3 May 2002 13:03:42
Date: Fri, 3 May 2002 13:03:42 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Fwd: different extents for 0,1 or 2 offets
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX,
This particular point has been bugging me somewhat as well, though from the
standpoint of georeferencing automation. Basically, it makes life difficult
to try to maintain multiple extents for a given named place. More rules
means greater difficulty and greater opportunity for inconsistency.
So, I'm willing to simplify the Extents rule to read as follows, unless
there are any objections, which should be voiced immediately.
<begin proposed update>
Uncertainty due to the extent of a locality
Named places are not single points; they have extents. Although there are
conventions for placing the coordinates of a named place at the post
office, courthouse, or geographic center of a town, one cannot be sure that
the person who recorded the locality used a particular convention. Use the
distance from the geographic center of the named place to its furthest
extent as the uncertainty.
<end proposed update>
Yep, that's nice and succinct. I like it.
You asked about whether the uncertainty in the extent would be half of this
distance. Under most circumstances I would argue that you are correct, but
the rule, as it stands, covers any original measurement made from within
the named place. It's conservative, but at least it isn't going to be wrong.
Thanks for motivating me to make a commitment on this issue.
John
>X-Originating-IP: [67.25.99.222]
>From:
>To: <tuco@socrates.Berkeley.EDU>
>Subject: different extents for 0,1 or 2 offets
>Date: Fri, 3 May 2002 12:26:00 -0700
>X-Mailer: MSN Explorer 7.00.0021.1900
>X-OriginalArrivalTime: 03 May 2002 19:32:31.0624 (UTC)
>FILETIME=[449FE880:01C1F2D9]
>
>While working through extents the concept a circumscribing circle with
>radius to encompass the most distant point of a placename, and thus all
>points, is floating around in my head. I'm wondering for single offsets
>why we are taking extents as the distance from the center of a placename
>to the boundary in the direction of the offset. An example I just did had
>a W extent of 3 mi and the N extent is 2 mi.
>
>The relevant section from the guidelines is:
>Uncertainty due to the extent of a locality
>Named places are not single points; they have extents. Although there are
>conventions for placing the coordinates of a named place at the post
>office, courthouse, or geographic center of a town, one cannot be sure
>that the person who recorded the locality used a particular convention. If
>only the named place is given in the locality description use the distance
>from the geographic center of the named place to its furthest extent as
>the uncertainty. If the description includes an offset, use the distance
>from the geographic center to furthest extent of the named place in the
>direction of the offset. For multitple offsets, (e.g., 3 km N, 5 km E of
>Bakersfield) use the furthest of the extents from the geographic center of
>the named place in the two cardinal directions.
>
>I glossed over this in previous reading because I thought it sort of made
>sense. But after doing extents, these guidelines seems arbitrary and I'm
>just wondering why different rules for different situations. It seems if
>we don't know what the conventions were for the placename, regardless of
>the number of extents, it is only reasonable to assume the first and "use
>the distance from the geographic center of the named place to its furthest
>extent as the uncertainty." Using distance from the center to boundary
>in the direction of the offset assumes what? Perhaps that the collecter
>was referring to the center or the boundary? If true, then wouldn't the
>extent (for computation of the error radius) be half the distance between
>the center and boundary? But who knows where in the placename the
>collector was referencing. It doesn't appear to be a big deal, the
>difference is usually less than a mile and usually insignificant combined
>with degree error. If ok, I'll just go with the max, with annotation of
>course.
>
>
>----------
>>> Posting number 229, dated 3 May 2002 16:54:21
Date: Fri, 3 May 2002 16:54:21 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Fwd: different extents for 0,1 or 2 offets
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
John: Sounds ok, but doesn't the two offset technique use the linear ext=
ent as a side of the bounding box to calculate the extent based on center=
to corner hypotenuse while the single/zero offset technique uses the lin=
ear extent as a component of d'? Whew. So for the same linear extent, t=
he two offset extent will be greater. But this is in line with the maxim=
al approach to estimating error. =20
----- Original Message -----
From: John Wieczorek
Sent: Friday, May 03, 2002 1:01 PM
To: MAMMAL-Z-NET@USOBI.ORG
Subject: [MANIS] Fwd: different extents for 0,1 or 2 offets
XXXX,
This particular point has been bugging me somewhat as well, though from t=
he
standpoint of georeferencing automation. Basically, it makes life difficu=
lt
to try to maintain multiple extents for a given named place. More rules
means greater difficulty and greater opportunity for inconsistency.
So, I'm willing to simplify the Extents rule to read as follows, unless
there are any objections, which should be voiced immediately.
<begin proposed update>
Uncertainty due to the extent of a locality
Named places are not single points; they have extents. Although there are
conventions for placing the coordinates of a named place at the post
office, courthouse, or geographic center of a town, one cannot be sure th=
at
the person who recorded the locality used a particular convention. Use th=
e
distance from the geographic center of the named place to its furthest
extent as the uncertainty.
<end proposed update>
Yep, that's nice and succinct. I like it.Get more from the Web. FREE MSN=
Explorer download : http://explorer.msn.com
>>> Posting number 230, dated 3 May 2002 17:44:06
Date: Fri, 3 May 2002 17:44:06 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Fwd: different extents for 0,1 or 2 offets
In-Reply-To: <OE486LSpSpwbuZ8f0SY0000325d@hotmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX,
The two offset technique is not used to calculate an extent. The extent is
used to calculate the maximum error distance. The fact that we will use the
most conservative extent of a named place does not change how that extent
is used in the calculations.
While it is true that the extent contributes to the size of the bounding
box, it is not equal to the length of the side of the bounding box. The
length of the extent can be no more than half of the length of the side of
the bounding box, and that only if there are no other contributions of
distance uncertainties. The single/zero offset technique will still use the
linear extent as a component of d'. Also, the contribution to the maximum
error distance for a given extent will still be the square root of two
times greater for two offsets than it will be for one offset, regardless of
how big the offsets are.
The description of error calculations for "Combinations of uncertainties:
distance" is already based on the greatest single extent of the named place
in the two orthogonal offsets. Sorry if the explanations above are
confusing. The bottom line is that the proposed simplification in how to
determine the extent does affect how the extent is used in calculations. No
change to the documentation or methodology is required beyond the proposed
simplification in determining extents.
John
At 04:54 PM 5/3/02 -0700, you wrote:
>John: Sounds ok, but doesn't the two offset technique use the linear
>extent as a side of the bounding box to calculate the extent based on
>center to corner hypotenuse while the single/zero offset technique uses
>the linear extent as a component of d'? Whew. So for the same linear
>extent, the two offset extent will be greater. But this is in line with
>the maximal approach to estimating error.
>----- Original Message -----
>From: John Wieczorek
>Sent: Friday, May 03, 2002 1:01 PM
>To: MAMMAL-Z-NET@USOBI.ORG
>Subject: [MANIS] Fwd: different extents for 0,1 or 2 offets
>
>XXXX,
>
>This particular point has been bugging me somewhat as well, though from the
>standpoint of georeferencing automation. Basically, it makes life difficult
>to try to maintain multiple extents for a given named place. More rules
>means greater difficulty and greater opportunity for inconsistency.
>
>So, I'm willing to simplify the Extents rule to read as follows, unless
>there are any objections, which should be voiced immediately.
>
><begin proposed update>
>Uncertainty due to the extent of a locality
>Named places are not single points; they have extents. Although there are
>conventions for placing the coordinates of a named place at the post
>office, courthouse, or geographic center of a town, one cannot be sure that
>the person who recorded the locality used a particular convention. Use the
>distance from the geographic center of the named place to its furthest
>extent as the uncertainty.
><end proposed update>
>
>Yep, that's nice and succinct. I like it.Get more from the Web. FREE MSN
>Explorer download : http://explorer.msn.com
>>> Posting number 231, dated 5 May 2002 09:24:51
Date: Sun, 5 May 2002 09:24:51 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Fwd: different extents for 0,1 or 2 offets
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
John: I meant to say half the side of the bounding box hence the center=
to corner hypotenuse.
- Original Message -----
From: John Wieczorek
Sent: Friday, May 03, 2002 5:42 PM
To: MAMMAL-Z-NET@USOBI.ORG
Subject: Re: [MANIS] Fwd: different extents for 0,1 or 2 offets
XXXX,
The two offset technique is not used to calculate an extent. The extent i=
s
used to calculate the maximum error distance. The fact that we will use t=
he
most conservative extent of a named place does not change how that extent
is used in the calculations.
While it is true that the extent contributes to the size of the bounding
box, it is not equal to the length of the side of the bounding box. The
length of the extent can be no more than half of the length of the side o=
f
the bounding box, and that only if there are no other contributions of
distance uncertainties. The single/zero offset technique will still use t=
he
linear extent as a component of d'. Also, the contribution to the maximum
error distance for a given extent will still be the square root of two
times greater for two offsets than it will be for one offset, regardless =
of
how big the offsets are.
The description of error calculations for "Combinations of uncertainties:
distance" is already based on the greatest single extent of the named pla=
ce
in the two orthogonal offsets. Sorry if the explanations above are
confusing. The bottom line is that the proposed simplification in how to
determine the extent does affect how the extent is used in calculations. =
No
change to the documentation or methodology is required beyond the propose=
d
simplification in determining extents.
John
>>> Posting number 232, dated 7 May 2002 12:50:12
>>> Posting number 233, dated 7 May 2002 21:30:01
>>> Posting number 234, dated 8 May 2002 17:58:12
Date: Wed, 8 May 2002 17:58:12 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing Step Nine
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
Recently there have been postings to this list announcing data uploads. I
applaud those who have done so for their ability to follow directions. That
protocol is exactly what I had written in the Georeferencing Steps
document. Though these are useful reminders of progress to everyone, in
retrospect it seems that this kind of traffic to the list is unnecessary,
and that I am the only one who really needs to know when the data are
available. Consequently I have changed Step Nine in the Georeferencing
Steps document. Please review it if you have any questions.
In addition, I took the liberty of making the change discussed on 3 May
about simplifying the protocol for determining extents of named places.
That change can be reviewed under the section in the Georeferencing
Guidelines document entitled "Uncertainty due to the extent of a locality."
John
>>> Posting number 235, dated 9 May 2002 13:13:29
Date: Thu, 9 May 2002 13:13:29 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Named Place
In-Reply-To: <002901c1f781$c9acc730$9f6e0a0a@Sbober>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX, and all,
Good eye. The downloaded data did not contain the fields "Extent" and
"Named Place" until moments ago, though the Access97 database
manisgeoreftemplate.mdb did include them.
No other column is needed for units. All distance units for a given
calculation must be the same as the original units used in the locality
description, and these units are recorded in the MaxErrorUnits field. The
Named Place field should be filled with the proper name (e.g, Bakersfield,
or "junction of Hwy. 5 and Hwy. 80") rather than the type of named place
(e.g., "ppl", or "lake").
John
At 12:48 PM 5/9/02 -0500, XXXXXXXXXX wrote:
>John-
>
>I just noticed the 'Extent' and Named Place fields listed on the
>Georeferencing Steps page. Are the files downloaded supposed to already
>have them? I just added them, but was unsure if another column was needed
>for units, of if that should all be covered with the max error distance
>units(as it all needs to be normalized anyway). Also, is the named place
>column a specific or general term? Is it the proper name of the place
>used to find the coord.'s, or should just be something along the lines of
>'populated place', 'lake', etc.
>
>>> Posting number 236, dated 9 May 2002 19:05:15
Date: Thu, 9 May 2002 19:05:15 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Fwd: SpecLocality placenames, simply wrong vs probably right
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: quoted-printable
XXXX, and all,
I have not forgotten this, and I wanted to tell you so, even though I can't=
=20
spare the time to do it justice at the moment. It's been like that a lot=20
lately. I WILL reply though.
John
>X-Sender: gshugart@mail.ups.edu
>Date: Mon, 22 Apr 2002 00:48:24 -0700
>Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
>Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
>From:
>Subject: SpecLocality placenames, simply wrong vs probably right
>To: MAMMAL-Z-NET@USOBI.ORG
>X-Status:
>X-Keywords:
>
>John: It wasn't clear what methods or methodology you disagree=20
>with. Regarding assumptions, the comment that "some of the=20
>determinations using this methodology will simply be wrong" seems overly=20
>optimistic. We will never know "simply" if many localities are wrong or=20
>right. It is more productive to think in probabilities. We have to make=
=20
>assumptions regarding what SpecLocality values represent given that we are=
=20
>georeferencing other institution's records and formats without consulting=
=20
>primary data. So I have to assume, as does everyone else, that entries=20
>like "Hood River" from different collectors and institutions probably=20
>refer to the same locality and the same type of locality (city rather than=
=20
>river). For determining lat/lons and errors, we also assume that the=20
>lat/longs and boundaries of population centers like Hood River probably=20
>have not shifted or expanded/contracted significantly over the=20
>years. These basic assumptions make determinations probabilities rather=
=20
>than right or wrong.
>
>Another possibility is that the comment "this methodology will simply be=20
>wrong" refers to calculating lat/longs based on a placename and=20
>offsets? Formal release of the web lat/long calculator validates this=20
>methodology. But, regardless of the methodology the assumptions are the=
same.
>
>A third possibility is that perhaps you were referring to the reference=20
>string that Hood River, the ppl, was georeferenced but that there are=20
>other possibilities? Many placenames with a landscape feature in the name=
=20
>(e.g., river, falls, beach, spring) will be in the category of at least=20
>two types of locality, so for these I am looking (a few keystrokes, so=20
>little time is wasted) and I am annotating as I find them with a standard=
=20
>"could also be =8A". About 10% of the Oregon records are in this=20
>category. If you don't want them georeferenced let me know.
>
>
>
>>XXXX, and all,
>>
>>I disagree with the methods described below on three counts. The more
>>important of my two reasons is that some of the determinations using this
>>methodology will simply be wrong, and there won't be any way to know if
>>they are wrong even with the Locality Remarks. My secondary reason is that
>>it will be difficult to find these localities among the data in order to
>>filter them out from analyses for which they aren't appropriate. Finally,
>>it seems to me we gain no benefit from actually having coordinates for
>>ambiguous localities, especially given that they take up precious time to
>>georeference. By ambiguous I mean that there is more than one possible
>>distinct place to which the locality may refer. By distinct I mean that=
the
>>maximum error circles do not overlap.
>>
>>One could, sometimes appropriately, choose the geographic center of
>>multiple possible places for the coordinates of a locality and have the
>>extent cover all of them. In this case the error would be larger than it
>>would be by choosing any one of the possible localities, but the
>>determination (coordinates plus maximum error distance) would not be=
wrong.
>>I would use this method sparingly, however, given that it does take quite=
a
>>bit of time to make a determination of this kind. The method is most
>>appropriate when the distances between the possible places is relatively
>>small so that the maximum error distance itself remains small. I have no
>>objection to this approach, but I would argue vehemently to avoid
>>determinations that are likely to be wrong.
>>John W
>
>
>--
>>> Posting number 237, dated 21 May 2002 18:26:31
Date: Tue, 21 May 2002 18:26:31 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Fwd: sufficiently georeferenced?
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
XXXXX has brought up another good question. I consider the examples she has
given below to be specific localities that just happen to contain
coordinates, and they should therefore be georeferenced. Localities that
have values in the DecLat and DecLong fields should be considered to have
been georeferenced for the purposes of this discussion.
John
>From:
>Subject: sufficiently georeferenced?
>
>John:
>
>The subject of georeferencing records that already have coordinates has
>been discussed in the past, but I had not encountered any prior to Benzie
>County, Michigan.
>
>(Copied here is part of your reply to XXXXXX on 2/15/02:
>
>XXXXXX and all,
>There is no provision for georeferencing records that already have
>coordinates, but this shouldn't necessarily deter you from doing so. If you
>go this route, please be sure to note that you have provided these
>additional data when you send them in to me. It makes a difference in how I
>handle the data on this end....)
>
>
>Would you please advise if the following Benzie examples are what you
>referred to as having coordinates? Are these searchable and provide
>sufficient information on the localities, or should I add decimal degrees
>et al. in the columns provided in the ACCESS template?
>
>SpecLocality: SLEEPING BEAR DUNES NTL LAKESHORE, ESCH RD AT ARAL
>LatText: 44D45'86D04'30"
>
>SpecLocality: ESCH RD AT ARAL, LINE 7
>TRS: 44D45'N 86D04'30"W, 185M
>
>Thanks again for your assistance.
>
>>> Posting number 238, dated 21 May 2002 20:22:25
>>> Posting number 239, dated 23 May 2002 11:49:56
Date: Thu, 23 May 2002 11:49:56 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: urban boundaries Topo USA 4.0
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii" ; format="flowed"
MaNISers: Haven't heard back yet if there is an accessible boundary
file, but the boundaries are boundaries rather than some artefact.
There seems to be a trend to annex or incorporate airports and
watersheds which increases the extents in modern times. See original
query below regarding boundaries.
>Dear XXXX,
>
>The boundaries are selectable but you are not able to determine a perimeter
>distance for the boundary. Also, I would only use the program as a general
>reference tool and not for something that requires precise measurement.
>
>
>Regards,
>Dan Lee
>Technical Support Specialist
>
>DeLorme Tech Support
>Two DeLorme Drive
>Yarmouth, Maine 04096
>E-Mail: tech_inbox@DeLorme.com
>
>-----Original Message-----
>From:
>Sent: Tuesday, May 07, 2002 1:44 PM
>To: tech@delorme.com
>Subject: urban boundaries Topo USA 4.0
>
>
>Support: I recently purchased Topo USA 4.0 for a
>mapping/georeferencing project. I'm impressed with the program. For
>the project it is important that I can come up with the length &
>width or extents of cities, urban areas and suburbs. I noticed that
>a right click on an urban area (brown) then the create route option
>shows a green line highlighting what appear to be boundaries of the
>city, suburb, incorporated area. This also seems to work with
>parks. This boundary feature doesn't appear to be documented in the
>manual. Are these the official city/urban boundaries or some route
>that the program creates independent of boundaries?
>
>I queried the knowledge base but didn't come up with anything
>specific on limits, city limits, urban boundaries with and without
>create route.
>--
>>> Posting number 240, dated 26 May 2002 23:34:58
>>> Posting number 241, dated 27 May 2002 09:17:14
>>> Posting number 242, dated 29 May 2002 18:14:13
>>> Posting number 243, dated 29 May 2002 18:17:09
>>> Posting number 244, dated 4 Jun 2002 22:18:54
>>> Posting number 245, dated 5 Jun 2002 14:33:55
Date: Wed, 5 Jun 2002 14:33:55 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: georeferencing rates
Comments: cc:
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Barbara & John,
I have finished approximately 2700 records since the second week of April.
However, the first two weeks was spent getting a computer, setting it up
with the correct software and familiarizing myself with the project and how
I was going to tackle it. The third week was when I really started on the
project. Here is the break down by counties in Nevada:
Washoe County:
809 records
17 working days to complete
119 working hrs.
6.79 records/hr. average
FTP'ed 5/7/2002
Storey, Carson City, Douglas, Lyon, and Mineral Counties:
822 records
10 working days to complete
70 working hours
11.74 records/hour average
FTP'ed 5/23/2002
Humboldt County:
605 records
6 working days to complete
42 working hours
14.4 records/hour average
FTP'ed 5/31/2002
Pershing County:
136 records
2 working days to complete
14 working hours
9.7 records/hour average
FTP'ed 6/4/2002
Total Average Records/Hour = 10.65 Records/Hour
My working hours are calculated for a 7 hour working day. I am in the
office for 9 hours, however, 1 hour is used for lunch and the other hour is
for my miscellaneous walks around the museum to keep me fresh and awake,
therefore, 7 working hour days.
I use:
MS Access 2000 for data manipulation and compiling
ArcView 3.2 for viewing localities
USGS Geographic Names Information System (GNIS) for the state of Nevada.
Feature data used in ArcView.
NV Atlas & Gazatteer by DeLorme for verifying locations and getting the
"extents" or size of features.
Topozone.com to look for features that are not found using GNIS or to get a
better topographic view than what DeLorme delivers.
http://www.esg.montana.edu/gl/trs-data.html - for converting TRS data to
latitude and longitude
Picking XXXXXXXX' brain for some locational stumpers!
John has asked previously what I have done to get the rates that I am
getting and I had answered him in another e-mail about that. In short, I
manipulate the data as much as I can before I start working on it. I divide
the NV database into individual counties. I then arrange the localities
within the counties to be clumped together so that I can work on one
locality at a time. Saves me time by not having to look up new new
latitudes and longitudes for different areas. If you want details, let me
know and I will write a much more detailed explanation.
Hope this helps. Take care.
Sincerely,
XXXXXXXXXXX
Curatorial Assistant
>>> Posting number 246, dated 6 Jun 2002 09:45:36
Date: Thu, 6 Jun 2002 09:45:36 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: plotting lat/longs in Topo USA 4.0
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Barb and all:
=20
A cool item in the tip category that might be useful for the meeting or f=
or just reviewing, proofing, or displaying records is the cability to dis=
play your work in Topo USA 4.0. All that is required is a file with reco=
rds having lat, long, and a label. Import & open the file (in text forma=
t), and the localities appear with little flags or label of your choice o=
n the Topo maps. Multiple records for the same point could use some prep=
rocessing to count and display the number. Other programs do the same, b=
ut I was so impressed just thought I would pass it along. Get more from =
the Web.
>>> Posting number 247, dated 6 Jun 2002 09:57:24
Date: Thu, 6 Jun 2002 09:57:24 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: georeferencing rates
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Barbara and John,
Our georeferencer for UAM, XXXXXXXXXX, has just left for the field for
the remainder of the summer. I have looked over the Access file that she
has been using as well as her timesheets to figure out how much time she has
been spending on this and how many records are completed. This is what I
found:
4078 records from Alaska have been georeferenced.
A total of 200 hours have been charged to the MaNIS account for her time.
20.39 records/hour
XXXXX works 20 hours per week and does nothing but georeference. She has
spent 10 weeks on this project. This average (20/hr) seems very high. I am
wondering if it is just a consequence of the Alaska data - there are many
duplicate localities (e.g. 50 different localities that contain Fairbanks -
the city) XXXXXX looks up the lat/long once (or a few times, depending on
the data) and varies the ME as needed. There are also quite a few that she
was not able to be georeference. We have been mainly using a CD map program
"All Topo Maps: ALASKA" It contains all the USGS maps of Alaska. It is very
handy for georeferencing rivers, bays, lakes, distances from named places,
etc.
We have georeferenced about 46% of the localities that we started with (that
needed georeferencing). We still have about 4734 records to georeference.
We have broken the Alaska localities into 2 groups, those from other
museums, and those from UAM. All of the localities that are currently
georeferenced are from other museums - we only have 740 more to go. As for
the other 4000 or so left to georeference, they are all from UAM
Let me know if you need any additional clarification. Cheers!
XXXXXXXXXXXXXXX
"Barbara R. Stein" wrote:
> Dear All,
>
> John and I are very interested in how the georeferencing is
> progressing. It is one of many topics we would like to discuss at the
> ASM meeting, but time will be short and I am not sure it is the best use
> of that forum. I would much rather demo the network to you and begin
> talking about how we will be bringing collections on line!
> Consequently, we would like to ask each institution to calculate their
> current georeferencing rate and determine if you are doing less than,
> the same as, or better than the rates on which we based our proposal
> budget. They were:
>
> 9/hr for US localities
> 6/hr for non-US North American localities
> 3/hr for non-North American localities
>
> We suspect that your rates will have improved now that you are all
> familiar with the process, but if you are easily exceeding these figures
> we woul'd like to know if there are tricks you have developed and ask
> that you post these to the list. If you are pretty much on target, this
> is good for us to know as well. And if you are lagging, now is the time
> to speak up and let's figure out how you can catch up. If you would
> send this information within the next week, John and I will summarize
> the stats in time for the meeting and present a quick overview.
>
> FYI, John and I are getting an increasing number of queries about MaNIS,
> from folks who are anxious to access our data, both the georeferenced
> localities and the specimen records to which they will eventually be
> linked, and from those who are considering pursuing similar projects on
> their own. We have pointed a number of individuals to the
> georeferencing steps and guidelines posted on the web site and we will
> continue to update these and streamline them as you provide us with
> feedback.
>
> Looking forward to seeing you in Lake Charles.
>
> Best,
> Barbara
>>> Posting number 248, dated 6 Jun 2002 13:25:30
Date: Thu, 6 Jun 2002 13:25:30 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: georeferencing rates
In-Reply-To: <3CFD9F3D.EB78F191@oz.net>
Mime-version: 1.0
Content-transfer-encoding: 7bit
Content-type: text/plain; charset="US-ASCII"
Dear Barbara, John, et al.,
My team--XXXXXXXXXX and XXXXXXXXXX--seem to be working on schedule
and on pace. XXXXX tells me that most localities can be georeferenced at a
rate of about 9-10 per hour, but they occasionally hit snags that slow them
down to about 5, or so, per hour. Not surprisingly, their overall rate is
still improving as they gain more and more experience.
See you in Lake Charles. XXXXX
>>> Posting number 249, dated 7 Jun 2002 09:34:28
Date: Fri, 7 Jun 2002 09:34:28 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: georeferencing rates
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Barb and all: I was doing about 25 records/hr using GNIS placenames and =
calculated lat/longs in Excel, even with having to lookup extents on Topo=
USA 4.0. The rate has slowed to about 10/hr doing road miles and other =
non-GNIS placename records. TRS records should be in the 20/hr range. I=
should finish up with the 6,752 Oregon dataset sometime in July.
Some ideas to maximize georeferencing rates for future work are to concen=
trate on those records with GNIS placenames. This would get about 60-70%=
in the US. Also clean your data first, GNIS provides a great filter/dic=
tionary file for US records and can be done at a rate of about 100 record=
s/hour using semi-automation. Eliminating or saving extents for later wo=
uld push the georeferencing rate to over 100/hr with clean data. Get mor=
e from the Web.
>>> Posting number 250, dated 10 Jun 2002 09:26:26
Date: Mon, 10 Jun 2002 09:26:26 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: georeferencing rates
In-Reply-To: <3CFD9F3D.EB78F191@oz.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Barbara, John, et al.
We (UMNH) are georeferencing Utah localities at a much faster rate than
anticipated using a very effective combination of the GNIS gazetteer and the
National Geographic TOPO! software (USGS 1:24,000 maps). I estimate 20-30/h
for unambiguous localities, and 5-10/h for those that are more difficult (i.e.
vague). We are flagging very few as unresolved problems.
XXXX
>>> Posting number 251, dated 10 Jun 2002 10:50:20
Date: Mon, 10 Jun 2002 10:50:20 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: georeferencing rates
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Barbara, John, and all-
Here at CAS, I have been georeferencing US localities at a rate of 12-15 per
hour for straightforward localities and 8-10 per hour for localities that
require more detective work. I have been using a combination of GNIS
gazetteer and Terrain Navigator 2001 for each New England state I have
worked on thus far.
XXX
>>> Posting number 252, dated 11 Jun 2002 11:51:06
Date: Tue, 11 Jun 2002 11:51:06 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: georeferencing rates
In-Reply-To: <3CFD9F3D.EB78F191@oz.net>
MIME-version: 1.0
Content-type: text/plain; format=flowed; charset=us-ascii
Content-transfer-encoding: 7BIT
Dear Barb and John,
XXX's georeferencing progress to date is as follows:
Texas- 10049 localities, 4095 finished, 290 not georeferenced due to lack
of data.
Oklahoma- 1233 localities, 218 finished, 33 insufficient locality data.
Total- 4699 records finished
91 working days @ 4 hrs per day = 364 hours.
Average georeferencing rate = 13 per hour.
We are using Topozone and USGS data for georeferencing. Many of the
records are being done semi-automatically using the UTM Converter developed
at TTU several years ago. The data have to be cleaned up considerably
before this process can occur, however. Ambiguous localities are located
(where possible) using National Geographic's "TOPO! Texas" CD set and Texas
State Department of Highways and Public Transportation's published county
maps (paper).
Sincerely,
XXXXXX.
>>> Posting number 253, dated 11 Jun 2002 19:25:25
Date: Tue, 11 Jun 2002 19:25:25 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: georeferencing rates
In-Reply-To: <4.2.2.20020611114004.00aa3478@packrat.musm.ttu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: quoted-printable
Dear all:
CNMA's (Colecci=F3n Nacional de Mam=EDferos, at Instituto de Biolog=EDa,=
=20
UNAM) georeferencing progress is as follows:
Mexican states: Puebla and Tlaxcala.
effective working period: march - may 2002 =3D 66 working days;
6 h per day =3D 396 h;
total number of different localities (not records) georeferenced =3D 547;
georeferencing rate =3D 1.4 localities/h;
>We are using INEGI (Instituto Nacional de Estad=EDstica, Geograf=EDa e=20
>Inform=E1tica, M=E9xico) data and maps for georeferencing. Most data had=
to=20
>be cleaned up before georeferencing; most Mexican localities cannot be=20
>automatically done since they have not been georeferenced. Our skills are=
=20
>improving though.
> XXXXXXXX
------------------------------------------------=20
>>> Posting number 254, dated 12 Jun 2002 09:26:20
>>> Posting number 255, dated 12 Jun 2002 13:05:00
Date: Wed, 12 Jun 2002 13:05:00 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Georef. Rate
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
For Illinois, I have gotten up to about 15 per hour with unambiguous, or
down to about 5-6 per hour for those requiring a bit more work. Still
pursuing available resources for speeding this up as needed.
XXXXXXXXX
>>> Posting number 256, dated 12 Jun 2002 14:02:26
Date: Wed, 12 Jun 2002 14:02:26 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Organization: Mammalogy
Subject: Re: georeferencing rates
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Folks:
We have averaged 9 localities per hour. At times we can move at 20+ per
hour, others are show-stoppers. New Mexico - with ca. 17,000 unique
localities to work on - started slow, but we have beat the learning
curve and should move along nicely this summer.
-XXXXXXX
>>> Posting number 257, dated 19 Jun 2002 09:25:52
>>> Posting number 258, dated 19 Jun 2002 10:29:00
>>> Posting number 259, dated 19 Jun 2002 10:52:01
>>> Posting number 260, dated 19 Jun 2002 08:51:14
>>> Posting number 261, dated 26 Jun 2002 12:54:15
Date: Wed, 26 Jun 2002 12:54:15 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Automated Georeferencer
In-Reply-To: <p05100302b8e9651a4b72@[207.207.104.113]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX,
Here's the URL for the test site for the International Automated Georeferencer:
http://129.237.201.122/manis_si.html
>>> Posting number 262, dated 26 Jun 2002 15:13:32
>>> Posting number 263, dated 26 Jun 2002 13:22:40
Date: Wed, 26 Jun 2002 13:22:40 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Automated Georeferencer
In-Reply-To: <5.0.0.25.2.20020626125318.024ce160@socrates.berkeley.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
This site is in development. Feel free to look at it and try it, but please
do not try to incorporate it into your daily routine yet. Reed Beaman and I
are working to make a system that is streamlined with our georeferencing
techniques. We will announce the tool as ready for prime time when the
current pending issues have been addressed and I have written documentation
on how to use it properly.
Thanks,
John W
>XXXX,
>
>Here's the URL for the test site for the International Automated
>Georeferencer:
>
>http://129.237.201.122/manis_si.html
>>> Posting number 264, dated 26 Jun 2002 13:35:29
>>> Posting number 265, dated 28 Jun 2002 12:50:56
Date: Fri, 28 Jun 2002 12:50:56 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Captive flag, Zoo's
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
John, and all--
I just explored the various Zoo records, for Illinois, and felt I should
inform you of what I found. Two localites that were captive flagged were
actually from the zoo grounds, and were not captive specimens, but animals
living wild on the grounds themselves. I found this with our own records,
but cannot investigate to confirm the other records. I will change the
captive flag, and add georeferencing information on the 2, but thought you,
and all, might want to know about this potential (albeit somewhat nitpicky)
snag.
XXXX
>>> Posting number 266, dated 28 Jun 2002 11:50:31
Date: Fri, 28 Jun 2002 11:50:31 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: georeferencing streams, creeks, rivers for SpecLoc
Mime-Version: 1.0
Content-Type: text/plain; format=flowed
MaNISers: Hard to believe, but creeks, rivers and streams are used as
SpecLoc for many Oregon records and probably others. I've checked some of
our records against the specimen tags/catalogs and they are not typos or
ommission of data. I've been georeferenceing them taking a midpoint
(straight line or creek miles). The GNIS download has the mouth and source
dec lat/longs for streams, creeks, rivers. Plotting these first saves much
time vs trying to follow a creek onscreen. Once ends are marked the
measuring tool gives the total distance and midpoint.
Topo USA 4.0 will also give intersections of roads/water courses and water
courses using the street intersection option.
>>> Posting number 267, dated 28 Jun 2002 12:31:00
Date: Fri, 28 Jun 2002 12:31:00 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Captive flag, Zoo's
In-Reply-To: <GMENIPGLHCAIHELBOECBIEJNCAAA.sbober@fmnh.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Dear XXXX, and all,
This is indeed an important point, of which all data provider's and
georeferencers should remain aware. Here's a point of information: Every
record that comes with the captive flag turned on (except for those from
MVZ and UAM) were set by me when I parsed the data for the gazetteer. My
purpose in doing so was to highlight records that are likely not to
represent localities that are valid for species distributions. By setting
this flag, I identified records that should be checked by the contributing
institutions. At times, as XXXX notes, these records actually refer to
collecting events that are valid for the species distribution. However, as
georeferencers without access to the primary material we are unable to
determine this. The plan, then, is to georeference these localities as you
would any other. It will be up to the individual institutions
to determine whether to include the data for particular
specimens. Institutions that feel the need may later want
to add a field to their databases that is similar to the captive flag (we
call it the Valid_Distribution_Flag) to accomodate this issue.
John W
On Fri, 28 Jun 2002, XXXXXXXXX wrote:
> John, and all--
>
> I just explored the various Zoo records, for Illinois, and felt I should
> inform you of what I found. Two localites that were captive flagged were
> actually from the zoo grounds, and were not captive specimens, but animals
> living wild on the grounds themselves. I found this with our own records,
> but cannot investigate to confirm the other records. I will change the
> captive flag, and add georeferencing information on the 2, but thought you,
> and all, might want to know about this potential (albeit somewhat nitpicky)
> snag.
>
> XXXX
>
>>> Posting number 268, dated 28 Jun 2002 12:40:29
Date: Fri, 28 Jun 2002 12:40:29 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: georeferencing streams, creeks, rivers for SpecLoc
In-Reply-To: <F84thODt9Cz93MgPPf300000ff7@hotmail.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=X-UNKNOWN
Content-Transfer-Encoding: QUOTED-PRINTABLE
Nice. And remember, if a Locality says "Bitteroot River" and the county is
"Missoula County" use only that part of the Bitteroot River that is in
Missoula County for the determination.
On Fri, 28 Jun 2002, XXXXXXXXX wrote:
> MaNISers: Hard to believe, but creeks, rivers and streams are used as
> SpecLoc for many Oregon records and probably others. I've checked some o=
f
> our records against the specimen tags/catalogs and they are not typos or
> ommission of data. I've been georeferenceing them taking a midpoint
> (straight line or creek miles). The GNIS download has the mouth and sour=
ce
> dec lat/longs for streams, creeks, rivers. Plotting these first saves mu=
ch
> time vs trying to follow a creek onscreen. Once ends are marked the
> measuring tool gives the total distance and midpoint.
>
> Topo USA 4.0 will also give intersections of roads/water courses and wate=
r
> courses using the street intersection option.
>
>
>
>
>>> Posting number 269, dated 28 Jun 2002 13:39:01
Date: Fri, 28 Jun 2002 13:39:01 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: georeferencing streams, creeks, rivers for SpecLoc
Mime-Version: 1.0
Content-Type: text/plain; format=flowed
John: Thanks, forgot to say that. The mouth and source are often in
different counties. GNIS gives the county for the mouth which often doesn't
match the HigherGeog MaNIS county. In these cases I am assuming county is
correct and going with the part of the water course in the HigherGeog
county. Annotating as needed. We never did discuss the assumption that
county is correct.
>>> Posting number 270, dated 28 Jun 2002 15:50:52
Date: Fri, 28 Jun 2002 15:50:52 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: georeferencing streams, creeks, rivers for SpecLoc
In-Reply-To: <F123RvITt7Qs6hvfPAZ00001126@hotmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
As for the county issue, I generally assume its correctness, unless given
reason to doubt. I have run into discrepancies that were cleared up by
looking at a paper map with counties, and seeing that certain things run
through multiple counties. Also, there have been two counties listed for
the same named locality, that I initially assumed were identical, until GNIS
gave two choices. Annotation/remarks seem to be the way to keep things
clear, unless there are other ideas.
XXXXXXXX
-----Original Message-----
From: Mammal Networked Information System
[mailto:MAMMAL-Z-NET@USOBI.ORG]On Behalf Of Gary Shugart
Sent: Friday, June 28, 2002 3:39 PM
To: MAMMAL-Z-NET@USOBI.ORG
Subject: Re: [MANIS] georeferencing streams, creeks, rivers for SpecLoc
John: Thanks, forgot to say that. The mouth and source are often in
different counties. GNIS gives the county for the mouth which often doesn't
match the HigherGeog MaNIS county. In these cases I am assuming county is
correct and going with the part of the water course in the HigherGeog
county. Annotating as needed. We never did discuss the assumption that
county is correct.
>>> Posting number 271, dated 9 Jul 2002 10:21:16
Date: Tue, 9 Jul 2002 10:21:16 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: batch processing TRS data
Mime-Version: 1.0
Content-Type: text/plain; format=flowed
MaNISer: After having assistants do a number of records via the TRS web
site I happened on the batch processor link. The converter program can be
downloaded to your desktop and data can be submitted as a file. It takes
some parsing and concatenating of the MaNIS (or other) TRS strings or fields
to get the data in an acceptable format, but is fairly easy to do with
Excel. Lat/long are output with the TRS identifier thus allowiing a link
back to the MaNIS file. If anyone is interested and has 100's of TRS
records to do this might be a speedy option to consider. I'll help with the
data formating if needed.
>>> Posting number 272, dated 9 Jul 2002 10:28:02
Date: Tue, 9 Jul 2002 10:28:02 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: batch processing TRS data, the link
Mime-Version: 1.0
Content-Type: text/plain; format=flowed
The link for the TRS batch processor is
http://www.geocities.com/jeremiahobrien/trs2ll.html.
Also I tried the program and it and it works.
>>> Posting number 273, dated 18 Jul 2002 10:18:17
Date: Thu, 18 Jul 2002 10:18:17 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: georeferencing Congo
Comments: To:
In-Reply-To: <AA33E10E16DAD411BDFD0008C7CF50E6097FFB48@hawk.mail.ukans.e du>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear XXXXXX and all,
The question raised below is a good one, and it is the most difficult of
the problems we'll encounter with localities from which detailed maps are
not available. I'll include here an excerpt from the forthcoming ASM
Meeting Notes document and open my proposed solution to discussion.
"A question was raised about determining errors for foreign localities if
you do not know the extent of the nearest named place. For instance, you
may know that you were 4.6km NW of Hotezel, South Africa, but if you don't
know the extent of the village of Hotezel itself, how do you determine the
extent?
This is a tricky problem to which there are numerous possible
solutions. An ideal solution is one that is simple to remember and simple
to implement so that it is executed consistently under all
circumstances. The first thing to remember is that we have no dictum
saying that the maximum error distance has to be as small as possible.
Instead, it has to be as large as necessary to ensure that we are not
over-representing the accuracy of the data. With that in mind, I recommend
the following approach to determining the extent of a named place when it
cannot be determined directly from the maps, gazetteers, or any of the
other tools at hand.
1) Determine the location of the named place that is nearest to the
one for which you are trying to determine the extent. Let's call that
named place the "nearest neighbor."
2) Use one-half the distance between the named place of interest and
its nearest neighbor as the extent of the named place of interest. At times
this may turn out to be an unrealistically large extent, but there is no
harm in that. Into the future, estimates of the error distance can be
refined as better information becomes available."
At 09:15 AM 7/18/02 -0500, XXXXXXXX wrote:
>Hi.
>
>I'm a graduate student at XX and I'm georeferencing Congo localities for the
>MaNIS project. I have a question related to calculating the maximum error
>distance; I don't know what to do for the extent of localities. I've been
>using gazetteers and atlases to find localities but I couldn't find good
>maps to determine the extent of localities.
>
>Are there any chances I could set the extent of most localities at unknown?!
>I'm sure this is a dumb question, but I run into this problem when I started
>to use the georeferencing calculator and I don't know what to do.
>
>Please let me know if you have suggestions.
>Thank you.
>XXXXXXX
>
>>> Posting number 274, dated 18 Jul 2002 14:48:37
>>> Posting number 275, dated 19 Jul 2002 17:42:23
Date: Fri, 19 Jul 2002 17:42:23 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: TRS records
Mime-Version: 1.0
Content-Type: text/plain; format=flowed
MaNIS: Any guidance on records dealing with 1/4 of 1/4 sections
appreciated. Example Oregon records appear in the gazetteer as:
T2N,R45E,Sec. 25 NW SE (UW)
SW .25,NE .25 sec.12,T11S,R5W (LSU)
Some of the ambiguity might result from concatenting separate fields for 1/4
and 1/4 of 1/4 but I can't tell.
On these I stop at the section or use the placename + offset if error
appears to be smaller. Just wondered if I missed a standard order
somewhere? (Note: KU's TSRs are truncated so I am stopping at section on KU
records at the suggestion of KU).
One way of dealing with the ambiguity I found in the UW records:
T35S,R6E,Sec. 10 SW1/4,SE1/16
Presumably this is the SE 1/4 of the SW 1/4 of Sec 10 and not a typo?
To finish off the Oregon TRS records I used the batch processor. It worked
great and matched those I had done one at a time on the web site. It
doesn't like TR without a section however. On the few without section, I
used section 15 and calculated the lat/long .5 mi S and .5 mi W to get the
center of the TR. For some with just TR a smaller error resulted from using
the placename and offset. In this case I went with the placename + offset.
>>> Posting number 276, dated 19 Jul 2002 20:03:17
Date: Fri, 19 Jul 2002 20:03:17 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: TRS records
In-Reply-To: <F102xrUDugdSORDNUjE00011800@hotmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX, and other doing locales with TRS data from UW and LSU,
I constructed the localities from the Burke Museum from data that were
originally contained three separate fields for the TRS data. For the
Section part of the localities you see, I concatenated the verbatim value
of the Section field to the abbreviation "Sec.", thus, for your first
example below, the original data had "25 NW SE" in the Section field. For
your second UW locality below, the original section field contained "10
SW1/4,SE1/16". There are quite a few examples like these. Would a
representative from UW please clarify these issues?
The TRS data for LSU are all contained in the Locality field itself and
were not interpreted or parsed in any way. An LSU representative will have
to let us know if there is a consistent rule about how to interpret the TRS
for LSU localities.
Have there been any other ambiguities such as these?
Thanks for bringing these to our attention, XXXX.
John
>>> Posting number 277, dated 19 Jul 2002 23:16:04
Date: Fri, 19 Jul 2002 23:16:04 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: TRS records
Mime-Version: 1.0
Content-Type: text/plain; format=flowed
Others ambiguous 1/4 sections:
We (PSM) has about 20 records in Oregon like:
Brookings; T40S, R14W, S36, NW .25, SE .25
There is no way to tell 1/4 vs 1/4 of 1/4 from the record, but our format is
TRS, 1/4 section, 1/4 or 1/4 section unless the the text says something
different. These follow the format of the collector, but are ambiguous and
will be corrected when I get to it.
Also MVS had one T2S,R10E,S21 SW1/4,NE1/4 (MaNIS # 278049.
>From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
>Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
>To: MAMMAL-Z-NET@USOBI.ORG
>Subject: Re: [MANIS] TRS records
>Date: Fri, 19 Jul 2002 20:03:17 -0700
>
>....
>
>Have there been any other ambiguities such as these?
>
....>
>John
>
>
>>> Posting number 278, dated 22 Jul 2002 09:08:40
Date: Mon, 22 Jul 2002 09:08:40 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: TRS records
In-Reply-To: <5.0.0.25.2.20020719195028.02489710@socrates.berkeley.edu>
Mime-version: 1.0
Content-type: text/plain; charset="US-ASCII"
Content-transfer-encoding: 7bit
The example cited section 10 SW1/4, SE1/16 means the SE1/16 of the SW1/4 of
section 10. I could check with the collector, but that is the way it is
written in his field catalog and I have to presume that this is correct.
The hierarchy of locality is to go from larger to smaller entities.
The T2N, R45E, Sec 25 NW SE is straight from the collector's descriptions
and I would presume that in this case it means the SE1/4 of the NW1/4 of
Sec. 25. The alternative is to not assume anything and truncate the
locality to the section with a comment on the ambiguity.
Hope that this helps.
Cheers, XXXX
>
>>> Posting number 279, dated 22 Jul 2002 12:16:29
>>> Posting number 280, dated 22 Jul 2002 20:00:22
Date: Mon, 22 Jul 2002 20:00:22 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Chile is done
Comments: cc: acaiozzi@uclink.berkeley.edu
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear all,
Having a native Chilean student to do georeferencing of Chilean localities
has proven to be extremely beneficial. First, on her trip back home to
Santiago in June, she acquired some nice 1:500,000 military maps of most of
the country at fairly low cost. She was then able to use those maps, the
Alexandria Digital Library Gazetteer, the US Board of Geographic Names
Gazetteer, and a variety of other sources on the web to georeference the
877 localities in 75.5 hours for a mean rate of 11.62 localities per hour.
Huzzah. Moral of the story: go out and find willing and able natives to do
georeferencing of foreign localities.
John
>>> Posting number 281, dated 23 Jul 2002 08:14:23
>>> Posting number 282, dated 23 Jul 2002 08:15:09
>>> Posting number 283, dated 23 Jul 2002 18:48:47
Date: Tue, 23 Jul 2002 18:48:47 -0400
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: TRS records
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Dear MaNis Users,
I wanted to comment on interpretation of fractional divisions of sections
for MSU Museum Township-Range-Section (TRS) records.
For fractional land divisions, MSU uses the format based on the legal
description/U.S. Public Land Survey system, in that the smallest fractional
division is listed first going from left to right, and the comma(s)
separating fractions are read as "of the".
The abbreviated description NW 1/4, NE 1/4, Sec. 5, T2N, R1W would be read
as: The Northwest quarter of the Northeast quarter of Section 5 of
Township 2 North, Range 1 West.
In our records, the fractions may appear after the TRS information, but the
first fraction in the list is always the smallest. The abbreviated
description T. 2 N., R. 1 E., Sec. 16, SW 1/4, SE 1/4, NW 1/4 would be read
as: The Southwest quarter of the Southeast quarter of the Northwest quarter
of Section 16, Township 2 North, Range 1 East.
(We have found the following sources helpful:
http://www.outfitters.com/genealogy/land/twprangemap.html
Muehrcke, P.C. and J.O., 1998. Map Use: Reading, Analysis, and
Interpretation, JP Publications, Madison, Wisconsin)
Also - We calculated the extent of a 1/4 of 1/4 of 1/4 (quarter of quarter
of quarter) section to be 0.088 miles. John W, is this okay?
Thanks,
XXXXXXXXXXXXX
>>> Posting number 284, dated 24 Jul 2002 11:48:10
>>> Posting number 285, dated 30 Jul 2002 18:07:53
Date: Tue, 30 Jul 2002 18:07:53 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear all,
I have made a minor amendment to Step Seven of the Georeferencing Steps
document (http://dlp.cs.berkeley.edu/manis/GeorefSteps.html). Basically, I
realized it will be a little easier for me to "curate" the finished
georeferencing files if I know not only when they were done and by which
institution, but also if I know which geographic region are contained
within them. There is no need to worry about any of the files that have
already been sent to me. I'll just suffer for not having thought of this
earlier - it's how I train myself.
The new content of Step Seven of the Georeferencing Steps document is
repeated here for convenience.
Step Seven - Export Finished Localities
When a downloaded set of localities is done being georeferenced. Export the
complete set of data (all records with all fields and a column header row)
to a tab-delimited text file. Rename the file to reflect the institution,
the geographic scope of its content, and the date the file was finished.
For example, a file of Peruvian localities finished by the Field Museum on
Halloween would be FMNH-Peru-2001-10-31.txt. Make a backup of this file and
store it in a safe place until the data have been loaded into the MaNIS
Gazetteer.
John W
>>> Posting number 286, dated 31 Jul 2002 13:25:46
>>> Posting number 287, dated 31 Jul 2002 12:37:37
>>> Posting number 288, dated 1 Aug 2002 13:49:40
Date: Thu, 1 Aug 2002 13:49:40 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Fwd: neat maps
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
Chris Conroy was kind enough to forward this map resource to me. Please
take note of it if you have any interest in georeferencing any of these areas.
John W
>From: Christopher Conroy <ondatra@socrates.Berkeley.EDU>
>Subject: neat maps
>
>Folks,
>
> Here is a link to neat maps of former soviet republics for your
> viewing pleasure.
>http://www.reisenett.no/map_collection/commonwealth.html
>
>Chris
>--
>>> Posting number 289, dated 6 Aug 2002 12:23:05
Date: Tue, 6 Aug 2002 12:23:05 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: TRS updates to the Georeferencing Guidelines
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
Due to popular demand I have added a few paragraphs to the Georeferencing
Guidelines about calculating the coordinates of Townships and subsections
thereof. I also added a row to the table on TRS extents to include 1/4 of
1/4 of 1/4 section as well as a column for the extent of Township divisions
when orthogonal offsets are used to calculate the coordinates. Finally, I
added two new links to URLs that explain Townships quite nicely. Enjoy.
John W.
>>> Posting number 290, dated 12 Aug 2002 11:30:02
>>> Posting number 291, dated 13 Aug 2002 20:12:32
>>> Posting number 292, dated 16 Aug 2002 10:24:58
Date: Fri, 16 Aug 2002 10:24:58 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Selecting a "Coordinate Source" for Namibia in Africa
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
LACM has started work on Namibia after downloading many shapefiles and =
data tables from the "Atlas of Namibia" website located here:
http://www.dea.met.gov.na/data/Atlas/Atlas_web.htm#2Elevations,%20relief,=
%20profiles
I have cross-referenced (quality-checked the latitudes and longitudes) =
the data with source data of Namibia from the GNS website of NIMA =
(National Imagery and Mapping Agency) and the location data from the =
Atlas seems to be very accurate for the needs of MaNIS. Therefore, I =
created a customized topographic map with place names using ArcView for =
georeferencing. =20
Finally, my question. =20
In the "georeferencing calculator" under the heading "Coordinate Source" =
there are a number of choices, but none really describe my situation. =
Would selecting "Gazatteer" be my best bet when using data from the =
Atlas? When using GNS data from the NIMA website should I also select =
"Gazatteer"? The link to the NIMA website for data downloads for many =
countries is here:
http://164.214.2.59/gns/html/index.html
My thought is that since the source data for the Atlas and NIMA are from =
a variety of sources, that the logical choice is "Gazatteer", but I =
wanted to check with the board first. Sorry if this question seemed too =
"simple". This is my first international site and wanted to be sure on =
this. Thanks.
Sincerely,
XXXXXXXXXXXXXXXX
>>> Posting number 293, dated 16 Aug 2002 10:42:08
Date: Fri, 16 Aug 2002 10:42:08 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Selecting a "Coordinate Source" for Namibia in Africa
In-Reply-To: <000c01c24549$d8b624e0$180775ce@Subaru>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
These are definitely reasonable questions, the answers to which are "Yes,
use Gazetteer as the Coordinate Source in the Georeferencing Calcultor for
compilations of numeric coordinates for named places - as opposed to maps
from which the coordinates are "measured" by the georeferencer."
At 10:24 AM 8/16/02 -0700, you wrote:
>LACM has started work on Namibia after downloading many shapefiles and
>data tables from the "Atlas of Namibia" website located here:
>
>http://www.dea.met.gov.na/data/Atlas/Atlas_web.htm#2Elevations,%20relief,%20profiles
>
>I have cross-referenced (quality-checked the latitudes and longitudes) the
>data with source data of Namibia from the GNS website of NIMA (National
>Imagery and Mapping Agency) and the location data from the Atlas seems to
>be very accurate for the needs of MaNIS. Therefore, I created a
>customized topographic map with place names using ArcView for georeferencing.
>
>Finally, my question.
>
>In the "georeferencing calculator" under the heading "Coordinate Source"
>there are a number of choices, but none really describe my
>situation. Would selecting "Gazatteer" be my best bet when using data
>from the Atlas? When using GNS data from the NIMA website should I also
>select "Gazatteer"? The link to the NIMA website for data downloads for
>many countries is here:
>
>http://164.214.2.59/gns/html/index.html
>
>My thought is that since the source data for the Atlas and NIMA are from a
>variety of sources, that the logical choice is "Gazatteer", but I wanted
>to check with the board first. Sorry if this question seemed too
>"simple". This is my first international site and wanted to be sure on
>this. Thanks.
>
>>> Posting number 294, dated 21 Aug 2002 10:45:00
Date: Wed, 21 Aug 2002 10:45:00 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Previously parsed records - what to do?
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
If one downloads records with the locations already parsed (by John
Wieczorek), am I allowed to copy the information for new records? Or am I
to do the new records from "scratch". Typically, my findings are pretty
similar to John W's parsed records except I have more decimal places.
If I am allowed to copy the information, do I note that I copied it from
John W's previously finished records? Thanks in advance.
Sincerely,
XXXXXXXXXX
>>> Posting number 295, dated 21 Aug 2002 14:38:40
Date: Wed, 21 Aug 2002 14:38:40 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Previously parsed records - what to do?
In-Reply-To: <001e01c2493a$795c5420$180775ce@Subaru>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear all,
This is a tough one, actually. Here are my two cents worth, though I'd
encourage discussion if there is disagreement or further insight.
One thing is for sure, *do not* modify the records that say "Parsed by John
Wieczorek from data provided by ..." in the LatLongRemarks field. This
means that the data are original, from the source institution. They may
even be "exact" coordinates directly from the collector, and therefore may
be more specific than the locality description. We don't really have a way
to know.
Given the above, it may be that the coordinates in those parsed records do
not accurately reflect the locality in the records of another institution.
Even if they do, they may not be as precise as they might be if you
georeference them using our MaNIS georeferencing guidelines.
One case I can think of where one might argue to use the existing parsed
coordinates is where you have multiple occurrences of the same locality
from the same institution and some of them do not already have
coordinates. It may be more likely in this case that the localities really
do reflect the same exact place. However, this may be more likely for some
institutions than for others. I don't have enough familiarity with the data
and collecting practices of every institution to say for sure.
So, at the risk of looking like I'm waffling, I'll make this recommendation:
Use existing parsed coordinates to help you find localities on the map, but
georeference them as you would any other locality. If you are
georeferencing localities from your own institutions records and have good
reason to believe that the parsed coordinates accurately reflect the
position of all similar localities *from your institution's records*, go
ahead and use them. However, in the latter case, do not copy the "Parsed by
John Wieczorek from data provided by ..." in the LatLongRemarks field to
any other records. Instead, say something like "Coordinates copied from a
similar locality."
John W
At 10:45 AM 8/21/02 -0700, you wrote:
>If one downloads records with the locations already parsed (by John
>Wieczorek), am I allowed to copy the information for new records? Or am I
>to do the new records from "scratch". Typically, my findings are pretty
>similar to John W's parsed records except I have more decimal places.
>
>If I am allowed to copy the information, do I note that I copied it from
>John W's previously finished records? Thanks in advance.
>
>>> Posting number 296, dated 21 Aug 2002 17:00:26
Date: Wed, 21 Aug 2002 17:00:26 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Previously parsed records - what to do?
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
I thought the instructions from Barbara (Feb or March) were to leave the =
previously georeferenced records as they are and do not verify lat/longs,=
proof or add errors. So it doesn't seem possible to georeference them a=
s you would any other locality but leave them as they are. Did you mean =
"but do not georeference them"? =20
----- Original Message -----
From: John Wieczorek
Sent: Wednesday, August 21, 2002 4:46 PM
To: MAMMAL-Z-NET@USOBI.ORG
Subject: Re: Previously parsed records - what to do?
Dear all,
...So, at the risk of looking like I'm waffling, I'll make this recommend=
ation:
Use existing parsed coordinates to help you find localities on the map, b=
ut
georeference them as you would any other locality. ...Get more from the W=
eb.
>>> Posting number 297, dated 21 Aug 2002 17:24:19
Date: Wed, 21 Aug 2002 17:24:19 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Previously parsed records - what to do?
In-Reply-To: <OE17mQbaYgUCwMtkDFl00011a64@hotmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX, and all,
You caught an unfortunate glitch in clarity. Let me restate that whole blurb.
I'll make this recommendation:
Use existing parsed coordinates to help you find localities on the map, but
do not georeference them. Use the standard georeferencing guidelines for
all other localities, even if they are similar to localities with original
parsed coordinates. Exception: If you are georeferencing localities from
your own institutions records and have good reason to believe that the
parsed coordinates accurately reflect the position of all similar
localities *from your institution's records*, go ahead and use them.
However, in the latter case, do not copy the "Parsed by John Wieczorek from
data provided by ..." in the LatLongRemarks field to any other records.
Instead, say something like "Coordinates copied from a similar locality."
Thanks for keeping me honest.
John
At 05:00 PM 8/21/02 -0700, you wrote:
>I thought the instructions from Barbara (Feb or March) were to leave the
>previously georeferenced records as they are and do not verify lat/longs,
>proof or add errors. So it doesn't seem possible to georeference them as
>you would any other locality but leave them as they are. Did you mean
>"but do not georeference them"?
>
>
>----- Original Message -----
>From: John Wieczorek
>Sent: Wednesday, August 21, 2002 4:46 PM
>To: MAMMAL-Z-NET@USOBI.ORG
>Subject: Re: Previously parsed records - what to do?
>
>Dear all,
>
>...So, at the risk of looking like I'm waffling, I'll make this
>recommendation:
>Use existing parsed coordinates to help you find localities on the map, but
>georeference them as you would any other locality. ...Get more from the
>Web.
>>> Posting number 298, dated 23 Aug 2002 18:26:08
>>> Posting number 299, dated 26 Aug 2002 16:23:28
>>> Posting number 300, dated 2 Sep 2002 09:46:51
>>> Posting number 301, dated 4 Sep 2002 10:16:08
>>> Posting number 302, dated 4 Sep 2002 15:58:26
>>> Posting number 303, dated 5 Sep 2002 10:52:29
>>> Posting number 304, dated 5 Sep 2002 15:34:03
>>> Posting number 305, dated 5 Sep 2002 17:30:55
Date: Thu, 5 Sep 2002 17:30:55 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Initiating Discussion of Concept Info
In-Reply-To: <1244.140.107.26.190.1031248349.squirrel@www.oz.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii" ; format="flowed"
The proposed MaNIS structure
(http://dlp.cs.berkeley.edu/manis/darwin2jrwConceptInfo.htm) and DC2
looks fine to me for an online museum catalog. If this is the goal
then I favor adoption of the DC2 structure and concepts and let
contributors use their discretion to populate the fields. The
online catalog seems to be a broadening of the original MaNIS goal of
a network of georeferenced records that could be used for plotting
specimen locations. But no problem.
I plan to include as much data as are available in our in-house
records when writing data out to the server. I had planned on
leaving out specimens that could not be georeferenced but can include
them.
Predefined result sets (as counts) specifically for georeferencing:
species with lat/long data (or w/wo lat/long data)
species with lat/long data by area of interest (county, state,
country, continent)
species with lat/long by error radii (categorical)
species with lat/long data and month and year of collection - for plotting
species - all data - for Excel or Access work
for general mammal work:
species by prep type
species by locality
I am assuming there will be a clickable "plot the localities" and
"plot the localities and errors" buttons on the web page.
A question on the bounding box field. (Also in DC1 and 2). Is this
really a field/concept?. I would seem to be more along the lines of
a query that pops up and asks for opposing corners (lat/long) that
define the sides of the box.
Anticipated queries: I just got a NPS request for all vertebrates in
our collection from US National Parks. Plants too, but they aren't
online. My response was to wait abit and this type of request can
be answered online if the requestor had lat/long boundary files or
target counties/areas. This type of a request would be best handled
via a batch request of a list of lat/long boundaries or target areas.
Searching with an irregular boundary, rather than simple box, also
would be a great utility.
>>> Posting number 306, dated 6 Sep 2002 08:48:12
>>> Posting number 307, dated 6 Sep 2002 09:24:18
>>> Posting number 308, dated 6 Sep 2002 15:12:05
Date: Fri, 6 Sep 2002 15:12:05 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: UAM missing Concepts
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Looks good. From UAM, we could address data to most of the proposed
concepts. (There might be a water shrew that swam into a minnow trap,
but we can not yet provide depth data for a mammal.) Nevertheless,
there are a couple fields implemented in our present web interface that
are missing.
1.) We have a field for "Geographic Feature" which contains geographic
descriptors below the state/province level such as parks, refuges, etc.
This field has been crucial in pleasing various agencies. They can
generate their own real-time report of our holdings from their
jurisdictions, map them, etc. (NPS can search on any particular unit,
but not on all NPS holdings.)
2.) Analogous to to OtherCatalogNumber, we have Other_ID, but
supporting it, we have Other_ID_Type. An Other_ID_Type could be
"GenBank accession number," and the associated Other_ID would be
something like "Z123456." I suppose, that to avoid a one-to-many
relationship, these could be concatenated into another string. This
would complicate, or slow, some searches that are possible from our
present site, e.g. taxon="Sorex" + Other_ID_Type = "GenBank..." +
country="Russia" will show all the red-toothed shrews from Russia with
GenBank accession numbers. "Atomic" searches are also possible. (e.g.,
Other_ID_Type="GenBank..." + Other_ID = "Z123456")
(Try it at
http://arctos.museum.uaf.edu:8080/cgi-bin/uam_db/specimensearch.cgi)
This example has had consequences. In individual specimen records, we
display a GenBank number as a hyperlink to the sequence page on GenBank.
Now, NCBI has set up "LinkOuts" from "our" sequence pages to the
specimen records at UAM. NCBI is enthusiastic about repository
databases as much-needed supplemental documentation of GenBank and will
be making some announcements soon. We should consider if and how
OtherCatalogNumber might be used by GenBank's "Entrez" for LinkOuts to
MaNIS.
XXXXXX
>>> Posting number 309, dated 9 Sep 2002 09:46:40
>>> Posting number 310, dated 10 Sep 2002 08:16:56
>>> Posting number 311, dated 11 Sep 2002 09:13:26
>>> Posting number 312, dated 16 Sep 2002 09:01:13
>>> Posting number 313, dated 16 Sep 2002 10:42:31
>>> Posting number 314, dated 20 Sep 2002 11:02:33
>>> Posting number 315, dated 20 Sep 2002 11:13:40
Date: Fri, 20 Sep 2002 11:13:40 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Initiating Discussion of Concept Info
In-Reply-To: <p05100306b99d6117277d@[207.207.104.113]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
More comments, interspersed below.
John
At 05:30 PM 9/5/02 -0700, you wrote:
>The proposed MaNIS structure
>(http://dlp.cs.berkeley.edu/manis/darwin2jrwConceptInfo.htm) and DC2
>looks fine to me for an online museum catalog. If this is the goal
>then I favor adoption of the DC2 structure and concepts and let
>contributors use their discretion to populate the fields. The
>online catalog seems to be a broadening of the original MaNIS goal of
>a network of georeferenced records that could be used for plotting
>specimen locations. But no problem.
Actually, most of the expanded concepts were in the proposal from the
beginning as we wanted to be able to serve the scientific community with
the kinds of information to which they are accustomed when they write to us
now, including things such as preparation types.
>I plan to include as much data as are available in our in-house
>records when writing data out to the server. I had planned on
>leaving out specimens that could not be georeferenced but can include
>them.
I would hope that others agree that the ungeoreferenced data can still be
useful in other contexts.
>Predefined result sets (as counts) specifically for georeferencing:
>
>species with lat/long data (or w/wo lat/long data)
>species with lat/long data by area of interest (county, state,
>country, continent)
>species with lat/long by error radii (categorical)
>species with lat/long data and month and year of collection - for plotting
>species - all data - for Excel or Access work
>
>for general mammal work:
>
>species by prep type
>species by locality
Queries for all of these questions will be possible. Barbara and I will try
to engineer predefined result set schemata and post them here for review. I
know already that there will be a "Full" result set, which contains
everything on the Concept Info page. Beyond that, we'll try to identify
special purpose result sets.
>I am assuming there will be a clickable "plot the localities" and
>"plot the localities and errors" buttons on the web page.
Not immediately, but yes, that is my hope as well.
>A question on the bounding box field. (Also in DC1 and 2). Is this
>really a field/concept?. I would seem to be more along the lines of
>a query that pops up and asks for opposing corners (lat/long) that
>define the sides of the box.
BoundingBox, like JulianDay, is a concept calculated from other concepts
that map to fields in the database. It is not a separate field expected to
be found in your database.
>Anticipated queries: I just got a NPS request for all vertebrates in
>our collection from US National Parks. Plants too, but they aren't
>online. My response was to wait abit and this type of request can
>be answered online if the requestor had lat/long boundary files or
>target counties/areas. This type of a request would be best handled
>via a batch request of a list of lat/long boundaries or target areas.
>Searching with an irregular boundary, rather than simple box, also
>would be a great utility.
I would hope that an application for this kind of spatial query can be
built on top of MaNIS. I envision drawing the area of interest on a map as
one of the criteria in the query itself. Work along these lines is being
investigated here at the MVZ.
>>> Posting number 316, dated 20 Sep 2002 11:27:55
Date: Fri, 20 Sep 2002 11:27:55 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: UAM missing Concepts
In-Reply-To: <3D793645.2050001@uaf.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
More comments, interspersed below.
At 03:12 PM 9/6/02 -0800, you wrote:
>Looks good. From UAM, we could address data to most of the proposed
>concepts. (There might be a water shrew that swam into a minnow trap,
>but we can not yet provide depth data for a mammal.) Nevertheless,
>there are a couple fields implemented in our present web interface that
>are missing.
>
>1.) We have a field for "Geographic Feature" which contains geographic
>descriptors below the state/province level such as parks, refuges, etc.
>This field has been crucial in pleasing various agencies. They can
>generate their own real-time report of our holdings from their
>jurisdictions, map them, etc. (NPS can search on any particular unit,
>but not on all NPS holdings.)
Features are interesting and useful, as Gary also pointed out in a previous
message. They can also be problematic the way they have been implemented at
MVZ and UAM. In short, it isn't possible to have a specimen located in more
than one feature. This means you couldn't have a specimen that was both in
Yosemite National Park and in the Sierra Nevada Range. Recognizing this
problem and seeing spatial query capabilities on the horizon, we at MVZ
have backed off of the wholesale use of the Feature field. I suggest that
its eventual fate is in question, so probably shouldn't try to include it.
>2.) Analogous to to OtherCatalogNumber, we have Other_ID, but
>supporting it, we have Other_ID_Type. An Other_ID_Type could be
>"GenBank accession number," and the associated Other_ID would be
>something like "Z123456." I suppose, that to avoid a one-to-many
>relationship, these could be concatenated into another string. This
>would complicate, or slow, some searches that are possible from our
>present site, e.g. taxon="Sorex" + Other_ID_Type = "GenBank..." +
>country="Russia" will show all the red-toothed shrews from Russia with
>GenBank accession numbers. "Atomic" searches are also possible. (e.g.,
>Other_ID_Type="GenBank..." + Other_ID = "Z123456")
>
>(Try it at
>http://arctos.museum.uaf.edu:8080/cgi-bin/uam_db/specimensearch.cgi)
>
>This example has had consequences. In individual specimen records, we
>display a GenBank number as a hyperlink to the sequence page on GenBank.
>Now, NCBI has set up "LinkOuts" from "our" sequence pages to the
>specimen records at UAM. NCBI is enthusiastic about repository
>databases as much-needed supplemental documentation of GenBank and will
>be making some announcements soon. We should consider if and how
>OtherCatalogNumber might be used by GenBank's "Entrez" for LinkOuts to
>MaNIS.
MaNIS can support one to many relationships if we want it to. If you look
at the original DC2 specification you'll see that I didn't include some
fields that would require one-to-many relationships to be really useful.
The most important one is RelatedCatalogedItem. Presumably if a specimen
can be realted to another one, it can be related to many other ones, even
across collections. If there is interest in resurrecting this concept for
MaNIS I am willing to do so.
>>> Posting number 317, dated 20 Sep 2002 15:28:31
>>> Posting number 318, dated 20 Sep 2002 17:42:08
Date: Fri, 20 Sep 2002 17:42:08 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: UAM missing Concepts
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Some remarks on two segments below:
>
> Features are interesting and useful, as XXXX also pointed out in a
> previous
> message. They can also be problematic the way they have been
> implemented at
> MVZ and UAM. In short, it isn't possible to have a specimen located in
> more
> than one feature. This means you couldn't have a specimen that was both in
> Yosemite National Park and in the Sierra Nevada Range. Recognizing this
> problem and seeing spatial query capabilities on the horizon, we at MVZ
> have backed off of the wholesale use of the Feature field. I suggest that
> its eventual fate is in question, so probably shouldn't try to include it.
I do not disagree with your reasoning, but between now and "the
horizon," we might find there was substantial use for this. We
definitely have some fuzzy stuff in there but we could restrict the use
in MaNIS to units that collections have in common and units for which
there are unequivocal boundaries, e.g., national and state parks,
wildlife refuges, etc.
>
> MaNIS can support one to many relationships if we want it to. If you look
> at the original DC2 specification you'll see that I didn't include some
> fields that would require one-to-many relationships to be really useful.
> The most important one is RelatedCatalogedItem. Presumably if a specimen
> can be realted to another one, it can be related to many other ones, even
> across collections. If there is interest in resurrecting this concept for
> MaNIS I am willing to do so.
UAM's vote is clear. There will be some specimens split among the
present MaNIS collections. I know we have vouchers for some tissues at
Texas Tech, etc. But in the long run, specimen databases should be able
to link to result databases.
XXXXXX
>>> Posting number 319, dated 21 Sep 2002 12:18:02
Date: Sat, 21 Sep 2002 12:18:02 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Datum
In-Reply-To: <3D8BF1A9.C7E952C9@oz.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXXXX and all,
Good find. You definitely should use a datum when you can find it. If you
use "datum not recorded" you'll add a systematic error of 1 km for
localities outside of the area shown on the map in the "Uncertainty due to
an unknown datum" section of the Georeferencing Guidelines Document on the
MaNIS website. As Barbara suspected, I did go ahead and add "Japanese
Geodetic Datum 2000" to the list of datums from which to choose in the
GeorefCalculator. Thanks for bringing this new Datum to our attention.
John
>XXXXXXXXXXXX wrote:
>
> > Barbara,
> >
> > Could you give me some advice as to how to deal with Datum on the
> > Georeferenceing Calculator?
> >
> > In the pop-up list for 'Datum' in the Calculator, a variety of coordinate
> > systems adopted worldwide or locally are listed including Tokyo Datum.
> >
> > However, maps I am referencing to for geocoding are not listed in it and
> > different than Tokyo Datum, a traditional standard that had long been
> > adopted in Japan until a few years ago.
> > If I could use the newest version of Japanese maps based on the Japanese
> > Geodetic Datum 2000 (JGD2000) , should I choose 'not recorded' from a
> > pop-up list of Datum?
> >
> > For your convenience in understanding the difference between these two
> > geodetic data, the following site may help.
> >
> > http://ivs.crl.go.jp/mirror/publications/gm2002/imakiire/
> >
> > Thank you very much for your assistance.
> >
> > - XXXXXX
> >
>>> Posting number 320, dated 23 Sep 2002 11:31:28
Date: Mon, 23 Sep 2002 11:31:28 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: updates to existing files
In-Reply-To: <3D8BCE70.8010102@uaf.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
This is primarily a question for John, but of relevance to all.
It is my understanding that we are not to make any changes to
locality data in our existing databases, because it would mess things up
for John (I don't know the details, but don't need to).
However, it is not clear whether it would cause problems if we
update IDs. We have a continual stream of re-identifications that come in
for many taxa, from our own on-going research and from people who borrow
our material for study. It would be quite inconvenient to wait for two
years to update these changes in identification.
What about other fields, such as date, collector, etc., if we
happen to notice misspellings, errors, etc.?
John, your thoughts? Anyone else concerned about this?
XXXXX
>>> Posting number 321, dated 23 Sep 2002 12:48:01
Date: Mon, 23 Sep 2002 12:48:01 -0400
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: updates to existing files
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
My understanding of the georeferencing data is that any non-geographic =
data within the "home db" can be editted without impact to the MaNIS =
mission. I will leave it to John to clearly explain the locality part.
On a similar line though, it occurred to me the other day, as I was =
deleting 102 duplicate records from out db that record deletion may well =
be a real problem for John. As I understand it new records to our db =
would not pose a similar problem. Correct John?
XXXXXn
>>> Posting number 322, dated 23 Sep 2002 13:00:11
>>> Posting number 323, dated 23 Sep 2002 22:02:29
Date: Mon, 23 Sep 2002 22:02:29 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: Barbara Stein <bstein@OZ.NET>
Subject: Re: updates to existing files
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
> My understanding of the georeferencing data is that any non-geographic data within the "home db" can
be editted without impact to the MaNIS mission. I will leave it to John to clearly explain the locality
part.
I agree with XXXXX that you are free to update any non-geography fields in your institutional dbs.
When John extracted the distinct set of localities from each of your dbs, he assigned a unique
identifier to each locality that will allow him to reassociate that locality with all of the specimen
records with which it was previously associated. Being
reductionist, I will simply say that if you alter the geography in your dbs, you may seriously
compromise that reassociation process. That would include deleting records. He's a clever guy, but the
less problems, the better.
You are all familiar with the unique geog "ID" that John added to each of the files you download for
georeferencing. Even if you do not disturb the geog. data in your dbs, altering this identifier will
also seriously compromise his ability to reassociate the
georeferenced localities with your speicmen records.
That having been said, I would ask you to refrain from using the abbreviation "ID" when you are
referring to taxonomic identifications. "IDs" have a specific meaning in a db context. They are
identifiers, not identifications, and reserving the use of "ID" for
indentifiers is just one easy, but important way, to lessen confusion in future postings.
So do keep updating your taxonomy, dates of collection, prep types, coll. names and nos. etc. We
absolutely want the best data we can get on the network. And keep posting questions. There is a lot
going on and it is good to reiterate these things periodically.
Best,
Barbara
> On a similar line though, it occurred to me the other day, as I was deleting 102 duplicate records
from out db that record deletion may well be a real problem for John. As I understand it new records to
our db would not pose a similar problem. Correct John?
>
> XXXXX
>
>>> Posting number 324, dated 25 Sep 2002 09:37:52
>>> Posting number 325, dated 30 Sep 2002 10:30:05
>>> Posting number 326, dated 1 Oct 2002 20:42:54
>>> Posting number 327, dated 2 Oct 2002 16:30:47
>>> Posting number 328, dated 2 Oct 2002 17:54:45
>>> Posting number 329, dated 2 Oct 2002 22:52:02
>>> Posting number 330, dated 3 Oct 2002 10:30:11
Date: Thu, 3 Oct 2002 10:30:11 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: updates to existing files
In-Reply-To: <3D8FF1E5.BED78CD7@oz.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
Sorry to have been preoccupied for so long. I'm trying to catch up again,
so here goes.
Additions of new records to your databases have no bearing on the
georeferencing process. Changing data does have the potential to impact the
re-association of georeferences to specimens, but only in a limited set of
circumstances, as follow:
1) changing catalog numbers
2) removing specimen records.
3) changing locality descriptions
The first case isn't likely to happen very often, but if it does, it's
still not the end of the world - we can still match the localities for
those records to ones in the gazetteer.
The second case isn't much of an issue, if the specimen record isn't there
anymore, it's not going to be of much use to georeference it anyway.
The third case is the most difficult one. If a locality description has
changed in your database, it will show up as being different when it comes
time to re-associate georeferences with specimens, because part of the
process will be to compare the original to the then-current locality. At
that point it will be necessary to look at the gazetteer version next to
your new version to see if the new locality is actually a different place.
If it is a different place, throw away the georeference for it. If it isn't
substantively different, use the georeference. Not such a big deal unless
you've systematically changed locality descriptions.
So, make all of the changes you like, but try to avoid changes to locality
descriptions. Misspellings and inconsistencies will be in the reports you
get back for your localities when georeferencing is done. And once the
georeferencing is done, it won't matter nearly as much how clear or
consistently formatted your localities are - someone will have already been
through the pain to figure out where they really are. In addition, there
will be a whole set of localities that were not possible to georeference
for one reason or another, so those who are truly bored will have a nice
set of problem localities to spend their evenings and weekends in solving.
John
At 10:02 PM 9/23/02 -0700, you wrote:
> > My understanding of the georeferencing data is that any non-geographic
> data within the "home db" can be editted without impact to the MaNIS
> mission. I will leave it to John to clearly explain the locality part.
>
>I agree with XXXXX that you are free to update any non-geography fields in
>your institutional dbs.
>
>When John extracted the distinct set of localities from each of your dbs,
>he assigned a unique identifier to each locality that will allow him to
>reassociate that locality with all of the specimen records with which it
>was previously associated. Being
>reductionist, I will simply say that if you alter the geography in your
>dbs, you may seriously compromise that reassociation process. That would
>include deleting records. He's a clever guy, but the less problems, the
>better.
>
>You are all familiar with the unique geog "ID" that John added to each of
>the files you download for georeferencing. Even if you do not disturb the
>geog. data in your dbs, altering this identifier will also seriously
>compromise his ability to reassociate the
>georeferenced localities with your speicmen records.
>
>That having been said, I would ask you to refrain from using the
>abbreviation "ID" when you are referring to taxonomic
>identifications. "IDs" have a specific meaning in a db context. They are
>identifiers, not identifications, and reserving the use of "ID" for
>indentifiers is just one easy, but important way, to lessen confusion in
>future postings.
>
>So do keep updating your taxonomy, dates of collection, prep types, coll.
>names and nos. etc. We absolutely want the best data we can get on the
>network. And keep posting questions. There is a lot going on and it is
>good to reiterate these things periodically.
>
>Best,
>Barbara
>
>>> Posting number 331, dated 4 Oct 2002 09:05:25
Date: Fri, 4 Oct 2002 09:05:25 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: updates to existing files
In-Reply-To: <5.0.0.25.2.20021003095534.02637790@socrates.berkeley.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Hi John,
Thanks for the message. Here at the Field, when we say "locality
descriptions" we are often referring to the specific locality AND the
elevation AND the lat and long. As you know, we have different fields for
each. When you say we should not change locality descriptions are you
referring to all three fields or just the specific locality. I suspect
that we are free to change incorrect lat and longs (since that is what the
project is about), but I want to be sure before I change any incorrect
elevations.
So for example, if I come across
"Chicago, 50 m" in the specific locality field I should leave it alone
but if I come across "Chicago" in the specific locality field and "50 m" in
the elevation field can I change the 50?
We are also wondering whether we might set up a copy of our specific
locality on our in-house database that we could correct as we come across
problems. The idea would be to copy this over to the "proper" catalogue
after you are finished with the reassociation of georeferences to our
specimens. In this way we could make corrections as we come across
problems rather than putting them in my "TO DO" pile that currently extends
through the roof of the Field Museum and has a red blinking light on top of
it to warn planes. Any thoughts on this?
Cheers,
XXXX
>>> Posting number 332, dated 4 Oct 2002 10:43:04
Date: Fri, 4 Oct 2002 10:43:04 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: updates to existing files
In-Reply-To: <5.1.0.14.1.20021004082747.02103ec0@mail.fmnh.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXX, and all,
I agree with your concept of the locality - it includes any piece of
information, no matter which field it "lives in," that helps to describe
the spatial bounds of collecting event. So, State/Province, County,
Specific Locality, Township/Range, Elevation, Depth, UTM coordinates,
Lat/Long, are all parts of the locality description.
I can appreciate your imagery of the changes backlog, and the problems I've
caused. I don't really want you to have to change your daily business while
waiting for the georeferencing to get done, but I don't want you to go out
of your way to clean up localities now because it will end up creating more
work for you later.
Here's a way to minimize the work later while allowing you to decrease the
aviation hazard now. Make a column in your database that can flag a record
as having been changed with respect to locality. For example, you could
create a column called "GeoreferenceAgain." Whenever you edit a locality in
such a way that it would be georeferenced differently, put a "yes" in this
field. Whenever you edit a locality, but it wouldn't change how you
georeference it, put a "no" in the GeoreferenceAgain field. The point is to
always put something in there if you edit the locality, that way you (and
I) know how to deal with it when re-associating coordinates with specimens.
Alternatively, you could do this with remarks in a Comments field if you
are very careful to be consistent about how you record the information so
that it can be parsed by a computer later. For example, if you edit a
locality in such a way that it would be georeferenced differently, append
"; Georeference again" to the end of your Comments field, and if you edit
the locality, but the change wouldn't affect the georeference, append "; do
not Georeference again" to the Comments field.
If you accept one of my solutions, you could go ahead and change "Chicago,
50m" to have "Chicago" in the SpecificLocality field and "50m" in the
elevation field. Then put "no" in the GeoreferenceAgain field or "; do not
Georeference again" at the end of a Comments field.
Suppose you had "Colorado River, 100ft" in your SpecificLocality field and
you came across some information that placed the collecting event at 100m
instead. This change would clearly affect how you georeference the
locality. Thus, regardless of whether you just change the SpecificLocality
field to be "Colorado River, 100m" or if you set the SpecificLocality to
"Colorado River" and set the Elevation to "100m", you should put "yes" in
the GeoreferenceAgain field, or put "; Georeference again" at the end of
the Comments field.
If you can commit to a solution such as the ones I've described above,
there will be no extra work later to figure out if a locality that you've
edited in your database has changed enough to warrant a new georeference.
Hope that helps,
John
>>> Posting number 333, dated 10 Oct 2002 16:55:34
>>> Posting number 334, dated 10 Oct 2002 18:10:52
>>> Posting number 335, dated 11 Oct 2002 17:19:29
>>> Posting number 336, dated 11 Oct 2002 17:48:29
>>> Posting number 337, dated 14 Oct 2002 11:09:14
>>> Posting number 338, dated 14 Oct 2002 09:21:04
>>> Posting number 339, dated 14 Oct 2002 10:34:42
>>> Posting number 340, dated 15 Oct 2002 09:01:44
>>> Posting number 341, dated 16 Oct 2002 11:25:41
>>> Posting number 342, dated 16 Oct 2002 11:38:42
>>> Posting number 343, dated 17 Oct 2002 10:31:07
>>> Posting number 344, dated 21 Oct 2002 16:30:30
>>> Posting number 345, dated 23 Oct 2002 15:25:41
>>> Posting number 346, dated 26 Oct 2002 12:50:33
>>> Posting number 347, dated 26 Oct 2002 19:56:01
>>> Posting number 348, dated 27 Oct 2002 13:21:08
>>> Posting number 349, dated 27 Oct 2002 12:26:49
>>> Posting number 350, dated 28 Oct 2002 09:31:32
>>> Posting number 351, dated 28 Oct 2002 13:02:50
>>> Posting number 352, dated 28 Oct 2002 14:22:20
>>> Posting number 353, dated 29 Oct 2002 13:39:47
>>> Posting number 354, dated 30 Oct 2002 14:53:58
>>> Posting number 355, dated 31 Oct 2002 18:22:33
>>> Posting number 356, dated 1 Nov 2002 15:19:31
>>> Posting number 357, dated 2 Nov 2002 02:32:36
>>> Posting number 358, dated 2 Nov 2002 03:40:24
Date: Sat, 2 Nov 2002 03:40:24 -0600
Reply-To:
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Apology
MIME-version: 1.0
Content-type: text/plain; format=flowed; charset=us-ascii
Content-transfer-encoding: 7BIT
Dear All,
I am so sorry that the last message I sent should have been addressed
personally, not to the mailing list.
Instead, let me talk about this example that I just encountered while I was
working on georeferencing. Some one might give us a good suggestion. The
records from 'Japan' includes one from Taiwan area. As some of you know,
Taiwan used to be also called Formosa for a certain period of time
including when it was politically under control of the Japanese government.
Presumably the specimen was collected and recorded during that time. When
we publish the georeference data of this specimen on the MANIS platform,
are there, if any, standards or agreements we should be ethically aware of,
especially when we deal with such potentially sensitive issues in terms of
the geographic regions which may have experienced change in their higher
geographic (country) names for any political or historical reasons?
>>> Posting number 359, dated 4 Nov 2002 13:37:40
>>> Posting number 360, dated 5 Nov 2002 20:36:41
Date: Tue, 5 Nov 2002 20:36:41 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: Barbara Stein <bstein@OZ.NET>
Subject: Re: Apology
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
> The records from 'Japan' includes one from Taiwan area. As some of you know,
>
> Taiwan used to be also called Formosa for a certain period of time
> including when it was politically under control of the Japanese government.
> Presumably the specimen was collected and recorded during that time. When
> we publish the georeference data of this specimen on the MANIS platform,
> are there, if any, standards or agreements we should be ethically aware of,
> especially when we deal with such potentially sensitive issues in terms of
> the geographic regions which may have experienced change in their higher
> geographic (country) names for any political or historical reasons?
XXXXXX,
I agree completely that sensitivity to geographic place name changes is one of
the most important and difficult issues data managers of natural history
information confront. Because it is essential that our databases are both
useful and historically accurate, the best suggestion I have is to create two
fields in your database, one for verbatim locality, i.e., the locality as it
was exactly written on the specimen tag or in the field notes at the time of
collection, and specific locality, i.e., the currently recognized locality.
This approach respects history while providing users with appropriate
information to assess species distributions, and both localities can be
provided to users in response to queries. It is usually extremely awkward to
generate complete result sets without querying on cleaned-up specific
localities. At the same time, it is often only possible to make sense of the
resulting data by having access to the verbatim locality.
Best,
Barbara
>>> Posting number 361, dated 12 Nov 2002 12:09:12
>>> Posting number 362, dated 16 Nov 2002 16:09:32
>>> Posting number 363, dated 19 Nov 2002 12:16:55
>>> Posting number 364, dated 19 Nov 2002 14:34:37
>>> Posting number 365, dated 20 Nov 2002 16:00:18
>>> Posting number 366, dated 25 Nov 2002 09:30:52
>>> Posting number 367, dated 11 Dec 2002 11:11:11
>>> Posting number 368, dated 11 Dec 2002 09:41:46
>>> Posting number 369, dated 8 Jan 2003 15:59:43
>>> Posting number 370, dated 9 Jan 2003 04:24:02
Date: Thu, 9 Jan 2003 04:24:02 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Idaho time trials
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Dear All:
Below are my Idaho time trial summaries for informational purposes (for =
herp and bird efforts). Unlike Oregon georeferencing, I kept track of =
hours spent on various aspects. Of the total records georeferenced:
=20
Method used......................................Total.....% of Total.
georef'd prior to download 7/27/02..121.....5.7%
GNIS-calculated lat/long.....................1568.....74.2%
Topo USA 4.0.......................................299.....14.1%
TRS (TRS2LL batch)...............31.....1.5%
Insufficient data.......................................95.....4.5%
Grand Total..........................................2114...100.0%
=20
Comments:
GNIS were done using a lookup of the name and lat/long then calculating =
lat/long from the placename using offsets and a macro similar to the web =
based calculator. Error calculation is done simultaneously.
=20
Topo USA 4.0 was used for road miles, creek miles, etc.
=20
TRS records were done using the batch program for western states =
(TRS2LL). Lat/longs for =BC sections and =BC of =BC section centers =
were calculated using lat/long calculator.
=20
Insufficient data (no specific locality, inconsistent information, =
placenames not found).
=20
Hours =20
12.33.....GNIS lookup of placename and lat/long and offset entry
17..........lookup extents
16..........Road miles, creek/river miles, junctions and TRS
15..........Proofing, reading back through placenames and offsets and =
plotting calculated lat/longs by county
48.5.......Total hours
.....
41.1 records/hr (1993/48.5)
=20
Greatest speed was in doing the GNIS records. Even with 17 hr for =
extents the rate was 53.5 records/hr (1568/29.33). Eliminating extents =
or putting them off for later would raise the rate to 127.2 records/hr =
(1568/12.33). This rate was=20
=20
Summaries of extents and error radii:
=20
(Only 3 records had units in km, which were converted to miles for =
summary of extents and errors.)
=20
Extents of placenames were done as outlined in the MaNIS instructions. =
The contribution of extent to error differs depending on the number of =
offsets. =20
=20
Range in mi......Count
extent< .1.............515
.1<extent<1..........466
1<=3Dextent<5.........671
5<=3Dextent<10.......186
10<=3Dextent<20........47
20<=3Dextent<30......8
30<=3Dextent<50........5
not done..............216
Grand Total........2114
=20
Error radii were not as large as I thought they would be. The greatest =
variable contributing to error seems to be precision (N, NE, ENE) used =
to describe the direction of the offset. =20
=20
Range in mi.....Count
error< .1..............218
.1<error<1......250
1<=3Derror<5.....887
5<=3Derror<10.....371
10<=3Derror<20.....153
20<=3Derror<30.......13
30<=3Derror<50.......6
not done......216
Grand Total.......2114
=20
The Oregon summary was similar, except that about 20% were TRS records. =
I didn't keep track of hours spent on various aspect so haven't posted =
that summary. A lower rate of 20/hr for Oregon was partly getting a =
system down.
=20
If getting paid by the record, choose those localities with published =
datasets for placenames and lat/longs and electronic maps for extents. =
If by the hour, I would still do those with placename datasets before =
delving into the one at a time records. The approach I would suggest is =
to use a batch system or semi batch, like mine, to do the US.
=20
I'm doing British Columbia now. I used the NIMA dataset (2500 records) =
then discovered that BCGNIS dataset with 42K records. However maps are =
not interactive (no Topo Canada) and extents may have to be done as =
estimates. Unlike NIMA and (US) GNIS the datum may have to be unknown.
>>> Posting number 371, dated 9 Jan 2003 04:27:38
Date: Thu, 9 Jan 2003 04:27:38 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: PSM will do MMNH Idaho and Oregon records
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
John: Just to make it official. There are 100 Oregon and about 10 Idaho.
>>> Posting number 372, dated 9 Jan 2003 09:31:34
Date: Thu, 9 Jan 2003 09:31:34 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Idaho time trials
In-Reply-To: <BAY1-DAV65FYfYpwMIH0001048e@hotmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: quoted-printable
XXXX,
Thanks for these wonderful stats. You're right, there are lots of folks who=
=20
will be interested to know this information, not the least of which is me.=
=20
It gives us a good basis from which to measure of the efficacy of=20
technological an/or methodological innovations. Again, thanks,
John
>>> Posting number 373, dated 9 Jan 2003 15:07:18
>>> Posting number 374, dated 14 Jan 2003 12:48:13
>>> Posting number 375, dated 14 Jan 2003 10:49:37
>>> Posting number 376, dated 14 Jan 2003 15:08:42
>>> Posting number 377, dated 14 Jan 2003 16:05:30
>>> Posting number 378, dated 14 Jan 2003 16:09:19
>>> Posting number 379, dated 20 Jan 2003 08:50:00
>>> Posting number 380, dated 22 Jan 2003 13:56:17
>>> Posting number 381, dated 23 Jan 2003 11:09:35
>>> Posting number 382, dated 24 Jan 2003 12:52:37
>>> Posting number 383, dated 24 Jan 2003 17:53:34
>>> Posting number 384, dated 25 Jan 2003 12:59:55
Date: Sat, 25 Jan 2003 12:59:55 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Georeferencing status check
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
Georeferencing has been going quite smoothly, overall, and I'd like to
thank all of the participants for their contributions. As more regions get
claimed and georeferenced, it's becoming more difficult to determine what
remains to be done. In an effort to make things easier, I'm trying to put
together a new interface for the CheckList page, including a graphic
depiction of our progress to date. To do that, I'm asking that each
participating institution to send me (it doesn't have to go on the
listserv) a brief message telling me how much is finished for regions that
have been claimed but not yet submitted. It'll help me to get an accurate
accounting and thereby determine how we are progressing overall. I will
assemble the status page with information received by 31 Jan 2003, so
please get back to me with your status by then.
Just so you all know, MVZ has 1143 localities to go to finish the 58543
localities for California, 1170 localities remaining of the 3389 for Costa
Rica, and we haven't yet begun with The Netherlands, Argentina, or
Arizona. If there is anyone out there who is interested in, ready, and has
the resources to georeference Arizona, the MVZ can relinquish it for
another, unclaimed region. Speak soon or we're likely to start on it as
soon as California is out of the way.
John
>>> Posting number 385, dated 26 Jan 2003 08:28:36
>>> Posting number 386, dated 27 Jan 2003 11:53:41
Date: Mon, 27 Jan 2003 11:53:41 -1000
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: procedural question
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
What are people doing in cases where you can't find the given locality =
in gazeteers or on maps, but another institution has already provided =
coordinates (in checking the coordinates on the map I see no place name =
similar to the one given for locality). Do you use the coordinates from =
the other institution? What do you enter for the various fields needed =
to find max error?
Sorry if this question has already been answered somewhere.
XXXXX
>>> Posting number 387, dated 28 Jan 2003 10:01:19
Date: Tue, 28 Jan 2003 10:01:19 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: procedural question
In-Reply-To: <9325D4CE29553F42A57E02B845C067F937D068@exchange.corp.bisho
pmuseum.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear XXXXX, and all,
It seems to me that if the coordinates don't help you to find a locality on
a map then there is a fair chance that there is actually something wrong
with the coordinates. If you've exhausted your available map resources
without finding the place, I would put, for example, "unable to locate Burt
Ranch" in the NoGeorefBecause field. In the DeterminationRef you could
still include the references on which you were unable to find the locality.
In LatLongRemarks I would add, for example, "data from MVZ suggest that
this locality is in the vicinity of 34.34 -120.43." The bottom line is
that we would like to avoid propagating suspect (or undocumented) data.
John
>>> Posting number 388, dated 28 Jan 2003 12:29:35
Date: Tue, 28 Jan 2003 12:29:35 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: procedural question
In-Reply-To: <5.1.1.5.2.20030128093301.017f9138@socrates.berkeley.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
John,
I appreciate your intent in adopting this convention, but I would urge that it
be applied sparing to historic collections and overseas localities. As one
example, many Neotropical collectors in the early 20th century based their
localities on the Map of the Hispanic Americas. I'm guessing that relatively
few collections have this source at hand during geo-referencing. A set of
coordinates rather laboriously determined by hand from a place name (tag) and
the atlas (MHA) or even a hand-drawn map included in field notes could be lost
or demoted to an ancillary field not used in mapping by your method.
The bottom line seems to be "What sort of effort would be reuqired to exhume
it?"
Anyhow, some thoughts on the basis of past experience with our collection....
XXXXX
>>> Posting number 389, dated 28 Jan 2003 09:17:12
Date: Tue, 28 Jan 2003 09:17:12 -1000
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: procedural question
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Actually, I quite like John's solution. It does not comment on the =
validity of the coordinates provided by another institution, but it does =
tell the institution whose locality is currently in question that I was =
not able to find the locality in my sources but here is what someone =
else has come up with, check it out and decide for yourself. In this =
way we don't, as John says, keep passing on possibly erroneous =
information but we don't lose it either. If either instituion can =
validate the data later, the full georeferencing data can be filled in =
at that time.
XXXXX
>>> Posting number 390, dated 29 Jan 2003 07:59:20
Date: Wed, 29 Jan 2003 07:59:20 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: procedural question
In-Reply-To: <9325D4CE29553F42A57E02B845C067F91EDAA3@exchange.corp.bisho
pmuseum.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Perhaps in my hurry to rush off to lecture I left too much unsaid in the
earlier message--permit me to elaborate a bit.
Place names change over time, as we all appreciate. Old maps depict old place
names (albeit sometimes with crudely determined coordinates). If one is dealing
with old specimens referenced with old place names, then it sometimes happens
that only old maps will register those localities. If a place has been renamed,
or simply abandoned, it won't appear on a modern map, no matter the detail or
accuracy of the latter. And if someone has already determined that, and gone to
the trouble of determining its coordinates, wouldn't we be far better off
leaving those coordinates in the database?
I believe a better arbiter of a "suspect" distributional record is an otherwise
extralimital one. In the extreme, application of John's suggestion would
actually remove a GPS-determined coordinates for a locality if (a) the fact it
was recorded by the collector with a GPS were unnoted in the catalog, and (b)
the place name chosen for a reference point were too obscure or colloquial to
appear on the maps a given geo-referencer was using in his/her country/state
review. Personally, I would rather identify these after the fact--in
distributional context--rather than a priori, on whether they can be relocated.
By the way, all of these would lack the estimate of precision (accuracy) we are
associating with all newly determined coordinates--by mapping only records with
spatial error terms, we can automatically exclude such records, without ever
changing their coordinates
As I suggested earlier, John's standard is a good one that will work with
recent collections being geo-referenced with comprehensive, well documented map
sources. I believe it will fail (meaning we lose information we already have in
digital form) on historic collections being georeferenced without all the
sources utilized by or available to the collectors. To safeguard against the
latter, perhaps persons trying to georeference specimens with undocumented
coordinates could contact the host museum(s) to inquire about possible
alternative data sources BEFORE eliminating those records' coordinates. I'd
sure hate to throw the baby out with the bathwater...
Bruce
> Actually, I quite like John's solution. It does not comment on the > validity
of the coordinates provided by another institution, but it does > tell the
institution whose locality is currently in question that I was > not able to
find the locality in my sources but here is what someone > else has come up
with, check it out and decide for yourself. In this > way we don't, as John
says, keep passing on possibly erroneous > information but we don't lose it
either. If either instituion can > validate the data later, the full
georeferencing data can be filled in > at that time. > >
>>> Posting number 391, dated 29 Jan 2003 08:59:09
Date: Wed, 29 Jan 2003 08:59:09 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: procedural question
In-Reply-To: <4.1.20030129075122.00965460@mail.fmnh.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii" ; format="flowed"
for what it is worth, I agree totally with what XXXXX writes below.
There are lots of named ranches here is the western US, given as
localities, that are identified on old (e.g., 1927) 7.5 minute USGS
topo sheets but that do not appear on more recent paper versions and
thus on any of the digitally available maps.
XXX
>Perhaps in my hurry to rush off to lecture I left too much unsaid in the
>earlier message--permit me to elaborate a bit.
>
>Place names change over time, as we all appreciate. Old maps depict old place
>names (albeit sometimes with crudely determined coordinates). If one
>is dealing
>with old specimens referenced with old place names, then it sometimes happens
>that only old maps will register those localities. If a place has
>been renamed,
>or simply abandoned, it won't appear on a modern map, no matter the detail or
>accuracy of the latter. And if someone has already determined that,
>and gone to
>the trouble of determining its coordinates, wouldn't we be far better off
>leaving those coordinates in the database?
>
>I believe a better arbiter of a "suspect" distributional record is
>an otherwise
>extralimital one. In the extreme, application of John's suggestion would
>actually remove a GPS-determined coordinates for a locality if (a) the fact it
>was recorded by the collector with a GPS were unnoted in the catalog, and (b)
>the place name chosen for a reference point were too obscure or colloquial to
>appear on the maps a given geo-referencer was using in his/her country/state
>review. Personally, I would rather identify these after the fact--in
>distributional context--rather than a priori, on whether they can be
>relocated.
>By the way, all of these would lack the estimate of precision
>(accuracy) we are
>associating with all newly determined coordinates--by mapping only
>records with
>spatial error terms, we can automatically exclude such records, without ever
>changing their coordinates
>
> As I suggested earlier, John's standard is a good one that will work with
>recent collections being geo-referenced with comprehensive, well
>documented map
>sources. I believe it will fail (meaning we lose information we
>already have in
>digital form) on historic collections being georeferenced without all the
>sources utilized by or available to the collectors. To safeguard against the
>latter, perhaps persons trying to georeference specimens with undocumented
>coordinates could contact the host museum(s) to inquire about possible
>alternative data sources BEFORE eliminating those records' coordinates. I'd
>sure hate to throw the baby out with the bathwater...
>Bruce
>
>> Actually, I quite like John's solution. It does not comment on
>>the > validity
>of the coordinates provided by another institution, but it does > tell the
>institution whose locality is currently in question that I was > not able to
>find the locality in my sources but here is what someone > else has come up
>with, check it out and decide for yourself. In this > way we don't, as John
>says, keep passing on possibly erroneous > information but we don't lose it
>either. If either instituion can > validate the data later, the full
>georeferencing data can be filled in > at that time. > >
>
>>> Posting number 392, dated 29 Jan 2003 09:24:28
Date: Wed, 29 Jan 2003 09:24:28 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: procedural question
In-Reply-To: <p05100303ba5db78db1d2@[128.32.214.36]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
I don't disagree with any of these sentiments, but I want to clarify my
original response to the original questions posed by XXXXX, which were:
"What are people doing in cases where you can't find the given locality in
gazetteers or on maps, but another institution has already provided
coordinates (in checking the coordinates on the map I see no place name
similar to the one given for locality). Do you use the coordinates from
the other institution? What do you enter for the various fields needed to
find max error?"
Remember, we're supposed to georeference localities that *do not* already
have coordinates, and we're supposed to leave localities that have
coordinates unchanged. It was never the intention to throw anything away,
not even the bathwater.
>>> Posting number 393, dated 29 Jan 2003 11:39:31
Date: Wed, 29 Jan 2003 11:39:31 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: procedural question
In-Reply-To: <5.1.1.5.2.20030129090833.01727938@socrates.berkeley.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
I have a question regarding extents and existing data. I have found
identical localities from different museums, but one has coordinates as
parsed by John. What should be done with these at error/extent time?
XXXXXXXXXXXX
>>> Posting number 394, dated 29 Jan 2003 09:58:00
Date: Wed, 29 Jan 2003 09:58:00 -1000
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: procedural question
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Regarding XXXX's question: per instructions, I leave any records that =
come with coordinates absolutely alone: "we're supposed to georeference =
localities that *do not* already have coordinates, and we're supposed to =
leave localities that have coordinates unchanged." I do find =
coordinates and do the whole deal for the records that don't have the =
coordinates added yet. This may seem like duplicating the =
georeferencing when another museum has coordinates for what looks like =
the exact same locality, but I do it for three reasons: =20
1. I find that many times the parsed coordinates are from less accurate =
coordinates than I come up with, for instance, many times the decimal =
lat and long are determined from coordinates that only are accurate to =
the closest minute. My measurements off of maps are far more accurate.
2. It's a good cross check of the information for both institutions.
3. As pointed out, the parsed records do not come with all the =
information needed for determining maximum error. Since I need to find =
these out for the records that do not come with coordinates, I may as =
well determine the coordinates while I'm at it.
Regarding XXXXX's concerns: no data is ever eliminated from the MaNIS =
records. The records that contain coordinates are never changed in any =
way, so that historical data is left untouched for those instutions that =
have already done the work. My question had to do with situations when =
the locality from institution A does not show up in any of my available =
sources, but I do see that institution B has provided coordinates. =
Following John's suggestion, I will now let Instituion A know why their =
locality has not been georeferenced, and refer them to the information =
provided by Instuition B.
XXXXX
-----Original Message-----
From:
Sent: Wednesday, January 29, 2003 7:40 AM
To: MAMMAL-Z-NET@USOBI.ORG
Subject: Re: procedural question
I have a question regarding extents and existing data. I have found
identical localities from different museums, but one has coordinates as
parsed by John. What should be done with these at error/extent time?
>>> Posting number 395, dated 29 Jan 2003 12:26:14
Date: Wed, 29 Jan 2003 12:26:14 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: finding ranches
In-Reply-To: <p05100303ba5db78db1d2@[128.32.214.36]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii" ; format="flowed"
In Oregon and Idaho, I found that zooming in all the way on Topozone
reveals many features that do not come up with the search utility and
are not in GNIS. Supposedly Topozone maps are scans of original
USGS maps. However to find a ranch, flat, or cave you have to know
the general locality. A fast connection and big screen also helps.
>for what it is worth, I agree totally with what Bruce writes below.
>There are lots of named ranches here is the western US, given as
>localities, that are identified on old (e.g., 1927) 7.5 minute USGS
>topo sheets but that do not appear on more recent paper versions and
>thus on any of the digitally available maps.
>>> Posting number 396, dated 30 Jan 2003 11:49:22
>>> Posting number 397, dated 30 Jan 2003 10:00:47
>>> Posting number 398, dated 30 Jan 2003 12:12:44
>>> Posting number 399, dated 30 Jan 2003 16:08:52
>>> Posting number 400, dated 30 Jan 2003 16:37:05
>>> Posting number 401, dated 30 Jan 2003 15:40:20
>>> Posting number 402, dated 30 Jan 2003 16:52:20
>>> Posting number 403, dated 30 Jan 2003 14:06:32
>>> Posting number 404, dated 30 Jan 2003 16:59:54
>>> Posting number 405, dated 30 Jan 2003 17:34:18
>>> Posting number 406, dated 30 Jan 2003 18:29:53
>>> Posting number 407, dated 30 Jan 2003 18:35:05
>>> Posting number 408, dated 30 Jan 2003 18:36:56
>>> Posting number 409, dated 31 Jan 2003 10:19:27
>>> Posting number 410, dated 31 Jan 2003 15:12:23
>>> Posting number 411, dated 31 Jan 2003 14:03:38
>>> Posting number 412, dated 31 Jan 2003 19:47:53
>>> Posting number 413, dated 3 Feb 2003 13:44:57
>>> Posting number 414, dated 4 Feb 2003 18:58:30
>>> Posting number 415, dated 5 Feb 2003 09:44:07
>>> Posting number 416, dated 5 Feb 2003 12:30:28
>>> Posting number 417, dated 5 Feb 2003 12:37:31
>>> Posting number 418, dated 6 Feb 2003 13:49:21
>>> Posting number 419, dated 7 Feb 2003 17:03:42
>>> Posting number 420, dated 7 Feb 2003 16:24:36
>>> Posting number 421, dated 7 Feb 2003 20:42:49
>>> Posting number 422, dated 10 Feb 2003 10:19:24
>>> Posting number 423, dated 10 Feb 2003 10:21:43
>>> Posting number 424, dated 10 Feb 2003 19:37:19
Date: Mon, 10 Feb 2003 19:37:19 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: MaNIS Georeferencing Status Report - 1 Feb 2003
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
Brace yourselves. I'm about to blast you with numbers. Below are several
ways of looking at the georeferencing progress to date. The data include
the localities and contributions of the Bell Museum, as well as the
contributions of CONABIO and Town Peterson's group at KU - to all of whom I
extend another hearty thanks.
The following data represent the georeferencing status as of 1 Feb 2003, 17
months into the MaNIS project. When looking at these data there are several
attendant issues to consider, namely:
1) MVZ actually started georeferencing before the grant did.
2) Most institutions did not start georeferencing until about Jan 2002.
3) One institution did not submit a report but has 3600 localities in progress.
4) There has been assistance from colleagues who were not budgeted into the
original calculations. This accounts for 18932 claimed localities of which
8545 have been georeferenced and 2808 had coordinates when we began.
5) Of the 34090 unclaimed localities, 4838 already have lat/longs.
6) The Bell Museum has contributed 5407 localities to the workload and has
thus far committed to georeferencing 3815 localities. Some of the Bell
Museum localities may not be georeferenced in the context of MaNIS.
Total number of localities: 296737
Total number georeferenced: 203881 (68.7%)
Total pre-georeferenced: 64073 (21.6%)
Georeferenced in MaNIS: 139808 (47.1%)
Remaining to georeference: 92856 (31.3%)
Breakdown by georef category:
USA georeferenced: 161906 of 192887 (83.9%)
CAN georeferenced: 8426 of 12197 (69.1%)
MEX georeferenced: 6705 of 30062 (22.3%)
Other georeferenced: 26844 of 61591 (43.6%)
The claims are as follow:
USA claimed: 191125 of 192887 (99.1%)
CAN claimed: 8202 of 12197 (67.2%)
MEX claimed: 29898 of 30062 (99.5%)
Other claimed: 33422 of 61591 (54.3%)
Total claimed: 262647 of 296737 (88.5%)
Using reasonable (I think) estimates of the georeferencing rates for the
three original categories (18/hr, 12/hr, and 9/hr for USA, non-USA North
America, and non-North America respectively), we have finished 214 weeks of
georeferencing and have another 196 weeks of georeferencing to go. In other
words, 52.2% of the workload is behind us. These weeks are not based on the
actual hours spent georeferencing. However, given that georeferencing takes
time, and time is money (if we're to believe the addage), another way to
assess our progress is to look at how much money has been spent to get to
our current status. As of 1 Feb 2003, 49% of the money available for
georeferencing has been spent. Given the caveats in items 1-6 above and
that there is some variability in the georeferencing rates, I'm happy to
report that we're right on target!
To celebrate, I've made changes to the Georef Checklist
(http://dlp.cs.berkeley.edu/manis/Checklist.html) page to reflect that we
have done more than we have left to do. We now have a map of claimed
georeferencing regions courtesy of Robert Hijmans. In addition, I've added
a new table with an alphabetical listing of geographic regions that have
not yet been claimed for georeferencing, including the number of localities
and the number left to georeference. Because I'm feeling particularly
generous (relieved is probably more accurate, actually), I've also made an
update to the georeferencing calculator to include a few new map scales
that have been encountered since beginning to georeference far-away places
where we'd all rather be.
I guess I'll take this opportunity to claim Albania, Djibouti, Macau,
Guinea-Bissau, and the Maldives for georeferencing while I'm at it. (You
should probably look on the new checklist before being too laudatory).
Thanks, and good work,
John
>>> Posting number 425, dated 20 Feb 2003 11:25:34
>>> Posting number 426, dated 25 Feb 2003 11:37:32
Date: Tue, 25 Feb 2003 11:37:32 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: "Patricia W. Freeman" <pfreeman1@UNL.EDU>
Subject: smaller collections and overlap
In-Reply-To: <5.0.2.1.0.20030220112241.02400a00@nhm.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
I meant to pass this along to the list a couple of weeks ago. It brings up
an interesting issue. The University of Nebraska State Museum (UNSM)
division of Zoology had already done much of work that MaNIS set out to do
when the proposal went out, or at least we had much of the work in place
and had created the UNSM Georeferencing Calculator. It then became an
issue as to whether Steve Hinsaw at Michigan would "do" Nebraska for MaNIS
or that UNSM would. We both volunteered.
I hope we can be linked in some fashion to MaNIS. Other state/regional
collections may be in the same boat. We concentrate on Nebraska and the
Northern Great Plains region and have the largest collection of Nebraska
mammals (and birds, and herps, and fish)
Here is my letter.
10/2/03
Dear Steve and John-
If you want consistency with Manis, Steve should probably go ahead and
calculate localities for Nebraska.
Our algorithm [*], based on ideas from that Texas Tech paper, is similar
but figuring accuracy is a different matter. We have applied our
Calculator method to all four of our vertebrate collections here. We try
to have two ways of reporting locality now. For the third, retroactively
figuring lat /long, we feel, is art rather than science and the buyers will
have to beware, particularly when it comes to different centuries, town
centers and post offices over time, different models of GPS units, and
whether those units have been corrected.
I hope, however, Manis will consider links to smaller state and regional
collections that are georeferenced and have a good knowledge of and are
closer to old place names in our state or region.
As regarding the sharing of data issue.
We have two levels of information. The public/online one gives localities
to counties only. These lead to professional inquires that come straight
to my Collections Manager or myself. It has worked very well over the last
several years and researchers contact us on a regular, and increasing
basis. We handle endangered species localities on a case by case basis. [**
then***]
Best regards-
Trish Freeman
my notes today that are more explanatory:
*that we have now named the UNSM Georeferencing Calculator and was created
in 1999 or 2000 by Cliff Lemen, all-round biologist and is computer
knowledgeable (ask Bruce Patterson)
** This way we know exactly who wants data and can assess for-profit
requests. Our UNSM management policy has suggested charging for these
requests. They have not come up much, however.
*** We have a disclaimer on every report sent out that we cannot assure the
accuracy of the data (this would be both taxonomically and
geogreference-wise).
Thanks-
Trish
Patricia W. Freeman
Professor/ Curator of Zoology
University of Nebraska State Museum
Lincoln NE 68588-0514
402-472-6606
402-472-8949 (fax)
pfreeman1@unl.edu
http://www-museum.unl.edu/research/zoology/zoology.html
>>> Posting number 427, dated 5 Mar 2003 09:46:19
>>> Posting number 428, dated 5 Mar 2003 13:07:44
>>> Posting number 429, dated 5 Mar 2003 16:58:32
>>> Posting number 430, dated 11 Mar 2003 21:02:37
>>> Posting number 431, dated 11 Mar 2003 09:47:45
>>> Posting number 432, dated 11 Mar 2003 12:02:27
>>> Posting number 433, dated 13 Mar 2003 15:53:15
Date: Thu, 13 Mar 2003 15:53:15 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Georeferencing guidelines citation
In-Reply-To: <5.1.1.5.2.20030311022218.018130b0@socrates.berkeley.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: quoted-printable
Hello John.
Do you have the proper citation for your 'Georeferencing guidelines'?
Thanks
XXXXXXXXXXXXXXXXXXX
>>> Posting number 434, dated 13 Mar 2003 15:56:31
Date: Thu, 13 Mar 2003 15:56:31 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Georeferencing guidelines citation
In-Reply-To: <5.2.0.9.0.20030313155151.00b5ce48@xolo.conabio.gob.mx>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: quoted-printable
For the web page I suppose you'd use something like the following:
Wieczorek, J. R. 2001. Georeferencing Guidelines.=20
(http://dlp.cs.berkeley.edu/manis/GeorefGuide.html)
I'm in the process of preparing a paper on the subject to the International=
=20
Journal of GIS based on the GeorefGuide document. I'm going to try to=20
submit it in the next week or two.
At 03:53 PM 3/13/03 -0600, you wrote:
> Hello John.
> Do you have the proper citation for your 'Georeferencing guidelines'?
>
> Thanks
> XXXXXXXXXXXXXXXXXXXXXXX
>
>>> Posting number 435, dated 14 Mar 2003 12:04:44
>>> Posting number 436, dated 14 Mar 2003 16:34:31
>>> Posting number 437, dated 14 Mar 2003 18:56:55
Date: Fri, 14 Mar 2003 18:56:55 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: New Georeferencing Calculator
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
By popular demand (way back when I first released the Georeferencing
Calculator, and more recently as new browser versions have become
available) I have developed a new Calculator. The new version is the same
as the old in every respect except the following:
1) The code for the new calculator is much smaller and should therefore
load much more quickly. The release consists of one file,
georefcalculator.jar, which is 244k.
2) I have removed an internal dependency that required access to
elib.cs.berkeley.edu to determine datum errors. The datum error data are
now included within the program. This means the calculator can be run with
appletviewer without requiring access to the internet.
3) I have created an enhanced web page that checks for and installs (if
necessary) a Java plugin in your browser before trying to load the applet.
4) I have made the applet Mac compatible in IE and Netscape versions later
than 4.x.
The calculator still seems to run fine in all of the previously supported
operating systems and browsers. Feel free to report findings to the contrary.
The new Calculator can be found at the following URL:
http://dlp.cs.berkeley.edu/manis/gc.html
When it is clear that this version is at least as stable as the old one, I
will replace all of the links in the MaNIS web pages to this new version
and "retire" the old one. Therefore, please start to use this new version
and tell me if anything is amiss.
Thanks,
John
>>> Posting number 438, dated 15 Mar 2003 13:37:26
>>> Posting number 439, dated 17 Mar 2003 15:18:21
>>> Posting number 440, dated 20 Mar 2003 15:28:02
Date: Thu, 20 Mar 2003 15:28:02 -1000
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: New Georeferencing Calculator
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
John,
your calculators (old and new) do not provide a selection for GPS =
reading in the "coordinate source" box. I've got new localities coming =
in that require that option and there just may be some fairly recent =
ones in the MaNIS gazeteer that will need it also.
XXXXX
>>> Posting number 441, dated 21 Mar 2003 13:05:20
Date: Fri, 21 Mar 2003 13:05:20 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Updates for GPS-derived coordinates
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All, and especially XXXXXXXXXXXXX for the reminder,
I have made an update to the new Georeferencing Calculator (
(http://elib.cs.berkeley.edu/manis/gc.html) to accommodate GPS as a
coordinate source. When the Locality Type is "Coordinates Only" and the
Coordinate Source is "GPS" a new text box appears in which you can enter
the GPS accuracy. The calculations treat GPS accuracy in the same way as an
extent of a named place. In this case the named place is the pair of
coordinates and the extent is the accuracy.
I have updated the Georeferencing Guidelines
(http://dlp.cs.berkeley.edu/manis/GeorefGuide.html) to include a section on
GPS accuracy as well as minor changes in the text to accommodate this
source of uncertainty. I'm including the text of the GPS section below for
you convenience.
The changes have not been to the old calculator, which I will retire when
I'm confident that this new calculator has no flaws more serious than the
old one has. To date I have had no reported bugs, which I take to be a good
sign rather than a sign that you have all given up georeferencing altogether.
John
Excerpt from Georefencing Guidelines 21 Mar 2003:
Uncertainty due to GPS accuracy
The accuracy of the coordinate data reported by a GPS varies with time,
place, and equipment used. Previous to the order to cease Selective
Availability (deliberate GPS signal scrambling) at 8PM EST 1 May 2000,
uncorrected GPS receivers were subject to artificial inaccuracies of about
100 meters. Today, many GPS receivers have a function to determine the
estimated accuracy of given reading, but this information is not
universally available, nor is it often recorded with the coordinates. It is
not possible to determine the actual accuracy of a GPS reading
retroactively if it was not recorded at the time of the reading. In fact,
many GPS receivers estimate accuracy poorly. My Garmin eTrex Summit, for
example, reports positions with putative accuracies of 7 meters that are
demonstrably off by 15 meters. Where extreme accuracy is required, be sure
of the capabilities of your GPS under the prevailing conditions when the
coordinates are recerded. For retrospective uncertainty estimates where
detailed information is not available, 30 meters is a reasonable,
conservative estimate of GPS accuracy in the absence of Selective
Availability.
>>> Posting number 442, dated 24 Mar 2003 16:56:35
Date: Mon, 24 Mar 2003 16:56:35 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Canadian Geographic Names Service
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
I have just discovered a Canadian corollary to the USGS GNIS, the
Geographical Names of Canada, at the following URL:
http://geonames.nrcan.gc.ca/index_e.php
My apologies to those of you who might already have known of this, but if
you did, why haven't you told the rest of us? :)
John
>>> Posting number 443, dated 24 Mar 2003 20:13:00
Date: Mon, 24 Mar 2003 20:13:00 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Canadian Geographic Names Service
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
John and All: I assumed everyone knew about it. It is on the US GNIS
website under Links>Useful Geographic Names links. I guess to differentiate
these from nonuseful links. Back in the Idaho time trials posting I
mentioned BCGNIS and downloading placenames and lat/longs using the bounding
box. I ended up with 42K+ BC placename for BC for use in the Excel
calculator. The problem then was in the datum. I have since learned (will
forward the email when I locate it) that the almost all lat/longs are NAD27.
In BCGNIS the lat/long are to the nearest minute leading to my current
conundrum: Is it worth bothering with other components of error (especially
extent) when the lat/longs are to the nearest minute. More as I think about
the problem.
----- Original Message -----
From: "John Wieczorek" <tuco@SOCRATES.BERKELEY.EDU>
To: <MAMMAL-Z-NET@USOBI.ORG>
Sent: Monday, March 24, 2003 4:56 PM
Subject: [MANIS] Canadian Geographic Names Service
> Dear All,
>
> I have just discovered a Canadian corollary to the USGS GNIS, the
> Geographical Names of Canada, at the following URL:
>
> http://geonames.nrcan.gc.ca/index_e.php
>
> My apologies to those of you who might already have known of this, but if
> you did, why haven't you told the rest of us? :)
>
> John
>
>>> Posting number 444, dated 24 Mar 2003 20:14:39
Date: Mon, 24 Mar 2003 20:14:39 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Fw: BCGNIS datum
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
----- Original Message -----
From: "Mason, Janet SRM:EX" <Janet.Mason@gems4.gov.bc.ca>
To:
Sent: Thursday, March 20, 2003 8:09 AM
Subject: RE: BCGNIS datum
> The bounding box is certainly one way to pull all BC names. Note that
> applications using static gazetteer data are discouraged, because place
> names can and do change.
>
> I'm familiar with the Topo USA 4.0 product, but you might get some value
> from the Internet Mapping Framework (IMF) utility being developed through
> the Land Information BC initiative in our ministry, Sustainable Resource
> Management. I've been finalizing protocols with IMF designers so that
their
> 'search by name' utility will hit a live view of BCGNIS, hence obviate
> static retention of place name data.
>
> http://maps.gov.bc.ca/ --> "Provincial Basemap".
> Select "Find Location" on toolbar, then 'Place Name'. Note that the
utility
> zooms to the coordinates held in BCGNIS (ie. MOUTH of rivers, CENTRE of
> aerial features, SUMMIT of elevated features, so you'll still need to pan
if
> your site is near headwaters, etc.) Merging with orthophotos might be
> useful for you. The amount & quality of toponymy on base maps is abyssmal,
> but a markup tool allows you to customize maps.
>
> Cheers,
>
> Janet Mason
> Provincial Toponymist
> Base Mapping & Geomatic Services Branch
> Ministry of Sustainable Resource Management
> PO Box 9355 STN Prov Govt
> VICTORIA BC V8W 9M2
> *(250) 387-9328 fax (250) 356-7831
> * janet.mason@gems4.gov.bc.ca
> BC Geographical Names Information System:
> http://srmwww.gov.bc.ca/bcnames
>
>
>
> -----Original Message-----
> From:
> Sent: Wednesday, March 19, 2003 6:00 AM
> To: Mason, Janet SRM:EX
> Subject: Re: BCGNIS datum
>
>
> Janet: Your response was very helpful. From your web site I downloaded
> about 42K localities using your bounding box search. The bounding box was
> impressive. These localities then went into an Excel workbook for lookup
of
> BC placenames.
>
> Do you know of any electronic (interactive) maps similar to Topo USA 4.0
> (from DeLorme) that encompass BC? I have to determine extents for
> placenames (encompassing radius) and do a number of measurements and don't
> have the funds to purchase all the hard copy maps.
>
>
> ----- Original Message -----
> From: "Mason, Janet SRM:EX" <Janet.Mason@gems4.gov.bc.ca>
> To:
> Sent: Tuesday, March 18, 2003 10:25 AM
> Subject: RE: BCGNIS datum
>
>
> > Hi XXXX.
> >
> > The VAST majority of coordinates in BCGNIS are NAD 27; they have been
> > hand-scaled off federal 1:50k or 1:250k lithographed sheets over the
past
> 5
> > decades, and - as you will have noticed - are usually rounded to the
> nearest
> > minute. At worst, rounding represents less than 1.5 km on the ground at
> our
> > latitudes. Names adopted in the last 10 years (a few hundred, max) are
> > located to the nearest 5-second, but still scaled off the available
litho
> > sheets, hence primarly NAD27.
> >
> > We're working to repopulate these values from locations on the 1:20 000
> > provincial base, TRIM (NAD 83), but I'm guessing we're a year or more
> away.
> > At that time, datum will be specified, and UTM and decimal-degrees will
> also
> > be displayed, or calculated on the fly.
> >
> > I hope this helps,
> >
> > Cheers,
> >
> > Janet Mason
> > Provincial Toponymist
> > Base Mapping & Geomatic Services Branch
> > Ministry of Sustainable Resource Management
> > PO Box 9355 STN Prov Govt
> > VICTORIA BC V8W 9M2
> > *(250) 387-9328 fax (250) 356-7831
> > * janet.mason@gems4.gov.bc.ca
> > BC Geographical Names Information System:
> > http://srmwww.gov.bc.ca/bcnames <http://srmwww.gov.bc.ca/bcnames>
> >
> > -----Original Message-----
> > From:
> > Sent: Friday, March 14, 2003 11:46 AM
> > To: Mason, Janet SRM:EX
> > Subject: BCGNIS datum
> >
> >
> >
> > Dear Janet: I am working on a georeferencing project through Berkeley
> > Museum of Vertebrate Zoology ( http://dlp.cs.berkeley.edu/manis/
> > <http://dlp.cs.berkeley.edu/manis/> ). I used BCGNIS to get lat/longs
for
> > mammal specimens from British Columbia. One component of error we are
> using
> > is the deteremining lat/long is the datum (known or unknow). I'm just
> > wondering if the datum (NAD27, NAD83, WGS84) is specified for BCGNIS
> > localities on the original maps or elsewhere.
> >
> >
> >
> >
> > XXXXXXXXXXXXXXXXXXX
> >
>>> Posting number 445, dated 25 Mar 2003 11:29:14
>>> Posting number 446, dated 25 Mar 2003 12:06:07
>>> Posting number 447, dated 25 Mar 2003 09:22:23
Date: Tue, 25 Mar 2003 09:22:23 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Canadian Geographic Names Service
In-Reply-To: <BAY1-DAV18SXI3acH0x0002cc62@hotmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear XXXX, and all,
For the sake of consistency all georeferencing should follow the MaNIS
guidelines. If an example of a reason not to neglect the extent
determinations would be helpful, here's one:
Vancouver Island
John
At 08:13 PM 3/24/03 -0800, you wrote:
>John and All: I assumed everyone knew about it. It is on the US GNIS
>website under Links>Useful Geographic Names links. I guess to differentiate
>these from nonuseful links. Back in the Idaho time trials posting I
>mentioned BCGNIS and downloading placenames and lat/longs using the bounding
>box. I ended up with 42K+ BC placename for BC for use in the Excel
>calculator. The problem then was in the datum. I have since learned (will
>forward the email when I locate it) that the almost all lat/longs are NAD27.
>
>In BCGNIS the lat/long are to the nearest minute leading to my current
>conundrum: Is it worth bothering with other components of error (especially
>extent) when the lat/longs are to the nearest minute. More as I think about
>the problem.
>
>----- Original Message -----
>From: "John Wieczorek" <tuco@SOCRATES.BERKELEY.EDU>
>To: <MAMMAL-Z-NET@USOBI.ORG>
>Sent: Monday, March 24, 2003 4:56 PM
>Subject: [MANIS] Canadian Geographic Names Service
>
>
> > Dear All,
> >
> > I have just discovered a Canadian corollary to the USGS GNIS, the
> > Geographical Names of Canada, at the following URL:
> >
> > http://geonames.nrcan.gc.ca/index_e.php
> >
> > My apologies to those of you who might already have known of this, but if
> > you did, why haven't you told the rest of us? :)
> >
> > John
> >
>>> Posting number 448, dated 25 Mar 2003 09:53:09
Date: Tue, 25 Mar 2003 09:53:09 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Canadian Geographic Names Service
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
If it was just Vancouver Island, I would put it down as "extent too large,
find more precise reference" with the idea that it would be a waste of time
to fiddle and annotate when the owning institution most likely had more
data. However, I can georeference it with a 210 km extent and annotate it
as above. In this case, the extent component of error swamps the the other
components, esp coordinate (in)precision due the minute bounding box.
Vancouver Is is the size of western Washington or Oregon which I did not
georeference because of "extent too large" or "ambiguous reference" problem.
----- Original Message -----
From: "John Wieczorek" <tuco@SOCRATES.BERKELEY.EDU>
To: <MAMMAL-Z-NET@USOBI.ORG>
Sent: Tuesday, March 25, 2003 9:22 AM
Subject: Re: [MANIS] Canadian Geographic Names Service
> Dear XXXX, and all,
>
> For the sake of consistency all georeferencing should follow the MaNIS
> guidelines. If an example of a reason not to neglect the extent
> determinations would be helpful, here's one:
>
> Vancouver Island
>
> John
>
>
>>> Posting number 449, dated 25 Mar 2003 10:42:56
Date: Tue, 25 Mar 2003 10:42:56 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Canadian Geographic Names Service
In-Reply-To: <BAY1-DAV717eG5yhCSp00030258@hotmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Granted that my example is extreme, but I wanted to illustrate
unequivocally that the extent matters. Extents will be in a variety of
sizes. At times they will be larger than other sources of uncertainty, and
at other times they will be trivial in comparison.
At 09:53 AM 3/25/03 -0800, you wrote:
>If it was just Vancouver Island, I would put it down as "extent too large,
>find more precise reference" with the idea that it would be a waste of time
>to fiddle and annotate when the owning institution most likely had more
>data. However, I can georeference it with a 210 km extent and annotate it
>as above. In this case, the extent component of error swamps the the other
>components, esp coordinate (in)precision due the minute bounding box.
>Vancouver Is is the size of western Washington or Oregon which I did not
>georeference because of "extent too large" or "ambiguous reference" problem.
>
>----- Original Message -----
>From: "John Wieczorek" <tuco@SOCRATES.BERKELEY.EDU>
>To: <MAMMAL-Z-NET@USOBI.ORG>
>Sent: Tuesday, March 25, 2003 9:22 AM
>Subject: Re: [MANIS] Canadian Geographic Names Service
>
>
> > Dear XXXX, and all,
> >
> > For the sake of consistency all georeferencing should follow the MaNIS
> > guidelines. If an example of a reason not to neglect the extent
> > determinations would be helpful, here's one:
> >
> > Vancouver Island
> >
> > John
> >
> >
>>> Posting number 450, dated 25 Mar 2003 11:28:58
Date: Tue, 25 Mar 2003 11:28:58 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: plotting different datums
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
John: Just wondering how the plotting software is going to deal with the
lat/longs from different datums (or data)? BC will have about 55% WGS84
(NIMA) and 35% NAD27 (NAD27)? I can envision an overlay for each datum or
conversions prior to plotting? The difference isn't that great for North
America at least, so depending on scale/magnification it might not matter?
>>> Posting number 451, dated 25 Mar 2003 11:41:41
Date: Tue, 25 Mar 2003 11:41:41 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: plotting different datums
In-Reply-To: <BAY1-DAV56eACbzCWfb00030af3@hotmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
If, by plotting software, you mean applications running "on top of" MaNIS,
using our georeferenced data, then we envision a datum transformation layer
that can take our original data (in whatever datum) and transform it to the
datum of choice for the purpose of visualization or analysis. Coordinates
for which the datum is "not recorded" will not be transformed, hence the
need for the unknown datum uncertainty in the calculations.
In BC the uncertainty from not knowing whether the source is NAD27 or WGS84
ranges between about 70 m in the extreme SE to about 110 m in the extreme
NW. If we didn't do datum transformations, then those localities that are
well specified (uncertainties of about the same scale as the datum
transformation distance) would
be quite obviously displaced from their correct positions.
At 11:28 AM 3/25/03 -0800, you wrote:
>John: Just wondering how the plotting software is going to deal with the
>lat/longs from different datums (or data)? BC will have about 55% WGS84
>(NIMA) and 35% NAD27 (NAD27)? I can envision an overlay for each datum or
>conversions prior to plotting? The difference isn't that great for North
>America at least, so depending on scale/magnification it might not matter?
>>> Posting number 452, dated 25 Mar 2003 14:33:57
Date: Tue, 25 Mar 2003 14:33:57 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: How to deal with non-standard characters in Locality description?
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
While it is not strictly related "georeferencing"
rules, I recently discovered that special
characters in languages other than english pose a
problem. I also discovered that various
collections have different rules how to deal with
these characters. For example many european
languages use additional characters (think about
the german umlaut, or accent marks on certain
vowels). Some collections tried to use these
characters, while others used the closest
equivalent (instead of an umlaut a, an a is used).
Unfortunately, there are problems with both
approach.
1,Due to change in how this special characters are
mapped in computers (DOS-->Win3.1-->Win98 etc.)
characters can be lost. During my georeferencing
work, I found examples when characters were lost
(discarded) because they were special non-standard
english characters.
2, One character can make a big difference:
certain locality names in europe differ only by
one or two of these special characters. Some
locality names were ambiguous in Hunary because
those special characters were transcribed to an
english equivalent.
The reason why I am mentioning this problem is
that many people start to georeference non-US
localities and this kind of problems will be more
common. During the time when I georeferenced
Hungarian localities, I tried to resolve some
pretty distorted locality names. When I found
these distorted (some of them was transcribed to
english alphabet and and had typos in them) I
noted the correct spelling of the place name in
the "NamedPlace" field. However, I am not sure how
long this data will survive after some transfer
and conversion between various databases and
operating systems. At the time when I was working
on the Hungarian localities I felt that it is
important to make these notes because they explain
what assumptions I made when I resolved the
location of each localities.
Anyway, do we have rules how to deal with these
names that contain non-standard characters? Is it
worth to "correct" spelling of locality names, or
shall we just find the geographic coordinates for
the records?
Sincerely,
XXXXX (spelled correctly with an accent
mark on the a)
>>> Posting number 453, dated 26 Mar 2003 10:06:18
Date: Wed, 26 Mar 2003 10:06:18 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: How to deal with non-standard characters in Locality
description?
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Nobody wants to say the "U word <http://www.unicode.org/> ."
XXXXXX
>>> Posting number 454, dated 26 Mar 2003 13:52:39
>>> Posting number 455, dated 27 Mar 2003 13:41:00
Date: Thu, 27 Mar 2003 00:13:41 -0600
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: How to deal with non-standard characters in Locality
description?
In-Reply-To: <01KTYKRN651Q9KO309@TTACS.TTU.EDU>
MIME-version: 1.0
Content-type: text/plain; charset=us-ascii; format=flowed
With regard to XXXXX's (with an accent mark on the a) suggestion and
question, I am wondering if "LocalityAnnotation" field rather than
"NamedPlace" is to be used for any note when I located a weird
transliteration from non-alphabetical language to US-alphabets for a
specific geographic name and made a reasonable assumption from such
apparently incorrect spelling. Almost every few lines, I came across
this sort of problems in georeferencing Japanese localities.
Sincerely,
XXXXXXXXXXXXX
>>> Posting number 456, dated 27 Mar 2003 09:16:43
Date: Thu, 27 Mar 2003 09:16:43 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: How to deal with non-standard characters in Locality
description?
In-Reply-To: <p05111b00baa834ed4cbe@[129.118.175.4]>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: quoted-printable
XXXXX, XXXXX, and All,
The LocalityAnnotation field is meant to alert people at the source=20
institution that there is something amiss in their locality description=20
that they should investigate. Misspellings fall into this category, as do=20
internal inconsistencies in the description. The NamedPlace field should=20
contain the name as used in the source for the coordinates.
I understand that not everyone has a Unicode-capable database. For those of=
=20
us who don't, making some of these spelling updates will not be possible=20
right now. Nevertheless, it is well providing this information during=20
georeferencing, because some institutions will be able to take advantage of=
it.
John
>>> Posting number 457, dated 31 Mar 2003 22:04:51
Date: Mon, 31 Mar 2003 22:04:51 -0800
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: more on diacritics and Unicode
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
John and all: I ran across this link will looking for some maps.
Check the link "Diacritics, Special Characters and their Codes" link on the
NIMA - GNS page at
http://www.nima.mil/gns/html/diacritic.html
>>> Posting number 458, dated 1 Apr 2003 12:43:08
>>> Posting number 459, dated 8 Apr 2003 16:31:21
>>> Posting number 460, dated 11 Apr 2003 14:08:36
>>> Posting number 461, dated 12 Apr 2003 10:42:18
>>> Posting number 462, dated 15 Apr 2003 13:33:18
Date: Tue, 15 Apr 2003 13:33:18 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: GMT and the required DateLastModified field
In-Reply-To: <5.1.1.5.2.20030315123544.016f79f8@socrates.berkeley.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii" ; format="flowed"
Dear All: The manner that the required DateLastModified field
description is written seems to require the date last modified to
seconds.
DateLastModified
ISO 8601 date and time in UTC(GMT) when the record was last modified.
Example: "November 5, 1994, 8:15:30 am, US Eastern Standard Time"
would be "1994-11-05T13:15:30Z".
(http://dlp.cs.berkeley.edu/manis/darwin2ConceptInfo030315jrw.htm) .
However there appear to be six levels of date-time precision
(granularity) and it might be easier to just go with the third of
YYYY-MM-DD?
Formats (from http://www.w3.org/TR/NOTE-datetime).
Different standards may need different levels of granularity in the
date and time, so this profile defines six levels. Standards that
reference this profile should specify one or more of these
granularities. If a given standard allows more than one granularity,
it should specify the meaning of the dates and times with reduced
precision, for example, the result of comparing two dates with
different precisions.
The formats are as follows. Exactly the components shown here must be
present, with exactly this punctuation. Note that the "T" appears
literally in the string, to indicate the beginning of the time
element, as specified in ISO 8601.
Year:
YYYY (eg 1997)
Year and month:
YYYY-MM (eg 1997-07)
Complete date:
YYYY-MM-DD (eg 1997-07-16)
Complete date plus hours and minutes:
YYYY-MM-DDThh:mmTZD (eg 1997-07-16T19:20+01:00)
Complete date plus hours, minutes and seconds:
YYYY-MM-DDThh:mm:ssTZD (eg 1997-07-16T19:20:30+01:00)
Complete date plus hours, minutes, seconds and a decimal fraction of a
second
YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00)
For Washington State the time offset from GMT is -8 hr (-7 hr
daylight saving time).
To figure the offset in hr from GMT, here is a useful site:
http://greenwichmeantime.com/local/usa/.
>>> Posting number 463, dated 15 Apr 2003 16:06:32
Date: Tue, 15 Apr 2003 16:06:32 -0500
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: Dave Vieglais <vieglais@KU.EDU>
Subject: Re: GMT and the required DateLastModified field
In-Reply-To: <p05100301bac216189669@[207.207.104.113]>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Hi All,
Perhaps one of the important things to realize here is the use of the
this field. It is intended to provide a timestamp indicating that the
record data is guaranteed not to have changed since this time.
Providing such a timestamp enables one to quickly determine if a copy of
a record needs to be updated from the source.
The actual data for this field should be stored as a DATETIME type in
the database, and will retain the precision native to that type in the
database. This will generally be in the 100'ths of a second ballpark.
The representation of this field in queries and in response records will
be one of the ISO8601 formats, most likely that used in the description.
Note that the representation of date/time in the request or response
records is completely independent of the representation that is visible
to the user through either the database management interface or through
portal applications retrieving this information.
XXXXXXXXXX wrote:
> Dear All: The manner that the required DateLastModified field
> description is written seems to require the date last modified to
> seconds.
Yes, but that does not mean that the data entry person needs to capture
this information to the nearest second. Indeed, this field should
generally be determined by the database system, and not be manually
entered at all.
>
> DateLastModified
>
> ISO 8601 date and time in UTC(GMT) when the record was last modified.
> Example: "November 5, 1994, 8:15:30 am, US Eastern Standard Time"
> would be "1994-11-05T13:15:30Z".
> (http://dlp.cs.berkeley.edu/manis/darwin2ConceptInfo030315jrw.htm) .
>
> However there appear to be six levels of date-time precision
> (granularity) and it might be easier to just go with the third of
> YYYY-MM-DD?
>
> Formats (from http://www.w3.org/TR/NOTE-datetime).
>
> Different standards may need different levels of granularity in the
> date and time, so this profile defines six levels. Standards that
> reference this profile should specify one or more of these
> granularities. If a given standard allows more than one granularity,
> it should specify the meaning of the dates and times with reduced
> precision, for example, the result of comparing two dates with
> different precisions.
>
> The formats are as follows. Exactly the components shown here must be
> present, with exactly this punctuation. Note that the "T" appears
> literally in the string, to indicate the beginning of the time
> element, as specified in ISO 8601.
>
>
> Year:
> YYYY (eg 1997)
> Year and month:
> YYYY-MM (eg 1997-07)
> Complete date:
> YYYY-MM-DD (eg 1997-07-16)
> Complete date plus hours and minutes:
> YYYY-MM-DDThh:mmTZD (eg 1997-07-16T19:20+01:00)
> Complete date plus hours, minutes and seconds:
> YYYY-MM-DDThh:mm:ssTZD (eg 1997-07-16T19:20:30+01:00)
> Complete date plus hours, minutes, seconds and a decimal fraction of a
> second
> YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00)
>
> For Washington State the time offset from GMT is -8 hr (-7 hr
> daylight saving time).
> To figure the offset in hr from GMT, here is a useful site:
> http://greenwichmeantime.com/local/usa/.
It is important that the correct timezone information is set on the
database server. This is because in most cases, dates and times stored
in databases use local time, and generally do not capture the offset
from GMT. Interface software (such as DiGIR) use the system locale
information to determine the timezone of the system, and adjust incoming
requests to reflect local time so that the values can be compared with
the database entries. Similarly for outgoing data.
So:
1. Do not try to manually enter timestamps. These values should be
computed by the database.
2. Make sure that the timezone information is correctly set on the
machine serving the data, and that all mirrors of the data use the same
timezone.
3. Make sure that the clock on your system(s) are up to date. Use some
time synching tools to make this happen automatically.
cheers,
Dave V.
>>> Posting number 464, dated 15 Apr 2003 16:14:52
>>> Posting number 465, dated 15 Apr 2003 16:23:50
>>> Posting number 466, dated 15 Apr 2003 16:09:20
Date: Tue, 15 Apr 2003 16:09:20 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: GMT and the required DateLastModified field
In-Reply-To: <3E9C7458.9050305@ku.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii" ; format="flowed"
Dave: You lost me. I'm now thinking that this field has nothing to
do with the actual curatorial functions on the inhouse databases?
Here we have FileMaker Pro (inhouse) write data to the MaNIS server.
>Hi All,
>Perhaps one of the important things to realize here is the use of the
>this field. It is intended to provide a timestamp indicating that the
>record data is guaranteed not to have changed since this time.
>Providing such a timestamp enables one to quickly determine if a copy of
>a record needs to be updated from the source.
Seems obvious, but the source is what? The MaNIS server or the
inhouse database?
........
>So:
>1. Do not try to manually enter timestamps.
Meaning do not have auto enter date and time fields (and Datetime) in
the inhouse database stamp records?
>These values should be
>computed by the database.
Meaning the server software on the MaNIS server is going to stamp the
record as it is written to the server from the inhouse database?
Thus the inhouse database (FMP here) does not need the time stamp?
>2. Make sure that the timezone information is correctly set on the
>machine serving the data, and that all mirrors of the data use the same
>timezone.
Roger
>3. Make sure that the clock on your system(s) are up to date. Use some
>time synching tools to make this happen automatically.
Roger, roger
>
>cheers,
> Dave V.
>
>>> Posting number 467, dated 15 Apr 2003 17:08:17
Date: Tue, 15 Apr 2003 17:08:17 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: Stan Blum <sblum@CALACADEMY.ORG>
Subject: Re: GMT and the required DateLastModified field
In-Reply-To: <p05100304bac23e0efa12@[207.207.104.113]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
At 04:09 PM 4/15/03 -0700, you wrote:
>Dave: You lost me. I'm now thinking that this field has nothing to
>do with the actual curatorial functions on the inhouse databases?
>Here we have FileMaker Pro (inhouse) write data to the MaNIS server.
>
>>Hi All,
>>Perhaps one of the important things to realize here is the use of the
>>this field. It is intended to provide a timestamp indicating that the
>>record data is guaranteed not to have changed since this time.
>>Providing such a timestamp enables one to quickly determine if a copy of
>>a record needs to be updated from the source.
>
> Seems obvious, but the source is what? The MaNIS server or the
>inhouse database?
XXXX et al.,
My understanding of this field is that it's purpose is to lighten the load
of (ro)bots or spiders that will be going around to collection databases
and indexing their contents. For example, the folks at ITIS Canada built
something that indexed collections for names, and so when someone searched
for a particular taxon, they could respond with links to databases holding
relevant specimen records. Spiders like that don't want to have to copy
ALL the relevant data every time they visit, but only data that have
changed since they last visited. They can keep track of when they last
visited, we just need to be able to respond to a query for everything "with
a LastModifiedDate newer than X".
Like most of you, we at CAS have our real databases separated from the
copies that are being queried from the Web. Unfortunately, our original
databases don't have time-stamps to record when individual records were
last edited. That means every time I update our web version, or
"DiGIR-resource", I export the whole thing. I can, however, make life
easier for the spiders, by adding a time-stamp field to the web version and
setting it equal to the date-time of the upload. In other words, I do the
upload to the web version (wiping out everything that was there before),
and then run an update query that sets "DateLastModified" to Now. Only a
small portion of our records may be new, but I can't tell the spiders which
ones, so they have to "get them all". I can tell them, when they come back
next, that nothing has been updated since they visited; they have the most
recent version that's available.
By the way, our DiGIR resource is in Microsoft SQL Server and the data type
I'm using for the DateLastModified field is smalldatetime. It's precision
is only to the nearest minute, so when I ask it to spit out date time in
the (its version of the) ISO format, it pads the seconds with zeros. I
think that's sufficient for our purposes.
Right now (daylight savings time) I have to add 8 hours to NOW to the get
the correct value. What I don't have yet is the right set up for dealing
with daylight savings time automatically...
This stuff is always so complicated.
-Stan
>>> Posting number 468, dated 17 Apr 2003 00:00:0/
Date: Thu, 17 Apr 2003 00:54:19 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Providing the DateLastModified field
In-Reply-To: <5.2.0.9.0.20030415161716.04e6c6e0@mail.calacademy.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
For those of you whose data migrations I design and implement (i.e., those
conforming to my view of the world - heheh), the DateLastModified will be
constructed during the migration process. Each time the data get migrated,
I will compare the contents of each newly migrated specimen record to the
contents of the corresponding specimen record from the previous migration.
If the record has changed since the last migration, or if it is new since
the last migration, I will set the DateLastModified to the date when the
migration is made. If the record is unchanged, I will retain the previous
DateLastModified. I'm considering recording the nature of the changes and
making that resource available as well, but I haven't fully thought through
the implications or utility of creating such a resource. Comments are welcome.
For those subscribers who are interested, my proposed implementation for
determining the DateLastModified will work properly only for those records
that have a unique identifier (i.e., a unique combination of
InstitutionCode, CollectionCode, and CatalogNumberText, within a resource).
Records that have a duplicate identifier within the resource will all have
the DateLastModified set to the most recent migration date.
John
At 05:08 PM 4/15/03 -0700, you wrote:
>At 04:09 PM 4/15/03 -0700, you wrote:
>>Dave: You lost me. I'm now thinking that this field has nothing to
>>do with the actual curatorial functions on the inhouse databases?
>>Here we have FileMaker Pro (inhouse) write data to the MaNIS server.
>>
>>>Hi All,
>>>Perhaps one of the important things to realize here is the use of the
>>>this field. It is intended to provide a timestamp indicating that the
>>>record data is guaranteed not to have changed since this time.
>>>Providing such a timestamp enables one to quickly determine if a copy of
>>>a record needs to be updated from the source.
>>
>> Seems obvious, but the source is what? The MaNIS server or the
>>inhouse database?
>
>XXXX et al.,
>
>My understanding of this field is that it's purpose is to lighten the load
>of (ro)bots or spiders that will be going around to collection databases
>and indexing their contents. For example, the folks at ITIS Canada built
>something that indexed collections for names, and so when someone searched
>for a particular taxon, they could respond with links to databases holding
>relevant specimen records. Spiders like that don't want to have to copy
>ALL the relevant data every time they visit, but only data that have
>changed since they last visited. They can keep track of when they last
>visited, we just need to be able to respond to a query for everything "with
>a LastModifiedDate newer than X".
>
>Like most of you, we at CAS have our real databases separated from the
>copies that are being queried from the Web. Unfortunately, our original
>databases don't have time-stamps to record when individual records were
>last edited. That means every time I update our web version, or
>"DiGIR-resource", I export the whole thing. I can, however, make life
>easier for the spiders, by adding a time-stamp field to the web version and
>setting it equal to the date-time of the upload. In other words, I do the
>upload to the web version (wiping out everything that was there before),
>and then run an update query that sets "DateLastModified" to Now. Only a
>small portion of our records may be new, but I can't tell the spiders which
>ones, so they have to "get them all". I can tell them, when they come back
>next, that nothing has been updated since they visited; they have the most
>recent version that's available.
>
>By the way, our DiGIR resource is in Microsoft SQL Server and the data type
>I'm using for the DateLastModified field is smalldatetime. It's precision
>is only to the nearest minute, so when I ask it to spit out date time in
>the (its version of the) ISO format, it pads the seconds with zeros. I
>think that's sufficient for our purposes.
>
>Right now (daylight savings time) I have to add 8 hours to NOW to the get
>the correct value. What I don't have yet is the right set up for dealing
>with daylight savings time automatically...
>
>This stuff is always so complicated.
>
>-Stan
>>> Posting number 469, dated 17 Apr 2003 01:49:19
Date: Thu, 17 Apr 2003 01:49:19 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: two questiones
In-Reply-To: <31320.1050426708@www21.gmx.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
XXX has some good questions, the discussion of which may be of benefit to
many of you. I've interspersed my commentary with the original message.
>I encountered few more cases new for me :)) and would highly appreciate your
>suggestions how to deal with them. Compared to Kenya a lot of records in
>Guatemala have very precise distances and a lot of records represent one
>of the
>cases listed below.
>
>1 What value of the "Distance precision" shall I enter if the distances
>given along orthogonal directions are of the different precision: like 2
>km E and
>6.5 km S San Marcos. Is the "distance precision" value 1 km or 0.5km in this
>case. In other records it is like 2.0 km E and 6.5 km S. That seems to be
>more clear. In few other records they can differ even more like 7 km E and
>1.25
>km N. I am not sure how to proceed in these cases?
Use the more precise measurement as the gauge. The premise is that the one
who recorded the data was cognizant of that level of precision and executed
the measurement with equal precision even though the record does not
reflect it. So, for your example "7 km E and 1.25 km N" the distance
precision would be 1/4 km.
>2 A lot of distances are given along the road with direction indicated: like
> 15km S (by road) Santa Anna. We also have the same distance to the same
>place meassured by air. Shall I try to follow/meassure the road distance?
Here's my "official" stance on the subject - an excerpt from a paper I've
written for the International Journal of GIS with Qinghua Guo and Robert
Hijmans, both at UC Berkeley.
"3.4 Using offsets
Offsets generally consist of combinations of distances and directions from
a named place. Some locality descriptions explicitly state the path to
follow when measuring the offset (e.g. 'by road', 'by river', 'by air', 'up
the valley', etc.). In this case the georeferencer should follow the path
designated in the description using a map with the largest available scale
to find the coordinates of the offset from the named place. The smaller the
scale of the map used, the more the measured distance on the map is likely
to overshoot the intended target.
It is sometimes possible to infer the offset path from additional
supporting evidence in the locality description. For example, in the
locality '58 km NW of Haines Junction, Kluane Lake' supports a measurement
by road since the final coordinates by that path are nearer to the lake
than going 58 km NW in a straight line. By convention, localities
containing two offsets in orthogonal directions (e.g. '10 km S and 5 km W
of Bikini Atoll') are always linear measurements.
Sometimes the environmental constraints of the collected specimen can imply
the method of measurement. For example, '30 km W of Boonville' if taken as
a linear measurement, would lie off the coast of northern California in the
Pacific Ocean. If the locality refers to the collection of a terrestrial
mammal, it is likely that the collector followed the road heading west out
of Boonville, winding toward the coast, in which case the animal was
collected on land.
If either of the above methods fail to distinguish the offset method, it
may be necessary to refer to more detailed supplementary sources, such as
field notes or itineraries, to determine this information. Supplementary
sources often do not exist, or do not contain additional information,
making it difficult to distinguish between offsets meant to be along a path
and those meant to be along a straight line. A particularly conservative
approach would be to not georeference localities that fall into this
category and instead record a comment explaining the reasoning. However,
value can still be derived from georeferencing localities that suffer from
the ambiguity described above. One solution is to determine the coordinates
based on one or the other of the offset paths. Another solution is use the
midpoint between all possible paths. There may be discipline-specific
reasons to choose one solution over another, but the georeferencer should
always document the choice and accommodate the ambiguity in the uncertainty
calculations. "
>3 A lot of records in Guatemala refere to farms (called "fincas"). Most of
>them have own coordinates. In records in addition to the farm name, a
>distances from the larger named place are given. The problem is, that the
>distances
>from the same named place are different. I assume that animals were captured
>in different farm areas. Differences in the distances usually are not very
>big. Question is. Shall I refere to the farm as such or count coordinates
>along
>the distances from the named place?
>Here is two examples:
>Finca Santa Julia, 1.5 km E and 0.75 km S San Rafael Pie de La Cuesta
>and
>Finca Santa Julia, 1.5 km SE (by air) San Rafael Pie de La Cuesta
>as well as
>Finca Santa Julia, ca. 1.25 km E, 0.75 km S (by air) San Rafael Pie de La
>Cuesta
This one is a very interesting problem. One is left to wonder, from such
descriptions, if the collector meant "all of Finca Santa Julia, which can
be found at 1.5 km E and 0.75 km S San Rafael Pie de La Cuesta" or "1.5 km
E and 0.75 km S San Rafael Pie de La Cuesta, which happens to be in Finca
Santa Julia."
The first question for me as a georeferencer is, "Can I find Finca Santa
Julia in a resource that tells me how big the Finca is?" If I cannot, then
I will ignore that part of the locality and use the alternative description
and I'll note "can't find Finca Santa Julia, georeferenced based on
offsets." If I can find Finca Santa Julia and it has a smaller extent than
the precision of the offsets, I will ignore the offset information and use
the location and size of the Finca. If I can find Finca Santa Julia and it
has a larger extent than the precision of the offsets, I will use the
intersection of the Finca with the bounding box or circle defined by the
offsets and their precision as my named place and calculate my uncertainty
based on the rules for a locality of the type "Named Place Only."
>4 The last one. How to deal with records, where animals where captured on
>particular place (coordinates given) but died in captivity. Shall I reference
>them like others or shall add some remarks to the "CaptiveFlag"
This is another interesting question. We're not concerned about where the
animal died per se. Instead, we're concerned about whence it was taken from
the wild. It will be difficult, when georeferencing without reference to
the specimens, to know if the specimens collected at that locality were
captive at the time or not, unless the locality itself describes something
about captive. Here are two fictitious examples:
"captive from fox farm 2 km E of Kluane Lake"
"fox farm 2 km E of Kluane Lake"
The first example is trivial - the CaptiveFlag should be "yes." The second
locality, however, could equally well refer to a captive fox from the fox
farm or to an unlucky itinerant ground squirrel that was collected there.
The bottom line is that the CaptiveFlag is not really a proper attribute of
the locality itself, but of the collecting event that associates the
locality with a specimen. I put it in the gazetteer to identify those
locality records that are known to refer only to collections of captive
animals.
>sorry for disturbance
Not at all. These were very good questions.
John
>Thanks
>
>XXX
>>> Posting number 470, dated 18 Apr 2003 15:55:53
>>> Posting number 471, dated 20 Apr 2003 07:47:23
Date: Sun, 20 Apr 2003 07:47:23 -1000
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Re: Providing the DateLastModified field
In-Reply-To: <5.1.1.5.2.20030417003736.0176ba08@socrates.berkeley.edu>
MIME-version: 1.0
Content-type: text/plain; charset=us-ascii
Content-transfer-encoding: 7bit
Hi John (and others),
> I'm considering recording the nature of the changes and
> making that resource available as well, but I haven't fully
> thought through
> the implications or utility of creating such a resource. Comments
> are welcome.
For what it's worth, I'm setting up our Access databases such that every
data edit is logged at the time of edit (similar to transaction logs of more
robust DBMS apps). The table name is "EditLog", and the fields are:
Field Name Type Description
-------------------------------------------------
EditLogID AutoNumb* Unique Primary Key
TableName Text (255) Name of table in which record was edited
FieldName Text (255) Name of field in which record was edited
PKID Long Int. Unique Primary Key of record that was edited
PreviousValue Memo Previous value of Field in record that was edited
EditorID Long Int. ID number of the Person who edited the record
TimeStamp Date/Time Date and time when record was edited
*AutoNumber field automatically assigns a unique random long integer value
to each new record.
Thus, with a single table, I can track all record transactions. This
structure works best if each table has a Long Integer as its primary key
(usually surrogate), but with some alterations to the PKID field(s) you
could probably accomodate multiple-field and other "natural" primary key
values. Only one transaction is logged for each new record addition
(FieldName="*"; PreviousValue="ADDED"). When records are deleted, I log the
value of all non-null fields for that record, and log an additional
transaction record with FieldName="*"; PreviousValue="DELETED".
Whenever a record is updated, Code is triggered to interrogate the record
for each field whose value has changed, so transaction logs are created for
only those fields (except for a DELETE transaction, in which case all
non-null values are logged). Although I apply this logging Code at the time
of record edit, the same code could be modified to work at the time of data
migration.
Note that only the previous values are logged. Current values are obtained
from the active data tables. The Code currently skips fields whose previous
value is NULL; the assumption being that if no edit log record exists for
the field, its previous value was NULL. This approach is flawed for the
case where a record had a value, then was changed to NULL, then was later
changed to another value. The first and second values would be logged, but
no log record would exist of the intermidiate NULL state (and thus no way to
know that the field was NULL for a period of time in-between). I'm
considering several alternative solutions to this problem.
In any case, I've found it to be an extremely useful generic feature to add
to databases to maintain an edit history of each field.
Anyone interested in more details, please feel free to contact me.
XXXXX
>>> Posting number 472, dated 20 Apr 2003 22:23:26
Date: Sun, 20 Apr 2003 22:23:26 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: A caution using NIMA and other gazetteers that are incomplete.
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Dear All:
I downloaded the BC lat/longs from NIMA and did an automated then a
semi-automated filtered pass through the BC records. There were 2,515
records in the NIMA dataset and I should have suspected that it was
incomplete. Initially I was pleased as I got about 60% apparently good
hits. But I then I downloaded the BCGNIS gazetteer of 43,690 records and
started looking up records not found in NIMA. It wasn't long until I
realized that many (more than half) of the apparently singular and
apparently unambiguous hits from NIMA were not singular and hence ambiguous.
An example:
There are two Vernons in BCGNIS, but only one in NIMA:
Vernon, City 50.2583 -119.2667
Vernon, Community 50.0333 -126.3500
There were about 45 Vernon records in the MaNIS download and 37 had only
Vernon in SpecLoc.
Most are not this bad but only involve a few records with a possible city or
community named after a geographical feature (lake, creek, bay, mount).
In the BCGNIS records, there are 3,198 multiple placenames (city, community,
locale, lake etc.). A frequency count of the multiples (I was curious):
Same name Number of occurrences in BCGNIS
2 2174
3 568
4 209
5 96
6 59
7 30
8 16
9 18
10 7
11 4
12 4
13 3
14 3
16 2
17 2
18 1 Summit Lake (1 community, 15 lakes, 2 localities)
19 1 Bear Creek (17 Creeks, 1 locality, 1 railway point)
20 1 Long Lake (all Lakes)
So, use GNIS for Canada and avoid NIMA.
The situation in the US is somewhat better with a county to assist in
filtering. However, it is circular to use county to verify the SpecLoc. We
assume county was entered with the locality but this might not be the case
and could have been added at any time to the computerized record.
Both NIMA and BCGNIS are to the nearest minute so there are no differences
in precision.
>>> Posting number 473, dated 23 Apr 2003 13:05:16
>>> Posting number 474, dated 27 Apr 2003 17:29:04
Date: Sun, 27 Apr 2003 17:29:04 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Old GeorefCalculator made obsolete
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear All,
Having received no bug reports on the new GeorefCalculator since it's
release on 14 Mar 2003, I decided to remove all references to the old
georeferencing calculator from the MaNIS web pages. All links to the
calculator are now supposed to point to the following URL:
http://elib.cs.berkeley.edu/manis/gc.html
Thanks for not breaking it!
John
>>> Posting number 475, dated 30 Apr 2003 16:02:52
>>> Posting number 476, dated 1 May 2003 13:59:55
>>> Posting number 477, dated 5 May 2003 14:18:11
>>> Posting number 478, dated 9 May 2003 13:11:22
>>> Posting number 479, dated 13 May 2003 10:02:58
>>> Posting number 480, dated 13 May 2003 16:29:35
>>> Posting number 481, dated 13 May 2003 16:50:29
>>> Posting number 482, dated 16 May 2003 12:52:08
>>> Posting number 483, dated 23 May 2003 13:10:51
>>> Posting number 484, dated 29 May 2003 14:30:19
>>> Posting number 485, dated 29 May 2003 16:34:24
>>> Posting number 486, dated 29 May 2003 11:48:00
>>> Posting number 487, dated 29 May 2003 15:23:01
Date: Thu, 29 May 2003 15:23:01 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Martinique and Guadeloupe
In-Reply-To: <5.2.1.1.0.20030529163213.00ab4888@packrat.musm.ttu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXXX, and all,
This mix up on my part is important to understand as we start claiming "the
dregs" of localities. The problems encountered are especially likely to
occur with islands, and even more likely for islands that are possessions
of another country.
>Do you mean that there were some Guadeloupe localities that showed up when
>you downloaded France? TTU has already finished the regular Guadeloupe
>localities. Please contact me.
Yes, there were. If you look on the country list you'll find that France
and Guadeloupe appear independently. When they downloaded based on
country=France, there were 83 records with Guadeloupe in the StateProvince
field. These are completely distinct from the records that would be
downloaded by a query on country=Guadeloupe, of which there are 17 records.
As it turns out, there are 3 Guadeloupe records that aren't located by
either of these "country=" queries because the word Guadeloupe is in some
other field. The only sure way to get them all is to query on Higher
Geography contains Guadeloupe, which returns a total of 103 records.
So, from here on out it is a good idea to put the region you are trying to
match into the Higher Geography field. Please, also let me know the number
of localities that match your claim, so that I can check that nothing is
being "orphaned." Such is the price of heterogeneous data structures. This
is exactly the kind of thing that makes people fanatic about
standardization. (I'm not one of those people, for the record).
Sorry for the mix up,
John
>>> Posting number 488, dated 30 May 2003 14:46:26
>>> Posting number 489, dated 31 May 2003 17:09:52
>>> Posting number 490, dated 2 Jun 2003 06:30:41
>>> Posting number 491, dated 2 Jun 2003 10:05:19
>>> Posting number 492, dated 2 Jun 2003 12:21:30
>>> Posting number 493, dated 2 Jun 2003 13:16:08
>>> Posting number 494, dated 2 Jun 2003 14:04:52
>>> Posting number 495, dated 2 Jun 2003 22:38:22
>>> Posting number 496, dated 3 Jun 2003 10:34:20
Date: Tue, 3 Jun 2003 10:34:20 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: Last doc
Comments: To: Juan Carlos =?iso-8859-1?Q?Hern=E1ndez?= Barrios
<jhernan@xolo.conabio.gob.mx>
In-Reply-To: <5.2.0.9.0.20030602160747.00b4bb30@xolo.conabio.gob.mx>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: quoted-printable
Juan Carlos,
I have confirmed that the file has been received, and that all of the files=
=20
for all of the states that you claimed for georeferencing for MaNIS are=20
archived at MVZ.
I'd like to extend sincere appreciation for your efforts from all of the=20
MaNIS participants and from those who will benefit in perpetuity from your=
=20
contribution to this project.
Thank you,
John
At 08:45 AM 6/3/03 -0500, you wrote:
> Hello John
> After several months, we've finished the mexican records that we have=20
> selected from the hole country, originally we had selected 12098 records=
=20
> out of 30 thousand+-.
> In this document: CONABIO-Mexico-2003-6-2.txt,
> we send you the records from Sonora and Sinaloa,
>
> They sum a total of 3,079,
> 2482 georeferenced and
> 597 not georeferenced
>
> This is our last delivering, let us know if you have some more=20
> questions about the files that we had sent to you.
>
> Cheers.
> Juan Carlos Hern=E1ndez Barrios
> CONABIO
>>> Posting number 497, dated 3 Jun 2003 15:33:12
Date: Tue, 3 Jun 2003 15:33:12 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: problems with Microsoft Excel
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Hello fellow georeferencers,
I have been working on the MaNIS project for the past year and am
currently georeferencing Baja California, Mexico. I usually work in
Microsoft Excel. A few weeks ago I made the unfortunate discovery that
at some point during my georeferencing process I sorted my Baja Excel
file while a few columns were hidden. In many versions of Excel (listed
below) a sort while columns are hidden results in the visible columns
being sorted without the hidden columns, thus scrambling the data. In
my situation I had hidden the LocalityID and the CollectionCode columns
so that after the sort (or multiple sorts) these columns were no longer
associated with the locality information and my georeferenced results.
I was able to fix the problem by re-associating the LocalityID and
CollectionCode from original MaNIS downloaded files with the locality
description in my completed georeferenced data. I am grateful however
that I discovered the issue before I sent the finished data to John with
mismatched data. John informed me that they are not checking for this
type of problem as the data comes in, but that their data checking
methods would reveal such a problem later on in the process, at which
point he could re-associate localities with their LocalityIDs, though
with a great deal of effort.
I am sure all of you are aware of the sorting issues involved when
working in Excel and have not made the same mistake that I did. If you
are not already doing so, I suggest that each georeferencer spot check
their data against the original files before submitting them to John. I
could easily see how the problem I encountered would have gone unnoticed
if I hadn't referred back to the original file to check my work.
If you are using Excel XP this problem is no longer an issue as it was
revised so that now hidden columns are sorted with other columns. In
the versions of Microsoft Excel listed below, the sorting feature does
not sort hidden rows or columns.
Microsoft Excel 2000
Microsoft Excel 2002
Microsoft Excel 97 for Windows
Microsoft Excel for Windows 95 7.0
Microsoft Excel for Windows 95 7.0a
Microsoft Excel for Windows 5.0
Microsoft Excel for Windows 5.0c
Microsoft Excel 98 Macintosh Edition
Microsoft Excel for the Macintosh 5.0
Microsoft Excel for the Macintosh 5.0a
Feel free to contact me if you want/need more specific information about
this problem.
XXXXXXXXXX
>>> Posting number 498, dated 4 Jun 2003 06:53:17
Date: Wed, 4 Jun 2003 06:53:17 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From:
Subject: Excel features
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
XXX et al: Select the entire worksheet or row to avoid the partial sort
problem. Avoid selecting cells for sorts. Guess you know that, but =
others
might not. Not sure how far back this works, but it does in Excel 2000.
Another nifty comparison via Excel is a cell by cell comparison. Prior =
to
submitting data to JW, I sort a copy of the original download and the =
final
copy the same way. Then enter in an unused col in cell 1 of a worksheet =
an
if statement:
=3Dif(Final!A:A=3DOriginalDwnld!A:A,"", 1)
where I click col A in Final and col A OrigDwnld. Dragging to the right
will do the same for cols B-D. Then a return does the comparison and =
then a
fill down of the entire col will compare entire cols. If
the same, the cell will be blank, if different, a 1 will show. Text can =
also
be substituted for the result values blank and 1. For example "same" =
(need
quotes) and "different" gives the result in words. This allows me to =
check
that I have not made any changes to the original MaNIS data (MaNIS =
site,collection,
HigherGeog, SpecLocality) prior to submitting.
Cell, column, row or range locking can also be done via
Format>Cell>Protection tab then Tools>Protection>Document. Unlocked =
cells
can be change but locked cell cannot be altered.
>>> Posting number 499, dated 4 Jun 2003 16:50:58
>>> Posting number 500, dated 4 Jun 2003 17:45:42
>>> Posting number 501, dated 6 Jun 2003 13:35:48
>>> Posting number 502, dated 6 Jun 2003 13:44:35
>>> Posting number 503, dated 6 Jun 2003 13:50:31
>>> Posting number 504, dated 9 Jun 2003 16:38:32
>>> Posting number 505, dated 10 Jun 2003 12:31:56
>>> Posting number 506, dated 10 Jun 2003 14:02:22
Date: Tue, 10 Jun 2003 14:02:22 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Re: [HERPNET] Distances by road
Comments: To: HERPNET@USOBI.ORG
In-Reply-To: <p0510031ebb0ad3cd7747@[207.207.103.216]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
XXXXXX, and all,
Please excuse the cross-posting, but I think this topic is critical for all
of our efforts.
I fully understand that it is, at times, possible to do better than
prescribed in the MaNIS Guidelines. For anyone interested, I also have a
pdf file of a manuscript that is currently in review for the International
Journal of Geographical Information Science. The manuscript goes into more
depth than the currently "published" guidelines
(http://dlp.cs.berkeley.edu/manis/GeorefGuide.html) and addresses just such
questions as we are now discussing. Please write to me (don't reply to this
message, as your reply will go to the HerpNet list) if you would like to
see that paper.
The Georeferencing Guidelines have a threefold purpose. First, and
foremost, they are intended to help educate the georeferencer with respect
to the complexities involved in taking a descriptive locality and making a
spatially explicit determination from it. It is not unlike making a
specimen identification - an opinion is rendered based on available
information. Second, they are intended to provide consistency across a
large-scale operation such as HerpNet, whether it is done collaboratively
or not. Third, the product resulting from the application of the guidelines
is intended to maximally useful. It is the combination of these goals,
along with the limitations on resources, that should shape the course of
action.
OK, I think I'm done waxing philosophic. I'll try to address some specifics
interspersed in the original message.
At 06:13 PM 6/9/03 -0700, you wrote:
>John, I've spent a lot of time collecting in tropical countries,
>where there are may be little villages and long stretches of road
>between them, and if I said I collected something in San Pedro, I
>would mean San Pedro, not something with an error figure "within 50
>km of San Pedro" just because the nearest village was 100 km away.
I don't disagree with you at all. However, if I don't know about you, or
your methods, how can I know what you mean by San Pedro? Do you mean the
center of San Pedro? Or the edge of San Pedro as you drive out of town?
This is where the real problem comes in - where *is* the edge of town? We
have done quite a bit of work to determine if one could estimate the size
of a town from its population, that being one piece of information that is
generally available. Whereas there is a correlation between these two
attributes, it is not consistent enough to provide a simple rule. So, we
try to figure out the sizes (extents) of towns and other features whenever
possible. In fact, we record that information, along with the source from
which we got the information, so that others can take advantage of it in
the future.
Back to your example. Clearly we are losing specificity by following the
rule in this case, and we're compromising the third goal, that of
maximizing the usefulness of the data. There are ways to do better, and
georeferencers are free to do so *if* they document their assumptions in
the LatLongRemarks field provided for that purpose. One way to do better
for this example is to find maps of a scale large enough to show the extent
of the named place. This really is the ideal, and suggests that whoever has
the best resources for a given geographic area should claim and
georeference it. Resources may include good maps, or extensive knowledge,
or supplemental material (such as field notes) for many of the localities
in a geographic region. Even so, there will be cases where an extent cannot
be determined. For such cases, a simple rule - no, guideline - is needed. I
chose a guideline that allowed the locality to be georeferenced, albeit
with a liberal margin of uncertainty. It is perfectly reasonable to
establish an alternative guideline that says "don't georeference a locality
for which the extent of the named place cannot be determined." No matter
which guideline is adopted, the determination (the georeference) is never
beyond revision. Suppose the "half-way" guideline is followed. One could
later discover candidate localities for refined georeferencing by searching
on the CoordinateUncertaintyInMeters (see
http://dlp.cs.berkeley.edu/manis/darwin2ConceptInfo030315jrw.htm). For
example, I could search for all records where the collector was "John
Wieczorek" and the CoordinateUncertainty in meters is greater than 5000 m,
because I know where all of my collecting localities are to within that
level of uncertainty. I could look over the results and change the
georeference to reflect my knowledge of the collecting events. After doing
so, I could even update the metadata about the determination to say that it
has a VerificationStatus of "collector-verified," whereas before I
revisited it the VerificationStatus was "unverified." In the alternate
scenario, in which the guideline says that "no locality shall be
georeferenced without knowing (and recording) the extent of the named
place," I would fill in a NoGeorefBecause field to say "extent not found."
In this scenario I could later search for all localities where the
collector was "John Wieczorek" and the NoGeorefBecause field was not null.
I could look through those records to see if there were any that I could
georeference because of my special knowledge of the events. Either method
works. Remember, a georeference is just an opinion. It's essential to know
how the opinion was formed if you intend to use it for anything important.
>I realize that some collectors in the past were much more casual about
>localities, but for the last 50 years, in my experience, specimens
>have been more accurately allocated, and I would argue for a little
>less liberal convention.
I have found that this is not a universal truth. Some of the finest
collectors today remain incapable of recording a reasonable locality
description. The problem isn't so much that collectors don't follow a
particular convention, instead, the problem is that they are almost never
specific enough. Interestingly, GPS has not solved this problem, it has
compounded it. All of that aside, the basic problem in georeferencing
based on the locality description, which is all we have to go on in some
cases, is that there remains a gap in our knowledge about the extent of the
named place.
> To me using such an error figure would
>severely compromise the value of the specimen record. For example,
>San Pedro might be on the slope of the Andes, and 50 km in any
>direction might involve an altitudinal change of 3000 meters and
>passing through 3 or 4 major habitat types.
I wholeheartedly agree. The smaller the CoordinateUncertainty, the greater
the number of questions for which a locality can be useful.
> I would have to ask what
>purpose is served by such a convention. Either the record is
>believable (in which case, why calculate an error figure?), or, if
>not (i.e., with a huge error figure), is the point of such a
>convention merely to cast doubt on records?
It is a mistake to confuse the CoordinateUncertainty with "believability."
It is actually a measure of "specificity." The Georeferencing Guidelines
are intended to provide data that are all equally believable, and that are
all explicitly measured with respect to their specificity.
>This seems a not very constructive convention, and I wonder where it
>comes from. It's certainly not anything I would have thought of if I
>were making up "extent rules."
It came from me, for the reasons described above. Remember, the rule is
flexible if you have additional information and you document it. Just in
case, I've offered another alternative above - to not georeference if the
extent cannot be determined, and to say so. Do you have another
alternative? My recommendation is that it be simple and universally applicable.
>No offense meant to those who constructed the MaNIS rules,
That was me. No offense taken. The rules themselves were built by being
continually challenged. It *has* to be that way, and I welcome it.
>but it
>might be worth a group of active museum herpetologists and
>ornithologists thinking about exactly what specimen locality records
>are used for
Though I don't doubt the value of the exercise, I think it is a mistake to
presume that you will come up with all of the possible, or even likely,
uses of the data. Why limit the scope a priori?
> and how much value is added to that use by the
>complicated and time-consuming (up to 50% of time spent
>georeferencing, according to Gary Shugart who has done many MaNIS
>records) calculation of errors (which come largely from extents).
>I'm concerned that the time and money spent calculating errors in the
>relatively easily handled localities from the USA will in fact take
>away from the basic georeferencing of many specimens from other
>countries, and those are perhaps the specimens that need it most!
Some additional information may be of use here. Funding was based on
documenting coordinates and errors as set out in the MaNIS guidelines,
using the Georeferencing Calculator, and the following collaborative
paradigm. The georeferencing rates for the USA were different from the
rates for Canada and Mexico, and these in turn were different from the rate
for everywhere else. All of these rates were determined empirically, and
they were all conservative. In other words, the funds for georeferencing
were based on the known level of effort required (number of localities by
geographic category by institution), not for the effort based solely on the
rates for the USA. Since those rates were determined, additional means of
semi-automating the process have been developed, most notably by Gary
Shugart at PSM, and make the georeferencing rates even greater for most
geographic regions. I highly recommend that Gary's tools be fully developed
and documented so that everyone can take advantage of them. By doing so,
you will gain that extra time that can be spent to increase the specificity
of georeferences.
John W
>XXXXXX
>
>>There are situations, especially using small-scale (large area) maps, where
>>you cannot tell how big a feature, such as a town, really is. The
>>convention in such cases, in the absence of better references (maps, remote
>>sensing data, etc.) is to use one-half the distance from the coordinates of
>>the named place (e.g., a town) to the nearest named place of the same type
>>(e.g., nearest town) as the extent. Automated georeferencing that is done
>>based on gazetteers that don't have extent information will also have to
>>rely on this convention for (liberally) estimating the extents. I believe
>>this helps answer the first question, below.
>>
>>John Wieczorek
>
>--
>>> Posting number 507, dated 10 Jun 2003 17:34:33
>>> Posting number 508, dated 11 Jun 2003 10:39:00
Date: Wed, 11 Jun 2003 10:39:00 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Fwd: Re: [HERPNET] Distances by road
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
I'm cross-posting this response from XXXXXX.
>From:
>Subject: Re: [HERPNET] Distances by road
>To: HERPNET@USOBI.ORG
>
>Much thanks to John Wieczorek for the thoughtful and lengthy
>response. I can see that the answer is in having good maps, and such
>maps should be available for most countries. Then there won't be
>problems of the sort I envisioned.
>
>I still have a residual question about the value of coordinate
>uncertainties and exactly how they will be used. I know circles can
>be drawn around coordinates showing those CUs, but with enough dots
>(records) and enough circles of varying sizes around them, a plot of
>locality records will be cluttered to say the least, perhaps to the
>point of being undecipherable. If I wanted to plot the range of a
>certain lizard from specimen records for a generic revision or a book
>on Costa Rican herps, my inclination would be to leave the circles
>off for the sake of the user. In the past, after laboriously finding
>each locality on a map, we would have applied those dots, made a
>subjective judgment that we thought one or more of them were suspect
>or just plain false, and gone back and done further research on the
>specimen/locality in question. We would then have either informed
>the reader of our suspicions (or even made the record a different
>symbol) or just struck the offending dot from the map.
>
>To me, as an end user with great interest in the geographic range of
>species, the incalculable value in MaNIS and HerpNet and BirdNet (and
>OdoNet, in my dreams) will be our having been able to plot the
>coordinates for all of our thousands of specimens to make order from
>chaos, not the coordinate uncertainties, which strike me as more
>along the line of producing chaos from order. The thought of all the
>time/money/energy spent calculating the uncertainties will continue
>to discomfit me, even though I have heard those calculations are
>among the factors that made the proposals attractive to the granting
>agency.
>
>Happy georeferencing,
>
>XXXXXX
>--
>>> Posting number 509, dated 11 Jun 2003 10:40:04
Date: Wed, 11 Jun 2003 10:40:04 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Fwd: Re: [HERPNET] Distances by road
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
I'm cross-posting this one too.
>From: Stan Blum <sblum@CALACADEMY.ORG>
>Subject: Re: [HERPNET] Distances by road
>To: HERPNET@USOBI.ORG
>
>I would also like to thank John for the thorough description of the
>rationale behind his/our approach to geo-referencing. Just so John doesn't
>feel so lonely, I'd like to add that a number of people who have been
>thinking hard about geo-referencing have settled on similar "best
>practices" -- some of them through discussion with John and his colleagues,
>and some of them independently. This is not to say that there isn't room
>for further discussion.
>
>
>At 04:40 PM 6/10/2003 -0700, XXXXXXXXX wrote:
>>I still have a residual question about the value of coordinate
>>uncertainties and exactly how they will be used.
>
>While I agree with John that we shouldn't limit ourselves to specific uses
>-- our goal is to create the most useful and accurate data for the long
>haul -- I will describe what I think the uses of coordinate uncertainty (or
>more broadly, locality uncertainty) are likely to be in the near
>future. Over the last few years, we have seen the availability and
>precision of environmental layers (Digital Elevation Models, for example)
>grow and grow. There is no reason to assume we have hit some sort of limit
>with environmental data. With more and better data, we should be able to
>create better and better models of how environmental characteristics
>influence species' distributions. The greater the uncertainty we have in
>our occurrence data, the more noise we are going to have in our
>models. Having uncertainty expressed as a scalar (distance or area) will
>allow analysts to filter the data they feed into these models (e.g.,
>everything with an uncertainty less than 100 m). We can't presuppose the
>dividing line between acceptable and unacceptable for any particular use,
>so we are recording uncertainty as a continuous measure and leaving it up
>to the future analyst. So, in my view, data without any sort of associated
>uncertainty will be much less useful on the long run.
>
>Another thing we can take heart from is that the cost of geo-referencing
>(even with an uncertainty measure) is marginal compared to the cost of
>collecting and preparing the specimens in the first place and then keeping
>them in collections for all these years. And the geo-reference makes the
>data so much more useful.
>
>These projects are the best thing that's come down the road for NH
>collections in a long, long time.
>
>-Stan
>>> Posting number 510, dated 11 Jun 2003 16:00:57
>>> Posting number 511, dated 13 Jun 2003 12:49:40
>>> Posting number 512, dated 17 Jun 2003 08:58:07
>>> Posting number 513, dated 17 Jun 2003 13:46:43
>>> Posting number 514, dated 17 Jun 2003 15:16:59
>>> Posting number 515, dated 17 Jun 2003 19:07:26
>>> Posting number 516, dated 18 Jun 2003 16:35:10
>>> Posting number 517, dated 26 Jun 2003 15:18:48
>>> Posting number 518, dated 27 Jun 2003 17:27:27
>>> Posting number 519, dated 2 Jul 2003 10:31:05
>>> Posting number 520, dated 2 Jul 2003 12:19:01
>>> Posting number 521, dated 3 Jul 2003 14:05:09
>>> Posting number 522, dated 7 Jul 2003 10:28:39
>>> Posting number 523, dated 7 Jul 2003 13:30:13
>>> Posting number 524, dated 7 Jul 2003 18:29:30
>>> Posting number 525, dated 7 Jul 2003 18:44:46
>>> Posting number 526, dated 8 Jul 2003 08:23:58
>>> Posting number 527, dated 8 Jul 2003 07:54:05
>>> Posting number 528, dated 9 Jul 2003 12:02:37
>>> Posting number 529, dated 11 Jul 2003 10:24:59
>>> Posting number 530, dated 11 Jul 2003 12:56:25
>>> Posting number 531, dated 11 Jul 2003 13:57:30
>>> Posting number 532, dated 12 Jul 2003 12:07:54
Date: Sat, 12 Jul 2003 12:07:54 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Fwd: georeferencing issue
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear XXX, and all,
The subject of data verification is an important one, so I'm including the
original message and my reply on the MaNIS list for the benefit of all.
I'll intersperse my comments with the original message.
>Date: Fri, 11 Jul 2003 16:09:42 -0700
>From:
>Subject: georeferencing issue
>
>John-
>
>This is probably one of those issues that is an inherent problem with
>georeferencing that there is no real solution for, but I wanted to know
>how host institutions should deal with these situations.
>
>Here is the example in question: today our other Curatorial Assistant,
>XXXXXX, was dealing with a marine mammal specimen and it's locality and
>called me for verification (as a person who lived in Bolinas and
>georeferences for a living). The speclocality read "Bolinas, Stinson
>Beach". Two completely different towns. I thought about it and decided
>as a georeferencer I would find the geographic center between the two
>and then assign an error that extends to the furthest extent of the two.
I have no problem with the method you would have chosen (given Bolinas and
Stinson Beach as two populated places) as long as the LatLongRemark
documented your choice.
Another way to georeference this locality would be to put the point in the
middle of the town of Stinson Beach and have the extent cover Bolinas. Or,
put the point in Bolinas and have the extent cover Stinson Beach.
Yet another reasonable alternative would have been to not georeference the
locality and say "internal inconsistency" in the NoGeorefBecause field
since there's no way to be in both towns at the same time.
There's a slight complication even beyond what you've mentioned, though,
which is that Stinson Beach is also a beach on the south side of the town
of Stinson Beach, with its own entry in the GNIS database. Given this extra
information, the "mid-point" that you would choose for your method of
georeferencing might be in a little different place than if you didn't
consider the beach.
For the second method mentioned above the point and extent would be
unaffected since the beach is closer to the town of Stinson Beach than
Bolinas is.
The third option mentioned above is also still (maybe even more so) a
reasonable choice.
>I then checked the MaNIS data to see how the georeferencer had handled
>this problem. The info in the MaNIS file (CAS early delivery) was
>lat/long as Stinson Beach ppl with an error of .7 mi. I assume the
>georeferencer was unfamiliar with the area and assumed Stinson Beach was
>a more specific locality than Bolinas instead of two separate towns.
I agree that this is the assumption that must have been made - and it
wasn't in the LatLongRemarks.
> In
>my georeferencing results the error would have been almost 2 miles vs .7
>mi. Knowing the geography of the area I know that the MaNIS
>georeferenced data is not at all accurate to where the specimen was most
>likely collected, in fact the error does not even encompass the most
>probable true locality.
This brings up the most critical point of all, which is that our
georeferencing efforts are providing determinations (opinions) based on the
locality descriptions - not on the specimens. Without knowledge of the
specimens that are associated with the locality we are not able to make the
kind of judgement to which XXX is referring above. In other words, we
didn't know a priori that the specimens from that locality are marine mammals.
>Is this just an example of the host institution needing to verify the
>information pre (i.e. CAS should have noticed the conflict before
>submission) and post MaNIS?
I argue that it isn't efficient to pre-verify (or standardize) locality
data before georeferencing. It actually ends up taking more time overall
that way. My basic reasoning here is two-fold. First, in a large-scale
collaborative effort we would never have even begun georeferencing if we
waited for the pre-verification to take place. Second, the georeferencing
itself increases the efficiency with which we are able to isolate
problematic localities. We get to see a whole bunch of localities in a
context of other localities from the same general area and we get to see
patterns in recording techniques and formats. Eventually, we'll also be
able to group locality information by species with environmental
data. What this means for us is that every georeference is unverified at
this stage. Think of them as the opinions of the georeferencers given the
information at hand.
The next logical phase is to validate the georeferences, which is my
responsibility. The first part of validation consists of checking that the
data provided by georeferencers are complete (e.g., NoGeorefBecause filled
out when there are no LatLongs, DeterminationRefs are provided, etc.). The
second part of validation is to make sure that the georeferences are
consistent with the higher geographic information (e.g., that records
putatively from Marin County actually lie within Marin County, or that the
LocalityAnnotation says that the county must have been wrong).
That is the limit of the validation that we can do without reference to the
rest of the specimen record, so it will be at this stage, when all
validation is finished for all localities for all institutions, that the
georeferenced locality information will be returned to the source
institutions to be re-associated with the specimens. Once that task is
accomplished the institutions will have the custodianship of the data and
the responsibility for verification in perpetuity. By verification I mean
that the specimens are checked to see that they were collected in the
locations described by the georeferences. That's going to be an ongoing
task, for which it is my hope that we'll be able to provide valuable tools
to all participants in MaNIS. Funding for this purpose is being sought in
the context of the ORNIS project, which will be the Ornitholigical sister
of MaNIS, based on all of the same principles and technology. We envision
niche modelling tools to help isolate environmental outliers as well as
tools for following itineraries and mechanisms for users of the networks to
provide feedback to the source institutions for their verification. The
more different ways we have of looking at the data, the more data problems
will be exposed and fixed.
>If so how to you propose the host
>institutions proof the data now that it is finished?
I hope I've convinced you that we're not finished yet. In fact, those
pre-release data to CAS came specifically with the disclaimer that they
hadn't yet been validated by us, so don't put them in your database. You'll
get the whole batch again after validation.
>If Andrea hadn't
>been working with this specimen we probably wouldn't have noticed the
>error just by mapping the points from MaNIS. I am curious as to your
>thoughts on this issue.
In addition to what I've said above, I propose that we track the
VerificationStatus of individual specimen or locality records, depending on
your database structure. Specifically, when the data come back from MaNIS,
they will have VerificationStatus = "unverified" and GeorefMethod = "MaNIS
Georeferencing Guidelines". At MVZ it is our intention to have other
possible values of verification status, such as "MVZ verified" which will
meant that staff of the MVZ checked the specimens against the locality and
found no inconsistency. The highest level of verification will be
"collector verified" which will mean that the collector actually looked at
a plot of the specimens based on the coordinates and errors and said "Yes,
all of those specimens came from within that circle and the circle is of
the correct size to describe the locality for all of them." It doesn't get
better than that. In order to engage the collectors, however, I think we'll
have to make some fun tools that we all can play with. That's our goal anyway.
Thanks for asking the tough and timely questions.
John
>>> Posting number 533, dated 14 Jul 2003 14:50:47
>>> Posting number 534, dated 16 Jul 2003 10:22:52
>>> Posting number 535, dated 16 Jul 2003 11:47:15
>>> Posting number 536, dated 17 Jul 2003 14:32:11
>>> Posting number 537, dated 18 Jul 2003 18:28:33
>>> Posting number 538, dated 23 Jul 2003 18:28:59
>>> Posting number 539, dated 23 Jul 2003 18:33:04
>>> Posting number 540, dated 24 Jul 2003 13:17:56
>>> Posting number 541, dated 25 Jul 2003 19:12:38
>>> Posting number 542, dated 25 Jul 2003 19:14:29
>>> Posting number 543, dated 25 Jul 2003 19:24:30
>>> Posting number 544, dated 29 Jul 2003 13:52:29
>>> Posting number 545, dated 31 Jul 2003 14:25:49
>>> Posting number 546, dated 31 Jul 2003 21:15:20
>>> Posting number 547, dated 1 Aug 2003 15:24:12
>>> Posting number 548, dated 1 Aug 2003 17:01:45
>>> Posting number 549, dated 2 Aug 2003 19:13:23
>>> Posting number 550, dated 4 Aug 2003 10:04:46
>>> Posting number 551, dated 4 Aug 2003 15:01:02
>>> Posting number 552, dated 4 Aug 2003 15:15:59
>>> Posting number 553, dated 5 Aug 2003 15:10:11
>>> Posting number 554, dated 6 Aug 2003 08:11:44
>>> Posting number 555, dated 6 Aug 2003 18:36:56
>>> Posting number 556, dated 11 Aug 2003 09:43:41
>>> Posting number 557, dated 15 Aug 2003 09:31:00
>>> Posting number 558, dated 15 Aug 2003 10:21:12
>>> Posting number 559, dated 19 Aug 2003 16:28:06
Date: Tue, 19 Aug 2003 16:28:06 -0700
Reply-To: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
Sender: Mammal Networked Information System <MAMMAL-Z-NET@USOBI.ORG>
From: John Wieczorek <tuco@SOCRATES.BERKELEY.EDU>
Subject: Last claim: Russia
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Whoohoo. This is an historic day. The MVZ would like to claim the last of
the known universe for georeferencing.
Having come to this milestone, I would like to thank everyone for their
participation in this grand experiment. I don't mean to imply that we're
done yet, but at least the Checklist map is all filled in with either green
(denoting geographic regions for which georeferences have been completed)
or red (denoting regions in progress). This makes Robert Hijmans very
happy. I'm pretty happy too, but I'll be even happier when the whole map is
green, so here's my reminder to keep up the good work and get those
outstanding georeferences to me as soon as you can.
We've already begun the first phase of data validation and standardization
on the files that have been returned already. Our hope is to be caught up
with this process as the last files come in so that we can do three final
important steps, 1) determine if there are localities that we missed, 2)
use GIS to do spatial validations on the georeferences against
administrative boundary layers, and 3) prepare the georeferences to be
returned to the source databases.
In preparation for the remainder of this part of the MaNIS project, it
would be helpful if participants could do two things at your earliest
convenience:
1) Look at the Georeferencing Checklist
(http://elib.cs.berkeley.edu/manis/Checklist.html) to see if my records of
outstanding claims are correct, and
2) Send me an estimate of when you expect to finish georeferencing the
regions for which there are claims outstanding.
Thanks to all,
John
>>> Posting number 560, dated 21 Aug 2003 12:04:31
>>> Posting number 561, dated 8 Sep 2003 10:32:10
>>> Posting number 562, dated 8 Sep 2003 18:53:07
>>> Posting number 563, dated 7 Oct 2003 19:47:47
>>> Posting number 564, dated 10 Dec 2003 13:03:43
>>> Posting number 565, dated 12 Dec 2003 09:43:00
>>> Posting number 566, dated 12 Dec 2003 13:53:59
>>> Posting number 567, dated 21 Jan 2004 11:19:04
>>> Posting number 568, dated 21 Jan 2004 13:00:12
>>> Posting number 569, dated 21 Jan 2004 15:16:04
>>> Posting number 570, dated 21 Jan 2004 15:18:13
>>> Posting number 571, dated 22 Jan 2004 12:46:18
>>> Posting number 572, dated 21 Jan 2004 15:11:33
>>> Posting number 573, dated 22 Jan 2004 14:21:24