I'm looking for a formula to divide up the earths surface into
equidistant cells such that given a lat/long pair I can retrieve its
cell and all neighboring cells.
So, for example, if I have something like the following:

 1,1  1,2  1,3  1,4 

 2,1  2,2  2,3  2,4 

 3,1  3,2  3,3  3,4 

 4,1  4,2  4,3  4,4 

 5,1  5,2  5,3  5,4 

... and I specify a lat/long [that falls in (4,2)], I need the formula
to return (3,1), (3,2), (3,3), (4,1), (4,2), (4,3), (5,1), (5,2),
(5,3).
I need to be able to specify a distance (i.e. 10mi, 20km, ...), with all
cells being equal in endtoend distance (either from topleft to
bottomright, or from toptobottom/lefttoright), and accounting for
the curvature of the earth.
I'm looking for a formula, algorithm, or program to do this
(preferably a Java program). 
Request for Question Clarification by
maniacga
on
31 Aug 2005 17:47 PDT
Hello Jim_p,
Can you explain if the solution can impose some constraints on the
region of the earth that it operates in (e.g., within N 60 degree
parallel and S 60 degree parallel)?
Is there a preferred method you wish to use to divide the world up
into "equidistant" cells?
(e.g., township & range)
By "equidistant", I assume you mean that the distance north / south
and east / west is uniform on the cells (with perhaps some slight
distortion).
The general problem for dividing up cells is illustrated at
http://www.ksls.com/assets/PLSS/6a.jpg
The specific illustration I refer to shows how the US aligned the
township and range relative to a know point. For example, in Kansas
http://www.ksls.com/about_surveys.htm
the reference point was in this case the N 40 degree parallel and the
"6th principal meridian". There is other good material in this
reference, referring to the difficulty the surveyors had in performing
the initial survey.
Maniac

Clarification of Question by
jim_pga
on
31 Aug 2005 18:53 PDT
Definately S 60 ... N 60, I suppose. What are the constraints?
You're correct on what I meant by equidistant.
I'm not sure I understand what you mean by preferred method though.
Range I think. Driving distance. I need to be able to define multiple
grids with variable distances (i.e. A grid with cells that are 10mi
wide, A grid with cells 25mi wide, a grid with cells 10km wide,
etc...) Is that what you were asking?
jim

Request for Question Clarification by
maniacga
on
31 Aug 2005 20:02 PDT
Hello Jim,
To clarify why a constraint (e.g., 60 N / 60 S) is helpful, look at a
Mercator map illustrated at
http://www.colorado.edu/geography/gcraft/notes/mapproj/gif/mercator.gif
In this map, the areas near the poles are distorted. If you chopped
this map into squares, the squares north of the equator represent an
area on earth with a wider area to the south than the north. The
difference is not much for the center region of the map but the
difference is quite significant near the poles (top and bottom of the
map).
This type of method (chopping up a map projection) would be a straight
forward solution and there are plenty of sites with source code  for
example
http://www.dmap.co.uk/ll2tm.htm
has a form to convert between lat/long and meters (or an Excel
spreadsheet at a small fee for bulk conversions). This could be
adapted to determine the cells (and adjoining ones) in a straight
forward manner. Note that the square cells (on the map) are "almost"
square (on the earth), with the distortion relatively minor until you
get near the poles.
Would a solution like this be suitable?
Maniac

Clarification of Question by
jim_pga
on
31 Aug 2005 20:34 PDT
Sorry, I misunderstood your earlier post. The constraint is that the
formula wouldn't work in areas N of 60 or S of 60. I suppose that
would be fine, if it keeps things simple.
What's important is that the cells are all equal in size and that I
can alter this size (in miles and kilometers) to increase or decrease
the granularity of the grid.
Thanks,
jim

Clarification of Question by
jim_pga
on
31 Aug 2005 20:37 PDT
Out of curiousity, would N of 70 instead of 60 complicate things much?
Not that it's all that important, but I'd be nice to include a bit
more of Canada and Alaska.
jim

Request for Question Clarification by
maniacga
on
01 Sep 2005 17:56 PDT
Hello Jim,
Hmm. I read the comments and I believe I understand the application
(quickly looking up nearby locations from a lat/long with a specific radius)
and not necessarily dividing up the world into squares. The problem
with squares was described by myself (distortion, different sizes) and
the commenter myoarin (alignment of cells)  if they are not
necessary, that should simplify the solution quite a bit.
Based on what you stated (and some searching on the web), it may be
possible to solve that problem more directly using latitude and
longitude. Let me outline a solution:
 each degree of latitude covers about 69 miles. So on the north /
south axis, you have a direct conversion between the difference of
latitude and distance.
 each degree of longitude covers about 69 miles at the equator but
zero miles at the poles. The distance per longitude can be calcuated
based on the latitude as illustrated at
http://teacherlink.org/content/math/activities/gpsarea/guide.html
[scroll to "Part 2", (7)]
Using the 100 W longitude and 45 N latitude as an example, the 20 mile
x 20 mile area centered would be roughly:
 100 degrees + /  (10 miles / (cosine(45 degrees) x 69
miles/degree)) or 100W +/ 0.205
 45 degress + /  (10 miles / 69 miles/degree) or 45N +/ 0.145
[using decimal degrees, you can convert to degrees / minutes / seconds if needed]
Using these ranges of values, you could then search your data [I
assume in a database] for values within those two ranges to do the
range based lookup. Would this be satisfactory or do you really need
the squares (which won't quite be uniform in size if they line up or
won't line up well if they are of uniform size)?
Maniac

Clarification of Question by
jim_pga
on
01 Sep 2005 21:30 PDT
No, actually, that's the problem, I can't use ranges. Imagine if you
weren't allowed to use database 'less than' or 'greater than'
operators. How would you solve this problem then?
One approach is to calculate the distance of every object in the
database, but that's just not feasible. Another approach is to
catagorize your data, which is essentially what I'm trying to do.
If I had 3 fixed distances (5mi, 15mi, 35mi), I could create a grid
with cells that are 5mi x 5mi, another grid that's 15mi by 15mi,
etc... Then, I could create columns in my database for 5MI_CELL,
15MI_CELL, and 35MI_CELL, and use the formula to add the cell labels
for each entry. Now if I want to find locations within a 15mi radius
of a lat/long, I just search for all entries where 15MI_CELL='(3,1)'
or 15MI_CELL='(3,2)' or 15MI_CELL='(3,3)'... or whatever the labeling
scheme is...
Make any sense? Is there a better way to do this?
jim

Request for Question Clarification by
maniacga
on
02 Sep 2005 14:51 PDT
Hello Jim,
It boggles the mind a little bit that your database does not allow you
to extract values based on range selections. However, I can think of
at least three ways to work around that.
[1] Fetch more data / reduce as a second step. Encode the data with
the lat / long as whole numbers as well as fractions. Extract all the
values that match the whole numbered lat / long of the central point
(plus up to three adjoining lat / long values if you are within 10
miles of a whole lat / long position  for a 10 mile radius). For the
values fetched, do the distance algorithm (such as suggested by
myoarin) to throw out the values that are not within the radius.
[though I can understand if the fetching of "too many" values is not
desired]
[2] If you need to cover a large part of the world using cells, it
would probably be best to use a grid system that is standardized. For
example, the "Military Grid Reference System" (MGRS) described in a
tutorial at
http://www.rotc.monroe.army.mil/jrotc/documents/Curriculum/Unit_5/U5C2L3_txt.pdf
allows you to specify a location on the earth to any resolution (e.g.,
1000 or 100 meters). You could encode the locations with the MGRS
location and source code for conversions between lat / long and MGRS
(and several others) is available at
http://earthinfo.nga.mil/GandG/geotrans/index.htm
It is pretty straight forward to compute the MGRS values of adjoining
areas  though if you used 1000 meter resolution and had a 15 mile
radius, there would over a thousand MGRS values to look up. Using
multiple ranges (e.g., 1000 and 10000 meter) for the encoding would be
one way to reduce that problem.
[3] Get a better database. If the effort needed to work around the
limitations of the database is more than the costs of switching to a
new database, it may be work it. I can certainly refer you to several
good ones [though if your hadware is limited  I can see this would be
a problem as well]
Please let me know if one of those solutions would be acceptable or if
not, additional constraints on the solution.
Maniac

Clarification of Question by
jim_pga
on
06 Sep 2005 12:04 PDT
If you think of directories or search engines that typically build
indexes on keywords and are optimized for keyword lookups, then this
problem makes a little more sense.
I had thought about your suggestion, basically using an existing grid
system like UTM or MGRS [2], and then reducing as a second step [1],
but reducing as a second step is inefficient because I'd have to apply
distance calculations on every entry returned, and that's going to
slow things down significantly. So, the grid system being used would
have to be fairly granular to minimize the number of invalid entries
returned. Then, I figured that if grid systems like UTM and MGRS
exisit, then it must be possible to encode a custom grid system. Hence
my asking here. I would prefer a grid system with cells that are the
size of the search area radius, but if that's not possible, then if
the grid system is fairly granular, I guess I could use something
close to my search area radius and reduce as a second step.
So, maybe MGRS would work. Could you elaborate on it some more. How do
I define my grid size? What is the max size for grid cells and what is
the min size that I could have for grid cells under MGRS?
Thanks,
jim

Request for Question Clarification by
maniacga
on
06 Sep 2005 16:49 PDT
Hello Jim,
Well, MGRS  as explained briefly at
http://www.ncgia.ucsb.edu/education/curricula/giscc/units/u013/u013.html
(and in some more detail in the tutorial already referenced)  can
represent locations in a variety of resolutions. As you add more
digits  you get better resolution like the following:
 2 digits  10 km resolution
 4 digits  1000 meter (1 km)
 6 digits  100 meter
 8 digits  10 meter
 10 digits  1 meter resolution
So, you have a wide range of resolutions available for the fixed sized
cells. Of course, at higher resolutions, you have a LOT more possible
values to extract from the database within the selected range of a
location.
You say the distance calculation is "expensive"  perhaps on a hand
held device (or another with limited computational resources). If that
is the case, there are work arounds like:
 eliminate the square root calcuation from the distance calculation
(you are "within the range" if dx*dx + dy*dy < range*range)
 if you have several values on the same "row" or "column" (I use
those terms loosely), determine the maximum "within range" cell 
throw out all the ones above that point (and keep the ones below)
 use fixed point arithmetic (if floating point is expensive) if you
can do double precision integer arithmetic
A combination of these methods could be used as well.
Let me know if you want to follow up on one of these more fully.
Maniac

Request for Question Clarification by
maniacga
on
09 Sep 2005 14:23 PDT
Hello Jim,
Let me recap a previous clarification request [removed with the
answer] and then address it.

[reference to Google Local removed]
Note on the results page how there are fixed proximities (Search
within: 1mi, 5mi, 15mi, 45mi).
Suppose I wanted to implement something similar to this. Then my cell
sizes would need to be, approximately:
1 mi = ~1600 m
5 mi = ~8 km
15 mi = ~24 km
45 mi = ~70 km
Using MGRS, could I define my grid resolutions to these values? If so,
then this is probably closer to the answer I'm looking for.

I took a look at the source code at
http://earthinfo.nga.mil/GandG/geotrans/index.htm
(a large package doing a variety of coordinate conversions)
and after unpacking, looked at the MGRS source code in
geotrans2.2.6/dt_cc/mgrs/mgrs.c
it looks like once the UTM cell is identified (the first part of the
MGRS value), the "northing" and "easting" values are then manipulated
to compute the offset (modulo the resolution) to compute the remaining
digits. There are some hard coded values in the code (e.g., 100000.0
in Make_MGRS_String) but the code could be modified to use a different
scaling factor.
For example, if you used 5 mile squares (8046.72 meters each), you
could use that value to divide the easting / northing values with a
fixed precision (e.g., 2 digits) and compute a reference like this
UTM cell (e.g., 18SWD)
offset East (two digits)
offset North (two digits)
Or something like
18SWD0215
with references for that cell and adjacent ones laid out like this
18SWD0116, 18SWD0216, 18SWD0316
18SWD0115, 18SWD0115, 18SWD0315
18SWD0114, 18SWD0214, 18SWD0314
[north to south is top to bottom, west to east is left to right]
Of course, if you are near the top / bottom and/or left / right of an
UTM cell, computing the adjacent cells would have to take that into
account (and pull cells from the adjoining UTM cell).
Maniac

Clarification of Question by
jim_pga
on
12 Sep 2005 23:46 PDT
Ok, I think that's what I'm looking for. Why are the offsets only two digits?
jim
