Monday 3 March 2008

Branch-and-Bound algorithms for Nearest Neighbour queries

This paper by Roussopoulos et al is a fine expositions of how to use a Branch-and-Bound algorithm in conjunction with an R-tree index to efficiently perform Nearest-Neighbour queries on a spatial database.


Nearest-neighbour is a pretty standard query for spatial database. Oracle Spatial offers this capability natively. It would be great if PostGIS did too. If the GIST index API offers appropriate access methods, it seems like it might not be too hard to implement the Roussopoulos algorithm. (How about it, Paul, now that you're dedicating your life to becoming a PostGIS coding wizard...)

I'm also thinking that this approach might produce an efficient algorithm for computing distance between Geometrys for JTS. It's always bugged me that the JTS distance algorithm is just plain ol' O(N^2) brute-force. It seems like there has to be a better way, but as usual there seems to be precious little prior art out there in web-land. This should also work well in the "Prepared" paradigm, which means that it will provide an efficient implementation for PostGIS as well. Soon, hopefully...

2 comments:

hobu said...

I think Marios Hadjieleftheriou's SpatialIndex library implements this algorithm for RTree, MVRTree, and TPRTree.

See the code
here

Dr JTS said...

Great, thanks for the link. Nice to have some code to look at...

One difference between nearest-neighbour and distance is that NN is a case of comparing 1 object to N other objects, whereas distance (in general) is a case of comparing N objects to M objects. This makes the B-and-B R-tree traversal a bit more complex. I think I have this figured out, though - I'm looking forward to implementing it and seeing how efficient it is in practice.