R-tree bounds, spatial relations and envelope

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
Report Content as Inappropriate

R-tree bounds, spatial relations and envelope

Adam Wulkiewicz

Recently I've run into an issue that is related for instance with the R-tree. It's caused by the fact that the if the floating-point coordinates are used, spatial relations of OGC-Geometries like Point or Segment are checked WRT machine epsilon but in the case of Boxes the coordinates are compared without taking epsilon into account. This of course makes the Boxes checks faster. However it also means that if a Box is a "minimum bounding box/rectangle", i.e. min/max values are exactly the min/max coordinates of the corresponding bounded geometries, the spatial relation may give unexpected results. For instance:

point p(0, 0);
point p_eps(-eps/2, -eps/2);
box b(point(0, 0), point(1, 1));

bg::equals(p, p_eps) == true;
bg::intersects(b, p) == true;
// but
bg::intersects(b, p_eps) == false;

It looks more or less like this:

And this is true for any non-Box geometry, e.g. for two Polygons and their bounding boxes the results may be similar to the ones showed above. It could be observed in the R-tree or any other space partitioning data structure or algorithm using Boxes. In particular, it means that in some very specific cases R-tree spatial queries may give different results than spatial relations checks done without the spatial index.

Have anyone noticed this problem?
Do you think that it is a problem that needs to be fixed?
Now that I've described this problem, do you find some specific case in your own code that could be affected by it?
I'm asking because it may be acceptible to prune slightly more nodes/values (so not interesting space) even if the result wouldn't be prefectly "correct".

To improve/fix it we could check the spatial relations for Boxes WRT epsilon. However this slows down many algorithms in the library, e.g. buffer() by 25% and the rtree spatial queries 2x or more which I think is unacceptable.

Another thing that could be done is to enlarge the bounding boxes WRT epsilon, in a way consistent with the spatial equality checks used in the library. This solves the problem and there is no direct performance loss. I'm saying direct because this would cause that slightly greater number of nodes could be traversed and values returned simply because this would be the correct behavior. Though the enlargement is so small that this shouldn't be noticeable, at least I haven't noticed any change. So it seems that this is the way to go. The catch is that the rtree bounds (bounding box) would no longer have the min/max coordinates exactly the same as the min/max coordinates of represented Points or Segments stored in the rtree.

Btw, there is no need to enlarge the nodes' Boxes if value Boxes are stored because the spatial relations are checked the same way for both. If the value Boxes represented some non-Box Geometries like Polygons they could probably be enlarged before storing them in the index. Then the R-tree would simply use those enlarged Boxes.

Do you think that the slightly enlarged R-tree bounds for Point and Segments could cause some other problems?
Would you prefer if it was possible to enable or disable this feature?
Would you prefer to have it enabled or disabled by default?


Geometry mailing list
[hidden email]