Hi,
I've got some thoughts about the query interface I want to share with you. We previously talked about using Boost.Range syntax in the spatial queries: rtree  filter1(...)  filter2(...)  ... 1. Geometrical relationship filters Some filters may be taken directly from ggl algorithms: within(g) covered_by(g) intersects(g) or intersected_by or intersecting overlaps(g) or overlapped_by or overlapping t  within(g)  get values within some geometry t  intersects(g)  get values intersecting some geometry Note that some filters may refer to the geometry which is passed or to the values stored in the rtree. Someone may want to get values within some geometry or to get values containing some geometry. He may just use previously described filters and connect them (described further) but then e.g. 2 operations must be done instead of 1. And for some queries this couldn't be possible (with operator only). So there could be:  additional filters: t  outside(g)  filters generated by some operator: t  ~within(g)  confusing  object passed before in the sequence: t  inv  within(g)  even more confusing  object passed with filter: t  inv(within(g)) In addition to this every filter can have it's negation generated by operator! !within(g) == not_within(g) !intersects(g) == not_intersects(g) Eventually there could be additional filters e.g. "get objects with 0 coordinate greather than some other object" coords<0, more>(g) etc. It would be nice to have e.g. "get points on the right of the linestring" 2. K nearest neighbour filters nearest(g)  get nearest value to the geometry g nearest(g, 5)  get nearest 5 values to the geometry g Folowing filters describes geometrical relationship but they'd probably be used with NN filter. further(g, comp_dist1) closer(g, comp_dist2) range(g, comp_dist1, comp_dist2) In addition there could be k furthest neighbor filter: furthest(g) == !nearest(g) 3. Filters connection 3.1. operator "get values within geometry g1 AND not intersecting geometry g2" t  within(g1)  !intersects(g2) "get 5 nearest to g2 values overlapping g1" which means "get values overlapping g1 AND 5 nearest to g2" t  overlaps(g1)  nearest(g2, 5) Using this operator filters may be processed sequentially from left to right. Like in streams there may be manipulators used e.g. "get objects which centroids are nearest to the geometry g1" could look like this: t  centroid  nearest(g1, 5) instead of e.g. t  nearest<centroid_tag>(g1, 5) or t  nearest(g1, 5, centroid) or t  nearest(g1, 5, some_strategy) but t  centroid::nearest(g1, 5) looks nice. But, queries with logical alternatives can't be described this way e.g. "get objects within g1 OR intersecting g2". This kind of query performed by one tree traversal could be faster than two separate queries. Moreover one probably will have to check if there are repeated values in both results. Btw pipe operator is used here as a conjunction and this is c++ binary alternative. 3.2. expressions By using more operators we lose sequential nature of the querry but we get more powerful expressions which may be passed to the query e.g. "get objects within g1 AND not intersecting g2" t[within(g1) & !intersects(g2)] "get objects within g1 OR intersecting g2" t[within(g1)  intersects(g2)] "get points closest to the linestring in the areas intersecting geometry g1 or within geometry g2" t[(intersects(g1)  within(g2)) & nearest(ls, 10)] "get boxes which centroids are closest to the g3 in the areas intersecting geometry g1 or within geometry g2" e.g. t[(intersects(g1)  within(g2)) & centroid(nearest(g3, 10))] or t[(intersects(g1)  within(g2)) & centroid::nearest(g3, 10)] But I don't know if all future expresions would be easy to implement e.g. expressions with some number of KNN filters alternatives would probably gather results in appropriate number of containers because objects must be sorted by distance for each KNN filter. Then results would probably be connected (checks for repeating objects). But this is probably possible. Or there could be only one KNN search per query but this isn't nice. Thoughts? Regards, Adam _______________________________________________ ggl mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/ggl 
Hi Adam,
Thanks for your proposal. On 3072011 13:45, Adam Wulkiewicz wrote: > > 1. Geometrical relationship filters > > Some filters may be taken directly from ggl algorithms: > > within(g) > covered_by(g) > intersects(g) or intersected_by or intersecting > overlaps(g) or overlapped_by or overlapping > > t  within(g)  get values within some geometry > t  intersects(g)  get values intersecting some geometry Basically I like the clean syntax. These range adaptors, having the same name, should be in another namespace and if users hoist both namespaces with the usingclause, it will be ambiguous. So namespace will be necessary here (or in the normal function), making it a bit longer. What if we don't do this and just use the "filtered" adaptor defined by Boost.Range? t  filtered(within(g)) The advantage is that AND, OR and NOT are much easier to handle (if people write their own predicate  but maybe Phoenix or Lambda can be of help here, didn't check this now). However, the basic requirement is that queries like this are using the spatial index. So I don't know how the filtered would use that index. I assume that "within" would do that (did you already did some work here?). So we indeed need "something which is a Boost.Range adaptor, uses the spatial index, and is capable to do userdefined queries at compiletime". We might consider adding a "spatially_filtered" adaptor which acts like this. > > Note that some filters may refer to the geometry which is passed or to > the values stored in the rtree. Someone may want to get values within > some geometry or to get values containing some geometry. He may just > use previously described filters and connect them (described further) > but then e.g. 2 operations must be done instead of 1. And for some > queries this couldn't be possible (with operator only). So there > could be: > >  additional filters: t  outside(g) Outside is actually not (always) the same as ! within, because if a geometry partly intersects another, it is ! within and also ! disjoint. Outside might be renamed to "disjoint" here. >  filters generated by some operator: t  ~within(g)  confusing >  object passed before in the sequence: t  inv  within(g)  even > more confusing >  object passed with filter: t  inv(within(g)) > > In addition to this every filter can have it's negation generated by > operator! > > !within(g) == not_within(g) > !intersects(g) == not_intersects(g) > > Eventually there could be additional filters e.g. "get objects with 0 > coordinate greather than some other object" coords<0, more>(g) etc. It > would be nice to have e.g. "get points on the right of the linestring" Basically I like the idea of ! within(g), it is very readable. But it is, if I'm right, different from Boost.Range adaptors which cannot be negated like this. The Boost.Range adaptor overloads operator to get its effect, you additionally need to overload operator! for a range adaptor as well. I didn't try but don't know if this is possible (probably it is). Besides that we have to be careful what is exactly the negation, see comment above. > > 2. K nearest neighbour filters > > nearest(g)  get nearest value to the geometry g > nearest(g, 5)  get nearest 5 values to the geometry g What will the first line return? All values in order of distance? > > Folowing filters describes geometrical relationship but they'd > probably be used with NN filter. > > further(g, comp_dist1) > closer(g, comp_dist2) > range(g, comp_dist1, comp_dist2) > > In addition there could be k furthest neighbor filter: > furthest(g) == !nearest(g) The last line I don't understand... What if we have only 4 elements and we want the nearest(g, 5). It will return 4 values. The furthest will also return 4 values, so it is not the negation. > > 3. Filters connection > > 3.1. operator > > "get values within geometry g1 AND not intersecting geometry g2" > t  within(g1)  !intersects(g2) Concatenation is possible in Boost.Range adaptors and I like it. So it will work automatically (besides the ! syntax). It might not be the most efficient in all cases because they are filtered twice, so the case you describe (g1 AND not g2) is implemented differently (though the effect is the same). You mention that also below. And how do both make use of the spatial index? Therefore, the filtered predicate might be better, being able to do this in one step. > > "get 5 nearest to g2 values overlapping g1" which means I don't understand this example, because something overlapping something else does have a distance of 0.0, so nearest is more or less undertermined here. With the centroid below it is more clear. > "get values overlapping g1 AND 5 nearest to g2" > t  overlaps(g1)  nearest(g2, 5) > > Using this operator filters may be processed sequentially from left to > right. Like in streams there may be manipulators used e.g. "get > objects which centroids are nearest to the geometry g1" could look > like this: > > t  centroid  nearest(g1, 5) Do I understand here that the centroid is a range adaptor, which returns the centroids from all geometries present in the spatial index? In that case I agree with this syntax. > > instead of e.g. > > t  nearest<centroid_tag>(g1, 5) or > t  nearest(g1, 5, centroid) or > t  nearest(g1, 5, some_strategy) > > but > > t  centroid::nearest(g1, 5) > > looks nice. Also possible but might be confusing  it mixes namespace with behaviour. In this case I would prefer centroid_nearest (which is IMO looking good) > > But, queries with logical alternatives can't be described this way > e.g. "get objects within g1 OR intersecting g2". This kind of query > performed by one tree traversal could be faster than two separate > queries. Moreover one probably will have to check if there are > repeated values in both results. So here the spatially_filtered adaptor will be of great help. > > Btw pipe operator is used here as a conjunction and this is c++ > binary alternative. > > 3.2. expressions > > By using more operators we lose sequential nature of the querry but we > get more powerful expressions which may be passed to the query e.g. > > "get objects within g1 AND not intersecting g2" > t[within(g1) & !intersects(g2)] > > "get objects within g1 OR intersecting g2" > t[within(g1)  intersects(g2)] > > "get points closest to the linestring in the areas intersecting > geometry g1 or within geometry g2" > t[(intersects(g1)  within(g2)) & nearest(ls, 10)] > > "get boxes which centroids are closest to the g3 in the areas > intersecting geometry g1 or within geometry g2" e.g. > t[(intersects(g1)  within(g2)) & centroid(nearest(g3, 10))] or > t[(intersects(g1)  within(g2)) & centroid::nearest(g3, 10)] Yes, it is looking nice, and I would suggest to verify if we can use Boost.Range filtered in combination with Phoenix to get this effect. Further the & should be &&, and the  should be , IMO. So something like t  filtered(within(_1) && ! intersects(_1)) (fictitious) > > But I don't know if all future expresions would be easy to implement > e.g. expressions with some number of KNN filters alternatives would > probably gather results in appropriate number of containers because > objects must be sorted by distance for each KNN filter. Then results > would probably be connected (checks for repeating objects). But this > is probably possible. > > Or there could be only one KNN search per query but this isn't nice. So, in general, I like the idea of querying spatial indexes. I would suggest to see how Phoenix helps us by defining spatial query combinations, while making use of the index as well. It is very interesting and it would be great if it would all work clear and efficient like that. Regards, Barend _______________________________________________ ggl mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/ggl 
Barend Gehrels wrote:
>> t  within(g)  get values within some geometry >> t  intersects(g)  get values intersecting some geometry > > > Basically I like the clean syntax. These range adaptors, having the same > name, should be in another namespace and if users hoist both namespaces > with the usingclause, it will be ambiguous. So namespace will be > necessary here (or in the normal function), making it a bit longer. Of course you're right. I assumed that everything is in the geometry::index namespace. So some algorithms from geometry corresponds to the filters in geometry::index. > What if we don't do this and just use the "filtered" adaptor defined by > Boost.Range? > t  filtered(within(g)) It's ok for me. There also may be some standard filters implemented: index::within_filtered, index::intersects_filtered, index::nearest_filtered. intersects_filtered is implemented already as spatially_filtered. Btw note that geometry::index filters should be the first ones in the sequence of filters aplied to the container. t  some_not_index_filter()  index::filtered(...) won't work. > > The advantage is that AND, OR and NOT are much easier to handle (if > people write their own predicate  but maybe Phoenix or Lambda can be of > help here, didn't check this now). Or Proto if it couldn't be done with Phoenix. > However, the basic requirement is that queries like this are using the > spatial index. So I don't know how the filtered would use that index. I > assume that "within" would do that (did you already did some work > here?). So we indeed need "something which is a Boost.Range adaptor, > uses the spatial index, and is capable to do userdefined queries at > compiletime". > > We might consider adding a "spatially_filtered" adaptor which acts like > this. spatial_filter<...> is implemented as mentioned before (intersects). It's just generated by operator(index::rtree<...> const&, index::filters::spatially_filtered<...> const&) Currently the querry is performed in the spatial_filter<...>'s constructor by calling rtree<...>::find(...) method. >> Note that some filters may refer to the geometry which is passed or to >> the values stored in the rtree. Someone may want to get values within >> some geometry or to get values containing some geometry. He may just >> use previously described filters and connect them (described further) >> but then e.g. 2 operations must be done instead of 1. And for some >> queries this couldn't be possible (with operator only). So there >> could be: >> >>  additional filters: t  outside(g) > > > Outside is actually not (always) the same as ! within, because if a > geometry partly intersects another, it is ! within and also ! disjoint. > Outside might be renamed to "disjoint" here. Got it. I forgot about disjoint. Any other algorithms which could be used? >>  filters generated by some operator: t  ~within(g)  confusing >>  object passed before in the sequence: t  inv  within(g)  even >> more confusing >>  object passed with filter: t  inv(within(g)) >> >> In addition to this every filter can have it's negation generated by >> operator! >> >> !within(g) == not_within(g) >> !intersects(g) == not_intersects(g) >> >> Eventually there could be additional filters e.g. "get objects with 0 >> coordinate greather than some other object" coords<0, more>(g) etc. It >> would be nice to have e.g. "get points on the right of the linestring" > > > Basically I like the idea of ! within(g), it is very readable. > > But it is, if I'm right, different from Boost.Range adaptors which > cannot be negated like this. The Boost.Range adaptor overloads operator > to get its effect, you additionally need to overload operator! for a > range adaptor as well. I didn't try but don't know if this is possible > (probably it is). > > Besides that we have to be careful what is exactly the negation, see > comment above. In the case of using ranges, just another type of range would be generated by operator! e.g. index::ranges::not_within. And the meaning should be exactly as the meaning of !geometry::within(). Just like in the case of using the loop, checking all elements and returning values meeting !within(some_geometry) requirement. >> >> 2. K nearest neighbour filters >> >> nearest(g)  get nearest value to the geometry g >> nearest(g, 5)  get nearest 5 values to the geometry g > > What will the first line return? All values in order of distance? One value nearest to the geometry. If we have container of points and the geometry is a point it would be the closest point. Internally there would be algorithms from geometry used so if there was comparable_distance()/within()/intersects()/etc. implemented for Indexable and Geometry the query would compile, otherwise it wouldn't. >> Folowing filters describes geometrical relationship but they'd >> probably be used with NN filter. >> >> further(g, comp_dist1) >> closer(g, comp_dist2) >> range(g, comp_dist1, comp_dist2) >> >> In addition there could be k furthest neighbor filter: >> furthest(g) == !nearest(g) > > The last line I don't understand... What if we have only 4 elements and > we want the nearest(g, 5). It will return 4 values. The furthest will > also return 4 values, so it is not the negation. operator! was a suggestion. If we want to return sorted ranges for nearest() and furthest() then nearest() will generate range sorted from closest object and furthest() from furthest. >> 3. Filters connection >> >> 3.1. operator >> >> "get values within geometry g1 AND not intersecting geometry g2" >> t  within(g1)  !intersects(g2) > > Concatenation is possible in Boost.Range adaptors and I like it. So it > will work automatically (besides the ! syntax). It might not be the most > efficient in all cases because they are filtered twice, so the case you > describe (g1 AND not g2) is implemented differently (though the effect > is the same). You mention that also below. And how do both make use of > the spatial index? > Therefore, the filtered predicate might be better, being able to do this > in one step. As you noticed, in order to work efficiently, filters from g::index shouldn't be processed exactly like filters in Boost.Range  transparently. They should be used only with the rtree object or other filter from g::index. operators  should generate one compound object (or an expression if you like) which would be passed to the rtree::find() e.g. when the begin() method of this object (which is also a range) is called. >> >> "get 5 nearest to g2 values overlapping g1" which means > > I don't understand this example, because something overlapping something > else does have a distance of 0.0, so nearest is more or less > undertermined here. With the centroid below it is more clear. I assumed that nearest just uses comparable_distance() so it will work if it is implemented for geometries. Moreover I assumed that comparable_distance() returns closest distance between e.g. geometries' perimeters or perimeter and some point etc. If we have a river(linestring) in a country (polygon) and rtree containing some indexables(e.g. POI) on the entire continent this query would be something like: "find 5 indexables nearest to the river inside some country". Btw distances between boxes and other geometries could be useful in the future. >> "get values overlapping g1 AND 5 nearest to g2" >> t  overlaps(g1)  nearest(g2, 5) >> >> Using this operator filters may be processed sequentially from left to >> right. Like in streams there may be manipulators used e.g. "get >> objects which centroids are nearest to the geometry g1" could look >> like this: >> >> t  centroid  nearest(g1, 5) > > Do I understand here that the centroid is a range adaptor, which returns > the centroids from all geometries present in the spatial index? In that > case I agree with this syntax. It would be a part of the expression but this syntax probably won't be used. >> instead of e.g. >> >> t  nearest<centroid_tag>(g1, 5) or >> t  nearest(g1, 5, centroid) or >> t  nearest(g1, 5, some_strategy) >> >> but >> >> t  centroid::nearest(g1, 5) >> >> looks nice. > > Also possible but might be confusing  it mixes namespace with > behaviour. In this case I would prefer centroid_nearest (which is IMO > looking good) Ok. >> >> But, queries with logical alternatives can't be described this way >> e.g. "get objects within g1 OR intersecting g2". This kind of query >> performed by one tree traversal could be faster than two separate >> queries. Moreover one probably will have to check if there are >> repeated values in both results. > > So here the spatially_filtered adaptor will be of great help. > The expression may be passed to some function or operator t  index::filtered(Expr)  some_filter() t  some_filter()  index::filtered(Expr) // Error t  (Expr)  some_filter() t  some_filter()  (Expr) // Error t[Expr]  some_filter() index::filter(t, Expr)  some_filter() index::find(t, Expr)  some_filter() >> >> Btw pipe operator is used here as a conjunction and this is c++ >> binary alternative. >> >> 3.2. expressions >> >> By using more operators we lose sequential nature of the querry but we >> get more powerful expressions which may be passed to the query e.g. >> >> "get objects within g1 AND not intersecting g2" >> t[within(g1) & !intersects(g2)] >> >> "get objects within g1 OR intersecting g2" >> t[within(g1)  intersects(g2)] >> >> "get points closest to the linestring in the areas intersecting >> geometry g1 or within geometry g2" >> t[(intersects(g1)  within(g2)) & nearest(ls, 10)] >> >> "get boxes which centroids are closest to the g3 in the areas >> intersecting geometry g1 or within geometry g2" e.g. >> t[(intersects(g1)  within(g2)) & centroid(nearest(g3, 10))] or >> t[(intersects(g1)  within(g2)) & centroid::nearest(g3, 10)] > > Yes, it is looking nice, and I would suggest to verify if we can use > Boost.Range filtered in combination with Phoenix to get this effect. > Further the & should be &&, and the  should be , IMO. > > So something like t  filtered(within(_1) && ! intersects(_1)) > (fictitious) Have you had in mind something like this t  filtered(within(_1, g1) && ! intersects(_1, g2)) ? It would be nice because one might write within(_1, g1) or within(g1, _1). There would be no additional names. Nodes and values must be handled differently while traversing. There would be different checks performed for each but it should be doable. >> >> But I don't know if all future expresions would be easy to implement >> e.g. expressions with some number of KNN filters alternatives would >> probably gather results in appropriate number of containers because >> objects must be sorted by distance for each KNN filter. Then results >> would probably be connected (checks for repeating objects). But this >> is probably possible. >> >> Or there could be only one KNN search per query but this isn't nice. > > So, in general, I like the idea of querying spatial indexes. I would > suggest to see how Phoenix helps us by defining spatial query > combinations, while making use of the index as well. It is very > interesting and it would be great if it would all work clear and > efficient like that. I agree. Do you think that some simple queries are needed too? E.g. t  intersects_filtered(g) t  within_filtered(g) t  nearest_filtered(p, 5) What other types of queries besides geometryrelated and knn would you like to have? Regards, Adam _______________________________________________ ggl mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/ggl 
hi:
1. IIRC, ggl still miss functor interfaces for predicates to use with boost.range. something like: template<class G1> struct Predicator { G1 const& g1; Predicator(G1 const& g) : g1(g) {} template<class G2> bool operator()(G2 const& g2) const { return predicate(g1, g2); } }; such functor may be used with both boost.range and rtee; 2. rtree may be consider as either an index structure or a boost.range: while using with some predicates(e.g. equals/intersects/overlaps/within), rtree works faster as an index structure; while using with other predicates, rtree may be just taken as a boost.range. so: "t  index::filtered(within(g))" and "t  filtered(within(g))" give the same result but the previous one could be much more faster. as for chained case, t  pred1()  pred2() ...  predN() Will it return a new rtree for index::filtered()? Well, I think this could be less efficient for most use case, and not work well with those predicates that rtree not good at. I'd prefer simply return a sequential range at the first place. so : only the first "t  index::filtered(pred(g))" makes use of index facility. The followed adaptors will just deal with normal sequential range; Adam Wulkiewicz wrote: > Barend Gehrels wrote: > >>> t  within(g)  get values within some geometry >>> t  intersects(g)  get values intersecting some geometry >> >> >> Basically I like the clean syntax. These range adaptors, having the same >> name, should be in another namespace and if users hoist both namespaces >> with the usingclause, it will be ambiguous. So namespace will be >> necessary here (or in the normal function), making it a bit longer. > > Of course you're right. I assumed that everything is in the > geometry::index namespace. So some algorithms from geometry corresponds to > the filters in geometry::index. > >> What if we don't do this and just use the "filtered" adaptor defined by >> Boost.Range? >> t  filtered(within(g)) > > It's ok for me. There also may be some standard filters implemented: > index::within_filtered, index::intersects_filtered, > index::nearest_filtered. intersects_filtered is implemented already as > spatially_filtered. > > Btw note that geometry::index filters should be the first ones in the > sequence of filters aplied to the container. > > t  some_not_index_filter()  index::filtered(...) > > won't work. > >> >> The advantage is that AND, OR and NOT are much easier to handle (if >> people write their own predicate  but maybe Phoenix or Lambda can be of >> help here, didn't check this now). > > Or Proto if it couldn't be done with Phoenix. > >> However, the basic requirement is that queries like this are using the >> spatial index. So I don't know how the filtered would use that index. I >> assume that "within" would do that (did you already did some work >> here?). So we indeed need "something which is a Boost.Range adaptor, >> uses the spatial index, and is capable to do userdefined queries at >> compiletime". >> >> We might consider adding a "spatially_filtered" adaptor which acts like >> this. > > spatial_filter<...> is implemented as mentioned before (intersects). It's > just generated by > > operator(index::rtree<...> const&, > index::filters::spatially_filtered<...> const&) > > Currently the querry is performed in the spatial_filter<...>'s constructor > by calling rtree<...>::find(...) method. > >>> Note that some filters may refer to the geometry which is passed or to >>> the values stored in the rtree. Someone may want to get values within >>> some geometry or to get values containing some geometry. He may just >>> use previously described filters and connect them (described further) >>> but then e.g. 2 operations must be done instead of 1. And for some >>> queries this couldn't be possible (with operator only). So there >>> could be: >>> >>>  additional filters: t  outside(g) >> >> >> Outside is actually not (always) the same as ! within, because if a >> geometry partly intersects another, it is ! within and also ! disjoint. >> Outside might be renamed to "disjoint" here. > > Got it. I forgot about disjoint. Any other algorithms which could be used? > >>>  filters generated by some operator: t  ~within(g)  confusing >>>  object passed before in the sequence: t  inv  within(g)  even >>> more confusing >>>  object passed with filter: t  inv(within(g)) >>> >>> In addition to this every filter can have it's negation generated by >>> operator! >>> >>> !within(g) == not_within(g) >>> !intersects(g) == not_intersects(g) >>> >>> Eventually there could be additional filters e.g. "get objects with 0 >>> coordinate greather than some other object" coords<0, more>(g) etc. It >>> would be nice to have e.g. "get points on the right of the linestring" >> >> >> Basically I like the idea of ! within(g), it is very readable. >> >> But it is, if I'm right, different from Boost.Range adaptors which >> cannot be negated like this. The Boost.Range adaptor overloads operator >> to get its effect, you additionally need to overload operator! for a >> range adaptor as well. I didn't try but don't know if this is possible >> (probably it is). >> >> Besides that we have to be careful what is exactly the negation, see >> comment above. > > In the case of using ranges, just another type of range would be generated > by operator! e.g. index::ranges::not_within. And the meaning should be > exactly as the meaning of !geometry::within(). Just like in the case of > using the loop, checking all elements and returning values meeting > !within(some_geometry) requirement. > >>> >>> 2. K nearest neighbour filters >>> >>> nearest(g)  get nearest value to the geometry g >>> nearest(g, 5)  get nearest 5 values to the geometry g >> >> What will the first line return? All values in order of distance? > > One value nearest to the geometry. If we have container of points and the > geometry is a point it would be the closest point. Internally there would > be algorithms from geometry used so if there was > comparable_distance()/within()/intersects()/etc. implemented for Indexable > and Geometry the query would compile, otherwise it wouldn't. > >>> Folowing filters describes geometrical relationship but they'd >>> probably be used with NN filter. >>> >>> further(g, comp_dist1) >>> closer(g, comp_dist2) >>> range(g, comp_dist1, comp_dist2) >>> >>> In addition there could be k furthest neighbor filter: >>> furthest(g) == !nearest(g) >> >> The last line I don't understand... What if we have only 4 elements and >> we want the nearest(g, 5). It will return 4 values. The furthest will >> also return 4 values, so it is not the negation. > > operator! was a suggestion. If we want to return sorted ranges for > nearest() and furthest() then nearest() will generate range sorted from > closest object and furthest() from furthest. > >>> 3. Filters connection >>> >>> 3.1. operator >>> >>> "get values within geometry g1 AND not intersecting geometry g2" >>> t  within(g1)  !intersects(g2) >> >> Concatenation is possible in Boost.Range adaptors and I like it. So it >> will work automatically (besides the ! syntax). It might not be the most >> efficient in all cases because they are filtered twice, so the case you >> describe (g1 AND not g2) is implemented differently (though the effect >> is the same). You mention that also below. And how do both make use of >> the spatial index? >> Therefore, the filtered predicate might be better, being able to do this >> in one step. > > As you noticed, in order to work efficiently, filters from g::index > shouldn't be processed exactly like filters in Boost.Range  > transparently. They should be used only with the rtree object or other > filter from g::index. operators  should generate one compound object (or > an expression if you like) which would be passed to the rtree::find() e.g. > when the begin() method of this object (which is also a range) is called. > >>> >>> "get 5 nearest to g2 values overlapping g1" which means >> >> I don't understand this example, because something overlapping something >> else does have a distance of 0.0, so nearest is more or less >> undertermined here. With the centroid below it is more clear. > > I assumed that nearest just uses comparable_distance() so it will work if > it is implemented for geometries. Moreover I assumed that > comparable_distance() returns closest distance between e.g. geometries' > perimeters or perimeter and some point etc. > > If we have a river(linestring) in a country (polygon) and rtree containing > some indexables(e.g. POI) on the entire continent this query would be > something like: "find 5 indexables nearest to the river inside some > country". > > Btw distances between boxes and other geometries could be useful in the > future. > >>> "get values overlapping g1 AND 5 nearest to g2" >>> t  overlaps(g1)  nearest(g2, 5) >>> >>> Using this operator filters may be processed sequentially from left to >>> right. Like in streams there may be manipulators used e.g. "get >>> objects which centroids are nearest to the geometry g1" could look >>> like this: >>> >>> t  centroid  nearest(g1, 5) >> >> Do I understand here that the centroid is a range adaptor, which returns >> the centroids from all geometries present in the spatial index? In that >> case I agree with this syntax. > > It would be a part of the expression but this syntax probably won't be > used. > >>> instead of e.g. >>> >>> t  nearest<centroid_tag>(g1, 5) or >>> t  nearest(g1, 5, centroid) or >>> t  nearest(g1, 5, some_strategy) >>> >>> but >>> >>> t  centroid::nearest(g1, 5) >>> >>> looks nice. >> >> Also possible but might be confusing  it mixes namespace with >> behaviour. In this case I would prefer centroid_nearest (which is IMO >> looking good) > > Ok. > >>> >>> But, queries with logical alternatives can't be described this way >>> e.g. "get objects within g1 OR intersecting g2". This kind of query >>> performed by one tree traversal could be faster than two separate >>> queries. Moreover one probably will have to check if there are >>> repeated values in both results. >> >> So here the spatially_filtered adaptor will be of great help. >> > > The expression may be passed to some function or operator > > t  index::filtered(Expr)  some_filter() > t  some_filter()  index::filtered(Expr) // Error > > t  (Expr)  some_filter() > t  some_filter()  (Expr) // Error > > t[Expr]  some_filter() > > index::filter(t, Expr)  some_filter() > index::find(t, Expr)  some_filter() > >>> >>> Btw pipe operator is used here as a conjunction and this is c++ >>> binary alternative. >>> >>> 3.2. expressions >>> >>> By using more operators we lose sequential nature of the querry but we >>> get more powerful expressions which may be passed to the query e.g. >>> >>> "get objects within g1 AND not intersecting g2" >>> t[within(g1) & !intersects(g2)] >>> >>> "get objects within g1 OR intersecting g2" >>> t[within(g1)  intersects(g2)] >>> >>> "get points closest to the linestring in the areas intersecting >>> geometry g1 or within geometry g2" >>> t[(intersects(g1)  within(g2)) & nearest(ls, 10)] >>> >>> "get boxes which centroids are closest to the g3 in the areas >>> intersecting geometry g1 or within geometry g2" e.g. >>> t[(intersects(g1)  within(g2)) & centroid(nearest(g3, 10))] or >>> t[(intersects(g1)  within(g2)) & centroid::nearest(g3, 10)] >> >> Yes, it is looking nice, and I would suggest to verify if we can use >> Boost.Range filtered in combination with Phoenix to get this effect. >> Further the & should be &&, and the  should be , IMO. >> >> So something like t  filtered(within(_1) && ! intersects(_1)) >> (fictitious) > > Have you had in mind something like this > > t  filtered(within(_1, g1) && ! intersects(_1, g2)) ? > > It would be nice because one might write within(_1, g1) or within(g1, _1). > There would be no additional names. > > Nodes and values must be handled differently while traversing. There would > be different checks performed for each but it should be doable. > >>> >>> But I don't know if all future expresions would be easy to implement >>> e.g. expressions with some number of KNN filters alternatives would >>> probably gather results in appropriate number of containers because >>> objects must be sorted by distance for each KNN filter. Then results >>> would probably be connected (checks for repeating objects). But this >>> is probably possible. >>> >>> Or there could be only one KNN search per query but this isn't nice. >> >> So, in general, I like the idea of querying spatial indexes. I would >> suggest to see how Phoenix helps us by defining spatial query >> combinations, while making use of the index as well. It is very >> interesting and it would be great if it would all work clear and >> efficient like that. > > I agree. > > Do you think that some simple queries are needed too? E.g. > > t  intersects_filtered(g) > t  within_filtered(g) > t  nearest_filtered(p, 5) > > What other types of queries besides geometryrelated and knn would you > like to have? > > Regards, > Adam > _______________________________________________ > ggl mailing list > [hidden email] > http://lists.osgeo.org/mailman/listinfo/ggl > > Regards, zsj _______________________________________________ ggl mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/ggl 
feverzsj wrote:
> hi: > > 1. IIRC, ggl still miss functor interfaces for predicates to use with > boost.range. > something like: > > template<class G1> > struct Predicator > { > G1 const& g1; > > Predicator(G1 const& g) : g1(g) {} > > template<class G2> > bool operator()(G2 const& g2) const > { > return predicate(g1, g2); > } > }; > > such functor may be used with both boost.range and rtee; > > > 2. rtree may be consider as either an index structure or a boost.range: > while using with some predicates(e.g. > equals/intersects/overlaps/within), rtree works faster as an index > structure; > while using with other predicates, rtree may be just taken as a > boost.range. > so: "t  index::filtered(within(g))" and "t  filtered(within(g))" give > the same result but the previous one could be much more faster. > > as for chained case, > t  pred1()  pred2() ...  predN() > > Will it return a new rtree for index::filtered()? > Well, I think this could be less efficient for most use case, and not > work well with those predicates that rtree not good at. > I'd prefer simply return a sequential range at the first place. > so : > only the first "t  index::filtered(pred(g))" makes use of index > facility. The followed adaptors will just deal with normal sequential > range; First of all I'd like to clarify. I'll be writing about the original design of the query which probably will be changed. I wanted to use only the syntax of Boost.Range to build a Query/Expression object containing all predicates. This object would be passed to the rtree. So the line tree  index::pred1()  index::pred2()  some_namespace::pred() would compile and produce a range of objects returned by query filtered by some_namespace::pred(). But tree  some_namespace::pred()  index::pred1() wouldn't even compile. I now realize that this may be confusing since everybody thinks that predicates works exactly like filters from Boost.Range but they shouldn't. They should work more like Expressions. About the new design... Using filtered() seems reasonable but I don't know if we need 2 of them. Type of a conteiner on the left of operator should define the type of query. filtered() should take an Basic Expression as a parameter and return it inside a simple wrapper. Next, for different containers/ranges this Basic Expression should be transformed to the right form. E.g. for Geometry.Index container the Basic Expression should be passed to the container and transformed by the container. Rtree will transform it to the different form than Kdtree. For stllike containers Basic Expression will be transformed to the different form, probably even wrapped inside a iterator of the resulting range. Still, this may be confusing because filters will work differently for different sequences. t  filtered(Expr)  something() will produce optimized query but t  something()  filtered(Expr) won't and this is probably not how the user thinks about ranges. I don't even know if spatial index should expose iterators, begin(), end() etc. and compile in the second case. Because everybody thinks: "Oh! Boost.Ranges. I know how it works!" maby it's better to use something else to distinguish between Geometry query and ranges. Function or operator different than. index::filtered(t, Expr)  other_filter() index::query(t, Expr)  other_filter() t[Expr]  other_filter() t(Expr)  other_filter() t >> query(Expr)  other_filter() Still, there is a problem with operators  they may be defined already. E.g. vect[Expr]  other_filter(). Thoughts? About the Expression in the next email. Regards, Adam _______________________________________________ ggl mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/ggl 
Hi Adam,
> > I now realize that this may be confusing since everybody thinks that > predicates works exactly like filters from Boost.Range but they > shouldn't. They should work more like Expressions. Yes, actually I did as well. What is exactly the difference? > (...) > Because everybody thinks: "Oh! Boost.Ranges. I know how it works!" > maby it's better to use something else to distinguish between Geometry > query and ranges. Function or operator different than. Yes, if they are different, I agree. Regards, Barend _______________________________________________ ggl mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/ggl 
Barend Gehrels wrote:
> Hi Adam, > >> >> I now realize that this may be confusing since everybody thinks that >> predicates works exactly like filters from Boost.Range but they >> shouldn't. They should work more like Expressions. > > Yes, actually I did as well. What is exactly the difference? Well it isn't commutative and transparent like filters which just operates on iterators. index::filter (or a sequence of index:: predicates) must be passed just after the tree object and it doesn't operate on iterators but calls some searching routine, stores elements inside some container (e.g. std::deque) and exposes this container as a range. Think of it like if it was std::map. You may call find() method and find a value in a fast way. But you may treat it like a range, get some iterator, iterate over all elements and filter some of them. This isn't effective. >> (...) >> Because everybody thinks: "Oh! Boost.Ranges. I know how it works!" >> maby it's better to use something else to distinguish between Geometry >> query and ranges. Function or operator different than. > > Yes, if they are different, I agree. It is a matter of choice. How the user should pass the expression or sequence of predicates or wathever to the spatial index? There are some examples in the previous email. Regards, Adam _______________________________________________ ggl mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/ggl 
Hi Adam,
On 2282011 23:39, Adam Wulkiewicz wrote: > Barend Gehrels wrote: >> Hi Adam, >> >>> >>> I now realize that this may be confusing since everybody thinks that >>> predicates works exactly like filters from Boost.Range but they >>> shouldn't. They should work more like Expressions. >> >> Yes, actually I did as well. What is exactly the difference? > > Well it isn't commutative and transparent like filters which just > operates on iterators. index::filter (or a sequence of index:: > predicates) must be passed just after the tree object and it doesn't > operate on iterators but calls some searching routine, stores elements > inside some container (e.g. std::deque) and exposes this container as > a range. Is this not what we want? The filter should be passed just after the tree object, seems OK to me, and exposes the output as a range. I understand that after (and before) this there were proposals to pass other range adaptoprs after, trying to combine them, which makes the implementation (and interface) probably much more troublesome. We must be careful to keep it simple. I agree with Luke's remark about lets not create a DSEL, and see also the answer of feverzsj, (982011 15:57) . I will reply on that now. Regards, Barend _______________________________________________ ggl mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/ggl 
In reply to this post by feverzsj
Hi,
On 982011 15:57, feverzsj wrote: > hi: > > 1. IIRC, ggl still miss functor interfaces for predicates to use with > boost.range. > something like: > > template<class G1> > struct Predicator > { > G1 const& g1; > > Predicator(G1 const& g) : g1(g) {} > > template<class G2> > bool operator()(G2 const& g2) const > { > return predicate(g1, g2); > } > }; > > such functor may be used with both boost.range and rtee; Yes, that is right. They are not yet there. We've now defined most part of the interface in the form of free functions, which which this can be built but it is not there. > > > 2. rtree may be consider as either an index structure or a boost.range: > while using with some predicates(e.g. > equals/intersects/overlaps/within), rtree works faster as an index > structure; > while using with other predicates, rtree may be just taken as a > boost.range. > so: "t  index::filtered(within(g))" and "t  filtered(within(g))" > give the same result but the previous one could be much more faster. > > as for chained case, > t  pred1()  pred2() ...  predN() > > Will it return a new rtree for index::filtered()? > Well, I think this could be less efficient for most use case, and not > work well with those predicates that rtree not good at. > I'd prefer simply return a sequential range at the first place. > so : > only the first "t  index::filtered(pred(g))" makes use of index > facility. The followed adaptors will just deal with normal sequential > range; I agree with this. Regards, Barend _______________________________________________ ggl mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/ggl 
In reply to this post by Barend
Barend Gehrels wrote:
> What if we don't do this and just use the "filtered" adaptor defined by > Boost.Range? > t  filtered(within(g)) > > The advantage is that AND, OR and NOT are much easier to handle (if > people write their own predicate  but maybe Phoenix or Lambda can be of > help here, didn't check this now). > > However, the basic requirement is that queries like this are using the > spatial index. So I don't know how the filtered would use that index. I > assume that "within" would do that (did you already did some work > here?). So we indeed need "something which is a Boost.Range adaptor, > uses the spatial index, and is capable to do userdefined queries at > compiletime". > > We might consider adding a "spatially_filtered" adaptor which acts like > this. In order to be handled efficiently these predicates must be passed into the query algorithm. Then, different things are done for nodes and values (in the rtree). Different things will be done for rtree and for kdtree. But we may of course add our own adaptors and in fact this is what I've done. > Outside is actually not (always) the same as ! within, because if a > geometry partly intersects another, it is ! within and also ! disjoint. > Outside might be renamed to "disjoint" here. ... > Basically I like the idea of ! within(g), it is very readable. > > But it is, if I'm right, different from Boost.Range adaptors which > cannot be negated like this. The Boost.Range adaptor overloads operator > to get its effect, you additionally need to overload operator! for a > range adaptor as well. I didn't try but don't know if this is possible > (probably it is). > > Besides that we have to be careful what is exactly the negation, see > comment above. The user is able to pass some number of predicates so will be able to describe what he want using predicates corresponding to ggl algorithms. For now it's done by passing std::pair<Pred1, Pred2> or boost::tuple<Pred1, Pred2, ...> instead of single predicate to the range adaptor. Currently there are 2 ways of inserting/removing elements to/from the rtree and 3 ways of querying: namespace bgi = boost::geometry::index; typedef std::pair<Box, int> val_t; bgi::rtree<val_t, bgi::linear<32, 8> > tree; Native interface: // insert/remove tree.insert(std::make_pair(Box(...), 0)); tree.remove(std::make_pair(Box(...), 0)); std::vector<val_t> result; // multiple results val_t result_val; // one result size_t f = 0; // number of results // query // this uses intersects() f = tree.query(Box(...), std::back_inserter(result)); f = tree.query(bgi::intersects(Box()), std::back_inserter(result)); f = tree.query(bgi::within(Box()), std::back_inserter(result)); f = tree.query(bgi::overlaps(Box()), std::back_inserter(result)); f = tree.query(bgi::covered_by(Box()), std::back_inserter(result)); f = tree.query( boost::make_tuple( bgi::covered_by(Box()) , bgi::intersects(Box()) , bgi::overlaps(Box()) ) , std::back_inserter(result)); // nearest // nearest neighbor f = tree.nearest(Point(...), result); // nearest neighbor with predicates f = tree.nearest(Point(...), bgi::within(Box()), result); // k nearest neighbors f = tree.nearest(Point(...), 10, std::back_inserter(result)); // k nearest neighbors with predicates f = tree.nearest(Point(...), 10 , bgi::within(Box()), std::back_inserter(result)); Function interface: bgi::insert(tree, std::make_pair(Box(...), 0)); bgi::remove(tree, std::make_pair(Box(...), 0)); f = bgi::query(tree, Box(...), std::back_inserter(result)); //etc. f = bgi::nearest(tree, Point(...), 10, std::back_inserter(result)); //etc. Ranges interface: tree  bgi::query_filtered(Box(...)) tree  bgi::query_filtered(bgi::within(Box())) tree  bgi::nearest_filtered(Point(), 10, bgi::within(Box())) tree  bgi::nearest_filtered(Point(), 10) In the case of ranges, operator returns bgi::query_filter<Index> or bgi::nearest_filter<Index> which wraps std::vector<val_t> and performs the query in the constructor. Thoughts? Source code in https://svn.boost.org/svn/boost/sandboxbranches/geometry/index Some tests may not work since I changed some names. additional_sizes_and_times.cpp and additional_glut_vis.cpp works for sure. In order to use additional_sizes_and_times.cpp you should have config.txt file with at least 3 numbers e.g. 1000000 100000 100000 Regards, Adam Wulkiewicz _______________________________________________ ggl mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/ggl 
Adam Wulkiewicz wrote:
... > // nearest > > // nearest neighbor > f = tree.nearest(Point(...), result); > // nearest neighbor with predicates > f = tree.nearest(Point(...), bgi::within(Box()), result); > // k nearest neighbors > f = tree.nearest(Point(...), 10, std::back_inserter(result)); > // k nearest neighbors with predicates > f = tree.nearest(Point(...), 10 > , bgi::within(Box()), std::back_inserter(result)); ... Small improvement... Now instead of just passing Point as the first parameter one may pass DistancesPredicates. The user may describe which points of the box will be used in distances calculation. Distances used in searching and testing for being inside some range may be defined separately. namespace bgi = boost::geometry::index; tree.nearest( bgi::bounded( bgi::centroid(Point(...)), bgi::far(10), bgi::near(20)), 10, result); This query will return 10 boxes which centroids are closest to the Point(...). The distance to the furthest box's point will be bigger than 10 and the distance to the closest box's point will be smaller than 20. So this will return 10 boxes with smallest distance between the Point and boxes' centroids intersecting a ring with inner radius 10 and outer 20. Other distances predicates may be or are generated by: PointRelation  default bgi::unbounded(PointRelation) bgi::min_bounded(PointRelation, MinRelation) bgi::max_bounded(PointRelation, MaxRelation) bgi::bounded(PointRelation, MinRelation, MaxRelation) Relations may be or are generated by: Point/Number  depends on relation type, near relation is used by default bgi::near(Point/Number) bgi::far(Point/Number) bgi::centroid(Point/Number) I hope it isn't too complicated. Regards, Adam _______________________________________________ ggl mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/ggl 
In reply to this post by Adam Wulkiewicz
Adam Wulkiewicz wrote:
... > // query > > // this uses intersects() > f = tree.query(Box(...), std::back_inserter(result)); > f = tree.query(bgi::intersects(Box()), std::back_inserter(result)); > f = tree.query(bgi::within(Box()), std::back_inserter(result)); > f = tree.query(bgi::overlaps(Box()), std::back_inserter(result)); > f = tree.query(bgi::covered_by(Box()), std::back_inserter(result)); > > f = tree.query( > boost::make_tuple( > bgi::covered_by(Box()) > , bgi::intersects(Box()) > , bgi::overlaps(Box()) > ) > , std::back_inserter(result)); ... Another small improvement. User may now pass a value predicate and add values to the result only if they pass the test. It takes less time than e.g. searching for all values in some region and then testing them. bool user_pred(Value const& v) { /* some condition */ } std::vector<Value> result; t.query( std::make_pair( Box(...), bgi::value(user_pred) ), std::back_inserter(result) ); This will return all values intersecting the Box and meeting user_pred. Regards, Adam _______________________________________________ ggl mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/ggl 
Administrator

On 28/09/11 12:16, Adam Wulkiewicz wrote:
> Another small improvement. User may now pass a value predicate and add > values to the result only if they pass the test. It takes less time than > e.g. searching for all values in some region and then testing them. > > bool user_pred(Value const& v) { /* some condition */ } > > std::vector<Value> result; > > t.query( > std::make_pair( > Box(...), > bgi::value(user_pred) > ), > std::back_inserter(result) > ); > > This will return all values intersecting the Box and meeting user_pred. Adam, It looks you are continuously pushing the development forward. Great effort! Shame on my, I haven't monitored this subject closely, so here comes basic question: where is the code and how can I start testing it? Best regards,  Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org Member of ACCU, http://accu.org _______________________________________________ ggl mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/ggl 
Mateusz Loskot wrote:
> On 28/09/11 12:16, Adam Wulkiewicz wrote: >> Another small improvement. User may now pass a value predicate and add >> values to the result only if they pass the test. It takes less time than >> e.g. searching for all values in some region and then testing them. >> >> bool user_pred(Value const& v) { /* some condition */ } >> >> std::vector<Value> result; >> >> t.query( >> std::make_pair( >> Box(...), >> bgi::value(user_pred) >> ), >> std::back_inserter(result) >> ); >> >> This will return all values intersecting the Box and meeting user_pred. > > Adam, > > It looks you are continuously pushing the development forward. > Great effort! > > Shame on my, I haven't monitored this subject closely, so here comes > basic question: where is the code and how can I start testing it? Hi, you may find it here: https://svn.boost.org/svn/boost/sandboxbranches/geometry/index. There are 2 folders: /boost  code in boostlike directory structure /tests  unit tests (main.cpp, *.hpp) and additional ones (times of execution and visualization using GLUT). If you want to run additional_sizes_and_times.cpp (which shows times only btw) you should have config file config.txt with at least 3 values: values_count remove_count queries_count Try e.g. 1000000 100000 100000 Some definitions: Value  type that is stored in the rtree (or if you like, in rtree's leafs) Indexable  type that is used by the rtree's algorithms (some Point or Box) Translator  type of an object translating from Value to Indexable Geometry.Index is in /boost/geometry/extensions/index which contain: ./algorithms  various algorithms working on Indexables. ./filters  indexes filters facade. ./rtree  rtree's code. ./translator  some translators. ./indexable.hpp  indexablesrelated traits, functions etc. ./distance_predicates.hpp  default knn queries' distance predicates. ./predicates.hpp  default spatial queries' and values' predicates. ./*.hpp  other tools. In the /boost/geometry/extensions/index/rtree there are: ./linear  specializations and algorithms used by linear insert ./node  various rtree's nodes (build on Boost.Variant and polymorphic nodes, with elements in std::vectors and in boost::array wrapper), visitors, allocators, creating and destroying of nodes. ./quadratic  specializations and algorithms used by quadratic insert. ./rstar  specializations and algorithms used by R*tree insert. ./visitors  various visitors doing most of the things in the rtree. ./distance_predicates.hpp  specializations of knn queries' distance predicates for rtree's nodes. ./filters.hpp  filters specializations. ./options.hpp  definition of compiletime options passed to the rtree (e.g. which algorithm, min, max elements, etc.). ./distance_predicates.hpp  specializations of spatial queries' and values' predicates for rtree's nodes. ./rtree.hpp  the rtree. Regards, Adam _______________________________________________ ggl mailing list [hidden email] http://lists.osgeo.org/mailman/listinfo/ggl 
Powered by Nabble  Edit this page 