problem with find_within_range

Paul Harris paulharris at computer.org
Wed Nov 18 03:19:52 UTC 2009


Hi Jason,

2009/11/18 Jason Remillard <remillard.jason at gmail.com>

> Hi,
>
> I am using 0.7.0 with visual studio 2005. I usually don't do a "use
> namespace std" until I am in my C++ files. This patch adds in the
> std:: namespace, and fixes a warning about the compiler not knowing if
> it should call the float, double, or long of sqrt. I included it
> because I hope that it is not related to the problem I am having with
> find_within_range.
>
>
Thank you for that, I had forgotten that if you #include <cmath> it puts all
the math functions (like sqrt) into the std namespace.  I'll make the
changes on the git HEAD.


> The included test program makes a set or random 3D points, sets up the
> kdtree on them, then picks a random search point with a random range.
> It then calculates the points that should be returned by
> find_within_range, nd compares them to what actually comes back. I am
> seeing both extra points and missing points.  Bug, or I am doing
> something wrong?
>
>
I'm afraid that you are technically doing something wrong.
Only find_nearest bothers to sqrt() the distances.  find_within_range
converts the 'limit' into a square region (or bounding box) with sides=limit
and then searches and returns all points within that box.

This is how it has always been - all of the _within_range() functions work
this way.  I don't think this is ideal, and it should certainly be more
obvious.   I myself had forgotten that it does this.

I can't really change the way it works as people possibly rely on this
behaviour.  Personally I'd prefer it sqrt'd to find the distance, but the
change is not as simple as just wacking in a sqrt somewhere.  Plus people
don't always want to use euclidean distances so its not really a step in the
right direction.

I've added your test (with corrections and comments) to git-head, and also
added extra comments to kdtree.hpp, so people won't be caught out as easily
when 0.7.2 comes out.



There are a few things you could do:

a) adjust what you expect from kdtree.  see below, i've adjusted your test
so that it runs without failing.

b) use visit_within_range() to test the distance properly before adding it
to the vector, or double-check the distances in the result vector.

c) try out my git branch paulharris/region ... i've templated the Region
that you can search on, and so you could implement a Region of your own that
does the sqrt() test in the encloses() method.   this could eventually be
incorporated into the standard kdtree so that we can then call eg
tree.find_within_range_accurate() or whatever.


cheers,
Paul




> Thanks, Jason.
>
> #include <kdtree++/kdtree.hpp>
> #include <vector>
> #include <map>
>
#include <set>

> #include <stdio.h>
>
> using namespace std;
>
> struct kdtreeNode
> {
>  typedef double value_type;
>
>  double xyz[3];
>  size_t index;
>
>  value_type operator[](size_t n) const
>  {
>    return xyz[n];
>  }
>
>  double distance( const kdtreeNode &node)
>  {
>    double x = xyz[0] - node.xyz[0];
>    double y = xyz[1] - node.xyz[1];
>    double z = xyz[2] - node.xyz[2];
>

// this is not correct   return sqrt( x*x+y*y+z*z);

// this is what kdtree checks with find_within_range()
// the "manhattan distance" from the search point.
// effectively, distance is the maximum distance in any one dimension.
   return max(fabs(x),max(fabs(y),fabs(z)));


 }
> };
>
> int main(int argc,char *argv[])
> {
>  vector<kdtreeNode> pts;
>
>  typedef KDTree::KDTree<3,kdtreeNode> treeType;
>
>  treeType tree;
>
>  // make random 3d points
>  for ( size_t n = 0; n < 10000; ++n)
>  {
>    kdtreeNode node;
>    node.xyz[0] = double(rand())/RAND_MAX;
>    node.xyz[1] = double(rand())/RAND_MAX;
>    node.xyz[2] = double(rand())/RAND_MAX;
>    node.index = n;
>
>    tree.insert( node);
>    pts.push_back( node);
>  }
>
>  for (size_t r = 0; r < 1000; ++r)
>  {
>    kdtreeNode refNode;
>    refNode.xyz[0] = double(rand())/RAND_MAX;
>    refNode.xyz[1] = double(rand())/RAND_MAX;
>    refNode.xyz[2] = double(rand())/RAND_MAX;
>
>    double limit = double(rand())/RAND_MAX;
>
>    // find the correct return list by checking every single point
>    set<size_t> correctCloseList;
>
>    for ( size_t i= 0; i < pts.size(); ++i)
>    {
>      double dist = refNode.distance( pts[i]);
>      if ( dist < limit)
>        correctCloseList.insert( i );
>    }
>
>    // now do the same with the kdtree.
>    vector<kdtreeNode> howClose;
>
>  tree.find_within_range(refNode,limit,back_insert_iterator<vector<kdtreeNode>
> >(howClose));
>
>    // make sure no extra points are returned, and the return has no
> missing points.
>    for ( size_t i = 0; i < howClose.size(); ++i)
>    {
>      set<size_t>::iterator hit = correctCloseList.find( howClose[i].index);
>
>      if ( hit != correctCloseList.end())
>      {
>        correctCloseList.erase(hit);
>      }
>      else
>      {
>        // point that is too far away - fail!
>        assert(false);
>        printf("fail, extra points.\n");
>      }
>    }
>
>    // fail, not all of the close enough points returned.
>    assert( correctCloseList.size() == 0);
>    if ( correctCloseList.size() > 0)
>    {
>      printf("fail, missing points.\n");
>    }
>  }
> }
>
> _______________________________________________
> libkdtree-devel mailing list
> libkdtree-devel at lists.alioth.debian.org
> http://lists.alioth.debian.org/mailman/listinfo/libkdtree-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/libkdtree-devel/attachments/20091118/260a5cd2/attachment.htm>


More information about the libkdtree-devel mailing list