Python bindings for libkdtree: FIX of find_within_range...

Willi Richert w.richert at gmx.net
Fri Mar 20 17:29:55 UTC 2009


Hi,

nice to hear. So stay tuned. I have just finished my template based swig 
system that creates dim \in [1,20] for (int, float). I think that nobody will 
use NN for more than 20 dimensions. Nevertheless the stuff then gets quite 
big:

-rw-r--r-- 1 wr wr     756 2009-03-20 13:03 CMakeLists.txt
-rwxr-xr-x 1 wr wr    8754 2009-03-20 18:27 gen-swig-hpp.py
-rw-r--r-- 1 wr wr   54260 2009-03-20 18:27 kdtree.py
-rw-r--r-- 1 wr wr    4822 2009-03-20 18:27 kdtree.pyc
-rwxr-xr-x 1 wr wr 2119152 2009-03-20 18:28 _kdtree.so
-rw-r--r-- 1 wr wr    1328 2009-03-20 18:20 Makefile
-rw-r--r-- 1 wr wr   51603 2009-03-20 18:27 py-kdtree.hpp
-rw-r--r-- 1 wr wr    4054 2009-03-20 18:09 py-kdtree.hpp.tmpl
-rw-r--r-- 1 wr wr  130817 2009-03-20 18:27 py-kdtree.i
-rw-r--r-- 1 wr wr     380 2009-03-20 16:54 py-kdtree.i.tmpl
-rw-r--r-- 1 wr wr    2808 2009-03-20 13:03 py-kdtree_test.cpp
-rw-r--r-- 1 wr wr   15353 2009-03-20 18:13 py-kdtree_test.py
-rw-r--r-- 1 wr wr   18796 2009-03-20 18:17 py-kdtree_test.pyc
-rw-r--r-- 1 wr wr 1146062 2009-03-20 18:27 py-kdtree_wrap.cxx
-rw-r--r-- 1 wr wr 2386768 2009-03-20 18:28 py-kdtree_wrap.o



Regards,
wr

On Freitag 20 März 2009 18:24:16 Michal J. Gajda wrote:
> Hi,
>
> Indeed KD-tree data structure seems designed to use 'taxi-cab' metric
> instead of euclidean metric. That's why I used the same distance type as
> COORD_T in my patch.
> I work around this in my code by filtering points with euclidean distance
> afterwards.
>
> On the other hand I would love to beta test code for other dimensions than
> 1f, 3f, and 6f. For now I've got surprising result of 3f projection being
> faster for 20k points than 6f one. It may be because runtime grows as
> O(2^dim*log(n)). It may be that nobody has enough points to usefully
> exploit anything beyond 10D? --
>   Best regards
>     Michal
>
> -original message-
> Subject: Re: Python bindings for libkdtree: FIX of find_within_range...
> From: Paul Harris <paulharris at computer.org>
> Date: 20-03-2009 15:15
>
> Hrm.
>
> It may be the same reason that find_within_range() would return 4 as well.
>
> See email in archives, subject "Accuracy of libkdtree", 29 Jan 2009.
>
> This will need to be confirmed, I'll put it on my list to try and do asap.
> My theory is that it may only be checking in each dimension individually (a
> Manhattan distance check), and not doing a final  Euclidean distance check.
> That would be my initial guess.   I'm not sure what the logic would be
> behind that, maybe just speed.  Anyway enough guess-work, next step is to
> confirm.
>
> If anyone wants to step up and figure out what is going on, and to have
> some input into what should be done about it, be my guest :)
>
> cheers,
> Paul
>
>
> 2009/3/20 Willi Richert <w.richert at gmx.net>
>
> > Hi,
> >
> > alright, I can do that.
> >
> > However, I just found out that the range measurement is somehow
> > "illogic":
> >
> >    def test_count_within_range(self):
> >        nn = KDTree_2Int()
> >
> >        for p in [(0,0), (1,0), (0,1), (1,1)]:
> >             nn.add((p, id(p)))
> >
> >         res = nn.count_within_range((0,0), 1.0)
> >        self.assertEqual(3, res, "Counted %i points instead of %i"%(res,
> > 3))
> >
> >         res = nn.count_within_range((0,0), 1.9)
> >         self.assertEqual(4, res, "Counted %i points instead of %i"%(res,
> > 4))
> >
> >
> > libkdtree_new/python-bindings> python -m unittest
> > py-kdtree_test.KDTree_2IntTestCase.test_count_within_range
> > F
> > ======================================================================
> > FAIL: test_count_within_range (py-kdtree_test.KDTree_2IntTestCase)
> > ----------------------------------------------------------------------
> > Traceback (most recent call last):
> >  File "py-kdtree_test.py", line 110, in test_count_within_range
> >     self.assertEqual(3, res, "Counted %i points instead of %i"%(res, 3))
> > AssertionError: Counted 4 points instead of 3
> >
> > ----------------------------------------------------------------------
> > Ran 1 test in 0.001s
> >
> >
> > Any suggestion why that is the case? The point (1,1) should not be
> > counted as its distance to (0,0) is sqrt(2)>1.0
> >
> > wr
> >
> > On Freitag 20 März 2009 11:56:16 Paul Harris wrote:
> > > Hi,
> > >
> > > 2009/3/20 Willi Richert <w.richert at gmx.net>
> > >
> > > > One more thing: The current state of having one explicit template
> > > > realization
> > > > in the python/swig part annoys me a little bit. Some weeks ago there
> >
> > was
> >
> > > > a discussion on the ML about whether or not dynamically supporting
> > > > the specification of the dimension and type. How is the status here,
> > > > Paul?
> >
> > If
> >
> > > > you
> > > > have decided against it, I will provide a python file that generates
> >
> > all
> >
> > > > the
> > > > stuff for float, double and int for dim \in [1, .., 20]. That will
> > > > ease the maintenance a lot.  In the other case the situation for the
> > > > python bindings will be much smoother. But the performance will maybe
> > > > suffer. Can you shed some light on this decision, Paul?
> > >
> > > Its still on the list, but I haven't had the time to do anything on
> >
> > kdtree
> >
> > > for a while now, so i'm just focusing pushing through bug fixes to
> > > ensure that the results are correct.
> > >
> > > I would guess it'll be at least a month before I can do anything,
> > > unless
> >
> > I
> >
> > > manage to find a small random window of time where I can attack that
> >
> > issue.
> >
> > > Would it be too much to ask if you write the python file that generates
> >
> > all
> >
> > > the variants, and then we can do the dynamic version later?  I don't
> > > know how much work that would involve, if its not too much then I would
> > > prefer if you set yourself up with a low-maintenance solution first.
> > >
> > > cheers,
> > > Paul




More information about the libkdtree-devel mailing list