Performance of find methods, comparison with linear search

Sylvain Bougerel sylvain.bougerel.devel at gmail.com
Fri Nov 23 11:07:48 UTC 2007


(Sorry previous mail too big, got bounced so sent again...)

On Nov 23, 2007 6:46 PM, Sylvain Bougerel
<sylvain.bougerel.devel at gmail.com> wrote:
> On Nov 21, 2007 10:23 AM, Paul Harris <paulharris at computer.org> wrote:
> > On 11/21/07, Sylvain Bougerel <sylvain.bougerel.devel at gmail.com> wrote:
> >
> > not sure about hypersphere tho... one of the efficiencies comes from
> > the use of the hyperrect.  that way, there is only 1 dimension
> > compared at a time, nice and fast.  almost no multiple dimension
> > distance calculations required.   doesn't sphere intersection require
> > multiplications, additions to calculate distance from the center, and
> > comparison to the radius?  much more work than a single floating point
> > comparison at each tree branch.
> >
>
> Hi,
>
> To insert the point, you simply walk down the tree using the
> comparison, until you reach the a NULL node. You insert the point
> there.
>
> Going back to find_nearest():
> let's suppose you have already found the place where you would normaly
> insert the point in the tree by intersecting inner nodes with the
> hyperrectangle formed by the value to probe and the root of the tree.
> From that point, you gonna walk up the tree now and go into any single
> node that lies in the hypersphere of center "probe" and range "best
> distance found so far". You do not need to look out of the hypersphere
> cause you know that nothing closer than "best distance found so far"
> can turn up.
> The calculation that makes the different between the hypersphere and
> the hyperrectangle is:
> "calculate the distance between the probe point and the plane at the
> node you are looking at"
> If the result of that calculation is bigger that the "best distance
> found so far" you do not need to visit the other side of the node. So
> big parts of the tree are pruned with that method.
>
>
> So I have implemented the algorithm. It's in the function name
> "find_nearest_interactive()" in "kdtree.hpp". I used the same test
> "perf_kdtree.cpp", and before launching the test, I check that both
> find_nearest() and find_nearest_interactive(), give the same results.
> If they don't, I list down which is the best result.  The function is
> completely unrolled, which makes it difficult to read; for I wanted to
> control all parts of the algorithm. (Both file are in attachement)
>
>
> On my machine (a modest laptop) find_nearest_iterative() runs about
> 30x faster with "-g", 40x faster with no optimization flags, and 100x
> faster with "-O4". To my greatest surprise, it often gives better
> results that the original find_nearest(). However, "perf_kdtree.cpp"
> is not such a good test, for the points are completely random, while
> it is best to have a "cloud of points" like in the test provided by
> Renaud. Could anyone try with Renaud's test? (I could do it too, but
> I'm lazy :-) )
>
> So here is an exemple of the results "perf_kdtree.cpp" gives:
>
> ==== with no optimization flag
> For 100000 triplets and 10000 tries:
> For triplet (13560,15072,12899) iterative finds (13807,14942,12820) at
> 84150 recursive finds (13607,14821,12761) at 84254 and best is
> ITERATIVE
> For triplet (10808,33,13648) iterative finds (10749,311,13534) at
> 93761 recursive finds (10502,40,13669) at 94126 and best is ITERATIVE
> For triplet (15712,7798,54) iterative finds (15860,7778,325) at 95745
> recursive finds (15847,7525,109) at 95779 and best is ITERATIVE
> For triplet (2233,454,13198) iterative finds (2269,559,13344) at 33637
> recursive finds (2203,484,13019) at 33841 and best is ITERATIVE
> For triplet (10190,8136,15340) iterative finds (10300,7886,15265) at
> 80225 recursive finds (10247,7933,15530) at 80558 and best is
> ITERATIVE
> For triplet (14635,3095,13573) iterative finds (14439,3053,13538) at
> 41405 recursive finds (14440,3154,13582) at 41587 and best is
> ITERATIVE
> For triplet (9311,4033,5989) iterative finds (9549,4079,6248) at
> 125841 recursive finds (9349,3861,5681) at 125892 and best is
> ITERATIVE
> For triplet (2773,6869,12984) iterative finds (2792,7098,13103) at
> 66963 recursive finds (2835,6940,12743) at 66966 and best is ITERATIVE
> ---- snip ----
> find_nearest()      takes (iterative 0.28, recursive 10.03)
> find_within_range() takes 225.98
> optimize()          takes 1.59
> find_nearest()      takes (iterative 0.35, recursive 26.25)
> find_within_range() takes 230.86
>
>
> ==== with optimization "-O4"
> ---- snip ----
> find_nearest()      takes (iterative 0.07, recursive 6.58)
> find_within_range() takes 66.65
> optimize()          takes 0.28
> find_nearest()      takes (iterative 0.1, recursive 15.31)
> find_within_range() takes 66.91
>
>
> Note that:
> 1.    Optimize() makes the work harder for 2 very different
> algorithms. It can't be blamed on the recursive implementation
> anymore. Rather, there could be some issue with optimize().
> 2.    It's not commited, because find_nearest_iterative() requires the
> user to define operator*() for the type _Tp. It's not a good thing
> IMHO: I belive operator*() doesn't make any sense for Tp=strings for
> example. But it seems to be inherent to the design of KDTree, for
> find_nearest() already seems to have such a requirement (I have to
> investigate more on this). So while find_nearest_iterative() is an
> improvement, there are design issues with it.
>
> But you can have fun playing with it at the moment,
>
> Sylvain
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: perf_kdtree.cpp.bz2
Type: application/x-bzip2
Size: 1965 bytes
Desc: not available
Url : http://lists.alioth.debian.org/pipermail/libkdtree-devel/attachments/20071123/cd672cb3/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kdtree.hpp.bz2
Type: application/x-bzip2
Size: 9102 bytes
Desc: not available
Url : http://lists.alioth.debian.org/pipermail/libkdtree-devel/attachments/20071123/cd672cb3/attachment-0001.bin 


More information about the libkdtree-devel mailing list