New find() interface, more flexible, with more functionalities...

Sylvain Bougerel sylvain.bougerel.devel at gmail.com
Sun Nov 25 02:54:38 UTC 2007


Hi everyone,

I would like to propose a new interface that unify find(),
find_exact(), find_nearest(), find_within_range(),
count_within_range(), and visit_within_range() together, and offers
many new possibilities. The new interface does not impose operator*(),
operator+(), and operator-() to be defined for the type subvalue_type
-- the type returned by the accessor --.

The new interface is just called find(), and provides a default
functor to compute a distance between 2 subvalues provided by the
accessor of the KDtree. The return type _Dist_type, must have
operator<(), operator+() defined:

      template <typename _Tp
                      typename _Dist_type>
      struct scalar_distance
      {
	_Dist_Type
	operator() (const _Tp& __a, const _Tp& __b);
      };

      // return a distance_iterator value with the lowest distance
      // from __val
      template<typename _Distance>
      distance_iterator
      find (const_reference __val, _Distance& __dist=_Distance());

      // same with target value provided by an iterator.
      template<typename _Distance>
      distance_iterator
      find (const_iterator __iter, _Distance& __dist=_Distance());

      // return a distance_iterator on the value with the lowest
      // distance from __val and at maximum __max_dist
      template<typename _Distance>
      distance_iterator
      find (const_reference __val, const subvalue_type& __max_dist,
_Distance& __dist=_Distance());

      // same with target value provided by an iterator
      template<typename _Distance>
      distance_iterator
      find (const_iterator __iter, const subvalue_type& __max_dist,
_Distance& __dist=_Distance());

      // return a range_iterator on a value within the region [__a, __b)
      template<typename _Distance>
      range_iterator
      find (const_reference __a, const_reference __b, _Distance&
__dist=_Distance());

      // same with range provided by an iterator
      template<typename _Distance>
      range_iterator
      find (const_iterator __a, const_iterator __b, _Distance&
__dist=_Distance());

Two of these function return a distance iterator. The distance
iterator is an forward_iterator. At each call to
distance_iterator::operator++() it will find the next value whose
distance from the target is >= to the previous value. The
range_iterator is also a forward_iterator. At each call of
range_iterator::operator++, it will find the next value within the
region [__a, __b]. Below is a list of example that shows how to use
these functions and the function is the default fashion to emulate all
functionality you need.

Emulation of find(__val) uses 2nd function of the interface; it find
the closest value to the target at a maximum distance of 0:
distance_iterator iter=kdtree.find(target_val, 0,
scalar_distance<kdtree::subtype, kdtree::subtype>());

Emulation of find_exact(__val) uses the same as above, but the post
processing is different; it iterates through all the values that are
at the maximum distance of 0, to find the desired one:
distance_iterator iter=kdtree.find(target_val, 0,
scalar_distance<kdtree::subtype, kdtree::subtype>());
while (iter!=kdtree.end_distance())
  if(*iter==target_val) break;

Emulation of find_nearest(__val) with maximum range; it find the
closest value to the target in the tree:
distance_iterator iter=kdtree.find(target_val,
scalar_distance<kdtree::subtype, kdtree::subtype>());

Emulation of find_within_range(KDTree::Region(__a, __b)) is, with a
little of post processing:
range_iterator iter=kdtree.find(val_one, val_two,
scalar_distance<kdtree::subtype, kdtree::subtype>());
vector<value_type> v;
while (iter!=kdtree.end_range())
  v.push_back(++iter);

Emulation of count_within_range(KDTree::Region(__a, __b)) is similar
to the above:
int count=0;
for(range_iterator iter=kdtree.find(val_one, val_two,
scalar_distance<kdtree::subtype, kdtree::subtype>());
    iter!=kdtree.end_range(); ++iter)
  ++count;

Emulation of visit_within_range(KDTree::Region(__a, __b)) is like the above:
int count=0;
for(range_iterator iter=kdtree.find(val_one, val_two,
scalar_distance<kdtree::subtype, kdtree::subtype>());
    iter!=kdtree.end_range(); ++iter)
  my_post_processing_function(*iter);

Providing with k nearest value to the target, here k=100:
distance_iterator iter=kdtree.find(target_val, 0,
scalar_distance<kdtree::subtype, kdtree::subtype>());
for(int i=0; i<100 && iter!=kdtree.end_distance(); ++i, ++iter);

Providing a search for the farther away point in the tree:
template <typename _Tp
               typename _Dist_type>
struct negative_scalar_distance
{
  _Dist_Type
  operator() (const _Tp& __a, const _Tp& __b);
};
distance_iterator iter=kdtree.find(target_val,
negative_scalar_distance<kdtree::subtype, kdtree::subtype>());

And even more: suppose your kdtree is used to map strings (like in a
book database: author for the first dimension, title for the second
dimension, ...), and suppose that you define a function that
calculates the similarity between 2 strings, this function could be
used as a distance to find the closest match (author, title) when a
user is performing a search.
struct string_similarity
{
  float
  operator() (const string& __a, const string& __b);
}
distance_iterator iter=kdtree.find(target_val, string_similarity());


So the advantage of this interface is mainly that is very flexible,
and it doesn't impose the type distance to be the same as the type
returned by the accessor. Which can make sense in some situations. The
problem with this interface is that you have to specify the functor
for the distance each time you call find. Finally, all the functions
above should be presented with their find_if() equivalent. I find the
interface more "natural" personally, but that's just my impression. If
I could find a generic way to specify a default functor for the
distance, that would be great...

What do you think about it? Would you want it to make it's way into the repo?

- Sylvain



More information about the libkdtree-devel mailing list