[Debtags-commits] [svn] r1354 - tagcoll/trunk
Enrico Zini
enrico at costa.debian.org
Thu Sep 15 09:26:09 UTC 2005
Author: enrico
Date: Thu Sep 15 09:26:08 2005
New Revision: 1354
Modified:
tagcoll/trunk/ (props changed)
tagcoll/trunk/README
Log:
r5335 at viaza: enrico | 2005-09-15 11:25:58 +0200
Added new idea to the README
Modified: tagcoll/trunk/README
==============================================================================
--- tagcoll/trunk/README (original)
+++ tagcoll/trunk/README Thu Sep 15 09:26:08 2005
@@ -160,6 +160,33 @@
These are the TODO-list items currently being worked on::
+ - New grouping algorithm:
+ - Define a group cardinality minimum threshold (kind of around 7) and
+ maximum threshold (kind of around 14)
+ - Identify all tagsets with cardinality > maximum threshold, and consider
+ them immutable
+ - Identify all tagsets with cardinality < minimum threshold, and merge them
+ with the nearest tagsets so that the cardinality of the resulting set is
+ still < minimum threshold. Merge could happen only among tagsets at
+ distance 1, or one could have hints to give weight to tags, and compute a
+ weighted distance that considers the relevance to the user of the various
+ different tags (for some users, a change in implemented-in::* could mean
+ near to nothing).
+ - Use a collection of Hints for having a preference for the tags of the
+ resulting set (for example, implemented-in::* could be less important
+ than use::*) or just use the merge of all tags in the merged tagsets as
+ the resulting tagset, or use the intersection, or handle merged tagsets
+ specially. The intersection is probably better, especially if weighted
+ distance is used before and then the tags cut out by the intesection
+ would be the less relevant ones already.
+ - If small groups remain which can't be merged because all the nearby
+ groups are big, merge all of them who are smaller than an extra minimum
+ (say 2 or 3) anyway, using the nearest set regardless of its cardinality
+ - Try to run a smart hierarchy on the results
+ - Hints could be a map of Expression -> weight, so that multiple tags can
+ be assigned the same weight (like use::* -> 10, implemented-in::*->1)
+ This should normalise the 'special' items somehow.
+
- Merge ItemGrouper and TDBIndexer
- Add example code
More information about the Debtags-commits
mailing list