Further ideas for Debtags AI

Erich Schubert erich.schubert at gmail.com
Tue Jun 13 19:20:04 UTC 2006


Hi Alex,
Independent of your schedule and such (btw, one of my exams was pushed
back for a week, so I'll be busy for one more week), some ideas I'd
love to see you research, too:
- using Itemset mining for Debtags (might be useful in the tagger, too)
- naive-bayes of 2nd order (i.e. see if it can improve the results
when you don't do a plain naive bayes, but the "just as much naive"
approach of also taking "has word A and word B" into the equations.
- k-mode based clustering of packages
- outlier detection (for detecting badly tagged packages)
- other datamining algorithms ;-)

best regards,
Erich Schubert
--
    erich@(mucl.de|debian.org)      --      GPG Key ID: 4B3A135C    (o_
  To understand recursion you first need to understand recursion.   //\
  Wo befreundete Wege zusammenlaufen, da sieht die ganze Welt für   V_/_
        eine Stunde wie eine Heimat aus. --- Herrmann Hesse



More information about the Debtags-devel mailing list