Trove vocabulary

Erich Schubert erich@debian.org
Tue, 9 Dec 2003 00:25:54 +0100


Hello,

> you already have this lying around. How about dropping the current
> vocabulary for the trove categories? Would be a pity of the work
> already done, but having one 'standard' would be clearer...

but we also have "debram", which has all packages in woody categorized.

The vocabulary and data currently in debtags has always been a
proof-of-concept, with the goal of learning "design patterns" for the
final vocabulary. That is why we have a "tag task force" on our todo
list.
I havn't investigated the "trove" vocabulary enough to know its weak
points. I'll have to spend way more time in browsing freshmeat &
sourceforge than i have (working half-time, researching half-time and
studying...).
But i believe that "trove" is unbalanced. Certain categories are so full
of entries that you don't find anything. But that might just reflect the
nature of the data.

The biggest point i learned:
you cannot easily add new tags later on. You have to do the vocabulary
correct at the start, or any change will reduce the quality, not improve
it, because the changes will take a long time until they have been
applied to all data.

Greetings,
Erich Schubert
-- 
   erich@(vitavonni.de|debian.org)    --    GPG Key ID: 4B3A135C    (o_
    Go away or i'll replace you with a very small shell script.     //\
           Es ist besser, geliebt und verloren zu haben,            V_/_
                   als niemals geliebt zu haben.