Refactoring the vocabulary

Enrico Zini enrico at enricozini.org
Thu Apr 8 00:00:19 UTC 2010


On Wed, Mar 24, 2010 at 01:52:42AM -0300, Tássia Camões wrote:

> I've been thinking about the group of facets and tags defined in the
> vocabulary and I would like to open a discussion on this topic.
>
> It doesn't seem reasonable to me that, for example, facets like
> "implemented-in" and "biology" are at the same level of hierarchy in
> the categorization. While some facets can really be defined as
> "different points of view from which to look at the package archive",
> some other seem only to be the relation with a specific field (as it
> happens with "biology", that most of the packages will never be
> categorized under this facet).

I see your point. Similar thought go through my mind when I go through
the vocabulary.

> Using this aproach we would be much closer to the magic number 7 plus
> or minus 2 mentioned by [1].
> 
> As a first draft, the ones I would consider as "first level" facets are:
> 
> culture, field, implemented-in, hardware, interface, made-of, role,
> scope, network, suite, uitoolkit, use, works-with, works-with-format
> 
> The others seem to be "second level" ones ( they could be nested,
> maybe under a "section" facet):
> 
> accessibility, admin, biology, devel, filetransfer, game, junior,
> mail, office, protocol, security, sound, web, x11
> 
> I don't know all the implications of doing such a change, but I'd be
> happy to make you think about this.

My default answer has always been that this is possibly a non-problem,
and that it's best to have a flat list of facets and then have smart
interfaces that only present what is relevant to the user.

For example, goplay makes its own selection of what facets are important
and disregards the rest, while here are examples of suggesting tags,
searching tags or computing tag clouds in a smart way that would work
alongside a more traditional keyword search:

  http://www.enricozini.org/2007/debtags/axi-query-expand/
  http://www.enricozini.org/2007/debtags/axi-query-tags/
  http://www.enricozini.org/2008/debtags/axi-searchcloud/

However, my default answer might have been a bit too visionary: I've
basically been requiring people to make new-concept interfaces to
debtags or not make any interface at all. Therefore, this approach kind
of killed any attempt at producing more ordinary interfaces that use
tags.

More ordinary interfaces may be precisely what some nontrivial groups of
users could be more familiar with, so I'm afraid this might have been a
barrier for adoption. It may be time to address the proposal you raise
in a more constructive way.

I'll assume we're looking at building some kind of interface where the
user is presented a list of tags to choose from, and by choosing tags
eventually gets to refine a short-ish list of packages.

First thing, I agree by gut feeling with your first level/second level
division. "works-with-format" and "uitoolkit" could possibly be second
level, as they look like specialisations of "works-with" and
"interface". "network" could be somehow subordinate to "use" so it could
be second level, too.

In fact, if we try to limit the number of first-level facets as much as
possible, we can get by with as little as: field, interface, role, use,
works-with.

The question is how to display the second level facets. I don't like the
idea of building a big tree of tags because we'd have a tree which is
both very deep and branches a lot. Usable decision trees either branch a
lot but are shallow (for example, pick cone type and flavour of ice
cream: many decisions but only a depth of two), or are deep but branch
very little (say, many easy yes/no questions: deep, but minimal
branching).

I would experiment with something like only showing the second level
facets when at least one tag has been selected. For example:

 1. Display the tags in field, interface, role, use, works-with
 2. Once one tag is chosen, display also the relevant second level
    facets (a tag cloud?)

There could be some discussion about how to define "relevant" here, but
it could be along the line of "the tags that remain when we throw away
all packages that don't match the tags selected so far".

Most further detail are application dependent, though. One may want to
tweak their "first level" choice (say, goplay), or to define "relevant"
in a different way.


Ciao,

Enrico

-- 
GPG key: 4096R/E7AD5568 2009-05-08 Enrico Zini <enrico at enricozini.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: Digital signature
URL: <http://lists.alioth.debian.org/pipermail/debtags-devel/attachments/20100408/57a23b4b/attachment.pgp>


More information about the Debtags-devel mailing list