vocabulary structure [coarseness]
Enrico Zini
enrico at enricozini.org
Wed Jun 28 11:21:28 UTC 2006
On Tue, Jun 27, 2006 at 09:57:39AM +0200, Peter Rockai wrote:
Reply to the second issue I found.
> As for rest of made-of, there's only data:*, so made-of::data and format::
> (File Format) facet would be probably a good idea again. There are tags like
> role::content:data and i would vote for
> works-with::{audio,video,text,image,database,archive,font,...} all of which
> would hint the package could be tagged with a format:: tag as well (not
> always, but it would often make sense).
Here, as I understand it, the problem you raise is the level of
coarseness of tags in a facet: some facets have coarse tags, some facets
have very detailed tags, some facets have a mix of both.
Example:
Enrico wants an image viewer, and there are as much as 6 different
tags that are relevant to such a simple question:
works-with::image
works-with::image:raster
works-with::image:raster:jpg
works-with::image:raster:png
works-with::image:vector
works-with::image:vector:svg
6 is near to the cognitive limit of 7 +/- 2 and Enrico gets confused.
This is a nasty but important point. Should tags in a facet be
omogeneous with regards to the level of coarseness? If yes, how do we
separate fine-grained from coarse grained tags? And how do we handle
the in-between cases?
I think we're reasoning interface-wise rather than classification-wise:
once a tag is well-defined, it doesn't matter too much were it is filed.
Of course, filing a tag under the right facet contributes to the
well-definedness of the tag.
The current approach is to use grouping to represent different levels of
detail. If I run:
debtags tagsearch works-with | grep -v '[a-z]:[a-z]'
Then I get a coarse classification, while if I run:
debtags tagsearch works-with::image
then I get a more detailed group of tags related to images.
I could see ways of operating this distinction automatically at an
interface level using the current vocabulary structure. I also wouldn't
mind restructuring the vocabulary using a different approach to
coarseness like:
Facet: works-with
Coarseness: broad
Facet: kind-of-image-format
Coarseness: detailed
and then come out with algorithms to hide tags from more detailed facets
unless they become relevant to the current search.
I don't know which of the two ways is easier, though :)
Ciao,
Enrico
--
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico at debian.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.alioth.debian.org/pipermail/debtags-devel/attachments/20060628/ebe745cf/attachment.pgp
More information about the Debtags-devel
mailing list