[Debtags-devel] Proposed Debtags goals for Etch

Enrico Zini enrico at enricozini.org
Sun Jul 17 07:37:50 UTC 2005


On Sat, Jul 16, 2005 at 09:25:22PM +0200, Benjamin Mesing wrote:

> >  - now that we have Debtags data in the Packages file, we should find a
> >    way to keep it updated.  This afternoon I'll talk with aj about it.
> >    The idea we had at lunch was to have the archive pull the (possibly
> >    signed) data from some URL.
> I don't know much about the package file format - but does this mean
> that the Tags: field can be set by the maintainer now? If so, the
> primary goal must be, to mostly stabelize the vocabulary.

No, that's not the case: it's an external override file that I send to
aj and aj installs on the archive.  This means that we're still free to
refactor as we see fit, although it'd be a good idea to declare some
facets 'stable' so that applications can ship with filters that use
them.


> The other goal I consider to be of huge importance, is to make the
> tagging as complete as possible. An incomplete data is of little use for
> the users or may even be of negative benefit.

Sure, but we don't need to be in a hurry with that.  I think some
searches to be already useful to allow people to try it out, and as
usage will grow more quality on the tags will come.


> This leads to another important point. A comprehensive explanation of
> the vocabulary and a guidline for the selection of tags must be written
> and published, so that people know what tags are appropriate.
> Points that should be mentioned are e.g.:
>       * every package should have one or more tags from role::
>       * every application should have one or more tags from interface::
>       * each application should have one or more tags from
>         implemented-in::

Yes.  But this needs not to be an etch goal: if we can have it in time
for etch, then that's great.  By the way, you just wrote the first
version of the guide, so we might even be almost done with this.

I added your points to:

  svn://svn.debian.org/debtags/debtags/trunk/doc/tagging-guideline.rst


>               * Btw. perhaps it would be a good to add a
>                 implemented-in::other tag? (However this leads back to
>                 the discussion if we should try to try to avoid an
>                 explosion of the tagset or not...)

I've been considering for a while to add a TODO tag to all the facets,
to mark when you see that a tag from that facet is needed, but missing.

uhm... I've been considering it for a while, and it's about time to do
it: I've done it, let's see.


>               * What about documentation packages? Would make
>                 implemented-in make sense for them too (e.g.
>                 implemented-in::html, implemented-in::sgml,...)?

That would sound more like file format than implementation.  I guess
this means that non-software package won't have implemented-in:: .
Added to the tagging-guide.

>               * I've added C# to the implemented-in facet (I've named it
>                 c-sharp as I don't know if a # is allowed in the
>                 vocabulary.) Enrico, could you please change this to c#
>                 if it is allowed as this is more appropriate?

Debtags won't break if you use #; however I feel like being conservative
in case applications with weaker parsers want to grok tags, so I'd say
let's go with c-sharp.


>       * ....
> Probably it would be great, if the AI-Tagger could be used to assist the
> developers in the tagging process. However it is not yet in production
> state, and I am not too happy with the overall design. Perl is not the
> first choice language for user friendly applications. However I know no
> other language with the same string processing capabilities, and I've
> already invested a lot of work. I hope I will find some time to improve
> the tagger and make it usable. But this should be discussed later.

Talking about the AI-Tagger, I've met Hanna Wallach here and she's
interested to join the work with it.  That's exactly the field of her
PhD, I hope she finds some time for it.


> >  - if data goes automatically to the Packages file, we should have a
> >    stricter control on what goes in.  A simple and good option is to
> >    have the central database reside on the svn repository, and
> >    committing manually the patches from the web interface.
> Why do we need stricter control for this? What is the difference in
> automatically changing the package file and automatically changing the
> debtags database (or is this not done automatically now?). Is one or the
> other more likely to be "attacked"? 

You don't want someone to add all tags to all packages and increase the
size of the Packages file by some megs; and in case someone implements
filters in aptitude, you don't want to make mozilla disappear just
because someone tagged it role::aux-shlib.

I want to have a level of human-driven sanity check before things move
from the world-writable website and the debtags-edit mails into what's
actually used.


> >  - implement simple pre-cooked tag-expression-based package filters into
> >    aptitude (something like:
> >      (!(role::aux-data || role::aux-dummy || role::aux-shlib))
> >    or
> >      (!culture::* || culture::italian) )
> What do you mean here?

The idea is adding one or two options in aptitude that are implemented
via tag filters.  For example, you can have an option such as 'hide all
automatic dependencies' that is implemented using this expression:

   !(role::aux-data || role::aux-dummy || role::aux-shlib)

or an option such as 'hide culture-specific packages that are not
interesting to me' implemented using:

   !culture::* || culture::italian

or also a 'mark all dummy packages for removal'.


> The other goals sound sensible to me.

Great!  There's quite a bit of excitement going on here at Debconf5
about Debtags.  I'm quite optimist for the near future!


Ciao,

Enrico

--
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico at debian.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.alioth.debian.org/pipermail/debtags-devel/attachments/20050717/1a84e0b2/attachment.pgp


More information about the Debtags-devel mailing list