[Debtags-devel] spellchecking debian-packages

Enrico Zini enrico@enricozini.org
Mon, 6 Jun 2005 16:29:23 +0200


--GvXjxJ+pjyke8COw
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Jun 06, 2005 at 11:24:47AM +0100, Justin B Rye wrote:

> Hi; I'm interested in helping.  I am not a programmer (indeed, I'm
> Google's top hit for "IANAP"), but I've been a Debian user and
> sysadmin for many years, and a quick look at the "debian-packages"
> file shows one obvious thing I can do - spelling patch attached.

Wow, thanks!  I applied both patches.

I had a look at your page (mentioning IANAP and Google hits was a
deliberate strategy to lurk us there: admit it ;) and wow!  Librarian,
linguist...  Finally someone who may actually know anything of
categorization landed here! :)

Have you had a look at http://debtags.alioth.debian.org/paper-debtags.html ?


> I'm also strongly tempted to provide a patch imposing a standard
> capitalisation policy, and there are some other corrections I'd
> argue for (eg: there's no such thing as "X-Windows"), but I'm
> starting with something simple.=20

That's fine with me.  I think the whole vocabulary entries didn't have
much proofreading at all, mainly because we are not sure about many of
them: if you see the "Status:" of the facets, 3 of them are marked
'complete' (although among them, 'culture' could be quite debatable); 9
are 'needing-review'; 13 are 'draft' and 4 are 'controversial'.

We are facing questions such as:

 - the 'use::' facet is damn useful, but how do we define it really?
   What should go in it and what should not?
 - how do we categorise technologies?  now I split them in different
   facets (format, protocol, dbtech, hwtech, filetransfer), but that's
   questionable (isn't filetransfer the same as protocols? aren't all of
   these just the same aspect of a package (that is, the technology it
   uses) and as such they should go in a single 'technology' facet?).
 - what is a 'suite'?  It's clearly useful to categorise applications
   along what bigger whole they are a part of, but is 'apache' really a
   suite?  And what applications are really part of gnome?  What goes in
   the suite 'debian'?  Don't we have a thousand more (perl, GNU R, GCC
   and its various compilers...)?
 - how do we handle facets that allow categorization with lots of tags?


> Incidentally, I don't see any mention in the archives of
> /usr/lib/menu or /usr/share/doc-base files, which each implement
> "section" hierarchies slightly divergent from the old system of
> repository sections.  I hope they aren't being overlooked.

Uhm, well, aehm, they were in fact overlooked, but you mentioning them
now made them not being overlooked anymore ;)

How do we handle them?  Two possibilities I see (more can be figured
out):
 - Directly map them into some of our tags
 - Use them as heuristic data and implement some strategy in autodebtag
   to deduce some tag from them.

Justin, welcome in!


Ciao,

Enrico

--
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico@enricozini.org>

--GvXjxJ+pjyke8COw
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFCpF3D9LSwzHl+v6sRAtfDAKCDv0hT7iTgOMtAr1ShtUUvdy7XUQCdGLPh
K+zqnvpm/YQkZlaaFoJ7e/Y=
=bGht
-----END PGP SIGNATURE-----

--GvXjxJ+pjyke8COw--