[Dict-common-dev] Default wordlist selection by locale
agustin.martin at hispalinux.es
Fri Sep 2 14:47:54 UTC 2005
On Fri, Sep 02, 2005 at 01:41:36PM +0100, Colin Watson wrote:
> I've got a hairy problem with installation of dictionaries as part of
> the Ubuntu installation process.
> Part of our policy in the Ubuntu installer is to ask as few questions as
> possible, and in particular to avoid questions in the second stage
> (base-config). In most cases the second stage needs to ask no questions
> at all, although occasionally X can't figure out the screen resolution
> and has to ask for that. However, since we upgraded to the new
> dictionaries-common (currently 0.49.2) and dictionaries as part of the
> aspell 0.60 transition, installations have been asking which dictionary
> should be the default.
Strange, code handling that should be mostly similar in sarge than in 0.49.2
> We install some set of dictionaries as dependencies of the
> language-support-* packages, depending on the selected locale; for
> example, language-support-en depends on wamerican and wbritish. In fact,
> language-support-en is always installed in addition to any other
> locale-specific language support package, so there are always multiple
> dictionaries and a question will always be asked.
That is true if you install the language-support package separately, but
should not happen on first install from scratch (or at the base-config
stage) along with dictionaries-common package, unless there is not even
a fallback match.
> Now, there are various ways I could get around this:
> * In the short term I'm just going to drop the priority of the
> wordlist question to medium in Ubuntu; unfortunately, that leaves
> /etc/dictionaries-common/words (and thus /usr/share/dict/words) as a
> dangling symlink, which is obviously bad.
> * I could change dc-debconf-select.pl to select a wordlist arbitrarily
> in the event that none was explicitly selected. That doesn't produce
> very good results, though, especially in case wamerican (say)
> manages to sort before the appropriate wordlist for the primary
> language the user selected.
> * We often have better information about the default wordlist than
> just the language part of the locale; if the user selected en_US,
> then wamerican should really be the default wordlist, but if they
> selected en_GB then it should be wbritish. I could have an enormous
> lookup table in localechooser or something that selects a default
> * Putting this all in localechooser is pretty nasty, though; the set
> of available wordlist packages could change at any time, and I don't
> want to have to keep up with it. How about having each wordlist
> package declare some kind of a priority for various locales (e.g.
> wamerican could be en_US:10, en_*:5, wbritish could be en_GB:10,
> en_ZA:9, en_*:5, etc.)? Then something in dictionaries-common could
> select a good default in case the user didn't explicitly select one,
> and all the information would reside in individual packages rather
> than in the installer, which is generally a good plan.
> Does this make any kind of sense to any dictionary maintainers, or am I
> missing something that lets me get good results already?
dictionaries-common.config should already give those results at the base
That is where the pre-seeding is done, after values given by
"debian-installer/language" and "debian-installer/country" debconf values.
If a reasonable value is found it is pre-seeded, and the question priority
is set to low, for control maniacs.
If something is going wrong there, that is the place to fix it
(The installed dictionaries-common.config is really the concatenation
dictionaries-common.config + dc-debconf-select.pl). At the base installation
stage, when only configs are run, but packages are not yet installed that is
the script that will be run (those in the dicts/wordlists will do nothing
because the dc-debconf-select.pl script is not yet installed).
Code there should try guessing the default ispell dictionary/wordlist after
the debian-installer settings, or after the previous symlinks if upgrading
from woody, with different priorities depending on the quality of the
a) Try exact match. If found
-> set debconf value, question priority low
b) Try a reasonable fallback (e.g., en_GB, but no british dict is installed,
but is an american one)
-> set value with question priority medium
c) Try an english variant
-> set value with question priority medium
d) None of the above
-> ask question with priority critical
Note that while priorities in (b-d) possibilities look high, in practice
they should not result in a debconf question being prompted for any but
very special setups, since most of these will have a single ispell
dictionary/wordlist installed. Also, if values are previously set, nothing
will be changed.
This should only leave the question pending at maximal priority if e.g.,
language is es_ES, and no spanish or english dict is to be installed, but
are german and french, or similar setups.
Removing dictionaries-common along with any other package depending on it,
and reinstalling all them together with something like
# DICT_COMMON_DEBUG="yes" apt-get install language-support-en
should give a lot of information about the guesing process. As a matter of
# dpkg --purge --force-depends dictionaries-common
# DICT_COMMON_DEBUG="yes" apt-get install dictionaries-common
from an already installed system should also give relevant information.
With that info, we can try guessing what is going wrong, and look for a fix.
More information about the Dict-common-dev