[Dict-common-dev] Re: Bug#385403: aspell-ru: hash file is not created

Agustin Martin agmartin at debian.org
Sun Sep 24 20:37:27 UTC 2006


On Sat, Sep 23, 2006 at 09:12:28PM +0300, Martin-Éric Racine wrote:
> Hello Brian,
> 
> It seems that a recent release of the Russian wordlist (source:
> rus-ispell) contains words that aspell interprets as illegal, which makes
> the hash generation break, leaving users with an empty hash file (Bug
> #385403) and making this an RC bug. Both words definitely exist in the
> Russian language, so I am unsure how to solve this issue. Any ideas?

The error message

====postinst log===
????????????? ????? aspell-ru (0.99g3-1) ...
aspell-autobuildhash: processing: ru [ru]
??????: /usr/lib/aspell//ru_affix.dat:1246: The condition "???" does not
guarantee that "????" can always be stripped.
aspell-autobuildhash: processing: ru [ru]
??????: /usr/lib/aspell//ru_affix.dat:1246: The condition "???" does not
guarantee that "????" can always be stripped.
===================

suggests that rule described in line 1246 of ru_affix.dat is wrong, and
cannot always be executed for the given string. Indeed, looking at that
rule, I see something that, expressed in terms of 7bit chars, looks like

SFX L estn as stn

that means, if you find 'stn' strip 'estn' (Buggy!!!) and replace it by
'as'. But if you have something like 'astn' rule is matched, but you cannot
strip 'estn', hence the error.

I have blindly modified that line to something possible and I then find a
different set of errors and warnings,

---------------------
Warning: The word "???" is invalid. The total length is larger than 240
characters. Skipping word.
...
Error: The word "?????????" is invalid. The total word length, with
soundslike data, is larger than 240 characters.
---------------------

which disappear if I use the .wl file instead of the .cwl for building the
hash (I also removed the offending number from the first line of the
wordlist). I tested this in a aspell personal sarge backport, so I cannot
confirm if this means only that my backport is buggy or if is a general
problem. I hope to try that tomorrow with a current aspell.

Regarding the original problem, the right ru_affix.dat fix is definitely
something for rus-ispell upstream, or at least for somebody fluent with
russian.

I am attaching a patch with the changes I used here to test that the rule
was failing. Do not consider it a real patch, I do not speak russian, just a
dummy test.

Hope this helps

-- 
Agustin
-------------- next part --------------
--- ru_affix.dat.orig	2006-09-24 22:05:38.000000000 +0200
+++ ru_affix.dat	2006-09-24 22:05:55.000000000 +0200
@@ -1244,7 +1244,7 @@
 SFX L   ÓÑ   ÌÁÓØ         [^ÁÉÑØ]ÓÑ
 SFX L   ÅÞØÓÑ £ËÓÑ         ÅÞØÓÑ
 SFX L   ÅÞØ  £Ë           ÅÞØ
-SFX L   ÅÚÔÉ £Ú           ÚÔÉ
+SFX L   ÅÚÔÉ £Ú           ÅÚÔÉ
 SFX L   ÅÓÔØ ÌÏ           ÞÅÓÔØ
 SFX L   ÅÓÔØ ÌÉ           ÞÅÓÔØ
 SFX L   ÅÓÔØ ÌÁ           ÞÅÓÔØ


More information about the Dict-common-dev mailing list