[Dict-common-dev] aspell-en problems with aspell-autobuildhash

Agustin Martin agustin.martin@hispalinux.es
Wed, 13 Jul 2005 01:17:46 +0200


--jI8keyz6grp/JLjh
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Tue, Jul 12, 2005 at 10:51:00AM -0700, Brian Nelson wrote:
> 
> Installed a /var/lib/aspell/en.compat and running aspell-autobuildhash
> fails because it looks for /usr/share/aspell/en.*wl.gz and fails to find
> any matching wordlists.
> 
> Adding separate .compat files for each of these lists (instead of a
> single ) doesn't produce the correct results either, because it runs
> stuff like "aspell --lang=en-common" with looks for a corresponding
> en-common.dat, which of course doesn't exist.
> 
> I think the correction is to, instead of looking for one of
> /usr/share/aspell/$lang.*wl.gz, look at /usr/share/aspell/$lang*.*wl.gz
> and run $build_hash for each one.

Not for english, because everything is in a single package where you have
full control, but I am not sure if that can be a problem for other languages
having variants packaged separately, like might be 'de' and 'de-old'. I
think is more robust if we use a file at /usr/share/aspell having a list of
the subdicts to be processed, I am thinking for english something like one
of

/usr/share/aspell/en.{data,wordlists,subdicts}

or any better suggestion (I do not particularly like any of the above).
That file would contain something like

en-common
en-variant_0
en-variant_1
en-variant_2
en_CA-w_accents-only
en_CA-wo_a...
[...]

being that file parsed and $build_hash run for each entry with a common
$lang for all. If that file does not exists $build_hash would be run only
for $lang.

> 
> Does that sound correct?  If so, I can come up with a patch ...
>

The only difference is how the subdicts list is got. I have quickly written
something using that file (as $lang.data), but still did not test it at all.
I am attaching the patch so you can have a look at it, but note that it is
completely untested, so take it with a lot of care. I hope to test it
tomorrow, at least with langs having a single subdict named after $lang.

-- 
Agustin

--jI8keyz6grp/JLjh
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="aspell-autobuildhash.diff"

Index: aspell-autobuildhash
===================================================================
RCS file: /cvsroot/dict-common/dictionaries-common/scripts/system/aspell-autobuildhash,v
retrieving revision 1.2
diff -u -r1.2 aspell-autobuildhash
--- aspell-autobuildhash	7 Jul 2005 10:57:30 -0000	1.2
+++ aspell-autobuildhash	12 Jul 2005 22:55:35 -0000
@@ -64,39 +64,60 @@
 # ---------------------------------------------------------------------
 
 sub autorebuild {
-    my $lang = shift ||                          # The dictionary name
+    my $lang      = shift ||                          # The dictionary name
 	myerror "No argument passed to function autorebuild";
-    my $base = "/usr/share/aspell/$lang";        # the wordlist basename
-    my $hash = "/var/lib/aspell/$lang.rws";      # the hash file
-    my $data = "/usr/lib/aspell";                # The data/lib dir
-    my $msg  ='';
-    my $unpack = '';
-    my $options = "--dont-validate-affixes" unless $debug;
+    my $data      = "/usr/lib/aspell";                # The data/lib dir
+    my $langsfile = "/usr/share/aspell/$lang.data";   # The subdicts file
+    my $options   = "--dont-validate-affixes" unless $debug;
+    my @sublangs  = ();
     
-    print STDERR "aspell-autobuildhash: processing lang: $lang\n";
-
     myerror "aspell data dir $data does not exist" unless ( -d $data );
-
-    if ( -e "$base.mwl.gz" ){
-	$unpack = "zcat $base.mwl.gz";
-    } elsif ( -e "$base.wl.gz") {
-	$unpack = "zcat $base.wl.gz";
-    } elsif ( -e "$base.cwl.gz") {
-	$unpack = "zcat $base.cwl.gz | precat";
+    
+    if ( -e $langsfile ){
+	open (LANGSFILE, "< $langsfile") || die "Could not open $langsfile for reading";
+	@sublangs = <LANGSFILE>;
+	close LANGSFILE;
     } else {
-	mymessage "Could not find any of $base.{mwl,wl,cwl}.gz";
-	return 0;
+	push @sublangs, $lang;
     }
-
-    #$unpack = "$unpack | aspell clean strict";
-    system ("$unpack | aspell $options --local-data-dir=$data --lang=$lang create master $hash") == 0
-	or $msg = "Could not build the hash file for $lang" ;
     
-    if ( $msg ){             # Do not break postinst if hash cannot be built
-	mymessage ($msg);    # Just inform about that
-	return 0;
-    }  
- 
+    foreach ( @sublangs ){
+	
+	next if m/^[\t\s]*$/;
+	chomp;
+	s/^[\s\t]*//;
+	s/[\s\t]*$//;
+	next if m/^\#/;
+	
+	my $sublang = $_;
+	my $base    = "/usr/share/aspell/$sublang";     # the wordlist basename
+	my $hash    = "/var/lib/aspell/$sublang.rws";   # the hash file
+	my $msg     ='';
+	my $unpack  = '';
+	
+	print STDERR "aspell-autobuildhash: processing: $lang [$sublang]\n";
+	
+
+	if ( -e "$base.mwl.gz" ){
+	    $unpack = "zcat $base.mwl.gz";
+	} elsif ( -e "$base.wl.gz") {
+	    $unpack = "zcat $base.wl.gz";
+	} elsif ( -e "$base.cwl.gz") {
+	    $unpack = "zcat $base.cwl.gz | precat";
+	} else {
+	    mymessage "Could not find any of $base.{mwl,wl,cwl}.gz";
+	    return 0;
+	}
+	
+	#$unpack = "$unpack | aspell clean strict";
+	system ("$unpack | aspell $options --local-data-dir=$data --lang=$lang create master $hash") == 0
+	    or $msg = "Could not build the hash file for $sublang" ;
+	
+	if ( $msg ){             # Do not break postinst if hash cannot be built
+	    mymessage ($msg);    # Just inform about that
+	    return 0;
+	}
+    }
     return 1;
 }
 

--jI8keyz6grp/JLjh--