r4639 - in /packages/liblingua-ispell-perl: ./ branches/ branches/upstream/ branches/upstream/current/ branches/upstream/current/lib/ branches/upstream/current/lib/Lingua/ tags/

gwolf at users.alioth.debian.org gwolf at users.alioth.debian.org
Thu Dec 21 05:39:38 CET 2006


Author: gwolf
Date: Thu Dec 21 05:39:37 2006
New Revision: 4639

URL: http://svn.debian.org/wsvn/pkg-perl/?sc=1&rev=4639
Log:
[svn-inject] Installing original source of liblingua-ispell-perl

Added:
    packages/liblingua-ispell-perl/
    packages/liblingua-ispell-perl/branches/
    packages/liblingua-ispell-perl/branches/upstream/
    packages/liblingua-ispell-perl/branches/upstream/current/
    packages/liblingua-ispell-perl/branches/upstream/current/Changes
    packages/liblingua-ispell-perl/branches/upstream/current/MANIFEST
    packages/liblingua-ispell-perl/branches/upstream/current/Makefile.PL
    packages/liblingua-ispell-perl/branches/upstream/current/README
    packages/liblingua-ispell-perl/branches/upstream/current/lib/
    packages/liblingua-ispell-perl/branches/upstream/current/lib/Lingua/
    packages/liblingua-ispell-perl/branches/upstream/current/lib/Lingua/Ispell.pm
    packages/liblingua-ispell-perl/branches/upstream/current/spellcheck   (with props)
    packages/liblingua-ispell-perl/branches/upstream/current/test.pl   (with props)
    packages/liblingua-ispell-perl/tags/

Added: packages/liblingua-ispell-perl/branches/upstream/current/Changes
URL: http://svn.debian.org/wsvn/pkg-perl/packages/liblingua-ispell-perl/branches/upstream/current/Changes?rev=4639&op=file
==============================================================================
--- packages/liblingua-ispell-perl/branches/upstream/current/Changes (added)
+++ packages/liblingua-ispell-perl/branches/upstream/current/Changes Thu Dec 21 05:39:37 2006
@@ -1,0 +1,26 @@
+Revision history for Perl extension Lingua::Ispell.
+
+0.07  Tue Apr 18 11:09:59 EDT 2000
+	- removed support for non-terse mode; this saves me from having to try to
+	  parse the input line the same as ispell does.  The downside is that user
+	  programs don't get reports on correctly spelled words.  Hopefully this
+	  fixes a lot of bugs.
+	- "*_dictionaries" subs renamed as "*_dictionary"
+
+0.06  Fri Apr 14 16:51:44 EDT 2000
+	- misses/guesses now returned as arrays rather than space-separated strings.
+
+0.05  Fri Oct 22 10:53:38 EDT 1999
+	- renamed from Text::Ispell
+
+0.04  Tue Oct 12 11:22:06 EDT 1999
+	- added support for additional options
+
+0.03  Mon Aug 30 15:37:42 EDT 1999
+	- Documentation upgrades.
+
+0.02  [Skipped]
+
+0.01  Fri Aug 27 15:48:07 1999
+	- original version
+

Added: packages/liblingua-ispell-perl/branches/upstream/current/MANIFEST
URL: http://svn.debian.org/wsvn/pkg-perl/packages/liblingua-ispell-perl/branches/upstream/current/MANIFEST?rev=4639&op=file
==============================================================================
--- packages/liblingua-ispell-perl/branches/upstream/current/MANIFEST (added)
+++ packages/liblingua-ispell-perl/branches/upstream/current/MANIFEST Thu Dec 21 05:39:37 2006
@@ -1,0 +1,7 @@
+MANIFEST
+Makefile.PL
+test.pl
+Changes
+README
+lib/Lingua/Ispell.pm
+spellcheck

Added: packages/liblingua-ispell-perl/branches/upstream/current/Makefile.PL
URL: http://svn.debian.org/wsvn/pkg-perl/packages/liblingua-ispell-perl/branches/upstream/current/Makefile.PL?rev=4639&op=file
==============================================================================
--- packages/liblingua-ispell-perl/branches/upstream/current/Makefile.PL (added)
+++ packages/liblingua-ispell-perl/branches/upstream/current/Makefile.PL Thu Dec 21 05:39:37 2006
@@ -1,0 +1,7 @@
+use ExtUtils::MakeMaker;
+# See lib/ExtUtils/MakeMaker.pm for details of how to influence
+# the contents of the Makefile that is written.
+WriteMakefile(
+    'NAME'	=> 'Lingua::Ispell',
+    'VERSION_FROM' => 'lib/Lingua/Ispell.pm', # finds $VERSION
+);

Added: packages/liblingua-ispell-perl/branches/upstream/current/README
URL: http://svn.debian.org/wsvn/pkg-perl/packages/liblingua-ispell-perl/branches/upstream/current/README?rev=4639&op=file
==============================================================================
--- packages/liblingua-ispell-perl/branches/upstream/current/README (added)
+++ packages/liblingua-ispell-perl/branches/upstream/current/README Thu Dec 21 05:39:37 2006
@@ -1,0 +1,316 @@
+
+Note:
+A simple "spellchecking" program is included in this distribution.
+It is a perl program named "spellcheck".  It simply prints the
+analysis of the input text; it provides no way to modify the text. 
+It is simply given as a demonstration of the module.  Type 
+    spellcheck -h
+for a usage summary.  If no input files are specified, it will
+read from stdin.  After each line of input, it will print the
+analysis of the terms.  By default, it only gives output for
+terms which are "incorrect".  Give it the -v option to have it
+report on the "correct" terms as well.
+
+Tests:
+'make test' currently does nothing.  To test the installation,
+try out the "spellcheck" program provided. 
+
+__POD__
+
+NAME
+     Lingua::Ispell.pm - a module encapsulating access to the
+     Ispell program.
+
+     Note: this module was previously known as Text::Ispell; if
+     you have Text::Ispell installed on your system, it is now
+     obsolete and should be replaced by Lingua::Ispell.
+
+NOTA BENE
+     ispell, when reporting on misspelled words, indicates the
+     string it was unable to verify, as well as its starting
+     offset in the input line. No such information is returned
+     for words which are deemed to be correctly spelled.  For
+     example, in a line like "Can't buy a thrill", ispell simply
+     reports that the line contained four correctly spelled
+     words.
+
+     Lingua::Ispell would like to identify which substrings of
+     the input line are words -- correctly spelled or otherwise.
+     It used to attempt to split the input line into words
+     according to the same rules ispell uses; but that has proven
+     to be very difficult, resulting in both slow and error-prone
+     code.
+
+     Consequences
+
+     Lingua::Ispell now operates only in "terse" mode. In this
+     mode, only misspelled words are reported. Words which
+     ispell verifies as correctly spelled are silently accepted.
+
+     In the report structures returned by spellcheck(), the
+     'term' member is now always identical to the 'original'
+     member; of the two, you should probably use the 'term'
+     member.  (Also consider the 'offset' member.)  ispell does
+     not report this information for correctly spelled words; if
+     at some point in the future this capability is added to
+     ispell, Lingua::Ispell will be updated to take advantage of
+     it.
+
+     Use of the $word_chars variable has been removed; setting it
+     no longer has any effect.
+
+     terse_mode() now does nothing.
+
+SYNOPSIS
+      # Brief:
+      use Lingua::Ispell;
+      Lingua::Ispell::spellcheck( $string );
+      # or
+      use Lingua::Ispell qw( spellcheck ); # import the function
+      spellcheck( $string );
+
+      # Useful:
+      use Lingua::Ispell qw( :all );  # import all symbols
+      for my $r ( spellcheck( "hello hacking perl shrdlu 42" ) ) {
+ print "$r->{'type'}: $r->{'term'}\n";
+      }
+
+
+DESCRIPTION
+     Lingua::Ispell::spellcheck() takes one argument.  It must be
+     a string, and it should contain only printable characters.
+     One allowable exception is a terminal newline, which will be
+     chomped off anyway.  The line is fed to a coprocess running
+     ispell for analysis.  ispell parses the line into "terms"
+     according to the language-specific rules in effect.
+
+     The result of ispell's analysis of each term is a
+     categorization of the term into one of six types: ok,
+     compound, root, miss, none, and guess.  Some of these carry
+     additional information.  The first three types are
+     "correctly" spelled terms, and the last three are for
+     "incorrectly" spelled terms.
+
+     Lingua::Ispell::spellcheck returns a list of objects, each
+     corresponding to a term in the spellchecked string.  Each
+     object is a hash (hash-ref) with at least two entries:
+     'term' and 'type'.  The former contains the term ispell is
+     reporting on, and the latter is ispell's determination of
+     that term's type (see above).  For types 'ok' and 'none',
+     that is all the information there is.  For the type 'root',
+     an additional hash entry is present: 'root'.  Its value is
+     the word which ispell identified in the dictionary as being
+     the likely root of the current term.  For the type 'miss',
+     an additional hash entry is present: 'misses'.  Its value is
+     an ref to an array of words which ispell identified as being
+     "near-misses" of the current term, when scanning the
+     dictionary.
+
+     NOTE
+
+     As mentioned above, Lingua::Ispell::spellcheck() currently
+     only reports on misspelled terms.
+
+     EXAMPLE
+
+      use Lingua::Ispell qw( spellcheck );
+      Lingua::Ispell::allow_compounds(1);
+      for my $r ( spellcheck( "hello hacking perl salmoning fruithammer shrdlu 42" ) ) {
+ if ( $r->{'type'} eq 'ok' ) {
+   # as in the case of 'hello'
+   print "'$r->{'term'}' was found in the dictionary.\n";
+ }
+ elsif ( $r->{'type'} eq 'root' ) {
+   # as in the case of 'hacking'
+   print "'$r->{'term'}' can be formed from root '$r->{'root'}'\n";
+ }
+ elsif ( $r->{'type'} eq 'miss' ) {
+   # as in the case of 'perl'
+   print "'$r->{'term'}' was not found in the dictionary;\n";
+   print "Near misses: @{$r->{'misses'}}\n";
+ }
+ elsif ( $r->{'type'} eq 'guess' ) {
+   # as in the case of 'salmoning'
+   print "'$r->{'term'}' was not found in the dictionary;\n";
+   print "Root/affix Guesses: @{$r->{'guesses'}}\n";
+ }
+ elsif ( $r->{'type'} eq 'compound' ) {
+   # as in the case of 'fruithammer'
+   print "'$r->{'term'}' is a valid compound word.\n";
+ }
+ elsif ( $r->{'type'} eq 'none' ) {
+   # as in the case of 'shrdlu'
+   print "No match for term '$r->{'term'}'\n";
+ }
+ # and numbers are skipped entirely, as in the case of 42.
+      }
+
+
+     ERRORS
+
+     Lingua::Ispell::spellcheck() starts the ispell coprocess if
+     the coprocess seems not to exist. Ordinarily this is simply
+     the first time it's called.
+
+     ispell is spawned via the Open2::open2() function, which
+     throws an exception (i.e. dies) if the spawn fails.  The
+     caller should be prepared to catch this exception -- unless,
+     of course, the default behavior of die is acceptable.
+
+     Nota Bene
+
+     The full location of the ispell executable is stored in the
+     variable $Lingua::Ispell::path.  The default value is
+     /usr/local/bin/ispell.  If your ispell executable has some
+     name other than this, then you must set
+     $Lingua::Ispell::path accordingly before you call
+     Lingua::Ispell::spellcheck() (or any other function in the
+     module) for the first time!
+
+AUX FUNCTIONS
+     add_word(word)
+
+     Adds a word to the personal dictionary.  Be careful of
+     capitalization.  If you want the word to be added "case-
+     insensitively", you should call add_word_lc()
+
+     add_word_lc(word)
+
+     Adds a word to the personal dictionary, in lower-case form.
+     This allows ispell to match it in a case-insensitive manner.
+
+     accept_word(word)
+
+     Similar to adding a word to the dictionary, in that it
+     causes ispell to accept the word as valid, but it does not
+     actually add it to the dictionary.  Presumably the effects
+     of this only last for the current ispell session, which will
+     mysteriously end if any of the coprocess-restarting
+     functions are called...
+
+     parse_according_to(formatter)
+
+     Causes ispell to parse subsequent input lines according to
+     the specified formatter.  As of ispell v. 3.1.20, only 'tex'
+     and 'nroff' are supported.
+
+     set_params_by_language(language)
+
+     Causes ispell to set its internal operational parameters
+     according to the given language.  Legal arguments to this
+     function, and its effects, are currently unknown by the
+     author of Lingua::Ispell.
+
+     save_dictionary()
+
+     Causes ispell to save the current state of the dictionary to
+     its disk file.  Presumably ispell would ordinarily only do
+     this upon exit.
+
+     terse_mode(bool:terse)
+
+     NOTE: This function has been disabled! Lingua::Ispell now
+     always operates in terse mode.
+
+     In terse mode, ispell will not produce reports for "correct"
+     words.  This means that the calling program will not receive
+     results of the types 'ok', 'root', and 'compound'.
+
+
+FUNCTIONS THAT RESTART ISPELL
+     The following functions cause the current ispell coprocess,
+     if any, to terminate. This means that all the changes to the
+     state of ispell made by the above functions will be lost,
+     and their respective values reset to their defaults.  The
+     only function above whose effect is persistent is
+     save_dictionary().
+
+     Perhaps in the future we will figure out a good way to make
+     this state information carry over from one instantiation of
+     the coprocess to the next.
+
+     allow_compounds(bool)
+
+     When this value is set to True, compound words are accepted
+     as legal -- as long as both words are found in the
+     dictionary; more than two words are always illegal.  When
+     this value is set to False, run-together words are
+     considered spelling errors.
+
+     The default value of this setting is dictionary-dependent,
+     so the caller should set it explicitly if it really matters.
+
+     make_wild_guesses(bool)
+
+     This setting controls when ispell makes "wild" guesses.
+
+     If False, ispell only makes "sane" guesses, i.e.  possible
+     root/affix combinations that match the current dictionary;
+     only if it can find none will it make "wild" guesses, which
+     don't match the dictionary, and might in fact be illegal
+     words.
+
+     If True, wild guesses are always made, along with any "sane"
+     guesses. This feature can be useful if the dictionary has a
+     limited word list, or a word list with few suffixes.
+
+     The default value of this setting is dictionary-dependent,
+     so the caller should set it explicitly if it really matters.
+
+     use_dictionary([dictionary])
+
+     Specifies what dictionary to use instead of the default.
+     Dictionary names are actually file names, and are searched
+     for according to the following rule: if the name does not
+     contain a slash, it is looked for in the directory
+     containing the default dictionary, typically /usr/local/lib.
+     Otherwise, it is used as is: if it does not begin with a
+     slash, it is construed from the current directory.
+
+     If no argument is given, the default dictionary will be
+     used.
+
+     use_personal_dictionary([dictionary])
+
+     Specifies what personal dictionary to use instead of the
+     default.
+
+     Dictionary names are actually file names, and are searched
+     for according to the following rule:  if the name begins
+     with a slash, it is used as is (i.e. it is an absolute path
+     name). Otherwise, it is construed as relative to the user's
+     home directory ($HOME).
+
+     If no argument is given, the default personal dictionary
+     will be used.
+
+FUTURE ENHANCEMENTS
+     ispell options:
+
+       -w chars
+       Specify additional characters that can be part of a word.
+
+
+DEPENDENCIES
+     Lingua::Ispell uses the external program ispell, which is
+     the "International Ispell", available at
+
+       http://fmg-www.cs.ucla.edu/geoff/ispell.html
+
+     as well as various archives and mirrors, such as
+
+       ftp://ftp.math.orst.edu/pub/ispell-3.1/
+
+     This is a very popular program, and may already be installed
+     on your system.
+
+     Lingua::Ispell also uses the standard perl modules
+     FileHandle, IPC::Open2, and Carp.
+
+AUTHOR
+     jdporter at min.net (John Porter)
+
+COPYRIGHT
+     This module is free software; you may redistribute it and/or
+     modify it under the same terms as Perl itself.
+

Added: packages/liblingua-ispell-perl/branches/upstream/current/lib/Lingua/Ispell.pm
URL: http://svn.debian.org/wsvn/pkg-perl/packages/liblingua-ispell-perl/branches/upstream/current/lib/Lingua/Ispell.pm?rev=4639&op=file
==============================================================================
--- packages/liblingua-ispell-perl/branches/upstream/current/lib/Lingua/Ispell.pm (added)
+++ packages/liblingua-ispell-perl/branches/upstream/current/lib/Lingua/Ispell.pm Thu Dec 21 05:39:37 2006
@@ -1,0 +1,587 @@
+
+#(@) Lingua::Ispell.pm - a module encapsulating access to the Ispell program.
+
+=head1 NAME
+
+Lingua::Ispell.pm - a module encapsulating access to the Ispell program.
+
+Note: this module was previously known as Text::Ispell; if you have
+Text::Ispell installed on your system, it is now obsolete and should be
+replaced by Lingua::Ispell.
+
+=head1 NOTA BENE
+
+ispell, when reporting on misspelled words, indicates the string it was unable
+to verify, as well as its starting offset in the input line.
+No such information is returned for words which are deemed to be correctly spelled.
+For example, in a line like "Can't buy a thrill", ispell simply reports that the
+line contained four correctly spelled words.  
+
+Lingua::Ispell would like to identify which substrings of the input
+line are words -- correctly spelled or otherwise.  It used to attempt to split
+the input line into words according to the same rules ispell uses; but that has
+proven to be very difficult, resulting in both slow and error-prone code.
+
+=head2 Consequences
+
+Lingua::Ispell now operates only in "terse" mode.
+In this mode, only misspelled words are reported.
+Words which ispell verifies as correctly spelled are silently accepted.
+
+In the report structures returned by C<spellcheck()>, the C<'term'> member
+is now always identical to the C<'original'> member; of the two, you should 
+probably use the C<'term'> member.  (Also consider the C<'offset'> member.)
+ispell does not report this information for correctly spelled words; if at
+some point in the future this capability is added to ispell, Lingua::Ispell
+will be updated to take advantage of it.
+
+Use of the C<$word_chars> variable has been removed; setting it no longer
+has any effect.
+
+C<terse_mode()> now does nothing.
+
+=cut
+
+
+package Lingua::Ispell;
+use Exporter;
+ at Lingua::Ispell::ISA = qw(Exporter);
+ at Lingua::Ispell::EXPORT_OK = qw(
+  spellcheck
+  add_word
+  add_word_lc
+  accept_word
+  parse_according_to
+  set_params_by_language
+  save_dictionary
+  allow_compounds
+  make_wild_guesses
+  use_dictionary
+  use_personal_dictionary
+);
+%Lingua::Ispell::EXPORT_TAGS = (
+  'all' => \@Lingua::Ispell::EXPORT_OK,
+);
+
+
+use FileHandle;
+use IPC::Open2;
+use Carp;
+
+use strict;
+
+use vars qw( $VERSION );
+$VERSION = '0.07';
+
+
+=head1 SYNOPSIS
+
+ # Brief:
+ use Lingua::Ispell;
+ Lingua::Ispell::spellcheck( $string );
+ # or
+ use Lingua::Ispell qw( spellcheck ); # import the function
+ spellcheck( $string );
+
+ # Useful:
+ use Lingua::Ispell qw( :all );  # import all symbols
+ for my $r ( spellcheck( "hello hacking perl shrdlu 42" ) ) {
+   print "$r->{'type'}: $r->{'term'}\n";
+ }
+
+
+=head1 DESCRIPTION
+
+Lingua::Ispell::spellcheck() takes one argument.  It must be a
+string, and it should contain only printable characters.
+One allowable exception is a terminal newline, which will be
+chomped off anyway.  The line is fed to a coprocess running
+ispell for analysis.  ispell parses the line into "terms"
+according to the language-specific rules in effect.
+
+The result of ispell's analysis of each term is a categorization
+of the term into one of six types: ok, compound, root, miss, none,
+and guess.  Some of these carry additional information.
+The first three types are "correctly" spelled terms, and the last
+three are for "incorrectly" spelled terms.
+
+Lingua::Ispell::spellcheck returns a list of objects, each
+corresponding to a term in the spellchecked string.  Each object
+is a hash (hash-ref) with at least two entries: 'term' and 'type'.
+The former contains the term ispell is reporting on, and the latter
+is ispell's determination of that term's type (see above).
+For types 'ok' and 'none', that is all the information there is.
+For the type 'root', an additional hash entry is present: 'root'.
+Its value is the word which ispell identified in the dictionary
+as being the likely root of the current term.
+For the type 'miss', an additional hash entry is present: 'misses'.
+Its value is an ref to an array of words which ispell
+identified as being "near-misses" of the current term, when
+scanning the dictionary.
+
+=head2 NOTE
+
+As mentioned above, C<Lingua::Ispell::spellcheck()> currently only reports on misspelled terms.
+
+=head2 EXAMPLE
+
+ use Lingua::Ispell qw( spellcheck );
+ Lingua::Ispell::allow_compounds(1);
+ for my $r ( spellcheck( "hello hacking perl salmoning fruithammer shrdlu 42" ) ) {
+   if ( $r->{'type'} eq 'ok' ) {
+     # as in the case of 'hello'
+     print "'$r->{'term'}' was found in the dictionary.\n";
+   }
+   elsif ( $r->{'type'} eq 'root' ) {
+     # as in the case of 'hacking'
+     print "'$r->{'term'}' can be formed from root '$r->{'root'}'\n";
+   }
+   elsif ( $r->{'type'} eq 'miss' ) {
+     # as in the case of 'perl'
+     print "'$r->{'term'}' was not found in the dictionary;\n";
+     print "Near misses: @{$r->{'misses'}}\n";
+   }
+   elsif ( $r->{'type'} eq 'guess' ) {
+     # as in the case of 'salmoning'
+     print "'$r->{'term'}' was not found in the dictionary;\n";
+     print "Root/affix Guesses: @{$r->{'guesses'}}\n";
+   }
+   elsif ( $r->{'type'} eq 'compound' ) {
+     # as in the case of 'fruithammer'
+     print "'$r->{'term'}' is a valid compound word.\n";
+   }
+   elsif ( $r->{'type'} eq 'none' ) {
+     # as in the case of 'shrdlu'
+     print "No match for term '$r->{'term'}'\n";
+   }
+   # and numbers are skipped entirely, as in the case of 42.
+ }
+
+
+=head2 ERRORS
+
+C<Lingua::Ispell::spellcheck()> starts the ispell coprocess 
+if the coprocess seems not to exist.  Ordinarily this is simply
+the first time it's called.
+
+ispell is spawned via the C<Open2::open2()> function, which
+throws an exception (i.e. dies) if the spawn fails.  The caller
+should be prepared to catch this exception -- unless, of course,
+the default behavior of die is acceptable.
+
+=head2 Nota Bene
+
+The full location of the ispell executable is stored
+in the variable C<$Lingua::Ispell::path>.  The default
+value is F</usr/local/bin/ispell>.
+If your ispell executable has some name other than
+this, then you must set C<$Lingua::Ispell::path> accordingly
+before you call C<Lingua::Ispell::spellcheck()> (or any other function
+in the module) for the first time!
+
+=cut
+
+
+sub _init {
+  unless ( $Lingua::Ispell::pid ) {
+    my @options;
+    while ( my( $k, $ar ) = each %Lingua::Ispell::options ) {
+      if ( @$ar ) {
+        for ( @$ar ) {
+          #push @options, "$k $_";
+          push @options, $k, $_;
+        }
+      }
+      else {
+        push @options, $k;
+      }
+    }
+
+    $Lingua::Ispell::path ||= '/usr/local/bin/ispell';
+
+    $Lingua::Ispell::pid = undef; # so that it's still undef if open2 fails.
+    $Lingua::Ispell::pid = open2( # if open2 fails, it throws, but doesn't return.
+      *Reader,
+      *Writer,
+      $Lingua::Ispell::path,
+      '-a', '-S',
+      @options,
+    );
+
+    my $hdr = scalar(<Reader>);
+
+    # must be the same as ispell:
+    $Lingua::Ispell::terse = 0;
+    {
+      # set up permanent terse mode:
+      local $/ = "\n";
+      local $\ = '';
+      print Writer "!\n";
+      $Lingua::Ispell::terse = 1;
+    }
+  }
+
+  $Lingua::Ispell::pid
+}
+
+sub _exit {
+  if ( $Lingua::Ispell::pid ) {
+    close Reader;
+    close Writer;
+    kill $Lingua::Ispell::pid;
+    $Lingua::Ispell::pid = undef;
+  }
+}
+
+
+sub spellcheck {
+  _init() or return();  # caller should really catch the exception from a failed open2.
+  my $line = shift;
+  local $/ = "\n"; local $\ = '';
+  chomp $line;
+  $line =~ s/\r//g; # kill the hate
+  $line =~ /\n/ and croak "newlines not allowed in arguments to Lingua::Ispell::spellcheck!";
+  print Writer "^$line\n";
+  my @commentary;
+  local $_;
+  while ( <Reader> ) {
+    chomp;
+    last unless $_ gt '';
+    push @commentary, $_;
+  }
+
+  my %types = (
+    # correct words:
+    '*' => 'ok',
+    '-' => 'compound',
+    '+' => 'root',
+
+    # misspelled words:
+    '#' => 'none',
+    '&' => 'miss',
+    '?' => 'guess',
+  );
+  # and there's one more type, unknown, which is
+  # used when the first char is not in the above set.
+
+  my %modisp = (
+      'root' => sub {
+        my $h = shift;
+        $h->{'root'} = shift;
+      },
+      'none' => sub {
+        my $h = shift;
+        $h->{'original'} = shift;
+        $h->{'offset'} = shift;
+      },
+      'miss' => sub { # also used for 'guess'
+        my $h = shift;
+        $h->{'original'} = shift;
+        $h->{'count'} = shift; # count will always be 0, when $c eq '?'.
+        $h->{'offset'} = shift;
+
+        my @misses  = splice @_, 0, $h->{'count'};
+        my @guesses = @_;
+
+        $h->{'misses'}  = \@misses;
+        $h->{'guesses'} = \@guesses;
+      },
+  );
+  $modisp{'guess'} = $modisp{'miss'}; # same handler.
+
+  my @results;
+  for my $i ( 0 .. $#commentary ) {
+    my %h = (
+      'commentary' => $commentary[$i],
+    );
+
+    my @tail; # will get stuff after a colon, if any.
+
+    if ( $h{'commentary'} =~ s/:\s+(.*)// ) {
+      my $tail = $1;
+      @tail = split /, /, $tail;
+    }
+
+    my( $c, @args ) = split ' ', $h{'commentary'};
+  
+    my $type = $types{$c} || 'unknown';
+
+    $modisp{$type} and $modisp{$type}->( \%h, @args, @tail );
+
+    $h{'type'} = $type;
+    $h{'term'} = $h{'original'};
+
+    push @results, \%h;
+  }
+
+  @results
+}
+
+sub _send_command($$) {
+  my( $cmd, $arg ) = @_;
+  defined $arg or $arg = '';
+  local $/ = "\n"; local $\ = '';
+  chomp $arg;
+  _init();
+  print Writer "$cmd$arg\n";
+}
+
+
+=head1 AUX FUNCTIONS
+
+=head2 add_word(word)
+
+Adds a word to the personal dictionary.  Be careful of capitalization.
+If you want the word to be added "case-insensitively", you should
+call C<add_word_lc()>
+
+=cut
+
+sub add_word($) {
+  _send_command "\*", $_[0];
+}
+
+=head2 add_word_lc(word)
+
+Adds a word to the personal dictionary, in lower-case form. 
+This allows ispell to match it in a case-insensitive manner.
+
+=cut
+
+sub add_word_lc($) {
+  _send_command "\&", $_[0];
+}
+
+=head2 accept_word(word)
+
+Similar to adding a word to the dictionary, in that it causes
+ispell to accept the word as valid, but it does not actually
+add it to the dictionary.  Presumably the effects of this only
+last for the current ispell session, which will mysteriously
+end if any of the coprocess-restarting functions are called...
+
+=cut
+
+sub accept_word($) {
+  _send_command "\@", $_[0];
+}
+
+=head2 parse_according_to(formatter)
+
+Causes ispell to parse subsequent input lines according to
+the specified formatter.  As of ispell v. 3.1.20, only
+'tex' and 'nroff' are supported.
+
+=cut
+
+sub parse_according_to($) {
+  # must be one of 'tex' or 'nroff'
+  _send_command "\-", $_[0];
+}
+
+=head2 set_params_by_language(language) 
+
+Causes ispell to set its internal operational parameters
+according to the given language.  Legal arguments to this
+function, and its effects, are currently unknown by the
+author of Lingua::Ispell.
+
+=cut
+
+sub set_params_by_language($) {
+  _send_command "\~", $_[0];
+}
+
+=head2 save_dictionary() 
+
+Causes ispell to save the current state of the dictionary
+to its disk file.  Presumably ispell would ordinarily
+only do this upon exit.
+
+=cut
+
+sub save_dictionary() {
+  _send_command "\#", '';
+}
+
+=head2 terse_mode(bool:terse)
+
+I<B<NOTE:> This function has been disabled! 
+Lingua::Ispell now always operates in terse mode.>
+
+In terse mode, ispell will not produce reports for "correct" words.
+This means that the calling program will not receive results of the
+types 'ok', 'root', and 'compound'.
+
+=cut
+
+sub terse_mode($) {
+#  my $bool = shift;
+#  my $cmd = $bool ?  "\!" : "\%";
+#  _send_command $cmd, '';
+#  $Lingua::Ispell::terse = $bool;
+}
+
+
+=head1 FUNCTIONS THAT RESTART ISPELL
+
+The following functions cause the current ispell coprocess, if any, to terminate. 
+This means that all the changes to the state of ispell made by the above
+functions will be lost, and their respective values reset to their defaults.
+The only function above whose effect is persistent is C<save_dictionary()>.
+
+Perhaps in the future we will figure out a good way to make this
+state information carry over from one instantiation of the coprocess
+to the next.
+
+=head2 allow_compounds(bool)
+
+When this value is set to True, compound words are
+accepted as legal -- as long as both words are found in the
+dictionary; more than two words are always illegal.
+When this value is set to False, run-together words are
+considered spelling errors.
+
+The default value of this setting is dictionary-dependent,
+so the caller should set it explicitly if it really matters.
+
+=cut
+
+sub allow_compounds {
+  my $bool = shift;
+  _exit();
+  if ( $bool ) {
+    $Lingua::Ispell::options{'-C'} = [];
+    delete $Lingua::Ispell::options{'-B'};
+  }
+  else {
+    $Lingua::Ispell::options{'-B'} = [];
+    delete $Lingua::Ispell::options{'-C'};
+  }
+}
+
+=head2 make_wild_guesses(bool)
+
+This setting controls when ispell makes "wild" guesses.
+
+If False, ispell only makes "sane" guesses, i.e.  possible
+root/affix combinations that match the current dictionary;
+only if it can find none will it make "wild" guesses,
+which don't match the dictionary, and might in fact
+be illegal words.
+
+If True, wild guesses are always made, along with any "sane" guesses. 
+This feature can be useful if the dictionary has a limited word list,
+or a word list with few suffixes. 
+
+The default value of this setting is dictionary-dependent,
+so the caller should set it explicitly if it really matters.
+
+=cut
+
+sub make_wild_guesses {
+  my $bool = shift;
+  _exit();
+  if ( $bool ) {
+    $Lingua::Ispell::options{'-m'} = [];
+    delete $Lingua::Ispell::options{'-P'};
+  }
+  else {
+    $Lingua::Ispell::options{'-P'} = [];
+    delete $Lingua::Ispell::options{'-m'};
+  }
+}
+
+=head2 use_dictionary([dictionary])
+
+Specifies what dictionary to use instead of the
+default.  Dictionary names are actually file
+names, and are searched for according to the
+following rule: if the name does not contain a slash,
+it is looked for in the directory containing the
+default dictionary, typically /usr/local/lib.
+Otherwise, it is used as is: if it does not begin
+with a slash, it is construed from the current
+directory.
+
+If no argument is given, the default dictionary will be used.
+
+=cut
+
+sub use_dictionary {
+  _exit();
+  if ( @_ ) {
+    $Lingua::Ispell::options{'-d'} = [ @_ ];
+  }
+  else {
+    delete $Lingua::Ispell::options{'-d'};
+  }
+}
+
+=head2 use_personal_dictionary([dictionary])
+
+Specifies what personal dictionary to use
+instead of the default.
+
+Dictionary names are actually file names, and are
+searched for according to the following rule:
+if the name begins with a slash, it is used as
+is (i.e. it is an absolute path name). Otherwise,
+it is construed as relative to the user's home
+directory ($HOME).
+
+If no argument is given, the default personal
+dictionary will be used.
+
+=cut
+
+sub use_personal_dictionary {
+  _exit();
+  if ( @_ ) {
+    $Lingua::Ispell::options{'-p'} = [ @_ ];
+  }
+  else {
+    delete $Lingua::Ispell::options{'-p'};
+  }
+}
+
+
+
+1;
+
+
+=head1 FUTURE ENHANCEMENTS
+
+ispell options:
+
+  -w chars
+         Specify additional characters that can be part of a word.
+
+=head1 DEPENDENCIES
+
+Lingua::Ispell uses the external program ispell, which is
+the "International Ispell", available at
+
+  http://fmg-www.cs.ucla.edu/geoff/ispell.html
+
+as well as various archives and mirrors, such as 
+
+  ftp://ftp.math.orst.edu/pub/ispell-3.1/
+
+This is a very popular program, and may already be
+installed on your system.
+
+Lingua::Ispell also uses the standard perl modules FileHandle,
+IPC::Open2, and Carp.
+
+=head1 AUTHOR
+
+jdporter at min.net (John Porter)
+
+=head1 COPYRIGHT
+
+This module is free software; you may redistribute it and/or
+modify it under the same terms as Perl itself.
+
+=cut
+

Added: packages/liblingua-ispell-perl/branches/upstream/current/spellcheck
URL: http://svn.debian.org/wsvn/pkg-perl/packages/liblingua-ispell-perl/branches/upstream/current/spellcheck?rev=4639&op=file
==============================================================================
--- packages/liblingua-ispell-perl/branches/upstream/current/spellcheck (added)
+++ packages/liblingua-ispell-perl/branches/upstream/current/spellcheck Thu Dec 21 05:39:37 2006
@@ -1,0 +1,76 @@
+#!/usr/local/bin/perl -w
+
+$| = 1;
+
+use lib qw( lib );
+use Lingua::Ispell qw( :all );
+use strict;
+
+while ( <> ) {
+  chomp;
+  my $line = $_;
+
+  if ( s/^-C\s*// ) { allow_compounds(1); next; }
+  if ( s/^-m\s*// ) { infer_root_affix_combos(1); next; }
+  if ( s/^-d\s*// ) { use_dictionary(split); next; }
+  if ( s/^-p\s*// ) { use_personal_dictionary(split); next; }
+
+
+  for my $r ( spellcheck( $line ) ) {
+
+    {
+      'ok' =>
+        sub { print "ok: $r->{'term'}\n"; },
+
+      'compound' =>
+        sub { print "ok: $r->{'term'}\n"; },
+
+      'root' =>
+        sub { print "ok: '$r->{'term'}' can be formed from root '$r->{'root'}'\n"; },
+
+      'none' =>
+        sub {
+          my $indent = ' ' x $r->{'offset'};
+          print <<EOF;
+No match found for term "$r->{'term'}" in:
+"$line"
+$indent^
+
+EOF
+        },
+
+      'miss' =>
+        sub {
+          my $indent = ' ' x $r->{'offset'};
+	  local $" = "\n\t";
+          print <<EOF;
+Near miss on term "$r->{'term'}" in:
+"$line"
+$indent^
+missed terms:
+	@{$r->{'misses'}}
+
+EOF
+        },
+
+      'guess' =>
+        sub {
+          my $indent = ' ' x $r->{'offset'};
+	  local $" = "\n\t";
+          print <<EOF;
+Guess on term "$r->{'term'}" in:
+"$line"
+$indent^
+missed terms:
+	@{$r->{'misses'}}
+guesses:
+	@{$r->{'guesses'}}
+
+EOF
+        },
+
+    }->{ $r->{'type'} }->();
+  }
+}
+
+

Propchange: packages/liblingua-ispell-perl/branches/upstream/current/spellcheck
------------------------------------------------------------------------------
    svn:executable = 

Added: packages/liblingua-ispell-perl/branches/upstream/current/test.pl
URL: http://svn.debian.org/wsvn/pkg-perl/packages/liblingua-ispell-perl/branches/upstream/current/test.pl?rev=4639&op=file
==============================================================================
--- packages/liblingua-ispell-perl/branches/upstream/current/test.pl (added)
+++ packages/liblingua-ispell-perl/branches/upstream/current/test.pl Thu Dec 21 05:39:37 2006
@@ -1,0 +1,30 @@
+#!/usr/local/bin/perl -w
+# Before `make install' is performed this script should be runnable with
+# `make test'. After `make install' it should work as `perl test.pl'
+
+######################### We start with some black magic to print on failure.
+
+# Change 1..1 below to 1..last_test_to_print .
+# (It may become useful if the test is moved to ./t subdirectory.)
+
+use lib 'lib';
+BEGIN { $| = 1; print "1..1\n"; }
+END {print "not ok 1\n" unless $loaded;
+#for ( sort keys %INC ) { print "$_: $INC{$_}\n"; }
+}
+use Lingua::Ispell;
+$loaded = 1;
+report(1);
+
+######################### End of black magic.
+
+# Insert your test code below (better if it prints "ok 13"
+# (correspondingly "not ok 13") depending on the success of chunk 13
+# of the test code):
+
+sub report {
+	$TEST_NUM++;
+	print ( $_[0] ? "ok $TEST_NUM\n" : "not ok $TEST_NUM\n" );
+}
+
+

Propchange: packages/liblingua-ispell-perl/branches/upstream/current/test.pl
------------------------------------------------------------------------------
    svn:executable = 




More information about the Pkg-perl-cvs-commits mailing list