UTF-8 and ispell

G. Milde milde at users.sourceforge.net
Thu Sep 20 09:31:33 UTC 2007


On 19.09.07, Rafael Laboissiere wrote:

> I do not know what I was doing before, but it is working now.

By default, jed loads the broken jed-ispell-dicts.sl file. See ispell_init::

  % This will set up the dictionaries on your system, if you are a Debian Unstable user.
  custom_variable("Ispell_Cache_File", "/var/cache/dictionaries-common/jed-ispell-dicts.sl");
  if (1 == file_status (Ispell_Cache_File))
    () = evalfile (Ispell_Cache_File);

You will have to set Ispell_Cache_File in jed.rc to your alternative.


> Well, the
> file evals ok but spelling does not work for buffers in UTF-8 encoding. Do
> you think ispell.sl could be adapted for that?

I suppose that dictionaries-common needs to be refined:

Example: the iogerman package provides in odeutsch.aff the following
encodings::

  altstringtype "tex" "TeX" ".tex" ".bib"
  altstringtype "plaintex" "TeX" ".tex"
  altstringtype "latin1" "TeX" ".latin1" ".txt" ".tex" ".bib"
  altstringtype "utf8" "TeX" ".txt"
  altstringtype "ascii" "nroff" ".ascii" ".txt"
  altstringtype "pc" "TeX" ".pc" ".txt" ".tex"
  altstringtype "HTML" "TeX" ".html" ".htm" ".sgml" ".xml"

i.e. it is prepared for UTF-8 input, provided the right argument is given to
ispell.

dictionaries-common extracts in ispell-dicts-list.txt::

  ...
  deutsch (Old German -tex mode-)
  deutsch (Old German 8 bit)
  ...
  
i.e. only two of the seven encodings provided by the German dictionaries.

It should look for utf8 in the aff files an add a line like::

  deutsch (Old German UTF-8)

to ispell-dicts-list.txt for every dictionary providing 'altstringtype "utf8"'

jed-ispell-dicts.sl should then contain something like ::

  ispell_add_dictionary (
    "german-old-tex",
    "ogerman",
    "\"",
    "[']",
    "~tex",
    "-C -d ogerman");
  
  if (_slang_utf8_ok) {
    ispell_add_dictionary (
      "german-old-utf8",
      "ogerman",
      "ÄÖÜäößü",
      "[']",
      "~utf8",
      "-C -d ogerman");
  } else {
    ispell_add_dictionary (
      "german-old8",
      "ogerman",
      "ÄÖÜäößü",
      "[']",
      "~latin1",
      "-C -d ogerman");
  }
  
so that the correct argument is passed to ispell.

This works now in both, UTF8 and latin1 enabled jed.

(I did not check how this could be done and how it fits in the
dictionaries-common policy.)  


Günter




More information about the Pkg-jed-devel mailing list