[Debian-in-workers] Bug#651768: ttf-indic-fonts: Most configurations starting with "90-" for Indic scritps are faulty

K. Sethu skhome at gmail.com
Mon Dec 12 02:07:02 UTC 2011


Package: ttf-indic-fonts
Version: 1:0.5.11
Severity: important

Introduction
=========
There are bugs arising from the configuration file 90-ttf-tamil.fonts.conf and
analogous files for most, but not all, Indic scripts inside the xml tree that
Fontconfig walks over while parsing for font matching.

The “90-ttf-...conf” configurations from the source of "ttf-indic-fonts"
version 1:0.5.11 which is current in Debian (Stable), are for 09 scripts:
Telugu, Tamil,  Punjabi, Oriya, Malayalam, Kannada,Gujarati, Devanagari and
Bengali.

Of these 09 scripts, 07 are single language scripts with (script and language
name being the same too). Of the rest 02, the  Devanagari script is used by
multiple languages (- Sindhi, Sanskrit, Kashmiri, Konkani, Nepali, Maithily,
Marathi and Hindi) and Bengali script is used by two languages (- Assamese and
Bengali)

Then, from the later version ttf-indic-fonts_1:0:5:12, which is current in
Debian (Testing) and Debian (Unstable),  further to the files for the above
mentioned 09 scripts, a new addition is the same type configuration for
Sourashtra (saz_IN) language script  with the "fonts-pagul" (version1.0-1)
package.

The "ttf-indic-fonts_1:0.5.12" meta pacakge is also dependent on  "fonts-
pagul_1.0-1". So including also the case of this addition when version 1:0:5:12
of ttf-indic-fonts is installed, there could be totally 10  nos.
“90-....conf” sym-links and corresponding  files for the 10 indic scripts
which are used for 18 languages.

A "90-....conf" configuration file of an Indic script gets installed in Debian
and Debian based distros when the individual ttf fonts pacakge for the
corresponding Indic script is installed  or at the installation of  the meta
package "ttf-indic-fonts" which installs the ttf-fonts pacakges for all the 10
Indic scripts.

There are two types of bugs that are presented in this report. However not all
configurations cause both types of bugs.

Both types of bugs are found with 07 configurations. The symlinks and
configuration for those 07 are as follows:

/etc/fonts/conf.d/90-ttf-tamil-fonts.conf -> /etc/fonts/conf.avail/90-ttf-
tamil-fonts.conf
/etc/fonts/conf.d/90-ttf-punjabi-fonts.conf -> /etc/fonts/conf.avail/90-ttf-
punjabi-fonts.conf
/etc/fonts/conf.d/90-ttf-oriya-fonts.conf -> /etc/fonts/conf.avail/90-ttf-
oriya-fonts.conf
/etc/fonts/conf.d/90-ttf-gujarati-fonts.conf -> /etc/fonts/conf.avail/90-ttf-
gujarati-fonts.conf
/etc/fonts/conf.d/90-ttf-devanagari-fonts.conf -> /etc/fonts/conf.avail/90-ttf-
devanagari-fonts.conf
/etc/fonts/conf.d/90-ttf-bengali-fonts.conf -> /etc/fonts/conf.avail/90-ttf-
bengali-fonts.conf
/etc/fonts/conf.d/90-fonts-pagul.conf -> /etc/fonts/conf.avail/90-fonts-
pagul.conf

The following two symlinks and configuration files - for Telugu and Kannada
have only the first of the two types of bugs.

/etc/fonts/conf.d/90-ttf-telugu-fonts.conf -> /etc/fonts/conf.avail/90-ttf-
telugu-fonts.conf
/etc/fonts/conf.d/90-ttf-kannada-fonts.conf -> /etc/fonts/conf.avail/90-ttf-
kannada-fonts.conf

The one for Malayalam script does not cause anyone of the two types of bugs.

I have found that an older version 1:0.5.4 which is current in Debian (old
stable), do not have these configuration files for Indic scripts with priority
90 and so was free of bugs presented here.  Looking at debian/changelog in
source, it is evident that these configuration files (either some of them or
most of them) got included first time in ttf-indic-fonts version 1:0.5.5.

Note the following debian/changelog entry for version 1:0.5.5:

//[Praveen Arimbrathodiyil]
  * Added priotity of 90 to fontconfig configuration files//


Two types of Bugs:
==============
* Font matching  for sans-serif generic : Other than for Malayalam for the rest
09 scripts, the "90-...conf" configuration mandate is not effective.

* Font matching  for serif generic : The matching for the generic Serif by the
"90-...conf" configuration files for one of the 07 scripts which is not one of
Telugu, Kannada and Malayalam  gets actually imposed to all other languages
also with dominating configuration determined according to priority among the
installed configurations among the 07 scripts.

Now more details of the bugs:

Type 1 Bug- Font matching of sans-serif for scripts of all languages except
Malayalam
=====================================================================

The upstream Fontconfig is shipped with a number of configuration symlinks
which for Non-Latin scripts include "/etc/fonts/conf.d/65-nonlatin.conf ->
/etc/fonts/conf.avail/65-nonlatin.conf" .

Taking the example case for Tamil, this configuration sets the following
matches to the generic fonts of Sans-Serif,  Serif and Mono :

Sans-Serif :  TSCu_Paranar , Lohit Tamil in the descending order of priority
Serif           :  Lohit Tami
Mono       :   Lohit Tamil

So the preference to match Lohit Tamil to a generic font gets superseded with
TSCu_Paranar in case of Sans-Serif, but only if TSCu_Paranar "regular" or
"bold" or both are installed.

Now in the config file /etc/fonts/conf.avail/90-ttf-tamil-fonts.conf, the
purpose of the first <match ....> </match> block  appears to be setting the
match for Sans-Serif to "Lohit Tamil" with an append mode configuration
statement : <edit name="family" mode="append" binding="same">

It is also apparent that the purpose of the final block <rejectfont> ...
</rejectfont>  is to not allow the picking of any of the legacy encoded TSCu
and TAMu fonts in font matching.

Despite the above two blocks in 90-ttf-tamil-fonts.conf, when the symlink
/etc/fonts/conf.d/90-ttf-tamil-fonts.conf -> /etc/fonts/conf.avail/90-ttf-
tamil-fonts.conf is present and also TSCu_Parnar font (either regular or bold
or both) is installed the generic font Sans-Serif gets matched to TSCu_Paranar
only. This can be verified with the following command :

fc-match Sans:lang=ta

for which the result is:
TSCu_Paranar.ttf: "TSCu_Paranar" "Regular"

So if the purpose of 90-ttf-tamil-fonts.conf is rejecting TSCu_Paranar and
matching "Lohit Tamil" to Sans-Serif, then the purpose is not getting realized.
In fact the "90-ttf-tamil-fonts.conf " file does not do any change to the font
matching for Sans-Serif by the "65-nonlatin.conf".

The workarounds available to an user wanting not to have TSCu_Paranar matched
to Sans-Serif but "Lohit Tamil" are one of the follows:

a. Removing TSCu_Paranar, regular and bold fonts

b. Keep the TSCu_Paranar fonts but drop its preferred status by removing from
/etc/fonts/conf.avail/65-nonlatin.conf - needs to be a super user.

c.  Keep the TSCu_Paranar fonts and do not modify
/etc/fonts/conf.avail/65-nonlatin.conf - For this,  the user can include inside
own .fonts.conf (in user's home directory) file the following <match>
....</match>block :

#####################
<match target="font">
  <test compare="contains" name="lang">
   <string>ta</string>
  </test>
  <alias>
   <family>sans-serif</family>
   <prefer>
    <family>Lohit Tamil</family>
   </prefer>
  </alias>
 </match>
#####################

(Note for those readers not having made .fonts.conf before - the mandatory 3
header lines and the single tail line in .fotns.conf are as shown in
http://www.freedesktop.org/software/fontconfig/fontconfig-user.html under the
section titled : Configuration File Format)

Now looking at other scripts, just as with the above example case for Tamil,
for each of the languages which use the other 9 scripts (which are all
excluding Malayalam) the font matching for  Sans-Serif is not as per the
respective "90-...conf" configuration !

b).  Type 2 Bugs- Font matching of sans-serif for all scripts except for
Telugu, Malayalam and Kannada
===============================================================================

When all the 07 scripts listed earlier or some of them are present with ttf-
indic-fonts_1:0:5:12 version (or some or all of the 06 of them  excluding the
one for fonts-pagul with ttf-indic-fonts_1:0:5:11), this type of bug is seen on
matching for Serif.

The bug is that the match for Serif for one language script, gets also imposed
on the match for Serif of each of all the other  scripts / languages.

To demonstrate this, start with fonts and configurations for all Indic scripts
being installed. Then find the match for Serif in case of any language using
the following command:

fc-match Serif:lang=ll

wherein ll denotes the language code -  ta for Tamil, en for English, si for
Sinhala etc.

When all indic fonts and their 90-ttf-*.conf (plus 90-fonts-pagul.conf in case
of version 1:0.5.12) symlinks are installed, it is found that for all languages
(Indic or otherwise) for Serif, the match is "Lohit Tamil" one which is
supposed to be the preferred match for Tamil only.

Further in this scenario I have not found any way of changing the dominating
match for Serif by way of user's .fonts.conf, for Tamil as well as other
languages.

As an example of hindrances that could be posed by this dominance on another
language script consider example case of Sinhala. The command "fc-match
Serif:lang=si" under the above circumstance shows "Lohit Tamil" as match for
Serif even for Sinhala. But since Lohit Tamil does not have  Sinhala glyphs,
fontconfig would match to the best possible which is "LKLUG" that is set as
preferred for each of Sans-Serif, Serif, and Monospace by 65-nonlatin.conf .
But if the user wants to set  another font for Serif for Sinhala, then this
buggy domination from 90-ttf-tamil-fonts.conf hinders such user's resetting.

I found two workarounds to remove all of the above hindrances arising from 90
-ttf-tamil-fonts.conf :

i) Snip out the sym-link : /etc/fonts/conf.d/90-ttf-tamil-fonts.conf ->
/etc/fonts/conf.avail/90-ttf-tamil-fonts.conf

ii) Keep the sym-link but edit /etc/fonts/conf.avail/90-ttf-tamil-fonts.conf
and change the binding to "weak" from "same" in the following block for
matching for "serif" :

<match target="pattern">
                <test name="lang" compare="contains">
                        <string>ta</string>
                </test>
                <test qual="any" name="family">
                        <string>serif</string>
                </test>
                <edit name="family" mode="append" binding="same">
                        <string>Lohit Tamil</string>
                </edit>
        </match>

With either of the above two workarounds the buggy dominance is taken over by
"Lohit Punjabi" arising from 90-ttf-punjabi-fonts.conf - the next in the list
of 07 scripts I listed first which are actually in reverse alphabetic sort
order.  So next for Punjabi also if we continue with removing or editing (to
change "binding" from "same" to "weak") the conf file, then the next conf in
the list "90-ttf-oriya-fonts.conf" dominates all languages for match for Serif.

Adnauseam , this continues for rest of the scripts too, net result being we
either remove the "90-..conf" symlinks of all the Indic scripts (other than
those of Telegu , Malayalam and Kannada) or keep them with modifying the append
mode "binding" for Serif matches to be set to "weak". (While using the latter
approach in cases of the conf files of multilingual scripts, Devanagiri and
Bengali, it is also necessary that the modification of "binding" is done for
each of the languages covered by these scripts)

It is necessary to review the purposes for which these configurations for Indic
scripts were introduced  in the first place and find alternative solution which
would not cause the bugs I have presented here. Whatever the solution reached
it should not hinder user to have own setting via .fonts.conf in user's home
directory.



-- System Information:
Debian Release: 6.0.3
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: i386 (i686)

Kernel: Linux 2.6.32-5-686 (SMP w/2 CPU cores)
Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages ttf-indic-fonts depends on:
ii  ttf-bengali-fonts             1:0.5.11   Free TrueType fonts for the Bengal
ii  ttf-devanagari-fonts          1:0.5.11   Free TrueType fonts for languages 
ii  ttf-gujarati-fonts            1:0.5.11   Free TrueType fonts for the Gujara
ii  ttf-kannada-fonts             1:0.5.11   Free TrueType fonts for the Kannad
ii  ttf-malayalam-fonts           1:0.5.11   Free TrueType fonts for the Malaya
ii  ttf-oriya-fonts               1:0.5.11   Free TrueType fonts for the Oriya 
ii  ttf-punjabi-fonts             1:0.5.11   Free TrueType fonts for the Punjab
ii  ttf-tamil-fonts               1:0.5.11   Free TrueType fonts for the Tamil 
ii  ttf-telugu-fonts              1:0.5.11   Free TrueType fonts for the Telugu

ttf-indic-fonts recommends no packages.

ttf-indic-fonts suggests no packages.

-- no debconf information





More information about the Debian-in-workers mailing list