[SCM] WebKit Debian packaging branch, debian/unstable, updated. debian/1.1.15-1-40151-g37bb677

darin darin at 268f45cc-cd09-0410-ab3c-d52691b4dbfc
Sat Sep 26 06:31:05 UTC 2009


The following commit has been merged in the debian/unstable branch:
commit 1583039238bf98ff07f98ec75d56712689247ab2
Author: darin <darin at 268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Date:   Tue Aug 13 23:49:19 2002 +0000

    	Character set handling improvements. In total, this adds 92 new character encoding
    	names to the list we know how to handle (299, up from 207), so it probably makes
    	some pages work that didn't work before. It probably also adds character encoding
    	names that are never used in practice.
    
            * kwq/character-sets.txt: Took out all but one of our changes. We now handle aliases
    	that are not in this file by adding them to mac-encodings.txt.
            * kwq/mac-encodings.txt: Added. Lists CFStringEncoding values and IANA names for them.
    	We keep this file small by using the aliasing in character-sets.txt, and we also continue
    	to get MIB numbers from character-sets.txt.
            * kwq/make-charset-table.pl: Rewrote to read from new format mac-encodings.txt file, and
    	check for new kinds of errors.
    
            * kwq/.cvsignore: Don't ignore the make-mac-encodings files any more, since we
    	don't compile that any more.
            * kwq/Makefile.am: Remove rules for compiling and running make-mac-encodings.
    	* kwq/make-mac-encodings.c: Removed.
    
    
    git-svn-id: http://svn.webkit.org/repository/webkit/trunk@1809 268f45cc-cd09-0410-ab3c-d52691b4dbfc

diff --git a/WebCore/ChangeLog-2002-12-03 b/WebCore/ChangeLog-2002-12-03
index 4193f54..a2a011b 100644
--- a/WebCore/ChangeLog-2002-12-03
+++ b/WebCore/ChangeLog-2002-12-03
@@ -1,5 +1,25 @@
 2002-08-13  Darin Adler  <darin at apple.com>
 
+	Character set handling improvements. In total, this adds 92 new character encoding
+	names to the list we know how to handle (299, up from 207), so it probably makes
+	some pages work that didn't work before. It probably also adds character encoding
+	names that are never used in practice.
+
+        * kwq/character-sets.txt: Took out all but one of our changes. We now handle aliases
+	that are not in this file by adding them to mac-encodings.txt.
+        * kwq/mac-encodings.txt: Added. Lists CFStringEncoding values and IANA names for them.
+	We keep this file small by using the aliasing in character-sets.txt, and we also continue
+	to get MIB numbers from character-sets.txt.
+        * kwq/make-charset-table.pl: Rewrote to read from new format mac-encodings.txt file, and
+	check for new kinds of errors.
+
+        * kwq/.cvsignore: Don't ignore the make-mac-encodings files any more, since we
+	don't compile that any more.
+        * kwq/Makefile.am: Remove rules for compiling and running make-mac-encodings.
+	* kwq/make-mac-encodings.c: Removed.
+
+2002-08-13  Darin Adler  <darin at apple.com>
+
 	- fixed 3023439 -- support for windows-874 charset for thai
 
         * kwq/character-sets.txt: Added cp874 and windows-874.
diff --git a/WebCore/ChangeLog-2003-10-25 b/WebCore/ChangeLog-2003-10-25
index 4193f54..a2a011b 100644
--- a/WebCore/ChangeLog-2003-10-25
+++ b/WebCore/ChangeLog-2003-10-25
@@ -1,5 +1,25 @@
 2002-08-13  Darin Adler  <darin at apple.com>
 
+	Character set handling improvements. In total, this adds 92 new character encoding
+	names to the list we know how to handle (299, up from 207), so it probably makes
+	some pages work that didn't work before. It probably also adds character encoding
+	names that are never used in practice.
+
+        * kwq/character-sets.txt: Took out all but one of our changes. We now handle aliases
+	that are not in this file by adding them to mac-encodings.txt.
+        * kwq/mac-encodings.txt: Added. Lists CFStringEncoding values and IANA names for them.
+	We keep this file small by using the aliasing in character-sets.txt, and we also continue
+	to get MIB numbers from character-sets.txt.
+        * kwq/make-charset-table.pl: Rewrote to read from new format mac-encodings.txt file, and
+	check for new kinds of errors.
+
+        * kwq/.cvsignore: Don't ignore the make-mac-encodings files any more, since we
+	don't compile that any more.
+        * kwq/Makefile.am: Remove rules for compiling and running make-mac-encodings.
+	* kwq/make-mac-encodings.c: Removed.
+
+2002-08-13  Darin Adler  <darin at apple.com>
+
 	- fixed 3023439 -- support for windows-874 charset for thai
 
         * kwq/character-sets.txt: Added cp874 and windows-874.
diff --git a/WebCore/ChangeLog-2005-08-23 b/WebCore/ChangeLog-2005-08-23
index 4193f54..a2a011b 100644
--- a/WebCore/ChangeLog-2005-08-23
+++ b/WebCore/ChangeLog-2005-08-23
@@ -1,5 +1,25 @@
 2002-08-13  Darin Adler  <darin at apple.com>
 
+	Character set handling improvements. In total, this adds 92 new character encoding
+	names to the list we know how to handle (299, up from 207), so it probably makes
+	some pages work that didn't work before. It probably also adds character encoding
+	names that are never used in practice.
+
+        * kwq/character-sets.txt: Took out all but one of our changes. We now handle aliases
+	that are not in this file by adding them to mac-encodings.txt.
+        * kwq/mac-encodings.txt: Added. Lists CFStringEncoding values and IANA names for them.
+	We keep this file small by using the aliasing in character-sets.txt, and we also continue
+	to get MIB numbers from character-sets.txt.
+        * kwq/make-charset-table.pl: Rewrote to read from new format mac-encodings.txt file, and
+	check for new kinds of errors.
+
+        * kwq/.cvsignore: Don't ignore the make-mac-encodings files any more, since we
+	don't compile that any more.
+        * kwq/Makefile.am: Remove rules for compiling and running make-mac-encodings.
+	* kwq/make-mac-encodings.c: Removed.
+
+2002-08-13  Darin Adler  <darin at apple.com>
+
 	- fixed 3023439 -- support for windows-874 charset for thai
 
         * kwq/character-sets.txt: Added cp874 and windows-874.
diff --git a/WebCore/WebCore.pbproj/project.pbxproj b/WebCore/WebCore.pbproj/project.pbxproj
index 341f31f..af9022c 100644
--- a/WebCore/WebCore.pbproj/project.pbxproj
+++ b/WebCore/WebCore.pbproj/project.pbxproj
@@ -907,6 +907,10 @@
 			children = (
 				F58784D802DE375901EA4122,
 				F58784D902DE375901EA4122,
+				F5BFAAC10309CDF6018635CE,
+				F550D70B02E13281018635CA,
+				F550D70902E13281018635CA,
+				F550D70C02E13281018635CA,
 				F58784CC02DE375901EA4122,
 				F58784CD02DE375901EA4122,
 				F58784D302DE375901EA4122,
@@ -3681,11 +3685,6 @@
 			path = "make-charset-table.pl";
 			refType = 4;
 		};
-		F550D70E02E13281018635CA = {
-			isa = PBXFileReference;
-			path = "make-mac-encodings.c";
-			refType = 4;
-		};
 		F550D71002E132BB018635CA = {
 			fileRef = F550D70A02E13281018635CA;
 			isa = PBXBuildFile;
@@ -5715,10 +5714,6 @@
 				F58785F002DE382001EA4122,
 				F58784C402DE375801EA4122,
 				F58785F102DE382001EA4122,
-				F550D70902E13281018635CA,
-				F550D70B02E13281018635CA,
-				F550D70C02E13281018635CA,
-				F550D70E02E13281018635CA,
 				F58784EB02DE375901EA4122,
 				F58785F202DE382001EA4122,
 				F58785F302DE382001EA4122,
@@ -7536,6 +7531,11 @@
 			settings = {
 			};
 		};
+		F5BFAAC10309CDF6018635CE = {
+			isa = PBXFileReference;
+			path = "character-sets.txt";
+			refType = 4;
+		};
 		F5C2869302846DCD018635CA = {
 			isa = PBXFrameworkReference;
 			name = ApplicationServices.framework;
diff --git a/WebCore/kwq/.cvsignore b/WebCore/kwq/.cvsignore
index 0a40284..5297efc 100644
--- a/WebCore/kwq/.cvsignore
+++ b/WebCore/kwq/.cvsignore
@@ -1,6 +1,4 @@
 Makefile.in
 Makefile
 KWQCharsetData.c
-mac-encodings.txt
-make-mac-encodings
 .deps
diff --git a/WebCore/kwq/Makefile.am b/WebCore/kwq/Makefile.am
index 9a44228..9382809 100644
--- a/WebCore/kwq/Makefile.am
+++ b/WebCore/kwq/Makefile.am
@@ -1,19 +1,5 @@
-NULL =
-
-noinst_PROGRAMS = make-mac-encodings
-make_mac_encodings_SOURCES = make-mac-encodings.c
-make_mac_encodings_LDFLAGS = -framework CoreFoundation
-
-mac-encodings.txt: make-mac-encodings
-	$(<D)/$(<F) $@
-
 KWQCharsetData.c: make-charset-table.pl character-sets.txt mac-encodings.txt
-	perl $^ $@
-
-BUILT_SOURCES = \
-	make-mac-encodings \
-	mac-encodings.txt \
-	KWQCharsetData.c \
-	$(NULL)
+	perl $^ > $@
 
+BUILT_SOURCES = KWQCharsetData.c
 CLEANFILES = $(BUILT_SOURCES)
diff --git a/WebCore/kwq/character-sets.txt b/WebCore/kwq/character-sets.txt
index 7758112..7511042 100644
--- a/WebCore/kwq/character-sets.txt
+++ b/WebCore/kwq/character-sets.txt
@@ -3,10 +3,7 @@
 CHARACTER SETS
 
 (last updated 2001 August 23)
-(Apple Changes: added x-sjis alias for Shift JIS, 2002 July 10) 
 (Apple Changes: added MIBenum: 1004 for ISO-10646-J-1, 2002 July 26) 
-(Apple Changes: added euc-cn alias for GB2312, 2002 August 1) 
-(Apple Changes: added cp874 and windows-874 aliases for TIS-620, 2002 August 13) 
 
 These are the official names for character sets that may be used in
 the Internet and may be referred to in Internet documentation.  These
@@ -1368,7 +1365,6 @@ Source: This charset is an extension of csHalfWidthKatakana by
         This charset can be used for the top-level media type "text".
 Alias: MS_Kanji 
 Alias: csShiftJIS
-Alias: x-sjis
 
 Name: Extended_UNIX_Code_Packed_Format_for_Japanese
 MIBenum: 18
@@ -1582,7 +1578,6 @@ Source: Chinese for People's Republic of China (PRC) mixed one byte,
         See GB 2312-80 
         PCL Symbol Set Id: 18C
 Alias: csGB2312
-Alias: euc-cn
 
 Name: Big5  (preferred MIME name)
 MIBenum: 2026
@@ -1638,8 +1633,6 @@ Alias: None
 Name: TIS-620
 MIBenum: 2259
 Source: Thai Industrial Standards Institute (TISI)	     [Tantsetthi]
-Alias: cp874
-Alias: windows-874
 
 Name: HZ-GB-2312
 MIBenum: 2085
diff --git a/WebCore/kwq/mac-encodings.txt b/WebCore/kwq/mac-encodings.txt
new file mode 100644
index 0000000..09349a2
--- /dev/null
+++ b/WebCore/kwq/mac-encodings.txt
@@ -0,0 +1,140 @@
+MacRoman: macintosh
+WindowsLatin1: windows-1252, x-ansi
+ISOLatin1: iso-8859-1, iso8859-1
+NextStepLatin: x-nextstep
+ASCII: us-ascii, iso-ir-6us
+Unicode: utf-16be, unicodeFFFE, unicode, utf-16
+UTF8: utf-8, unicode-1-1-utf-8, unicode-2-0-utf-8, x-unicode-2-0-utf-8
+NonLossyASCII
+
+MacJapanese: x-mac-japanese
+MacChineseTrad: x-mac-trad-chinese, x-mac-chinesetrad
+MacKorean: x-mac-korean
+MacArabic: x-mac-arabic
+MacHebrew: x-mac-hebrew
+MacGreek: x-mac-greek
+MacCyrillic: x-mac-cyrillic
+MacDevanagari: x-mac-devanagari
+MacGurmukhi: x-mac-gurmukhi
+MacGujarati: x-mac-gujarati
+MacOriya
+MacBengali
+MacTamil
+MacTelugu
+MacKannada
+MacMalayalam
+MacSinhalese
+MacBurmese
+MacKhmer
+MacThai: x-mac-thai
+MacLaotian
+MacGeorgian
+MacArmenian
+MacChineseSimp: x-mac-simp-chinese, x-mac-chinesesimp
+MacTibetan: x-mac-tibetan
+MacMongolian
+MacEthiopic
+MacCentralEurRoman: x-mac-centraleurroman, x-mac-ce
+MacVietnamese
+MacExtArabic
+
+MacSymbol: x-mac-symbol
+MacDingbats: x-mac-dingbats
+MacTurkish: x-mac-turkish
+MacCroatian: x-mac-croatian
+MacIcelandic: x-mac-icelandic
+MacRomanian: x-mac-romanian
+MacCeltic
+MacGaelic
+
+MacFarsi: x-mac-farsi
+
+MacUkrainian: x-mac-ukrainian
+
+MacInuit
+MacVT100: x-mac-vt100
+
+ISOLatin2: iso-8859-2, iso8859-2
+ISOLatin3: iso-8859-3
+ISOLatin4: iso-8859-4
+ISOLatinCyrillic: iso-8859-5
+ISOLatinArabic: iso-8859-6
+ISOLatinGreek: iso-8859-7
+ISOLatinHebrew: iso-8859-8, iso-8859-8-i, iso-8859-8-e, DOS-862, logical, visual
+ISOLatin5: iso-8859-9
+ISOLatin6: iso-8859-10
+ISOLatinThai: iso-8859-11
+ISOLatin7: iso-8859-13
+ISOLatin8: iso-8859-14
+ISOLatin9: iso-8859-15, l9, latin9, csISOLatin9
+
+DOSLatinUS: cp437
+DOSGreek: cp737, ibm737
+DOSBalticRim: cp775, cp500
+DOSLatin1: cp850
+DOSGreek1
+DOSLatin2: cp852
+DOSCyrillic
+DOSTurkish: cp857
+DOSPortuguese
+DOSIcelandic: cp861
+DOSHebrew
+DOSCanadianFrench
+DOSArabic: cp864, dos-720
+DOSNordic
+DOSRussian: cp866
+DOSGreek2: ibm869
+DOSThai: tis-620, cp874, windows-874, dos-874
+DOSJapanese: cp932
+DOSChineseSimplif: cp936
+DOSKorean: cp949
+DOSChineseTrad: cp950
+WindowsLatin2: windows-1250, x-cp1250
+WindowsCyrillic: windows-1251, x-cp1251
+WindowsGreek: windows-1253
+WindowsLatin5: windows-1254
+WindowsHebrew: windows-1255
+WindowsArabic: windows-1256, cp1256
+WindowsBalticRim: windows-1257
+WindowsKoreanJohab: johab
+WindowsVietnamese: windows-1258
+
+JIS_X0201_76: JIS_X0201
+JIS_X0208_83: JIS_X0208-1983
+JIS_X0208_90: JIS_X0208-1990
+JIS_X0212_90: JIS_X0212-1990
+JIS_C6226_78: JIS_C6226-1978
+ShiftJIS_X0213_00
+GB_2312_80: gb_2312-80, csGB231280, gb2312-80, gb231280
+GBK_95: x-gbk
+GB_18030_2000
+KSC_5601_87: KS_C_5601-1987, ks_c_5601_1987, ks_c_5601, ksc5601
+KSC_5601_92_Johab
+CNS_11643_92_P1
+CNS_11643_92_P2
+CNS_11643_92_P3
+
+ISO_2022_JP: iso-2022-jp
+ISO_2022_JP_2: iso-2022-jp-2
+ISO_2022_JP_1: iso-2022-jp-1
+ISO_2022_JP_3: iso-2022-jp-3
+ISO_2022_CN: iso-2022-cn
+ISO_2022_CN_EXT: iso-2022-cn-ext
+ISO_2022_KR: iso-2022-kr
+
+EUC_JP: euc-jp, x-euc, x-euc-jp
+EUC_CN: euc-cn, gb2312, cn-gb, gbk, x-euc-cn
+EUC_TW: euc-tw
+EUC_KR: euc-kr
+
+ShiftJIS: shift_jis, x-sjis, csWindows31J, shift-jis, x-ms-cp932
+KOI8_R: koi8-r, koi, koi8, koi8r
+Big5: big5, cn-big5, x-x-big5
+MacRomanLatin1: x-mac-roman-latin1
+HZ_GB_2312: hz-gb-2312
+Big5_HKSCS_1999: big5-hkscs
+
+EBCDIC_US
+EBCDIC_CP037: cp037
+
+0xAFE: japanese-autodetect
diff --git a/WebCore/kwq/make-charset-table.pl b/WebCore/kwq/make-charset-table.pl
index ef110cd..a7b99b9 100755
--- a/WebCore/kwq/make-charset-table.pl
+++ b/WebCore/kwq/make-charset-table.pl
@@ -5,77 +5,139 @@ use strict;
 
 my $MAC_SUPPORTED_ONLY = 1;
 
-my $canonical_name;
-my $mib_enum;
-my @aliases;
-my %name_to_mac_encoding;
-my %used_mac_encodings;
-
-my $already_wrote_one = 0;
+my %MIBNumberFromCharsetsFile;
+my %aliasesFromCharsetsFile;
+my %namesWritten;
 
 my $invalid_encoding = "kCFStringEncodingInvalidId";
 
-sub emit_prefix
-{
-    print TABLE "static const CharsetEntry table[] = {\n";
-}
+my $output = "";
 
-sub emit_suffix
+my $error = 0;
+
+sub error ($)
 {
-    print TABLE ",\n    {NULL,\n     -1,\n     $invalid_encoding}\n};\n";
+    print STDERR @_, "\n";
+    $error = 1;
 }
 
 sub emit_line
 {
     my ($name, $mibNum, $encodingNum) = @_;
-    print TABLE ",\n" if ($already_wrote_one);
-    print TABLE '    {"' . $name . '",' . "\n";
-    print TABLE "     " . $mibNum . ",\n";
-    print TABLE "     " . $encodingNum . "}";
-    $already_wrote_one = 1;
+ 
+    error "$name shows up twice in output" if $namesWritten{$name};
+    $namesWritten{$name} = 1;
+        
+    $encodingNum = "kCFStringEncoding" . $encodingNum if $encodingNum !~ /^[0-9]/;
+    $mibNum = -1 if !$mibNum;
+    $output .= "    { \"$name\", $mibNum, $encodingNum },\n";
 }
 
-sub emit_output 
+sub process_mac_encodings
 {
-    my ($canonical_name, $mib_enum, @aliases) = @_;
+    my ($filename) = @_;
+    
+    my %seenMacNames;
+    my %seenIANANames;
+    
+    open MAC_ENCODINGS, $filename or die;
     
-    my $mac_string_encoding = $invalid_encoding;
-
-    foreach my $name ($canonical_name, @aliases) {
-	$name = lc $name;
-	if ($name_to_mac_encoding{$name}) {
-	    $mac_string_encoding = $name_to_mac_encoding{$name};
-	    $used_mac_encodings{$name} = $name;
-	}
-    }
-
-    unless ($MAC_SUPPORTED_ONLY && $mac_string_encoding eq $invalid_encoding) {
-	foreach my $name ($canonical_name, @aliases) {
-	    emit_line($name, $mib_enum, $mac_string_encoding);
-        }
-    }
-}
-
-
-sub process_mac_encodings {
     while (<MAC_ENCODINGS>) {
-	chomp;
-	if (my ($id, $name) = /([0-9]*):(.*)/) {
-	    $name_to_mac_encoding{lc $name} = $id;
-	}
+        chomp;
+	if (my ($MacName, $IANANames) = /(.*): (.*)/) {
+            my %aliases;
+            
+            error "CFString encoding name $MacName is mentioned twice in mac-encodings.txt" if $seenMacNames{$MacName};
+            $seenMacNames{$MacName} = 1;
+
+            # Build the aliases list.
+            # Also check that no two names are part of the same entry in the charsets file.
+	    my @IANANames = sort split ", ", lc $IANANames;
+            for my $name (@IANANames) {
+                if ($name !~ /^[-a-z0-9_]+$/) {
+                    error "$name, in mac-encodings.txt, has illegal characters in it";
+                    next;
+                }
+                
+                error "$name is mentioned twice in mac-encodings.txt" if $seenIANANames{$name};
+                $seenIANANames{$name} = 1;
+                
+                $aliases{$name} = 1;
+                next if !$aliasesFromCharsetsFile{$name};
+                for my $alias (@{$aliasesFromCharsetsFile{$name}}) {
+                    $aliases{$alias} = 1;
+                }
+                for my $otherName (@IANANames) {
+                    next if $name eq $otherName;
+                    if ($aliasesFromCharsetsFile{$otherName}
+                        && $aliasesFromCharsetsFile{$name} eq $aliasesFromCharsetsFile{$otherName}
+                        && $name le $otherName) {
+                        error "mac-encodings.txt lists both $name and $otherName under $MacName, but that aliasing is already specified in character-sets.txt";
+                    }
+                }
+            }
+            
+            # write out
+            my $MIBNumber;
+            my @aliases = sort keys %aliases;
+            for my $alias (@aliases) {
+                $MIBNumber = $MIBNumberFromCharsetsFile{$alias} if $MIBNumberFromCharsetsFile{$alias};
+            }
+            
+            for my $alias (@aliases) {
+                emit_line($alias, $MIBNumber, $MacName);
+            }
+	} elsif (/./) {
+            my $MacName = $_;
+            
+            error "CFString encoding name $MacName is mentioned twice in mac-encodings.txt" if $seenMacNames{$MacName};
+            $seenMacNames{$MacName} = 1;
+        }
     }
     
     # Hack, treat -E and -I same as non-suffix case.
     # Not sure if this does the right thing or not.
-    $name_to_mac_encoding{"iso-8859-8-e"} = $name_to_mac_encoding{"iso-8859-8"};
-    $name_to_mac_encoding{"iso-8859-8-i"} = $name_to_mac_encoding{"iso-8859-8"};
+    #$name_to_mac_encoding{"iso-8859-8-e"} = $name_to_mac_encoding{"iso-8859-8"};
+    #$name_to_mac_encoding{"iso-8859-8-i"} = $name_to_mac_encoding{"iso-8859-8"};
+    
+    close MAC_ENCODINGS;
 }
 
-sub process_iana_charsets {
+sub process_iana_charset 
+{
+    my ($canonical_name, $mib_enum, @aliases) = @_;
+    
+    return if !$canonical_name;
+    
+    my @names = sort $canonical_name, @aliases;
+    
+    for my $name (@names) {
+        $MIBNumberFromCharsetsFile{$name} = $mib_enum if $mib_enum;
+        $aliasesFromCharsetsFile{$name} = \@names;
+    }
+}
+
+sub process_iana_charsets
+{
+    my ($filename) = @_;
+    
+    open CHARSETS, $filename or die;
+    
+    my %seen;
+    
+    my $canonical_name;
+    my $mib_enum;
+    my @aliases;
+    
     while (<CHARSETS>) {
-	chomp;
+        chomp;
 	if ((my $new_canonical_name) = /Name: ([^ \t]*).*/) {
-	    emit_output $canonical_name, $mib_enum, @aliases if ($canonical_name);
+            $new_canonical_name = lc $new_canonical_name;
+            
+            error "saw $new_canonical_name twice in character-sets.txt", if $seen{$new_canonical_name};
+            $seen{$new_canonical_name} = 1;
+            
+	    process_iana_charset $canonical_name, $mib_enum, @aliases;
 	    
 	    $canonical_name = $new_canonical_name;
 	    $mib_enum = "";
@@ -83,30 +145,29 @@ sub process_iana_charsets {
 	} elsif ((my $new_mib_enum) = /MIBenum: ([^ \t]*).*/) {
 	    $mib_enum = $new_mib_enum;
 	} elsif ((my $new_alias) = /Alias: ([^ \t]*).*/) {
-	    push @aliases, $new_alias unless ($new_alias eq "None");
-	}
-    }
-}
-
-sub emit_unused_mac_encodings {
-    foreach my $name (keys %name_to_mac_encoding) {
-	if (! $used_mac_encodings{$name}) {
-	    emit_line($name, -1, $name_to_mac_encoding{$name});
+            next if $new_alias eq "None";
+            
+            $new_alias = lc $new_alias;
+            
+            error "saw $new_alias twice in character-sets.txt", if $seen{$new_alias};
+            $seen{$new_alias} = 1;
+            
+            push @aliases, $new_alias;
 	}
     }
+    
+    process_iana_charset $canonical_name, $mib_enum, @aliases;
+    
+    close CHARSETS;
 }
 
 # Program body
 
-open CHARSETS, "<" . $ARGV[0];
-open MAC_ENCODINGS, "<" . $ARGV[1];
-open TABLE, ">" . $ARGV[2];
+process_iana_charsets($ARGV[0]);
+process_mac_encodings($ARGV[1]);
 
-emit_prefix;
-process_mac_encodings;
-process_iana_charsets;
-emit_unused_mac_encodings;
-emit_line("japanese-autodetect", -1, "0xAFE"); # hard-code japanese autodetect
-emit_suffix;
+exit 1 if $error;
 
-close TABLE;
+print "static const CharsetEntry table[] = {\n";
+print $output;
+print "    { NULL, -1, $invalid_encoding }\n};\n";
diff --git a/WebCore/kwq/make-mac-encodings.c b/WebCore/kwq/make-mac-encodings.c
deleted file mode 100644
index 1c421f0..0000000
--- a/WebCore/kwq/make-mac-encodings.c
+++ /dev/null
@@ -1,76 +0,0 @@
-/*
- * Copyright (C) 2001, 2002 Apple Computer, Inc.  All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY APPLE COMPUTER, INC. ``AS IS'' AND ANY
- * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
- * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE COMPUTER, INC. OR
- * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
- * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
- * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
- * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
- * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
- */
-
-#include <CoreFoundation/CoreFoundation.h>
-
-static void
-usage(const char *program)
-{
-  printf("Usage: %s OUTFILE\n", program);
-  exit(1);
-}
-
-int
-main (int argc, char **argv)
-{
-  const CFStringEncoding *all_encodings;
-  const CFStringEncoding *p;
-  CFStringRef name;
-  char cname[2048];
-  FILE *output;
-
-  if (argc != 2) {
-    usage(argv[0]);
-  }
-  
-  output = fopen (argv[1], "w");
-
-  if (output == NULL) {
-    printf("Cannot open file \"%s\"\n", argv[1]);
-    exit(1);
-  }
-
-  all_encodings = CFStringGetListOfAvailableEncodings();
-
-  for (p = all_encodings; *p != kCFStringEncodingInvalidId; p++) {
-    name = CFStringConvertEncodingToIANACharSetName(*p);
-    /* All IANA encoding names must be US-ASCII */
-    if (name != NULL) {
-      CFStringGetCString(name, cname, 2048, kCFStringEncodingASCII);
-      fprintf(output, "%ld:%s\n", *p, cname);
-    } else {
-      switch (*p) {
-        case 41:
-        case kCFStringEncodingShiftJIS_X0213_00:
-        case kCFStringEncodingGB_18030_2000:
-        case 0xBFF:
-          break;
-        default:
-          printf("Warning: encoding %ld does not have an IANA chararacter set name\n", *p);
-      }
-    }
-  }
-  return 0;
-}

-- 
WebKit Debian packaging



More information about the Pkg-webkit-commits mailing list