[Evolution] Bug#506373: complement on the subject line & body

Cyrille Chépélov cyrille at chepelov.org
Thu Nov 20 23:31:53 UTC 2008


retitle 506373 Evolution recklessy ignores the charset on text/html
email fragments and causes glib's death by ana-utf8-phylactic shock
thanks

Although the subject line is (correclty) encoded in windows-1252 and
appears to contain the offending string, it does not appear to be the
cause of trouble. 

The offending string can be found in the scrap of html sent by Google as
the first MIME part of the message body; quoting  the bit:


        <div style="width:370px; background:#D2E6D2; border-style:solid;
        border-color:#ccc; border-width:1px 1px 0 1px; padding:15px 15px
        5px 15px;       margin:0 auto"><p
        style="margin:0;color:#0">cyrille at chepelov.org,
        vous êtes invité(e) à participer à</p>
        <h2 style="margin:5px 0; font-size:18px;
        line-height:1.4;color:#0">Concert Paris-Novembre (Réxx
        Vyyyyé)</h2>
        

(here, gedit did automatically convert that from ISO-8859-15 to UTF-8,
hence none of the diacritics appear mutilated. hexdumping the MIME bit
does confirm the ISO-8859-15 encoding:

000001c0  20 73 74 79 6c 65 3d 22  6d 61 72 67 69 6e 3a 30  |
style="margin:0|
000001d0  3b 63 6f 6c 6f 72 3a 23  30 22 3e 63 79 72 69 6c
|;color:#0">cyril|
000001e0  6c 65 40 63 68 65 70 65  6c 6f 76 2e 6f 72 67 2c  |
le at chepelov.org,|
000001f0  0a 76 6f 75 73 20 ea 74  65 73 20 69 6e 76 69 74  |.vous .tes
invit|
00000200  e9 28 65 29 20 e0 20 70  61 72 74 69 63 69 70 65  |.(e) .
participe|
00000210  72 20 e0 3c 2f 70 3e 0a  3c 68 32 20 73 74 79 6c  |r .</p>.<h2
styl|
00000220  65 3d 22 6d 61 72 67 69  6e 3a 35 70 78 20 30 3b  |
e="margin:5px 0;|
00000230  20 66 6f 6e 74 2d 73 69  7a 65 3a 31 38 70 78 3b  |
font-size:18px;|
00000240  20 6c 69 6e 65 2d 68 65  69 67 68 74 3a 31 2e 34  |
line-height:1.4|
00000250  3b 63 6f 6c 6f 72 3a 23  30 22 3e 43 6f 6e 63 65
|;color:#0">Conce|
00000260  72 74 20 50 61 72 69 73  2d 4e 6f 76 65 6d 62 72  |rt
Paris-Novembr|
00000270  65 20 28 52 e9 78 78 20  56 79 79 79 79 e9 29 3c  |e (R.xx
Vyyyy.)<|

Inspecting the raw RFC-2822 message, it appears that the bit of HTML
does have content-type Content-Type: text/html; charset=windows-1252.
While I regret that Google did not include redundant metadata within the
text/html bit, there not only there was proper warning that utf-8 this
was not, but also the default encoding was set to be 8859-15. Therefore,
what happened is that Evolution failed to properly convert this fragment
into proper UTF-8 before handing it over to glib (and in any case, it
definitely should have bleached it to not provide an invalid UTF-8
fragment down the HTML renderer). Assigning the blame on Evolution for
sure.

I will gladly provide the raw RFC-2822 offending message, but on a
non-disclosure basis.

Thanks in advance.

    -- Cyrille

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.alioth.debian.org/pipermail/pkg-evolution-maintainers/attachments/20081121/6a669803/attachment.htm 


More information about the Pkg-evolution-maintainers mailing list