Bug#288936: marked as done (vim: outputs parts of utf-8 multibyte characters at end of line)

Debian Bug Tracking System owner@bugs.debian.org
Wed, 16 Mar 2005 12:48:37 -0800


Your message dated Wed, 16 Mar 2005 21:34:00 +0100
with message-id <42389838.6000500@cacholong.nl>
and subject line vim: outputs parts of utf-8 multibyte characters at end of line
has caused the attached Bug report to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what I am
talking about this indicates a serious mail system misconfiguration
somewhere.  Please contact me immediately.)

Debian bug tracking system administrator
(administrator, Debian Bugs database)

--------------------------------------
Received: (at submit) by bugs.debian.org; 6 Jan 2005 13:21:03 +0000
>From brl@pcpool00.mathematik.uni-freiburg.de Thu Jan 06 05:21:03 2005
Return-path: <brl@pcpool00.mathematik.uni-freiburg.de>
Received: from pcpool00.mathematik.uni-freiburg.de [132.230.30.150] 
	by spohr.debian.org with esmtp (Exim 3.35 1 (Debian))
	id 1CmXZU-0002WB-00; Thu, 06 Jan 2005 05:21:00 -0800
Received: from pcpool09.mathematik.uni-freiburg.de ([132.230.30.159])
	by pcpool00.mathematik.uni-freiburg.de with asmtp (Exim 3.35 #1 (Debian))
	id 1CmXZU-0003vK-00
	for <submit@bugs.debian.org>; Thu, 06 Jan 2005 14:21:00 +0100
Received: from brl by pcpool09.mathematik.uni-freiburg.de with local (Exim 3.35 #1 (Debian))
	id 1CmXZU-0008HZ-00
	for <submit@bugs.debian.org>; Thu, 06 Jan 2005 14:21:00 +0100
Date: Thu, 6 Jan 2005 14:21:00 +0100
From: "Bernhard R. Link" <brlink@debian.org>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: vim: outputs parts of utf-8 multibyte characters at end of line
Message-ID: <20050106132100.GA31829@pcpool00.mathematik.uni-freiburg.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.3.28i
X-Reportbug-Version: 3.5
X-Debbugs-CC: zeisberg@informatik.uni-freiburg.de
Sender: "Bernhard R. Link" <brl@pcpool09.mathematik.uni-freiburg.de>
Delivered-To: submit@bugs.debian.org
X-Spam-Checker-Version: SpamAssassin 2.60-bugs.debian.org_2005_01_02 
	(1.212-2003-09-23-exp) on spohr.debian.org
X-Spam-Status: No, hits=-11.0 required=4.0 tests=BAYES_00,HAS_PACKAGE,
	X_DEBBUGS_CC autolearn=ham version=2.60-bugs.debian.org_2005_01_02
X-Spam-Level: 

Package: vim
Version: 1:6.3-046+1
Severity: minor

If the last character before a line-wrap is a multi-byte uft8 character,
vim prints the first byte of the character again. This leads to an
invalid sequence confusing uxterm. (Other terminals seem to autocorrect
this).

To reproduce (utf8-locale needed):
Generate a file with an multi-byte sequence at the end of the physical
line, for example by having more than enough German umlauts A-diaresises:
for n in $(seq 1000) ; do echo -n -e '\303\244' ; done > bla

Then start vim with this:
script output
vim bla
exit
Then od -c output contains:
0000660 244 303 244 303 033   [   2   ;   1   H 303 244 303 244 303 244

Note the spurious 303 before the escape sequence \033[2;1H. 

I think this bug is caused by the following code in
vim-6.3/vim63/src/screen.c +4171:

#ifdef FEAT_MBYTE
    /* When there is a multi-byte character, just output a
     * space to keep it simple. */
    if (has_mbyte && mb_off2cells(LineOffset[screen_row - 1]
                                        + (unsigned)Columns - 1) != 1 )
        out_char(' ');
    else
#endif
    out_char(ScreenLines[LineOffset[screen_row - 1]
                                                            + (Columns - 1)]);
Changing this to:
    if (has_mbyte /* && mb_off2cells(LineOffset[screen_row - 1]
                                         + (unsigned)Columns - 1) != 1 */)
removes the bug, so I guess that is indeed the cause.

-- System Information:
Debian Release: 3.1
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: i386 (i686)
Kernel: Linux 2.6.7-1-686
Locale: LANG=C, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)

Versions of packages vim depends on:
ii  dpkg                        1.10.25      Package maintenance system for Deb
ii  libc6                       2.3.2.ds1-20 GNU C Library: Shared libraries an
ii  libgpmg1                    1.19.6-19    General Purpose Mouse - shared lib
ii  libncurses5                 5.4-4        Shared libraries for terminal hand
ii  vim-common                  1:6.3-046+1  Vi IMproved - Common files

-- no debconf information

---------------------------------------
Received: (at 288936-done) by bugs.debian.org; 16 Mar 2005 20:34:07 +0000
>From matthijs@cacholong.nl Wed Mar 16 12:34:07 2005
Return-path: <matthijs@cacholong.nl>
Received: from a80-126-52-161.adsl.xs4all.nl (server.cacholong.nl) [80.126.52.161] 
	by spohr.debian.org with esmtp (Exim 3.35 1 (Debian))
	id 1DBfDS-00036R-00; Wed, 16 Mar 2005 12:34:06 -0800
Received: from localhost (localhost [127.0.0.1])
	by server.cacholong.nl (Postfix) with ESMTP id 6C40480CB9D;
	Wed, 16 Mar 2005 21:34:04 +0100 (CET)
Received: from server.cacholong.nl ([127.0.0.1])
	by localhost (server.cacholong.nl [127.0.0.1]) (amavisd-new, port 10024)
	with LMTP id 01301-03; Wed, 16 Mar 2005 21:34:02 +0100 (CET)
Received: from [192.168.20.17] (openbsd.xs4all.nl [80.126.240.96])
	by server.cacholong.nl (Postfix) with ESMTP id 77D7280CB1A;
	Wed, 16 Mar 2005 21:34:02 +0100 (CET)
Message-ID: <42389838.6000500@cacholong.nl>
Date: Wed, 16 Mar 2005 21:34:00 +0100
From: Matthijs Mohlmann <matthijs@cacholong.nl>
User-Agent: Debian Thunderbird 1.0 (X11/20050116)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: 288936@bugs.debian.org
Cc: 288936-done@bugs.debian.org
Subject: Re: vim: outputs parts of utf-8 multibyte characters at end of line
X-Enigmail-Version: 0.90.0.0
X-Enigmail-Supports: pgp-inline, pgp-mime
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature";
 boundary="------------enigC1D391381589D17B6E585128"
X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at cacholong.nl
Delivered-To: 288936-done@bugs.debian.org
X-Spam-Checker-Version: SpamAssassin 2.60-bugs.debian.org_2005_01_02 
	(1.212-2003-09-23-exp) on spohr.debian.org
X-Spam-Status: No, hits=-3.0 required=4.0 tests=BAYES_00 autolearn=no 
	version=2.60-bugs.debian.org_2005_01_02
X-Spam-Level: 
X-CrossAssassin-Score: 2

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enigC1D391381589D17B6E585128
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

The current vim in sid isn't affected by this bug, I couldn't reproduce
this bug.

Matthijs

--------------enigC1D391381589D17B6E585128
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCOJg72n1ROIkXqbARAiRWAJ9J6uKJnm9em/HbJPDIE+DTBS/DfQCfVByY
v0Ou0MijE9bM+WG71T9CATI=
=MVnm
-----END PGP SIGNATURE-----

--------------enigC1D391381589D17B6E585128--