NetBSD Problem Report #45221
From dholland@macaran.localdomain Sun Aug 7 05:38:40 2011
Return-Path: <dholland@macaran.localdomain>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id 6EB9E63CB22
for <gnats-bugs@gnats.NetBSD.org>; Sun, 7 Aug 2011 05:38:40 +0000 (UTC)
Message-Id: <20110807042127.5AF566E1D8@macaran.localdomain>
Date: Sun, 7 Aug 2011 00:21:27 -0400 (EDT)
From: dholland@eecs.harvard.edu
Reply-To: dholland@NetBSD.org
To: gnats-bugs@gnats.NetBSD.org
Subject: xterm utf-8 mode is (partially) a one-way trip
X-Send-Pr-Version: 3.95
>Number: 45221
>Category: xsrc
>Synopsis: xterm utf-8 mode is (partially) a one-way trip
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: xsrc-manager
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Aug 07 05:40:00 +0000 2011
>Last-Modified: Tue Jan 31 22:34:31 +0000 2023
>Originator: David A. Holland
>Release: NetBSD 5.99.49 (pkgsrc from 20110730)
>Organization:
>Environment:
System: NetBSD macaran 5.99.49 NetBSD 5.99.49 (MACARAN) #8: Mon Apr 11 19:54:18 EDT 2011 dholland@macaran:/usr/src/sys/arch/amd64/compile/MACARAN amd64
Architecture: x86_64
Machine: amd64
>Description:
xterm-259 from pkgsrc X has the following problem:
If you
- start it in non-utf-8 mode
- switch to utf-8 mode with the ctrl-rightmouse menu
- do some stuff
- switch back to normal mode with the ctrl-rightmouse menu
then only output handling and not input handling switches back away
from utf-8. That is, if you print characters 128-255 from programs,
they're displayed as the matching iso-latin-1 glyphs; but if you paste
characters 128-255 into the xterm they are converted to utf-8 on the
way in and programs read them as multibyte sequences. This conversion
is apparently enabled when utf-8 mode is switched on, but not disabled
again when it's switched off.
>How-To-Repeat:
prepare something suitable in your cut buffer, open a fresh xterm, and
paste into cat as follows:
% cat | hexdump -C
<correct glyph>
00000000 d8 0a |..|
00000002
switch on utf-8 mode:
% cat | hexdump -C
<correct glyph>
00000000 c3 98 0a |...|
00000003
switch off utf-8 mode:
% cat | hexdump -C
<wrong glyph>
00000000 c3 98 0a |...|
00000003
>Fix:
dunno, probably obvious if you know where to look, which I don't.
>Release-Note:
>Audit-Trail:
From: David Holland <dholland-pbugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: pkg/45221: xterm utf-8 mode is (partially) a one-way trip
Date: Sun, 18 Jul 2021 10:02:34 +0000
On Sun, Aug 07, 2011 at 05:40:00AM +0000, dholland@eecs.harvard.edu wrote:
> If you
> - start it in non-utf-8 mode
> - switch to utf-8 mode with the ctrl-rightmouse menu
> - do some stuff
> - switch back to normal mode with the ctrl-rightmouse menu
>
> then only output handling and not input handling switches back away
> from utf-8.
This still happens 10 years later with the xterm in base X, not sure
if the one in pkgsrc is different but I doubt it.
--
David A. Holland
dholland@netbsd.org
From: David Holland <dholland-pbugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: pkg/45221: xterm utf-8 mode is (partially) a one-way trip
Date: Tue, 31 Jan 2023 22:26:13 +0000
On Sun, Jul 18, 2021 at 10:05:01AM +0000, David Holland wrote:
> This still happens 10 years later with the xterm in base X, not sure
> if the one in pkgsrc is different but I doubt it.
And it still happens, though the behavior might have changed a little.
Open three xterms, starting them in non-utf-8 mode.
In xterm 1, run
% echo foo | awk '{ printf "%c\n", 216 }'
In xterm 2, switch to utf-8 mode with the right mouse menu
("UTF-8 Encoding") and run
% echo foo | awk '{ printf "%c%c\n", 195, 152 }'
These should print the same glyph.
Then in xterm 3:
% cat | hexdump -C
- select the glyph from xterm 1 (non-utf-8), paste
- it'll echo the correct glyph
- and you'll get "d8 0a" (the iso-latin-1 for the glyph
and a newline)
- hit ^D
% cat | hexdump -C
- select the glyph from xterm 2 (utf-8), paste
- it'll echo the correct glyph
- and you'll get "d8 0a" (the iso-latin-1 for the glyph
and a newline)
- hit ^D
- now switch this xterm to utf-8 mode with the right mouse menu
% cat | hexdump -C
- select the glyph from xterm 1 (non-utf-8), paste
- it'll echo the correct glyph
- and you'll get "c3 98 0a" (the utf-8 for the glyph
and a newline)
- hit ^D
% cat | hexdump -C
- select the glyph from xterm 2 (utf-8), paste
- it'll echo the correct glyph
- and you'll get "c3 98 0a" (the utf-8 for the glyph
and a newline)
- hit ^D
- now switch this xterm back out of utf-8 mode
% cat | hexdump -C
- select the glyph from xterm 1 (non-utf-8), paste
- it'll echo some other glyph
- and you'll get "c3 0a" (the wrong iso-latin-1
and a newline)
- hit ^D
% cat | hexdump -C
- select the glyph from xterm 2 (utf-8), paste
- it'll echo the correct glyph
- and you'll get "c3 0a" (the wrong iso-latin-1
and a newline)
- hit ^D
- if you switch back to utf-8 mode it'll paste correctly again
Additional weirdness happens if you try to paste from the same xterm,
which is possibly a different bug.
% xterm -version
XTerm(370)
I'm going to change the PR from pkgsrc to xsrc since it is happening
there and we possibly care more that way.
--
David A. Holland
dholland@netbsd.org
Responsible-Changed-From-To: pkg-manager->xsrc-manager
Responsible-Changed-By: dholland@NetBSD.org
Responsible-Changed-When: Tue, 31 Jan 2023 22:34:31 +0000
Responsible-Changed-Why:
problem (also?) affects xsrc
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.