NetBSD Problem Report #38511
From neil@daikokuya.co.uk Sat Apr 26 01:20:29 2008
Return-Path: <neil@daikokuya.co.uk>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by narn.NetBSD.org (Postfix) with ESMTP id 807B263B8BC
for <gnats-bugs@gnats.NetBSD.org>; Sat, 26 Apr 2008 01:20:29 +0000 (UTC)
Message-Id: <20080426012028.12110.qmail@daikokuya.co.uk>
Date: 26 Apr 2008 01:20:28 -0000
From: neil@daikokuya.co.uk
Reply-To: neil@daikokuya.co.uk
To: gnats-bugs@gnats.NetBSD.org
Subject: Invalid error recovery in rxvt-unicode
X-Send-Pr-Version: 3.95
>Number: 38511
>Category: pkg
>Synopsis: Invalid error recovery in rxvt-unicode
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: pkg-manager
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Apr 26 01:25:00 +0000 2008
>Closed-Date: Sun May 01 18:16:43 +0000 2016
>Last-Modified: Sun May 01 18:16:43 +0000 2016
>Originator: neil@daikokuya.co.uk
>Release: NetBSD 4.99.55
>Organization:
>Environment:
System: NetBSD duron.akihabara.co.uk 4.99.55 NetBSD 4.99.55 (GENERIC.NB) #0: Sun Mar 2 10:43:52 JST 2008 root@duron.akihabara.co.uk:/usr/obj/usr/src/sys/arch/i386/compile/GENERIC.NB i386
Architecture: i386
Machine: i386
>Description:
rxvt-unicode does not correctly reset conversion state after bad
multibyte conversions. The C standard leaves the conversion state
undefined after a bad conversion. An explicit call to reset the
state is required. This is particularly so on NetBSD - our C library
leaves the conversion state such that subsequent conversions do not work.
>How-To-Repeat:
_CTYPE=ja_JP.UTF-8 urxvt
<cat a UTF-8 file; all is cool>
<cat an EUC-JP file; it comes out garbage>
<cat the same UTF-8 file again; all is garbage>
>Fix:
Apply the following patches; I'm passing upstream:
$NetBSD$
--- src/command.C.orig 2008-04-26 10:10:05.000000000 +0900
+++ src/command.C
@@ -2380,13 +2380,19 @@ rxvt_term::next_char () NOTHROW
if (len == (size_t)-2)
{
+ // Reset to initial conversion state from undefined
+ mbrtowc (0, 0, 0, mbstate);
// the mbstate stores incomplete sequences. didn't know this :/
cmdbuf_ptr = cmdbuf_endp;
break;
}
if (len == (size_t)-1)
+ {
+ // Reset to initial conversion state from undefined
+ mbrtowc (0, 0, 0, mbstate);
return (unsigned char)*cmdbuf_ptr++; // the _occasional_ latin1 character is allowed to slip through
+ }
// assume wchar == unicode
cmdbuf_ptr += len;
$NetBSD$
--- src/misc.C.orig 2008-04-26 10:10:56.000000000 +0900
+++ src/misc.C
@@ -40,7 +40,11 @@ rxvt_wcstombs (const wchar_t *str, int l
ssize_t l = wcrtomb (dst, *str++, mbs);
if (l < 0)
+ {
+ // Reset to initial conversion state from undefined
+ wcrtomb (0, 0, mbs);
*dst++ = '?';
+ }
else
dst += l;
}
>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 01 May 2016 18:16:43 +0000
State-Changed-Why:
coypu says this was fixed upstream.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.