NetBSD Problem Report #38511

From neil@daikokuya.co.uk  Sat Apr 26 01:20:29 2008
Return-Path: <neil@daikokuya.co.uk>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id 807B263B8BC
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 26 Apr 2008 01:20:29 +0000 (UTC)
Message-Id: <20080426012028.12110.qmail@daikokuya.co.uk>
Date: 26 Apr 2008 01:20:28 -0000
From: neil@daikokuya.co.uk
Reply-To: neil@daikokuya.co.uk
To: gnats-bugs@gnats.NetBSD.org
Subject: Invalid error recovery in rxvt-unicode
X-Send-Pr-Version: 3.95

>Number:         38511
>Category:       pkg
>Synopsis:       Invalid error recovery in rxvt-unicode
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    pkg-manager
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Apr 26 01:25:00 +0000 2008
>Originator:     neil@daikokuya.co.uk
>Release:        NetBSD 4.99.55
>Organization:
>Environment:


System: NetBSD duron.akihabara.co.uk 4.99.55 NetBSD 4.99.55 (GENERIC.NB) #0: Sun Mar 2 10:43:52 JST 2008 root@duron.akihabara.co.uk:/usr/obj/usr/src/sys/arch/i386/compile/GENERIC.NB i386
Architecture: i386
Machine: i386
>Description:
  rxvt-unicode does not correctly reset conversion state after bad 
multibyte conversions.  The C standard leaves the conversion state
undefined after a bad conversion.  An explicit call to reset the
state is required.  This is particularly so on NetBSD - our C library
leaves the conversion state such that subsequent conversions do not work.

>How-To-Repeat:

 _CTYPE=ja_JP.UTF-8 urxvt
 <cat a UTF-8 file; all is cool>
 <cat an EUC-JP file; it comes out garbage>
 <cat the same UTF-8 file again; all is garbage>
>Fix:
Apply the following patches; I'm passing upstream:

$NetBSD$

--- src/command.C.orig	2008-04-26 10:10:05.000000000 +0900
+++ src/command.C
@@ -2380,13 +2380,19 @@ rxvt_term::next_char () NOTHROW

       if (len == (size_t)-2)
         {
+	  // Reset to initial conversion state from undefined
+	  mbrtowc (0, 0, 0, mbstate);
           // the mbstate stores incomplete sequences. didn't know this :/
           cmdbuf_ptr = cmdbuf_endp;
           break;
         }

       if (len == (size_t)-1)
+	{
+	  // Reset to initial conversion state from undefined
+	  mbrtowc (0, 0, 0, mbstate);
         return (unsigned char)*cmdbuf_ptr++; // the _occasional_ latin1 character is allowed to slip through
+	}

       // assume wchar == unicode
       cmdbuf_ptr += len;



$NetBSD$

--- src/misc.C.orig	2008-04-26 10:10:56.000000000 +0900
+++ src/misc.C
@@ -40,7 +40,11 @@ rxvt_wcstombs (const wchar_t *str, int l
       ssize_t l = wcrtomb (dst, *str++, mbs);

       if (l < 0)
+      {
+	// Reset to initial conversion state from undefined
+	wcrtomb (0, 0, mbs);
         *dst++ = '?';
+      }
       else
         dst += l;
     }

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.