NetBSD Problem Report #52671

From www@NetBSD.org  Sun Oct 29 15:19:38 2017
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id B91577A1C8
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 29 Oct 2017 15:19:38 +0000 (UTC)
Message-Id: <20171029151937.D6D967A1EC@mollari.NetBSD.org>
Date: Sun, 29 Oct 2017 15:19:37 +0000 (UTC)
From: gralph@post-ist-da.de
Reply-To: gralph@post-ist-da.de
To: gnats-bugs@NetBSD.org
Subject: The ignorecase option is not handeled correctly in vi for unicode characters
X-Send-Pr-Version: www-1.0

>Number:         52671
>Category:       bin
>Synopsis:       The ignorecase option is not handeled correctly in vi for unicode characters
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    bin-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Oct 29 15:20:00 +0000 2017
>Closed-Date:    Mon Oct 30 07:58:54 +0000 2017
>Last-Modified:  Sun Nov 12 16:35:01 +0000 2017
>Originator:     Ralph Geier
>Release:        7.1
>Organization:
>Environment:
NetBSD madeira NetBSD 7.1 (GENERIC.201703111743Z) i386
>Description:
For unicode characters ignorecase seems not to work.                        
>How-To-Repeat:
Open a new file with vi, set the 'ignorecase' option and try to search           
for german 'umlauts', e.g. in:          

während Ähren 

>Fix:
The following patch seems to fix it. I did not investigate really
deeply. I replaced some character classification/conversion functions           
by macros which use the multibyte aware versions where necessary, as it         
has been done similarly in the course of the NetBSD 'local changes' for         
the same source file.                   

Note that I had to introduce an ISALPHA2 macro, because the NetBSD              
local changes implement an ISALPHA which is restricted to the ASCII             
character set, and not suitable here.

Index: multibyte.h
===================================================================             
RCS file: /cvsroot/src/external/bsd/nvi/dist/common/multibyte.h,v               
retrieving revision 1.2
diff -u -r1.2 multibyte.h
--- multibyte.h 22 Nov 2013 15:52:05 -0000      1.2
+++ multibyte.h 29 Oct 2017 13:09:49 -0000                                      
@@ -53,6 +53,7 @@   
 #define ISCNTRL                iswcntrl                                        
 #define ISGRAPH                iswgraph
 #define ISLOWER                iswlower                                        
+#define ISALPHA2       iswalpha        
 #define ISPUNCT                iswpunct
 #define ISSPACE                iswspace
 #define ISUPPER                iswupper
@@ -86,6 +87,7 @@
 #define ISCNTRL                iscntrl
 #define ISGRAPH                isgraph
 #define ISLOWER                islower
+#define ISALPHA2       isalpha
 #define ISPUNCT                ispunct
 #define ISSPACE                isspace
 #define ISUPPER                isupper

Index: regcomp.c
===================================================================
RCS file: /cvsroot/src/external/bsd/nvi/dist/regex/regcomp.c,v
retrieving revision 1.5
diff -u -r1.5 regcomp.c
--- regcomp.c   26 Jan 2014 21:47:00 -0000      1.5
+++ regcomp.c   29 Oct 2017 13:10:01 -0000
@@ -752,7 +752,7 @@
                int ci;

                for (i = p->g->csetsize - 1; i >= 0; i--)
-                       if (CHIN(cs, i) && isalpha(i)) {
+                       if (CHIN(cs, i) && ISALPHA2(i)) {
                                ci = othercase(i);
                                if (ci != i)
                                        CHadd(cs, ci);
@@ -860,7 +860,7 @@
        const char *u;
        char c;

-       while (MORE() && isalpha(PEEK()))
+       while (MORE() && ISALPHA2(PEEK()))
                NEXT();
        len = p->next - sp;
        for (cp = cclasses; cp->name != NULL; cp++)
@@ -949,11 +949,11 @@
 static char                    /* if no counterpart, return ch */
 othercase(int ch)
 {
-       assert(isalpha(ch));
-       if (isupper(ch))
-               return(tolower(ch));
-       else if (islower(ch))
-               return(toupper(ch));
+       assert(ISALPHA2(ch));
+       if (ISUPPER(ch))
+               return(TOLOWER(ch));
+       else if (ISLOWER(ch))
+               return(TOUPPER(ch));
        else                    /* peculiar, but could happen */
                return(ch);
 }
@@ -994,7 +994,7 @@
        cat_t *cap = p->g->categories;
 */

-       if ((p->g->cflags&REG_ICASE) && isalpha(ch) && othercase(ch) != ch)
+       if ((p->g->cflags&REG_ICASE) && ISALPHA2(ch) && othercase(ch) != ch)
                bothcases(p, ch);
        else {
                EMIT(OCHAR, (UCHAR_T)ch);

>Release-Note:

>Audit-Trail:
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52671 CVS commit: src/external/bsd/nvi/dist
Date: Sun, 29 Oct 2017 11:29:34 -0400

 Module Name:	src
 Committed By:	christos
 Date:		Sun Oct 29 15:29:34 UTC 2017

 Modified Files:
 	src/external/bsd/nvi/dist/common: multibyte.h
 	src/external/bsd/nvi/dist/regex: regcomp.c

 Log Message:
 PR/52671: Ralph Geier: The ignorecase option is not handeled correctly in vi
 for unicode characters


 To generate a diff of this commit:
 cvs rdiff -u -r1.2 -r1.3 src/external/bsd/nvi/dist/common/multibyte.h
 cvs rdiff -u -r1.5 -r1.6 src/external/bsd/nvi/dist/regex/regcomp.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: wiz@NetBSD.org
State-Changed-When: Mon, 30 Oct 2017 07:58:54 +0000
State-Changed-Why:
Committed by christos, thanks!


From: Ralph Geier <gralph@post-ist-da.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/52671
Date: Sun, 12 Nov 2017 16:11:52 +0100

 Hi,

 my solution was not matured. It has problems with (multibyte)
 characters without upper-/lowercase counterparts.

 E.g. search for the the german 'sharp s' (=DF) with 'ignorecase' set.
 The following patch should fix this.

 regards
 Ralph


 Index: regcomp.c
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
 RCS file: /cvsroot/src/external/bsd/nvi/dist/regex/regcomp.c,v
 retrieving revision 1.6
 diff -u -r1.6 regcomp.c
 --- regcomp.c	29 Oct 2017 15:29:34 -0000	1.6
 +++ regcomp.c	12 Nov 2017 14:45:44 -0000
 @@ -98,7 +98,7 @@
  static void p_b_eclass __P((struct parse *p, cset *cs));
  static char p_b_symbol __P((struct parse *p));
  static char p_b_coll_elem __P((struct parse *p, int endc));
 -static char othercase __P((int ch));
 +static RCHAR_T othercase __P((int ch));
  static void bothcases __P((struct parse *p, int ch));
  static void ordinary __P((struct parse *p, int ch));
  static void nonnewline __P((struct parse *p));
 @@ -946,7 +946,7 @@
   - othercase - return the case counterpart of an alphabetic
   =3D=3D static char othercase(int ch);
   */
 -static char			/* if no counterpart, return ch */
 +static RCHAR_T			/* if no counterpart, return ch */
  othercase(int ch)
  {
  	assert(ISALPHA2(ch));

From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52671 CVS commit: src/external/bsd/nvi/dist/regex
Date: Sun, 12 Nov 2017 11:33:31 -0500

 Module Name:	src
 Committed By:	christos
 Date:		Sun Nov 12 16:33:31 UTC 2017

 Modified Files:
 	src/external/bsd/nvi/dist/regex: regcomp.c

 Log Message:
 PR/52671: Ralph Geier, return the wide character when changing case (because
 it could be wide).


 To generate a diff of this commit:
 cvs rdiff -u -r1.6 -r1.7 src/external/bsd/nvi/dist/regex/regcomp.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.