NetBSD Problem Report #52671
From www@NetBSD.org Sun Oct 29 15:19:38 2017
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id B91577A1C8
for <gnats-bugs@gnats.NetBSD.org>; Sun, 29 Oct 2017 15:19:38 +0000 (UTC)
Message-Id: <20171029151937.D6D967A1EC@mollari.NetBSD.org>
Date: Sun, 29 Oct 2017 15:19:37 +0000 (UTC)
From: gralph@post-ist-da.de
Reply-To: gralph@post-ist-da.de
To: gnats-bugs@NetBSD.org
Subject: The ignorecase option is not handeled correctly in vi for unicode characters
X-Send-Pr-Version: www-1.0
>Number: 52671
>Category: bin
>Synopsis: The ignorecase option is not handeled correctly in vi for unicode characters
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: bin-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Oct 29 15:20:00 +0000 2017
>Closed-Date: Mon Oct 30 07:58:54 +0000 2017
>Last-Modified: Sun Nov 12 16:35:01 +0000 2017
>Originator: Ralph Geier
>Release: 7.1
>Organization:
>Environment:
NetBSD madeira NetBSD 7.1 (GENERIC.201703111743Z) i386
>Description:
For unicode characters ignorecase seems not to work.
>How-To-Repeat:
Open a new file with vi, set the 'ignorecase' option and try to search
for german 'umlauts', e.g. in:
während Ähren
>Fix:
The following patch seems to fix it. I did not investigate really
deeply. I replaced some character classification/conversion functions
by macros which use the multibyte aware versions where necessary, as it
has been done similarly in the course of the NetBSD 'local changes' for
the same source file.
Note that I had to introduce an ISALPHA2 macro, because the NetBSD
local changes implement an ISALPHA which is restricted to the ASCII
character set, and not suitable here.
Index: multibyte.h
===================================================================
RCS file: /cvsroot/src/external/bsd/nvi/dist/common/multibyte.h,v
retrieving revision 1.2
diff -u -r1.2 multibyte.h
--- multibyte.h 22 Nov 2013 15:52:05 -0000 1.2
+++ multibyte.h 29 Oct 2017 13:09:49 -0000
@@ -53,6 +53,7 @@
#define ISCNTRL iswcntrl
#define ISGRAPH iswgraph
#define ISLOWER iswlower
+#define ISALPHA2 iswalpha
#define ISPUNCT iswpunct
#define ISSPACE iswspace
#define ISUPPER iswupper
@@ -86,6 +87,7 @@
#define ISCNTRL iscntrl
#define ISGRAPH isgraph
#define ISLOWER islower
+#define ISALPHA2 isalpha
#define ISPUNCT ispunct
#define ISSPACE isspace
#define ISUPPER isupper
Index: regcomp.c
===================================================================
RCS file: /cvsroot/src/external/bsd/nvi/dist/regex/regcomp.c,v
retrieving revision 1.5
diff -u -r1.5 regcomp.c
--- regcomp.c 26 Jan 2014 21:47:00 -0000 1.5
+++ regcomp.c 29 Oct 2017 13:10:01 -0000
@@ -752,7 +752,7 @@
int ci;
for (i = p->g->csetsize - 1; i >= 0; i--)
- if (CHIN(cs, i) && isalpha(i)) {
+ if (CHIN(cs, i) && ISALPHA2(i)) {
ci = othercase(i);
if (ci != i)
CHadd(cs, ci);
@@ -860,7 +860,7 @@
const char *u;
char c;
- while (MORE() && isalpha(PEEK()))
+ while (MORE() && ISALPHA2(PEEK()))
NEXT();
len = p->next - sp;
for (cp = cclasses; cp->name != NULL; cp++)
@@ -949,11 +949,11 @@
static char /* if no counterpart, return ch */
othercase(int ch)
{
- assert(isalpha(ch));
- if (isupper(ch))
- return(tolower(ch));
- else if (islower(ch))
- return(toupper(ch));
+ assert(ISALPHA2(ch));
+ if (ISUPPER(ch))
+ return(TOLOWER(ch));
+ else if (ISLOWER(ch))
+ return(TOUPPER(ch));
else /* peculiar, but could happen */
return(ch);
}
@@ -994,7 +994,7 @@
cat_t *cap = p->g->categories;
*/
- if ((p->g->cflags®_ICASE) && isalpha(ch) && othercase(ch) != ch)
+ if ((p->g->cflags®_ICASE) && ISALPHA2(ch) && othercase(ch) != ch)
bothcases(p, ch);
else {
EMIT(OCHAR, (UCHAR_T)ch);
>Release-Note:
>Audit-Trail:
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/52671 CVS commit: src/external/bsd/nvi/dist
Date: Sun, 29 Oct 2017 11:29:34 -0400
Module Name: src
Committed By: christos
Date: Sun Oct 29 15:29:34 UTC 2017
Modified Files:
src/external/bsd/nvi/dist/common: multibyte.h
src/external/bsd/nvi/dist/regex: regcomp.c
Log Message:
PR/52671: Ralph Geier: The ignorecase option is not handeled correctly in vi
for unicode characters
To generate a diff of this commit:
cvs rdiff -u -r1.2 -r1.3 src/external/bsd/nvi/dist/common/multibyte.h
cvs rdiff -u -r1.5 -r1.6 src/external/bsd/nvi/dist/regex/regcomp.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: wiz@NetBSD.org
State-Changed-When: Mon, 30 Oct 2017 07:58:54 +0000
State-Changed-Why:
Committed by christos, thanks!
From: Ralph Geier <gralph@post-ist-da.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/52671
Date: Sun, 12 Nov 2017 16:11:52 +0100
Hi,
my solution was not matured. It has problems with (multibyte)
characters without upper-/lowercase counterparts.
E.g. search for the the german 'sharp s' (=DF) with 'ignorecase' set.
The following patch should fix this.
regards
Ralph
Index: regcomp.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/external/bsd/nvi/dist/regex/regcomp.c,v
retrieving revision 1.6
diff -u -r1.6 regcomp.c
--- regcomp.c 29 Oct 2017 15:29:34 -0000 1.6
+++ regcomp.c 12 Nov 2017 14:45:44 -0000
@@ -98,7 +98,7 @@
static void p_b_eclass __P((struct parse *p, cset *cs));
static char p_b_symbol __P((struct parse *p));
static char p_b_coll_elem __P((struct parse *p, int endc));
-static char othercase __P((int ch));
+static RCHAR_T othercase __P((int ch));
static void bothcases __P((struct parse *p, int ch));
static void ordinary __P((struct parse *p, int ch));
static void nonnewline __P((struct parse *p));
@@ -946,7 +946,7 @@
- othercase - return the case counterpart of an alphabetic
=3D=3D static char othercase(int ch);
*/
-static char /* if no counterpart, return ch */
+static RCHAR_T /* if no counterpart, return ch */
othercase(int ch)
{
assert(ISALPHA2(ch));
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/52671 CVS commit: src/external/bsd/nvi/dist/regex
Date: Sun, 12 Nov 2017 11:33:31 -0500
Module Name: src
Committed By: christos
Date: Sun Nov 12 16:33:31 UTC 2017
Modified Files:
src/external/bsd/nvi/dist/regex: regcomp.c
Log Message:
PR/52671: Ralph Geier, return the wide character when changing case (because
it could be wide).
To generate a diff of this commit:
cvs rdiff -u -r1.6 -r1.7 src/external/bsd/nvi/dist/regex/regcomp.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.