NetBSD Problem Report #57530
From www@netbsd.org Tue Jul 18 03:43:29 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id E95E31A923D
for <gnats-bugs@gnats.NetBSD.org>; Tue, 18 Jul 2023 03:43:28 +0000 (UTC)
Message-Id: <20230718034328.0F8CA1A923E@mollari.NetBSD.org>
Date: Tue, 18 Jul 2023 03:43:28 +0000 (UTC)
From: rokuyama.rk@gmail.com
Reply-To: rokuyama.rk@gmail.com
To: gnats-bugs@NetBSD.org
Subject: regex(3): REG_BADRPT raised by { for REG_EXTENDED
X-Send-Pr-Version: www-1.0
>Number: 57530
>Category: lib
>Synopsis: regex(3): REG_BADRPT raised by { for REG_EXTENDED
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: lib-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Jul 18 03:45:00 +0000 2023
>Last-Modified: Tue Jul 18 09:30:02 +0000 2023
>Originator: Rin Okuyama
>Release: 10.99.3
>Organization:
Internet Initiative Japan Inc.
>Environment:
NetBSD netbsd 10.99.3 NetBSD 10.99.3 (AMD64_NET_MPSAFE) #1: Thu May 25 19:17:12 JST 2023 rin@latipes:/build/src/sys/arch/amd64/compile/AMD64_NET_MPSAFE amd64
>Description:
re_format(7) says:
> Extended regular expressions
...
> A ‘{’ followed by a character other than a digit is an
> ordinary character, not the beginning of a bound
However, REG_BADRPT is raised for stray {, e.g.:
----
$ /bin/echo '{' | /usr/bin/sed -E '/{/p'
sed: 1: "/{/p": RE error: repetition-operator operand invalid
----
This is due to recent sync of regex(3) with FreeBSD. At least,
the above example works for netbsd-8. Probably, this commit to
FreeBSD did it:
https://github.com/freebsd/freebsd-src/commit/a4a801688c909ef39cbcbc3488bc4fdbabd69d66
Thanks yamaguchi@ for pointing out this problem.
>How-To-Repeat:
/bin/echo '{' | /usr/bin/sed -E '/{/p'
>Fix:
Fix regcomp() or manpages [regex(3) and re_format(7)]
Even if we choice the latter, I am not sure whether REG_BADRPT is
appropriate here, or not.
>Audit-Trail:
From: RVP <rvp@SDF.ORG>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: lib/57530: regex(3): REG_BADRPT raised by { for REG_EXTENDED
Date: Tue, 18 Jul 2023 07:44:58 +0000 (UTC)
On Tue, 18 Jul 2023, rokuyama.rk@gmail.com wrote:
>> How-To-Repeat:
> /bin/echo '{' | /usr/bin/sed -E '/{/p'
>> Fix:
> Fix regcomp() or manpages [regex(3) and re_format(7)]
>
> Even if we choice the latter, I am not sure whether REG_BADRPT is
> appropriate here, or not.
>
I think so. SUSv4 (2018) says of this:
```
9.4 Extended Regular Expressions
*+?{
The <asterisk>, <plus-sign>, <question-mark>, and <left-brace>
shall be special except when used in a bracket expression (see RE
Bracket Expression). Any of the following uses produce undefined
results:
* If these characters appear first in an ERE, or immediately
following an unescaped <vertical-line>, <circumflex>,
<dollar-sign>, or <left-parenthesis>
* If a <left-brace> is not part of a valid interval expression
(see EREs Matching Multiple Characters)
```
Ie. this is exactly the same as:
```
$ grep -E '*'
grep: repetition-operator operand invalid
$
```
I think we should amend the man-page.
-RVP
From: RVP <rvp@SDF.ORG>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: lib/57530: regex(3): REG_BADRPT raised by { for REG_EXTENDED
Date: Tue, 18 Jul 2023 08:08:12 +0000 (UTC)
On Tue, 18 Jul 2023, RVP wrote:
> Ie. this is exactly the same as:
>
> ```
> $ grep -E '*'
> grep: repetition-operator operand invalid
> $
> ```
>
> I think we should amend the man-page.
>
I should've mentioned that the grep(1) used above was the one in FreeBSD-13.2
(which uses the libc regexp(3) as its engine.).
GNU grep 3.11 does:
```
$ /opt/gnu/bin/grep --version
grep (GNU grep) 3.11
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Mike Haertel and others; see
<https://git.savannah.gnu.org/cgit/grep.git/tree/AUTHORS>.
grep -P uses PCRE2 10.42 2022-12-11
$ env POSIXLY_CORRECT=true /opt/gnu/bin/grep -E '*' </dev/null
grep: warning: * at start of expression
$ env POSIXLY_CORRECT=true /opt/gnu/bin/grep -E '+' </dev/null
grep: warning: + at start of expression
$ env POSIXLY_CORRECT=true /opt/gnu/bin/grep -E '?' </dev/null
grep: warning: ? at start of expression
$
```
except:
```
$ env POSIXLY_CORRECT=true /opt/gnu/bin/grep -E '{' </dev/null
$
```
which behaves differently for some reason...
-RVP
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.