NetBSD Problem Report #57530

From www@netbsd.org  Tue Jul 18 03:43:29 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id E95E31A923D
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 18 Jul 2023 03:43:28 +0000 (UTC)
Message-Id: <20230718034328.0F8CA1A923E@mollari.NetBSD.org>
Date: Tue, 18 Jul 2023 03:43:28 +0000 (UTC)
From: rokuyama.rk@gmail.com
Reply-To: rokuyama.rk@gmail.com
To: gnats-bugs@NetBSD.org
Subject: regex(3): REG_BADRPT raised by { for REG_EXTENDED
X-Send-Pr-Version: www-1.0

>Number:         57530
>Category:       lib
>Synopsis:       regex(3): REG_BADRPT raised by { for REG_EXTENDED
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    lib-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Jul 18 03:45:00 +0000 2023
>Last-Modified:  Tue Jul 18 09:30:02 +0000 2023
>Originator:     Rin Okuyama
>Release:        10.99.3
>Organization:
Internet Initiative Japan Inc.
>Environment:
NetBSD netbsd 10.99.3 NetBSD 10.99.3 (AMD64_NET_MPSAFE) #1: Thu May 25 19:17:12 JST 2023  rin@latipes:/build/src/sys/arch/amd64/compile/AMD64_NET_MPSAFE amd64
>Description:
re_format(7) says:

> Extended regular expressions
...
> A ‘{’ followed by a character other than a digit is an
> ordinary character, not the beginning of a bound

However, REG_BADRPT is raised for stray {, e.g.:

----
$ /bin/echo '{' | /usr/bin/sed -E '/{/p'
sed: 1: "/{/p": RE error: repetition-operator operand invalid
----

This is due to recent sync of regex(3) with FreeBSD. At least,
the above example works for netbsd-8. Probably, this commit to
FreeBSD did it:

https://github.com/freebsd/freebsd-src/commit/a4a801688c909ef39cbcbc3488bc4fdbabd69d66

Thanks yamaguchi@ for pointing out this problem.
>How-To-Repeat:
/bin/echo '{' | /usr/bin/sed -E '/{/p'
>Fix:
Fix regcomp() or manpages [regex(3) and re_format(7)]

Even if we choice the latter, I am not sure whether REG_BADRPT is
appropriate here, or not.

>Audit-Trail:
From: RVP <rvp@SDF.ORG>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: lib/57530: regex(3): REG_BADRPT raised by { for REG_EXTENDED
Date: Tue, 18 Jul 2023 07:44:58 +0000 (UTC)

 On Tue, 18 Jul 2023, rokuyama.rk@gmail.com wrote:

 >> How-To-Repeat:
 > /bin/echo '{' | /usr/bin/sed -E '/{/p'
 >> Fix:
 > Fix regcomp() or manpages [regex(3) and re_format(7)]
 >
 > Even if we choice the latter, I am not sure whether REG_BADRPT is
 > appropriate here, or not.
 >

 I think so. SUSv4 (2018) says of this:

 ```
 9.4 Extended Regular Expressions
     *+?{
  	The <asterisk>, <plus-sign>, <question-mark>, and <left-brace>
  	shall be special except when used in a bracket expression (see RE
  	Bracket Expression). Any of the following uses produce undefined
  	results:

  	  * If these characters appear first in an ERE, or immediately
  	    following an unescaped <vertical-line>, <circumflex>,
  	    <dollar-sign>, or <left-parenthesis>

  	  * If a <left-brace> is not part of a valid interval expression
  	    (see EREs Matching Multiple Characters)
 ```

 Ie. this is exactly the same as:

 ```
 $ grep -E '*'
 grep: repetition-operator operand invalid
 $
 ```

 I think we should amend the man-page.

 -RVP

From: RVP <rvp@SDF.ORG>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: lib/57530: regex(3): REG_BADRPT raised by { for REG_EXTENDED
Date: Tue, 18 Jul 2023 08:08:12 +0000 (UTC)

 On Tue, 18 Jul 2023, RVP wrote:

 > Ie. this is exactly the same as:
 >
 > ```
 > $ grep -E '*'
 > grep: repetition-operator operand invalid
 > $
 > ```
 >
 > I think we should amend the man-page.
 >

 I should've mentioned that the grep(1) used above was the one in FreeBSD-13.2
 (which uses the libc regexp(3) as its engine.).

 GNU grep 3.11 does:

 ```
 $ /opt/gnu/bin/grep --version
 grep (GNU grep) 3.11
 Copyright (C) 2023 Free Software Foundation, Inc.
 License GPLv3+: GNU GPL version 3 or later 
 <https://gnu.org/licenses/gpl.html>.
 This is free software: you are free to change and redistribute it.
 There is NO WARRANTY, to the extent permitted by law.

 Written by Mike Haertel and others; see
 <https://git.savannah.gnu.org/cgit/grep.git/tree/AUTHORS>.

 grep -P uses PCRE2 10.42 2022-12-11

 $ env POSIXLY_CORRECT=true /opt/gnu/bin/grep -E '*' </dev/null
 grep: warning: * at start of expression

 $ env POSIXLY_CORRECT=true /opt/gnu/bin/grep -E '+' </dev/null
 grep: warning: + at start of expression

 $ env POSIXLY_CORRECT=true /opt/gnu/bin/grep -E '?' </dev/null
 grep: warning: ? at start of expression

 $
 ```

 except:

 ```
 $ env POSIXLY_CORRECT=true /opt/gnu/bin/grep -E '{' </dev/null
 $
 ```

 which behaves differently for some reason...

 -RVP

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.