NetBSD Problem Report #43896
From www@NetBSD.org Wed Sep 22 15:53:18 2010
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id 2945E63B995
for <gnats-bugs@gnats.NetBSD.org>; Wed, 22 Sep 2010 15:53:18 +0000 (UTC)
Message-Id: <20100922155317.E825763B97A@www.NetBSD.org>
Date: Wed, 22 Sep 2010 15:53:17 +0000 (UTC)
From: peter@kerwien.homeip.net
Reply-To: peter@kerwien.homeip.net
To: gnats-bugs@NetBSD.org
Subject: grep -o match problem
X-Send-Pr-Version: www-1.0
>Number: 43896
>Category: bin
>Synopsis: grep -o match problem
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: bin-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Sep 22 15:55:00 +0000 2010
>Closed-Date: Tue Sep 28 01:28:09 +0000 2010
>Last-Modified: Tue Sep 28 01:28:09 +0000 2010
>Originator: Peter Kerwien
>Release: NetBSD 5.99.39 (amd64)
>Organization:
N/A
>Environment:
NetBSD pc3 5.99.39 NetBSD 5.99.39 (GENERIC) #1: Wed Sep 22 05:57:37 UTC 2010 root@pc3:/usr/obj/sys/arch/amd64/compile/GENERIC amd64
>Description:
The following command fails to match properly:
echo VERSION=10 | grep -o '[0-9]*'
The result is empty. The correct result should be 10.
>How-To-Repeat:
See description.
>Fix:
>Release-Note:
>Audit-Trail:
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: bin/43896: grep -o match problem
Date: Mon, 27 Sep 2010 02:17:47 +0000
On Wed, Sep 22, 2010 at 03:55:00PM +0000, peter@kerwien.homeip.net wrote:
> The following command fails to match properly:
>
> echo VERSION=10 | grep -o '[0-9]*'
>
> The result is empty. The correct result should be 10.
This result is, though perhaps not useful, correct. You can see what's
going on if you try sed:
% echo VERSION=10 | sed 's/[0-9]*/wibble/'
wibbleVERSION=10
%
Because [0-9]* matches the empty string, grep is matching the empty
string at the beginning of the line and printing that.
To get the result you're looking for, try grep -o '[0-9][0-9]*' or
egrep -o '[0-9]+'.
--
David A. Holland
dholland@netbsd.org
State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Mon, 27 Sep 2010 02:22:44 +0000
State-Changed-Why:
Submitter hit one of the pitfalls in regexp matching...
State-Changed-From-To: closed->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Mon, 27 Sep 2010 08:29:53 +0000
State-Changed-Why:
A bug does exist here, however. grep -o apparently prints each distinct
match for a given input line separately:
% echo ' 1 2 3 4 ' | grep -o '[0-9]'
1
2
3
4
% echo ' the quick brown fox ' | grep -o '[a-z][a-z]*'
the
quick
brown
fox
%
Therefore, the original example, which can match the empty string, should
print all the possible empty strings it can match and also the nonempty
match, and not just stop with the first empty match at the beginning of the
line.
Reportedly, updating grep will fix the problem.
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/43896 (grep -o match problem)
Date: Mon, 27 Sep 2010 08:37:00 +0000
On Mon, Sep 27, 2010 at 08:29:54AM +0000, dholland@NetBSD.org wrote:
> A bug does exist here, however. grep -o apparently prints each distinct
> match for a given input line separately:
>
> [...]
>
> Reportedly, updating grep will fix the problem.
As does this patch:
Index: src/grep.c
===================================================================
RCS file: /cvsroot/src/gnu/dist/grep/src/grep.c,v
retrieving revision 1.12
diff -u -p -r1.12 grep.c
--- src/grep.c 28 Aug 2008 03:59:06 -0000 1.12
+++ src/grep.c 27 Sep 2010 08:35:31 -0000
@@ -542,7 +542,10 @@ prline (char const *beg, char const *lim
if (b == lim)
break;
if (match_size == 0)
- break;
+ {
+ beg++;
+ continue;
+ }
if(color_option)
printf("\33[%sm", grep_color);
fwrite(b, sizeof (char), match_size, stdout);
--
David A. Holland
dholland@netbsd.org
From: "David A. Holland" <dholland@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/43896 CVS commit: src/gnu/dist/grep/src
Date: Tue, 28 Sep 2010 00:54:05 +0000
Module Name: src
Committed By: dholland
Date: Tue Sep 28 00:54:04 UTC 2010
Modified Files:
src/gnu/dist/grep/src: grep.c
Log Message:
Fix -o behavior with patterns that match the empty string, as per PR 43896.
To generate a diff of this commit:
cvs rdiff -u -r1.12 -r1.13 src/gnu/dist/grep/src/grep.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Tue, 28 Sep 2010 01:28:09 +0000
State-Changed-Why:
Fixed (properly now), thanks.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.