NetBSD Problem Report #42320

From www@NetBSD.org  Sun Nov 15 11:19:03 2009
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id E5F1B63B8CD
	for <gnats-bugs@gnats.netbsd.org>; Sun, 15 Nov 2009 11:19:03 +0000 (UTC)
Message-Id: <20091115111903.3712C63B844@www.NetBSD.org>
Date: Sun, 15 Nov 2009 11:19:03 +0000 (UTC)
From: alnsn@yandex.ru
Reply-To: alnsn@yandex.ru
To: gnats-bugs@NetBSD.org
Subject: LC_NUMERIC in awk is not POSIX  compliant
X-Send-Pr-Version: www-1.0

>Number:         42320
>Category:       bin
>Synopsis:       LC_NUMERIC in awk is not POSIX  compliant
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    bin-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Nov 15 11:20:00 +0000 2009
>Closed-Date:    Sun Mar 27 22:11:23 +0000 2016
>Last-Modified:  Sun Mar 27 22:11:23 +0000 2016
>Originator:     Alexander Nasonov
>Release:        NetBSD 5.99.22
>Organization:
>Environment:
$ uname -a
NetBSD aa1nb.lan 5.99.22 NetBSD 5.99.22 (MONOLITHIC) #0: Sat Nov 14 16:49:42 GMT 2009  root@aa1nb.lan:/home/alnsn/src/netbsd-current/src/sys/arch/i386/compile/obj/MONOLITHIC i386

>Description:
awk doesn't recognise the period character if ${LC_NUMERIC} is a locale with comma decimal-point character. There is a special case in POSIX specs for this situation:

http://www.opengroup.org/onlinepubs/7990989775/xcu/awk.html

LC_NUMERIC
    Determine the radix character used when interpreting numeric input, performing conversions between numeric and string values and formatting numeric output. Regardless of locale, the period character (the decimal-point character of the POSIX locale) is the decimal-point character recognised in processing awk programs (including assignments in command-line arguments). 
>How-To-Repeat:
$ LC_NUMERIC=ru_RU.KOI8-R /usr/bin/awk '{print 0.01}'
/usr/bin/awk: syntax error at source line 1
 context is
        {print >>>  0.01 <<< 
/usr/bin/awk: illegal statement at source line 1

>Fix:

>Release-Note:

>Audit-Trail:
From: Christos Zoulas <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/42320 CVS commit: src/dist/nawk
Date: Sun, 15 Nov 2009 16:56:06 -0500

 Module Name:	src
 Committed By:	christos
 Date:		Sun Nov 15 21:56:06 UTC 2009

 Modified Files:
 	src/dist/nawk: main.c

 Log Message:
 PR/42320: Alexander Nasonov: According to:
 http://www.opengroup.org/onlinepubs/7990989775/xcu/awk.html
 the LC_NUMERIC decimal point recognized is always period. Make it so.


 To generate a diff of this commit:
 cvs rdiff -u -r1.9 -r1.10 src/dist/nawk/main.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: David Laight <david@l8s.co.uk>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, alnsn@yandex.ru
Subject: Re: PR/42320 CVS commit: src/dist/nawk
Date: Sun, 15 Nov 2009 22:11:27 +0000

 On Sun, Nov 15, 2009 at 10:00:06PM +0000, Christos Zoulas wrote:
 > The following reply was made to PR bin/42320; it has been noted by GNATS.
 ...
 >  Modified Files:
 >  	src/dist/nawk: main.c
 >  
 >  Log Message:
 >  PR/42320: Alexander Nasonov: According to:
 >  http://www.opengroup.org/onlinepubs/7990989775/xcu/awk.html
 >  the LC_NUMERIC decimal point recognized is always period. Make it so.

 I'm not sure this DTRT, the above only refers to LC_NUMERIC in
 awk program/script files, not in the data files being processed.

 Also you can't force the decimal point to be '.' without ensuring
 that the 1000s separator isn't also '.'.

 	David

 -- 
 David Laight: david@l8s.co.uk

From: christos@zoulas.com (Christos Zoulas)
To: David Laight <david@l8s.co.uk>, gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, alnsn@yandex.ru
Subject: Re: PR/42320 CVS commit: src/dist/nawk
Date: Sun, 15 Nov 2009 17:22:54 -0500

 On Nov 15, 10:11pm, david@l8s.co.uk (David Laight) wrote:
 -- Subject: Re: PR/42320 CVS commit: src/dist/nawk

 | On Sun, Nov 15, 2009 at 10:00:06PM +0000, Christos Zoulas wrote:
 | > The following reply was made to PR bin/42320; it has been noted by GNATS.
 | ...
 | >  Modified Files:
 | >  	src/dist/nawk: main.c
 | >  
 | >  Log Message:
 | >  PR/42320: Alexander Nasonov: According to:
 | >  http://www.opengroup.org/onlinepubs/7990989775/xcu/awk.html
 | >  the LC_NUMERIC decimal point recognized is always period. Make it so.
 | 
 | I'm not sure this DTRT, the above only refers to LC_NUMERIC in
 | awk program/script files, not in the data files being processed.
 | 
 | Also you can't force the decimal point to be '.' without ensuring
 | that the 1000s separator isn't also '.'.

 So what should we do? Reset it after the program is parsed? What does
 gawk do?

 christos

From: Alan Barrett <apb@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/42320 CVS commit: src
Date: Tue, 17 Nov 2009 20:49:34 +0000

 Module Name:	src
 Committed By:	apb
 Date:		Tue Nov 17 20:49:34 UTC 2009

 Modified Files:
 	src: build.sh

 Log Message:
 Set LC_ALL=C before we try to parse the output from any command.
 This will ensure that awk is not invoked in a way that tickles
 the bug described in PR 42320.


 To generate a diff of this commit:
 cvs rdiff -u -r1.217 -r1.218 src/build.sh

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Takehiko NOZAKI <takehiko.nozaki@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: PR/42320 CVS commit: src/dist/nawk
Date: Sun, 22 Nov 2009 02:26:35 +0900

 hi, all

 > =A0So what should we do? Reset it after the program is parsed? What does
 > =A0gawk do?
 >
 > =A0christos
 >
 >

 this bug introduced by merging nawk-20030729 branch.

 main.c rev1.3, setlocale(LC_ALL, "") at line 116 was added as NetBSD's
 local change.

 but nawk-20030729 had own setlocale(LC_NUMERIC, "C") call at line 107
 (this makes decimal point as '.').

 after marging nawk-20030729 at main.c rev1.4, dupilicated
 setlocale(LC_ALL, "") call
 at line 116 causes overwrite LC_NUMERIC, so that decimal point set as
 locale specific
 character.

 i think following patch make fix this PR.

 Index: main.c
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
 RCS file: /cvsroot/src/dist/nawk/main.c,v
 retrieving revision 1.10
 diff -u -r1.10 main.c
 --- main.c	15 Nov 2009 21:56:06 -0000	1.10
 +++ main.c	21 Nov 2009 15:11:34 -0000
 @@ -101,9 +101,8 @@
  int main(int argc, char *argv[])
  {
  	const char *fs =3D NULL;
 -	struct lconv *lconv;

 -	setlocale(LC_CTYPE, "");
 +	setlocale(LC_ALL, "");
  	setlocale(LC_NUMERIC, "C"); /* for parsing cmdline & prog */
  	cmdname =3D argv[0];
  	if (argc =3D=3D 1) {
 @@ -112,12 +111,6 @@
  		  cmdname);
  		exit(1);
  	}
 -
 -	(void) setlocale(LC_ALL, "");
 -	lconv =3D localeconv();
 -	lconv->decimal_point =3D ".";
 -
 -
  #ifdef SA_SIGINFO
  	{
  		struct sigaction sa;



 very truly yours.
 --=20
 Takehiko NOZAKI<takehiko.nozaki@gmail.com>

From: Christos Zoulas <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/42320 CVS commit: src/dist/nawk
Date: Sat, 21 Nov 2009 12:57:09 -0500

 Module Name:	src
 Committed By:	christos
 Date:		Sat Nov 21 17:57:09 UTC 2009

 Modified Files:
 	src/dist/nawk: main.c

 Log Message:
 Better fix for PR/42320 by Takehiko NOZAKI.


 To generate a diff of this commit:
 cvs rdiff -u -r1.10 -r1.11 src/dist/nawk/main.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Takehiko NOZAKI <takehiko.nozaki@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: PR/42320 CVS commit: src/dist/nawk
Date: Sun, 22 Nov 2009 02:33:42 +0900

 ...and one more, don't modify struct lconv returnrd by localeconv(3).

 see RETURN VALUE section:

 http://www.opengroup.org/onlinepubs/000095399/functions/localeconv.html

 The localeconv() function shall return a pointer to the filled-in object.
 The application shall not modify the structure pointed to by the return value
 which may be overwritten by a subsequent call to localeconv(). In addition,
 calls to setlocale() with the categories LC_ALL , LC_MONETARY , or LC_NUMERIC
 may overwrite the contents of the structure.


 --
 Takehiko NOZAKI<takehiko.nozaki@gmail.com>

From: Alexander Nasonov <alnsn@yandex.ru>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/42320: LC_NUMERIC in awk is not POSIX  compliant
Date: Wed, 22 Sep 2010 01:48:15 +0100

 Please add the atf test for this issue:

 $ cat tests/util/awk/period.awk
 {print $1 + 0.01}
 $ cat tests/util/awk/period.in
 0,02
 $ cat tests/util/awk/period.out
 0,03
 $ cvs diff tests/util/awk/t_awk.sh
 Index: tests/util/awk/t_awk.sh
 ===================================================================
 RCS file: /cvsroot/src/tests/util/awk/t_awk.sh,v
 retrieving revision 1.2
 diff -u -r1.2 t_awk.sh
 --- tests/util/awk/t_awk.sh	4 Jun 2010 08:39:41 -0000	1.2
 +++ tests/util/awk/t_awk.sh	22 Sep 2010 00:42:16 -0000
 @@ -86,10 +86,25 @@
  	h_check toupper
  }

 +atf_test_case period
 +period_head()
 +{
 +	atf_set "descr" "Checks that the period character is recognised" \
 +	                "in awk program regardless of locale (bin/42320)"
 +	atf_set "use.fs" "true"
 +}
 +period_body()
 +{
 +	export LANG=ru_RU.KOI8-R
 +
 +	h_check period
 +}
 +
  atf_init_test_cases()
  {
  	atf_add_test_case big_regexp
  	atf_add_test_case end
  	atf_add_test_case string1
  	atf_add_test_case multibyte
 +	atf_add_test_case period
  }

From: Alexander Nasonov <alnsn@yandex.ru>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, alnsn@yandex.ru
Subject: Re: bin/42320: LC_NUMERIC in awk is not POSIX  compliant
Date: Fri, 24 Sep 2010 23:59:38 +0100

 >  Please add the atf test for this issue:

 I'd like to enhance the test and check a handling of assignments in
 command-line agruments:

 $ cat tests/util/awk/period.awk
 {print x + $1 + 0.01}
 $ cat tests/util/awk/period.in
 0,02
 $ cat tests/util/awk/period.out
 0,04
 $ cvs diff tests/util/awk/t_awk.sh
 Index: tests/util/awk/t_awk.sh
 ===================================================================
 RCS file: /cvsroot/src/tests/util/awk/t_awk.sh,v
 retrieving revision 1.2
 diff -u -r1.2 t_awk.sh
 --- tests/util/awk/t_awk.sh	4 Jun 2010 08:39:41 -0000	1.2
 +++ tests/util/awk/t_awk.sh	24 Sep 2010 22:51:30 -0000
 @@ -27,10 +27,12 @@

  h_check()
  {
 +	local fname=d_$1
  	for sfx in in out awk; do
 -		cp -r $(atf_get_srcdir)/d_$1.$sfx .
 +		cp -r $(atf_get_srcdir)/$fname.$sfx .
  	done
 -	atf_check -o file:d_$1.out -x "awk -f d_$1.awk < d_$1.in"
 +	shift 1
 +	atf_check -o file:$fname.out -x "awk $@ -f $fname.awk < $fname.in"
  }

  atf_test_case big_regexp
 @@ -86,10 +88,25 @@
  	h_check toupper
  }

 +atf_test_case period
 +period_head()
 +{
 +	atf_set "descr" "Checks that the period character is recognised" \
 +	                "in awk program regardless of locale (bin/42320)"
 +	atf_set "use.fs" "true"
 +}
 +period_body()
 +{
 +	export LANG=ru_RU.KOI8-R
 +
 +	h_check period -v x=0.01
 +}
 +
  atf_init_test_cases()
  {
  	atf_add_test_case big_regexp
  	atf_add_test_case end
  	atf_add_test_case string1
  	atf_add_test_case multibyte
 +	atf_add_test_case period
  }

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: bin/42320: LC_NUMERIC in awk is not POSIX  compliant
Date: Wed, 3 Nov 2010 04:56:04 +0000

 On Sun, Nov 15, 2009 at 11:20:00AM +0000, alnsn@yandex.ru wrote:
  > [...]

 As noted in 44013 (which I just closed as a duplicate) this problem
 also affects -5, so the fixes should be pulled up.

 -- 
 David A. Holland
 dholland@netbsd.org

From: "Alexander Nasonov" <alnsn@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/42320 CVS commit: src/tests/util/awk
Date: Thu, 28 Apr 2011 23:28:23 +0000

 Module Name:	src
 Committed By:	alnsn
 Date:		Thu Apr 28 23:28:23 UTC 2011

 Modified Files:
 	src/tests/util/awk: t_awk.sh
 Added Files:
 	src/tests/util/awk: d_period.awk d_period.in d_period.out

 Log Message:
 Test for PR bin/42320.


 To generate a diff of this commit:
 cvs rdiff -u -r0 -r1.1 src/tests/util/awk/d_period.awk \
     src/tests/util/awk/d_period.in src/tests/util/awk/d_period.out
 cvs rdiff -u -r1.3 -r1.4 src/tests/util/awk/t_awk.sh

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Alexander Nasonov" <alnsn@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/42320 CVS commit: src/tests/util/awk
Date: Sat, 30 Apr 2011 01:10:08 +0000

 Module Name:	src
 Committed By:	alnsn
 Date:		Sat Apr 30 01:10:08 UTC 2011

 Modified Files:
 	src/tests/util/awk: t_awk.sh

 Log Message:
 PR/42320 doesn't seem to be fixed, mark the failure as expected.


 To generate a diff of this commit:
 cvs rdiff -u -r1.4 -r1.5 src/tests/util/awk/t_awk.sh

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: alnsn@NetBSD.org
State-Changed-When: Sun, 27 Mar 2016 22:11:23 +0000
State-Changed-Why:
Fixed long time ago.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.