NetBSD Problem Report #42320
From www@NetBSD.org Sun Nov 15 11:19:03 2009
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id E5F1B63B8CD
for <gnats-bugs@gnats.netbsd.org>; Sun, 15 Nov 2009 11:19:03 +0000 (UTC)
Message-Id: <20091115111903.3712C63B844@www.NetBSD.org>
Date: Sun, 15 Nov 2009 11:19:03 +0000 (UTC)
From: alnsn@yandex.ru
Reply-To: alnsn@yandex.ru
To: gnats-bugs@NetBSD.org
Subject: LC_NUMERIC in awk is not POSIX compliant
X-Send-Pr-Version: www-1.0
>Number: 42320
>Category: bin
>Synopsis: LC_NUMERIC in awk is not POSIX compliant
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: bin-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Nov 15 11:20:00 +0000 2009
>Closed-Date: Sun Mar 27 22:11:23 +0000 2016
>Last-Modified: Sun Mar 27 22:11:23 +0000 2016
>Originator: Alexander Nasonov
>Release: NetBSD 5.99.22
>Organization:
>Environment:
$ uname -a
NetBSD aa1nb.lan 5.99.22 NetBSD 5.99.22 (MONOLITHIC) #0: Sat Nov 14 16:49:42 GMT 2009 root@aa1nb.lan:/home/alnsn/src/netbsd-current/src/sys/arch/i386/compile/obj/MONOLITHIC i386
>Description:
awk doesn't recognise the period character if ${LC_NUMERIC} is a locale with comma decimal-point character. There is a special case in POSIX specs for this situation:
http://www.opengroup.org/onlinepubs/7990989775/xcu/awk.html
LC_NUMERIC
Determine the radix character used when interpreting numeric input, performing conversions between numeric and string values and formatting numeric output. Regardless of locale, the period character (the decimal-point character of the POSIX locale) is the decimal-point character recognised in processing awk programs (including assignments in command-line arguments).
>How-To-Repeat:
$ LC_NUMERIC=ru_RU.KOI8-R /usr/bin/awk '{print 0.01}'
/usr/bin/awk: syntax error at source line 1
context is
{print >>> 0.01 <<<
/usr/bin/awk: illegal statement at source line 1
>Fix:
>Release-Note:
>Audit-Trail:
From: Christos Zoulas <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/42320 CVS commit: src/dist/nawk
Date: Sun, 15 Nov 2009 16:56:06 -0500
Module Name: src
Committed By: christos
Date: Sun Nov 15 21:56:06 UTC 2009
Modified Files:
src/dist/nawk: main.c
Log Message:
PR/42320: Alexander Nasonov: According to:
http://www.opengroup.org/onlinepubs/7990989775/xcu/awk.html
the LC_NUMERIC decimal point recognized is always period. Make it so.
To generate a diff of this commit:
cvs rdiff -u -r1.9 -r1.10 src/dist/nawk/main.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: David Laight <david@l8s.co.uk>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, alnsn@yandex.ru
Subject: Re: PR/42320 CVS commit: src/dist/nawk
Date: Sun, 15 Nov 2009 22:11:27 +0000
On Sun, Nov 15, 2009 at 10:00:06PM +0000, Christos Zoulas wrote:
> The following reply was made to PR bin/42320; it has been noted by GNATS.
...
> Modified Files:
> src/dist/nawk: main.c
>
> Log Message:
> PR/42320: Alexander Nasonov: According to:
> http://www.opengroup.org/onlinepubs/7990989775/xcu/awk.html
> the LC_NUMERIC decimal point recognized is always period. Make it so.
I'm not sure this DTRT, the above only refers to LC_NUMERIC in
awk program/script files, not in the data files being processed.
Also you can't force the decimal point to be '.' without ensuring
that the 1000s separator isn't also '.'.
David
--
David Laight: david@l8s.co.uk
From: christos@zoulas.com (Christos Zoulas)
To: David Laight <david@l8s.co.uk>, gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, alnsn@yandex.ru
Subject: Re: PR/42320 CVS commit: src/dist/nawk
Date: Sun, 15 Nov 2009 17:22:54 -0500
On Nov 15, 10:11pm, david@l8s.co.uk (David Laight) wrote:
-- Subject: Re: PR/42320 CVS commit: src/dist/nawk
| On Sun, Nov 15, 2009 at 10:00:06PM +0000, Christos Zoulas wrote:
| > The following reply was made to PR bin/42320; it has been noted by GNATS.
| ...
| > Modified Files:
| > src/dist/nawk: main.c
| >
| > Log Message:
| > PR/42320: Alexander Nasonov: According to:
| > http://www.opengroup.org/onlinepubs/7990989775/xcu/awk.html
| > the LC_NUMERIC decimal point recognized is always period. Make it so.
|
| I'm not sure this DTRT, the above only refers to LC_NUMERIC in
| awk program/script files, not in the data files being processed.
|
| Also you can't force the decimal point to be '.' without ensuring
| that the 1000s separator isn't also '.'.
So what should we do? Reset it after the program is parsed? What does
gawk do?
christos
From: Alan Barrett <apb@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/42320 CVS commit: src
Date: Tue, 17 Nov 2009 20:49:34 +0000
Module Name: src
Committed By: apb
Date: Tue Nov 17 20:49:34 UTC 2009
Modified Files:
src: build.sh
Log Message:
Set LC_ALL=C before we try to parse the output from any command.
This will ensure that awk is not invoked in a way that tickles
the bug described in PR 42320.
To generate a diff of this commit:
cvs rdiff -u -r1.217 -r1.218 src/build.sh
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Takehiko NOZAKI <takehiko.nozaki@gmail.com>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: PR/42320 CVS commit: src/dist/nawk
Date: Sun, 22 Nov 2009 02:26:35 +0900
hi, all
> =A0So what should we do? Reset it after the program is parsed? What does
> =A0gawk do?
>
> =A0christos
>
>
this bug introduced by merging nawk-20030729 branch.
main.c rev1.3, setlocale(LC_ALL, "") at line 116 was added as NetBSD's
local change.
but nawk-20030729 had own setlocale(LC_NUMERIC, "C") call at line 107
(this makes decimal point as '.').
after marging nawk-20030729 at main.c rev1.4, dupilicated
setlocale(LC_ALL, "") call
at line 116 causes overwrite LC_NUMERIC, so that decimal point set as
locale specific
character.
i think following patch make fix this PR.
Index: main.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/dist/nawk/main.c,v
retrieving revision 1.10
diff -u -r1.10 main.c
--- main.c 15 Nov 2009 21:56:06 -0000 1.10
+++ main.c 21 Nov 2009 15:11:34 -0000
@@ -101,9 +101,8 @@
int main(int argc, char *argv[])
{
const char *fs =3D NULL;
- struct lconv *lconv;
- setlocale(LC_CTYPE, "");
+ setlocale(LC_ALL, "");
setlocale(LC_NUMERIC, "C"); /* for parsing cmdline & prog */
cmdname =3D argv[0];
if (argc =3D=3D 1) {
@@ -112,12 +111,6 @@
cmdname);
exit(1);
}
-
- (void) setlocale(LC_ALL, "");
- lconv =3D localeconv();
- lconv->decimal_point =3D ".";
-
-
#ifdef SA_SIGINFO
{
struct sigaction sa;
very truly yours.
--=20
Takehiko NOZAKI<takehiko.nozaki@gmail.com>
From: Christos Zoulas <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/42320 CVS commit: src/dist/nawk
Date: Sat, 21 Nov 2009 12:57:09 -0500
Module Name: src
Committed By: christos
Date: Sat Nov 21 17:57:09 UTC 2009
Modified Files:
src/dist/nawk: main.c
Log Message:
Better fix for PR/42320 by Takehiko NOZAKI.
To generate a diff of this commit:
cvs rdiff -u -r1.10 -r1.11 src/dist/nawk/main.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Takehiko NOZAKI <takehiko.nozaki@gmail.com>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: PR/42320 CVS commit: src/dist/nawk
Date: Sun, 22 Nov 2009 02:33:42 +0900
...and one more, don't modify struct lconv returnrd by localeconv(3).
see RETURN VALUE section:
http://www.opengroup.org/onlinepubs/000095399/functions/localeconv.html
The localeconv() function shall return a pointer to the filled-in object.
The application shall not modify the structure pointed to by the return value
which may be overwritten by a subsequent call to localeconv(). In addition,
calls to setlocale() with the categories LC_ALL , LC_MONETARY , or LC_NUMERIC
may overwrite the contents of the structure.
--
Takehiko NOZAKI<takehiko.nozaki@gmail.com>
From: Alexander Nasonov <alnsn@yandex.ru>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/42320: LC_NUMERIC in awk is not POSIX compliant
Date: Wed, 22 Sep 2010 01:48:15 +0100
Please add the atf test for this issue:
$ cat tests/util/awk/period.awk
{print $1 + 0.01}
$ cat tests/util/awk/period.in
0,02
$ cat tests/util/awk/period.out
0,03
$ cvs diff tests/util/awk/t_awk.sh
Index: tests/util/awk/t_awk.sh
===================================================================
RCS file: /cvsroot/src/tests/util/awk/t_awk.sh,v
retrieving revision 1.2
diff -u -r1.2 t_awk.sh
--- tests/util/awk/t_awk.sh 4 Jun 2010 08:39:41 -0000 1.2
+++ tests/util/awk/t_awk.sh 22 Sep 2010 00:42:16 -0000
@@ -86,10 +86,25 @@
h_check toupper
}
+atf_test_case period
+period_head()
+{
+ atf_set "descr" "Checks that the period character is recognised" \
+ "in awk program regardless of locale (bin/42320)"
+ atf_set "use.fs" "true"
+}
+period_body()
+{
+ export LANG=ru_RU.KOI8-R
+
+ h_check period
+}
+
atf_init_test_cases()
{
atf_add_test_case big_regexp
atf_add_test_case end
atf_add_test_case string1
atf_add_test_case multibyte
+ atf_add_test_case period
}
From: Alexander Nasonov <alnsn@yandex.ru>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, alnsn@yandex.ru
Subject: Re: bin/42320: LC_NUMERIC in awk is not POSIX compliant
Date: Fri, 24 Sep 2010 23:59:38 +0100
> Please add the atf test for this issue:
I'd like to enhance the test and check a handling of assignments in
command-line agruments:
$ cat tests/util/awk/period.awk
{print x + $1 + 0.01}
$ cat tests/util/awk/period.in
0,02
$ cat tests/util/awk/period.out
0,04
$ cvs diff tests/util/awk/t_awk.sh
Index: tests/util/awk/t_awk.sh
===================================================================
RCS file: /cvsroot/src/tests/util/awk/t_awk.sh,v
retrieving revision 1.2
diff -u -r1.2 t_awk.sh
--- tests/util/awk/t_awk.sh 4 Jun 2010 08:39:41 -0000 1.2
+++ tests/util/awk/t_awk.sh 24 Sep 2010 22:51:30 -0000
@@ -27,10 +27,12 @@
h_check()
{
+ local fname=d_$1
for sfx in in out awk; do
- cp -r $(atf_get_srcdir)/d_$1.$sfx .
+ cp -r $(atf_get_srcdir)/$fname.$sfx .
done
- atf_check -o file:d_$1.out -x "awk -f d_$1.awk < d_$1.in"
+ shift 1
+ atf_check -o file:$fname.out -x "awk $@ -f $fname.awk < $fname.in"
}
atf_test_case big_regexp
@@ -86,10 +88,25 @@
h_check toupper
}
+atf_test_case period
+period_head()
+{
+ atf_set "descr" "Checks that the period character is recognised" \
+ "in awk program regardless of locale (bin/42320)"
+ atf_set "use.fs" "true"
+}
+period_body()
+{
+ export LANG=ru_RU.KOI8-R
+
+ h_check period -v x=0.01
+}
+
atf_init_test_cases()
{
atf_add_test_case big_regexp
atf_add_test_case end
atf_add_test_case string1
atf_add_test_case multibyte
+ atf_add_test_case period
}
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: bin/42320: LC_NUMERIC in awk is not POSIX compliant
Date: Wed, 3 Nov 2010 04:56:04 +0000
On Sun, Nov 15, 2009 at 11:20:00AM +0000, alnsn@yandex.ru wrote:
> [...]
As noted in 44013 (which I just closed as a duplicate) this problem
also affects -5, so the fixes should be pulled up.
--
David A. Holland
dholland@netbsd.org
From: "Alexander Nasonov" <alnsn@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/42320 CVS commit: src/tests/util/awk
Date: Thu, 28 Apr 2011 23:28:23 +0000
Module Name: src
Committed By: alnsn
Date: Thu Apr 28 23:28:23 UTC 2011
Modified Files:
src/tests/util/awk: t_awk.sh
Added Files:
src/tests/util/awk: d_period.awk d_period.in d_period.out
Log Message:
Test for PR bin/42320.
To generate a diff of this commit:
cvs rdiff -u -r0 -r1.1 src/tests/util/awk/d_period.awk \
src/tests/util/awk/d_period.in src/tests/util/awk/d_period.out
cvs rdiff -u -r1.3 -r1.4 src/tests/util/awk/t_awk.sh
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Alexander Nasonov" <alnsn@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/42320 CVS commit: src/tests/util/awk
Date: Sat, 30 Apr 2011 01:10:08 +0000
Module Name: src
Committed By: alnsn
Date: Sat Apr 30 01:10:08 UTC 2011
Modified Files:
src/tests/util/awk: t_awk.sh
Log Message:
PR/42320 doesn't seem to be fixed, mark the failure as expected.
To generate a diff of this commit:
cvs rdiff -u -r1.4 -r1.5 src/tests/util/awk/t_awk.sh
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: alnsn@NetBSD.org
State-Changed-When: Sun, 27 Mar 2016 22:11:23 +0000
State-Changed-Why:
Fixed long time ago.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.