NetBSD Problem Report #59332
From dholland@netbsd.org Sun Apr 20 03:41:46 2025
Return-Path: <dholland@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
client-signature RSA-PSS (2048 bits) client-digest SHA256)
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 58E6B1A9239
for <gnats-bugs@gnats.NetBSD.org>; Sun, 20 Apr 2025 03:41:46 +0000 (UTC)
Message-Id: <20250420034145.AB64B8557B@mail.netbsd.org>
Date: Sun, 20 Apr 2025 03:41:45 +0000 (UTC)
From: dholland@NetBSD.org
Reply-To: dholland@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: awk handling of NaN comparisons
X-Send-Pr-Version: 3.95
>Number: 59332
>Category: standards
>Synopsis: awk handling of NaN comparisons
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: standards-manager
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Apr 20 03:45:00 +0000 2025
>Last-Modified: Mon Apr 21 07:20:00 +0000 2025
>Originator: David A. Holland
>Release: NetBSD 10.99.10 (20240219)
>Organization:
crying on the mountainside
>Environment:
System: NetBSD valkyrie 10.99.10 NetBSD 10.99.10 (VALKYRIE2) #0: Mon Feb 19 02:27:48 EST 2024 dholland@valkyrie:/y/objects/usrobj/sys/arch/amd64/compile/VALKYRIE2 amd64
Architecture: x86_64
Machine: amd64
>Description:
awk recognizes "NaN":
% echo foo | awk '{ x = "NaN" + 1; print x; }'
nan
but NaNs are supposed to be unequal to themselves:
% echo foo | awk '{ x = "NaN" + 1; print (x == x); }'
1
% echo foo | awk '{ x = "NaN" + 1; print (x != x); }'
0
% echo foo | awk '{ x = "NaN" + 1; print (x <= x); }'
1
>How-To-Repeat:
As above
>Fix:
probably avoid premature optimization
>Audit-Trail:
From: RVP <rvp@SDF.ORG>
To: gnats-bugs@netbsd.org
Cc: kre@netbsd.org
Subject: Re: standards/59332: awk handling of NaN comparisons
Date: Mon, 21 Apr 2025 07:19:01 +0000 (UTC)
On Sun, 20 Apr 2025, dholland@NetBSD.org wrote:
> awk recognizes "NaN":
>
> % echo foo | awk '{ x = "NaN" + 1; print x; }'
> nan
>
You have an old awk. Christos updated it last year:
```
$ fgrep awk /usr/src/doc/CHANGES
awk(1): Import 20240817 [christos 20240817]
$
```
which now gives:
```
$ awk --version
awk version 20240728
;; note that this must be exactly one of: `+nan', `-nan', `+inf' or `-inf'
;; (case-insensitive) to be recognized.
;;
$ awk 'BEGIN { x = "+NaN" + 0; print x; print (x == x); print (x != x); print (x <= x); }'
+nan
0
1
0
$
```
> but NaNs are supposed to be unequal to themselves:
>
> % echo foo | awk '{ x = "NaN" + 1; print (x == x); }'
> 1
> % echo foo | awk '{ x = "NaN" + 1; print (x != x); }'
> 0
> % echo foo | awk '{ x = "NaN" + 1; print (x <= x); }'
> 1
>
I'm not so sure that that's invalid from what POSIX[1] seems to say (CC: kre@):
```
Historical implementations of awk did not support floating-point
infinities and NaNs in numeric strings; e.g., "-INF" and "NaN".
However, implementations that use the atof() or strtod() functions
to do the conversion picked up support for these values if they
used a ISO/IEC 9899:1999 standard version of the function instead
of a ISO/IEC 9899:1990 standard version. Due to an oversight, the
2001 through 2004 editions of this standard did not allow support
for infinities and NaNs, but in this revision support is allowed
(but not required). This is a silent change to the behavior of awk
programs; for example, in the POSIX locale the expression:
("-INF" + 0 < 0)
formerly had the value 0 because "-INF" converted to 0, but now it
may have the value 0 or 1.
```
[1]: https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/utilities/awk.html
Anyway, the latest GAWK prints:
```
$ /tmp/G/bin/gawk --version
GNU Awk 5.3.2, API 4.0
Copyright (C) 1989, 1991-2025 Free Software Foundation.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see http://www.gnu.org/licenses/.
;; GAWK accepts plain `NaN' only if you pass `--posix'.
;;
$ /tmp/G/bin/gawk 'BEGIN { x = "+NaN" + 0; print x; print (x == x); print (x != x); print (x <= x); }'
+nan
0
1
0
$
```
while the latest MAWK prints:
```
$ /tmp/M/bin/mawk --version
mawk 1.3.4 20250131
Copyright 2008-2024,2025, Thomas E. Dickey
Copyright 1991-1996,2014, Michael D. Brennan
random-funcs: srandom/random
regex-funcs: internal
compiled limits:
sprintf buffer 8192
maximum-integer 9223372036854775808
;; MAWK needs `-Wposix' to recognize any NaNs, then coerces them all to +NaN,
;; and then gets it all wrong anyway. (It also accepts `nancy' as +NaN.)
;;
$ /tmp/M/bin/mawk -Wposix 'BEGIN { x = "+NaN" + 0; print x; print (x == x); print (x != x); print (x <= x); }'
+nan
1
0
1
$
```
-RVP
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.