NetBSD Problem Report #56148

From gson@gson.org  Thu May  6 07:26:18 2021
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 3834F1A9294
	for <gnats-bugs@gnats.NetBSD.org>; Thu,  6 May 2021 07:26:18 +0000 (UTC)
Message-Id: <20210506072609.2B6592541D3@guava.gson.org>
Date: Thu,  6 May 2021 10:26:09 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: lib/libc/stdio/t_printf:snprintf_float test randomly fails
X-Send-Pr-Version: 3.95

>Number:         56148
>Category:       lib
>Synopsis:       lib/libc/stdio/t_printf:snprintf_float test randomly fails
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    lib-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu May 06 07:30:00 +0000 2021
>Last-Modified:  Sun May 09 18:15:01 +0000 2021
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current, also -9
>Organization:

>Environment:
System: NetBSD
Architecture: x86_64
Machine: amd64
>Description:

The snprintf_float test case of the lib/libc/stdio/t_printf test
program randomly fails with a SIGSEGV once in a few thousand runs.

One such failure was recorded by the new NetBSD-9/amd64 testbed:

  http://releng.netbsd.org/b5reports/amd64-9/2021/2021.04.30.13.54.00/test.html#lib_libc_stdio_t_printf_snprintf_float

I have reproduced this in -current and on real amd64 hardware.

>How-To-Repeat:

cd /usr/tests/lib/libc/stdio
while ./t_printf snprintf_float; do true; done >log 2>&1
tail log

This typically takes more than a minute but less than an hour to fail.

>Fix:

>Audit-Trail:
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56148 CVS commit: src/lib/libc/gdtoa
Date: Thu, 6 May 2021 12:15:33 -0400

 Module Name:	src
 Committed By:	christos
 Date:		Thu May  6 16:15:33 UTC 2021

 Modified Files:
 	src/lib/libc/gdtoa: dtoa.c gdtoa.c strtoIg.c strtod.c strtodg.c

 Log Message:
 PR/56148: Andreas Gustafsson: lib/libc/stdio/t_printf:snprintf_float test
 randomly fails.
 Add checks to all places where lshift is called because it can return NULL


 To generate a diff of this commit:
 cvs rdiff -u -r1.10 -r1.11 src/lib/libc/gdtoa/dtoa.c
 cvs rdiff -u -r1.7 -r1.8 src/lib/libc/gdtoa/gdtoa.c
 cvs rdiff -u -r1.4 -r1.5 src/lib/libc/gdtoa/strtoIg.c
 cvs rdiff -u -r1.17 -r1.18 src/lib/libc/gdtoa/strtod.c
 cvs rdiff -u -r1.12 -r1.13 src/lib/libc/gdtoa/strtodg.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@netbsd.org
Cc: christos@NetBSD.org
Subject: Re: lib/56148: lib/libc/stdio/t_printf:snprintf_float test randomly fails
Date: Sun, 9 May 2021 19:37:33 +0300

 There are multiple layers of bugs here, and some of them are canceling
 each other out.

 First, there's the bug Christos already fixed that caused snprintf()
 to sometimes segfault when malloc() failed.

 Second, for that to happen, malloc() had to fail in the first place.
 The intent of the test seems to be to detect memory leaks by running
 out of memory only when there is a leak, but it's actually running out
 of memory many times in each run, often even in the first call to
 snprintf() when nothing has had a chance to leak yet.  So either the
 test is buggy in that the limits it sets for itself using setrlimit()
 are unreasonably low, or if you think the limits are reasonable, there
 is some other bug causing malloc() to fail regardless.

 Third, since the test is always running out of memory and is designed
 to fail when that happens, one might think it would always fail, but
 in actuality, it now never fails, at least not when compiled with the
 default optimization setting of -O2.  It does, however, fail if
 compiled with -O1.

 I believe this is because the test assumes that snprintf() will return
 -1 when malloc() fails, and the NetBSD implementation does behave that
 way, but this behavior is not allowed by C99 (the N1256 draft; I don't
 have a copy of the final standard), nor by POSIX; neither shows "out
 of memory" as a possible error for snprintf().  The gcc
 -fprintf-return-value optimization (which is enabled by default) then
 takes advantage of the knowledge that the call can't fail, computes
 the return value of snprintf() at compile time, and optimizes away the
 entire ATF_CHECK() call that would otherwise fail.

 Here, I think the bug is that snprintf() is using malloc() at all.
 For reference, there is a similar bug in glibc:

   https://bugzilla.redhat.com/show_bug.cgi?id=441945

 -- 
 Andreas Gustafsson, gson@gson.org

From: Christos Zoulas <christos@zoulas.com>
To: gnats-bugs@netbsd.org
Cc: lib-bug-people@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org,
 Andreas Gustafsson <gson@gson.org>
Subject: Re: lib/56148: lib/libc/stdio/t_printf:snprintf_float test randomly
 fails
Date: Sun, 9 May 2021 13:27:39 -0400

 --Apple-Mail=_30B3B6D3-E769-42D4-8EA5-321EB2297DA6
 Content-Transfer-Encoding: quoted-printable
 Content-Type: text/plain;
 	charset=us-ascii

 I am not aware of any libc implementations that don't allocate memory to =
 do double-to-ascii conversions.
 =46rom what I've seen Solaris does the best here calling malloc only in =
 __big_float_times_power().
 I don't see us replacing gdtoa anytime soon, so we should probably =
 document this in the snprintf() man pages.
 We could also file a bug report with opengroup/posix to make ENOMEM =
 legal for floating point conversions.

 christos

 --Apple-Mail=_30B3B6D3-E769-42D4-8EA5-321EB2297DA6
 Content-Transfer-Encoding: 7bit
 Content-Disposition: attachment;
 	filename=signature.asc
 Content-Type: application/pgp-signature;
 	name=signature.asc
 Content-Description: Message signed with OpenPGP

 -----BEGIN PGP SIGNATURE-----
 Comment: GPGTools - http://gpgtools.org

 iF0EARECAB0WIQS+BJlbqPkO0MDBdsRxESqxbLM7OgUCYJgbiwAKCRBxESqxbLM7
 OsrqAJ97jb/6MFXT236OKrnMd8Y7BPtBGgCgoNSFIY3i0WfnRVTmW94BJjkvDLA=
 =KYpm
 -----END PGP SIGNATURE-----

 --Apple-Mail=_30B3B6D3-E769-42D4-8EA5-321EB2297DA6--

From: Joerg Sonnenberger <joerg@bec.de>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@netbsd.org, lib-bug-people@netbsd.org,
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
	Andreas Gustafsson <gson@gson.org>
Subject: Re: lib/56148: lib/libc/stdio/t_printf:snprintf_float test randomly
 fails
Date: Sun, 9 May 2021 19:58:48 +0200

 On Sun, May 09, 2021 at 01:27:39PM -0400, Christos Zoulas wrote:
 > I am not aware of any libc implementations that don't allocate memory to do double-to-ascii conversions.
 > From what I've seen Solaris does the best here calling malloc only in __big_float_times_power().
 > I don't see us replacing gdtoa anytime soon, so we should probably document this in the snprintf() man pages.
 > We could also file a bug report with opengroup/posix to make ENOMEM legal for floating point conversions.

 It isn't even a question of replacing gdtoa, the alternatives will very,
 very likely run into the same problem that the power-of-five tables are
 huge if pre-computed or require large stack space or allocations if
 computed dynamically. That's the nature of the problem.

 Joerg

From: Christos Zoulas <christos@zoulas.com>
To: Joerg Sonnenberger <joerg@bec.de>
Cc: gnats-bugs@netbsd.org,
 lib-bug-people@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org,
 Andreas Gustafsson <gson@gson.org>
Subject: Re: lib/56148: lib/libc/stdio/t_printf:snprintf_float test randomly
 fails
Date: Sun, 9 May 2021 14:11:11 -0400

 --Apple-Mail=_A31D8CBF-28D2-45A9-B188-1B16E3CBF766
 Content-Transfer-Encoding: quoted-printable
 Content-Type: text/plain;
 	charset=us-ascii



 > On May 9, 2021, at 1:58 PM, Joerg Sonnenberger <joerg@bec.de> wrote:

 > It isn't even a question of replacing gdtoa, the alternatives will =
 very,
 > very likely run into the same problem that the power-of-five tables =
 are
 > huge if pre-computed or require large stack space or allocations if
 > computed dynamically. That's the nature of the problem.

 I know; I mentioned gdtoa explicitly because it is using malloc for much =
 more
 than that (its bignums, strings etc. -- grep alloc =
 /usr/src/lib/libc/gdtoa/*.c).

 christos


 --Apple-Mail=_A31D8CBF-28D2-45A9-B188-1B16E3CBF766
 Content-Transfer-Encoding: 7bit
 Content-Disposition: attachment;
 	filename=signature.asc
 Content-Type: application/pgp-signature;
 	name=signature.asc
 Content-Description: Message signed with OpenPGP

 -----BEGIN PGP SIGNATURE-----
 Comment: GPGTools - http://gpgtools.org

 iF0EARECAB0WIQS+BJlbqPkO0MDBdsRxESqxbLM7OgUCYJglvwAKCRBxESqxbLM7
 OrTnAJ49EByylCvl7+NDEJwdakmU9W+digCghfjbHTvKHObWBT/lM7zKQP/QTAg=
 =e5pe
 -----END PGP SIGNATURE-----

 --Apple-Mail=_A31D8CBF-28D2-45A9-B188-1B16E3CBF766--

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.