NetBSD Problem Report #58453

From christos@astron.com  Mon Jul 22 15:53:53 2024
Return-Path: <christos@astron.com>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
	 client-signature RSA-PSS (2048 bits) client-digest SHA256)
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 894E51A9239
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 22 Jul 2024 15:53:53 +0000 (UTC)
Message-Id: <20240722155350.EE45A6C29A@quasar.astron.com>
Date: Mon, 22 Jul 2024 15:53:50 +0000 (UTC)
From: christos@astron.com
Reply-To: christos@astron.com
To: gnats-bugs@NetBSD.org
Subject: endptr can be unitialized if an invalid base is passed to strto*(3)
X-Send-Pr-Version: 3.95

>Number:         58453
>Category:       lib
>Synopsis:       endptr can be unitialized if an invalid base is passed to strto*(3)
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    kre
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Jul 22 15:55:00 +0000 2024
>Closed-Date:    Wed Jul 24 11:26:47 +0000 2024
>Last-Modified:  Wed Jul 24 11:26:47 +0000 2024
>Originator:     Christos Zoulas
>Release:        NetBSD 10.99.11
>Organization:
	Bogus Plans, Inc.
>Environment:
System: NetBSD quasar.astron.com 10.99.11 NetBSD 10.99.11 (QUASAR) #6: Wed Jul 10 03:34:01 EDT 2024 christos@quasar.astron.com:/usr/src/sys/arch/amd64/compile/QUASAR amd64
Architecture: x86_64
Machine: amd64
>Description:
	Bug report from Alejandro Colomar:
	https://lore.kernel.org/all/20240720190321.550121-2-alx@kernel.org/T/
>How-To-Repeat:

#include <sys/cdefs.h>
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
#include <limits.h>
#include <errno.h>

#ifndef __UNCONST
#define __UNCONST(a) (void *)(intptr_t)(a)
#endif

int
main(int argc, char *argv[])
{
	char *endp = __UNCONST("boo");
	errno = 0;
	intmax_t im = strtoimax(argv[1], &endp, -2);
	printf("%jd [%s] %d\n", im, endp, errno);
	return 0;
}

Run on linux, observe it print "0 [boo] 22\n"

>Fix:
https://mail-index.netbsd.org/source-changes/2024/07/21/msg152491.html

>Release-Note:

>Audit-Trail:
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org, christos@astron.com, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: lib/58453: endptr can be unitialized if an invalid base is passed to strto*(3)
Date: Tue, 23 Jul 2024 04:52:30 +0700

 What exactly is this bug report about?

 The referenced message from Alejandro Colomar is about about
 just strtoi() (which used strtoimax() incorrectly) and which
 you fixed.

 That the other strto*() functions don't leave any defined value
 in *endptr when an invalid base is provided (which is really just
 a bug in the application) isn't unexpected, as Alejandro's 2nd
 message indicates, and POSIX makes expicit:

 APPLICATION USAGE
             Since the value of *endptr is unspecified if the value of base
 	    is not supported, applications should either ensure that base
             has a supported value (0 or between 2 and 36) before the call,
 	    or check for an [EINVAL] error before examining *endptr.

 The test program provided in this PR is incorrect, as it fails to make that
 check (especially as it is explicitly providing an invalid base).

 kre

 ps: Note that this doesn't mean that we couldn't put something specific
 in *endptr in the case of an invalid base, but anything which ends up
 relying upon that would be broken.  Perhaps we could set it to NULL or
 (void *)-1 so any attempt to dereference it would fault, though technically
 simply referencing it is UB.


From: Christos Zoulas <christos@zoulas.com>
To: Robert Elz <kre@munnari.OZ.AU>,
 Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@netbsd.org,
 Christos Zoulas <christos@astron.com>,
 netbsd-bugs@netbsd.org
Subject: Re: lib/58453: endptr can be unitialized if an invalid base is passed
 to strto*(3)
Date: Mon, 22 Jul 2024 17:58:14 -0400

 > On Jul 22, 2024, at 5:52=E2=80=AFPM, Robert Elz <kre@munnari.OZ.AU> =
 wrote:
 >=20
 > What exactly is this bug report about?
 >=20
 > The referenced message from Alejandro Colomar is about about
 > just strtoi() (which used strtoimax() incorrectly) and which
 > you fixed.
 >=20
 > That the other strto*() functions don't leave any defined value
 > in *endptr when an invalid base is provided (which is really just
 > a bug in the application) isn't unexpected, as Alejandro's 2nd
 > message indicates, and POSIX makes expicit:
 >=20
 > APPLICATION USAGE
 >            Since the value of *endptr is unspecified if the value of =
 base
 > 	    is not supported, applications should either ensure that =
 base
 >            has a supported value (0 or between 2 and 36) before the =
 call,
 > 	    or check for an [EINVAL] error before examining *endptr.
 >=20
 > The test program provided in this PR is incorrect, as it fails to make =
 that
 > check (especially as it is explicitly providing an invalid base).
 >=20
 > kre
 >=20
 > ps: Note that this doesn't mean that we couldn't put something =
 specific
 > in *endptr in the case of an invalid base, but anything which ends up
 > relying upon that would be broken.  Perhaps we could set it to NULL or
 > (void *)-1 so any attempt to dereference it would fault, though =
 technically
 > simply referencing it is UB.

 Our implementation sets it to the beginning of the string. I know it is =
 allowed
 to UB when the base is invalid, but since out implementation initializes =
 it,
 the test works. I think that there is a disadvantage to adding the test =
 in that
 it enforces a behavior which is strictly not required. So if the tests =
 were used
 on Linux for example, they would fail.

 christos

From: Robert Elz <kre@munnari.OZ.AU>
To: Christos Zoulas <christos@zoulas.com>
Cc: Taylor R Campbell <riastradh@NetBSD.org>, gnats-bugs@netbsd.org,
        Christos Zoulas <christos@astron.com>, netbsd-bugs@netbsd.org
Subject: Re: lib/58453: endptr can be unitialized if an invalid base is passed to strto*(3)
Date: Tue, 23 Jul 2024 05:55:33 +0700

     Date:        Mon, 22 Jul 2024 17:58:14 -0400
     From:        Christos Zoulas <christos@zoulas.com>
     Message-ID:  <910773C7-98CA-4CB9-9114-33AE517E74AB@zoulas.com>

   | I think that there is a disadvantage to adding the test in that
   | it enforces a behavior which is strictly not required. So if the
   | tests were used on Linux for example, they would fail.

 I remain confused.   What test is that?   Or tests that would be used on
 linux?

 If this is about some ATF test (the PR doesn't even suggest that though)
 then it would depend upon the test, some of them are explicitly testing
 NetBSD behaviour, to ensure it isn't accidentally altered (breaking backwards
 compat) and others test required behaviour, to make sure our implementation
 doesn't have design/impl bugs.

 Ideally the former kind would skip if the test isn't being run on NetBSD,
 but I doubt that many of them do.

 kre

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Robert Elz <kre@munnari.OZ.AU>
Cc: Christos Zoulas <christos@zoulas.com>, gnats-bugs@NetBSD.org,
	Christos Zoulas <christos@astron.com>, netbsd-bugs@NetBSD.org
Subject: Re: lib/58453: endptr can be unitialized if an invalid base is passed to strto*(3)
Date: Tue, 23 Jul 2024 00:03:48 +0000

 > Date: Tue, 23 Jul 2024 05:55:33 +0700
 > From: Robert Elz <kre@munnari.OZ.AU>
 >=20
 > I remain confused.   What test is that?   Or tests that would be used on
 > linux?
 >=20
 > If this is about some ATF test (the PR doesn't even suggest that though)
 > then it would depend upon the test, some of them are explicitly testing
 > NetBSD behaviour, to ensure it isn't accidentally altered (breaking backw=
 ards
 > compat) and others test required behaviour, to make sure our implementati=
 on
 > doesn't have design/impl bugs.
 >=20
 > Ideally the former kind would skip if the test isn't being run on NetBSD,
 > but I doubt that many of them do.

 I asked christos@ to file this PR and commit an ATF test to explain
 the otherwise baffling commit

 https://mail-index.netbsd.org/source-changes/2024/07/21/msg152491.html

 in order to verify we have actually fixed anything, or to prevent us
 from regressing later, and to track pullups in case it's needed.

 I still haven't looked into the code because I'm busy dealing with
 other things right now, but my understanding from what christos@ told
 me was:

 1. strtoimax and strtoumax may leave *endptr uninitialized on some
    platforms.

    (a) NetBSD's strtoimax and strtoumax always initialize *endptr.

    (b) glibc's(?) strtoimax and strtoumax sometimes leave *endptr
        uninitialized.

 2. Our strtoi is defined in terms of strtoimax and strtoumax, and the
    current implementation -- before christos@'s commit -- rely on
    strtoimax and strtoumax to always initialize *endptr.

 3. christos@'s commit lifts this assumption so that the strtoi code we
    use works in terms of either NetBSD's or glibc's
    strtoimax/strtoumax.

 I asked christos@ to commit an ATF test for strtoi that exercises a
 path that, _under glibc's implementation_ of strtoimax/strtoumax,
 would use uninitialized memory.  That way, we have a chance -- e.g.,
 via ubsan, or just by initializing it to some garbage pointer into
 unmapped oblivion -- of detecting the nonportable assumption in strtoi
 in case we ever change our strtoimax/strtoumax implementation.

From: Christos Zoulas <christos@zoulas.com>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: Robert Elz <kre@munnari.OZ.AU>,
 "gnats-bugs@netbsd.org" <gnats-bugs@NetBSD.org>,
 Christos Zoulas <christos@astron.com>,
 "netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>
Subject: Re: lib/58453: endptr can be unitialized if an invalid base is passed
 to strto*(3)
Date: Mon, 22 Jul 2024 20:13:51 -0400

 > 
 > I asked christos@ to file this PR and commit an ATF test to explain
 > the otherwise baffling commit
 > 
 > https://mail-index.netbsd.org/source-changes/2024/07/21/msg152491.html
 > 
 > in order to verify we have actually fixed anything, or to prevent us
 > from regressing later, and to track pullups in case it's needed.
 > 
 > I still haven't looked into the code because I'm busy dealing with
 > other things right now, but my understanding from what christos@ told
 > me was:
 > 
 > 1. strtoimax and strtoumax may leave *endptr uninitialized on some
 >   platforms.
 > 
 >   (a) NetBSD's strtoimax and strtoumax always initialize *endptr.
 > 
 >   (b) glibc's(?) strtoimax and strtoumax sometimes leave *endptr
 >       uninitialized.

 Yes, when there is an invalid base. strtoi did not handle this properly
 before the patch.

 > 
 > 2. Our strtoi is defined in terms of strtoimax and strtoumax, and the
 >   current implementation -- before christos@'s commit -- rely on
 >   strtoimax and strtoumax to always initialize *endptr.

 correct

 > 3. christos@'s commit lifts this assumption so that the strtoi code we
 >   use works in terms of either NetBSD's or glibc's
 >   strtoimax/strtoumax.

 correct
 > 
 > I asked christos@ to commit an ATF test for strtoi that exercises a
 > path that, _under glibc's implementation_ of strtoimax/strtoumax,
 > would use uninitialized memory.  That way, we have a chance -- e.g.,
 > via ubsan, or just by initializing it to some garbage pointer into
 > unmapped oblivion -- of detecting the nonportable assumption in strtoi
 > in case we ever change our strtoimax/strtoumax implementation.

 I just added a test that checks that strtoimax returns an initialized pointer
 (I should change from strcmp to == str).

 christos

From: Robert Elz <kre@munnari.OZ.AU>
To: Christos Zoulas <christos@zoulas.com>
Cc: Taylor R Campbell <riastradh@NetBSD.org>,
        "gnats-bugs@netbsd.org" <gnats-bugs@NetBSD.org>,
        Christos Zoulas <christos@astron.com>,
        "netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>
Subject: Re: lib/58453: endptr can be unitialized if an invalid base is passed to strto*(3)
Date: Tue, 23 Jul 2024 08:32:32 +0700

     Date:        Mon, 22 Jul 2024 20:13:51 -0400
     From:        Christos Zoulas <christos@zoulas.com>
     Message-ID:  <2984FE66-0D2E-41ED-A70E-0755EE1D1F46@zoulas.com>

   | Yes, when there is an invalid base. strtoi did not handle this properly
   | before the patch.

 Actually, as I understand it, unless you're looking at the POSIX
 definition of strtoimax(), there was no problem there on NetBSD,
 because of how our strtoimax() behaves.

 The actual problem that was fixed in the change cited related to
 distinguishing between the function being cancelled and it simply
 doing nothing (no number present I think might have been the alternative),
 which I guess we could have a test for, but I certainly have no idea how
 to write, and it isn't clear we ever had an actual issue there either.

   | > 3. christos@'s commit lifts this assumption so that the strtoi code we
   | >   use works in terms of either NetBSD's or glibc's
   | >   strtoimax/strtoumax.

 Yes, though there's not a lot of reason for that, other than strict
 correctness (no objections to dealing with that of course) - NetBSD's
 strtoi() is very unlikely to ever call a glibc strtoimax().   Someone
 else's copy of our strtoi() might of course.

 If we wanted to be absolutely correct (not just our libc vs glibc)
 and deal with other possible bizarre implementations of this issue,
 it would need to be more complex, or coded differently.

   | > I asked christos@ to commit an ATF test for strtoi that exercises a
   | > path that, _under glibc's implementation_ of strtoimax/strtoumax,
   | > would use uninitialized memory.  That way, we have a chance -- e.g.,
   | > via ubsan, or just by initializing it to some garbage pointer into
   | > unmapped oblivion -- of detecting the nonportable assumption in strtoi
   | > in case we ever change our strtoimax/strtoumax implementation.

 Since that assumption also got fixed, I believe, I doubt that a test
 for that would be easy to create, and while this

   | I just added a test that checks that strtoimax returns an initialized
   | pointer

 is reasonable, and harmless (verifying our implementation doesn't change),
 it isn't what was requested.

 Further, the actual undefined (unspecified) behaviour is what *entptr
 gets set to (if anything at all) - not whether it points to valid memory.
 Simply accessing *endptr is UB according to the POSIX (and I presume C)
 definition of strtoimax() (and the other basic strtoX() functions).

 That is, the implementation (of the strtoimax() etc functions) is permitted
 to place absolutely anything in the *endptr passed to it, if the base is
 invalid, including a pointer that will cause a trap if referenced (not
 just dereferenced).   I'm not aware of any of our supported ports where
 such a pointer value exists though.  ubsan (about which I know < zero)
 might possibly be able to detect such an undefined use however.

 I thought the changes to strtoi() were such that it dealt correctly
 with this possibility (even though it never occurs on NetBSD) but on
 a closer examination I'm not sure that's true.

 A simpler change to strtoi() to deal with all of this UB stuff would be to
 validate the base passed to it, rather than relying upon strtoimax()
 doing that - either by duplicating the code, or by all relevant functions
 calling an internal (ie _xxx()) function to do it, so they all remain
 consistent - that could be a static inline function from some #ifdef _LIBC
 section of some (internal) header file, so that the function could be
 defined just once, but avoid the function call overhead for the
 relatively trivial test that is needed.

 At least now, I think, there's enough information in this PR that
 some sense can be made of it, though I'm not sure that there's anything
 much required to handle it, and I doubt the ECANCELED change is really
 important enough to bother with a pullup, the other change certainly
 isn't.   After all, Alejandro's message containing the patch did say:

 alx@kernel.org said:
   | I'm not sure if this patch is wanted in NetBSD.  It doesn't fix any bugs
   | there. 

 Fixing it in HEAD makes sense however:

 alx@kernel.org said:
   | for those pasting this code in other systems (which is currently done by
   | libbsd).

 but we don't need it pulled up for that.

 kre

Responsible-Changed-From-To: lib-bug-people->kre
Responsible-Changed-By: kre@NetBSD.org
Responsible-Changed-When: Wed, 24 Jul 2024 03:16:05 +0000
Responsible-Changed-Why:
This PR is (kind of) intended to be the same as PR lib/58461,
they will both be fixed together.


From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/58453 CVS commit: src
Date: Wed, 24 Jul 2024 08:55:09 +0000

 Module Name:	src
 Committed By:	kre
 Date:		Wed Jul 24 08:55:09 UTC 2024

 Modified Files:
 	src/distrib/sets/lists/comp: mi
 	src/lib/libc/stdlib: Makefile.inc strtoi.3

 Log Message:
 Improve the man for strtoi() (and consequently strtou())

 PR lib/58461  PR lib/58453

 Improve the wording so it is clearer exactly what strtoi() is
 intended to do in various cases.

 While here, add, in the most minimalist way possible, the
 strtoi_l(), and so also strtou_l(), functions, which seem to
 have been previously undocumented.

 Do some linguistic gymnastics so that the conversion of the
 page from strtoi(3) -> strtou(3) will not generate "an unitmax_t"
 which is incorrect, we need "a unitmax_t" - one of those was
 easy to fix in the Makefile sed script, the others would have
 been more difficult, so reword instead.


 To generate a diff of this commit:
 cvs rdiff -u -r1.2465 -r1.2466 src/distrib/sets/lists/comp/mi
 cvs rdiff -u -r1.99 -r1.100 src/lib/libc/stdlib/Makefile.inc
 cvs rdiff -u -r1.10 -r1.11 src/lib/libc/stdlib/strtoi.3

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/58453 CVS commit: src/common/lib/libc/stdlib
Date: Wed, 24 Jul 2024 09:11:28 +0000

 Module Name:	src
 Committed By:	kre
 Date:		Wed Jul 24 09:11:28 UTC 2024

 Modified Files:
 	src/common/lib/libc/stdlib: _strtoi.h

 Log Message:
 PR lib/58461  PR lib/58453 portability fixes

 Revert previous (1.4) and make the changes properly.

 If base is invalid, what gets left in *endptr by strtoimax()
 is unspecified (it is not guaranteed to be either nptr or unaltered)
 and so cannot (in that case) be used in any way at all.

 Since it is hard to determine from some implementations of
 strtoimax() whether this happened or not, simply duplicate
 the validity test for base here, so we know that error (EINVAL
 because base is invalid) cannot occur from strtoimax().   In that
 case, if we get an EINVAL from strtoimax we can simply ignore it,
 as all it can mean is the (optional in POSIX) case where no
 conversion occurred (where strtoi() sets the status to ECANCELED).

 Since NetBSD never did that, this all changes nothing here, but
 makes strtoi() portable to other environments using a different
 version of strtoimax().

 NFCI.

 No pullups required, nothing has really changed, there never was
 a NetBSD bug to fix.


 To generate a diff of this commit:
 cvs rdiff -u -r1.4 -r1.5 src/common/lib/libc/stdlib/_strtoi.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/58453 CVS commit: src/tests/lib/libc/stdlib
Date: Wed, 24 Jul 2024 09:26:06 +0000

 Module Name:	src
 Committed By:	kre
 Date:		Wed Jul 24 09:26:06 UTC 2024

 Modified Files:
 	src/tests/lib/libc/stdlib: t_strtoi.c

 Log Message:
 Add some test cases to tests/lib/libc/stdlib/t_strtoi

 PR lib/58461  PR lib/58453

 Apologies for the previous commit message - I managed to
 forget to include the message filename after commit -F
 and so used the file being committed (the only changed file
 in the directory) as the log message.   (Fortunately that
 meant that the log didn't get appended to the PR).

 For the PR, the command to check what actually changed is

 	cvs rdiff -u -r1.3 -r1.4 src/tests/lib/libc/stdlib/t_strtoi.c

 rather than what will be reported below.

 This commit message really belongs to the previous commit,
 (1.5) there are no changes at all in this one.

 No pullups required.


 To generate a diff of this commit:
 cvs rdiff -u -r1.4 -r1.5 src/tests/lib/libc/stdlib/t_strtoi.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: kre@NetBSD.org
State-Changed-When: Wed, 24 Jul 2024 11:26:47 +0000
State-Changed-Why:
There's nothing needed to be done with this PR.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.