NetBSD Problem Report #58453
From christos@astron.com Mon Jul 22 15:53:53 2024
Return-Path: <christos@astron.com>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
client-signature RSA-PSS (2048 bits) client-digest SHA256)
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 894E51A9239
for <gnats-bugs@gnats.NetBSD.org>; Mon, 22 Jul 2024 15:53:53 +0000 (UTC)
Message-Id: <20240722155350.EE45A6C29A@quasar.astron.com>
Date: Mon, 22 Jul 2024 15:53:50 +0000 (UTC)
From: christos@astron.com
Reply-To: christos@astron.com
To: gnats-bugs@NetBSD.org
Subject: endptr can be unitialized if an invalid base is passed to strto*(3)
X-Send-Pr-Version: 3.95
>Number: 58453
>Category: lib
>Synopsis: endptr can be unitialized if an invalid base is passed to strto*(3)
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: kre
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Jul 22 15:55:00 +0000 2024
>Closed-Date: Wed Jul 24 11:26:47 +0000 2024
>Last-Modified: Wed Jul 24 11:26:47 +0000 2024
>Originator: Christos Zoulas
>Release: NetBSD 10.99.11
>Organization:
Bogus Plans, Inc.
>Environment:
System: NetBSD quasar.astron.com 10.99.11 NetBSD 10.99.11 (QUASAR) #6: Wed Jul 10 03:34:01 EDT 2024 christos@quasar.astron.com:/usr/src/sys/arch/amd64/compile/QUASAR amd64
Architecture: x86_64
Machine: amd64
>Description:
Bug report from Alejandro Colomar:
https://lore.kernel.org/all/20240720190321.550121-2-alx@kernel.org/T/
>How-To-Repeat:
#include <sys/cdefs.h>
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
#include <limits.h>
#include <errno.h>
#ifndef __UNCONST
#define __UNCONST(a) (void *)(intptr_t)(a)
#endif
int
main(int argc, char *argv[])
{
char *endp = __UNCONST("boo");
errno = 0;
intmax_t im = strtoimax(argv[1], &endp, -2);
printf("%jd [%s] %d\n", im, endp, errno);
return 0;
}
Run on linux, observe it print "0 [boo] 22\n"
>Fix:
https://mail-index.netbsd.org/source-changes/2024/07/21/msg152491.html
>Release-Note:
>Audit-Trail:
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org, christos@astron.com, netbsd-bugs@netbsd.org
Cc:
Subject: Re: lib/58453: endptr can be unitialized if an invalid base is passed to strto*(3)
Date: Tue, 23 Jul 2024 04:52:30 +0700
What exactly is this bug report about?
The referenced message from Alejandro Colomar is about about
just strtoi() (which used strtoimax() incorrectly) and which
you fixed.
That the other strto*() functions don't leave any defined value
in *endptr when an invalid base is provided (which is really just
a bug in the application) isn't unexpected, as Alejandro's 2nd
message indicates, and POSIX makes expicit:
APPLICATION USAGE
Since the value of *endptr is unspecified if the value of base
is not supported, applications should either ensure that base
has a supported value (0 or between 2 and 36) before the call,
or check for an [EINVAL] error before examining *endptr.
The test program provided in this PR is incorrect, as it fails to make that
check (especially as it is explicitly providing an invalid base).
kre
ps: Note that this doesn't mean that we couldn't put something specific
in *endptr in the case of an invalid base, but anything which ends up
relying upon that would be broken. Perhaps we could set it to NULL or
(void *)-1 so any attempt to dereference it would fault, though technically
simply referencing it is UB.
From: Christos Zoulas <christos@zoulas.com>
To: Robert Elz <kre@munnari.OZ.AU>,
Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@netbsd.org,
Christos Zoulas <christos@astron.com>,
netbsd-bugs@netbsd.org
Subject: Re: lib/58453: endptr can be unitialized if an invalid base is passed
to strto*(3)
Date: Mon, 22 Jul 2024 17:58:14 -0400
> On Jul 22, 2024, at 5:52=E2=80=AFPM, Robert Elz <kre@munnari.OZ.AU> =
wrote:
>=20
> What exactly is this bug report about?
>=20
> The referenced message from Alejandro Colomar is about about
> just strtoi() (which used strtoimax() incorrectly) and which
> you fixed.
>=20
> That the other strto*() functions don't leave any defined value
> in *endptr when an invalid base is provided (which is really just
> a bug in the application) isn't unexpected, as Alejandro's 2nd
> message indicates, and POSIX makes expicit:
>=20
> APPLICATION USAGE
> Since the value of *endptr is unspecified if the value of =
base
> is not supported, applications should either ensure that =
base
> has a supported value (0 or between 2 and 36) before the =
call,
> or check for an [EINVAL] error before examining *endptr.
>=20
> The test program provided in this PR is incorrect, as it fails to make =
that
> check (especially as it is explicitly providing an invalid base).
>=20
> kre
>=20
> ps: Note that this doesn't mean that we couldn't put something =
specific
> in *endptr in the case of an invalid base, but anything which ends up
> relying upon that would be broken. Perhaps we could set it to NULL or
> (void *)-1 so any attempt to dereference it would fault, though =
technically
> simply referencing it is UB.
Our implementation sets it to the beginning of the string. I know it is =
allowed
to UB when the base is invalid, but since out implementation initializes =
it,
the test works. I think that there is a disadvantage to adding the test =
in that
it enforces a behavior which is strictly not required. So if the tests =
were used
on Linux for example, they would fail.
christos
From: Robert Elz <kre@munnari.OZ.AU>
To: Christos Zoulas <christos@zoulas.com>
Cc: Taylor R Campbell <riastradh@NetBSD.org>, gnats-bugs@netbsd.org,
Christos Zoulas <christos@astron.com>, netbsd-bugs@netbsd.org
Subject: Re: lib/58453: endptr can be unitialized if an invalid base is passed to strto*(3)
Date: Tue, 23 Jul 2024 05:55:33 +0700
Date: Mon, 22 Jul 2024 17:58:14 -0400
From: Christos Zoulas <christos@zoulas.com>
Message-ID: <910773C7-98CA-4CB9-9114-33AE517E74AB@zoulas.com>
| I think that there is a disadvantage to adding the test in that
| it enforces a behavior which is strictly not required. So if the
| tests were used on Linux for example, they would fail.
I remain confused. What test is that? Or tests that would be used on
linux?
If this is about some ATF test (the PR doesn't even suggest that though)
then it would depend upon the test, some of them are explicitly testing
NetBSD behaviour, to ensure it isn't accidentally altered (breaking backwards
compat) and others test required behaviour, to make sure our implementation
doesn't have design/impl bugs.
Ideally the former kind would skip if the test isn't being run on NetBSD,
but I doubt that many of them do.
kre
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Robert Elz <kre@munnari.OZ.AU>
Cc: Christos Zoulas <christos@zoulas.com>, gnats-bugs@NetBSD.org,
Christos Zoulas <christos@astron.com>, netbsd-bugs@NetBSD.org
Subject: Re: lib/58453: endptr can be unitialized if an invalid base is passed to strto*(3)
Date: Tue, 23 Jul 2024 00:03:48 +0000
> Date: Tue, 23 Jul 2024 05:55:33 +0700
> From: Robert Elz <kre@munnari.OZ.AU>
>=20
> I remain confused. What test is that? Or tests that would be used on
> linux?
>=20
> If this is about some ATF test (the PR doesn't even suggest that though)
> then it would depend upon the test, some of them are explicitly testing
> NetBSD behaviour, to ensure it isn't accidentally altered (breaking backw=
ards
> compat) and others test required behaviour, to make sure our implementati=
on
> doesn't have design/impl bugs.
>=20
> Ideally the former kind would skip if the test isn't being run on NetBSD,
> but I doubt that many of them do.
I asked christos@ to file this PR and commit an ATF test to explain
the otherwise baffling commit
https://mail-index.netbsd.org/source-changes/2024/07/21/msg152491.html
in order to verify we have actually fixed anything, or to prevent us
from regressing later, and to track pullups in case it's needed.
I still haven't looked into the code because I'm busy dealing with
other things right now, but my understanding from what christos@ told
me was:
1. strtoimax and strtoumax may leave *endptr uninitialized on some
platforms.
(a) NetBSD's strtoimax and strtoumax always initialize *endptr.
(b) glibc's(?) strtoimax and strtoumax sometimes leave *endptr
uninitialized.
2. Our strtoi is defined in terms of strtoimax and strtoumax, and the
current implementation -- before christos@'s commit -- rely on
strtoimax and strtoumax to always initialize *endptr.
3. christos@'s commit lifts this assumption so that the strtoi code we
use works in terms of either NetBSD's or glibc's
strtoimax/strtoumax.
I asked christos@ to commit an ATF test for strtoi that exercises a
path that, _under glibc's implementation_ of strtoimax/strtoumax,
would use uninitialized memory. That way, we have a chance -- e.g.,
via ubsan, or just by initializing it to some garbage pointer into
unmapped oblivion -- of detecting the nonportable assumption in strtoi
in case we ever change our strtoimax/strtoumax implementation.
From: Christos Zoulas <christos@zoulas.com>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: Robert Elz <kre@munnari.OZ.AU>,
"gnats-bugs@netbsd.org" <gnats-bugs@NetBSD.org>,
Christos Zoulas <christos@astron.com>,
"netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>
Subject: Re: lib/58453: endptr can be unitialized if an invalid base is passed
to strto*(3)
Date: Mon, 22 Jul 2024 20:13:51 -0400
>
> I asked christos@ to file this PR and commit an ATF test to explain
> the otherwise baffling commit
>
> https://mail-index.netbsd.org/source-changes/2024/07/21/msg152491.html
>
> in order to verify we have actually fixed anything, or to prevent us
> from regressing later, and to track pullups in case it's needed.
>
> I still haven't looked into the code because I'm busy dealing with
> other things right now, but my understanding from what christos@ told
> me was:
>
> 1. strtoimax and strtoumax may leave *endptr uninitialized on some
> platforms.
>
> (a) NetBSD's strtoimax and strtoumax always initialize *endptr.
>
> (b) glibc's(?) strtoimax and strtoumax sometimes leave *endptr
> uninitialized.
Yes, when there is an invalid base. strtoi did not handle this properly
before the patch.
>
> 2. Our strtoi is defined in terms of strtoimax and strtoumax, and the
> current implementation -- before christos@'s commit -- rely on
> strtoimax and strtoumax to always initialize *endptr.
correct
> 3. christos@'s commit lifts this assumption so that the strtoi code we
> use works in terms of either NetBSD's or glibc's
> strtoimax/strtoumax.
correct
>
> I asked christos@ to commit an ATF test for strtoi that exercises a
> path that, _under glibc's implementation_ of strtoimax/strtoumax,
> would use uninitialized memory. That way, we have a chance -- e.g.,
> via ubsan, or just by initializing it to some garbage pointer into
> unmapped oblivion -- of detecting the nonportable assumption in strtoi
> in case we ever change our strtoimax/strtoumax implementation.
I just added a test that checks that strtoimax returns an initialized pointer
(I should change from strcmp to == str).
christos
From: Robert Elz <kre@munnari.OZ.AU>
To: Christos Zoulas <christos@zoulas.com>
Cc: Taylor R Campbell <riastradh@NetBSD.org>,
"gnats-bugs@netbsd.org" <gnats-bugs@NetBSD.org>,
Christos Zoulas <christos@astron.com>,
"netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>
Subject: Re: lib/58453: endptr can be unitialized if an invalid base is passed to strto*(3)
Date: Tue, 23 Jul 2024 08:32:32 +0700
Date: Mon, 22 Jul 2024 20:13:51 -0400
From: Christos Zoulas <christos@zoulas.com>
Message-ID: <2984FE66-0D2E-41ED-A70E-0755EE1D1F46@zoulas.com>
| Yes, when there is an invalid base. strtoi did not handle this properly
| before the patch.
Actually, as I understand it, unless you're looking at the POSIX
definition of strtoimax(), there was no problem there on NetBSD,
because of how our strtoimax() behaves.
The actual problem that was fixed in the change cited related to
distinguishing between the function being cancelled and it simply
doing nothing (no number present I think might have been the alternative),
which I guess we could have a test for, but I certainly have no idea how
to write, and it isn't clear we ever had an actual issue there either.
| > 3. christos@'s commit lifts this assumption so that the strtoi code we
| > use works in terms of either NetBSD's or glibc's
| > strtoimax/strtoumax.
Yes, though there's not a lot of reason for that, other than strict
correctness (no objections to dealing with that of course) - NetBSD's
strtoi() is very unlikely to ever call a glibc strtoimax(). Someone
else's copy of our strtoi() might of course.
If we wanted to be absolutely correct (not just our libc vs glibc)
and deal with other possible bizarre implementations of this issue,
it would need to be more complex, or coded differently.
| > I asked christos@ to commit an ATF test for strtoi that exercises a
| > path that, _under glibc's implementation_ of strtoimax/strtoumax,
| > would use uninitialized memory. That way, we have a chance -- e.g.,
| > via ubsan, or just by initializing it to some garbage pointer into
| > unmapped oblivion -- of detecting the nonportable assumption in strtoi
| > in case we ever change our strtoimax/strtoumax implementation.
Since that assumption also got fixed, I believe, I doubt that a test
for that would be easy to create, and while this
| I just added a test that checks that strtoimax returns an initialized
| pointer
is reasonable, and harmless (verifying our implementation doesn't change),
it isn't what was requested.
Further, the actual undefined (unspecified) behaviour is what *entptr
gets set to (if anything at all) - not whether it points to valid memory.
Simply accessing *endptr is UB according to the POSIX (and I presume C)
definition of strtoimax() (and the other basic strtoX() functions).
That is, the implementation (of the strtoimax() etc functions) is permitted
to place absolutely anything in the *endptr passed to it, if the base is
invalid, including a pointer that will cause a trap if referenced (not
just dereferenced). I'm not aware of any of our supported ports where
such a pointer value exists though. ubsan (about which I know < zero)
might possibly be able to detect such an undefined use however.
I thought the changes to strtoi() were such that it dealt correctly
with this possibility (even though it never occurs on NetBSD) but on
a closer examination I'm not sure that's true.
A simpler change to strtoi() to deal with all of this UB stuff would be to
validate the base passed to it, rather than relying upon strtoimax()
doing that - either by duplicating the code, or by all relevant functions
calling an internal (ie _xxx()) function to do it, so they all remain
consistent - that could be a static inline function from some #ifdef _LIBC
section of some (internal) header file, so that the function could be
defined just once, but avoid the function call overhead for the
relatively trivial test that is needed.
At least now, I think, there's enough information in this PR that
some sense can be made of it, though I'm not sure that there's anything
much required to handle it, and I doubt the ECANCELED change is really
important enough to bother with a pullup, the other change certainly
isn't. After all, Alejandro's message containing the patch did say:
alx@kernel.org said:
| I'm not sure if this patch is wanted in NetBSD. It doesn't fix any bugs
| there.
Fixing it in HEAD makes sense however:
alx@kernel.org said:
| for those pasting this code in other systems (which is currently done by
| libbsd).
but we don't need it pulled up for that.
kre
Responsible-Changed-From-To: lib-bug-people->kre
Responsible-Changed-By: kre@NetBSD.org
Responsible-Changed-When: Wed, 24 Jul 2024 03:16:05 +0000
Responsible-Changed-Why:
This PR is (kind of) intended to be the same as PR lib/58461,
they will both be fixed together.
From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/58453 CVS commit: src
Date: Wed, 24 Jul 2024 08:55:09 +0000
Module Name: src
Committed By: kre
Date: Wed Jul 24 08:55:09 UTC 2024
Modified Files:
src/distrib/sets/lists/comp: mi
src/lib/libc/stdlib: Makefile.inc strtoi.3
Log Message:
Improve the man for strtoi() (and consequently strtou())
PR lib/58461 PR lib/58453
Improve the wording so it is clearer exactly what strtoi() is
intended to do in various cases.
While here, add, in the most minimalist way possible, the
strtoi_l(), and so also strtou_l(), functions, which seem to
have been previously undocumented.
Do some linguistic gymnastics so that the conversion of the
page from strtoi(3) -> strtou(3) will not generate "an unitmax_t"
which is incorrect, we need "a unitmax_t" - one of those was
easy to fix in the Makefile sed script, the others would have
been more difficult, so reword instead.
To generate a diff of this commit:
cvs rdiff -u -r1.2465 -r1.2466 src/distrib/sets/lists/comp/mi
cvs rdiff -u -r1.99 -r1.100 src/lib/libc/stdlib/Makefile.inc
cvs rdiff -u -r1.10 -r1.11 src/lib/libc/stdlib/strtoi.3
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/58453 CVS commit: src/common/lib/libc/stdlib
Date: Wed, 24 Jul 2024 09:11:28 +0000
Module Name: src
Committed By: kre
Date: Wed Jul 24 09:11:28 UTC 2024
Modified Files:
src/common/lib/libc/stdlib: _strtoi.h
Log Message:
PR lib/58461 PR lib/58453 portability fixes
Revert previous (1.4) and make the changes properly.
If base is invalid, what gets left in *endptr by strtoimax()
is unspecified (it is not guaranteed to be either nptr or unaltered)
and so cannot (in that case) be used in any way at all.
Since it is hard to determine from some implementations of
strtoimax() whether this happened or not, simply duplicate
the validity test for base here, so we know that error (EINVAL
because base is invalid) cannot occur from strtoimax(). In that
case, if we get an EINVAL from strtoimax we can simply ignore it,
as all it can mean is the (optional in POSIX) case where no
conversion occurred (where strtoi() sets the status to ECANCELED).
Since NetBSD never did that, this all changes nothing here, but
makes strtoi() portable to other environments using a different
version of strtoimax().
NFCI.
No pullups required, nothing has really changed, there never was
a NetBSD bug to fix.
To generate a diff of this commit:
cvs rdiff -u -r1.4 -r1.5 src/common/lib/libc/stdlib/_strtoi.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/58453 CVS commit: src/tests/lib/libc/stdlib
Date: Wed, 24 Jul 2024 09:26:06 +0000
Module Name: src
Committed By: kre
Date: Wed Jul 24 09:26:06 UTC 2024
Modified Files:
src/tests/lib/libc/stdlib: t_strtoi.c
Log Message:
Add some test cases to tests/lib/libc/stdlib/t_strtoi
PR lib/58461 PR lib/58453
Apologies for the previous commit message - I managed to
forget to include the message filename after commit -F
and so used the file being committed (the only changed file
in the directory) as the log message. (Fortunately that
meant that the log didn't get appended to the PR).
For the PR, the command to check what actually changed is
cvs rdiff -u -r1.3 -r1.4 src/tests/lib/libc/stdlib/t_strtoi.c
rather than what will be reported below.
This commit message really belongs to the previous commit,
(1.5) there are no changes at all in this one.
No pullups required.
To generate a diff of this commit:
cvs rdiff -u -r1.4 -r1.5 src/tests/lib/libc/stdlib/t_strtoi.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: kre@NetBSD.org
State-Changed-When: Wed, 24 Jul 2024 11:26:47 +0000
State-Changed-Why:
There's nothing needed to be done with this PR.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.