NetBSD Problem Report #56780
From gson@gson.org Tue Apr 5 13:22:18 2022
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 95C1D1A921F
for <gnats-bugs@gnats.NetBSD.org>; Tue, 5 Apr 2022 13:22:18 +0000 (UTC)
Message-Id: <20220405132209.20C7F2546F8@guava.gson.org>
Date: Tue, 5 Apr 2022 16:22:09 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: lib/libc/sys/t_mmap:mmap_err test case fails randomly
X-Send-Pr-Version: 3.95
>Number: 56780
>Category: kern
>Synopsis: lib/libc/sys/t_mmap:mmap_err test case fails randomly
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Apr 05 13:25:00 +0000 2022
>Closed-Date: Sat Aug 13 10:22:31 +0000 2022
>Last-Modified: Sat Aug 13 10:22:31 +0000 2022
>Originator: Andreas Gustafsson
>Release: NetBSD-current
>Organization:
>Environment:
System: NetBSD
Architecture: i386
Machine: i386
>Description:
The TNF i386 testbed testbed just reported a failure of the
lib/libc/sys/t_mmap:mmap_err test case:
http://releng.netbsd.org/b5reports/i386/2022/2022.04.02.22.15.57/test.html#lib_libc_sys_t_mmap_mmap_err
It fails with the message
FAILED: /tmp/build/2022.04.02.22.15.57-i386/src/tests/lib/libc/sys/t_mmap.c:231: errno == EINVAL not met
I committed a change to make the test print the actual errno value at
the time of failure, and it's 9 (EBADF).
The testbed logs show sporadic failures of this test case going back
at least to 2018.
>How-To-Repeat:
# cd /usr/tests/lib/libc/sys
# while ./t_mmap map_err; do true; done
>Fix:
>Release-Note:
>Audit-Trail:
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56780: lib/libc/sys/t_mmap:mmap_err test case fails randomly
Date: Tue, 5 Apr 2022 16:54:41 +0300
Correction: in the How-To-Repeat recipe, it should say mmap_err, not
map_err:
# cd /usr/tests/lib/libc/sys
# while ./t_mmap mmap_err; do true; done
--
Andreas Gustafsson, gson@gson.org
From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56780: lib/libc/sys/t_mmap:mmap_err test case fails randomly
Date: Tue, 5 Apr 2022 14:09:25 -0000 (UTC)
gson@gson.org (Andreas Gustafsson) writes:
> FAILED: /tmp/build/2022.04.02.22.15.57-i386/src/tests/lib/libc/sys/t_mmap.c:231: errno == EINVAL not met
>I committed a change to make the test print the actual errno value at
>the time of failure, and it's 9 (EBADF).
The test does something funny and uses the address of the stack
variable 'addr' as as the address hint instead of its value.
size_t addr = SIZE_MAX;
errno = 0;
map = mmap(&addr, page, PROT_READ, MAP_FIXED|MAP_PRIVATE, -1, 0);
It's possible with enough address randomization that the variable
ends up page aligned, and then the alignment check won't fail
with EINVAL, but the descriptor check fails with EBADF.
From: Andreas Gustafsson <gson@gson.org>
To: mlelstv@serpens.de, gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56780: lib/libc/sys/t_mmap:mmap_err test case fails randomly
Date: Tue, 5 Apr 2022 17:44:49 +0300
Michael van Elst wrote:
> The test does something funny and uses the address of the stack
> variable 'addr' as as the address hint instead of its value.
Well spotted! I'll fix the test.
--
Andreas Gustafsson, gson@gson.org
From: "Andreas Gustafsson" <gson@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56780 CVS commit: src/tests/lib/libc/sys
Date: Tue, 5 Apr 2022 15:59:22 +0000
Module Name: src
Committed By: gson
Date: Tue Apr 5 15:59:22 UTC 2022
Modified Files:
src/tests/lib/libc/sys: t_mmap.c
Log Message:
In the mmap_err test case, mmap the address, not the address of the address.
Should fix PR kern/56780.
To generate a diff of this commit:
cvs rdiff -u -r1.15 -r1.16 src/tests/lib/libc/sys/t_mmap.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: PR/56780 CVS commit: src/tests/lib/libc/sys
Date: Tue, 5 Apr 2022 18:19:31 +0200
On Tue, Apr 05, 2022 at 04:00:04PM +0000, Andreas Gustafsson wrote:
> Log Message:
> In the mmap_err test case, mmap the address, not the address of the address.
> Should fix PR kern/56780.
I don't think this is the best possible fix.
The test could have been meant to test for two things, and IMHO should
test for both:
- mmap with MAP_FIXED for a not page aligned value (that is what you
have made it to do in most cases, but not very portable), but I think
this is not what it was originally about.
If we want to make it test for that, the fixed address should be picked
better and get a comment (i.e. it should be within the VM space limits
for the architecture, and if we do below maybe it should be made sure
that the address is inside an unmapped region)
- mmap with MAP_FIXED and a address that is already mapped (this was my
initial reading and what the test did test most of the time before we
started to randomize stack addresses)
Martin
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: PR/56780 CVS commit: src/tests/lib/libc/sys
Date: Tue, 5 Apr 2022 20:17:18 +0200
On Tue, Apr 05, 2022 at 06:19:31PM +0200, Martin Husemann wrote:
> - mmap with MAP_FIXED and a address that is already mapped (this was my
> initial reading and what the test did test most of the time before we
> started to randomize stack addresses)
One of the two statements in there is wrong, but maybe we should still do
both variants of the test, and imrove the selection of the invalid adress
(but maybe it doesn't really matter why it is invalid from a userland POV).
Martin
From: Andreas Gustafsson <gson@gson.org>
To: Martin Husemann <martin@duskware.de>, gnats-bugs@netbsd.org
Cc:
Subject: Re: PR/56780 CVS commit: src/tests/lib/libc/sys
Date: Tue, 5 Apr 2022 22:11:24 +0300
Martin Husemann wrote:
> I don't think this is the best possible fix.
>
> The test could have been meant to test for two things, and IMHO should
> test for both:
>
> - mmap with MAP_FIXED for a not page aligned value (that is what you
> have made it to do in most cases, but not very portable), but I think
> this is not what it was originally about.
> If we want to make it test for that, the fixed address should be picked
> better and get a comment (i.e. it should be within the VM space limits
> for the architecture, and if we do below maybe it should be made sure
> that the address is inside an unmapped region)
Whatever the original intent of the test, in practice it was testing
the case of non-page-aligned address (except in the rare cases where it
randomly failed because the address happened to be aligned), and that
is what it still does (only without the random failures).
> - mmap with MAP_FIXED and a address that is already mapped (this was my
> initial reading and what the test did test most of the time before we
> started to randomize stack addresses)
I'm sorry, but I don't follow. How exactly does randomizing stack
addresses change things? Wouldn't the "addr" variable almost always
be located at a non-page-aligned address whether or not randomization
is in use? Also, how would this case result in an errno of EINVAL?
--
Andreas Gustafsson, gson@gson.org
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: PR/56780 CVS commit: src/tests/lib/libc/sys
Date: Wed, 06 Apr 2022 03:59:06 +0700
Date: Tue, 5 Apr 2022 16:20:02 +0000 (UTC)
From: Martin Husemann <martin@duskware.de>
Message-ID: <20220405162002.8DBFC1A923B@mollari.NetBSD.org>
| - mmap with MAP_FIXED for a not page aligned value (that is what you
| have made it to do in most cases, but not very portable), but I think
| this is not what it was originally about.
My guess is that it probably was, if all that was wanted was an allocated
addr, there'd have been no reason to init it to SIZE_MAX. The only reason
for doing that is because the value is intended to be used.
| If we want to make it test for that, the fixed address should be picked
| better
Perhaps, or probably, but that's very hard.
| and get a comment
but that, certainly. Tests generally should have comments to indicate
what is supposed to be happening, and why, - since they often involve bizarre
interactions that no-one would normally ever do.
| (i.e. it should be within the VM space limits
| for the architecture,
doing that portably is hard.
| and if we do below maybe it should be made sure
| that the address is inside an unmapped region)
and that's probably even harder.
I suspect that the test was written with knowledge of how the
kernel actually checks for errors, and knows that the alignment
test is done first.
Since SIZE_MAX (biggest possible value in a size_t) must be 2^n - 1
for some n, we know it cannot be aligned, unless n == 0 (which is not an
interesting architecture, nothing much would work at all on that).
So, the test knows it has an unaligned value, knows that the kernel will
test for that before it bothers looking at other characteristics of
the addr (much cheaper that way, and also allows the subsequent checks
to be simpler), even before the fd is checked (also more expensive test)
so it knows that the correct thing should have happened.
(If only it hadn't been for that pesky &).
Of course there's a lot of knowledge about the way the tested object
(kernel mmap() interface here) is implemented embedded here - unfortunately
that's in common with a lot of the tests.
A test for an attempt to mmap() an allocated addr would be a useful
thing to add though. But just &stack_var isn't going to cut it,
not even &anything_else -- a suitably aligned data struct would need
to be hand crafted first, and probably avoid using anything malloc()'d
as that might be mmap'd, and that might be confusing. So most likely
static char array[4 * PAGE_SIZE];
intptr_t a = array;
a += ALIGNMENT;
a &= ~(ALIGNMENT - 1);
void *addr = a;
/* etc, try to mmap(addr, ...) 1 or 2 pages */
kre
From: Andreas Gustafsson <gson@gson.org>
To: Robert Elz <kre@munnari.OZ.AU>, gnats-bugs@netbsd.org
Cc:
Subject: Re: PR/56780 CVS commit: src/tests/lib/libc/sys
Date: Wed, 6 Apr 2022 09:57:02 +0300
Robert Elz wrote:
> A test for an attempt to mmap() an allocated addr would be a useful
> thing to add though. But just &stack_var isn't going to cut it,
> not even &anything_else -- a suitably aligned data struct would need
> to be hand crafted first, and probably avoid using anything malloc()'d
> as that might be mmap'd, and that might be confusing. So most likely
>
> static char array[4 * PAGE_SIZE];
> intptr_t a = array;
> a += ALIGNMENT;
> a &= ~(ALIGNMENT - 1);
> void *addr = a;
> /* etc, try to mmap(addr, ...) 1 or 2 pages */
I'm all for improving the test coverage of mmap(), but I consider it
outside the scope of this PR which is about a specific case of false
positives, not about deficiencies in test coverage.
That said, if we were to test mmapping an address in static data
storage or the stack with MAP_FIXED, what would the expected result
be? Since you say "attempt", it sounds like you think it should be an
error, but the man page doesn't really say, and what will actually
happen is that the call succeeds (assuming the arguments to mmap are
otherwise valid), both on NetBSD and two other operating systems
I tried it on.
See also: https://lwn.net/Articles/741369/
--
Andreas Gustafsson, gson@gson.org
State-Changed-From-To: open->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Sat, 13 Aug 2022 10:22:31 +0000
State-Changed-Why:
he specific issue reported in the PR is fixed. Possible deficiencies
in the test coverage should get their own test cases or PRs.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.