NetBSD Problem Report #57291
From www@netbsd.org Fri Mar 24 21:59:12 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id CCA2B1A9239
for <gnats-bugs@gnats.NetBSD.org>; Fri, 24 Mar 2023 21:59:11 +0000 (UTC)
Message-Id: <20230324215910.3D5111A923C@mollari.NetBSD.org>
Date: Fri, 24 Mar 2023 21:59:10 +0000 (UTC)
From: jspath55@gmail.com
Reply-To: jspath55@gmail.com
To: gnats-bugs@NetBSD.org
Subject: Unit test for lib/libc/regex/t_exhaust fails in ATF Tests suite with signal 9
X-Send-Pr-Version: www-1.0
>Number: 57291
>Category: misc
>Synopsis: Unit test for lib/libc/regex/t_exhaust fails in ATF Tests suite with signal 9
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: misc-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Mar 24 22:00:01 +0000 2023
>Last-Modified: Tue May 02 18:30:02 +0000 2023
>Originator: Jim Spath
>Release: 10.0_BETA
>Organization:
>Environment:
NetBSD pi.r.zero 10.0_BETA NetBSD 10.0_BETA (GENERIC) #0: Fri Jan 13 19:15:32 UTC 2023 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/evbarm/compile/GENERIC evbarm
>Description:
Ran tests under /usr/tests and found a failure, limited to Raspberry Pi Zero 2W systems.
$ atf-run lib/libc/regex/t_exhaust
Content-Type: application/X-atf-tps; version="3"
info: atf.version, Automated Testing Framework 0.20 (atf-0.20)
info: tests.root, /usr/tests
info: time.start, Fri Mar 24 21:35:01 UTC 2023
[...]
tps-count: 1
tp-start: 1679693702.112045, lib/libc/regex/t_exhaust, 1
tc-start: 1679693702.112338, regcomp_too_big
tc-end: 1679693715.914422, regcomp_too_big, failed, Test program received signal 9
tp-end: 1679693715.922852, lib/libc/regex/t_exhaust
info: time.end, Fri Mar 24 21:35:15 UTC 2023
>How-To-Repeat:
The error occurs on 2 installations of the evbarm image.
$ file lib/libc/regex/t_exhaust
lib/libc/regex/t_exhaust: ELF 32-bit LSB pie executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /usr/libexec/ld.elf_so, for NetBSD 10.0, compiled for: earmv7hf, not stripped
>Fix:
Unknown.
>Audit-Trail:
From: Jim Spath <jspath55@gmail.com>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: misc/57291
Date: Sun, 26 Mar 2023 09:09:23 -0400
When I found signal 9 errors on NetBSD 10.0 BETA with an evbarm
installation, I had not noticed console messages (they were not shown
in /var/log/messages) about swap. I transcribe them here:
[ 22555.40164641 UVM: pid 19688 (t_exhaust), uid 0 killed: out of suap
[ 53830.57815951 UVM: pid 17972 (t_exhaust), uid 1000 killed: out of swap
[ x656x.93092051 UVM: pid 24498 (t_exhaust), uid 1000 killed: out of suap
[ 85716.67184551 UVM: pid 28714 (t_exhaust), uid 1000 killed: out of swap
The install has no swap defined:
$ swapctl -q
swapctl: no swap or dump devices in /etc/fstab
I added a swap file and the test succeeded, once. On a second try,
though, the test failed again.
(1)
bash-5.1$ swapctl -l
Device 512-blocks Used Avail Capacity Priority
/var/swapdir/swap1 204800 29128 175672 14% 1
[...]
tps-count: 1
tp-start: 1679782699.209293, lib/libc/regex/t_exhaust, 1
tc-start: 1679782699.209346, regcomp_too_big
tc-end: 1679782810.396271, regcomp_too_big, passed
tp-end: 1679782810.436980, lib/libc/regex/t_exhaust
info: time.end, Sat Mar 25 22:20:10 UTC 2023
(2)
$ swapctl -l
Device 512-blocks Used Avail Capacity Priority
/var/swapdir/swap1 204800 0 204800 0% 1
tps-count: 1
tp-start: 1679835833.883805, lib/libc/regex/t_exhaust, 1
tc-start: 1679835833.884209, regcomp_too_big
tc-end: 1679835858.317031, regcomp_too_big, failed, Test program
received signal 9
tp-end: 1679835858.337171, lib/libc/regex/t_exhaust
info: time.end, Sun Mar 26 13:04:18 UTC 2023
$ swapctl -l
Device 512-blocks Used Avail Capacity Priority
/var/swapdir/swap1 204800 39456 165344 19% 1
I will run more tests.
Jim
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: misc/57291
Date: Sat, 1 Apr 2023 20:13:08 +0000
On Sun, Mar 26, 2023 at 01:10:01PM +0000, Jim Spath wrote:
> When I found signal 9 errors on NetBSD 10.0 BETA with an evbarm
> installation, I had not noticed console messages (they were not shown
> in /var/log/messages) about swap. I transcribe them here:
>
> [ 22555.40164641 UVM: pid 19688 (t_exhaust), uid 0 killed: out of suap
> [ 53830.57815951 UVM: pid 17972 (t_exhaust), uid 1000 killed: out of swap
> [ x656x.93092051 UVM: pid 24498 (t_exhaust), uid 1000 killed: out of suap
> [ 85716.67184551 UVM: pid 28714 (t_exhaust), uid 1000 killed: out of swap
It seems like a bug that these don't make it to /var/log/messages
under the default syslog config. (Assuming that's what happened. One
of the things that I've frequently seen in the past when OOMing is
that syslog goes to log the message, trips on the OOM condition while
doing so, and also gets killed. But if that had happened you'd also
see it on the console.)
also it's not clear that we should have tests that are so large they
won't run on platforms we might reasonably want to test on.
--
David A. Holland
dholland@netbsd.org
From: Jim Spath <jspath55@gmail.com>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: misc/57291
Date: Sun, 30 Apr 2023 18:17:16 -0400
> I had not noticed console messages (they were not shown
in /var/log/messages) about swap. I transcribe them here: [...]
> I will run more tests.
Looking again, I found the system messages with /sbin/dmesg:
[ 1498596.274645] UVM: pid 25241 (t_exhaust), uid 1000 killed: out of swap
I will add this to my local test harness of things to note before/after runs.
From: Jim Spath <jspath55@gmail.com>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: misc/57291
Date: Tue, 2 May 2023 14:26:07 -0400
I have run more tests, as promised, and reviewed the code and some history.
> I will run more tests.
David Holland noted:
> it's not clear that we should have tests that are so large they won't run on platforms we might reasonably want to test on.
The regex exhaust test has hard-coded size limits of 9,999 runs (or
"REGEX_MAXSIZE" factors), and there are limits of 256MB memory and a
600 second/10 minute run-time.
> atf_tc_set_md_var(tc, "timeout", "600");
> atf_tc_set_md_var(tc, "require.memory", "256M");
The Raspberry Pi 02W has 512MB memory, though the available space is
closer to 400MB. My testing showed that different runs require varying
amounts of memory, so even a 400MB system with no swap will sometimes
run a successful test.
Earlier versions of the t_exhaust.c code had a lower memory
requirement. I found this 2011-era code archived:
http://web.mit.edu/freebsd/head/contrib/netbsd-tests/lib/libc/regex/t_exhaust.c
> atf_tc_set_md_var(tc, "timeout", "600");
> #if defined(__FreeBSD__)
> atf_tc_set_md_var(tc, "require.memory", "64M");
> #else
> atf_tc_set_md_var(tc, "require.memory", "120M");
> #endif
Meanwhile, the FreeBSD version of this test had a "skip i386" switch
added, for similar reasons of running out of memory. See:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237450
and
https://mail-archive.freebsd.org/cgi/getmsg.cgi?fetch=8900+0+archive/2021/freebsd-testing/20211203.freebsd-testing
My test summary (architect/memory):
i386 3GB - 100% pass
pi4:arm64 8GB - 100% pass
pi3:arm64 1GB - 90% pass
pi0:arm32 512MB - 10% pass (no swap)
512MB - 90% pass (small swap file)
In my view, intermittent failures of this test are okay given the
variable memory requirements. I would rather see what happens when
memory is exhausted for one process than not look. I don't think the
test iterations/size should be altered given the history of test runs
with these set conditions.
Given the size capacity yet lack of write performance on SD cards it
makes sense to not have a swap file or use SD storage for temp space.
I would suggest added comments in the test code with real world
examples of run time and space requirements for some common small
footprint systems).
I would be okay skipping this test on systems with no swap (yes, some
valid systems might be ignored but this would prevent false negatives
of a sort). I can't find a good C example so here is a shell
equivalent;
skip_if_noswap()
{
if /sbin/swapctl -l >/dev/null 2>&1
then
atf_skip "current platform shows no swap configured"
fi
}
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.