NetBSD Problem Report #57291

From www@netbsd.org  Fri Mar 24 21:59:12 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id CCA2B1A9239
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 24 Mar 2023 21:59:11 +0000 (UTC)
Message-Id: <20230324215910.3D5111A923C@mollari.NetBSD.org>
Date: Fri, 24 Mar 2023 21:59:10 +0000 (UTC)
From: jspath55@gmail.com
Reply-To: jspath55@gmail.com
To: gnats-bugs@NetBSD.org
Subject: Unit test for lib/libc/regex/t_exhaust fails in ATF Tests suite with signal 9
X-Send-Pr-Version: www-1.0

>Number:         57291
>Category:       misc
>Synopsis:       Unit test for lib/libc/regex/t_exhaust fails in ATF Tests suite with signal 9
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    misc-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Mar 24 22:00:01 +0000 2023
>Last-Modified:  Tue May 02 18:30:02 +0000 2023
>Originator:     Jim Spath
>Release:        10.0_BETA
>Organization:
>Environment:
NetBSD pi.r.zero 10.0_BETA NetBSD 10.0_BETA (GENERIC) #0: Fri Jan 13 19:15:32 UTC 2023  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/evbarm/compile/GENERIC evbarm

>Description:
Ran tests under /usr/tests and found a failure, limited to Raspberry Pi Zero 2W systems.

$ atf-run lib/libc/regex/t_exhaust
Content-Type: application/X-atf-tps; version="3"

info: atf.version, Automated Testing Framework 0.20 (atf-0.20)
info: tests.root, /usr/tests
info: time.start, Fri Mar 24 21:35:01 UTC 2023
[...]
tps-count: 1
tp-start: 1679693702.112045, lib/libc/regex/t_exhaust, 1
tc-start: 1679693702.112338, regcomp_too_big
tc-end: 1679693715.914422, regcomp_too_big, failed, Test program received signal 9
tp-end: 1679693715.922852, lib/libc/regex/t_exhaust
info: time.end, Fri Mar 24 21:35:15 UTC 2023

>How-To-Repeat:
The error occurs on 2 installations of the evbarm image.

$ file lib/libc/regex/t_exhaust
lib/libc/regex/t_exhaust: ELF 32-bit LSB pie executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /usr/libexec/ld.elf_so, for NetBSD 10.0, compiled for: earmv7hf, not stripped

>Fix:
Unknown.

>Audit-Trail:
From: Jim Spath <jspath55@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: misc/57291
Date: Sun, 26 Mar 2023 09:09:23 -0400

 When I found signal 9 errors on NetBSD 10.0 BETA with an evbarm
 installation, I had not noticed console messages (they were not shown
 in /var/log/messages) about swap. I transcribe them here:

 [ 22555.40164641 UVM: pid 19688 (t_exhaust), uid 0 killed: out of suap
 [ 53830.57815951 UVM: pid 17972 (t_exhaust), uid 1000 killed: out of swap
 [ x656x.93092051 UVM: pid 24498 (t_exhaust), uid 1000 killed: out of suap
 [ 85716.67184551 UVM: pid 28714 (t_exhaust), uid 1000 killed: out of swap

 The install has no swap defined:

 $ swapctl -q
 swapctl: no swap or dump devices in /etc/fstab

 I added a swap file and the test succeeded, once. On a second try,
 though, the test failed again.

 (1)
 bash-5.1$ swapctl -l
 Device             512-blocks     Used    Avail Capacity  Priority
 /var/swapdir/swap1     204800    29128   175672    14%    1
 [...]
 tps-count: 1
 tp-start: 1679782699.209293, lib/libc/regex/t_exhaust, 1
 tc-start: 1679782699.209346, regcomp_too_big
 tc-end: 1679782810.396271, regcomp_too_big, passed
 tp-end: 1679782810.436980, lib/libc/regex/t_exhaust
 info: time.end, Sat Mar 25 22:20:10 UTC 2023

 (2)
 $ swapctl -l
 Device             512-blocks     Used    Avail Capacity  Priority
 /var/swapdir/swap1     204800        0   204800     0%    1

 tps-count: 1
 tp-start: 1679835833.883805, lib/libc/regex/t_exhaust, 1
 tc-start: 1679835833.884209, regcomp_too_big
 tc-end: 1679835858.317031, regcomp_too_big, failed, Test program
 received signal 9
 tp-end: 1679835858.337171, lib/libc/regex/t_exhaust
 info: time.end, Sun Mar 26 13:04:18 UTC 2023

 $ swapctl -l
 Device             512-blocks     Used    Avail Capacity  Priority
 /var/swapdir/swap1     204800    39456   165344    19%    1

 I will run more tests.

 Jim

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: misc/57291
Date: Sat, 1 Apr 2023 20:13:08 +0000

 On Sun, Mar 26, 2023 at 01:10:01PM +0000, Jim Spath wrote:
  >  When I found signal 9 errors on NetBSD 10.0 BETA with an evbarm
  >  installation, I had not noticed console messages (they were not shown
  >  in /var/log/messages) about swap. I transcribe them here:
  >  
  >  [ 22555.40164641 UVM: pid 19688 (t_exhaust), uid 0 killed: out of suap
  >  [ 53830.57815951 UVM: pid 17972 (t_exhaust), uid 1000 killed: out of swap
  >  [ x656x.93092051 UVM: pid 24498 (t_exhaust), uid 1000 killed: out of suap
  >  [ 85716.67184551 UVM: pid 28714 (t_exhaust), uid 1000 killed: out of swap

 It seems like a bug that these don't make it to /var/log/messages
 under the default syslog config. (Assuming that's what happened. One
 of the things that I've frequently seen in the past when OOMing is
 that syslog goes to log the message, trips on the OOM condition while
 doing so, and also gets killed. But if that had happened you'd also
 see it on the console.)

 also it's not clear that we should have tests that are so large they
 won't run on platforms we might reasonably want to test on.

 -- 
 David A. Holland
 dholland@netbsd.org

From: Jim Spath <jspath55@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: misc/57291
Date: Sun, 30 Apr 2023 18:17:16 -0400

 > I had not noticed console messages (they were not shown
 in /var/log/messages) about swap. I transcribe them here: [...]
 > I will run more tests.

 Looking again, I found the system messages with /sbin/dmesg:

 [ 1498596.274645] UVM: pid 25241 (t_exhaust), uid 1000 killed: out of swap

 I will add this to my local test harness of things to note before/after runs.

From: Jim Spath <jspath55@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: misc/57291
Date: Tue, 2 May 2023 14:26:07 -0400

 I have run more tests, as promised, and reviewed the code and some history.
 > I will run more tests.

 David Holland noted:
 >  it's not clear that we should have tests that are so large they won't run on platforms we might reasonably want to test on.

 The regex exhaust test has hard-coded size limits of 9,999 runs (or
 "REGEX_MAXSIZE" factors), and there are limits of 256MB memory and a
 600 second/10 minute run-time.

 >        atf_tc_set_md_var(tc, "timeout", "600");
 >        atf_tc_set_md_var(tc, "require.memory", "256M");

 The Raspberry Pi 02W has 512MB memory, though the available space is
 closer to 400MB. My testing showed that different runs require varying
 amounts of memory, so even a 400MB system with no swap will sometimes
 run a successful test.

 Earlier versions of the t_exhaust.c code had a lower memory
 requirement. I found this 2011-era code archived:

 http://web.mit.edu/freebsd/head/contrib/netbsd-tests/lib/libc/regex/t_exhaust.c

 >      atf_tc_set_md_var(tc, "timeout", "600");
 > #if defined(__FreeBSD__)
 >     atf_tc_set_md_var(tc, "require.memory", "64M");
 > #else
 >     atf_tc_set_md_var(tc, "require.memory", "120M");
 > #endif

 Meanwhile, the FreeBSD version of this test had a "skip i386" switch
 added, for similar reasons of running out of memory. See:
 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237450
 and
 https://mail-archive.freebsd.org/cgi/getmsg.cgi?fetch=8900+0+archive/2021/freebsd-testing/20211203.freebsd-testing

 My test summary (architect/memory):
 i386 3GB - 100% pass
 pi4:arm64 8GB - 100% pass
 pi3:arm64 1GB - 90% pass
 pi0:arm32 512MB - 10% pass (no swap)
       512MB - 90% pass (small swap file)

 In my view, intermittent failures of this test are okay given the
 variable memory requirements. I would rather see what happens when
 memory is exhausted for one process than not look. I don't think the
 test iterations/size should be altered given the history of test runs
 with these set conditions.

 Given the size capacity yet lack of write performance on SD cards it
 makes sense to not have a swap file or use SD storage for temp space.
 I would suggest added comments in the test code with real world
 examples of run time and space requirements for some common small
 footprint systems).

 I would be okay skipping this test on systems with no swap (yes, some
 valid systems might be ignored but this would prevent false negatives
 of a sort). I can't find a good C example so here is a shell
 equivalent;

 skip_if_noswap()
 {
         if /sbin/swapctl -l >/dev/null 2>&1
         then
             atf_skip "current platform shows no swap configured"
         fi
 }

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.