NetBSD Problem Report #56239

From gson@gson.org  Mon Jun  7 07:37:58 2021
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id B6E391A921F
	for <gnats-bugs@gnats.NetBSD.org>; Mon,  7 Jun 2021 07:37:58 +0000 (UTC)
Message-Id: <20210607073749.F0FBA2541D3@guava.gson.org>
Date: Mon,  7 Jun 2021 10:37:49 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: lib/libc/regex/t_exhaust:regcomp_too_big test fails
X-Send-Pr-Version: 3.95

>Number:         56239
>Category:       lib
>Synopsis:       lib/libc/regex/t_exhaust:regcomp_too_big test fails
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    lib-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Jun 07 07:40:00 +0000 2021
>Last-Modified:  Tue Feb 28 13:45:01 +0000 2023
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current, source date >= 2021.02.23.22.14.59
>Organization:

>Environment:
System: NetBSD
Architecture: aarch64
Machine: evbarm
>Description:

The regcomp_too_big test case of the lib/libc/regex/t_exhaust test
program is currently failing on multiple testbeds.   It usually fails
like this:

  lib/libc/regex/t_exhaust (221/890): 1 test cases
      regcomp_too_big: [5.293623s] Failed: /tmp/build/2021.02.23.22.14.59-evbarm-aarch64/src/tests/lib/libc/regex/t_exhaust.c:72: p != NULL not met

but occasionally like this:

  lib/libc/regex/t_exhaust (221/891): 1 test cases
      regcomp_too_big: [ 6537.6276466] UVM: pid 17844 (t_exhaust), uid 0 killed: out of swap
  [46.502225s] Failed: Test program received signal 9

and occasionally it passes.

This happens on:

 - the TNF testbed (qemu, 512 MB virtual RAM)
 - My testbed (qemu, 1 GB virtual RAM)
 - martin's testbed (real hardware, 3 GB RAM)

The problem started with this commit:

  2021.02.23.22.14.59 christos src/lib/libc/regex/Attic/cclass.h 1.8
  2021.02.23.22.14.59 christos src/lib/libc/regex/cname.h 1.8
  2021.02.23.22.14.59 christos src/lib/libc/regex/engine.c 1.25
  2021.02.23.22.14.59 christos src/lib/libc/regex/re_format.7 1.13
  2021.02.23.22.14.59 christos src/lib/libc/regex/regcomp.c 1.39
  2021.02.23.22.14.59 christos src/lib/libc/regex/regerror.c 1.24
  2021.02.23.22.14.59 christos src/lib/libc/regex/regex.3 1.27
  2021.02.23.22.14.59 christos src/lib/libc/regex/regex2.h 1.14
  2021.02.23.22.14.59 christos src/lib/libc/regex/regexec.c 1.23
  2021.02.23.22.14.59 christos src/lib/libc/regex/regfree.c 1.16
  2021.02.23.22.14.59 christos src/lib/libc/regex/utils.h 1.7

Logs:

  https://www.gson.org/netbsd/bugs/build/evbarm-aarch64/commits-2021.02.html#2021.02.23.22.14.59

  http://www.netbsd.org/~martin/aarch64-atf/47_atf.html#lib_libc_regex_t_exhaust_regcomp_too_big

Additional logs (less helpful because of unrelated panics):

  http://releng.netbsd.org/b5reports/evbarm-aarch64/commits-2021.02.html#2021.02.23.22.14.59

>How-To-Repeat:

Run the ATF tests on NetBSD/evbarm-aarch64.

>Fix:

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: port-arm-maintainer->lib-bug-people
Responsible-Changed-By: gson@NetBSD.org
Responsible-Changed-When: Tue, 08 Jun 2021 09:35:04 +0000
Responsible-Changed-Why:
Not arm specific after all.


From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-arm/56239: lib/libc/regex/t_exhaust:regcomp_too_big test fails on aarch64
Date: Tue, 8 Jun 2021 12:31:24 +0300

 Looks like the issue is not aarch64 specific after all, at least not
 the "p != NULL" failure mode, because it also happens randomly on amd64.
 To reproduce:

   cd /usr/tests/lib/libc/regex
   while ./t_exhaust regcomp_too_big; do true; done

 On my 512 MB qemu/NVMM amd64 VM, this failed after a few dozen
 iterations.
 -- 
 Andreas Gustafsson, gson@gson.org

From: Andreas Gustafsson <gson@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: christos@NetBSD.org
Subject: Re: lib/56239 (lib/libc/regex/t_exhaust:regcomp_too_big test fails)
Date: Wed, 9 Jun 2021 09:36:43 +0300

 I tried running the test case in a loop:

   cd /usr/tests/lib/libc/regex
   i=0; while ./t_exhaust regcomp_too_big; do i=$(expr $i + 1); echo $i; done

 using sources from immediately before and after the commit identified
 in the original PR, in a qemu amd64 VM with 512 MB RAM.  Results:

   source date 2021.02.23.21.59.31: no failure after more than 300,000 runs

   source date 2021.02.23.22.14.59: failed after 21 runs

 Also, the successful runs took 5-10x longer in the latter case, up from
 about 100 ms per run to 500-1000 ms per run.
 -- 
 Andreas Gustafsson, gson@NetBSD.org

From: Tom Lane <tgl@sss.pgh.pa.us>
To: gnats-bugs@netbsd.org
Cc: christos@NetBSD.org
Subject: Re: lib/56239 (lib/libc/regex/t_exhaust:regcomp_too_big test fails)
Date: Sun, 05 Jun 2022 18:59:05 -0400

 I have found that regcomp_too_big also fails for me on an HPPA machine,
 using HEAD/202206030100Z sources.  The symptoms are a bit different:

 tc-start: 1654467008.118185, regcomp_too_big
 tc-se:Test program crashed; attempting to get stack trace
 tc-se:[New process 11163]
 tc-se:Core was generated by `t_exhaust'.
 tc-se:Program terminated with signal SIGSEGV, Segmentation fault.
 tc-se:#0  0xaf85bea4 in wgetnext (p=3D0xb0001e88) at /home/tgl/netbsd-H-20=
 2206030\
 100Z/usr/src/lib/libc/regex/regcomp.c:1658
 tc-se:#0  0xaf85bea4 in wgetnext (p=3D0xb0001e88) at /home/tgl/netbsd-H-20=
 2206030\
 100Z/usr/src/lib/libc/regex/regcomp.c:1658
 tc-se:#1  0xaf85f19c in p_ere_exp (p=3D0xb0001e88, bc=3D<optimized out>) a=
 t /home/t\
 gl/netbsd-H-202206030100Z/usr/src/lib/libc/regex/regcomp.c:527
 tc-se:#2  0xaf85ccf8 in p_re (p=3D0xb0001e88, end1=3D41, end2=3D-130) at /=
 home/tgl/ne\
 tbsd-H-202206030100Z/usr/src/lib/libc/regex/regcomp.c:853
 tc-se:#3  0xaf85f06c in p_ere_exp (p=3D0xb0001e88, bc=3D<optimized out>) a=
 t /home/t\
 gl/netbsd-H-202206030100Z/usr/src/lib/libc/regex/regcomp.c:476
 tc-se:#4  0xaf85ccf8 in p_re (p=3D0xb0001e88, end1=3D41, end2=3D-130) at /=
 home/tgl/ne\
 tbsd-H-202206030100Z/usr/src/lib/libc/regex/regcomp.c:853
 tc-se:#5  0xaf85f06c in p_ere_exp (p=3D0xb0001e88, bc=3D<optimized out>) a=
 t /home/t\
 gl/netbsd-H-202206030100Z/usr/src/lib/libc/regex/regcomp.c:476
 tc-se:#6  0xaf85ccf8 in p_re (p=3D0xb0001e88, end1=3D41, end2=3D-130) at /=
 home/tgl/ne\
 tbsd-H-202206030100Z/usr/src/lib/libc/regex/regcomp.c:853
 ...

 It looks like an infinite recursion, but it turns out not to be:
 if I increase the ulimit -s setting from the platform's default 2048
 to 4096, the test passes!

 This may or may not be directly related to the issue seen on aarch64.
 It'd be interesting to check if messing with "ulimit -s" changes the
 behavior on other platforms.

 			regards, tom lane

From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-arm/56239: lib/libc/regex/t_exhaust:regcomp_too_big test fails on aarch64
Date: Tue, 28 Feb 2023 15:40:22 +0200

 > The regcomp_too_big test case of the lib/libc/regex/t_exhaust test
 > program is currently failing on multiple testbeds.   It usually fails
 > like this:
 > 
 >   lib/libc/regex/t_exhaust (221/890): 1 test cases
 >       regcomp_too_big: [5.293623s] Failed: /tmp/build/2021.02.23.22.14.59-evbarm-aarch64/src/tests/lib/libc/regex/t_exhaust.c:72: p != NULL not met
 > 
 > but occasionally like this:
 > 
 >   lib/libc/regex/t_exhaust (221/891): 1 test cases
 >       regcomp_too_big: [ 6537.6276466] UVM: pid 17844 (t_exhaust), uid 0 killed: out of swap
 >   [46.502225s] Failed: Test program received signal 9

 For the record, according to the logs on lyta.netbsd.org, the "p != NULL"
 failures are no longer happening, presumably fixed by Christos' commit
 of t_exhaust.c on 2021-06-09.

 However, the random "out of swap" failures are still happening.  The
 ATF HTML report from the most recent one is here:

   http://releng.netbsd.org/b5reports/evbarm-aarch64/2023/2023.02.19.21.35.07/test.html#lib_libc_regex_t_exhaust_regcomp_too_big

 The "out of swap" message is not in the HTML output but the console
 log, which can be found here:

   http://releng.netbsd.org/b5reports/evbarm-aarch64/2023/2023.02.19.21.35.07/test.log

 -- 
 Andreas Gustafsson, gson@gson.org

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.