NetBSD Problem Report #56506
From gson@gson.org Wed Nov 17 20:24:11 2021
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 404811A9239
for <gnats-bugs@gnats.NetBSD.org>; Wed, 17 Nov 2021 20:24:11 +0000 (UTC)
Message-Id: <20211117202356.6853D254286@guava.gson.org>
Date: Wed, 17 Nov 2021 22:23:56 +0200 (EET)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: sys/rc/t_rc_d_cli tests randomly fail
X-Send-Pr-Version: 3.95
>Number: 56506
>Category: bin
>Synopsis: sys/rc/t_rc_d_cli tests randomly fail
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Nov 17 20:25:01 +0000 2021
>Last-Modified: Sat Feb 26 16:25:01 +0000 2022
>Originator: Andreas Gustafsson
>Release: NetBSD-current
>Organization:
>Environment:
System: NetBSD
Architecture: i386
Machine: i386
>Description:
On one of my testbeds, a physical i386 laptop, various test cases of
the sys/rc/t_rc_d_cli test program fail randomly. The log output from
a typical failure is here:
https://www.gson.org/netbsd/bugs/build/i386-laptop/2021/2021.11.14.18.36.13/test.html#sys_rc_t_rc_d_cli_default_stop_no_args
In this case, the default_restart_no_args test case failed with the
error message "h_simple not running?".
This looks like a race condition in rc.subr, which in some cases
checks whether a service is running by examining the output of ps(1).
When ps runs, the process running a newly started service will have
forked, but it may not yet have completed an exec(), and if so, it
will not show up in the ps output under the expected name.
To test this theory, I modified rc.subr to save the ps output to a
file using tee(1), and found that when the test fails, the ps output
shows a process with the name "(sh)" in place of the expected
"h_simple".
>How-To-Repeat:
cd /usr/tests/sys/rc
while atf-run t_rc_d_cli:default_stop_no_args; do true; done
The :default_stop_no_args part is only supported on -current;
omit it if testing on a release. Repeat on different machines
until you find one that happens to have the right timing for
the test to fail.
>Fix:
>Audit-Trail:
From: "Andreas Gustafsson" <gson@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56506 CVS commit: src/tests/sys/rc
Date: Sat, 26 Feb 2022 16:21:59 +0000
Module Name: src
Committed By: gson
Date: Sat Feb 26 16:21:59 UTC 2022
Modified Files:
src/tests/sys/rc: t_rc_d_cli.sh
Log Message:
Mark randomly failing test cases as expected failures with a reference
to PR bin/56506.
To generate a diff of this commit:
cvs rdiff -u -r1.4 -r1.5 src/tests/sys/rc/t_rc_d_cli.sh
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.