NetBSD Problem Report #48892
From martin@duskware.de Wed Jun 11 08:22:59 2014
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 5B09BA64F0
for <gnats-bugs@gnats.NetBSD.org>; Wed, 11 Jun 2014 08:22:59 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: some tests will not clean up rump server processes
X-Send-Pr-Version: 3.95
>Number: 48892
>Notify-List: riastradh@NetBSD.org
>Category: bin
>Synopsis: some tests will not clean up rump server processes
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jun 11 08:25:00 +0000 2014
>Closed-Date:
>Last-Modified: Sat Apr 26 02:20:03 +0000 2025
>Originator: Martin Husemann
>Release: NetBSD 6.99.43
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD unpluged.duskware.de 6.99.43 NetBSD 6.99.43 (UNPLUGED) #5: Tue Jun 10 20:13:06 CEST 2014 martin@seven-days-to-the-wolves.aprisoft.de:/usr/src/sys/arch/evbarm/compile/UNPLUGED evbarm
Architecture: earm
Machine: evbarm
>Description:
Some tests, like for example /usr/tests/fs/nfs/t_rquotad, will create
background server processes via rump_server or similar. There seems to be
no proper atf cleanup path used in this tests (or it does not work).
Now consider this fragment fo r_quotad:
#now try a quota(8) call
export RUMPHIJACK='blanket=/mnt,socket=all,path=/rump,vfs=getvfsstat'
for q in ${expect} ; do
local id=$(id -${q})
atf_check -s exit:0 \
-o "match:/mnt 0 10 40960 1 20 51200 $
-o "match:Disk quotas for .*: $" \
quota -${q} -v
done
and note the cleanup code after it:
unset LD_PRELOAD
rump_quota_shutdown
Unfortunately this means that if the check fails (the -o match: disagrees with
the output) the test will abort and rump_quota_shutdown will not be invoked.
Worse: imagine you are using a (limited) tmpfs on /tmp. The left over
rump_server (or rump_* mounts) will keep the filesystem image open, even
after atf automatically removed the working directory. So your tmpfs runs
full and more tests fail...
>How-To-Repeat:
Currently a diskless system with / on nfs and /tmp on tmpfs is enough to
trigger bogus output from quota(1), so this will trigger the issue. Once
that is fixed: just modify the match expression to mismatch and run the
test, then pgrep for rump_server.
>Fix:
use atf explicit cleanup handling to shut down the rump servers.
>Release-Note:
>Audit-Trail:
From: "Andreas Gustafsson" <gson@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/48892 CVS commit: src/tests/fs/nfs
Date: Thu, 20 Aug 2020 07:32:40 +0000
Module Name: src
Committed By: gson
Date: Thu Aug 20 07:32:40 UTC 2020
Modified Files:
src/tests/fs/nfs: t_rquotad.sh
Log Message:
Add cleanup of possible leftover rump processes, replacing the
non-working cleanup code just removed from ffs_common.sh. Fixes
PR bin/48892 with respect to the t_rquotad test.
To generate a diff of this commit:
cvs rdiff -u -r1.7 -r1.8 src/tests/fs/nfs/t_rquotad.sh
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Wed, 21 Jul 2021 05:03:30 +0000
State-Changed-Why:
is this fixed?
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/48892 (some tests will not clean up rump server processes)
Date: Wed, 21 Jul 2021 10:22:35 +0200
Not sure how to describe this properly - for the concrete test mentioned
in the PR: yes.
But we have disabled *lots* of tests in the main test runs because they
leave rump_server processes around (and there are likely a few more PRs
for some individual cases).
Not sure it is worth to have this blanket PR open or how to systematically
collect all issues best.
Martin
State-Changed-From-To: feedback->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sat, 24 Jul 2021 03:27:22 +0000
State-Changed-Why:
feed is back
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/48892 (some tests will not clean up rump server processes)
Date: Sat, 24 Jul 2021 03:27:04 +0000
On Wed, Jul 21, 2021 at 08:25:01AM +0000, Martin Husemann wrote:
> Not sure how to describe this properly - for the concrete test mentioned
> in the PR: yes.
>
> But we have disabled *lots* of tests in the main test runs because they
> leave rump_server processes around (and there are likely a few more PRs
> for some individual cases).
>
> Not sure it is worth to have this blanket PR open or how to systematically
> collect all issues best.
Bleh. Ok, I'm going to suggest the following: Someone(TM) gather a
list of tests that are explicitly disabled, and file them in a new PR.
We'll leave this one open until that eventually happens.
We could also file a PR on atf noting that it ought to attend to this
automatically (given that a good chunk of its justification for
existence is that it "sandboxes" and "cleans up after" tests) but I
doubt that will accomplish anything. :-|
--
David A. Holland
dholland@netbsd.org
From: Andreas Gustafsson <gson@gson.org>
To: dholland-bugs@netbsd.org, martin@NetBSD.org
Cc: gnats-bugs@netbsd.org
Subject: Re: bin/48892 (some tests will not clean up rump server processes)
Date: Sat, 24 Jul 2021 16:11:09 +0300
David Holland wrote:
> Bleh. Ok, I'm going to suggest the following: Someone(TM) gather a
> list of tests that are explicitly disabled, and file them in a new PR.
> We'll leave this one open until that eventually happens.
Gathering a list is fine, but I don't see what would be accomplished
by replacing the present PR with a new one reporting the same problem,
other than losing history showing how past instances of the problem
were fixed.
As for the list, I could only find three test cases skipped due to
leftover rump_server processes, all in the same test:
./rump/rumpkern/t_sp.sh:test_case_skip stress_long kern/50350 "leftover rump_server"
./rump/rumpkern/t_sp.sh:test_case_skip stress_killer kern/55356 "leftover rump_server"
./rump/rumpkern/t_sp.sh:test_case_skip reconnect kern/55304 "leftover rump_server"
Perhaps martin can find more?
--
Andreas Gustafsson, gson@gson.org
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/48892 (some tests will not clean up rump server processes)
Date: Sun, 25 Jul 2021 01:18:03 +0000
On Sat, Jul 24, 2021 at 01:15:02PM +0000, Andreas Gustafsson wrote:
> Gathering a list is fine, but I don't see what would be accomplished
> by replacing the present PR with a new one reporting the same problem,
> other than losing history showing how past instances of the problem
> were fixed.
Maybe none, but sometimes it's helpful to not have the first chunk of
the PR make it look like the problem's been fixed.
> As for the list, I could only find three test cases skipped due to
> leftover rump_server processes, all in the same test:
>
> ./rump/rumpkern/t_sp.sh:test_case_skip stress_long kern/50350 "leftover rump_server"
> ./rump/rumpkern/t_sp.sh:test_case_skip stress_killer kern/55356 "leftover rump_server"
> ./rump/rumpkern/t_sp.sh:test_case_skip reconnect kern/55304 "leftover rump_server"
Thanks.
I don't understand how any of these scripts manage to work, but
hopefully someone does.
--
David A. Holland
dholland@netbsd.org
From: Taylor R Campbell <riastradh@NetBSD.org>
To: martin@NetBSD.org
Cc: gson@NetBSD.org, dholland-bugs@NetBSD.org, uwe@NetBSD.org
Subject: Re: bin/48892: some tests will not clean up rump server processes
Date: Sat, 26 Apr 2025 02:16:07 +0000
This is a multi-part message in MIME format.
--=_QJyEfqlAfncoM8fB56hLDQdX5vxg3caH
The attached patch teaches rump_server to respect an environment
variable RUMPDAEMON_KEEPSESSION so that it does not do setsid (just
setting the variable is enough, any value will do).
That way, when atf kills the test process's process group (which it
already does), it will kill any rump_server processes spawned by the
test.
I haven't patched all of the several hundred uses of rump_server
throughout src/tests to set it. But, until that is done, you could
test this patch by running the tests with
RUMPDAEMON_KEEPSESSION= atf-run
and see if the troublesome stress-killers still leave rump_servers
around.
Another alternative -- kludgier but perhaps more effective since it
covers more than just rump_server -- would be to LD_PRELOAD a library
that overrides:
pid_t setsid(void) { return getpid(); }
--=_QJyEfqlAfncoM8fB56hLDQdX5vxg3caH
Content-Type: text/plain; charset="ISO-8859-1"; name="pr48892-rumpsuppresssetsid"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="pr48892-rumpsuppresssetsid.patch"
# HG changeset patch
# User Taylor R Campbell <riastradh@NetBSD.org>
# Date 1745629371 0
# Sat Apr 26 01:02:51 2025 +0000
# Branch trunk
# Node ID 0651c236b598cad1b91312f3fce68600bdec9095
# Parent fa66a8de28195fb9d8985a1c997f6a4b3fd8da4d
# EXP-Topic riastradh-pr48892-atfservers
rump: New environment variable RUMPDAEMON_KEEPSESSION.
If defined, the server will remain in the same session and process
group as the caller when it daemonizes.
This way, we can define it during test runs so that all the
rump_server processes are in the same process group as the atf test
itself -- and so even if the cleanups fail, when atf kills the
process group with killpg, the servers should terminate more
reliably
PR bin/48892: some tests will not clean up rump server processes
diff -r fa66a8de2819 -r 0651c236b598 lib/librumpuser/rumpuser_daemonize.c
--- a/lib/librumpuser/rumpuser_daemonize.c Thu Apr 24 18:37:59 2025 +0000
+++ b/lib/librumpuser/rumpuser_daemonize.c Sat Apr 26 01:02:51 2025 +0000
@@ -112,7 +112,8 @@ rumpuser_daemonize_begin(void)
=20
switch (fork()) {
case 0:
- if (setsid() =3D=3D -1) {
+ if (getenv("RUMPDAEMON_KEEPSESSION") =3D=3D NULL &&
+ _setsid() =3D=3D -1) {
rumpuser_daemonize_done(errno);
}
rv =3D 0;
diff -r fa66a8de2819 -r 0651c236b598 usr.bin/rump_allserver/rump_allserver.1
--- a/usr.bin/rump_allserver/rump_allserver.1 Thu Apr 24 18:37:59 2025 +0000
+++ b/usr.bin/rump_allserver/rump_allserver.1 Sat Apr 26 01:02:51 2025 +0000
@@ -211,6 +211,19 @@ After use,
.Nm
can be made to exit using
.Xr rump.halt 1 .
+.Sh ENVIRONMENT
+The following environment variables affect
+.Nm rump_server
+and
+.Nm rump_allserver :
+.Bl -tag -width Ev
+.It Ev RUMPDAEMON_KEEPSESSION
+If defined, the server will remain in the same session and process
+group as the caller when it daemonizes.
+By default, when the server daemonizes it will enter a new session and
+process group as with
+.Xr setsid 2 .
+.El
.Sh EXAMPLES
Start a server and load the tmpfs file system module, and halt the
server immediately afterwards:
--=_QJyEfqlAfncoM8fB56hLDQdX5vxg3caH--
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.