NetBSD Problem Report #48892
From martin@duskware.de Wed Jun 11 08:22:59 2014
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 5B09BA64F0
for <gnats-bugs@gnats.NetBSD.org>; Wed, 11 Jun 2014 08:22:59 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: some tests will not clean up rump server processes
X-Send-Pr-Version: 3.95
>Number: 48892
>Category: bin
>Synopsis: some tests will not clean up rump server processes
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jun 11 08:25:00 +0000 2014
>Closed-Date:
>Last-Modified: Sun Jul 25 01:20:01 +0000 2021
>Originator: Martin Husemann
>Release: NetBSD 6.99.43
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD unpluged.duskware.de 6.99.43 NetBSD 6.99.43 (UNPLUGED) #5: Tue Jun 10 20:13:06 CEST 2014 martin@seven-days-to-the-wolves.aprisoft.de:/usr/src/sys/arch/evbarm/compile/UNPLUGED evbarm
Architecture: earm
Machine: evbarm
>Description:
Some tests, like for example /usr/tests/fs/nfs/t_rquotad, will create
background server processes via rump_server or similar. There seems to be
no proper atf cleanup path used in this tests (or it does not work).
Now consider this fragment fo r_quotad:
#now try a quota(8) call
export RUMPHIJACK='blanket=/mnt,socket=all,path=/rump,vfs=getvfsstat'
for q in ${expect} ; do
local id=$(id -${q})
atf_check -s exit:0 \
-o "match:/mnt 0 10 40960 1 20 51200 $
-o "match:Disk quotas for .*: $" \
quota -${q} -v
done
and note the cleanup code after it:
unset LD_PRELOAD
rump_quota_shutdown
Unfortunately this means that if the check fails (the -o match: disagrees with
the output) the test will abort and rump_quota_shutdown will not be invoked.
Worse: imagine you are using a (limited) tmpfs on /tmp. The left over
rump_server (or rump_* mounts) will keep the filesystem image open, even
after atf automatically removed the working directory. So your tmpfs runs
full and more tests fail...
>How-To-Repeat:
Currently a diskless system with / on nfs and /tmp on tmpfs is enough to
trigger bogus output from quota(1), so this will trigger the issue. Once
that is fixed: just modify the match expression to mismatch and run the
test, then pgrep for rump_server.
>Fix:
use atf explicit cleanup handling to shut down the rump servers.
>Release-Note:
>Audit-Trail:
From: "Andreas Gustafsson" <gson@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/48892 CVS commit: src/tests/fs/nfs
Date: Thu, 20 Aug 2020 07:32:40 +0000
Module Name: src
Committed By: gson
Date: Thu Aug 20 07:32:40 UTC 2020
Modified Files:
src/tests/fs/nfs: t_rquotad.sh
Log Message:
Add cleanup of possible leftover rump processes, replacing the
non-working cleanup code just removed from ffs_common.sh. Fixes
PR bin/48892 with respect to the t_rquotad test.
To generate a diff of this commit:
cvs rdiff -u -r1.7 -r1.8 src/tests/fs/nfs/t_rquotad.sh
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Wed, 21 Jul 2021 05:03:30 +0000
State-Changed-Why:
is this fixed?
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/48892 (some tests will not clean up rump server processes)
Date: Wed, 21 Jul 2021 10:22:35 +0200
Not sure how to describe this properly - for the concrete test mentioned
in the PR: yes.
But we have disabled *lots* of tests in the main test runs because they
leave rump_server processes around (and there are likely a few more PRs
for some individual cases).
Not sure it is worth to have this blanket PR open or how to systematically
collect all issues best.
Martin
State-Changed-From-To: feedback->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sat, 24 Jul 2021 03:27:22 +0000
State-Changed-Why:
feed is back
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/48892 (some tests will not clean up rump server processes)
Date: Sat, 24 Jul 2021 03:27:04 +0000
On Wed, Jul 21, 2021 at 08:25:01AM +0000, Martin Husemann wrote:
> Not sure how to describe this properly - for the concrete test mentioned
> in the PR: yes.
>
> But we have disabled *lots* of tests in the main test runs because they
> leave rump_server processes around (and there are likely a few more PRs
> for some individual cases).
>
> Not sure it is worth to have this blanket PR open or how to systematically
> collect all issues best.
Bleh. Ok, I'm going to suggest the following: Someone(TM) gather a
list of tests that are explicitly disabled, and file them in a new PR.
We'll leave this one open until that eventually happens.
We could also file a PR on atf noting that it ought to attend to this
automatically (given that a good chunk of its justification for
existence is that it "sandboxes" and "cleans up after" tests) but I
doubt that will accomplish anything. :-|
--
David A. Holland
dholland@netbsd.org
From: Andreas Gustafsson <gson@gson.org>
To: dholland-bugs@netbsd.org, martin@NetBSD.org
Cc: gnats-bugs@netbsd.org
Subject: Re: bin/48892 (some tests will not clean up rump server processes)
Date: Sat, 24 Jul 2021 16:11:09 +0300
David Holland wrote:
> Bleh. Ok, I'm going to suggest the following: Someone(TM) gather a
> list of tests that are explicitly disabled, and file them in a new PR.
> We'll leave this one open until that eventually happens.
Gathering a list is fine, but I don't see what would be accomplished
by replacing the present PR with a new one reporting the same problem,
other than losing history showing how past instances of the problem
were fixed.
As for the list, I could only find three test cases skipped due to
leftover rump_server processes, all in the same test:
./rump/rumpkern/t_sp.sh:test_case_skip stress_long kern/50350 "leftover rump_server"
./rump/rumpkern/t_sp.sh:test_case_skip stress_killer kern/55356 "leftover rump_server"
./rump/rumpkern/t_sp.sh:test_case_skip reconnect kern/55304 "leftover rump_server"
Perhaps martin can find more?
--
Andreas Gustafsson, gson@gson.org
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/48892 (some tests will not clean up rump server processes)
Date: Sun, 25 Jul 2021 01:18:03 +0000
On Sat, Jul 24, 2021 at 01:15:02PM +0000, Andreas Gustafsson wrote:
> Gathering a list is fine, but I don't see what would be accomplished
> by replacing the present PR with a new one reporting the same problem,
> other than losing history showing how past instances of the problem
> were fixed.
Maybe none, but sometimes it's helpful to not have the first chunk of
the PR make it look like the problem's been fixed.
> As for the list, I could only find three test cases skipped due to
> leftover rump_server processes, all in the same test:
>
> ./rump/rumpkern/t_sp.sh:test_case_skip stress_long kern/50350 "leftover rump_server"
> ./rump/rumpkern/t_sp.sh:test_case_skip stress_killer kern/55356 "leftover rump_server"
> ./rump/rumpkern/t_sp.sh:test_case_skip reconnect kern/55304 "leftover rump_server"
Thanks.
I don't understand how any of these scripts manage to work, but
hopefully someone does.
--
David A. Holland
dholland@netbsd.org
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.