NetBSD Problem Report #57286
From www@netbsd.org Fri Mar 24 00:09:26 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 378441A9239
for <gnats-bugs@gnats.NetBSD.org>; Fri, 24 Mar 2023 00:09:26 +0000 (UTC)
Message-Id: <20230324000924.C381B1A923C@mollari.NetBSD.org>
Date: Fri, 24 Mar 2023 00:09:24 +0000 (UTC)
From: jspath55@gmail.com
Reply-To: jspath55@gmail.com
To: gnats-bugs@NetBSD.org
Subject: Unit test fs/tmpfs/t_vnode_leak fails in ATF Tests suite
X-Send-Pr-Version: www-1.0
>Number: 57286
>Category: misc
>Synopsis: Unit test fs/tmpfs/t_vnode_leak fails in ATF Tests suite
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: misc-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Mar 24 00:10:00 +0000 2023
>Last-Modified: Sun May 14 01:45:01 +0000 2023
>Originator: Jim Spath
>Release: 10.0_BETA
>Organization:
>Environment:
NetBSD am.d64 10.0_BETA NetBSD 10.0_BETA (GENERIC) #0: Sun Feb 12 12:39:37 UTC 2023 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64
>Description:
When I ran the ATF test suite, one error was in the fs/tmpfs/t_vnode_leak section, which showed:
tc-so:Lowering kern.maxvnodes to 2000
tc-so:Executing command [ sysctl -w kern.maxvnodes=2000 ]
tc-se:Fail: incorrect exit status: 1, expected: 0
Running the same command standalone shows a failure:
~ sysctl -w kern.maxvnodes=2000
sysctl: kern.maxvnodes: Device busy
>How-To-Repeat:
The same results appear on other architectures such as arm64, but did not recur on an i386 install.
>Fix:
Unknown.
>Audit-Trail:
From: Jim Spath <jspath55@gmail.com>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: misc/57286
Date: Thu, 6 Apr 2023 19:53:46 -0400
I ran additional shell commands to determine why this occurs:
# sysctl -w kern.maxvnodes=2000
sysctl: kern.maxvnodes: Device busy
On one system the current maxvnode count is:
# sysctl -n kern.maxvnodes
402626
Higher values than 2,000 succeed:
# sysctl -w kern.maxvnodes=402625
kern.maxvnodes: 402626 -> 402625
# sysctl -w kern.maxvnodes=100000
kern.maxvnodes: 402625 -> 100000
# sysctl -w kern.maxvnodes=50000
kern.maxvnodes: 100000 -> 50000
# sysctl -w kern.maxvnodes=10000
kern.maxvnodes: 50000 -> 10000
The lower limit for this system is around 4,000.
# sysctl -w kern.maxvnodes=3000
sysctl: kern.maxvnodes: Device busy
# sysctl -w kern.maxvnodes=4000
sysctl: kern.maxvnodes: Device busy
# sysctl -w kern.maxvnodes=4500
kern.maxvnodes: 5000 -> 4500
Could the test determine a valid number of nodes before trying to
lower the value?
From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: misc/57286
Date: Fri, 7 Apr 2023 10:27:14 -0000 (UTC)
jspath55@gmail.com (Jim Spath) writes:
> # sysctl -w kern.maxvnodes=2000
> sysctl: kern.maxvnodes: Device busy
if (numvnodes >= desiredvnodes)
return EBUSY;
where 'desiredvnodes' is your wanted target number and 'numvnodes'
is the actual number of vnodes in memory.
> Could the test determine a valid number of nodes before trying to
> lower the value?
There is no fixed lower limit. When you reduce the number, the
system attempts _once_ to free enough vnodes. If that succeeds,
it's fine. But since it can only free unreferenced vnodes, it might
fail if your target is lower than the number of refernced (busy)
vnodes and you will get an EBUSY error. In that case the limit is
not changed (not even to the value that could be achieved).
From: Jim Spath <jspath55@gmail.com>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: misc/57286
Date: Fri, 7 Apr 2023 11:05:23 -0400
On Fri, Apr 7, 2023 at 6:30=E2=80=AFAM Michael van Elst <mlelstv@serpens.de=
> wrote:
> " it might fail if "
Thank you for reviewing the results and explaining the logic. It seems
I have done as much as possible from a user regression test
perspective. Test failures in this case do not imply a system code
issue.
Whether this test should be updated or skipped, I do not know. I will
continue with other tests meanwhile.
Jim
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: misc/57286
Date: Sun, 14 May 2023 01:42:53 +0000
On Fri, Apr 07, 2023 at 03:10:02PM +0000, Jim Spath wrote:
> > " it might fail if "
>
> Thank you for reviewing the results and explaining the logic. It seems
> I have done as much as possible from a user regression test
> perspective. Test failures in this case do not imply a system code
> issue.
>
> Whether this test should be updated or skipped, I do not know. I will
> continue with other tests meanwhile.
Well, the other question is whether it's actually leaking vnodes.
Insisting that there are ~4000 busy seems slightly suspect, but it
depends of course on what else is running.
It looks like this test does not use rump, which is a little
surprising...
--
David A. Holland
dholland@netbsd.org
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.