NetBSD Problem Report #57286

From www@netbsd.org  Fri Mar 24 00:09:26 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 378441A9239
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 24 Mar 2023 00:09:26 +0000 (UTC)
Message-Id: <20230324000924.C381B1A923C@mollari.NetBSD.org>
Date: Fri, 24 Mar 2023 00:09:24 +0000 (UTC)
From: jspath55@gmail.com
Reply-To: jspath55@gmail.com
To: gnats-bugs@NetBSD.org
Subject: Unit test fs/tmpfs/t_vnode_leak fails in ATF Tests suite
X-Send-Pr-Version: www-1.0

>Number:         57286
>Category:       misc
>Synopsis:       Unit test fs/tmpfs/t_vnode_leak fails in ATF Tests suite
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    misc-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Mar 24 00:10:00 +0000 2023
>Last-Modified:  Sun May 14 01:45:01 +0000 2023
>Originator:     Jim Spath
>Release:        10.0_BETA
>Organization:
>Environment:
NetBSD am.d64 10.0_BETA NetBSD 10.0_BETA (GENERIC) #0: Sun Feb 12 12:39:37 UTC 2023  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64

>Description:
When I ran the ATF test suite, one error was in the fs/tmpfs/t_vnode_leak section, which showed:

tc-so:Lowering kern.maxvnodes to 2000
tc-so:Executing command [ sysctl -w kern.maxvnodes=2000 ]
tc-se:Fail: incorrect exit status: 1, expected: 0

Running the same command standalone shows a failure:


~ sysctl -w kern.maxvnodes=2000
sysctl: kern.maxvnodes: Device busy


>How-To-Repeat:
The same results appear on other architectures such as arm64, but did not recur on an i386 install.





>Fix:
Unknown.

>Audit-Trail:
From: Jim Spath <jspath55@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: misc/57286
Date: Thu, 6 Apr 2023 19:53:46 -0400

 I ran additional shell commands to determine why this occurs:

 # sysctl -w kern.maxvnodes=2000
 sysctl: kern.maxvnodes: Device busy

 On one system the current maxvnode count is:
 # sysctl -n kern.maxvnodes
 402626

 Higher values than 2,000 succeed:

 # sysctl -w kern.maxvnodes=402625
 kern.maxvnodes: 402626 -> 402625
 # sysctl -w kern.maxvnodes=100000
 kern.maxvnodes: 402625 -> 100000
 # sysctl -w kern.maxvnodes=50000
 kern.maxvnodes: 100000 -> 50000
 # sysctl -w kern.maxvnodes=10000
 kern.maxvnodes: 50000 -> 10000

 The lower limit for this system is around 4,000.

 # sysctl -w kern.maxvnodes=3000
 sysctl: kern.maxvnodes: Device busy
 # sysctl -w kern.maxvnodes=4000
 sysctl: kern.maxvnodes: Device busy
 # sysctl -w kern.maxvnodes=4500
 kern.maxvnodes: 5000 -> 4500

 Could the test determine a valid number of nodes before trying to
 lower the value?

From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: misc/57286
Date: Fri, 7 Apr 2023 10:27:14 -0000 (UTC)

 jspath55@gmail.com (Jim Spath) writes:

 > # sysctl -w kern.maxvnodes=2000
 > sysctl: kern.maxvnodes: Device busy

         if (numvnodes >= desiredvnodes)
                 return EBUSY;

 where 'desiredvnodes' is your wanted target number and 'numvnodes'
 is the actual number of vnodes in memory.

 > Could the test determine a valid number of nodes before trying to
 > lower the value?

 There is no fixed lower limit. When you reduce the number, the
 system attempts _once_ to free enough vnodes. If that succeeds,
 it's fine. But since it can only free unreferenced vnodes, it might
 fail if your target is lower than the number of refernced (busy)
 vnodes and you will get an EBUSY error. In that case the limit is
 not changed (not even to the value that could be achieved).

From: Jim Spath <jspath55@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: misc/57286
Date: Fri, 7 Apr 2023 11:05:23 -0400

 On Fri, Apr 7, 2023 at 6:30=E2=80=AFAM Michael van Elst <mlelstv@serpens.de=
 > wrote:
 > " it might fail if "

 Thank you for reviewing the results and explaining the logic. It seems
 I have done as much as possible from a user regression test
 perspective. Test failures in this case do not imply a system code
 issue.

 Whether this test should be updated or skipped, I do not know. I will
 continue with other tests meanwhile.

 Jim

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: misc/57286
Date: Sun, 14 May 2023 01:42:53 +0000

 On Fri, Apr 07, 2023 at 03:10:02PM +0000, Jim Spath wrote:
  >  > " it might fail if "
  >  
  >  Thank you for reviewing the results and explaining the logic. It seems
  >  I have done as much as possible from a user regression test
  >  perspective. Test failures in this case do not imply a system code
  >  issue.
  >  
  >  Whether this test should be updated or skipped, I do not know. I will
  >  continue with other tests meanwhile.

 Well, the other question is whether it's actually leaking vnodes.
 Insisting that there are ~4000 busy seems slightly suspect, but it
 depends of course on what else is running.

 It looks like this test does not use rump, which is a little
 surprising...

 -- 
 David A. Holland
 dholland@netbsd.org
Home
PR Database Search
(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.