NetBSD Problem Report #55333

From martin@aprisoft.de  Mon Jun  1 11:53:24 2020
Return-Path: <martin@aprisoft.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id AE0931A9218
	for <gnats-bugs@gnats.NetBSD.org>; Mon,  1 Jun 2020 11:53:24 +0000 (UTC)
Message-Id: <20200601115315.97DAA5CC864@emmas.aprisoft.de>
Date: Mon,  1 Jun 2020 13:53:15 +0200 (CEST)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: KASSERT "vrefcnt(vp) > 0"
X-Send-Pr-Version: 3.95

>Number:         55333
>Category:       kern
>Synopsis:       KASSERT "vrefcnt(vp) > 0"
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    ad
>State:          analyzed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Jun 01 11:55:00 +0000 2020
>Closed-Date:    
>Last-Modified:  Thu Jun 11 20:45:01 +0000 2020
>Originator:     Martin Husemann
>Release:        NetBSD 9.99.64
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD seven-days-to-the-wolves.aprisoft.de 9.99.64 NetBSD 9.99.64 (GENERIC) #396: Sun May 31 11:06:59 CEST 2020 martin@seven-days-to-the-wolves.aprisoft.de:/work/src/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:

I am trying to setup a debug environment, with read-only root mount and a tmpfs
union above it. /etc/fstab looks like this:

--8<--
# NetBSD /etc/fstab
# See /usr/share/examples/fstab/ for more examples.
/dev/wd0n		/		ffs	ro		 1 1
tmpfs			/		tmpfs	rw,union,-sram%50
/dev/wd0b		none		swap	sw,dp		 0 0
kernfs			/kern		kernfs	rw
ptyfs			/dev/pts	ptyfs	rw
procfs			/proc		procfs	rw
-->8--

This did not work, system came up w/o the tmpfs and no clear diagnostic:

Mon Jun  1 11:42:24 UTC 2020
Not checking /: fs_passno = 0 in /etc/fstab
swapctl: setting dump device to /dev/wd0b
swapctl: adding /dev/wd0b as swap device at priority 0
Starting file system checks:
rm: named: Read-only file system
...

So I tried to mount it manually and it crashed:

# mount -a
[  91.8985291] panic: kernel diagnostic assertion "vrefcnt(vp) > 0" failed: file "../../../../kern/vfs_vnode.c", line 981 
[  92.0285217] cpu1: Begin traceback...
[  92.0685219] vpanic() at netbsd:vpanic+0x152
[  92.1185223] __x86_indirect_thunk_rax() at netbsd:__x86_indirect_thunk_rax
[  92.1985229] vref() at netbsd:vref+0x8d
[  92.2485234] lookup_once() at netbsd:lookup_once+0x16a
[  92.3085237] namei_tryemulroot() at netbsd:namei_tryemulroot+0xae4
[  92.3785241] namei() at netbsd:namei+0x29
[  92.4285245] fd_nameiat.isra.2() at netbsd:fd_nameiat.isra.2+0x54
[  92.4985250] do_sys_statat() at netbsd:do_sys_statat+0x87
[  92.5585254] sys___lstat50() at netbsd:sys___lstat50+0x25
[  92.6285259] syscall() at netbsd:syscallKernel lock error: _kernel_lock,244: spinout

[  92.7185265] lock address : 0xffffffff81485540 type     :               spin
[  92.7985270] initialized  : 0xffffffff80da8409
[  92.8585274] shared holds :                  0 exclusive:                  0
[  92.9385281] shares wanted:                  0 exclusive:                  0
[  93.0185287] relevant cpu :                  3 last held:                  0
[  93.1085293] relevant lwp : 0xffff8342ec1b4a00 last held: 000000000000000000
[  93.1885299] last locked  : 0xffffffff80c260d3 unlocked*: 0xffffffff80c260e7
[  93.2685304] +0x283curcpu holds :                  0 wanted by: 0xffff8342ec1b4a00


[  93.3685311] --- syscall (number 441) ---
[  93.4085314] netbsd:syscall+0x283:
[  93.4485316] cpu1: End traceback...
[  93.4885320] fatal breakpoint trap in supervisor mode
[  93.5485324] trap type 1 code 0 rip 0xffffffff80221975 cs 0x8 rflags 0x202 cr2 0x700245bea430 ilevel 0 rsp 0xffff8401523f7af0
[  93.6885334] curlwp 0xffff8342ec6608c0 pid 300.300 lowest kstack 0xffff8401523f42c0
Stopped in pid 300.300 (mount) at       netbsd:breakpoint+0x5:  leave
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x152
__x86_indirect_thunk_rax() at netbsd:__x86_indirect_thunk_rax
vref() at netbsd:vref+0x8d
lookup_once() at netbsd:lookup_once+0x16a
namei_tryemulroot() at netbsd:namei_tryemulroot+0xae4
namei() at netbsd:namei+0x29
fd_nameiat.isra.2() at netbsd:fd_nameiat.isra.2+0x54
do_sys_statat() at netbsd:do_sys_statat+0x87
sys___lstat50() at netbsd:sys___lstat50+0x25
syscall() at netbsd:syscall+0x283
--- syscall (number 441) ---
netbsd:syscall+0x283:
ds          7b00
es          7ab0
fs          7af0
gs          10
rdi         0
rsi         2d5
rbp         ffff8401523f7af0
rbx         104
rdx         1



>How-To-Repeat:
s/a

>Fix:
n/a

>Release-Note:

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/55333: KASSERT "vrefcnt(vp) > 0"
Date: Mon, 1 Jun 2020 14:01:16 +0200

 On Mon, Jun 01, 2020 at 11:55:01AM +0000, martin@NetBSD.org wrote:
 > This did not work, system came up w/o the tmpfs and no clear diagnostic:

 OK, it clearly helps to read the fine man page and not project ones wishes
 into the "union" magic flag.

 But the crash remains.

 Martin

Responsible-Changed-From-To: kern-bug-people->ad
Responsible-Changed-By: ad@NetBSD.org
Responsible-Changed-When: Mon, 01 Jun 2020 22:54:07 +0000
Responsible-Changed-Why:
I'll take a look


State-Changed-From-To: open->analyzed
State-Changed-By: ad@NetBSD.org
State-Changed-When: Thu, 11 Jun 2020 20:13:20 +0000
State-Changed-Why:
only occurs when there's a union on rootvnode


From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/55333 (KASSERT "vrefcnt(vp) > 0")
Date: Thu, 11 Jun 2020 20:34:08 +0000

 There are two bugs here by the looks of it and they're both a bit absurd:

 - When the tmpfs is mounted, mp->mnt_vnodecovered gets set = rootvnode,
   which is perfect, except that it's set on the actual root file system and
   not the newly mounted fs.

 - vp is NULL and there was no fault taken when it was dereferenced.

From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/55333 (KASSERT "vrefcnt(vp) > 0")
Date: Thu, 11 Jun 2020 20:44:48 +0000

 Also, in the UNION readdir case, fp->f_data is swapped out to the root vnode
 of the mounted-on FS half way through the readdir, which is ..  not good. 
 Need to think about that one.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.