NetBSD Problem Report #39307
From cube@leia.cubidou.net Thu Aug 7 07:20:33 2008
Return-Path: <cube@leia.cubidou.net>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by narn.NetBSD.org (Postfix) with ESMTP id C1B3863BB81
for <gnats-bugs@gnats.NetBSD.org>; Thu, 7 Aug 2008 07:20:33 +0000 (UTC)
Message-Id: <20080807070702.5C27A67EBE@leia.cubidou.net>
Date: Thu, 7 Aug 2008 09:07:02 +0200 (CEST)
From: cube@cubidou.net
Reply-To: cube@cubidou.net
To: gnats-bugs@gnats.NetBSD.org
Subject: mfs will sometimes panic at umount time
X-Send-Pr-Version: 3.95
>Number: 39307
>Category: kern
>Synopsis: mfs will sometimes panic at umount time
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Aug 07 07:22:43 +0000 2008
>Closed-Date: Fri Sep 26 08:00:31 +0000 2008
>Last-Modified: Fri Sep 26 08:00:31 +0000 2008
>Originator: Quentin Garnier
>Release: NetBSD 4.99.72
>Organization:
>Environment:
System: NetBSD leia.cubidou.net 4.99.72 NetBSD 4.99.72 (LEIA) #1: Sat Aug 2 21:01:58 CEST 2008 cube@leia.cubidou.net:/home/cube/src/build/obj/home/cube/src/src/sys/arch/i386/compile/LEIA i386
Architecture: i386
Machine: i386
>Description:
Sometimes at shutdown I get a uvm_fault in the mount_mfs process,
in VFS_START. mount_mfs spends most of its life in
VFS_START/mfs_start.
What happens is that there is one too many call to vfs_destroy()
done in the system, which means that when mfs_start() returns,
the struct mount pointer VFS_START had gotten at mount time is no
longer valid.
Note that I don't know if the fact that it doesn't always happen is
because the extra vfs_destroy call doesn't always happen, or because
of pure luck and a pointer that still points to mapped data (although
I think the former because I recall kmem_free going through careful
steps to make the freed data obviously freed, but I might be wrong).
>How-To-Repeat:
Use mfs, duh. In my experience, actually using /tmp for more than the
X11 socket seems to help reproducing the issue, for some reason.
>Fix:
Unknown.
>Release-Note:
>Audit-Trail:
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org,
tsutsui@ceres.dti.ne.jp
Subject: Re: kern/39307: mfs will sometimes panic at umount time
Date: Sat, 13 Sep 2008 02:02:26 +0900
cube@cubidou.net wrote:
> >Synopsis: mfs will sometimes panic at umount time
I see the following reproducible panics on cobalt at shutdown:
---
NetBSD 4.99.72 (GENERIC) #0: Wed Sep 10 04:48:22 PDT 2008
builds@wb35:/home/builds/ab/HEAD/cobalt/200809100002Z-obj/home/builds/ab/HEAD/src/sys/arch/cobalt/compile/GENERIC
:
trap: TLB miss (load or instr. fetch) in kernel mode
status=0xff03, cause=0x8, epc=0x802f9228, vaddr=0xc6474034
pid=125 cmd=mount_mfs usp=0x7fffca00 ksp=0xc6481d10
Stopped in pid 125.1 (mount_mfs) at netbsd:atomic_dec_uint_nv+0x18: lw
s
0,0(s2)
db> tr
atomic_dec_uint_nv+18 (c6474034,83c13ab4,0,0) ra 80255d34 sz 32
vfs_destroy+18 (c6474034,83c13ab4,0,0) ra 8025cfe8 sz 24
do_sys_mount+818 (83c57800,83c13ab4,0,7fffef94) ra 8025d02c sz 256
sys___mount50+3c (83c57800,83c13ab4,c6481f68,7fffef94) ra 802a2050 sz 48
syscall_plain+130 (83c57800,83c13ab4,c6481f68,7fffef94) ra 8029bbac sz 80
mips3_SystemCall+bc (83c57800,83c13ab4,c6481f68,7fffef94) ra 7de3d7b0 sz 0
PC 0x7de3d7b0: not in kernel space
0+7de3d7b0 (83c57800,83c13ab4,c6481f68,7fffef94) ra 0 sz 0
User-level: pid 125.1
db>
---
Izumi Tsutsui
From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/39307 CVS commit: src/sys/kern
Date: Wed, 24 Sep 2008 09:33:41 +0000 (UTC)
Module Name: src
Committed By: ad
Date: Wed Sep 24 09:33:41 UTC 2008
Modified Files:
src/sys/kern: vfs_subr.c
Log Message:
PR kern/39307 mfs will sometimes panic at umount time
In vfs_destroy, assert that the refcount is not dropping below zero.
To generate a diff of this commit:
cvs rdiff -r1.356 -r1.357 src/sys/kern/vfs_subr.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/39307 CVS commit: src/sys/kern
Date: Wed, 24 Sep 2008 09:44:09 +0000 (UTC)
Module Name: src
Committed By: ad
Date: Wed Sep 24 09:44:09 UTC 2008
Modified Files:
src/sys/kern: vfs_syscalls.c
Log Message:
PR kern/39307 mfs will sometimes panic at umount time
Don't drop reference to the mount if VFS_START() fails - that's for unmount
to do.
To generate a diff of this commit:
cvs rdiff -r1.371 -r1.372 src/sys/kern/vfs_syscalls.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->feedback
State-Changed-By: ad@NetBSD.org
State-Changed-When: Wed, 24 Sep 2008 09:46:18 +0000
State-Changed-Why:
Should be fixed - please verify.
From: Quentin Garnier <cube@cubidou.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/39307 (mfs will sometimes panic at umount time)
Date: Wed, 24 Sep 2008 16:00:21 +0200
--VZekXYd/M+CUZV2i
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
On Wed, Sep 24, 2008 at 09:46:20AM +0000, ad@NetBSD.org wrote:
> Synopsis: mfs will sometimes panic at umount time
>=20
> State-Changed-From-To: open->feedback
> State-Changed-By: ad@NetBSD.org
> State-Changed-When: Wed, 24 Sep 2008 09:46:18 +0000
> State-Changed-Why:
> Should be fixed - please verify.
It made the failure different this morning. However, it was a quick
test so I might have left a few asserts of my own in that kernel. I'll
have more time to test tonight.
Looking at your changes, however, I don't see how they could prevent the
panic. What I think is happening is a race between umount(2) and
mfs_mount(8).
The former will do its job and the necessary references are released.
That will signal mfs_mount(8) somehow, which is still in VFS_START at
that point. It will go ahead with doumount() which will do the final
call to vfs_destroy(), which in turns destroys the struct mount.
When mfs_start() returns to VFS_START, the pointer to the struct mount
is dereferenced and that's where it crashes.
It works when mfs_mount(8) gets signaled and to run before umount(2) is
finshed, so that the end of VFS_START can dereference the struct mount.
--=20
Quentin Garnier - cube@cubidou.net - cube@NetBSD.org
"See the look on my face from staying too long in one place
[...] every time the morning breaks I know I'm closer to falling"
KT Tunstall, Saving My Face, Drastic Fantastic, 2007.
--VZekXYd/M+CUZV2i
Content-Type: application/pgp-signature
Content-Disposition: inline
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (NetBSD)
iQEcBAEBAgAGBQJI2kf1AAoJENgoQloHrPnoi0IH/jtaP9a2bNssqc4LXItbTIkj
IAuK4T8+h/ClzWDwcEHT8Auaaj4KdajG5r9Q9LblDQxKNcS7Vq+OZd6e0t0ulsnu
+V3djYFL2LIjatmfDypiSrmgx0A+9t6ZuDN6WHFTdNXNMNPFBUHy9+DmgrX0UPxQ
OsePcJjwqi//qAwBbWqKIx+FhTUYhif0hJnwACaJqiUFpYQx0EWPat2yhJF0cggU
3M9eBgl+dPCGbeslbiBqIqr51YF5RMTI4dPZq4FxCwV7VqEefwEj7XBOihrT1WsV
iDilhEkovhTdzDdt1gbq6RJCWhJvXPG9atojY7cE8TdDSP9AhvrWqG++A9QV4WE=
=N9fI
-----END PGP SIGNATURE-----
--VZekXYd/M+CUZV2i--
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, netbsd-bugs@NetBSD.org, gnats-admin@NetBSD.org,
ad@NetBSD.org, cube@cubidou.net, tsutsui@ceres.dti.ne.jp
Subject: Re: kern/39307 (mfs will sometimes panic at umount time)
Date: Thu, 25 Sep 2008 00:32:43 +0900
> Synopsis: mfs will sometimes panic at umount time
:
> Should be fixed - please verify.
A cobalt kernel (in restorecd, which uses mfs for many dirs)
still fails:
---
NetBSD 4.99.72 (GENERIC) #1: Wed Sep 24 23:50:46 JST 2008
tsutsui@mirage:/home/tsutsui/cobalt/restorecd/usr/src/sys/arch/cobalt/compile/obj.cobalt/GENERIC
Cobalt Qube 2
total memory = 65536 KB
avail memory = 59624 KB
mainbus0 (root)
com0 at mainbus0 addr 0x1c800000 level 3: st16650a, working fifo
com0: console
cpu0 at mainbus0: QED RM5200 CPU (0x28a0) Rev. 10.0 with built-in FPU Rev. 10.0
cpu0: 32KB/32B 2-way set-associative L1 Instruction cache, 48 TLB entries
cpu0: 32KB/32B 2-way set-associative write-back L1 Data cache
:
Sep 24 23:39:27 client reboot: rebooted by root
Sep 24 23:39:27 client syslogd: Exiting on signal 15
trap: TLB miss (load or instr. fetch) in kernel mode
status=0xff03, cause=0x8, epc=0x802f99f8, vaddr=0xc6474034
pid=125 cmd=mount_mfs usp=0x7fffca00 ksp=0xc6481d10
Stopped in pid 125.1 (mount_mfs) at netbsd:atomic_dec_uint_nv+0x18: lw s0,0(s2)
db> tr
atomic_dec_uint_nv+18 (c6474034,83c21ab4,0,0) ra 80256184 sz 32
vfs_destroy+18 (c6474034,83c21ab4,0,0) ra 8025d3ec sz 24
do_sys_mount+814 (83c57800,83c21ab4,0,7fffef94) ra 8025d430 sz 256
sys___mount50+3c (83c57800,83c21ab4,c6481f68,7fffef94) ra 802a24a0 sz 48
syscall_plain+130 (83c57800,83c21ab4,c6481f68,7fffef94) ra 8029bffc sz 80
mips3_SystemCall+bc (83c57800,83c21ab4,c6481f68,7fffef94) ra 7de3d7b0 sz 0
PC 0x7de3d7b0: not in kernel space
0+7de3d7b0 (83c57800,83c21ab4,c6481f68,7fffef94) ra 0 sz 0
User-level: pid 125.1
db>
---
Note 'umount -a' before reboot(8) (though it fails to umount mfs
due to device busy) seems to prevent the panic.
i386 GENERIC kernel (which is a cdroot server of cobalt restorecd,
also uses mfs heavily) also fails:
---
NetBSD 4.99.72 (GENERIC) #0: Wed Sep 24 23:59:51 JST 2008
tsutsui@mirage:/home/tsutsui/cobalt/restorecd/usr/src/sys/arch/i386/compile/obj.i386/GENERIC
total memory = 767 MB
avail memory = 742 MB
VIA Technologies, Inc. VT8363 ( )
mainbus0 (root)
cpu0 at mainbus0: AMD 686-class, 1300MHz, id 0x671
acpi0 at mainbus0: Intel ACPICA 20080321
:
bootserver# reboot
Sep 24 15:18:15 uvm_fault(0xcadb98f0, 0, 1) -> 0xe
fatal page faultbootserver reboo in supervisor mode
trap type 6 code 0 eip c047c6e9 cs 8 eflags 10206 cr2 8 ilevel 0
kernel: supervisor trap page fault, code=0
Stopped in pid 60.1 (mount_mfs) at netbsd:bt_rembusy+0x9: movl 0x8(%edx),%ecx
db{0}> tr
bt_rembusy(c1ad2800,c1ad2800,cc29dca0,cc1c8000,0,1,cc1ebb8c,c0472fe1,c1ad2800,cc1c8000) at netbsd:bt_rembusy+0x9
vmem_xfree(c1ad2800,cc1c8000,910,c04ac8b6,c0b0c260,cc1c8000,cc1ebbac,c04b21ca,cc1c8000,910) at netbsd:vmem_xfree+0x46
kmem_free(cc1c8000,910,cc286b90,1286a20,cc286a20,cc286a20,cc1ebcbc,c04b8fec,cc1c8000,0) at netbsd:kmem_free+0x21
vfs_destroy(cc1c8000,0,0,cc1ebce0,0,0,cc1ebbfc,cc1ebbf8,1da6f00,c1b9f300) at netbsd:vfs_destroy+0x7a
do_sys_mount(cc29dca0,0,8050db7,bfbfff96,5c,bfbfecc0,0,78,cc1ebd28,cc29dca0) at netbsd:do_sys_mount+0x9ac
sys___mount50(cc29dca0,cc1ebd00,cc1ebd28,8050db7,bfbfff96,5c,bfbfecc0,78,64,8051040) at netbsd:sys___mount50+0x49
syscall(cc1ebd48,b3,ab,bfbf001f,bbbc001f,0,bfbfeec4,bfbfee48,0,bfbfecc0) at netbsd:syscall+0xa0
db{0}>
---
Izumi Tsutsui
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, netbsd-bugs@NetBSD.org, gnats-admin@NetBSD.org,
ad@NetBSD.org, cube@cubidou.net, tsutsui@ceres.dti.ne.jp
Subject: Re: kern/39307 (mfs will sometimes panic at umount time)
Date: Thu, 25 Sep 2008 01:12:22 +0900
> i386 GENERIC kernel (which is a cdroot server of cobalt restorecd,
> also uses mfs heavily) also fails:
GENERIC+DIAGNOSTIC kernel says:
---
bootserver# reboot
Sep 24 16:08:06 bootserver reboopanic: kernel diagnostic assertion "mp->mnt_refcnt == 0" failed: file "/home/tsutsui/cobalt/restorecd/usr/src/sys/kern/vfs_subr.c", line 293
fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c056b66c cs 8 eflags 246 cr2 cbc92109 ilevel 0
Stopped in pid 60.1 (mount_mfs) at netbsd:breakpoint+0x4: popl %ebp
db{0}> tr
breakpoint(c0a961cf,cc2f2b68,c0ac0140,c0498c15,1,cc1cc004,1,c03ffd96,0,0) at netbsd:breakpoint+0x4
panic(c0aa40a4,c0a0b690,c0a560c5,c0a567c4,125,1,cc2f2b9c,c04e789a,c0a0b690,c0a567c4) at netbsd:panic+0x1b8
__kernassert(c0a0b690,c0a567c4,125,c0a560c5,cc3667f8,cc3667f8,cc2f2cac,c04ef319,cc257000,0) at netbsd:__kernassert+0x39
vfs_destroy(cc257000,0,0,cc2f2cd0,0,0,0,cc2f2be8,1000000,c1c0b000) at netbsd:vfs_destroy+0xaa
do_sys_mount(cc37dca0,0,8050db7,bfbfff96,5c,bfbfecc0,0,78,cc2f2d28,c0abb250) at netbsd:do_sys_mount+0x9d9
sys___mount50(cc37dca0,cc2f2d00,cc2f2d28,bbb44a70,bbb44000,cae398f0,1,8050db7,bfbfff96,5c) at netbsd:sys___mount50+0x49
syscall(cc2f2d48,b3,ab,bfbf001f,bbbc001f,0,bfbfeec4,bfbfee48,0,bfbfecc0) at netbsd:syscall+0xab
db{0}>
---
Izumi Tsutsui
From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/39307 CVS commit: src/sys
Date: Thu, 25 Sep 2008 14:17:29 +0000 (UTC)
Module Name: src
Committed By: ad
Date: Thu Sep 25 14:17:29 UTC 2008
Modified Files:
src/sys/fs/puffs: puffs_msgif.c
src/sys/kern: vfs_syscalls.c
Log Message:
PR kern/39307 (mfs will sometimes panic at umount time)
Change dounmount() so that it never drops the caller provided reference.
Garbage collecting 'struct mount' is up to the caller.
To generate a diff of this commit:
cvs rdiff -r1.71 -r1.72 src/sys/fs/puffs/puffs_msgif.c
cvs rdiff -r1.373 -r1.374 src/sys/kern/vfs_syscalls.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: feedback->closed
State-Changed-By: ad@NetBSD.org
State-Changed-When: Fri, 26 Sep 2008 08:00:31 +0000
State-Changed-Why:
confirmed fixed
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.