NetBSD Problem Report #39307

From cube@leia.cubidou.net  Thu Aug  7 07:20:33 2008
Return-Path: <cube@leia.cubidou.net>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id C1B3863BB81
	for <gnats-bugs@gnats.NetBSD.org>; Thu,  7 Aug 2008 07:20:33 +0000 (UTC)
Message-Id: <20080807070702.5C27A67EBE@leia.cubidou.net>
Date: Thu,  7 Aug 2008 09:07:02 +0200 (CEST)
From: cube@cubidou.net
Reply-To: cube@cubidou.net
To: gnats-bugs@gnats.NetBSD.org
Subject: mfs will sometimes panic at umount time
X-Send-Pr-Version: 3.95

>Number:         39307
>Category:       kern
>Synopsis:       mfs will sometimes panic at umount time
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Aug 07 07:22:43 +0000 2008
>Closed-Date:    Fri Sep 26 08:00:31 +0000 2008
>Last-Modified:  Fri Sep 26 08:00:31 +0000 2008
>Originator:     Quentin Garnier
>Release:        NetBSD 4.99.72
>Organization:
>Environment:
System: NetBSD leia.cubidou.net 4.99.72 NetBSD 4.99.72 (LEIA) #1: Sat Aug 2 21:01:58 CEST 2008 cube@leia.cubidou.net:/home/cube/src/build/obj/home/cube/src/src/sys/arch/i386/compile/LEIA i386
Architecture: i386
Machine: i386
>Description:
	Sometimes at shutdown I get a uvm_fault in the mount_mfs process,
	in VFS_START.  mount_mfs spends most of its life in
	VFS_START/mfs_start.

	What happens is that there is one too many call to vfs_destroy()
	done in the system, which means that when mfs_start() returns,
	the struct mount pointer VFS_START had gotten at mount time is no
	longer valid.

	Note that I don't know if the fact that it doesn't always happen is
	because the extra vfs_destroy call doesn't always happen, or because
	of pure luck and a pointer that still points to mapped data (although
	I think the former because I recall kmem_free going through careful
	steps to make the freed data obviously freed, but I might be wrong).
>How-To-Repeat:
	Use mfs, duh.  In my experience, actually using /tmp for more than the
	X11 socket seems to help reproducing the issue, for some reason.
>Fix:
	Unknown.

>Release-Note:

>Audit-Trail:
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org,
        tsutsui@ceres.dti.ne.jp
Subject: Re: kern/39307: mfs will sometimes panic at umount time
Date: Sat, 13 Sep 2008 02:02:26 +0900

 cube@cubidou.net wrote:

 > >Synopsis:       mfs will sometimes panic at umount time

 I see the following reproducible panics on cobalt at shutdown:

 ---
 NetBSD 4.99.72 (GENERIC) #0: Wed Sep 10 04:48:22 PDT 2008
         builds@wb35:/home/builds/ab/HEAD/cobalt/200809100002Z-obj/home/builds/ab/HEAD/src/sys/arch/cobalt/compile/GENERIC

  :

 trap: TLB miss (load or instr. fetch) in kernel mode
 status=0xff03, cause=0x8, epc=0x802f9228, vaddr=0xc6474034
 pid=125 cmd=mount_mfs usp=0x7fffca00 ksp=0xc6481d10
 Stopped in pid 125.1 (mount_mfs) at     netbsd:atomic_dec_uint_nv+0x18: lw      
 s
 0,0(s2)
 db> tr
 atomic_dec_uint_nv+18 (c6474034,83c13ab4,0,0) ra 80255d34 sz 32
 vfs_destroy+18 (c6474034,83c13ab4,0,0) ra 8025cfe8 sz 24
 do_sys_mount+818 (83c57800,83c13ab4,0,7fffef94) ra 8025d02c sz 256
 sys___mount50+3c (83c57800,83c13ab4,c6481f68,7fffef94) ra 802a2050 sz 48
 syscall_plain+130 (83c57800,83c13ab4,c6481f68,7fffef94) ra 8029bbac sz 80
 mips3_SystemCall+bc (83c57800,83c13ab4,c6481f68,7fffef94) ra 7de3d7b0 sz 0
 PC 0x7de3d7b0: not in kernel space
 0+7de3d7b0 (83c57800,83c13ab4,c6481f68,7fffef94) ra 0 sz 0
 User-level: pid 125.1
 db> 

 ---
 Izumi Tsutsui

From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/39307 CVS commit: src/sys/kern
Date: Wed, 24 Sep 2008 09:33:41 +0000 (UTC)

 Module Name:	src
 Committed By:	ad
 Date:		Wed Sep 24 09:33:41 UTC 2008

 Modified Files:
 	src/sys/kern: vfs_subr.c

 Log Message:
 PR kern/39307 mfs will sometimes panic at umount time

 In vfs_destroy, assert that the refcount is not dropping below zero.


 To generate a diff of this commit:
 cvs rdiff -r1.356 -r1.357 src/sys/kern/vfs_subr.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/39307 CVS commit: src/sys/kern
Date: Wed, 24 Sep 2008 09:44:09 +0000 (UTC)

 Module Name:	src
 Committed By:	ad
 Date:		Wed Sep 24 09:44:09 UTC 2008

 Modified Files:
 	src/sys/kern: vfs_syscalls.c

 Log Message:
 PR kern/39307 mfs will sometimes panic at umount time

 Don't drop reference to the mount if VFS_START() fails - that's for unmount
 to do.


 To generate a diff of this commit:
 cvs rdiff -r1.371 -r1.372 src/sys/kern/vfs_syscalls.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->feedback
State-Changed-By: ad@NetBSD.org
State-Changed-When: Wed, 24 Sep 2008 09:46:18 +0000
State-Changed-Why:
Should be fixed - please verify.


From: Quentin Garnier <cube@cubidou.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/39307 (mfs will sometimes panic at umount time)
Date: Wed, 24 Sep 2008 16:00:21 +0200

 --VZekXYd/M+CUZV2i
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable

 On Wed, Sep 24, 2008 at 09:46:20AM +0000, ad@NetBSD.org wrote:
 > Synopsis: mfs will sometimes panic at umount time
 >=20
 > State-Changed-From-To: open->feedback
 > State-Changed-By: ad@NetBSD.org
 > State-Changed-When: Wed, 24 Sep 2008 09:46:18 +0000
 > State-Changed-Why:
 > Should be fixed - please verify.

 It made the failure different this morning.  However, it was a quick
 test so I might have left a few asserts of my own in that kernel.  I'll
 have more time to test tonight.

 Looking at your changes, however, I don't see how they could prevent the
 panic.  What I think is happening is a race between umount(2) and
 mfs_mount(8).

 The former will do its job and the necessary references are released.
 That will signal mfs_mount(8) somehow, which is still in VFS_START at
 that point.  It will go ahead with doumount() which will do the final
 call to vfs_destroy(), which in turns destroys the struct mount.

 When mfs_start() returns to VFS_START, the pointer to the struct mount
 is dereferenced and that's where it crashes.

 It works when mfs_mount(8) gets signaled and to run before umount(2) is
 finshed, so that the end of VFS_START can dereference the struct mount.

 --=20
 Quentin Garnier - cube@cubidou.net - cube@NetBSD.org
 "See the look on my face from staying too long in one place
 [...] every time the morning breaks I know I'm closer to falling"
 KT Tunstall, Saving My Face, Drastic Fantastic, 2007.

 --VZekXYd/M+CUZV2i
 Content-Type: application/pgp-signature
 Content-Disposition: inline

 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.9 (NetBSD)

 iQEcBAEBAgAGBQJI2kf1AAoJENgoQloHrPnoi0IH/jtaP9a2bNssqc4LXItbTIkj
 IAuK4T8+h/ClzWDwcEHT8Auaaj4KdajG5r9Q9LblDQxKNcS7Vq+OZd6e0t0ulsnu
 +V3djYFL2LIjatmfDypiSrmgx0A+9t6ZuDN6WHFTdNXNMNPFBUHy9+DmgrX0UPxQ
 OsePcJjwqi//qAwBbWqKIx+FhTUYhif0hJnwACaJqiUFpYQx0EWPat2yhJF0cggU
 3M9eBgl+dPCGbeslbiBqIqr51YF5RMTI4dPZq4FxCwV7VqEefwEj7XBOihrT1WsV
 iDilhEkovhTdzDdt1gbq6RJCWhJvXPG9atojY7cE8TdDSP9AhvrWqG++A9QV4WE=
 =N9fI
 -----END PGP SIGNATURE-----

 --VZekXYd/M+CUZV2i--

From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, netbsd-bugs@NetBSD.org, gnats-admin@NetBSD.org,
        ad@NetBSD.org, cube@cubidou.net, tsutsui@ceres.dti.ne.jp
Subject: Re: kern/39307 (mfs will sometimes panic at umount time)
Date: Thu, 25 Sep 2008 00:32:43 +0900

 > Synopsis: mfs will sometimes panic at umount time
  :
 > Should be fixed - please verify.

 A cobalt kernel (in restorecd, which uses mfs for many dirs)
 still fails:
 ---
 NetBSD 4.99.72 (GENERIC) #1: Wed Sep 24 23:50:46 JST 2008
         tsutsui@mirage:/home/tsutsui/cobalt/restorecd/usr/src/sys/arch/cobalt/compile/obj.cobalt/GENERIC
 Cobalt Qube 2
 total memory = 65536 KB
 avail memory = 59624 KB
 mainbus0 (root)
 com0 at mainbus0 addr 0x1c800000 level 3: st16650a, working fifo
 com0: console
 cpu0 at mainbus0: QED RM5200 CPU (0x28a0) Rev. 10.0 with built-in FPU Rev. 10.0
 cpu0: 32KB/32B 2-way set-associative L1 Instruction cache, 48 TLB entries
 cpu0: 32KB/32B 2-way set-associative write-back L1 Data cache

  :

 Sep 24 23:39:27 client reboot: rebooted by root
 Sep 24 23:39:27 client syslogd: Exiting on signal 15
 trap: TLB miss (load or instr. fetch) in kernel mode
 status=0xff03, cause=0x8, epc=0x802f99f8, vaddr=0xc6474034
 pid=125 cmd=mount_mfs usp=0x7fffca00 ksp=0xc6481d10
 Stopped in pid 125.1 (mount_mfs) at     netbsd:atomic_dec_uint_nv+0x18: lw      s0,0(s2)
 db> tr
 atomic_dec_uint_nv+18 (c6474034,83c21ab4,0,0) ra 80256184 sz 32
 vfs_destroy+18 (c6474034,83c21ab4,0,0) ra 8025d3ec sz 24
 do_sys_mount+814 (83c57800,83c21ab4,0,7fffef94) ra 8025d430 sz 256
 sys___mount50+3c (83c57800,83c21ab4,c6481f68,7fffef94) ra 802a24a0 sz 48
 syscall_plain+130 (83c57800,83c21ab4,c6481f68,7fffef94) ra 8029bffc sz 80
 mips3_SystemCall+bc (83c57800,83c21ab4,c6481f68,7fffef94) ra 7de3d7b0 sz 0
 PC 0x7de3d7b0: not in kernel space
 0+7de3d7b0 (83c57800,83c21ab4,c6481f68,7fffef94) ra 0 sz 0
 User-level: pid 125.1
 db> 
 ---
 Note 'umount -a' before reboot(8) (though it fails to umount mfs
 due to device busy) seems to prevent the panic.


 i386 GENERIC kernel (which is a cdroot server of cobalt restorecd,
 also uses mfs heavily) also fails:
 ---
 NetBSD 4.99.72 (GENERIC) #0: Wed Sep 24 23:59:51 JST 2008
         tsutsui@mirage:/home/tsutsui/cobalt/restorecd/usr/src/sys/arch/i386/compile/obj.i386/GENERIC
 total memory = 767 MB
 avail memory = 742 MB
 VIA Technologies, Inc. VT8363 ( )
 mainbus0 (root)
 cpu0 at mainbus0: AMD 686-class, 1300MHz, id 0x671
 acpi0 at mainbus0: Intel ACPICA 20080321

  :

 bootserver# reboot
 Sep 24 15:18:15 uvm_fault(0xcadb98f0, 0, 1) -> 0xe
 fatal page faultbootserver reboo in supervisor mode
 trap type 6 code 0 eip c047c6e9 cs 8 eflags 10206 cr2 8 ilevel 0
 kernel: supervisor trap page fault, code=0
 Stopped in pid 60.1 (mount_mfs) at      netbsd:bt_rembusy+0x9:  movl    0x8(%edx),%ecx
 db{0}> tr
 bt_rembusy(c1ad2800,c1ad2800,cc29dca0,cc1c8000,0,1,cc1ebb8c,c0472fe1,c1ad2800,cc1c8000) at netbsd:bt_rembusy+0x9
 vmem_xfree(c1ad2800,cc1c8000,910,c04ac8b6,c0b0c260,cc1c8000,cc1ebbac,c04b21ca,cc1c8000,910) at netbsd:vmem_xfree+0x46
 kmem_free(cc1c8000,910,cc286b90,1286a20,cc286a20,cc286a20,cc1ebcbc,c04b8fec,cc1c8000,0) at netbsd:kmem_free+0x21
 vfs_destroy(cc1c8000,0,0,cc1ebce0,0,0,cc1ebbfc,cc1ebbf8,1da6f00,c1b9f300) at netbsd:vfs_destroy+0x7a
 do_sys_mount(cc29dca0,0,8050db7,bfbfff96,5c,bfbfecc0,0,78,cc1ebd28,cc29dca0) at netbsd:do_sys_mount+0x9ac
 sys___mount50(cc29dca0,cc1ebd00,cc1ebd28,8050db7,bfbfff96,5c,bfbfecc0,78,64,8051040) at netbsd:sys___mount50+0x49
 syscall(cc1ebd48,b3,ab,bfbf001f,bbbc001f,0,bfbfeec4,bfbfee48,0,bfbfecc0) at netbsd:syscall+0xa0
 db{0}> 
 ---
 Izumi Tsutsui

From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, netbsd-bugs@NetBSD.org, gnats-admin@NetBSD.org,
        ad@NetBSD.org, cube@cubidou.net, tsutsui@ceres.dti.ne.jp
Subject: Re: kern/39307 (mfs will sometimes panic at umount time)
Date: Thu, 25 Sep 2008 01:12:22 +0900

 > i386 GENERIC kernel (which is a cdroot server of cobalt restorecd,
 > also uses mfs heavily) also fails:

 GENERIC+DIAGNOSTIC kernel says:
 ---
 bootserver# reboot
 Sep 24 16:08:06 bootserver reboopanic: kernel diagnostic assertion "mp->mnt_refcnt == 0" failed: file "/home/tsutsui/cobalt/restorecd/usr/src/sys/kern/vfs_subr.c", line 293
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 eip c056b66c cs 8 eflags 246 cr2 cbc92109 ilevel 0
 Stopped in pid 60.1 (mount_mfs) at      netbsd:breakpoint+0x4:  popl    %ebp
 db{0}> tr
 breakpoint(c0a961cf,cc2f2b68,c0ac0140,c0498c15,1,cc1cc004,1,c03ffd96,0,0) at netbsd:breakpoint+0x4
 panic(c0aa40a4,c0a0b690,c0a560c5,c0a567c4,125,1,cc2f2b9c,c04e789a,c0a0b690,c0a567c4) at netbsd:panic+0x1b8
 __kernassert(c0a0b690,c0a567c4,125,c0a560c5,cc3667f8,cc3667f8,cc2f2cac,c04ef319,cc257000,0) at netbsd:__kernassert+0x39
 vfs_destroy(cc257000,0,0,cc2f2cd0,0,0,0,cc2f2be8,1000000,c1c0b000) at netbsd:vfs_destroy+0xaa
 do_sys_mount(cc37dca0,0,8050db7,bfbfff96,5c,bfbfecc0,0,78,cc2f2d28,c0abb250) at netbsd:do_sys_mount+0x9d9
 sys___mount50(cc37dca0,cc2f2d00,cc2f2d28,bbb44a70,bbb44000,cae398f0,1,8050db7,bfbfff96,5c) at netbsd:sys___mount50+0x49
 syscall(cc2f2d48,b3,ab,bfbf001f,bbbc001f,0,bfbfeec4,bfbfee48,0,bfbfecc0) at netbsd:syscall+0xab
 db{0}> 

 ---
 Izumi Tsutsui

From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/39307 CVS commit: src/sys
Date: Thu, 25 Sep 2008 14:17:29 +0000 (UTC)

 Module Name:	src
 Committed By:	ad
 Date:		Thu Sep 25 14:17:29 UTC 2008

 Modified Files:
 	src/sys/fs/puffs: puffs_msgif.c
 	src/sys/kern: vfs_syscalls.c

 Log Message:
 PR kern/39307 (mfs will sometimes panic at umount time)

 Change dounmount() so that it never drops the caller provided reference.
 Garbage collecting 'struct mount' is up to the caller.


 To generate a diff of this commit:
 cvs rdiff -r1.71 -r1.72 src/sys/fs/puffs/puffs_msgif.c
 cvs rdiff -r1.373 -r1.374 src/sys/kern/vfs_syscalls.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: feedback->closed
State-Changed-By: ad@NetBSD.org
State-Changed-When: Fri, 26 Sep 2008 08:00:31 +0000
State-Changed-Why:
confirmed fixed


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.