NetBSD Problem Report #48135

From mlelstv@pussyfoot.1st.de  Sun Aug 18 10:55:16 2013
Return-Path: <mlelstv@pussyfoot.1st.de>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id F1657715F5
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 18 Aug 2013 10:55:15 +0000 (UTC)
Message-Id: <20130818105428.76CF0B13F@pussyfoot.1st.de>
Date: Sun, 18 Aug 2013 12:54:28 +0200 (CEST)
From: mlelstv@serpens.de
Reply-To: mlelstv@serpens.de
To: gnats-bugs@NetBSD.org
Subject: Bad locking for umount
X-Send-Pr-Version: 3.95

>Number:         48135
>Category:       kern
>Synopsis:       Bad locking for umount
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Aug 18 11:00:00 +0000 2013
>Closed-Date:    
>Last-Modified:  Tue Dec 03 10:54:55 +0000 2013
>Originator:     Michael van Elst
>Release:        NetBSD 6.99.23
>Organization:

>Environment:


System: NetBSD pussyfoot 6.99.23 NetBSD 6.99.23 (PUSSYFOOT) #2: Sun Aug 18 12:13:03 CEST 2013 mlelstv@pussyfoot:/home/netbsd-current/obj.amd64/home/netbsd-current/src/sys/arch/amd64/compile/PUSSYFOOT amd64
Architecture: x86_64
Machine: amd64
>Description:
Accessing a filesystem that is being umounted can cause a panic
for a LOCKDEBUG kernel and undefined behaviour otherwise.

There is a race condition between vfs_busy() and vfs_destroy().

When vfs_destroy() decrements the reference count of a mount to
zero, it assumes that no code accesses the mount point.

vfs_busy() starts with the assumption that there is already
a reference to the mount point. However, that reference may go
away at any time.

A LOCKDEBUG kernel panics when vfs_busy() has acquired a mutex
and vfs_destroy() tries to destroy the mutex. Without that check
vfs_destroy() continues to free the data structure that is still
used by vfs_busy().

>How-To-Repeat:
On a LOCKDEBUG system, start automounter with auto_attrcache=1, and
run something like:

while true; do
	ls -ld /amd/watchpoint/.
done

>Fix:

The following diff makes vfs_busy() increment the reference count
to protect the data structure and calls vfs_destroy() in its
failure path to unreference (and possibly free) the data structure.

The last hunk avoids another possible use-after-free case in
vfs_unbusy().

Index: sys/kern/vfs_mount.c
===================================================================
RCS file: /cvsroot/src/sys/kern/vfs_mount.c,v
retrieving revision 1.19
diff -u -r1.19 vfs_mount.c
--- sys/kern/vfs_mount.c	28 Apr 2013 21:34:31 -0000	1.19
+++ sys/kern/vfs_mount.c	18 Aug 2013 10:48:42 -0000
@@ -288,6 +288,7 @@

 	KASSERT(mp->mnt_refcnt > 0);

+	atomic_inc_uint(&mp->mnt_refcnt);
 	mutex_enter(&mp->mnt_unmounting);
 	if (__predict_false((mp->mnt_iflag & IMNT_GONE) != 0)) {
 		mutex_exit(&mp->mnt_unmounting);
@@ -295,6 +296,7 @@
 			KASSERT(mutex_owned(&mountlist_lock));
 			*nextp = CIRCLEQ_NEXT(mp, mnt_list);
 		}
+		vfs_destroy(mp);
 		return ENOENT;
 	}
 	++mp->mnt_busynest;
@@ -303,7 +305,6 @@
 	if (nextp != NULL) {
 		mutex_exit(&mountlist_lock);
 	}
-	atomic_inc_uint(&mp->mnt_refcnt);
 	return 0;
 }

@@ -328,13 +329,13 @@
 	KASSERT(mp->mnt_busynest != 0);
 	mp->mnt_busynest--;
 	mutex_exit(&mp->mnt_unmounting);
-	if (!keepref) {
-		vfs_destroy(mp);
-	}
 	if (nextp != NULL) {
 		KASSERT(mutex_owned(&mountlist_lock));
 		*nextp = CIRCLEQ_NEXT(mp, mnt_list);
 	}
+	if (!keepref) {
+		vfs_destroy(mp);
+	}
 }

 /*

>Release-Note:

>Audit-Trail:
From: "J. Hannken-Illjes" <hannken@eis.cs.tu-bs.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/48135: Bad locking for umount
Date: Wed, 21 Aug 2013 11:13:54 +0200

 This fix looks wrong.

 If it is possible for the (last) reference to go away while vfs_busy() is
 running it is also unsafe to increment the reference counter here as it
 could be zero and be freed from vfs_destroy().

 Do you have stack traces from the threads running vfs_busy()
 and vfs_destroy()?

 --
 J. Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)

From: Michael van Elst <mlelstv@serpens.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/48135: Bad locking for umount
Date: Wed, 21 Aug 2013 23:15:02 +0200

 On Wed, Aug 21, 2013 at 10:35:00AM +0000, J. Hannken-Illjes wrote:
 > 
 >  This fix looks wrong.
 >
 >  If it is possible for the (last) reference to go away while vfs_busy() is
 >  running it is also unsafe to increment the reference counter here as it
 >  could be zero and be freed from vfs_destroy().

 It is supposed to be a precondition to vfs_busy() that the reference count
 is not zero, otherwise it couldn't use the mp pointer for anything.

 There is a race condition where the data structure is freed by
 vfs_destroy and then used. In the current code it is likely
 to happen because vfs_busy will just take the mutex when dounmount
 releases it and calls vfs_destroy.


 >  Do you have stack traces from the threads running vfs_busy()
 >  and vfs_destroy()?

 Only from vfs_destroy() which is basically

 sys_unmount()
 dounmount()
 vfs_destroy()
 -> LOCKDEBUG cries because a mutex is destroyed while owned.

 This probably happens because vfs_busy can take the mutex
 between dounmount releasing it and vfs_destroy destroying it.
 I'm not sure why VFS_UMOUNT can succeed (otherwise vfs_destroy
 wouldn't happen) when another thread is calling vfs_busy.


 Greetings,
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: "J. Hannken-Illjes" <hannken@eis.cs.tu-bs.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/48135: Bad locking for umount
Date: Thu, 22 Aug 2013 12:36:21 +0200

 On Aug 21, 2013, at 11:20 PM, Michael van Elst <mlelstv@serpens.de> =
 wrote:
 > On Wed, Aug 21, 2013 at 10:35:00AM +0000, J. Hannken-Illjes wrote:
 >>=20
 >> This fix looks wrong.
 >>=20
 >> If it is possible for the (last) reference to go away while =
 vfs_busy() is
 >> running it is also unsafe to increment the reference counter here as =
 it
 >> could be zero and be freed from vfs_destroy().
 >=20
 > It is supposed to be a precondition to vfs_busy() that the reference =
 count
 > is not zero, otherwise it couldn't use the mp pointer for anything.

 If this precondition "The caller must hold a pre-existing reference to =
 the mount"
 would be met this error could not happen as vfs_busy() will not =
 decrement
 the reference count.

 And if this precondition is not met even incrementing the reference
 count is prone to races as the reference count is part of the mount =
 struct.

 Btw.: I'm not able to reproduce this error here -- what exactly is
       your setup (amd configuration etc.)

 --
 J. Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)

From: Michael van Elst <mlelstv@serpens.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/48135: Bad locking for umount
Date: Thu, 22 Aug 2013 19:42:37 +0200

 On Thu, Aug 22, 2013 at 10:40:00AM +0000, J. Hannken-Illjes wrote:

 >  Btw.: I'm not able to reproduce this error here -- what exactly is
 >        your setup (amd configuration etc.)

 It's a dual core amd64 machine that mounts an NFS server.

 [global]
 restart_mounts    = yes
 map_type          = file
 search_path       = /etc/amd
 dismount_interval = 3
 cache_duration    = 3
 auto_attrcache    = 1
 log_options       = all

 [ /a ]
 map_name      = pussyfoot

 and the map:

 source \
 	-type:=nfs;opts:=rw,nosuid,nodevs,intr,nfsv3,readdirplus,proto=tcp \
 	rhost:=fud;rfs:=/d/1

 The race condition happens after setting auto_attrcache=1, but it might
 have been triggered before.

 The "fix" prevents this race, but it surely opens others.


 Greetings,
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: Michael van Elst <mlelstv@serpens.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/48135: Bad locking for umount
Date: Thu, 22 Aug 2013 20:05:28 +0200

 On Thu, Aug 22, 2013 at 07:42:37PM +0200, Michael van Elst wrote:

 > >  Btw.: I'm not able to reproduce this error here -- what exactly is
 > >        your setup (amd configuration etc.)

 Here is another outcome of the race:

 panic: lockdebug_lookup: uninitialized lock (lock=0xfffffe81e66d3050,
 from=ffffffff8037e429)

 breakpoint() at netbsd:breakpoint+0x5
 vpanic() at netbsd:vpanic+0x136
 printf_nolog() at netbsd:printf_nolog
 lockdebug_abort() at netbsd:lockdebug_abort+0xb0
 mutex_exit() at netbsd:mutex_exit+0xaa
 vfs_busy() at netbsd:vfs_busy+0xcb
 lookup_once() at netbsd:lookup_once+0x14b
 namei_tryemulroot() at netbsd:namei_tryemulroot+0x457
 namei() at netbsd:namei+0x2b
 fd_nameiat.clone.0() at netbsd:fd_nameiat.clone.0+0x54
 do_sys_statat() at netbsd:do_sys_statat+0x84
 sys___lstat50() at netbsd:sys___lstat50+0x2a
 syscall() at netbsd:syscall+0x94

 a previous doumount completed, called vfs_destroy() and lets vfs_busy()
 continue and access the freed mutex.

 It's obvious that you need a lock outside the mount structure to
 arbitrate this. But I'm still wondering how VFS_UMOUNT can succeed
 when namei uses vnodes that reference the mount when calling vfs_busy.


 Greetings,
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: "Juergen Hannken-Illjes" <hannken@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/48135 CVS commit: src/sys/kern
Date: Fri, 30 Aug 2013 12:58:22 +0000

 Module Name:	src
 Committed By:	hannken
 Date:		Fri Aug 30 12:58:22 UTC 2013

 Modified Files:
 	src/sys/kern: vfs_mount.c

 Log Message:
 Dounmount() violates the locking protocol for member v_mountedhere.
 A vnode lock is required to access or modify this field.

 Lock/unlock the vnode when clearing v_mountedhere.

 Reviewed by: David Holland <dholland@netbsd.org>

 Should fix PR #48135 (Bad locking for umount)


 To generate a diff of this commit:
 cvs rdiff -u -r1.19 -r1.20 src/sys/kern/vfs_mount.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

Responsible-Changed-From-To: kern-bug-people->hannken
Responsible-Changed-By: hannken@NetBSD.org
Responsible-Changed-When: Fri, 30 Aug 2013 13:01:13 +0000
Responsible-Changed-Why:
Committed a fix -- please confirm.


State-Changed-From-To: open->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sat, 31 Aug 2013 19:06:20 +0000
State-Changed-Why:
update PR state to match


State-Changed-From-To: feedback->closed
State-Changed-By: hannken@NetBSD.org
State-Changed-When: Sat, 05 Oct 2013 05:56:28 +0000
State-Changed-Why:
Feedback timed out.


From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/48135 Bad locking for umount
Date: Sun, 13 Oct 2013 21:32:59 +0000

 not sent to gnats-bugs
    ------

 From: Michael van Elst <mlelstv@serpens.de>
 To: gnats@netbsd.org
 Subject: Re: kern/48135 Bad locking for umount
 Date: Sun, 8 Sep 2013 08:13:05 +0200


 Nothing changed, by looping over an amd mount I can still crash
 the system:

 db{1}> bt
 breakpoint() at netbsd:breakpoint+0x5
 vpanic() at netbsd:vpanic+0x136
 printf_nolog() at netbsd:printf_nolog
 lockdebug_more() at netbsd:lockdebug_more
 mutex_destroy() at netbsd:mutex_destroy+0x75
 vfs_destroy() at netbsd:vfs_destroy+0x42
 sys_unmount() at netbsd:sys_unmount+0x15b
 syscall() at netbsd:syscall+0x9a
 --- syscall (number 22) ---
 7f7ff683b86a:
 db{1}> mach cpu 0
 using CPU 0
 db{1}> bt
 spllower() at netbsd:spllower+0xf
 tsleep() at netbsd:tsleep+0x122
 ahci_exec_command() at netbsd:ahci_exec_command+0x10b
 wd_flushcache() at netbsd:wd_flushcache+0xd8
 wdioctl() at netbsd:wdioctl+0x5f7
 bdev_ioctl() at netbsd:bdev_ioctl+0x77
 VOP_IOCTL() at netbsd:VOP_IOCTL+0x3b
 bdev_ioctl() at netbsd:bdev_ioctl+0x77
 VOP_IOCTL() at netbsd:VOP_IOCTL+0x3b
 wapbl_cache_sync() at netbsd:wapbl_cache_sync+0x4d
 wapbl_write_commit() at netbsd:wapbl_write_commit+0x4b
 wapbl_flush() at netbsd:wapbl_flush+0x714
 ffs_sync() at netbsd:ffs_sync+0x2b9
 VFS_SYNC() at netbsd:VFS_SYNC+0x1c
 sync_fsync() at netbsd:sync_fsync+0x78
 VOP_FSYNC() at netbsd:VOP_FSYNC+0x3e
 sched_sync() at netbsd:sched_sync+0xef



 Greetings,
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

State-Changed-From-To: closed->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 13 Oct 2013 21:44:47 +0000
State-Changed-Why:
Turns out there was feedback after all, and, alas, it was negative.


From: "J. Hannken-Illjes" <hannken@eis.cs.tu-bs.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/48135 Bad locking for umount
Date: Mon, 14 Oct 2013 12:36:28 +0200

 Michael,

 are you sure both vfs_destroy() on cpu 1 and VFS_SYNC() on cpu 0
 refer to the same mount?

 VFS-SYNC() calling wapbl_XXX looks strange.

 --
 J. Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)

From: Michael van Elst <mlelstv@serpens.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/48135 Bad locking for umount
Date: Mon, 14 Oct 2013 20:05:23 +0200

 On Mon, Oct 14, 2013 at 10:40:00AM +0000, J. Hannken-Illjes wrote:
 >  Michael,
 >  
 >  are you sure both vfs_destroy() on cpu 1 and VFS_SYNC() on cpu 0
 >  refer to the same mount?

 No, could be something different.

 >  VFS-SYNC() calling wapbl_XXX looks strange.

 int
 ffs_sync(struct mount *mp, int waitfor, kauth_cred_t cred)
 {
 ...

 #ifdef WAPBL
         if (mp->mnt_wapbl) {
                 error = wapbl_flush(mp->mnt_wapbl, 0);
                 if (error)
                         allerror = error;
         }
 #endif

         fstrans_done(mp);
         vnfree(mvp);
         return (allerror);
 }


 Greetings,
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: "J. Hannken-Illjes" <hannken@eis.cs.tu-bs.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/48135 Bad locking for umount
Date: Tue, 15 Oct 2013 11:49:05 +0200

 >> VFS-SYNC() calling wapbl_XXX looks strange.
 > 
 > int
 > ffs_sync(struct mount *mp, int waitfor, kauth_cred_t cred)

 Sure -- but the example was an amd mount using nfs, not ffs.

 Are you able to inspect the lock failing to destroy?
 It would be nice to get its address, its owner (show lock from ddb)
 and the backtrace of the owner.

 --
 J. Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)

Responsible-Changed-From-To: hannken->kern-bug-people
Responsible-Changed-By: hannken@NetBSD.org
Responsible-Changed-When: Tue, 03 Dec 2013 10:54:55 +0000
Responsible-Changed-Why:
Not enough information to resolve -- back to queue.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.