NetBSD Problem Report #1354

From gnats  Sun Aug 13 20:47:11 1995
Received: from bloom-beacon.MIT.EDU (brnstndkramden.acf.nyu.edu@BLOOM-BEACON.MIT.EDU [18.181.0.26]) by sun-lamp.pc.cs.cmu.edu (8.6.11/8.6.10) with ESMTP id UAA17217 for <gnats-bugs@gnats.netbsd.org>; Sun, 13 Aug 1995 20:47:11 -0400
Message-Id: <199508140042.UAA10906@pattern.arlington.ma.us>
Date: Sun, 13 Aug 1995 20:42:57 -0400
From: John Kohl <jtk@kolvir.arlington.ma.us>
Reply-To: jtk@kolvir.arlington.ma.us
To: gnats-bugs@gnats.netbsd.org, jsp@sequent.com
Subject: union FS can deadlock while vgone()/vclean() is running
X-Send-Pr-Version: 3.2

>Number:         1354
>Category:       kern
>Synopsis:       union FS can deadlock while vgone()/vclean() is running
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Aug 13 20:50:02 +0000 1995
>Closed-Date:    Tue Jun 28 16:23:24 +0000 2016
>Last-Modified:  Tue Jun 28 16:23:24 +0000 2016
>Originator:     John Kohl
>Release:        19950812
>Organization:
NetBSD Kernel Hackers `R` Us
>Environment:

System: NetBSD pattern 1.0A NetBSD 1.0A (PATTERN) #67: Sat Aug 12 23:17:42 EDT 1995 jtk@pattern:/u1/NetBSD-current/src/sys/arch/i386/compile/PATTERN i386


>Description:

I was doing a kernel build (makedepend) and a 'make install' in
/usr/src/include, and got the union file system deadlocked on itself.

/usr/src is a union mount; the top layer is UFS and the bottom layer
is NFS.

Here are kernel stack traces of the deadlocked processes, plus some
analysis and vnode printouts:

The processes (ddb output):

 2585 f882a600 f9e5c000   0     1  2190  004006  3  netbsd  find   f87b8e00
 2577 f880cf00 f9e56000 5509     1  2135  004006  3  netbsd  cpp   vget f8798500

process 2577's trace follows.  It is waiting in vget for VXWANT to
clear, i.e. the vnode to finish being recycled.  Note that in the
printout of the union vnode, 2577 has the lock on the upper layer node
( because it was locked in union_lookup1() ).  2577 was looking up
"../../../../net/radix.h, and was in the process of looking up the "net"
component.

2585 is waiting for the top-layer (UFS) node lock.  
Although not shown here, the union node is recorded as locked by PID
2585, with flags 0x12 == UN_LOCKED|UN_CACHED.

db> trace 0xf9e57bf4
bpendtsleep(f8826600,0,f8807980,0,f9e57c30) at bpendtsleep
bpendtsleep(f8798500,8,f812dd7d,0,f8826600) at bpendtsleep
_vget(f8798500,0) at _vget+0x35
_union_allocvp(f9e57ef4,f877d200,f87bed80,f87c4e80,f9e57f08) at _union_allocvp+
0x120
_union_lookup(f9e57d74,f880ab80,f9e57f08,f9e57ee4,f87a5820) at _union_lookup+0x
493
_lookup(f9e57ee4,0,1d0e0,f9e57ee4,f9e57e14) at _lookup+0x257
_namei(f9e57ee4,0,1d0e0,f8843540,f8847000) at _namei+0x16b
_vn_open(f9e57ee4,1,1a4,f7bfd0a4,f880cf00) at _vn_open+0x17c
_open(f880cf00,f9e57f84,f9e57f7c,0,1d0e0) at _open+0x91
_syscall() at _syscall+0x239
--- syscall (number 5) ---
db> call vprint(0, 0xf8798500)
type VDIR, usecount 0, writecount 0, refcount 0, flags (VXLOCK|VXWANT)
                tag VT_UNION, vp=f8798500, uppervp=f8770700, lowervp=f8807980
uppervp: type VDIR, usecount 2, writecount 0, refcount 0,
        tag VT_UFS, ino 80688, on dev 4, 5 (LOCKED)
        owner pid 2577 waiting pid 2585
lowervp: type VDIR, usecount 2, writecount 0, refcount 0,
        tag VT_NFS, fileid 176651 fsid 0x1b01

process 2585's trace follows. It was doing a lookup in a different
filesystem (/usr) and decided to recycle a union vnode.  Inside
vclean(), there's a call to VOP_LOCK(vp) before vinvalbuf() is called.
This is consistent with the notation in the union node that 2585 owns
the union node lock.  However, 2585 does NOT have the top-layer lock,
because inside union_lock() there's this code:

		if (((un->un_flags & UN_ULOCK) == 0) &&
		    (vp->v_usecount != 0)) {
			VOP_LOCK(un->un_uppervp);
			un->un_flags |= UN_ULOCK;
		}

usecount is indeed 0 in this case--that's why it's on the freelist.

db> trace *0xf9e5c03c
bpendsleep(f87b8e00,f8770700,f9e5db60,f8187626,f87b8e00) at bpendsleep
bpendsleep(f87b8e00,8) at bpendsleep
_ufs_lock(f9e5db70,f8770700,f81d36ec,f8770700,f9e5dbb0) at _ufs_lock+0x22
_union_vptofh(f8826600) at _union_vptofh+0x35    << really union_fixup
_union_fsync(f9e5dbcc,f8798500,f9e5dc24,0,2) at _union_fsync+0x64
_vinvalbuf(f8798500,1,ffffffff,0,0,0) at _vinvalbuf+0x46
_vclean(f8798500,8) at _vclean+0x6d
_vgone(f8798500) at _vgone+0x34
_getnewvnode(1,f876a000,f8726100,f9e5dcc0,f92b702c) at _getnewvnode+0xdb
_ffs_vget(f876a000,1004,f9e5dd44,f87be780,f9e5de60) at _ffs_vget+0x4f
_ufs_lookup(f9e5dda8,f87be780,f9e5de60,f9e5de3c,f87dcc20) at _ufs_lookup+0xbb1
_lookup(f9e5de3c,f7bfd920,f882a600,8,0) at _lookup+0x257
_namei(f9e5de3c,f7bfd920,f882a600,8,14c40) at _namei+0x16b
_lstat(f882a600,f9e5df84,f9e5df7c,0,14c4c) at _lstat+0x4a
_syscall() at _syscall+0x239

The basic scenario is this, I think:

2577 starts looking up a name in the union FS.  It looks up the top
layer name, finds it's a directory, and takes the lock on it.
It then starts looking up the lower name, and blocks (maybe waiting for
the lock on the lowerdvp).

2585 calls getnewvnode(), and starts reclaim on the union directory node
for the name that 2577 is about to use.  It calls VOP_LOCK() which sets
the locked bits in the union layer but doesn't take the uppervp lock.
It then calls vinvalbuf() which calls VOP_FSYNC() which sees the missing
uppervp lock and tries to get the lock.  It blocks because 2577 has the
lock.

2577 then finishes its lookup in the lower file system.  It then calls
union_allocvp() while holding uppervp locked.  union_allocvp() finds the
union node on the hash chains, and tries to vget() the node.  vget
notices the 0xdeadb marker in the vnode free chain, and waits for the
vnode to be finished recycling.

voila, 2-process deadly embrace.

I'm guessing that the (vp->v_usecount != 0) check is inside union_lock()
to avoid a related deadlock--the vclean() process would mark the vnode
for cleaning but then block in VOP_LOCK() waiting for the uppervp lock,
while the union_allocvp() process would be in vget() waiting for the
vnode recycling to be completed.

I think something needs to change in the way union_allocvp() calls
vget.  It really can't hold the uppervp lock while calling vget on the
union node (because that can lead to deadlock as seen in this case), nor
can it comfortably drop/regain the uppervp lock (although it only needs
to do this for the cases where the union node is on the free list).

>How-To-Repeat:
be very unlucky, or just try to run several processes at once doing
builds inside a union FS.

>Fix:

>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->feedback 
State-Changed-By: sommerfeld 
State-Changed-When: Sat Jul 31 17:10:44 PDT 1999 
State-Changed-Why:  
I believe that htis may be fixed by Bill Studenmund's fix to getnewvnode() 
in vfs_subr.c 1.105, which is to not try to clean locked vnodes on the  
freelist.. 
State-Changed-From-To: feedback->open 
State-Changed-By: sommerfeld 
State-Changed-When: Sat Jul 31 17:16:32 PDT 1999 
State-Changed-Why:  
"Never mind".  union_islocked() doesn't DTRT for this to work (yet) 
. 


Responsible-Changed-From-To: kern-bug-people->sommerfeld 
Responsible-Changed-By: sommerfeld 
Responsible-Changed-When: Sat Jul 31 17:16:32 PDT 1999 
Responsible-Changed-Why:  
i'll deal with this. 

Responsible-Changed-From-To: sommerfeld->bin-bug-people
Responsible-Changed-By: spz@NetBSD.org
Responsible-Changed-When: Sun, 10 May 2009 19:12:21 +0000
Responsible-Changed-Why:
sommerfelds accounts are no longer active


From: David Holland <dholland@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/1354 union FS can deadlock
Date: Tue, 28 Jun 2016 03:39:54 +0000

 From: "J. Hannken-Illjes" <hannken@eis.cs.tu-bs.de>
 To: gnats-bugs@NetBSD.org
 Cc: David Holland <dholland@netbsd.org>
 Subject: PR/1354 union FS can deadlock
 Date: Mon, 27 Jun 2016 11:04:32 +0200

 Should be fixed by Rev 1.43 of sys/miscfs/union/union_vnops.c:

 date: 1999-03-22 18:24:22 +0100;  author: sommerfe;  state: Exp;  lines: +17 -4;

 vinvalbuf, called from vclean, could cause a locking-against-self
 deadlock in VOP_FSYNC() if the unreferenced vnode picked for
 reclamation happened to be stacked on top of a vnode the process
 already had locked.  This could happen if the same filesystem was
 accessed both through a union mount and directly; it seemed to happen
 most frequently when the direct access was through NFS.

 Avoid this deadlock by changing vinvalbuf to pass a new FSYNC_RECLAIM
 flag bit to VOP_FSYNC() to indicate that a reclaim is in progress and
 only a `shallow' fsync is necessary.

 Do nothing in *_fsync() in umapfs, nullfs, and unionfs when
 FSYNC_RECLAIM is set; the underlying vnodes will shortly be released
 in *_reclaim and may be reclaimed (and fsync'ed) later.

 --
 J. Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)

State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Tue, 28 Jun 2016 16:23:24 +0000
State-Changed-Why:
Fixed.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.