NetBSD Problem Report #24887

Received: (qmail 15876 invoked by uid 605); 23 Mar 2004 07:39:46 -0000
Message-Id: <1080027578.512298.7152.nullmailer@yamt.dyndns.org>
Date: Tue, 23 Mar 2004 16:39:38 +0900
From: yamt@mwd.biglobe.ne.jp
Sender: gnats-bugs-owner@NetBSD.org
Reply-To: yamt@mwd.biglobe.ne.jp
To: gnats-bugs@gnats.netbsd.org
Subject: rename violates vnode locking order
X-Send-Pr-Version: 3.95

>Number:         24887
>Category:       kern
>Synopsis:       rename violates vnode locking order
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    dholland
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Mar 23 07:40:00 +0000 2004
>Closed-Date:    Mon Jul 18 06:48:55 +0000 2011
>Last-Modified:  Mon Jul 18 06:48:55 +0000 2011
>Originator:     YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
>Release:        NetBSD 1.6ZK
>Organization:

>Environment:


System: NetBSD kaeru 1.6ZK NetBSD 1.6ZK (build.kaeru) #1138: Tue Mar 23 09:52:05 JST 2004 takashi@kaeru:/home/takashi/work/kernel/build.kaeru i386
Architecture: i386
Machine: i386
>Description:
	VOP_RENAME of many filesystems including ufs violate
	vnode locking order and sometimes deadlock.

(gdb) p (struct lock *)0xcabc552c
$1 = (struct lock *) 0xcabc552c
(gdb) p $1
$2 = (struct lock *) 0xcabc552c
(gdb) p *$1
$3 = {lk_interlock = {lock_data = 0,
    lock_file = 0xc04f3f60 "/home/takashi/nbsd/sys/kern/kern_lock.c",
    unlock_file = 0xc04f3f60 "/home/takashi/nbsd/sys/kern/kern_lock.c",
    lock_line = 513, unlock_line = 855, list = {tqe_next = 0x0,
      tqe_prev = 0x0}, lock_holder = 4294967295}, lk_flags = 525312,
  lk_sharecount = 0, lk_exclusivecount = 1, lk_recurselevel = 0,
  lk_waitcount = 7, lk_wmesg = 0xc04b1757 "vnlock", lk_un = {lk_un_sleep = {
      lk_sleep_lockholder = 6911, lk_sleep_locklwp = 1, lk_sleep_prio = 20,
      lk_sleep_timo = 0}, lk_un_spin = {lk_spin_cpu = 6911, lk_spin_list = {
        tqe_next = 0x1, tqe_prev = 0x14}}},
  lk_lock_file = 0xc04f9d60 "/home/takashi/nbsd/sys/miscfs/genfs/genfs_vnops.c",
  lk_unlock_file = 0xc04f9d60 "/home/takashi/nbsd/sys/miscfs/genfs/genfs_vnops.c", lk_lock_line = 324, lk_unlock_line = 340}
(gdb) p (struct lock *) 0xc9e457f4
$4 = (struct lock *) 0xc9e457f4
(gdb) p *$4
$5 = {lk_interlock = {lock_data = 0,
    lock_file = 0xc04f3f60 "/home/takashi/nbsd/sys/kern/kern_lock.c",
    unlock_file = 0xc04f6280 "/home/takashi/nbsd/sys/kern/kern_synch.c",
    lock_line = 513, unlock_line = 458, list = {tqe_next = 0x0,
      tqe_prev = 0x0}, lock_holder = 4294967295}, lk_flags = 525312,
  lk_sharecount = 0, lk_exclusivecount = 1, lk_recurselevel = 0,
  lk_waitcount = 1, lk_wmesg = 0xc04b1757 "vnlock", lk_un = {lk_un_sleep = {
      lk_sleep_lockholder = 5453, lk_sleep_locklwp = 1, lk_sleep_prio = 20,
      lk_sleep_timo = 0}, lk_un_spin = {lk_spin_cpu = 5453, lk_spin_list = {
        tqe_next = 0x1, tqe_prev = 0x14}}},
  lk_lock_file = 0xc04f9d60 "/home/takashi/nbsd/sys/miscfs/genfs/genfs_vnops.c",
  lk_unlock_file = 0xc04f9d60 "/home/takashi/nbsd/sys/miscfs/genfs/genfs_vnops.c", lk_lock_line = 324, lk_unlock_line = 340}
(gdb) lwp 0xcad9fc98 1
$6 = (struct lwp *) 0xcaad27c4
(gdb) bt
#0  0xcaad27c4 in ?? ()
#1  0xc02b1843 in ltsleep (ident=0xc9e457f4, priority=0, wmesg=0x0, timo=0,
    interlock=0xc9e457f4) at /home/takashi/nbsd/sys/kern/kern_synch.c:493
#2  0xc02a182e in acquire (lkp=0xc9e457f4, s=0xca18fbac, extflags=0, drain=0,
    wanted=1536) at /home/takashi/nbsd/sys/kern/kern_lock.c:259
#3  0xc02a1eef in _lockmgr (lkp=0xc9e457f4, flags=3390634924,
    interlkp=0xc9e4576c,
    file=0xc04f9d60 "/home/takashi/nbsd/sys/miscfs/genfs/genfs_vnops.c",
    line=324) at /home/takashi/nbsd/sys/kern/kern_lock.c:730
#4  0xc02f0886 in genfs_lock (v=0x0)
    at /home/takashi/nbsd/sys/miscfs/genfs/genfs_vnops.c:324
#5  0xc02ef60d in VOP_LOCK (vp=0x0, flags=0)
    at /home/takashi/nbsd/sys/kern/vnode_if.c:1082
#6  0xc02eeb25 in vn_lock (vp=0xc9e4576c, flags=65538)
    at /home/takashi/nbsd/sys/kern/vfs_vnops.c:782
#7  0xc02e4dc4 in vget (vp=0xc9e4576c, flags=65538)
    at /home/takashi/nbsd/sys/kern/vfs_subr.c:1245
#8  0xc02e06da in cache_lookup (dvp=0xcabc54a4, vpp=0xca18fec4, cnp=0xca18fed8)
    at /home/takashi/nbsd/sys/kern/vfs_cache.c:278
#9  0xc0271c32 in ufs_lookup (v=0xca18fd64)
    at /home/takashi/nbsd/sys/ufs/ufs/ufs_lookup.c:169
#10 0xc02eef3f in VOP_LOOKUP (dvp=0x0, vpp=0x0, cnp=0x0)
    at /home/takashi/nbsd/sys/kern/vnode_if.c:131
---Type <return> to continue, or q <return> to quit---
#11 0xc02e2cfc in lookup (ndp=0xca18feb4)
    at /home/takashi/nbsd/sys/kern/vfs_lookup.c:509
#12 0xc02e277c in namei (ndp=0xca18feb4)
    at /home/takashi/nbsd/sys/kern/vfs_lookup.c:172
#13 0xc02ed0d5 in rename_files (from=0x0, to=0x0, p=0xcad9fc98, retain=0)
    at /home/takashi/nbsd/sys/kern/vfs_syscalls.c:3186
#14 0xc02ed060 in sys_rename (l=0x0, v=0x0, retval=0xca18ff5c)
    at /home/takashi/nbsd/sys/kern/vfs_syscalls.c:3141
#15 0xc0372817 in syscall_plain (frame=0xca18ffa8)
    at /home/takashi/nbsd/sys/arch/i386/i386/syscall.c:156
(gdb) lwp 0xca1bf8f8 1
$7 = (struct lwp *) 0xca1b3df0
(gdb) bt
#0  0xca1b3df0 in ?? ()
#1  0xc02b1843 in ltsleep (ident=0xcabc552c, priority=0, wmesg=0x0, timo=0,
    interlock=0xcabc552c) at /home/takashi/nbsd/sys/kern/kern_synch.c:493
#2  0xc02a182e in acquire (lkp=0xcabc552c, s=0xcbbebbac, extflags=0, drain=0,
    wanted=1536) at /home/takashi/nbsd/sys/kern/kern_lock.c:259
#3  0xc02a1eef in _lockmgr (lkp=0xcabc552c, flags=3418274732,
    interlkp=0xcabc54a4,
    file=0xc04f9d60 "/home/takashi/nbsd/sys/miscfs/genfs/genfs_vnops.c",
    line=324) at /home/takashi/nbsd/sys/kern/kern_lock.c:730
#4  0xc02f0886 in genfs_lock (v=0x0)
    at /home/takashi/nbsd/sys/miscfs/genfs/genfs_vnops.c:324
#5  0xc02ef60d in VOP_LOCK (vp=0x0, flags=0)
    at /home/takashi/nbsd/sys/kern/vnode_if.c:1082
#6  0xc02eeb25 in vn_lock (vp=0xcabc54a4, flags=2)
    at /home/takashi/nbsd/sys/kern/vfs_vnops.c:782
#7  0xc02767ab in ufs_rename (v=0xcbbebdf4)
    at /home/takashi/nbsd/sys/ufs/ufs/ufs_vnops.c:891
#8  0xc02ef415 in VOP_RENAME (fdvp=0x0, fvp=0x0, fcnp=0x0, tdvp=0x0, tvp=0x0,
    tcnp=0x0) at /home/takashi/nbsd/sys/kern/vnode_if.c:798
#9  0xc02ed3a3 in rename_files (from=0x0, to=0x0, p=0xca1bf8f8, retain=0)
    at /home/takashi/nbsd/sys/kern/vfs_syscalls.c:3245
#10 0xc02ed060 in sys_rename (l=0x0, v=0x0, retval=0xcbbebf5c)
    at /home/takashi/nbsd/sys/kern/vfs_syscalls.c:3141
---Type <return> to continue, or q <return> to quit---
#11 0xc0372817 in syscall_plain (frame=0xcbbebfa8)
    at /home/takashi/nbsd/sys/arch/i386/i386/syscall.c:156
(gdb) fr 7
#7  0xc02767ab in ufs_rename (v=0xcbbebdf4)
    at /home/takashi/nbsd/sys/ufs/ufs/ufs_vnops.c:891
891             if ((error = vn_lock(fvp, LK_EXCLUSIVE)) != 0)
(gdb) list
886                             /* relookup blew away fdvp */
887                             return (error);
888                     }
889                     return (VOP_REMOVE(fdvp, fvp, fcnp));
890             }
891             if ((error = vn_lock(fvp, LK_EXCLUSIVE)) != 0)
892                     goto abortit;
893             dp = VTOI(fdvp);
894             ip = VTOI(fvp);
895             if ((nlink_t) ip->i_nlink >= LINK_MAX) {
(gdb) Quit
(gdb) p $7->l_proc->p_pid
$8 = 5453
(gdb) p $6->l_proc->p_pid
$9 = 6911
(gdb)
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:

From: "Darrin B. Jewell" <dbj@netbsd.org>
To: yamt@mwd.biglobe.ne.jp
Cc: gnats-bugs@gnats.netbsd.org
Subject: Re: kern/24887: rename violates vnode locking order
Date: 24 Mar 2004 15:44:21 -0500

 --=-=-=


 yamt@mwd.biglobe.ne.jp writes:
 > 	VOP_RENAME of many filesystems including ufs violate
 > 	vnode locking order and sometimes deadlock.

 As far as I know, this has been a "known bug, won't fix" for a long
 time.  I took a close look at ufs_rename last year, and the following
 notes stem from that.

 This is not going to be particularly easy to fix, but currently anyone
 with a good understanding of the ufs_rename details can force a vnode
 deadlock on demand.  Probably the easiest way is to exploit the fact
 that it locks fvp while holding tdp and tdvp before it does the checks
 on the relationship of fvp to the other two vnodes.

 In order to truly fix this problem, i think we will have to extend the
 vnode locking protocol into a new area where it is currently
 undefined.  Currently, the order for locking multiple vnodes is only
 well defined when you are locking two vnodes which are in a direct
 parent/child relationship two each other.  There is no specified order
 for locking two vnodes which may be siblings or have another more
 complicated relationship and only share a common parent.  Something
 like "lock vnode closest to root first, break ties by locking vnode
 with the lower pointer address first" might be a possible answer, but
 even that would probably require keeping the distance to root around
 for efficiency.

 While working on this, I wanted to take all 4 possible locks on both
 pairs of vnodes before doing any work.  In addition to the locking
 protocol problem above, this also ran into one more issue specific to
 the ufs call.  When a directory lookup is performed, information from
 the most recent lookup call is cached in directory inode in the
 i_count,i_endoff,i_diroff,i_offset and i_reclen fields.  This causes a
 problem if you wish to lookup two different vnodes in the same
 directory at the same time for rename purposes.  I worked around this
 by saving the side effects of a directory lookup around, but I found
 that to be a relatively ugly answer and requires care to avoid
 problems when the directory is changed by the second lookup.

 ufs_rename and friends are not particularly easy functions to read and
 understand.  The following patch introduces a cosmetic-only change
 that I found useful while reading the ufs_rename call.  It separates
 the "xp" local variable into two new, separate variables "txp" and
 "fxp", and separates the "dp" local variable into three new separate
 variables "tdp", "fdp", and "fdp2".  I also whiteboarded up a full
 flow diagram of the rename call and have a photograph of it which I
 can provide, although creating the diagram is a lot more revealing
 than just staring at one someone else made.


 --=-=-=
 Content-Type: text/x-patch
 Content-Disposition: attachment; filename=ufs_vnops.c.rename.diff
 Content-Description: change some variable names in ufs_rename

 Index: src/sys/ufs/ufs/ufs_vnops.c
 ===================================================================
 RCS file: /cvsroot/src/sys/ufs/ufs/ufs_vnops.c,v
 retrieving revision 1.113
 diff -u -u -p -r1.113 ufs_vnops.c
 --- src/sys/ufs/ufs/ufs_vnops.c	26 Jan 2004 10:39:30 -0000	1.113
 +++ src/sys/ufs/ufs/ufs_vnops.c	24 Mar 2004 18:47:20 -0000
 @@ -821,7 +821,7 @@ ufs_rename(void *v)
  	} */ *ap = v;
  	struct vnode		*tvp, *tdvp, *fvp, *fdvp;
  	struct componentname	*tcnp, *fcnp;
 -	struct inode		*ip, *xp, *dp;
 +	struct inode		*ip, *txp, *fxp, *tdp, *fdp, *fdp2;
  	struct direct		newdir;
  	int			doingdirectory, oldparent, newparent, error;

 @@ -890,7 +890,7 @@ ufs_rename(void *v)
  	}
  	if ((error = vn_lock(fvp, LK_EXCLUSIVE)) != 0)
  		goto abortit;
 -	dp = VTOI(fdvp);
 +	fdp = VTOI(fdvp);
  	ip = VTOI(fvp);
  	if ((nlink_t) ip->i_nlink >= LINK_MAX) {
  		VOP_UNLOCK(fvp, 0);
 @@ -898,7 +898,7 @@ ufs_rename(void *v)
  		goto abortit;
  	}
  	if ((ip->i_flags & (IMMUTABLE | APPEND)) ||
 -		(dp->i_flags & APPEND)) {
 +		(fdp->i_flags & APPEND)) {
  		VOP_UNLOCK(fvp, 0);
  		error = EPERM;
  		goto abortit;
 @@ -908,7 +908,7 @@ ufs_rename(void *v)
  		 * Avoid ".", "..", and aliases of "." for obvious reasons.
  		 */
  		if ((fcnp->cn_namelen == 1 && fcnp->cn_nameptr[0] == '.') ||
 -		    dp == ip ||
 +		    fdp == ip ||
  		    (fcnp->cn_flags & ISDOTDOT) ||
  		    (tcnp->cn_flags & ISDOTDOT) ||
  		    (ip->i_flag & IN_RENAME)) {
 @@ -917,7 +917,7 @@ ufs_rename(void *v)
  			goto abortit;
  		}
  		ip->i_flag |= IN_RENAME;
 -		oldparent = dp->i_number;
 +		oldparent = fdp->i_number;
  		doingdirectory = 1;
  	}
  	VN_KNOTE(fdvp, NOTE_WRITE);		/* XXXLUKEM/XXX: right place? */
 @@ -927,10 +927,10 @@ ufs_rename(void *v)
  	 * When the target exists, both the directory
  	 * and target vnodes are returned locked.
  	 */
 -	dp = VTOI(tdvp);
 -	xp = NULL;
 +	tdp = VTOI(tdvp);
 +	txp = NULL;
  	if (tvp)
 -		xp = VTOI(tvp);
 +		txp = VTOI(tvp);

  	/*
  	 * 1) Bump link count while we're moving stuff
 @@ -961,25 +961,25 @@ ufs_rename(void *v)
  	 */
  	error = VOP_ACCESS(fvp, VWRITE, tcnp->cn_cred, tcnp->cn_proc);
  	VOP_UNLOCK(fvp, 0);
 -	if (oldparent != dp->i_number)
 -		newparent = dp->i_number;
 +	if (oldparent != tdp->i_number)
 +		newparent = tdp->i_number;
  	if (doingdirectory && newparent) {
  		if (error)	/* write access check above */
  			goto bad;
 -		if (xp != NULL)
 +		if (txp != NULL)
  			vput(tvp);
  		vref(tdvp);	/* compensate for the ref checkpath looses */
 -		if ((error = ufs_checkpath(ip, dp, tcnp->cn_cred)) != 0) {
 +		if ((error = ufs_checkpath(ip, tdp, tcnp->cn_cred)) != 0) {
  			vrele(tdvp);
  			goto out;
  		}
  		tcnp->cn_flags &= ~SAVESTART;
  		if ((error = relookup(tdvp, &tvp, tcnp)) != 0)
  			goto out;
 -		dp = VTOI(tdvp);
 -		xp = NULL;
 +		tdp = VTOI(tdvp);
 +		txp = NULL;
  		if (tvp)
 -			xp = VTOI(tvp);
 +			txp = VTOI(tvp);
  	}
  	/*
  	 * 2) If target doesn't exist, link the target
 @@ -988,8 +988,8 @@ ufs_rename(void *v)
  	 *    entry to reference the source inode and
  	 *    expunge the original entry's existence.
  	 */
 -	if (xp == NULL) {
 -		if (dp->i_dev != ip->i_dev)
 +	if (txp == NULL) {
 +		if (tdp->i_dev != ip->i_dev)
  			panic("rename: EXDEV");
  		/*
  		 * Account for ".." in new directory.
 @@ -997,24 +997,24 @@ ufs_rename(void *v)
  		 * parent we don't fool with the link count.
  		 */
  		if (doingdirectory && newparent) {
 -			if ((nlink_t)dp->i_nlink >= LINK_MAX) {
 +			if ((nlink_t)tdp->i_nlink >= LINK_MAX) {
  				error = EMLINK;
  				goto bad;
  			}
 -			dp->i_ffs_effnlink++;
 -			dp->i_nlink++;
 -			DIP_ASSIGN(dp, nlink, dp->i_nlink);
 -			dp->i_flag |= IN_CHANGE;
 +			tdp->i_ffs_effnlink++;
 +			tdp->i_nlink++;
 +			DIP_ASSIGN(tdp, nlink, tdp->i_nlink);
 +			tdp->i_flag |= IN_CHANGE;
  			if (DOINGSOFTDEP(tdvp))
 -				softdep_change_linkcnt(dp);
 +				softdep_change_linkcnt(tdp);
  			if ((error = VOP_UPDATE(tdvp, NULL, NULL, 
  			    UPDATE_DIROP)) != 0) {
 -				dp->i_ffs_effnlink--;
 -				dp->i_nlink--;
 -				DIP_ASSIGN(dp, nlink, dp->i_nlink);
 -				dp->i_flag |= IN_CHANGE;
 +				tdp->i_ffs_effnlink--;
 +				tdp->i_nlink--;
 +				DIP_ASSIGN(tdp, nlink, tdp->i_nlink);
 +				tdp->i_flag |= IN_CHANGE;
  				if (DOINGSOFTDEP(tdvp))
 -					softdep_change_linkcnt(dp);
 +					softdep_change_linkcnt(tdp);
  				goto bad;
  			}
  		}
 @@ -1022,12 +1022,12 @@ ufs_rename(void *v)
  		error = ufs_direnter(tdvp, NULL, &newdir, tcnp, NULL);
  		if (error != 0) {
  			if (doingdirectory && newparent) {
 -				dp->i_ffs_effnlink--;
 -				dp->i_nlink--;
 -				DIP_ASSIGN(dp, nlink, dp->i_nlink);
 -				dp->i_flag |= IN_CHANGE;
 +				tdp->i_ffs_effnlink--;
 +				tdp->i_nlink--;
 +				DIP_ASSIGN(tdp, nlink, tdp->i_nlink);
 +				tdp->i_flag |= IN_CHANGE;
  				if (DOINGSOFTDEP(tdvp))
 -					softdep_change_linkcnt(dp);
 +					softdep_change_linkcnt(tdp);
  				(void)VOP_UPDATE(tdvp, NULL, NULL,
  						 UPDATE_WAIT|UPDATE_DIROP);
  			}
 @@ -1036,12 +1036,12 @@ ufs_rename(void *v)
  		VN_KNOTE(tdvp, NOTE_WRITE);
  		vput(tdvp);
  	} else {
 -		if (xp->i_dev != dp->i_dev || xp->i_dev != ip->i_dev)
 +		if (txp->i_dev != tdp->i_dev || txp->i_dev != ip->i_dev)
  			panic("rename: EXDEV");
  		/*
  		 * Short circuit rename(foo, foo).
  		 */
 -		if (xp->i_number == ip->i_number)
 +		if (txp->i_number == ip->i_number)
  			panic("rename: same file");
  		/*
  		 * If the parent directory is "sticky", then the user must
 @@ -1049,9 +1049,9 @@ ufs_rename(void *v)
  		 * otherwise the destination may not be changed (except by
  		 * root). This implements append-only directories.
  		 */
 -		if ((dp->i_mode & S_ISTXT) && tcnp->cn_cred->cr_uid != 0 &&
 -		    tcnp->cn_cred->cr_uid != dp->i_uid &&
 -		    xp->i_uid != tcnp->cn_cred->cr_uid) {
 +		if ((tdp->i_mode & S_ISTXT) && tcnp->cn_cred->cr_uid != 0 &&
 +		    tcnp->cn_cred->cr_uid != tdp->i_uid &&
 +		    txp->i_uid != tcnp->cn_cred->cr_uid) {
  			error = EPERM;
  			goto bad;
  		}
 @@ -1060,9 +1060,9 @@ ufs_rename(void *v)
  		 * to it. Also, ensure source and target are compatible
  		 * (both directories, or both not directories).
  		 */
 -		if ((xp->i_mode & IFMT) == IFDIR) {
 -			if (xp->i_ffs_effnlink > 2 ||
 -			    !ufs_dirempty(xp, dp->i_number, tcnp->cn_cred)) {
 +		if ((txp->i_mode & IFMT) == IFDIR) {
 +			if (txp->i_ffs_effnlink > 2 ||
 +			    !ufs_dirempty(txp, tdp->i_number, tcnp->cn_cred)) {
  				error = ENOTEMPTY;
  				goto bad;
  			}
 @@ -1075,19 +1075,19 @@ ufs_rename(void *v)
  			error = EISDIR;
  			goto bad;
  		}
 -		if ((error = ufs_dirrewrite(dp, xp, ip->i_number, 
 +		if ((error = ufs_dirrewrite(tdp, txp, ip->i_number, 
  		    IFTODT(ip->i_mode), doingdirectory && newparent ?
  		    newparent : doingdirectory, IN_CHANGE|IN_UPDATE)) != 0)
  			goto bad;
  		if (doingdirectory) {
  			if (!newparent) {
 -				dp->i_ffs_effnlink--;
 +				tdp->i_ffs_effnlink--;
  				if (DOINGSOFTDEP(tdvp))
 -					softdep_change_linkcnt(dp);
 +					softdep_change_linkcnt(tdp);
  			}
 -			xp->i_ffs_effnlink--;
 +			txp->i_ffs_effnlink--;
  			if (DOINGSOFTDEP(tvp))
 -				softdep_change_linkcnt(xp);
 +				softdep_change_linkcnt(txp);
  		}
  		if (doingdirectory && !DOINGSOFTDEP(tvp)) {
  			/*
 @@ -1102,13 +1102,13 @@ ufs_rename(void *v)
  			 * them now.
  			 */
  			if (!newparent) {
 -				dp->i_nlink--;
 -				DIP_ASSIGN(dp, nlink, dp->i_nlink);
 -				dp->i_flag |= IN_CHANGE;
 +				tdp->i_nlink--;
 +				DIP_ASSIGN(tdp, nlink, tdp->i_nlink);
 +				tdp->i_flag |= IN_CHANGE;
  			}
 -			xp->i_nlink--;
 -			DIP_ASSIGN(xp, nlink, xp->i_nlink);
 -			xp->i_flag |= IN_CHANGE;
 +			txp->i_nlink--;
 +			DIP_ASSIGN(txp, nlink, txp->i_nlink);
 +			txp->i_flag |= IN_CHANGE;
  			if ((error = VOP_TRUNCATE(tvp, (off_t)0, IO_SYNC,
  			    tcnp->cn_cred, tcnp->cn_proc)))
  				goto bad;
 @@ -1117,7 +1117,7 @@ ufs_rename(void *v)
  		vput(tdvp);
  		VN_KNOTE(tvp, NOTE_DELETE);
  		vput(tvp);
 -		xp = NULL;
 +		txp = NULL;
  	}

  	/*
 @@ -1130,8 +1130,8 @@ ufs_rename(void *v)
  		return (error);
  	}
  	if (fvp != NULL) {
 -		xp = VTOI(fvp);
 -		dp = VTOI(fdvp);
 +		fxp = VTOI(fvp);
 +		fdp2 = VTOI(fdvp);
  	} else {
  		/*
  		 * From name has disappeared.
 @@ -1150,7 +1150,7 @@ ufs_rename(void *v)
  	 * flag ensures that it cannot be moved by another rename or removed
  	 * by a rmdir.
  	 */
 -	if (xp != ip) {
 +	if (fxp != ip) {
  		if (doingdirectory)
  			panic("rename: lost dir entry");
  	} else {
 @@ -1161,26 +1161,26 @@ ufs_rename(void *v)
  		 * and ".." set to point to the new parent.
  		 */
  		if (doingdirectory && newparent) {
 -			xp->i_offset = mastertemplate.dot_reclen;
 -			ufs_dirrewrite(xp, dp, newparent, DT_DIR, 0, IN_CHANGE);
 +			fxp->i_offset = mastertemplate.dot_reclen;
 +			ufs_dirrewrite(fxp, fdp2, newparent, DT_DIR, 0, IN_CHANGE);
  			cache_purge(fdvp);
  		}
 -		error = ufs_dirremove(fdvp, xp, fcnp->cn_flags, 0);
 -		xp->i_flag &= ~IN_RENAME;
 +		error = ufs_dirremove(fdvp, fxp, fcnp->cn_flags, 0);
 +		fxp->i_flag &= ~IN_RENAME;
  	}
  	VN_KNOTE(fvp, NOTE_RENAME);
 -	if (dp)
 +	if (fdp2)
  		vput(fdvp);
 -	if (xp)
 +	if (fxp)
  		vput(fvp);
  	vrele(ap->a_fvp);
  	return (error);

  	/* exit routines from steps 1 & 2 */
   bad:
 -	if (xp)
 -		vput(ITOV(xp));
 -	vput(ITOV(dp));
 +	if (txp)
 +		vput(ITOV(txp));
 +	vput(ITOV(tdp));
   out:
  	if (doingdirectory)
  		ip->i_flag &= ~IN_RENAME;

 --=-=-=


 I hope some of these notes are helpful.
 Darrin

 --=-=-=--

From: Bill Studenmund <wrstuden@netbsd.org>
To: "Darrin B. Jewell" <dbj@netbsd.org>
Cc: yamt@mwd.biglobe.ne.jp, me@netbsd.org, gnats-bugs@gnats.netbsd.org
Subject: Re: kern/24887: rename violates vnode locking order
Date: Mon, 5 Apr 2004 15:07:06 -0700

 --xesSdrSSBC0PokLI
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable

 Darrin and I have been talking about this, and I agree that it's not an=20
 easy thing to fix.

 Three ideas have been suggested:

 1) figure out how far each of the directories are from root. I'd take this=
 =20
 as how far they are from fs root, since the distance between the mount=20
 point and '/' doesn't matter.

 2) Use trylock. The idea here is after we lock the first vnode, when we=20
 grab the other locks, we don't sleep for them. If we fail to get a lock,=20
 we unlock the locks we have then try again. This behavior will avoid=20
 deadlocks, but may make the rename take a while.

 3) Add a rename serialization lock. Any rename operation will need to grab=
 =20
 this lock before trying to lock the vnodes, then release this lock when=20
 all's done. This will keep rename from stumbling over itself. I think such=
 =20
 a lock can be per-mountpoint.


 In all honesty, I think we need (3) plus one of (1) or (2). I like (1) the=
 =20
 best, however I am concerned that we really have to calculate the distance=
 =20
 to root in the rename call itself. There's no real way we can cache the=20
 value, as a rename anywhere above our location would invalidate the value.=
 =20

 We'd need to do a getcwd-like lookup each rename, which doesn't sound=20
 terribly efficient.

 (2) plus (3) sounds like the way to go.

 I do like Darrin's suggestion of discussing, waiting a while, and only=20
 then changing code. :-)

 Thoughts?

 Darrin, I think the change is ok, though to be honest I've found rename to=
 =20
 be a very twisty maze. :-) As best I can tell, it's correct.

 Take care,

 Bill

 --xesSdrSSBC0PokLI
 Content-Type: application/pgp-signature
 Content-Disposition: inline

 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.2.3 (NetBSD)

 iD8DBQFAcdiKWz+3JHUci9cRAhPPAJoCrJJc75m9lCbnd9+rrNuqOj22SgCghh/5
 N40YaFErq8H/5BZtgT/qNjg=
 =e01A
 -----END PGP SIGNATURE-----

 --xesSdrSSBC0PokLI--

From: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
To: wrstuden@netbsd.org
Cc: dbj@netbsd.org, me@netbsd.org, gnats-bugs@gnats.netbsd.org
Subject: Re: kern/24887: rename violates vnode locking order
Date: Thu, 06 May 2004 07:24:09 +0900

 > Three ideas have been suggested:
 > 
 > 1) figure out how far each of the directories are from root. I'd take this 
 > as how far they are from fs root, since the distance between the mount 
 > point and '/' doesn't matter.
 > 
 > 2) Use trylock. The idea here is after we lock the first vnode, when we 
 > grab the other locks, we don't sleep for them. If we fail to get a lock, 
 > we unlock the locks we have then try again. This behavior will avoid 
 > deadlocks, but may make the rename take a while.
 > 
 > 3) Add a rename serialization lock. Any rename operation will need to grab 
 > this lock before trying to lock the vnodes, then release this lock when 
 > all's done. This will keep rename from stumbling over itself. I think such 
 > a lock can be per-mountpoint.
 > 
 > 
 > In all honesty, I think we need (3) plus one of (1) or (2). I like (1) the 
 > best, however I am concerned that we really have to calculate the distance 
 > to root in the rename call itself. There's no real way we can cache the 
 > value, as a rename anywhere above our location would invalidate the value. 
 > 
 > We'd need to do a getcwd-like lookup each rename, which doesn't sound 
 > terribly efficient.
 > 
 > (2) plus (3) sounds like the way to go.

 why is (3) needed if we do (2) ?

 where do you suppose trylock dances are done, in rename_files or ufs_rename?

 YAMAMOTO Takashi

From: Bill Studenmund <wrstuden@netbsd.org>
To: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
Cc: wrstuden@netbsd.org, dbj@netbsd.org, gnats-bugs@gnats.netbsd.org
Subject: Re: kern/24887: rename violates vnode locking order
Date: Thu, 6 May 2004 06:28:53 -0700

 --rwEMma7ioTxnRzrJ
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable

 On Thu, May 06, 2004 at 07:24:09AM +0900, YAMAMOTO Takashi wrote:
 > > Three ideas have been suggested:
 > >=20
 > > 1) figure out how far each of the directories are from root. I'd take t=
 his=20
 > > as how far they are from fs root, since the distance between the mount=
 =20
 > > point and '/' doesn't matter.
 > >=20
 > > 2) Use trylock. The idea here is after we lock the first vnode, when we=
 =20
 > > grab the other locks, we don't sleep for them. If we fail to get a lock=
 ,=20
 > > we unlock the locks we have then try again. This behavior will avoid=20
 > > deadlocks, but may make the rename take a while.
 > >=20
 > > 3) Add a rename serialization lock. Any rename operation will need to g=
 rab=20
 > > this lock before trying to lock the vnodes, then release this lock when=
 =20
 > > all's done. This will keep rename from stumbling over itself. I think s=
 uch=20
 > > a lock can be per-mountpoint.
 > >=20
 > >=20
 > > In all honesty, I think we need (3) plus one of (1) or (2). I like (1) =
 the=20
 > > best, however I am concerned that we really have to calculate the dista=
 nce=20
 > > to root in the rename call itself. There's no real way we can cache the=
 =20
 > > value, as a rename anywhere above our location would invalidate the val=
 ue.=20
 > >=20
 > > We'd need to do a getcwd-like lookup each rename, which doesn't sound=
 =20
 > > terribly efficient.
 > >=20
 > > (2) plus (3) sounds like the way to go.
 >=20
 > why is (3) needed if we do (2) ?

 So that we're more controlled about our locking. We know we don't want two=
 =20
 renames running into each other, so let's be explicit about it.

 > where do you suppose trylock dances are done, in rename_files or ufs_rena=
 me?

 I'm not sure right now. Work's eating my brain at the moment... But I=20
 wanted to reply to this email rather than let it linger.

 Take care,

 Bill

 --rwEMma7ioTxnRzrJ
 Content-Type: application/pgp-signature
 Content-Disposition: inline

 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.2.3 (NetBSD)

 iD8DBQFAmj2VWz+3JHUci9cRAqhOAKCENUF9u6yNLPsZMqWFsdSIqYF0WwCghC36
 LXiqzWpyxLfcsXiJePuSozg=
 =pqZS
 -----END PGP SIGNATURE-----

 --rwEMma7ioTxnRzrJ--

From: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
To: wrstuden@netbsd.org
Cc: dbj@netbsd.org, gnats-bugs@gnats.netbsd.org
Subject: Re: kern/24887: rename violates vnode locking order
Date: Wed, 12 May 2004 07:10:37 +0900

 hi,

 > > why is (3) needed if we do (2) ?
 > 
 > So that we're more controlled about our locking. We know we don't want two 
 > renames running into each other, so let's be explicit about it.

 do you mean, to keep the topology for ufs_checkpath?

 YAMAMOTO Takashi

Responsible-Changed-From-To: kern-bug-people->dholland
Responsible-Changed-By: dholland@NetBSD.org
Responsible-Changed-When: Tue, 08 Apr 2008 22:19:33 +0000
Responsible-Changed-Why:
this is mine these days. :-|


From: "David A. Holland" <dholland@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/24887 CVS commit: src/sys/ufs/ufs
Date: Sun, 17 Jul 2011 22:07:59 +0000

 Module Name:	src
 Committed By:	dholland
 Date:		Sun Jul 17 22:07:59 UTC 2011

 Modified Files:
 	src/sys/ufs/ufs: ufs_extern.h ufs_lookup.c ufs_wapbl.c

 Log Message:
 Provide correct locking for ufs_wapbl_rename. Note that this does not
 fix the non-wapbl rename; that will be coming soon. This patch also
 leaves a lot of the older locking-related code around in #if 0 blocks,
 and there's a lot of leftover redundant logic. All that will be going
 away later.

 Relates to at least these PRs:

   PR kern/24887
   PR kern/41417
   PR kern/42093
   PR kern/43626

 and possibly others.


 To generate a diff of this commit:
 cvs rdiff -u -r1.65 -r1.66 src/sys/ufs/ufs/ufs_extern.h
 cvs rdiff -u -r1.110 -r1.111 src/sys/ufs/ufs/ufs_lookup.c
 cvs rdiff -u -r1.16 -r1.17 src/sys/ufs/ufs/ufs_wapbl.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Mon, 18 Jul 2011 06:48:55 +0000
State-Changed-Why:
fixed


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.