NetBSD Problem Report #47040

From campbell@mumble.net  Fri Oct  5 13:58:40 2012
Return-Path: <campbell@mumble.net>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id 0F17363B8A7
	for <gnats-bugs@gnats.NetBSD.org>; Fri,  5 Oct 2012 13:58:40 +0000 (UTC)
Message-Id: <20121005135812.CC1B1604AF@jupiter.mumble.net>
Date: Fri,  5 Oct 2012 13:58:12 +0000 (UTC)
From: Taylor R Campbell <campbell+netbsd@mumble.net>
Reply-To: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@gnats.NetBSD.org
Subject: renaming mount point panics in namei
X-Send-Pr-Version: 3.95

>Number:         47040
>Category:       kern
>Synopsis:       renaming mount point panics in namei
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    dholland
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Oct 05 14:00:08 +0000 2012
>Closed-Date:    Tue Apr 12 04:47:15 +0000 2016
>Last-Modified:  Tue Apr 12 04:47:15 +0000 2016
>Originator:     Taylor R Campbell <campbell+netbsd@mumble.net>
>Release:        NetBSD 6.99.11
>Organization:
>Environment:
Architecture: amd64
Machine: amd64
>Description:

	I had a directory /foo/bar/current.ro, and I wanted to rename
	it to /foo/current.ro, so I ran

		% mv bar/current.ro .

	from /foo, forgetting that /foo/bar/current.ro had a file
	system mounted.  This immediately triggered a panic in namei:

		panic: leaf `current.ro' should be empty

	I am not sure which instance of the panic this is -- whether
	it's the one in lookup_once or the one in relookup -- because I
	can't read a kernel core dump on this machine.

>How-To-Repeat:

	Rename a mount point.

>Fix:

	Yes, please!

	(Real fix: rewrite namei...)

>Release-Note:

>Audit-Trail:
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/47040: renaming mount point panics in namei
Date: Tue, 9 Oct 2012 15:19:56 +0000

 On Fri, Oct 05, 2012 at 02:00:08PM +0000, Taylor R Campbell wrote:
  > 	I had a directory /foo/bar/current.ro, and I wanted to rename
  > 	it to /foo/current.ro, so I ran
  > 
  > 		% mv bar/current.ro .
  > 
  > 	from /foo, forgetting that /foo/bar/current.ro had a file
  > 	system mounted.  This immediately triggered a panic in namei:
  > 
  > 		panic: leaf `current.ro' should be empty
  > 
  > 	I am not sure which instance of the panic this is -- whether
  > 	it's the one in lookup_once or the one in relookup -- because I
  > 	can't read a kernel core dump on this machine.

 It doesn't matter, because either way it's caused by VOP_LOOKUP
 returning both an error and a result, which it's not supposed to do.

 Which fs(es)?

  > 	(Real fix: rewrite namei...)

 Not this time! :-p

 -- 
 David A. Holland
 dholland@netbsd.org

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: kern/47040: renaming mount point panics in namei
Date: Tue, 9 Oct 2012 19:06:23 +0000

    Date: Tue, 9 Oct 2012 15:19:56 +0000
    From: David Holland <dholland-bugs@netbsd.org>

    On Fri, Oct 05, 2012 at 02:00:08PM +0000, Taylor R Campbell wrote:
     > 	I had a directory /foo/bar/current.ro, and I wanted to rename
     > 	it to /foo/current.ro, so I ran
     > 
     > 		% mv bar/current.ro .
     > 
     > 	from /foo, forgetting that /foo/bar/current.ro had a file
     > 	system mounted.  This immediately triggered a panic in namei:
     > 
     > 		panic: leaf `current.ro' should be empty

    Which fs(es)?

 / is ffs sync, /foo is ffs async, and /foo/bar/current.ro is nullfs
 mounted read-only from /foo/bar/current, which is not a mount point.

From: "David A. Holland" <dholland@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/47040 CVS commit: src/sys
Date: Wed, 10 Oct 2012 06:55:26 +0000

 Module Name:	src
 Committed By:	dholland
 Date:		Wed Oct 10 06:55:26 UTC 2012

 Modified Files:
 	src/sys/kern: vfs_lookup.c
 	src/sys/miscfs/genfs: layer_vnops.c

 Log Message:
 In layer_lookup(), clear *vpp before returning EROFS, as otherwise a
 stale value can be returned and this causes a diagnostic panic in
 namei.

 In relookup(), clear *vpp before calling VOP_LOOKUP, as is done in
 lookup_once(), as an additional precautionary measure.

 (in theory both of these fixes are not required together)

 Should fix PR 47040.


 To generate a diff of this commit:
 cvs rdiff -u -r1.194 -r1.195 src/sys/kern/vfs_lookup.c
 cvs rdiff -u -r1.50 -r1.51 src/sys/miscfs/genfs/layer_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Wed, 10 Oct 2012 07:01:44 +0000
State-Changed-Why:
please try again


From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org,
	gnats-admin@netbsd.org, dholland@NetBSD.org
Subject: Re: kern/47040 (renaming mount point panics in namei)
Date: Thu, 11 Oct 2012 00:40:08 +0000

 No longer seems to panic, although EROFS (when the nullfs is mounted
 read-only) and ENOENT (when it is mounted read-write) seem like the
 wrong error codes.  Surely this should give EBUSY or something, no?

State-Changed-From-To: feedback->pending-pullups
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 04 Nov 2012 22:55:13 +0000
State-Changed-Why:
pullup-6 #664
pullup-5 #1814
pullup-4 #1466


From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: Taylor R Campbell <campbell+netbsd@mumble.net>
Subject: Re: kern/47040 (renaming mount point panics in namei)
Date: Mon, 5 Nov 2012 00:01:35 +0000

 On Thu, Oct 11, 2012 at 12:45:03AM +0000, Taylor R Campbell wrote:
  >  No longer seems to panic, although EROFS (when the nullfs is mounted
  >  read-only) and ENOENT (when it is mounted read-write) seem like the
  >  wrong error codes.  Surely this should give EBUSY or something, no?

 Yes, it should.

 There isn't any useful way to circumvent EROFS until we have
 lookparent and vfs-level rename using it. However, the ENOENT is
 baloney and arises as follows:

   1. looking up bar/current.ro looks in bar (ffs) and gets the
      underlying mounted-on directory current.ro.
   2. lookup_once finds that this directory is mounted on, and crosses
      the mount point, to get the visible current.ro vnode, the root of
      the nullfs.
   3. doing this sets searchdir = foundobj, which is bogus.
   4. rename then gets back the nullfs root as both fdvp and fvp.
   5. rename then tries to use the nullfs root to relookup "current.ro",
      and since it isn't there we get ENOENT.

 If it *is* there, we'll probably get EXDEV.

 I thought the searchdir = foundobj was necessary (there is similar
 nonsense elsewhere that *is* necessary), but it turns out to have been
 introduced by yamt when he added the "avoid locking vnodes from two
 filesystems" logic in -r1.190, and it's wrong. The right thing to do
 in that case is to set searchdir = NULL.

 So most of -r1.190 should be reverted; however, clearing searchdir
 there requires some additional null checks (both in lookup_once and in
 the callers) and there's an annoying family of corner cases involving
 mounted volumes whose root vnodes are relative symlinks, so I don't
 think I'm going to get this done right away.

 With those we should get EBUSY again, though.

 -- 
 David A. Holland
 dholland@netbsd.org

From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/47040 CVS commit: [netbsd-5] src/sys
Date: Tue, 6 Nov 2012 20:02:47 +0000

 Module Name:	src
 Committed By:	riz
 Date:		Tue Nov  6 20:02:47 UTC 2012

 Modified Files:
 	src/sys/kern [netbsd-5]: vfs_lookup.c
 	src/sys/miscfs/genfs [netbsd-5]: layer_vnops.c

 Log Message:
 Pull up following revision(s) (requested by dholland in ticket #1814):
 	sys/kern/vfs_lookup.c: revision 1.195
 	sys/miscfs/genfs/layer_vnops.c: revision 1.51
 In layer_lookup(), clear *vpp before returning EROFS, as otherwise a
 stale value can be returned and this causes a diagnostic panic in
 namei.
 In relookup(), clear *vpp before calling VOP_LOOKUP, as is done in
 lookup_once(), as an additional precautionary measure.
 (in theory both of these fixes are not required together)
 Should fix PR 47040.


 To generate a diff of this commit:
 cvs rdiff -u -r1.110.4.1 -r1.110.4.2 src/sys/kern/vfs_lookup.c
 cvs rdiff -u -r1.35 -r1.35.20.1 src/sys/miscfs/genfs/layer_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/47040 CVS commit: [netbsd-4] src/sys
Date: Wed, 14 Nov 2012 20:07:45 +0000

 Module Name:	src
 Committed By:	riz
 Date:		Wed Nov 14 20:07:45 UTC 2012

 Modified Files:
 	src/sys/kern [netbsd-4]: vfs_lookup.c
 	src/sys/miscfs/genfs [netbsd-4]: layer_vnops.c

 Log Message:
 Pull up following revision(s) (requested by dholland in ticket #1466):
 	sys/kern/vfs_lookup.c: revision 1.195
 	sys/miscfs/genfs/layer_vnops.c: revision 1.51
 In layer_lookup(), clear *vpp before returning EROFS, as otherwise a
 stale value can be returned and this causes a diagnostic panic in
 namei.
 In relookup(), clear *vpp before calling VOP_LOOKUP, as is done in
 lookup_once(), as an additional precautionary measure.
 (in theory both of these fixes are not required together)
 Should fix PR 47040.


 To generate a diff of this commit:
 cvs rdiff -u -r1.72.2.3 -r1.72.2.4 src/sys/kern/vfs_lookup.c
 cvs rdiff -u -r1.28.2.2 -r1.28.2.3 src/sys/miscfs/genfs/layer_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/47040 CVS commit: [netbsd-4-0] src/sys
Date: Wed, 14 Nov 2012 20:15:37 +0000

 Module Name:	src
 Committed By:	riz
 Date:		Wed Nov 14 20:15:37 UTC 2012

 Modified Files:
 	src/sys/kern [netbsd-4-0]: vfs_lookup.c
 	src/sys/miscfs/genfs [netbsd-4-0]: layer_vnops.c

 Log Message:
 Pull up following revision(s) (requested by dholland in ticket #1466):
 	sys/kern/vfs_lookup.c: revision 1.195
 	sys/miscfs/genfs/layer_vnops.c: revision 1.51
 In layer_lookup(), clear *vpp before returning EROFS, as otherwise a
 stale value can be returned and this causes a diagnostic panic in
 namei.
 In relookup(), clear *vpp before calling VOP_LOOKUP, as is done in
 lookup_once(), as an additional precautionary measure.
 (in theory both of these fixes are not required together)
 Should fix PR 47040.


 To generate a diff of this commit:
 cvs rdiff -u -r1.72.2.3 -r1.72.2.3.6.1 src/sys/kern/vfs_lookup.c
 cvs rdiff -u -r1.28.2.2 -r1.28.2.2.6.1 src/sys/miscfs/genfs/layer_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "SAITOH Masanobu" <msaitoh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/47040 CVS commit: [netbsd-6] src/sys
Date: Sun, 18 Nov 2012 18:36:59 +0000

 Module Name:	src
 Committed By:	msaitoh
 Date:		Sun Nov 18 18:36:59 UTC 2012

 Modified Files:
 	src/sys/kern [netbsd-6]: vfs_lookup.c
 	src/sys/miscfs/genfs [netbsd-6]: layer_vnops.c

 Log Message:
 Pull up following revision(s) (requested by dholland in ticket #664):
 	sys/kern/vfs_lookup.c: revision 1.195
 	sys/miscfs/genfs/layer_vnops.c: revision 1.51
 In layer_lookup(), clear *vpp before returning EROFS, as otherwise a
 stale value can be returned and this causes a diagnostic panic in
 namei.
 In relookup(), clear *vpp before calling VOP_LOOKUP, as is done in
 lookup_once(), as an additional precautionary measure.
 (in theory both of these fixes are not required together)
 Should fix PR 47040.


 To generate a diff of this commit:
 cvs rdiff -u -r1.192 -r1.192.8.1 src/sys/kern/vfs_lookup.c
 cvs rdiff -u -r1.50 -r1.50.8.1 src/sys/miscfs/genfs/layer_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: pending-pullups->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 18 Nov 2012 18:42:02 +0000
State-Changed-Why:
pullups for the main problem are done; another issue remains
(affects only current and netbsd-6)


Responsible-Changed-From-To: kern-bug-people->dholland
Responsible-Changed-By: dholland@NetBSD.org
Responsible-Changed-When: Sun, 18 Nov 2012 18:42:43 +0000
Responsible-Changed-Why:
I'd better do it


From: David Holland <dholland@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/47040 (renaming mount point panics in namei)
Date: Mon, 11 Apr 2016 00:00:03 +0000

 On Mon, Nov 05, 2012 at 12:01:35AM +0000, David Holland wrote:
  > I don't think I'm going to get this done right away.

 Bit of an understatement there...

 I have now finally managed to get to this. Patch is here:
    http://www.netbsd.org/~dholland/tmp/pr47040.diff

 Currently doing an anita run; if it passes that I'll commit, but any
 further inspection/eyeballing anyone passing by wants to do would be
 welcome.

 -- 
 David A. Holland
 dholland@netbsd.org

From: "David A. Holland" <dholland@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/47040 CVS commit: src/sys/kern
Date: Tue, 12 Apr 2016 04:02:55 +0000

 Module Name:	src
 Committed By:	dholland
 Date:		Tue Apr 12 04:02:55 UTC 2016

 Modified Files:
 	src/sys/kern: vfs_lookup.c

 Log Message:
 Fix (finally) the rest of PR 47040.

 Revert the supporting logic in -r1.190 of vfs_lookup.c, and fix the
 important change to set searchdir = NULL instead of searchdir =
 foundobj. Then supply the necessary new supporting logic to cope with
 some new cases where searchdir can be null.

 This is at the point when lookup_once crosses a mountpoint going down;
 the idea was to avoid coupling locks across filesystems as that has a
 number of potentially negative consequences. At this stage of namei,
 though, it's important to set searchdir to null as this is what is
 used later on to handle other cases arising from crossing mount
 points. If you set it to be the same as foundobj, that instead creates
 the impression that you looked up "/." on the new volume, and that
 causes odd things to happen in corner cases such as the one appearing
 in PR 47040.

 This fix ought to be pulled up to -6 and -7, and it probably could be
 safely, but given the delicacy of this code and the fact that it's
 taken me more than three years to find the combination of time and
 intestinal fortitude to do it, as well as the minor nature of the
 resulting wrong behavior observed so far, I think we'll let that part
 go.

 This change also exposes an annoying corner case: if you cross a mount
 point and the root directory vnode of the new volume is not a
 directory but a symlink, we now have no searchdir to follow the
 symlink relative to. In principle one could hang onto the searchdir
 from before calling lookup_once and use that, or complexify the
 interface of lookup_once to hang onto it as desired for this case.
 Alternatively one could add the necessary null checks to namei_follow
 and allow only absolute symlinks in this case, as for an absolute
 symlink one doesn't need the old searchdir. However, given that only
 broken filesystems have symlinks as their root vnodes, I'm not going
 to bother. Instead if this happens we'll just fail with ENOTDIR.


 To generate a diff of this commit:
 cvs rdiff -u -r1.203 -r1.204 src/sys/kern/vfs_lookup.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Tue, 12 Apr 2016 04:47:15 +0000
State-Changed-Why:
fixed, finally.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.