NetBSD Problem Report #47040
From campbell@mumble.net Fri Oct 5 13:58:40 2012
Return-Path: <campbell@mumble.net>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
by www.NetBSD.org (Postfix) with ESMTP id 0F17363B8A7
for <gnats-bugs@gnats.NetBSD.org>; Fri, 5 Oct 2012 13:58:40 +0000 (UTC)
Message-Id: <20121005135812.CC1B1604AF@jupiter.mumble.net>
Date: Fri, 5 Oct 2012 13:58:12 +0000 (UTC)
From: Taylor R Campbell <campbell+netbsd@mumble.net>
Reply-To: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@gnats.NetBSD.org
Subject: renaming mount point panics in namei
X-Send-Pr-Version: 3.95
>Number: 47040
>Category: kern
>Synopsis: renaming mount point panics in namei
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: dholland
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Oct 05 14:00:08 +0000 2012
>Closed-Date: Tue Apr 12 04:47:15 +0000 2016
>Last-Modified: Tue Apr 12 04:47:15 +0000 2016
>Originator: Taylor R Campbell <campbell+netbsd@mumble.net>
>Release: NetBSD 6.99.11
>Organization:
>Environment:
Architecture: amd64
Machine: amd64
>Description:
I had a directory /foo/bar/current.ro, and I wanted to rename
it to /foo/current.ro, so I ran
% mv bar/current.ro .
from /foo, forgetting that /foo/bar/current.ro had a file
system mounted. This immediately triggered a panic in namei:
panic: leaf `current.ro' should be empty
I am not sure which instance of the panic this is -- whether
it's the one in lookup_once or the one in relookup -- because I
can't read a kernel core dump on this machine.
>How-To-Repeat:
Rename a mount point.
>Fix:
Yes, please!
(Real fix: rewrite namei...)
>Release-Note:
>Audit-Trail:
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/47040: renaming mount point panics in namei
Date: Tue, 9 Oct 2012 15:19:56 +0000
On Fri, Oct 05, 2012 at 02:00:08PM +0000, Taylor R Campbell wrote:
> I had a directory /foo/bar/current.ro, and I wanted to rename
> it to /foo/current.ro, so I ran
>
> % mv bar/current.ro .
>
> from /foo, forgetting that /foo/bar/current.ro had a file
> system mounted. This immediately triggered a panic in namei:
>
> panic: leaf `current.ro' should be empty
>
> I am not sure which instance of the panic this is -- whether
> it's the one in lookup_once or the one in relookup -- because I
> can't read a kernel core dump on this machine.
It doesn't matter, because either way it's caused by VOP_LOOKUP
returning both an error and a result, which it's not supposed to do.
Which fs(es)?
> (Real fix: rewrite namei...)
Not this time! :-p
--
David A. Holland
dholland@netbsd.org
From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/47040: renaming mount point panics in namei
Date: Tue, 9 Oct 2012 19:06:23 +0000
Date: Tue, 9 Oct 2012 15:19:56 +0000
From: David Holland <dholland-bugs@netbsd.org>
On Fri, Oct 05, 2012 at 02:00:08PM +0000, Taylor R Campbell wrote:
> I had a directory /foo/bar/current.ro, and I wanted to rename
> it to /foo/current.ro, so I ran
>
> % mv bar/current.ro .
>
> from /foo, forgetting that /foo/bar/current.ro had a file
> system mounted. This immediately triggered a panic in namei:
>
> panic: leaf `current.ro' should be empty
Which fs(es)?
/ is ffs sync, /foo is ffs async, and /foo/bar/current.ro is nullfs
mounted read-only from /foo/bar/current, which is not a mount point.
From: "David A. Holland" <dholland@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47040 CVS commit: src/sys
Date: Wed, 10 Oct 2012 06:55:26 +0000
Module Name: src
Committed By: dholland
Date: Wed Oct 10 06:55:26 UTC 2012
Modified Files:
src/sys/kern: vfs_lookup.c
src/sys/miscfs/genfs: layer_vnops.c
Log Message:
In layer_lookup(), clear *vpp before returning EROFS, as otherwise a
stale value can be returned and this causes a diagnostic panic in
namei.
In relookup(), clear *vpp before calling VOP_LOOKUP, as is done in
lookup_once(), as an additional precautionary measure.
(in theory both of these fixes are not required together)
Should fix PR 47040.
To generate a diff of this commit:
cvs rdiff -u -r1.194 -r1.195 src/sys/kern/vfs_lookup.c
cvs rdiff -u -r1.50 -r1.51 src/sys/miscfs/genfs/layer_vnops.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Wed, 10 Oct 2012 07:01:44 +0000
State-Changed-Why:
please try again
From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org,
gnats-admin@netbsd.org, dholland@NetBSD.org
Subject: Re: kern/47040 (renaming mount point panics in namei)
Date: Thu, 11 Oct 2012 00:40:08 +0000
No longer seems to panic, although EROFS (when the nullfs is mounted
read-only) and ENOENT (when it is mounted read-write) seem like the
wrong error codes. Surely this should give EBUSY or something, no?
State-Changed-From-To: feedback->pending-pullups
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 04 Nov 2012 22:55:13 +0000
State-Changed-Why:
pullup-6 #664
pullup-5 #1814
pullup-4 #1466
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: Taylor R Campbell <campbell+netbsd@mumble.net>
Subject: Re: kern/47040 (renaming mount point panics in namei)
Date: Mon, 5 Nov 2012 00:01:35 +0000
On Thu, Oct 11, 2012 at 12:45:03AM +0000, Taylor R Campbell wrote:
> No longer seems to panic, although EROFS (when the nullfs is mounted
> read-only) and ENOENT (when it is mounted read-write) seem like the
> wrong error codes. Surely this should give EBUSY or something, no?
Yes, it should.
There isn't any useful way to circumvent EROFS until we have
lookparent and vfs-level rename using it. However, the ENOENT is
baloney and arises as follows:
1. looking up bar/current.ro looks in bar (ffs) and gets the
underlying mounted-on directory current.ro.
2. lookup_once finds that this directory is mounted on, and crosses
the mount point, to get the visible current.ro vnode, the root of
the nullfs.
3. doing this sets searchdir = foundobj, which is bogus.
4. rename then gets back the nullfs root as both fdvp and fvp.
5. rename then tries to use the nullfs root to relookup "current.ro",
and since it isn't there we get ENOENT.
If it *is* there, we'll probably get EXDEV.
I thought the searchdir = foundobj was necessary (there is similar
nonsense elsewhere that *is* necessary), but it turns out to have been
introduced by yamt when he added the "avoid locking vnodes from two
filesystems" logic in -r1.190, and it's wrong. The right thing to do
in that case is to set searchdir = NULL.
So most of -r1.190 should be reverted; however, clearing searchdir
there requires some additional null checks (both in lookup_once and in
the callers) and there's an annoying family of corner cases involving
mounted volumes whose root vnodes are relative symlinks, so I don't
think I'm going to get this done right away.
With those we should get EBUSY again, though.
--
David A. Holland
dholland@netbsd.org
From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47040 CVS commit: [netbsd-5] src/sys
Date: Tue, 6 Nov 2012 20:02:47 +0000
Module Name: src
Committed By: riz
Date: Tue Nov 6 20:02:47 UTC 2012
Modified Files:
src/sys/kern [netbsd-5]: vfs_lookup.c
src/sys/miscfs/genfs [netbsd-5]: layer_vnops.c
Log Message:
Pull up following revision(s) (requested by dholland in ticket #1814):
sys/kern/vfs_lookup.c: revision 1.195
sys/miscfs/genfs/layer_vnops.c: revision 1.51
In layer_lookup(), clear *vpp before returning EROFS, as otherwise a
stale value can be returned and this causes a diagnostic panic in
namei.
In relookup(), clear *vpp before calling VOP_LOOKUP, as is done in
lookup_once(), as an additional precautionary measure.
(in theory both of these fixes are not required together)
Should fix PR 47040.
To generate a diff of this commit:
cvs rdiff -u -r1.110.4.1 -r1.110.4.2 src/sys/kern/vfs_lookup.c
cvs rdiff -u -r1.35 -r1.35.20.1 src/sys/miscfs/genfs/layer_vnops.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47040 CVS commit: [netbsd-4] src/sys
Date: Wed, 14 Nov 2012 20:07:45 +0000
Module Name: src
Committed By: riz
Date: Wed Nov 14 20:07:45 UTC 2012
Modified Files:
src/sys/kern [netbsd-4]: vfs_lookup.c
src/sys/miscfs/genfs [netbsd-4]: layer_vnops.c
Log Message:
Pull up following revision(s) (requested by dholland in ticket #1466):
sys/kern/vfs_lookup.c: revision 1.195
sys/miscfs/genfs/layer_vnops.c: revision 1.51
In layer_lookup(), clear *vpp before returning EROFS, as otherwise a
stale value can be returned and this causes a diagnostic panic in
namei.
In relookup(), clear *vpp before calling VOP_LOOKUP, as is done in
lookup_once(), as an additional precautionary measure.
(in theory both of these fixes are not required together)
Should fix PR 47040.
To generate a diff of this commit:
cvs rdiff -u -r1.72.2.3 -r1.72.2.4 src/sys/kern/vfs_lookup.c
cvs rdiff -u -r1.28.2.2 -r1.28.2.3 src/sys/miscfs/genfs/layer_vnops.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47040 CVS commit: [netbsd-4-0] src/sys
Date: Wed, 14 Nov 2012 20:15:37 +0000
Module Name: src
Committed By: riz
Date: Wed Nov 14 20:15:37 UTC 2012
Modified Files:
src/sys/kern [netbsd-4-0]: vfs_lookup.c
src/sys/miscfs/genfs [netbsd-4-0]: layer_vnops.c
Log Message:
Pull up following revision(s) (requested by dholland in ticket #1466):
sys/kern/vfs_lookup.c: revision 1.195
sys/miscfs/genfs/layer_vnops.c: revision 1.51
In layer_lookup(), clear *vpp before returning EROFS, as otherwise a
stale value can be returned and this causes a diagnostic panic in
namei.
In relookup(), clear *vpp before calling VOP_LOOKUP, as is done in
lookup_once(), as an additional precautionary measure.
(in theory both of these fixes are not required together)
Should fix PR 47040.
To generate a diff of this commit:
cvs rdiff -u -r1.72.2.3 -r1.72.2.3.6.1 src/sys/kern/vfs_lookup.c
cvs rdiff -u -r1.28.2.2 -r1.28.2.2.6.1 src/sys/miscfs/genfs/layer_vnops.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "SAITOH Masanobu" <msaitoh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47040 CVS commit: [netbsd-6] src/sys
Date: Sun, 18 Nov 2012 18:36:59 +0000
Module Name: src
Committed By: msaitoh
Date: Sun Nov 18 18:36:59 UTC 2012
Modified Files:
src/sys/kern [netbsd-6]: vfs_lookup.c
src/sys/miscfs/genfs [netbsd-6]: layer_vnops.c
Log Message:
Pull up following revision(s) (requested by dholland in ticket #664):
sys/kern/vfs_lookup.c: revision 1.195
sys/miscfs/genfs/layer_vnops.c: revision 1.51
In layer_lookup(), clear *vpp before returning EROFS, as otherwise a
stale value can be returned and this causes a diagnostic panic in
namei.
In relookup(), clear *vpp before calling VOP_LOOKUP, as is done in
lookup_once(), as an additional precautionary measure.
(in theory both of these fixes are not required together)
Should fix PR 47040.
To generate a diff of this commit:
cvs rdiff -u -r1.192 -r1.192.8.1 src/sys/kern/vfs_lookup.c
cvs rdiff -u -r1.50 -r1.50.8.1 src/sys/miscfs/genfs/layer_vnops.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: pending-pullups->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 18 Nov 2012 18:42:02 +0000
State-Changed-Why:
pullups for the main problem are done; another issue remains
(affects only current and netbsd-6)
Responsible-Changed-From-To: kern-bug-people->dholland
Responsible-Changed-By: dholland@NetBSD.org
Responsible-Changed-When: Sun, 18 Nov 2012 18:42:43 +0000
Responsible-Changed-Why:
I'd better do it
From: David Holland <dholland@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/47040 (renaming mount point panics in namei)
Date: Mon, 11 Apr 2016 00:00:03 +0000
On Mon, Nov 05, 2012 at 12:01:35AM +0000, David Holland wrote:
> I don't think I'm going to get this done right away.
Bit of an understatement there...
I have now finally managed to get to this. Patch is here:
http://www.netbsd.org/~dholland/tmp/pr47040.diff
Currently doing an anita run; if it passes that I'll commit, but any
further inspection/eyeballing anyone passing by wants to do would be
welcome.
--
David A. Holland
dholland@netbsd.org
From: "David A. Holland" <dholland@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47040 CVS commit: src/sys/kern
Date: Tue, 12 Apr 2016 04:02:55 +0000
Module Name: src
Committed By: dholland
Date: Tue Apr 12 04:02:55 UTC 2016
Modified Files:
src/sys/kern: vfs_lookup.c
Log Message:
Fix (finally) the rest of PR 47040.
Revert the supporting logic in -r1.190 of vfs_lookup.c, and fix the
important change to set searchdir = NULL instead of searchdir =
foundobj. Then supply the necessary new supporting logic to cope with
some new cases where searchdir can be null.
This is at the point when lookup_once crosses a mountpoint going down;
the idea was to avoid coupling locks across filesystems as that has a
number of potentially negative consequences. At this stage of namei,
though, it's important to set searchdir to null as this is what is
used later on to handle other cases arising from crossing mount
points. If you set it to be the same as foundobj, that instead creates
the impression that you looked up "/." on the new volume, and that
causes odd things to happen in corner cases such as the one appearing
in PR 47040.
This fix ought to be pulled up to -6 and -7, and it probably could be
safely, but given the delicacy of this code and the fact that it's
taken me more than three years to find the combination of time and
intestinal fortitude to do it, as well as the minor nature of the
resulting wrong behavior observed so far, I think we'll let that part
go.
This change also exposes an annoying corner case: if you cross a mount
point and the root directory vnode of the new volume is not a
directory but a symlink, we now have no searchdir to follow the
symlink relative to. In principle one could hang onto the searchdir
from before calling lookup_once and use that, or complexify the
interface of lookup_once to hang onto it as desired for this case.
Alternatively one could add the necessary null checks to namei_follow
and allow only absolute symlinks in this case, as for an absolute
symlink one doesn't need the old searchdir. However, given that only
broken filesystems have symlinks as their root vnodes, I'm not going
to bother. Instead if this happens we'll just fail with ENOTDIR.
To generate a diff of this commit:
cvs rdiff -u -r1.203 -r1.204 src/sys/kern/vfs_lookup.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Tue, 12 Apr 2016 04:47:15 +0000
State-Changed-Why:
fixed, finally.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.