NetBSD Problem Report #44288
From jruohone@gmail.com Tue Dec 28 20:19:09 2010
Return-Path: <jruohone@gmail.com>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id F0C5B63B89A
for <gnats-bugs@gnats.netbsd.org>; Tue, 28 Dec 2010 20:19:08 +0000 (UTC)
Message-Id: <20101228201904.2994CF891@marx.bitnet>
Date: Tue, 28 Dec 2010 22:19:03 +0200 (EET)
From: Jukka Ruohonen <jruohonen@iki.fi>
Sender: a b <jruohone@gmail.com>
Reply-To: jruohonen@iki.fi
To: gnats-bugs@gnats.NetBSD.org
Subject: tmpfs_rmdir() panic
X-Send-Pr-Version: 3.95
>Number: 44288
>Category: kern
>Synopsis: tmpfs_rmdir() panic
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: rmind
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Dec 28 20:20:01 +0000 2010
>Closed-Date: Sun May 29 22:40:44 +0000 2011
>Last-Modified: Sun May 29 22:40:44 +0000 2011
>Originator: Jukka Ruohonen
>Release: NetBSD 5.1
>Organization:
-
>Environment:
i386
>Description:
The following panic can be triggered on 5.1/i386:
panic(...
__kernassert(...
tmpfs_rmdir(...
VOP_RMDIR(...
sys_rmdir(...
syscall(...
I will try to append a better trace later.
NB: The same machine was previously running -current with the exact same
setup, so it can be concluded that this has been presumably fixed in HEAD.
>How-To-Repeat:
-
>Fix:
Unknown.
>Release-Note:
>Audit-Trail:
From: Jukka Ruohonen <jruohonen@iki.fi>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/44288: tmpfs_rmdir() panic
Date: Wed, 29 Dec 2010 17:48:12 +0200
Two remarks:
1. This is not fixed in HEAD. Building archivers/gtar-base triggers it
realiably also on AMD64, 5.99.42.
2. The assertion is:
"node->tn_links == 0" failed: file
"/usr/src/sys/fs/tmpfs/tmpfs_vnops.c", line 1108.
The call sequence is the same;
syscall()->sys_rmdir()->VOP_RMDIR()->tmpfs_rmdir()
->kern_assert()->panic().
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/44288: tmpfs_rmdir() panic
Date: Sun, 2 Jan 2011 00:39:18 +0000
On Wed, Dec 29, 2010 at 03:50:03PM +0000, Jukka Ruohonen wrote:
> 2. The assertion is:
>
> "node->tn_links == 0" failed: file
> "/usr/src/sys/fs/tmpfs/tmpfs_vnops.c", line 1108.
Can you slap this in and see if it goes off instead? Probably the
problem is that the direectory is corrupt; asserting earlier rules out
a gaffe somewhere in the rmdir logic itself.
Index: tmpfs_vnops.c
===================================================================
RCS file: /cvsroot/src/sys/fs/tmpfs/tmpfs_vnops.c,v
retrieving revision 1.75
diff -p -U 5 -r1.75 tmpfs_vnops.c
--- tmpfs_vnops.c 30 Nov 2010 10:43:04 -0000 1.75
+++ tmpfs_vnops.c 2 Jan 2011 00:35:25 -0000
@@ -1069,10 +1069,11 @@ tmpfs_rmdir(void *v)
* removed. */
if (node->tn_size > 0) {
error = ENOTEMPTY;
goto out;
}
+ KASSERT(node->tn_links == 2);
/* This invariant holds only if we are not trying to remove "..".
* We checked for that above so this is safe now. */
KASSERT(node->tn_spec.tn_dir.tn_parent == dnode);
--
David A. Holland
dholland@netbsd.org
From: Jukka Ruohonen <jruohonen@iki.fi>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/44288: tmpfs_rmdir() panic
Date: Sun, 2 Jan 2011 21:54:46 +0200
On Sun, Jan 02, 2011 at 12:40:04AM +0000, David Holland wrote:
> Can you slap this in and see if it goes off instead? Probably the
> problem is that the direectory is corrupt; asserting earlier rules out
> a gaffe somewhere in the rmdir logic itself.
...
> + KASSERT(node->tn_links == 2);
The above assertion indeed goes off.
- Jukka.
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/44288: tmpfs_rmdir() panic
Date: Tue, 4 Jan 2011 06:49:23 +0000
On Sun, Jan 02, 2011 at 07:55:01PM +0000, Jukka Ruohonen wrote:
> > + KASSERT(node->tn_links == 2);
>
> The above assertion indeed goes off.
I'm not particularly surprised.
Now I guess we get to figure out what's corrupting the link count...
Maybe try changing the above assertion to
if (node->tn_links != 2) {
printf("tmpfs_rmdir: tn_links is %u, should be 2\n", node->tn_links);
node->tn_links = 2;
}
and running the workload under ktrace to see what it's doing that
causes the problem?
(well, I suppose it's about 98% likely that it's rename)
--
David A. Holland
dholland@netbsd.org
From: Taylor R Campbell <campbell@mumble.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/44288: tmpfs_rmdir() panic
Date: Mon, 28 Feb 2011 16:35:50 +0000
This transcript doesn't look right. `rename' is a simple wrapper for
the rename(2) system call, with none of the bells & whistles of mv(1).
/tmp is mounted on tmpfs.
% cd /tmp
% mkdir test
% cd test
% mkdir foo bar
% ls -lid * .
28570683 drwxr-xr-x 4 riastradh wheel 40 Feb 28 15:56 .
28570657 drwxr-xr-x 2 riastradh wheel 0 Feb 28 15:56 bar
28570662 drwxr-xr-x 2 riastradh wheel 0 Feb 28 15:56 foo
% rename foo bar
% ls -lid * .
28570683 drwxr-xr-x 4 riastradh wheel 20 Feb 28 15:56 .
28570662 drwxr-xr-x 2 riastradh wheel 0 Feb 28 15:56 bar
Note that the link count on . doesn't change. Looks like tmpfs_rename
needs to decrement the link count of tdnode if it renames a directory
over a directory, since tmpfs_dir_detach leaves the caller with the
responsibility of adjusting the directory's link count (see, e.g.,
tmpfs_rmdir). I'll try this in a bit with rump. (For now, uh, I'll
try to make sure nobody rmdirs /tmp/test on my laptop...oops.)
--- tmpfs_vnops.c 2011-01-29 14:52:38.000000000 +0000
+++ tmpfs_vnops.c 2011-02-28 16:34:52.000000000 +0000
@@ -982,6 +982,12 @@
KASSERT(de2->td_node == tnode);
tmpfs_dir_detach(tdvp, de2);
+ /* If we just detached a directory, decrement the
+ * parent's link count, since the to-node's .. will no
+ * longer point to it. */
+ if (tnode->tn_type == VDIR)
+ tdnode->tn_links--;
+
/* Free the directory entry we just deleted. Note that the
* node referred by it will not be removed until the vnode is
* really reclaimed. */
From: Antti Kantee <pooka@cs.hut.fi>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/44288
Date: Mon, 28 Feb 2011 20:07:21 +0200
tests/vfs/t_vnops has plenty of rename-tests, but of course it misses
this one, or at least triggering the bug.
I tested it, and you are correct. Can you propose a patch to vfs/t_vnops
so we can conveniently rule out this bug from all file systems?
log:
$ mkdir test
$ cd test
$ mkdir foo bar
$ ~/a.out foo bar
$ cd ..
$ ls -l
total 0
drwxr-xr-x 3 root wheel 0 Feb 28 19:00 test
$ rmdir test
And meanwhile, in the server window:
pain-rustique:1:~> rump_server -lrumpvfs -lrumpfs_tmpfs -s unix:///tmp/tmpfs
panic: kernel diagnostic assertion "node->tn_links == 0" failed: file "/usr/alls
rc/src/sys/rump/fs/lib/libtmpfs/../../../../fs/tmpfs/tmpfs_vnops.c", line 1116
rump kernel halting...
halted
Abort (core dumped)
pain-rustique:2:~> gdb rump_server rump_server.core
[...]
#0 0xbbac60d7 in _lwp_kill () from /usr/lib/libc.so.12
#1 0xbbac608e in raise () from /usr/lib/libc.so.12
#2 0xbbac5914 in abort () from /usr/lib/libc.so.12
#3 0xbbb1610d in rumpuser_exit (rv=-1) at rumpuser.c:541
#4 0xbbbb69fb in cpu_reboot (howto=4, bootstr=0x0) at rump.c:570
#5 0xbbb80ce0 in panic (
fmt=0xbbbbecec "kernel %sassertion \"%s\" failed: file \"%s\", line %d")
at /sys/rump/librump/rumpkern/../../../kern/subr_prf.c:302
#6 0xbbb64504 in kern_assert (t=0xbb9871bc "diagnostic ",
f=0xbb987170 "/usr/allsrc/src/sys/rump/fs/lib/libtmpfs/../../../../fs/tmpfs/tmpfs_vnops.c", l=1116, e=0xbb987764 "node->tn_links == 0")
at /sys/rump/librump/rumpkern/../../../lib/libkern/kern_assert.c:50
#7 0xbb982570 in tmpfs_rmdir (v=0xb8bff840)
at /usr/allsrc/src/sys/rump/fs/lib/libtmpfs/../../../../fs/tmpfs/tmpfs_vnops.c:1116
#8 0xbbb78202 in VOP_RMDIR (dvp=0xbb8504c8, vp=0xbb8502c4, cnp=0xb8bff8a0)
at /sys/rump/librump/rumpkern/../../../kern/vnode_if.c:874
#9 0xbb9b9b30 in sys_rmdir (l=0xb9ec7c80, uap=0xb8e02098, retval=0xb8bff97c)
at /sys/rump/librump/rumpvfs/../../../kern/vfs_syscalls.c:3708
#10 0xbbbb6f7a in sy_call (sy=0xbbbccb6c, l=0xb9ec7c80, uap=0xb8e02098,
rval=0xb8bff97c) at /sys/rump/librump/rumpkern/../../../sys/syscallvar.h:61
#11 0xbbbb6f3a in rump_proxy_syscall (num=137, arg=0xb8e02098,
retval=0xb8bff97c) at rump.c:753
#12 0xbbb10838 in rumpsyscall (sysnum=137, data=0xb8e02098, retval=0xb8bff97c)
at rumpuser_sp.c:222
#13 0xbbb1134c in serv_handlesyscall (spc=0xbbb17e38, rhdr=0xb8e03194,
data=0xb8e02098 ":÷żż", '˙' <repeats 112 times>, ".") at rumpuser_sp.c:598
#14 0xbbb114b6 in serv_syscallbouncer (arg=0x0) at rumpuser_sp.c:647
#15 0xbbb084c7 in pthread_setcancelstate () from /usr/lib/libpthread.so.0
#16 0xbba39610 in swapcontext () from /usr/lib/libc.so.12
(gdb)
From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/44288: tmpfs_rmdir() panic
Date: Mon, 28 Feb 2011 18:57:47 +0000
I just tested the patch with rump. Without the patch, the link count
was wrong, and rmdiring test/bar and then test made rump_tmpfs crash
on the same KASSERT as jruoho observed. With the patch, the link
count is right, and rmdiring test/bar and then test works dandily.
I haven't run any other tmpfs tests with the patch, though.
From: "Antti Kantee" <pooka@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/44288 CVS commit: src/tests/fs/vfs
Date: Tue, 1 Mar 2011 14:21:47 +0000
Module Name: src
Committed By: pooka
Date: Tue Mar 1 14:21:46 UTC 2011
Modified Files:
src/tests/fs/vfs: t_vnops.c
Log Message:
augment rename test case with the failure from PR kern/44288
To generate a diff of this commit:
cvs rdiff -u -r1.15 -r1.16 src/tests/fs/vfs/t_vnops.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Jukka Ruohonen <jruohonen@iki.fi>
To: gnats-bugs@NetBSD.org
Cc: Taylor R Campbell <campbell@mumble.net>,
David Holland <dholland-bugs@NetBSD.org>,
Antti Kantee <pooka@cs.hut.fi>
Subject: Re: kern/44288: tmpfs_rmdir() panic
Date: Sat, 19 Mar 2011 22:03:37 +0200
On Mon, Feb 28, 2011 at 07:00:07PM +0000, Taylor R Campbell wrote:
> I just tested the patch with rump. Without the patch, the link count
> was wrong, and rmdiring test/bar and then test made rump_tmpfs crash
> on the same KASSERT as jruoho observed. With the patch, the link
> count is right, and rmdiring test/bar and then test works dandily.
I am sorry to inform, but the assertion still goes off with this patch
+ /* If we just detached a directory, decrement the
+ * parent's link count, since the to-node's .. will no
+ * longer point to it. */
+ if (tnode->tn_type == VDIR)
+ tdnode->tn_links--;
+
The frequency of it has also increased in real-world usage, given that it is
now triggered by several other pkgsrc packages as well.
- Jukka.
From: "Mindaugas Rasiukevicius" <rmind@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/44288 CVS commit: src/sys/fs/tmpfs
Date: Sun, 29 May 2011 22:29:08 +0000
Module Name: src
Committed By: rmind
Date: Sun May 29 22:29:07 UTC 2011
Modified Files:
src/sys/fs/tmpfs: tmpfs.h tmpfs_subr.c tmpfs_vfsops.c tmpfs_vnops.c
Log Message:
- Rework and document inode reference counting. Also document inode life
cycle (destruction part). Perform link counting in tmpfs_dir_attach()
and tmpfs_dir_detach(), instead of alloc/free and arbitrary places.
Fixes PR/44285, PR/44288, PR/44657 and likely PR/42484.
- Fix the race between the lookup and inode destruction. Fixes PR/43167
and its duplicates PR/40088, PR/40757.
- Improve tmpfs_rename() locking a little, fix kqueue event notifications
and also fix PR/43617. Add simplistic tmpfs_parentcheck_p(); to be
expanded and used for further rename() locking fixes.
- Cache directory entry "hint" in the tmpfs node, add tmpfs_dir_cached(),
and thus avoid unnecessary lookup in tmpfs_remove() and tmpfs_rmdir().
- Set correct _PC_FILESIZEBITS value in tmpfs_pathconf(). Fixes PR/43576.
- Few minor fixes.
To generate a diff of this commit:
cvs rdiff -u -r1.43 -r1.44 src/sys/fs/tmpfs/tmpfs.h
cvs rdiff -u -r1.70 -r1.71 src/sys/fs/tmpfs/tmpfs_subr.c
cvs rdiff -u -r1.50 -r1.51 src/sys/fs/tmpfs/tmpfs_vfsops.c
cvs rdiff -u -r1.84 -r1.85 src/sys/fs/tmpfs/tmpfs_vnops.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
Responsible-Changed-From-To: kern-bug-people->rmind
Responsible-Changed-By: rmind@NetBSD.org
Responsible-Changed-When: Sun, 29 May 2011 22:40:44 +0000
Responsible-Changed-Why:
State-Changed-From-To: open->closed
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Sun, 29 May 2011 22:40:44 +0000
State-Changed-Why:
Fixed, thanks.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.