NetBSD Problem Report #44288

From jruohone@gmail.com  Tue Dec 28 20:19:09 2010
Return-Path: <jruohone@gmail.com>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id F0C5B63B89A
	for <gnats-bugs@gnats.netbsd.org>; Tue, 28 Dec 2010 20:19:08 +0000 (UTC)
Message-Id: <20101228201904.2994CF891@marx.bitnet>
Date: Tue, 28 Dec 2010 22:19:03 +0200 (EET)
From: Jukka Ruohonen <jruohonen@iki.fi>
Sender: a b <jruohone@gmail.com>
Reply-To: jruohonen@iki.fi
To: gnats-bugs@gnats.NetBSD.org
Subject: tmpfs_rmdir() panic
X-Send-Pr-Version: 3.95

>Number:         44288
>Category:       kern
>Synopsis:       tmpfs_rmdir() panic
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    rmind
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Dec 28 20:20:01 +0000 2010
>Closed-Date:    Sun May 29 22:40:44 +0000 2011
>Last-Modified:  Sun May 29 22:40:44 +0000 2011
>Originator:     Jukka Ruohonen
>Release:        NetBSD 5.1
>Organization:
-
>Environment:
i386
>Description:

The following panic can be triggered on 5.1/i386:

        panic(... 
        __kernassert(...
        tmpfs_rmdir(...
        VOP_RMDIR(...
        sys_rmdir(...
        syscall(...

I will try to append a better trace later.

NB: The same machine was previously running -current with the exact same
    setup, so it can be concluded that this has been presumably fixed in HEAD.

>How-To-Repeat:

-

>Fix:

Unknown.

>Release-Note:

>Audit-Trail:
From: Jukka Ruohonen <jruohonen@iki.fi>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/44288: tmpfs_rmdir() panic
Date: Wed, 29 Dec 2010 17:48:12 +0200

 Two remarks:

 1. This is not fixed in HEAD. Building archivers/gtar-base triggers it
    realiably also on AMD64, 5.99.42.

 2. The assertion is:

 	"node->tn_links == 0" failed: file
 	"/usr/src/sys/fs/tmpfs/tmpfs_vnops.c", line 1108.

    The call sequence is the same;

  	syscall()->sys_rmdir()->VOP_RMDIR()->tmpfs_rmdir()
 		->kern_assert()->panic().

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/44288: tmpfs_rmdir() panic
Date: Sun, 2 Jan 2011 00:39:18 +0000

 On Wed, Dec 29, 2010 at 03:50:03PM +0000, Jukka Ruohonen wrote:
  >  2. The assertion is:
  >  
  >  	"node->tn_links == 0" failed: file
  >  	"/usr/src/sys/fs/tmpfs/tmpfs_vnops.c", line 1108.

 Can you slap this in and see if it goes off instead? Probably the
 problem is that the direectory is corrupt; asserting earlier rules out
 a gaffe somewhere in the rmdir logic itself.


 Index: tmpfs_vnops.c
 ===================================================================
 RCS file: /cvsroot/src/sys/fs/tmpfs/tmpfs_vnops.c,v
 retrieving revision 1.75
 diff -p -U 5 -r1.75 tmpfs_vnops.c
 --- tmpfs_vnops.c	30 Nov 2010 10:43:04 -0000	1.75
 +++ tmpfs_vnops.c	2 Jan 2011 00:35:25 -0000
 @@ -1069,10 +1069,11 @@ tmpfs_rmdir(void *v)
  	 * removed. */
  	if (node->tn_size > 0) {
  		error = ENOTEMPTY;
  		goto out;
  	}
 +	KASSERT(node->tn_links == 2);

  	/* This invariant holds only if we are not trying to remove "..".
  	 * We checked for that above so this is safe now. */
  	KASSERT(node->tn_spec.tn_dir.tn_parent == dnode);



 -- 
 David A. Holland
 dholland@netbsd.org

From: Jukka Ruohonen <jruohonen@iki.fi>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/44288: tmpfs_rmdir() panic
Date: Sun, 2 Jan 2011 21:54:46 +0200

 On Sun, Jan 02, 2011 at 12:40:04AM +0000, David Holland wrote:
 >  Can you slap this in and see if it goes off instead? Probably the
 >  problem is that the direectory is corrupt; asserting earlier rules out
 >  a gaffe somewhere in the rmdir logic itself.

 ...

 >  +	KASSERT(node->tn_links == 2);

 The above assertion indeed goes off.

 - Jukka.

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/44288: tmpfs_rmdir() panic
Date: Tue, 4 Jan 2011 06:49:23 +0000

 On Sun, Jan 02, 2011 at 07:55:01PM +0000, Jukka Ruohonen wrote:
  >  >  +	KASSERT(node->tn_links == 2);
  >  
  >  The above assertion indeed goes off.

 I'm not particularly surprised.

 Now I guess we get to figure out what's corrupting the link count...

 Maybe try changing the above assertion to

    if (node->tn_links != 2) {
 	printf("tmpfs_rmdir: tn_links is %u, should be 2\n", node->tn_links);
 	node->tn_links = 2;
    }

 and running the workload under ktrace to see what it's doing that
 causes the problem?

 (well, I suppose it's about 98% likely that it's rename)

 -- 
 David A. Holland
 dholland@netbsd.org

From: Taylor R Campbell <campbell@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/44288: tmpfs_rmdir() panic
Date: Mon, 28 Feb 2011 16:35:50 +0000

 This transcript doesn't look right.  `rename' is a simple wrapper for
 the rename(2) system call, with none of the bells & whistles of mv(1).
 /tmp is mounted on tmpfs.

 % cd /tmp
 % mkdir test
 % cd test
 % mkdir foo bar
 % ls -lid * .
 28570683 drwxr-xr-x  4 riastradh  wheel  40 Feb 28 15:56 .
 28570657 drwxr-xr-x  2 riastradh  wheel   0 Feb 28 15:56 bar
 28570662 drwxr-xr-x  2 riastradh  wheel   0 Feb 28 15:56 foo
 % rename foo bar
 % ls -lid * .
 28570683 drwxr-xr-x  4 riastradh  wheel  20 Feb 28 15:56 .
 28570662 drwxr-xr-x  2 riastradh  wheel   0 Feb 28 15:56 bar

 Note that the link count on . doesn't change.  Looks like tmpfs_rename
 needs to decrement the link count of tdnode if it renames a directory
 over a directory, since tmpfs_dir_detach leaves the caller with the
 responsibility of adjusting the directory's link count (see, e.g.,
 tmpfs_rmdir).  I'll try this in a bit with rump.  (For now, uh, I'll
 try to make sure nobody rmdirs /tmp/test on my laptop...oops.)

 --- tmpfs_vnops.c	2011-01-29 14:52:38.000000000 +0000
 +++ tmpfs_vnops.c	2011-02-28 16:34:52.000000000 +0000
 @@ -982,6 +982,12 @@
  		KASSERT(de2->td_node == tnode);
  		tmpfs_dir_detach(tdvp, de2);

 +		/* If we just detached a directory, decrement the
 +		 * parent's link count, since the to-node's .. will no
 +		 * longer point to it.  */
 +		if (tnode->tn_type == VDIR)
 +			tdnode->tn_links--;
 +
  		/* Free the directory entry we just deleted.  Note that the
  		 * node referred by it will not be removed until the vnode is
  		 * really reclaimed. */

From: Antti Kantee <pooka@cs.hut.fi>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/44288
Date: Mon, 28 Feb 2011 20:07:21 +0200

 tests/vfs/t_vnops has plenty of rename-tests, but of course it misses
 this one, or at least triggering the bug.

 I tested it, and you are correct.  Can you propose a patch to vfs/t_vnops
 so we can conveniently rule out this bug from all file systems?

 log:

 $ mkdir test
 $ cd test
 $ mkdir foo bar
 $ ~/a.out foo bar
 $ cd ..
 $ ls -l
 total 0
 drwxr-xr-x  3 root  wheel  0 Feb 28 19:00 test
 $ rmdir test

 And meanwhile, in the server window:

 pain-rustique:1:~> rump_server -lrumpvfs -lrumpfs_tmpfs -s unix:///tmp/tmpfs
 panic: kernel diagnostic assertion "node->tn_links == 0" failed: file "/usr/alls
 rc/src/sys/rump/fs/lib/libtmpfs/../../../../fs/tmpfs/tmpfs_vnops.c", line 1116
 rump kernel halting...
 halted
 Abort (core dumped)
 pain-rustique:2:~> gdb rump_server rump_server.core 
 [...]
 #0  0xbbac60d7 in _lwp_kill () from /usr/lib/libc.so.12
 #1  0xbbac608e in raise () from /usr/lib/libc.so.12
 #2  0xbbac5914 in abort () from /usr/lib/libc.so.12
 #3  0xbbb1610d in rumpuser_exit (rv=-1) at rumpuser.c:541
 #4  0xbbbb69fb in cpu_reboot (howto=4, bootstr=0x0) at rump.c:570
 #5  0xbbb80ce0 in panic (
     fmt=0xbbbbecec "kernel %sassertion \"%s\" failed: file \"%s\", line %d")
     at /sys/rump/librump/rumpkern/../../../kern/subr_prf.c:302
 #6  0xbbb64504 in kern_assert (t=0xbb9871bc "diagnostic ", 
     f=0xbb987170 "/usr/allsrc/src/sys/rump/fs/lib/libtmpfs/../../../../fs/tmpfs/tmpfs_vnops.c", l=1116, e=0xbb987764 "node->tn_links == 0")
     at /sys/rump/librump/rumpkern/../../../lib/libkern/kern_assert.c:50
 #7  0xbb982570 in tmpfs_rmdir (v=0xb8bff840)
     at /usr/allsrc/src/sys/rump/fs/lib/libtmpfs/../../../../fs/tmpfs/tmpfs_vnops.c:1116
 #8  0xbbb78202 in VOP_RMDIR (dvp=0xbb8504c8, vp=0xbb8502c4, cnp=0xb8bff8a0)
     at /sys/rump/librump/rumpkern/../../../kern/vnode_if.c:874
 #9  0xbb9b9b30 in sys_rmdir (l=0xb9ec7c80, uap=0xb8e02098, retval=0xb8bff97c)
     at /sys/rump/librump/rumpvfs/../../../kern/vfs_syscalls.c:3708
 #10 0xbbbb6f7a in sy_call (sy=0xbbbccb6c, l=0xb9ec7c80, uap=0xb8e02098, 
     rval=0xb8bff97c) at /sys/rump/librump/rumpkern/../../../sys/syscallvar.h:61
 #11 0xbbbb6f3a in rump_proxy_syscall (num=137, arg=0xb8e02098, 
     retval=0xb8bff97c) at rump.c:753
 #12 0xbbb10838 in rumpsyscall (sysnum=137, data=0xb8e02098, retval=0xb8bff97c)
     at rumpuser_sp.c:222
 #13 0xbbb1134c in serv_handlesyscall (spc=0xbbb17e38, rhdr=0xb8e03194, 
     data=0xb8e02098 ":÷żż", '˙' <repeats 112 times>, ".") at rumpuser_sp.c:598
 #14 0xbbb114b6 in serv_syscallbouncer (arg=0x0) at rumpuser_sp.c:647
 #15 0xbbb084c7 in pthread_setcancelstate () from /usr/lib/libpthread.so.0
 #16 0xbba39610 in swapcontext () from /usr/lib/libc.so.12
 (gdb) 

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/44288: tmpfs_rmdir() panic
Date: Mon, 28 Feb 2011 18:57:47 +0000

 I just tested the patch with rump.  Without the patch, the link count
 was wrong, and rmdiring test/bar and then test made rump_tmpfs crash
 on the same KASSERT as jruoho observed.  With the patch, the link
 count is right, and rmdiring test/bar and then test works dandily.

 I haven't run any other tmpfs tests with the patch, though.

From: "Antti Kantee" <pooka@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/44288 CVS commit: src/tests/fs/vfs
Date: Tue, 1 Mar 2011 14:21:47 +0000

 Module Name:	src
 Committed By:	pooka
 Date:		Tue Mar  1 14:21:46 UTC 2011

 Modified Files:
 	src/tests/fs/vfs: t_vnops.c

 Log Message:
 augment rename test case with the failure from PR kern/44288


 To generate a diff of this commit:
 cvs rdiff -u -r1.15 -r1.16 src/tests/fs/vfs/t_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Jukka Ruohonen <jruohonen@iki.fi>
To: gnats-bugs@NetBSD.org
Cc: Taylor R Campbell <campbell@mumble.net>,
	David Holland <dholland-bugs@NetBSD.org>,
	Antti Kantee <pooka@cs.hut.fi>
Subject: Re: kern/44288: tmpfs_rmdir() panic
Date: Sat, 19 Mar 2011 22:03:37 +0200

 On Mon, Feb 28, 2011 at 07:00:07PM +0000, Taylor R Campbell wrote:
 >  I just tested the patch with rump.  Without the patch, the link count
 >  was wrong, and rmdiring test/bar and then test made rump_tmpfs crash
 >  on the same KASSERT as jruoho observed.  With the patch, the link
 >  count is right, and rmdiring test/bar and then test works dandily.

 I am sorry to inform, but the assertion still goes off with this patch

  +              /* If we just detached a directory, decrement the
  +               * parent's link count, since the to-node's .. will no
  +               * longer point to it.  */
  +              if (tnode->tn_type == VDIR)
  +                      tdnode->tn_links--;
  +

 The frequency of it has also increased in real-world usage, given that it is
 now triggered by several other pkgsrc packages as well.

 - Jukka.

From: "Mindaugas Rasiukevicius" <rmind@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/44288 CVS commit: src/sys/fs/tmpfs
Date: Sun, 29 May 2011 22:29:08 +0000

 Module Name:	src
 Committed By:	rmind
 Date:		Sun May 29 22:29:07 UTC 2011

 Modified Files:
 	src/sys/fs/tmpfs: tmpfs.h tmpfs_subr.c tmpfs_vfsops.c tmpfs_vnops.c

 Log Message:
 - Rework and document inode reference counting.  Also document inode life
   cycle (destruction part).  Perform link counting in tmpfs_dir_attach()
   and tmpfs_dir_detach(), instead of alloc/free and arbitrary places.
   Fixes PR/44285, PR/44288, PR/44657 and likely PR/42484.

 - Fix the race between the lookup and inode destruction.  Fixes PR/43167
   and its duplicates PR/40088, PR/40757.

 - Improve tmpfs_rename() locking a little, fix kqueue event notifications
   and also fix PR/43617.  Add simplistic tmpfs_parentcheck_p(); to be
   expanded and used for further rename() locking fixes.

 - Cache directory entry "hint" in the tmpfs node, add tmpfs_dir_cached(),
   and thus avoid unnecessary lookup in tmpfs_remove() and tmpfs_rmdir().

 - Set correct _PC_FILESIZEBITS value in tmpfs_pathconf().  Fixes PR/43576.

 - Few minor fixes.


 To generate a diff of this commit:
 cvs rdiff -u -r1.43 -r1.44 src/sys/fs/tmpfs/tmpfs.h
 cvs rdiff -u -r1.70 -r1.71 src/sys/fs/tmpfs/tmpfs_subr.c
 cvs rdiff -u -r1.50 -r1.51 src/sys/fs/tmpfs/tmpfs_vfsops.c
 cvs rdiff -u -r1.84 -r1.85 src/sys/fs/tmpfs/tmpfs_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

Responsible-Changed-From-To: kern-bug-people->rmind
Responsible-Changed-By: rmind@NetBSD.org
Responsible-Changed-When: Sun, 29 May 2011 22:40:44 +0000
Responsible-Changed-Why:


State-Changed-From-To: open->closed
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Sun, 29 May 2011 22:40:44 +0000
State-Changed-Why:
Fixed, thanks.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.