NetBSD Problem Report #43456

From njoly@lanfeust.sis.pasteur.fr  Fri Jun 11 10:16:50 2010
Return-Path: <njoly@lanfeust.sis.pasteur.fr>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id F3FA963B935
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 11 Jun 2010 10:16:49 +0000 (UTC)
Message-Id: <20100611101646.9153EDC9B9@lanfeust.sis.pasteur.fr>
Date: Fri, 11 Jun 2010 12:16:46 +0200 (CEST)
From: njoly@pasteur.fr
Reply-To: njoly@pasteur.fr
To: gnats-bugs@gnats.NetBSD.org
Subject: KASSERT from ptyfs null mount
X-Send-Pr-Version: 3.95

>Number:         43456
>Category:       kern
>Synopsis:       KASSERT from ptyfs null mount
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    hannken
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Jun 11 10:20:04 +0000 2010
>Closed-Date:    Fri Jan 14 11:51:20 +0000 2011
>Last-Modified:  Fri Jan 14 11:51:20 +0000 2011
>Originator:     Nicolas Joly
>Release:        NetBSD 5.99.30
>Organization:
Insitut Pasteur
>Environment:
System: NetBSD lanfeust.sis.pasteur.fr 5.99.30 NetBSD 5.99.30 (LANFEUST) #5: Fri Jun 11 12:01:51 CEST 2010 njoly@lanfeust.sis.pasteur.fr:/local/src/NetBSD/obj.amd64/sys/arch/amd64/compile/LANFEUST amd64
Architecture: x86_64
Machine: amd64
>Description:
I got hit by a KASSERT from a ptyfs null mount in a i386 chroot on my amd64
workstation. It's highly reproductible; simply launch an xterm from the
chroot, then exit.

panic: kernel diagnostic assertion "sn->sn_opencnt == 0" failed: file "/local/src/NetBSD/src/sys/miscfs/specfs/spec_vnops.c", line 321
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff8022b7cd cs 8 rflags 246 cr2  fbfe811c cpl 0 rsp ffff800048ebf7c0
Stopped in pid 23990.1 (ksh) at netbsd:breakpoint+0x5:  leave

Here follow the corresponding backtrace :
#0  0xffffffff804cc8ad in cpu_reboot (howto=256, bootstr=<value optimized out>)
    at /local/src/NetBSD/src/sys/arch/amd64/amd64/machdep.c:675
#1  0xffffffff8024321c in db_sync_cmd (addr=<value optimized out>, 
    have_addr=<value optimized out>, count=0, modif=0x0)
    at /local/src/NetBSD/src/sys/ddb/db_command.c:1375
#2  0xffffffff80243995 in db_command (last_cmdp=0xffffffff80cc6440)
    at /local/src/NetBSD/src/sys/ddb/db_command.c:909
#3  0xffffffff80243bf4 in db_command_loop ()
    at /local/src/NetBSD/src/sys/ddb/db_command.c:567
#4  0xffffffff80248f74 in db_trap (type=<value optimized out>, 
    code=<value optimized out>) at /local/src/NetBSD/src/sys/ddb/db_trap.c:101
#5  0xffffffff80246663 in kdb_trap (type=1, code=0, regs=0xffff800049ced6d0)
    at /local/src/NetBSD/src/sys/arch/amd64/amd64/db_interface.c:214
#6  0xffffffff806cbf40 in trap (frame=0xffff800049ced6d0)
    at /local/src/NetBSD/src/sys/arch/amd64/amd64/trap.c:284
#7  0xffffffff80100fe1 in calltrap ()
#8  0xffffffff8022b7cd in breakpoint ()
#9  0xffffffff8068d9f2 in panic (
    fmt=0xffffffff80adb220 "kernel %sassertion \"%s\" failed: file \"%s\", line %d") at /local/src/NetBSD/src/sys/kern/subr_prf.c:299
#10 0xffffffff807d4915 in kern_assert (t=0x3f8 <Address 0x3f8 out of bounds>, 
    f=0x0, l=-2136142718, e=0x8 <Address 0x8 out of bounds>)
    at /local/src/NetBSD/src/sys/lib/libkern/kern_assert.c:50
#11 0xffffffff8067180f in spec_node_destroy (vp=0xffff80004a55f198)
    at /local/src/NetBSD/src/sys/miscfs/specfs/spec_vnops.c:321
#12 0xffffffff80773fa7 in vrelel (vp=0xffff80004a55f198, flags=0)
    at /local/src/NetBSD/src/sys/kern/vfs_subr.c:1578
#13 0xffffffff807752b6 in vrevoke (vp=<value optimized out>)
    at /local/src/NetBSD/src/sys/kern/vfs_subr.c:2106
#14 0xffffffff802c65ea in genfs_revoke (v=<value optimized out>)
    at /local/src/NetBSD/src/sys/miscfs/genfs/genfs_vnops.c:275
#15 0xffffffff8048d4fa in layer_bypass (v=<value optimized out>)
    at /local/src/NetBSD/src/sys/miscfs/genfs/layer_vnops.c:355
#16 0xffffffff8078b083 in VOP_REVOKE (vp=0xffff80004a569658, 
    flags=<value optimized out>)
    at /local/src/NetBSD/src/sys/kern/vnode_if.c:593
#17 0xffffffff80459aae in exit1 (l=0xffff800049aab800, 
    rv=<value optimized out>) at /local/src/NetBSD/src/sys/kern/kern_exit.c:391
#18 0xffffffff80459ce2 in sys_exit (l=0xffff800049aab800, 
    uap=0xffff800049cedba0, retval=<value optimized out>)
    at /local/src/NetBSD/src/sys/kern/kern_exit.c:183
#19 0xffffffff80521b68 in netbsd32_exit (l=0x0, uap=<value optimized out>, 
    retval=0x8)
    at /local/src/NetBSD/src/sys/compat/netbsd32/netbsd32_netbsd.c:182
#20 0xffffffff80524d53 in netbsd32_syscall (frame=0xffff800049cedc80)
    at /local/src/NetBSD/src/sys/sys/syscallvar.h:61
#21 0xffffffff8010085a in osyscall1 ()
#22 0x00000000fbd1a0b8 in ?? ()
#23 0x00000000ffffdc88 in ?? ()
#24 0x0000000000000000 in ?? ()

>How-To-Repeat:
Do a ptyfs null mount in a chroot, launch and exit an xterm fromn it.
>Fix:

>Release-Note:

>Audit-Trail:
From: Antti Kantee <pooka@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/43456 CVS commit: src/tests/fs/ptyfs
Date: Fri, 11 Jun 2010 23:52:38 +0000

 Module Name:	src
 Committed By:	pooka
 Date:		Fri Jun 11 23:52:38 UTC 2010

 Added Files:
 	src/tests/fs/ptyfs: Atffile Makefile t_nullpts.c t_ptyfs.c

 Log Message:
 Add some ptyfs tests.

 Note: I'm not adding these to the build yet, since they depend on
 some other other cleanup I might get done only after the weekend.
 Even so, t_nullpts serves a simple example of how to repeat the
 crash described in PR kern/43456 (just remove "rump_sys_" from the
 calls and it should compile and you should get a host kernel panic
 instead of a coredump).


 To generate a diff of this commit:
 cvs rdiff -u -r0 -r1.1 src/tests/fs/ptyfs/Atffile src/tests/fs/ptyfs/Makefile \
     src/tests/fs/ptyfs/t_nullpts.c src/tests/fs/ptyfs/t_ptyfs.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/43456: KASSERT from ptyfs null mount
Date: Sun, 13 Jun 2010 14:50:10 +0200

 Coming through a null mount the layered node is active (v_usecount == 2)
 whereas the ptyfs node is inactive (v_usecount == 1).

 Therefore vclean() will not call spec_node_revoke() on the ptyfs node -> boom.

 -- 
 Juergen Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)

From: Antti Kantee <pooka@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/43456 CVS commit: src/external/bsd/atf/dist
Date: Wed, 16 Jun 2010 15:17:37 +0000

 Module Name:	src
 Committed By:	pooka
 Date:		Wed Jun 16 15:17:37 UTC 2010

 Modified Files:
 	src/external/bsd/atf/dist/atf-c: tcr.c tcr.h
 	src/external/bsd/atf/dist/atf-c++: formats.cpp tests.cpp tests.hpp
 	src/external/bsd/atf/dist/atf-report: atf-report.cpp
 	src/external/bsd/atf/dist/atf-run: atf-run.cpp
 	src/external/bsd/atf/dist/tests/atf/atf-c: t_macros.c
 	src/external/bsd/atf/dist/tests/atf/atf-report: t_integration.sh

 Log Message:
 Introduce expected failures to atf.  They can be used to flag tests
 which are known to fail, e.g.:

         atf_tc_set_md_var(tc, "xfail", "PR kern/43456");

 Expected failures do not count towards the ultimate pass/fail result
 from the test run:

 pain-rustique:39:~/<2>src/tests/fs/ptyfs> atf-run t_nullpts | atf-report
 Tests root: /home/pooka/src/wholesrc2/src/tests/fs/ptyfs

 t_nullpts (1/1): 1 test cases
     nullrevoke: Expected failure: PR kern/43456

 Summary for 1 test programs:
     0 passed test cases.
     0 failed test cases.
     1 expected failures.
     0 skipped test cases.
 pain-rustique:40:~/<2>src/tests/fs/ptyfs> echo $?
 0

 However, an xfail test which passes will count as a failure, i.e.
 xfail inverts test case success/fail.  This way we can get a better
 sense from the ultimate verdict of the NetBSD atf run by seeing if
 there were any unexpected failures, i.e. new regressions.

 This feature will be present in the upcoming atf 0.10 release,
 possibly with finer grained control.

 patch reviewed by jmmv


 To generate a diff of this commit:
 cvs rdiff -u -r1.1.1.4 -r1.2 src/external/bsd/atf/dist/atf-c/tcr.c
 cvs rdiff -u -r1.1.1.3 -r1.2 src/external/bsd/atf/dist/atf-c/tcr.h
 cvs rdiff -u -r1.1.1.3 -r1.2 src/external/bsd/atf/dist/atf-c++/formats.cpp
 cvs rdiff -u -r1.1.1.4 -r1.2 src/external/bsd/atf/dist/atf-c++/tests.cpp \
     src/external/bsd/atf/dist/atf-c++/tests.hpp
 cvs rdiff -u -r1.1.1.1 -r1.2 \
     src/external/bsd/atf/dist/atf-report/atf-report.cpp
 cvs rdiff -u -r1.2 -r1.3 src/external/bsd/atf/dist/atf-run/atf-run.cpp
 cvs rdiff -u -r1.1.1.4 -r1.2 \
     src/external/bsd/atf/dist/tests/atf/atf-c/t_macros.c
 cvs rdiff -u -r1.1.1.2 -r1.2 \
     src/external/bsd/atf/dist/tests/atf/atf-report/t_integration.sh

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Juergen Hannken-Illjes" <hannken@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/43456 CVS commit: src
Date: Mon, 10 Jan 2011 11:11:04 +0000

 Module Name:	src
 Committed By:	hannken
 Date:		Mon Jan 10 11:11:04 UTC 2011

 Modified Files:
 	src/sys/miscfs/genfs: layer_extern.h layer_vnops.c
 	src/sys/miscfs/nullfs: null_vnops.c
 	src/sys/miscfs/overlay: overlay_vnops.c
 	src/sys/miscfs/umapfs: umap_vnops.c
 	src/tests/fs/ptyfs: t_nullpts.c

 Log Message:
 Add layer_revoke() that adjusts the lower vnode use count to be at least as
 high as the upper vnode count before passing down the VOP_REVOKE().

 This way vclean() check for active (vp->v_usecount > 1) vnodes gets it right.

 Should fix PR kern/43456.


 To generate a diff of this commit:
 cvs rdiff -u -r1.26 -r1.27 src/sys/miscfs/genfs/layer_extern.h
 cvs rdiff -u -r1.44 -r1.45 src/sys/miscfs/genfs/layer_vnops.c
 cvs rdiff -u -r1.36 -r1.37 src/sys/miscfs/nullfs/null_vnops.c
 cvs rdiff -u -r1.17 -r1.18 src/sys/miscfs/overlay/overlay_vnops.c
 cvs rdiff -u -r1.50 -r1.51 src/sys/miscfs/umapfs/umap_vnops.c
 cvs rdiff -u -r1.4 -r1.5 src/tests/fs/ptyfs/t_nullpts.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Nicolas Joly <njoly@pasteur.fr>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, njoly@pasteur.fr, hannken@netbsd.org
Subject: Re: PR/43456 CVS commit: src
Date: Mon, 10 Jan 2011 20:06:22 +0100

 On Mon, Jan 10, 2011 at 11:15:07AM +0000, Juergen Hannken-Illjes wrote:
 > The following reply was made to PR kern/43456; it has been noted by GNATS.
 > 
 > From: "Juergen Hannken-Illjes" <hannken@netbsd.org>
 > To: gnats-bugs@gnats.NetBSD.org
 > Cc: 
 > Subject: PR/43456 CVS commit: src
 > Date: Mon, 10 Jan 2011 11:11:04 +0000
 > 
 >  Module Name:	src
 >  Committed By:	hannken
 >  Date:		Mon Jan 10 11:11:04 UTC 2011
 >  
 >  Modified Files:
 >  	src/sys/miscfs/genfs: layer_extern.h layer_vnops.c
 >  	src/sys/miscfs/nullfs: null_vnops.c
 >  	src/sys/miscfs/overlay: overlay_vnops.c
 >  	src/sys/miscfs/umapfs: umap_vnops.c
 >  	src/tests/fs/ptyfs: t_nullpts.c
 >  
 >  Log Message:
 >  Add layer_revoke() that adjusts the lower vnode use count to be at least as
 >  high as the upper vnode count before passing down the VOP_REVOKE().
 >  
 >  This way vclean() check for active (vp->v_usecount > 1) vnodes gets it right.
 >  
 >  Should fix PR kern/43456.

 Noticing that the testcase was fixed, i checked again my original
 problem. Unfortunately, this still fails with the same KASSERT panic,
 and unchanged backtrace.

 Thanks.

 -- 
 Nicolas Joly

 Biological Software and Databanks.
 Institut Pasteur, Paris.

From: "Juergen Hannken-Illjes" <hannken@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/43456 CVS commit: src/sys/miscfs/genfs
Date: Thu, 13 Jan 2011 10:28:38 +0000

 Module Name:	src
 Committed By:	hannken
 Date:		Thu Jan 13 10:28:38 UTC 2011

 Modified Files:
 	src/sys/miscfs/genfs: layer_vnops.c

 Log Message:
 Layer_revoke(): change previous to always take an extra reference on the
 lower vnode before passing down the VOP_REVOKE().  This way VOP_REVOKE()
 on a layered file system always inactivates and closes the lower vnode.

 Should finally fix PR kern/43456.


 To generate a diff of this commit:
 cvs rdiff -u -r1.45 -r1.46 src/sys/miscfs/genfs/layer_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Nicolas Joly <njoly@pasteur.fr>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, njoly@pasteur.fr
Subject: Re: PR/43456 CVS commit: src/sys/miscfs/genfs
Date: Thu, 13 Jan 2011 21:44:42 +0100

 On Thu, Jan 13, 2011 at 10:30:05AM +0000, Juergen Hannken-Illjes wrote:
 > The following reply was made to PR kern/43456; it has been noted by GNATS.
 > 
 > From: "Juergen Hannken-Illjes" <hannken@netbsd.org>
 > To: gnats-bugs@gnats.NetBSD.org
 > Cc: 
 > Subject: PR/43456 CVS commit: src/sys/miscfs/genfs
 > Date: Thu, 13 Jan 2011 10:28:38 +0000
 > 
 >  Module Name:	src
 >  Committed By:	hannken
 >  Date:		Thu Jan 13 10:28:38 UTC 2011
 >  
 >  Modified Files:
 >  	src/sys/miscfs/genfs: layer_vnops.c
 >  
 >  Log Message:
 >  Layer_revoke(): change previous to always take an extra reference on the
 >  lower vnode before passing down the VOP_REVOKE().  This way VOP_REVOKE()
 >  on a layered file system always inactivates and closes the lower vnode.
 >  
 >  Should finally fix PR kern/43456.

 I tested it again, and the problem is gone.
 Thanks a lot.

 -- 
 Nicolas Joly

 Biological Software and Databanks.
 Institut Pasteur, Paris.

Responsible-Changed-From-To: kern-bug-people->hannken
Responsible-Changed-By: hannken@NetBSD.org
Responsible-Changed-When: Fri, 14 Jan 2011 11:51:20 +0000
Responsible-Changed-Why:
Fixed it.


State-Changed-From-To: open->closed
State-Changed-By: hannken@NetBSD.org
State-Changed-When: Fri, 14 Jan 2011 11:51:20 +0000
State-Changed-Why:
Fixed in tree.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.