NetBSD Problem Report #42318

From louis@maat.zabrico.com  Sat Nov 14 21:22:55 2009
Return-Path: <louis@maat.zabrico.com>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 845D163B844
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 14 Nov 2009 21:22:55 +0000 (UTC)
Message-Id: <200911142122.nAELMqql015136@maat.zabrico.com>
Date: Sat, 14 Nov 2009 16:22:52 -0500 (EST)
From: louis@zabrico.com
Reply-To: louis@zabrico.com
To: gnats-bugs@gnats.NetBSD.org
Subject: chroot or pkg_comp causes a hang on netbsd-5
X-Send-Pr-Version: 3.95

>Number:         42318
>Category:       kern
>Synopsis:       chroot or pkg_comp causes a hang on netbsd-5
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bouyer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Nov 14 21:25:00 +0000 2009
>Closed-Date:    Sat Nov 28 19:10:51 +0000 2009
>Last-Modified:  Sat Nov 28 19:15:03 +0000 2009
>Originator:     Louis Guillaume
>Release:        NetBSD 5.0_STABLE - sources from Nov. 11, 2009
>Organization:
>Environment:
System: NetBSD maat.zabrico.com 5.0_STABLE NetBSD 5.0_STABLE (GENERIC) #9: Thu Nov 12 22:02:50 EST 2009 louis@maat.zabrico.com:/usr/obj/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
	After upgrading my NetBSD-5.0_STABLE, i386 system, I attempted to update a
	pkg_comp chroot environment to upgrade my packages. I unpacked the latest
	kernel and base, comp, etc and text sets from the recent build (build.sh 
	release). Then I ran postinstall to clean up and everything seemed to be
	working fine.

	I entered the chroot environment with "sudo pkg_comp chroot" and then
	began to run pkg_rolling-replace as usual. Now the process hung while doing
	the "pkg_chk -uq" to figure out what was out-of-date. From a different
	session, I was able to kill the pkg_comp process, figuring this was all a
	fluke of some kind.

	Then I chrooted again. This time, upon simply executing "pkg_chk -uq" the
	whole system hung! I was not able to get an ssh session, and the console
	took the user part of the login but froze before prompting for password.

	I Ctrl-Alt-Esc'd to get to the debugger and it said...

	fatal breakpoint trap in supervisor mode
	/netbsd: trap type 1 code 0 eip c05788dc cs 8 eflags 202 cr2 cd4e2000 ilevel 6
	syslogd: restart

	After the "restart" it tried to sync disks, printed a bunch of `2's and
	hung. I had to cold reboot it.

	This was repeatable each time I attempted to use pkg_comp for just about
	anything other than going into the chroot itself.

	I decided something was wrong with pkg_comp and decided to re-create my
	environment. So I started fresh with pkg_comp makeroot. And things seem
	to be working quite well.

	But I just found a hang when unmounting the pkg_comp filesystems (which
	are null-mounted). Interrupt will not kill the process and it's not
	responsive to a regular kill. It responded to a CTRL-Z, (susp), but if
	I try to "kill %1" I get "/bin/ksh: kill: %1: No such process".
	But that may be just because of using "sudo", I don't know.

	Root was able to kill the chroot process with no issue, but the "umount"
	command is still running anyway, unkillable even with `9'.

	The system hasn't hung again, but this is the kind of thing that was
	happening before right before it hung. 

>How-To-Repeat:
	Not sure, but try to run pkg_comp (or perhaps other chroot operations)
	with netbsd-5 from at least Nov. 11th.

>Fix:

Unknown
>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->bouyer
Responsible-Changed-By: bouyer@NetBSD.org
Responsible-Changed-When: Sat, 28 Nov 2009 00:09:21 +0000
Responsible-Changed-Why:
Probably the same nullfs issue as kern/42377


From: Manuel Bouyer <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/42318 CVS commit: src/sys/kern
Date: Sat, 28 Nov 2009 10:10:18 +0000

 Module Name:	src
 Committed By:	bouyer
 Date:		Sat Nov 28 10:10:18 UTC 2009

 Modified Files:
 	src/sys/kern: vfs_subr.c

 Log Message:
 Previous did cause a deadlock with layered FS: the vrele thread
 can sleep on the vnode lock, while vget is sleeping on the
 VI_INACTNOW flag (or the vget caller is looping on vget returning failure
 because of the VI_INACTNOW flag). With layered FSes, the upper and lower
 vnodes share the same lock, so the vget() caller above can be already
 holding the vnode lock.

 Fix by dropping VI_INACTNOW before sleeping on the vnode lock in
 vrelel(), and check the ref count again once we have the lock. If the
 vnode has more than one reference, donc VOP_INACTIVE it.
 Fix PR kern/42318 and PR kern/42377
 patch tested by Hisashi T Fujinaka, Joachim König, Stephen Borrill and
 Matthias Scheler.


 To generate a diff of this commit:
 cvs rdiff -u -r1.391 -r1.392 src/sys/kern/vfs_subr.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Louis Guillaume <louis@zabrico.com>
To: gnats-bugs@NetBSD.org
Cc: bouyer@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: PR/42318 CVS commit: src/sys/kern
Date: Sat, 28 Nov 2009 12:02:28 -0500

 Manuel Bouyer wrote:
 > The following reply was made to PR kern/42318; it has been noted by GNATS.
 > 
 > From: Manuel Bouyer <bouyer@netbsd.org>
 > To: gnats-bugs@gnats.NetBSD.org
 > Cc: 
 > Subject: PR/42318 CVS commit: src/sys/kern
 > Date: Sat, 28 Nov 2009 10:10:18 +0000
 > 
 >  Module Name:	src
 >  Committed By:	bouyer
 >  Date:		Sat Nov 28 10:10:18 UTC 2009
 >  
 >  Modified Files:
 >  	src/sys/kern: vfs_subr.c
 >  

 >  To generate a diff of this commit:
 >  cvs rdiff -u -r1.391 -r1.392 src/sys/kern/vfs_subr.c
 >  

 Hi,

 Can we please have this pulled up to the netbsd-5 branch? Not sure if 
 there was more work done here; the revision I have in my copy is 
 1.357.47. Are there changes in other files or can I just grab v. 1.391?

 Thanks!

 Louis

From: Stephen Borrill <sborrill@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/42318 CVS commit: [netbsd-5] src/sys/kern
Date: Sat, 28 Nov 2009 18:59:12 +0000

 Module Name:	src
 Committed By:	sborrill
 Date:		Sat Nov 28 18:59:11 UTC 2009

 Modified Files:
 	src/sys/kern [netbsd-5]: vfs_subr.c

 Log Message:
 Pull up the following revisions(s) (requested by bouyer in ticket #1171):
 	sys/kern/vfs_subr.c:	revision 1.392

 Previous caused a deadlock with layered FS: the vrele thread can sleep on
 the vnode lock, while vget is sleeping on the VI_INACTNOW flag (or the vget
 caller is looping on vget returning failure because of the VI_INACTNOW
 flag). With layered FSes, the upper and lower vnodes share the same lock, so
 the vget() caller above can be already holding the vnode lock.

 Fix by dropping VI_INACTNOW before sleeping on the vnode lock in
 vrelel(), and check the ref count again once we have the lock. If the
 vnode has more than one reference, don't VOP_INACTIVE it.
 Fix PR kern/42318 and PR kern/42377


 To generate a diff of this commit:
 cvs rdiff -u -r1.357.4.7 -r1.357.4.8 src/sys/kern/vfs_subr.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: bouyer@NetBSD.org
State-Changed-When: Sat, 28 Nov 2009 19:10:51 +0000
State-Changed-Why:
Patch commit6ed and pulled up to netbsd-5


From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Louis Guillaume <louis@zabrico.com>
Cc: gnats-bugs@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: PR/42318 CVS commit: src/sys/kern
Date: Sat, 28 Nov 2009 20:10:09 +0100

 On Sat, Nov 28, 2009 at 12:02:28PM -0500, Louis Guillaume wrote:
 > Manuel Bouyer wrote:
 > >The following reply was made to PR kern/42318; it has been noted by GNATS.
 > >
 > >From: Manuel Bouyer <bouyer@netbsd.org>
 > >To: gnats-bugs@gnats.NetBSD.org
 > >Cc: 
 > >Subject: PR/42318 CVS commit: src/sys/kern
 > >Date: Sat, 28 Nov 2009 10:10:18 +0000
 > >
 > > Module Name:	src
 > > Committed By:	bouyer
 > > Date:		Sat Nov 28 10:10:18 UTC 2009
 > > 
 > > Modified Files:
 > > 	src/sys/kern: vfs_subr.c
 > > 
 > 
 > > To generate a diff of this commit:
 > > cvs rdiff -u -r1.391 -r1.392 src/sys/kern/vfs_subr.c
 > > 
 > 
 > Hi,
 > 
 > Can we please have this pulled up to the netbsd-5 branch? Not sure if 
 > there was more work done here; the revision I have in my copy is 
 > 1.357.47. Are there changes in other files or can I just grab v. 1.391?

 The pullup has just been done

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.