NetBSD Problem Report #42377
From htodd@kerry.i8u.org Wed Nov 25 14:55:57 2009
Return-Path: <htodd@kerry.i8u.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id 8673C63B8B4
for <gnats-bugs@gnats.NetBSD.org>; Wed, 25 Nov 2009 14:55:57 +0000 (UTC)
Message-Id: <200911251455.nAPEturq014526@kerry.i8u.org>
Date: Wed, 25 Nov 2009 06:55:56 -0800 (PST)
From: htodd@twofifty.com
Reply-To: htodd@twofifty.com
To: gnats-bugs@gnats.NetBSD.org
Subject: netbsd-5 i386 system hangs on file access after changes of around 11/13
X-Send-Pr-Version: 3.95
>Number: 42377
>Category: kern
>Synopsis: netbsd-5 i386 system hangs on file access after changes of around 11/13
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: bouyer
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Nov 25 15:00:01 +0000 2009
>Closed-Date: Sat Nov 28 19:11:29 +0000 2009
>Last-Modified: Sat Nov 28 19:11:29 +0000 2009
>Originator: H. Todd Fujinaka
>Release: NetBSD 5.0_STABLE
>Organization:
None
>Environment:
System: NetBSD kerry.i8u.org 5.0_STABLE NetBSD 5.0_STABLE (KERRY) #1191: Mon Nov 23 09:44:39 PST 2009 htodd@kerry.i8u.org:/home/obj/sys/arch/i386/compile.i386/KERRY i386
Architecture: i386
Machine: i386
>Description:
My system now hangs during builds of "world" and also on reboot. Similar problems are experienced in amd64-current. On reboot the system seems to "hang" on unmounting disks. Debugger information (screenshots of the debugger) is located at http://www.i8u.org/~htodd/logpix.tar.bz2.
wd0 at atabus2 drive 0: <WDC WD2500KS-00MJB0>
wd0: drive supports 16-sector PIO transfers, LBA48 addressing
wd0: 232 GB, 484521 cyl, 16 head, 63 sec, 512 bytes/sect x 488397168 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
wd0(piixide1:0:0): using PIO mode 4, Ultra-DMA mode 6 (Ultra/133) (using DMA)
# /dev/rwd0d:
type: unknown
disk: NetBSD
label:
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 484521
total sectors: 488397168
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0 # microseconds
track-to-track seek: 0 # microseconds
drivedata: 0
16 partitions:
# size offset fstype [fsize bsize cpg/sgs]
a: 1024128 63 4.2BSD 1024 8192 0 # (Cyl. 0*- 1016*)
b: 8192016 1024191 swap # (Cyl. 1016*- 9143*)
c: 488397105 63 unused 0 0 # (Cyl. 0*- 484520)
d: 488397168 0 unused 0 0 # (Cyl. 0 - 484520)
e: 12288528 9216207 4.2BSD 2048 16384 0 # (Cyl. 9143*- 21334*)
f: 6144768 21504735 4.2BSD 2048 16384 0 # (Cyl. 21334*- 27430*)
g: 460747665 27649503 4.2BSD 2048 16384 0 # (Cyl. 27430*- 484520)
>How-To-Repeat:
Just start a build, have it hang, then try to reboot.
>Fix:
I'll start reveting things
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->bouyer
Responsible-Changed-By: bouyer@NetBSD.org
Responsible-Changed-When: Fri, 27 Nov 2009 19:10:15 +0000
Responsible-Changed-Why:
I've reproduced it and have a patch
State-Changed-From-To: open->feedback
State-Changed-By: bouyer@NetBSD.org
State-Changed-When: Fri, 27 Nov 2009 19:10:15 +0000
State-Changed-Why:
Hi,
could you please try the patch I posted in
http://mail-index.netbsd.org/tech-kern/2009/11/27/msg006546.html
From: Hisashi T Fujinaka <htodd@twofifty.com>
To: gnats-bugs@NetBSD.org
Cc: bouyer@NetBSD.org, kern-bug-people@NetBSD.org, netbsd-bugs@NetBSD.org,
gnats-admin@NetBSD.org, bouyer@NetBSD.org
Subject: Re: kern/42377 (netbsd-5 i386 system hangs on file access after
changes of around 11/13)
Date: Fri, 27 Nov 2009 23:53:31 -0800 (PST)
The patch allows my build to finish, fixing my main complaint.
From: Manuel Bouyer <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/42377 CVS commit: src/sys/kern
Date: Sat, 28 Nov 2009 10:10:18 +0000
Module Name: src
Committed By: bouyer
Date: Sat Nov 28 10:10:18 UTC 2009
Modified Files:
src/sys/kern: vfs_subr.c
Log Message:
Previous did cause a deadlock with layered FS: the vrele thread
can sleep on the vnode lock, while vget is sleeping on the
VI_INACTNOW flag (or the vget caller is looping on vget returning failure
because of the VI_INACTNOW flag). With layered FSes, the upper and lower
vnodes share the same lock, so the vget() caller above can be already
holding the vnode lock.
Fix by dropping VI_INACTNOW before sleeping on the vnode lock in
vrelel(), and check the ref count again once we have the lock. If the
vnode has more than one reference, donc VOP_INACTIVE it.
Fix PR kern/42318 and PR kern/42377
patch tested by Hisashi T Fujinaka, Joachim König, Stephen Borrill and
Matthias Scheler.
To generate a diff of this commit:
cvs rdiff -u -r1.391 -r1.392 src/sys/kern/vfs_subr.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Stephen Borrill <sborrill@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/42377 CVS commit: [netbsd-5] src/sys/kern
Date: Sat, 28 Nov 2009 18:59:12 +0000
Module Name: src
Committed By: sborrill
Date: Sat Nov 28 18:59:11 UTC 2009
Modified Files:
src/sys/kern [netbsd-5]: vfs_subr.c
Log Message:
Pull up the following revisions(s) (requested by bouyer in ticket #1171):
sys/kern/vfs_subr.c: revision 1.392
Previous caused a deadlock with layered FS: the vrele thread can sleep on
the vnode lock, while vget is sleeping on the VI_INACTNOW flag (or the vget
caller is looping on vget returning failure because of the VI_INACTNOW
flag). With layered FSes, the upper and lower vnodes share the same lock, so
the vget() caller above can be already holding the vnode lock.
Fix by dropping VI_INACTNOW before sleeping on the vnode lock in
vrelel(), and check the ref count again once we have the lock. If the
vnode has more than one reference, don't VOP_INACTIVE it.
Fix PR kern/42318 and PR kern/42377
To generate a diff of this commit:
cvs rdiff -u -r1.357.4.7 -r1.357.4.8 src/sys/kern/vfs_subr.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: feedback->closed
State-Changed-By: bouyer@NetBSD.org
State-Changed-When: Sat, 28 Nov 2009 19:11:29 +0000
State-Changed-Why:
Patch commited and pulled up
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.