NetBSD Problem Report #45355
From apb@cequrux.com Sat Sep 10 16:26:19 2011
Return-Path: <apb@cequrux.com>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id EF94363BB48
for <gnats-bugs@gnats.NetBSD.org>; Sat, 10 Sep 2011 16:26:18 +0000 (UTC)
Message-Id: <20110910162150.8F40C2C5E1D3@apb-laptoy.apb.alt.za>
Date: Sat, 10 Sep 2011 16:21:50 +0000 (UTC)
From: apb@cequrux.com
To: gnats-bugs@gnats.NetBSD.org
Subject: Reader/writer lock error: rw_vector_enter: locking against myself
X-Send-Pr-Version: 3.95
>Number: 45355
>Category: kern
>Synopsis: Reader/writer lock error: rw_vector_enter: locking against myself
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Sep 10 16:30:00 +0000 2011
>Closed-Date: Sun Nov 13 18:05:13 +0000 2011
>Last-Modified: Sun Nov 13 18:05:13 +0000 2011
>Originator: Alan Barrett
>Release: NetBSD 5.99.55
>Organization:
Not much
>Environment:
System: NetBSD i386 5.99.55 (sources from 2011-09-02 14:00 UTC)
Architecture: i386
Machine: i386
>Description:
The system crashes frequently (more than once per day). I am usually
not able to get any debug information, because after a crash pressing
alt-control-F1 does not switch the display from graphics to text mode.
The most recent crash was while I was using a text console, so I was
able to get the following information. (This is transcribed from blurry
photos, and it's likely that there's confusion between the digits
0 and 8.)
Reader / writer lock error: rw_vector_enter: locking against myself
lock address : 0x00000000d483e0b0
current cpu : 0
current lwp : 0x00000000d8eac2a0
owner/count : 0x00000000d8eac2a0 flags = 0x0000000000000004
panic: lock error
fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c0251254 cs 8 eflags 206 cr2 ce881000 ilevel 0
stopped in pid 15882.1 (vim) at netbsd:breakpoint+0x4: popl %ebp
db{0}> bt
breakpoint ...
panic ...
lockdebug_abort ...
rw_abort ...
rw_vector_enter ...
genfs_lock ...
layer_bypass ...
VOP_LOCK ...
vclean ...
getcleanvnode ...
getnewvnode ...
ffs_vget ...
ffs_valloc ...
ufs_makeinode ...
ufs_create ...
VOP_CREATE ...
vn_open ...
sys_open ...
syscall ...
>How-To-Repeat:
The crash happened as I was attempting to save a file from the
"vim" editor (installed from pkgsrc). The file system is
ffs+wapbl on cgd. The file system also happens to be the
backing layer for a read-only nullfs mount, and it's possible that
a process was attempting to access the same file through the
read-only nullfs mount at the same time as the editor was
attempting to save the file to the writabe backing layer.
>Fix:
>Release-Note:
>Audit-Trail:
From: "Juergen Hannken-Illjes" <hannken@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/45355 CVS commit: src/sys/kern
Date: Sun, 2 Oct 2011 13:00:08 +0000
Module Name: src
Committed By: hannken
Date: Sun Oct 2 13:00:07 UTC 2011
Modified Files:
src/sys/kern: vfs_vnode.c
Log Message:
The path getnewvnode()->getcleanvnode()->vclean()->VOP_LOCK() will panic
if the vnode we want to clean is a layered vnode and the caller already
locked its lower vnode.
Change getnewvnode() to always allocate a fresh vnode and add a helper
thread (vdrain) to keep the number of allocated vnodes within desiredvnodes.
Rename getcleanvnode() to cleanvnode() and let it take a vnode from the
lists, clean and free it.
Reviewed by: David Holland <dholland@netbsd.org>
Should fix:
PR #19110 (nullfs mounts over NFS cause lock manager problems)
PR #34102 (ffs panic in NetBSD 3.0_STABLE)
PR #45115 (lock error panic when build.sh*3 and daily script is running)
PR #45355 (Reader/writer lock error: rw_vector_enter: locking against myself)
To generate a diff of this commit:
cvs rdiff -u -r1.11 -r1.12 src/sys/kern/vfs_vnode.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sat, 05 Nov 2011 16:38:30 +0000
State-Changed-Why:
is it working better now?
From: Alan Barrett <apb@cequrux.com>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/45355 (Reader/writer lock error: rw_vector_enter: locking
against myself)
Date: Sun, 6 Nov 2011 20:23:57 +0200
On Sat, 05 Nov 2011, dholland@NetBSD.org wrote:
>is it working better now?
I think this is fixed. I am unable to trigger the panic using
a kernel from 2011-10-16 (with src/sys/kern/kern_vnode.c revision 1.14).
--apb (Alan Barrett)
State-Changed-From-To: feedback->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 13 Nov 2011 18:05:13 +0000
State-Changed-Why:
Fixed, thanks.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.