NetBSD Problem Report #47514
From tsugutomo.enami@jp.sony.com Wed Jan 30 02:57:05 2013
Return-Path: <tsugutomo.enami@jp.sony.com>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
by www.NetBSD.org (Postfix) with ESMTP id B7EEC63EC52
for <gnats-bugs@gnats.NetBSD.org>; Wed, 30 Jan 2013 02:57:05 +0000 (UTC)
Message-Id: <tkrr4l3ml8j.fsf@sigxcpu.sm.sony.co.jp>
Date: Wed, 30 Jan 2013 11:57:00 +0900
From: tsugutomo.enami@jp.sony.com
To: gnats-bugs@gnats.NetBSD.org
Subject: Multiple dump -X triggers kernel panic in fss_ioctl
>Number: 47514
>Category: kern
>Synopsis: Multiple dump -X triggers kernel panic in fss_ioctl
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: hannken
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jan 30 03:00:01 +0000 2013
>Closed-Date: Mon Feb 11 08:49:52 +0000 2013
>Last-Modified: Sun Jun 09 16:25:00 +0000 2013
>Originator: enami tsugutomo
>Release: NetBSD 6.0_STABLE
>Organization:
>Environment:
System: NetBSD rplaca.sm.sony.co.jp 6.0_STABLE NetBSD 6.0_STABLE (GENERIC) #2: Mon Jan 7 16:53:59 JST 2013 enami@sigfpe.sm.sony.co.jp:/home/enami/src/netbsd-6/obj.amd64/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
Recently, I've updated amanda in pkgsrc (from few years old one)
and kernel starts to panic since then. It looks like the amanda
in pkgsrc is added facility to use dump -X if possilble on last
summer.
Here is the panic message and stacktrace (copied by hand):
uvm_fault(0xfffffe80bda3bd40, 0x0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip ffffffff804bf1af cs 8 rflags 10283 cr2 8 cpl 0 rsp fffffe8006d59820
kernel: page fault trap, code=0
Stopped in pid 1713.1 (dump) at netbsd:mutex_vector_enter+0x80: movq 18(%r15), %rax
db{0}> bt
mutex_vector_enter() at netbsd:mutex_vector_enter+0x80
fss_ioctl() at netbsd:fss_ioctl+0xed
VOP_IOCTL() at netbsd:VOP_IOCTL+0x3b
vn_ioctl() at netbsd:vn_ioctl+0x76
sys_ioctl() at netbsd:sys_ioctl+0x13c
syscall() at netbsd:syscall+0xc4
db{0}>
The value of %r15 is fffffffffffffff0
With my amanda configuration, up to 8 dump will runs in parallel.
The system has two cpus.
>How-To-Repeat:
Install amanda from pkgsrc and setup to run multiple dumps in parallel.
>Fix:
I guess there is race condition between fss_open and fss_close.
Here is possible story:
A process calls fss_open while another process is calling
fss_close (since the device driver is marked as MPSAFE). In
the fss_close, no lock is held if control is between
mutex_exit(&sc->slock) and fss_ioctl(dev, FSSIOCCLR...) for
example. So, fss_open may return successfully during that.
Then the fss_close will detatch the device, before the
process which opened the fss device issues FSSIOCSET ioctl
(mutexes are destroyed and softc is freed as a result).
Later, the ioctl will be issued and it raises kernel panic.
The value of %r15 may indicate destroyed mutex.
>Release-Note:
>Audit-Trail:
From: "J. Hannken-Illjes" <hannken@eis.cs.tu-bs.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/47514: Multiple dump -X triggers kernel panic in fss_ioctl
Date: Wed, 30 Jan 2013 11:35:12 +0100
--Apple-Mail=_B5B3A17D-6C5A-418D-87AF-705EBFBC1375
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=us-ascii
Please try the attached patch. If you are not able to build a kernel
please drop me a note containing the output of "uname -a".
--
J. Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)
--Apple-Mail=_B5B3A17D-6C5A-418D-87AF-705EBFBC1375
Content-Disposition: attachment;
filename=fss.c.diff
Content-Type: application/octet-stream;
name="fss.c.diff"
Content-Transfer-Encoding: 7bit
Index: fss.c
===================================================================
RCS file: /cvsroot/src/sys/dev/fss.c,v
retrieving revision 1.83
diff -p -u -2 -r1.83 fss.c
--- fss.c 28 Jul 2012 16:14:17 -0000 1.83
+++ fss.c 30 Jan 2013 10:33:15 -0000
@@ -224,4 +224,5 @@ fss_close(dev_t dev, int flags, int mode
error = 0;
+ mutex_enter(&fss_device_lock);
restart:
mutex_enter(&sc->sc_slock);
@@ -229,4 +230,5 @@ restart:
sc->sc_flags &= ~mflag;
mutex_exit(&sc->sc_slock);
+ mutex_exit(&fss_device_lock);
return 0;
}
@@ -240,10 +242,7 @@ restart:
if ((sc->sc_flags & FSS_ACTIVE) != 0) {
mutex_exit(&sc->sc_slock);
+ mutex_exit(&fss_device_lock);
return error;
}
- if (! mutex_tryenter(&fss_device_lock)) {
- mutex_exit(&sc->sc_slock);
- goto restart;
- }
KASSERT((sc->sc_flags & FSS_ACTIVE) == 0);
--Apple-Mail=_B5B3A17D-6C5A-418D-87AF-705EBFBC1375--
From: tsugutomo.enami@jp.sony.com
To: <gnats-bugs@netbsd.org>
Cc: <kern-bug-people@netbsd.org>, <gnats-admin@netbsd.org>,
<netbsd-bugs@netbsd.org>
Subject: Re: kern/47514: Multiple dump -X triggers kernel panic in fss_ioctl
Date: Fri, 01 Feb 2013 15:07:53 +0900
"J. Hannken-Illjes" <hannken@eis.cs.tu-bs.de> writes:
> --Apple-Mail=_B5B3A17D-6C5A-418D-87AF-705EBFBC1375
> Content-Transfer-Encoding: 7bit
> Content-Type: text/plain;
> charset=us-ascii
>
> Please try the attached patch. If you are not able to build a kernel
> please drop me a note containing the output of "uname -a".
I've applied the patch to my netbsd-6 working directory and it looks
like the system survived at least the nightly dump of last night.
Thanks.
enami.
From: "J. Hannken-Illjes" <hannken@eis.cs.tu-bs.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/47514: Multiple dump -X triggers kernel panic in fss_ioctl
Date: Fri, 1 Feb 2013 09:01:18 +0100
On Feb 1, 2013, at 7:10 AM, tsugutomo.enami@jp.sony.com wrote:
> I've applied the patch to my netbsd-6 working directory and it looks
> like the system survived at least the nightly dump of last night.
Will commit in a few days then ...
--
J. Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)
From: "Juergen Hannken-Illjes" <hannken@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47514 CVS commit: src/sys/dev
Date: Wed, 6 Feb 2013 09:29:47 +0000
Module Name: src
Committed By: hannken
Date: Wed Feb 6 09:29:46 UTC 2013
Modified Files:
src/sys/dev: fss.c
Log Message:
Take fss_device_lock first when closing a fss device.
Fixes PR kern/47514 (Multiple dump -X triggers kernel panic in fss_ioctl)
To generate a diff of this commit:
cvs rdiff -u -r1.83 -r1.84 src/sys/dev/fss.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
Responsible-Changed-From-To: kern-bug-people->hannken
Responsible-Changed-By: hannken@NetBSD.org
Responsible-Changed-When: Fri, 08 Feb 2013 10:01:54 +0000
Responsible-Changed-Why:
Take.
State-Changed-From-To: open->pending-pullups
State-Changed-By: hannken@NetBSD.org
State-Changed-When: Fri, 08 Feb 2013 10:01:54 +0000
State-Changed-Why:
Fixed in tree -- pullup requested.
From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47514 CVS commit: [netbsd-6] src/sys/dev
Date: Sun, 10 Feb 2013 23:57:26 +0000
Module Name: src
Committed By: riz
Date: Sun Feb 10 23:57:26 UTC 2013
Modified Files:
src/sys/dev [netbsd-6]: fss.c
Log Message:
Pull up following revision(s) (requested by hannken in ticket #808):
sys/dev/fss.c: revision 1.84
Take fss_device_lock first when closing a fss device.
Fixes PR kern/47514 (Multiple dump -X triggers kernel panic in fss_ioctl)
To generate a diff of this commit:
cvs rdiff -u -r1.81.4.1 -r1.81.4.2 src/sys/dev/fss.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47514 CVS commit: [netbsd-6-0] src/sys/dev
Date: Sun, 10 Feb 2013 23:57:38 +0000
Module Name: src
Committed By: riz
Date: Sun Feb 10 23:57:38 UTC 2013
Modified Files:
src/sys/dev [netbsd-6-0]: fss.c
Log Message:
Pull up following revision(s) (requested by hannken in ticket #808):
sys/dev/fss.c: revision 1.84
Take fss_device_lock first when closing a fss device.
Fixes PR kern/47514 (Multiple dump -X triggers kernel panic in fss_ioctl)
To generate a diff of this commit:
cvs rdiff -u -r1.81.4.1 -r1.81.4.1.4.1 src/sys/dev/fss.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: pending-pullups->closed
State-Changed-By: hannken@NetBSD.org
State-Changed-When: Mon, 11 Feb 2013 08:49:52 +0000
State-Changed-Why:
Pulled up.
From: "SAITOH Masanobu" <msaitoh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47514 CVS commit: [netbsd-5] src/sys/dev
Date: Sun, 9 Jun 2013 11:29:43 +0000
Module Name: src
Committed By: msaitoh
Date: Sun Jun 9 11:29:43 UTC 2013
Modified Files:
src/sys/dev [netbsd-5]: fss.c
Log Message:
Pull up following revision(s) (requested by gdt in ticket #1853):
sys/dev/fss.c: revision 1.84
Take fss_device_lock first when closing a fss device.
Fixes PR kern/47514 (Multiple dump -X triggers kernel panic in fss_ioctl)
To generate a diff of this commit:
cvs rdiff -u -r1.60.4.6 -r1.60.4.7 src/sys/dev/fss.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "SAITOH Masanobu" <msaitoh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47514 CVS commit: [netbsd-5-1] src/sys/dev
Date: Sun, 9 Jun 2013 16:18:57 +0000
Module Name: src
Committed By: msaitoh
Date: Sun Jun 9 16:18:57 UTC 2013
Modified Files:
src/sys/dev [netbsd-5-1]: fss.c
Log Message:
Pull up following revision(s) (requested by gdt in ticket #1853):
sys/dev/fss.c: revision 1.84
Take fss_device_lock first when closing a fss device.
Fixes PR kern/47514 (Multiple dump -X triggers kernel panic in fss_ioctl)
To generate a diff of this commit:
cvs rdiff -u -r1.60.4.3 -r1.60.4.3.2.1 src/sys/dev/fss.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "SAITOH Masanobu" <msaitoh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47514 CVS commit: [netbsd-5-2] src/sys/dev
Date: Sun, 9 Jun 2013 16:20:29 +0000
Module Name: src
Committed By: msaitoh
Date: Sun Jun 9 16:20:29 UTC 2013
Modified Files:
src/sys/dev [netbsd-5-2]: fss.c
Log Message:
Pull up following revision(s) (requested by gdt in ticket #1853):
sys/dev/fss.c: revision 1.84
Take fss_device_lock first when closing a fss device.
Fixes PR kern/47514 (Multiple dump -X triggers kernel panic in fss_ioctl)
To generate a diff of this commit:
cvs rdiff -u -r1.60.4.6 -r1.60.4.6.2.1 src/sys/dev/fss.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
>Unformatted:
Pullup request to -5.
The following script provokes the crash around 10% of the time (with 4
snapshot devices). With the fix, 100 runs are fine.
----------------------------------------
#!/bin/sh
fssconfig -l
for fs in / /usr /n1; do
for lev in 0 1 2 3 4; do
dump $lev -f - -XS $fs > /dev/null &
done
done
sleep 1
pkill dump
sleep 1
fssconfig -l
----------------------------------------
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.