NetBSD Problem Report #46232
From mm_lists@pulsar-zone.net Tue Mar 20 17:01:32 2012
Return-Path: <mm_lists@pulsar-zone.net>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
by www.NetBSD.org (Postfix) with ESMTP id AC4C063E275
for <gnats-bugs@gnats.NetBSD.org>; Tue, 20 Mar 2012 17:01:31 +0000 (UTC)
Message-Id: <201203201701.q2KH1c9H022293@ginseng.pulsar-zone.net>
Date: Tue, 20 Mar 2012 13:01:37 -0400
From: Matthew Mondor <mm_lists@pulsar-zone.net>
To: gnats-bugs@gnats.NetBSD.org
Subject: netbsd-6 pad(4) panic: locking against myself
>Number: 46232
>Category: kern
>Synopsis: netbsd-6 pad(4) panic: locking against myself
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: mrg
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Mar 20 17:05:01 +0000 2012
>Closed-Date: Mon Apr 16 06:19:58 +0000 2012
>Last-Modified: Tue Apr 17 17:05:01 +0000 2012
>Originator: Matthew Mondor
>Release: NetBSD 6.0_BETA
>Organization:
>Environment:
System: NetBSD ninja.xisop 6.0_BETA NetBSD 6.0_BETA (GENERIC_MM) #2: Tue Mar 20 12:14:05 EDT 2012 root@ninja.xisop:/usr/obj/sys/arch/amd64/compile/GENERIC_MM amd64
Architecture: x86_64
Machine: amd64
>Description:
I can reliably reproduce a pad(4) panic.
Mutex error: mutex_vector_enter: locking against myself
wait/spin: 0/0
...
lockdebug_abort()
mutex_vector_enter()
audio_clear()
audio_pint()
pad_read()
spec_read()
VOP_READ()
vn_read()
dofileread()
sys_read()
syscall()
BTW, must I do something special for ddb to do VGA_POST? The kernel
was built with support for it, but if within X11 I can't see ddb
entering. I had crash dumps enabled because of that instead, but I
wasn't patient enough to wait for the long dump to finish...
>How-To-Repeat:
Shell 1:
$ audioplay -d /dev/sound1 foo.wav
Shall 2:
$ dd if=/dev/pad0 of=foo.pcm
>Fix:
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->mrg
Responsible-Changed-By: mrg@NetBSD.org
Responsible-Changed-When: Wed, 21 Mar 2012 00:37:25 +0000
Responsible-Changed-Why:
i'll take a look.
From: Matthew Mondor <mm_lists@pulsar-zone.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/46232: netbsd-6 pad(4) panic: locking against myself
Date: Thu, 22 Mar 2012 02:34:26 -0400
On Tue, 20 Mar 2012 17:05:01 +0000 (UTC)
Matthew Mondor <mm_lists@pulsar-zone.net> wrote:
When attempting to reproduce the test today in order to see if the
ddb.commandonenter=call ddb_vgapost sysctl worked, I initially failed
to crash the system, with instead a locked dd process stuck in an
endless uninterruptible loop (and unkillable even with SIGKILL), and
stuck audio writer process.
There were two differences from my other tests in this case:
- Playing a mono rather than stereo sample
- Using -d /dev/sound1 rather than
relinking /dev/sound, /dev/audio, /dev/audioctl to sound1, audio1 and
audioctl1, respectively.
With the symlinks pointing to the 1 devices, and playing stereo
content, I still could reproduce the panic, and fortunately ddb_vgapost
works.
Thanks,
--
Matt
From: Matthew Mondor <mm_lists@pulsar-zone.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/46232: netbsd-6 pad(4) panic: locking against myself
Date: Thu, 22 Mar 2012 05:30:36 -0400
On Tue, 20 Mar 2012 17:05:01 +0000 (UTC)
Matthew Mondor <mm_lists@pulsar-zone.net> wrote:
> >Synopsis: netbsd-6 pad(4) panic: locking against myself
I had not noticed when posting the PR, but kern/39204 might be related.
--
Matt
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/46232: netbsd-6 pad(4) panic: locking against myself
Date: Mon, 26 Mar 2012 17:21:47 +1100
this bug is most certainly not related to prior bugs or crashes in
pad itself, but is an audiomp issue.
looking at the code is fairly obvious. pad_read() takes the lock
thread lock, does a bunch of processing, and then right before it
calls into the audio interrupt it takes the intr lock. then we
traverse down audio_pint()->audio_clear(), and audio clear expects
to need to take the intr_lock here.
there's a bunch of bad uses of audio_clear() in audio.c.
.mrg.
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
Matthew Mondor <mm_lists@pulsar-zone.net>
Subject: re: kern/46232: netbsd-6 pad(4) panic: locking against myself
Date: Mon, 26 Mar 2012 17:46:33 +1100
could you please try this patch? thanks.
.mrg.
Index: audio.c
===================================================================
RCS file: /cvsroot/src/sys/dev/audio.c,v
retrieving revision 1.258
diff -p -r1.258 audio.c
*** audio.c 21 Feb 2012 20:53:34 -0000 1.258
--- audio.c 26 Mar 2012 06:45:55 -0000
*************** int audio_initbufs(struct audio_softc *)
*** 241,246 ****
--- 241,247 ----
void audio_calcwater(struct audio_softc *);
int audio_drain(struct audio_softc *);
void audio_clear(struct audio_softc *);
+ void audio_clear_intr_unlocked(struct audio_softc *sc);
static inline void audio_pint_silence
(struct audio_softc *, struct audio_ringbuffer *, uint8_t *, int);
*************** audio_clear(struct audio_softc *sc)
*** 1992,1998 ****
KASSERT(mutex_owned(sc->sc_lock));
- mutex_enter(sc->sc_intr_lock);
if (sc->sc_rbus) {
cv_broadcast(&sc->sc_rchan);
sc->hw_if->halt_input(sc->hw_hdl);
--- 1993,1998 ----
*************** audio_clear(struct audio_softc *sc)
*** 2005,2010 ****
--- 2005,2018 ----
sc->sc_pbus = false;
sc->sc_pr.pause = false;
}
+ }
+
+ void
+ audio_clear_intr_unlocked(struct audio_softc *sc)
+ {
+
+ mutex_enter(sc->sc_intr_lock);
+ audio_clear(sc);
mutex_exit(sc->sc_intr_lock);
}
*************** audiosetinfo(struct audio_softc *sc, str
*** 3643,3649 ****
setmode = 0;
if (nr > 0) {
if (!cleared) {
! audio_clear(sc);
cleared = true;
}
modechange = true;
--- 3651,3657 ----
setmode = 0;
if (nr > 0) {
if (!cleared) {
! audio_clear_intr_unlocked(sc);
cleared = true;
}
modechange = true;
*************** audiosetinfo(struct audio_softc *sc, str
*** 3651,3657 ****
}
if (np > 0) {
if (!cleared) {
! audio_clear(sc);
cleared = true;
}
modechange = true;
--- 3659,3665 ----
}
if (np > 0) {
if (!cleared) {
! audio_clear_intr_unlocked(sc);
cleared = true;
}
modechange = true;
*************** audiosetinfo(struct audio_softc *sc, str
*** 3660,3666 ****
if (SPECIFIED(ai->mode)) {
if (!cleared) {
! audio_clear(sc);
cleared = true;
}
modechange = true;
--- 3668,3674 ----
if (SPECIFIED(ai->mode)) {
if (!cleared) {
! audio_clear_intr_unlocked(sc);
cleared = true;
}
modechange = true;
*************** audiosetinfo(struct audio_softc *sc, str
*** 3754,3760 ****
if (SPECIFIED(p->port)) {
if (!cleared) {
! audio_clear(sc);
cleared = true;
}
error = au_set_port(sc, &sc->sc_outports, p->port);
--- 3762,3768 ----
if (SPECIFIED(p->port)) {
if (!cleared) {
! audio_clear_intr_unlocked(sc);
cleared = true;
}
error = au_set_port(sc, &sc->sc_outports, p->port);
*************** audiosetinfo(struct audio_softc *sc, str
*** 3763,3769 ****
}
if (SPECIFIED(r->port)) {
if (!cleared) {
! audio_clear(sc);
cleared = true;
}
error = au_set_port(sc, &sc->sc_inports, r->port);
--- 3771,3777 ----
}
if (SPECIFIED(r->port)) {
if (!cleared) {
! audio_clear_intr_unlocked(sc);
cleared = true;
}
error = au_set_port(sc, &sc->sc_inports, r->port);
*************** audiosetinfo(struct audio_softc *sc, str
*** 3825,3831 ****
/* Block size specified explicitly. */
if (ai->blocksize == 0) {
if (!cleared) {
! audio_clear(sc);
cleared = true;
}
sc->sc_blkset = false;
--- 3833,3839 ----
/* Block size specified explicitly. */
if (ai->blocksize == 0) {
if (!cleared) {
! audio_clear_intr_unlocked(sc);
cleared = true;
}
sc->sc_blkset = false;
*************** audiosetinfo(struct audio_softc *sc, str
*** 3836,3842 ****
/* check whether new blocksize changes actually */
if (hw->round_blocksize == NULL) {
if (!cleared) {
! audio_clear(sc);
cleared = true;
}
sc->sc_pr.blksize = ai->blocksize;
--- 3844,3850 ----
/* check whether new blocksize changes actually */
if (hw->round_blocksize == NULL) {
if (!cleared) {
! audio_clear_intr_unlocked(sc);
cleared = true;
}
sc->sc_pr.blksize = ai->blocksize;
*************** audiosetinfo(struct audio_softc *sc, str
*** 3849,3855 ****
if (pblksize != sc->sc_pr.blksize ||
rblksize != sc->sc_rr.blksize) {
if (!cleared) {
! audio_clear(sc);
cleared = true;
}
sc->sc_pr.blksize = ai->blocksize;
--- 3857,3863 ----
if (pblksize != sc->sc_pr.blksize ||
rblksize != sc->sc_rr.blksize) {
if (!cleared) {
! audio_clear_intr_unlocked(sc);
cleared = true;
}
sc->sc_pr.blksize = ai->blocksize;
State-Changed-From-To: open->feedback
State-Changed-By: mrg@NetBSD.org
State-Changed-When: Mon, 26 Mar 2012 07:00:56 +0000
State-Changed-Why:
patch sent for testing.
From: Matthew Mondor <mm_lists@pulsar-zone.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/46232: netbsd-6 pad(4) panic: locking against myself
Date: Mon, 26 Mar 2012 13:23:21 -0400
On Mon, 26 Mar 2012 06:50:05 +0000 (UTC)
matthew green <mrg@eterna.com.au> wrote:
> could you please try this patch? thanks.
With that patch I didn't experience a panic so far. However:
Try #1: killing dd was not possible, it went in some 100% system-time
busy loop after the recording session once I tried to kill it. The
resulting file only had about 1 second of data, the rest of the file
was zeros.
Try #2: killing dd was successful. The file content however was
irregular, with many large zero sections. Thus playing back the
results with the proper parameters, it was correct stereo output, but
filled with zero gaps, so playing with regular cuts.
Thanks,
--
Matt
State-Changed-From-To: feedback->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Mon, 26 Mar 2012 17:37:17 +0000
State-Changed-Why:
patch improved it, still borked though
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
Matthew Mondor <mm_lists@pulsar-zone.net>
Subject: re: kern/46232: netbsd-6 pad(4) panic: locking against myself
Date: Tue, 27 Mar 2012 04:47:48 +1100
> From: Matthew Mondor <mm_lists@pulsar-zone.net>
> To: gnats-bugs@NetBSD.org
> Cc:
> Subject: Re: kern/46232: netbsd-6 pad(4) panic: locking against myself
> Date: Mon, 26 Mar 2012 13:23:21 -0400
>
> On Mon, 26 Mar 2012 06:50:05 +0000 (UTC)
> matthew green <mrg@eterna.com.au> wrote:
>
> > could you please try this patch? thanks.
>
> With that patch I didn't experience a panic so far. However:
>
> Try #1: killing dd was not possible, it went in some 100% system-time
> busy loop after the recording session once I tried to kill it. The
> resulting file only had about 1 second of data, the rest of the file
> was zeros.
>
> Try #2: killing dd was successful. The file content however was
> irregular, with many large zero sections. Thus playing back the
> results with the proper parameters, it was correct stereo output, but
> filled with zero gaps, so playing with regular cuts.
yeah - these both are different bugs. one of them might be the
same as 39204 (#2). so there are generic pad(4) bugs but they
weren't introduced by audiomp it seems. :-(
thanks for testing, but i'm not sure what the real problem is for
either of the remaining issues.
.mrg.
State-Changed-From-To: open->closed
State-Changed-By: mrg@NetBSD.org
State-Changed-When: Mon, 26 Mar 2012 18:27:13 +0000
State-Changed-Why:
this problem fixed. the other pad problems should be in other PRs.
seems that PR 39204 covers #2, but the unkillable dd is not covered.
From: "matthew green" <mrg@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/46232 CVS commit: src/sys/dev
Date: Mon, 26 Mar 2012 18:26:11 +0000
Module Name: src
Committed By: mrg
Date: Mon Mar 26 18:26:10 UTC 2012
Modified Files:
src/sys/dev: audio.c
Log Message:
deal with PR#46232: ensure locking around audio_clear() is handled
consistently and avoid locking against self.
To generate a diff of this commit:
cvs rdiff -u -r1.258 -r1.259 src/sys/dev/audio.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Matthew Mondor <mm_lists@pulsar-zone.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/46232 (netbsd-6 pad(4) panic: locking against myself)
Date: Mon, 26 Mar 2012 14:39:10 -0400
On Mon, 26 Mar 2012 18:27:14 +0000 (UTC)
mrg@NetBSD.org wrote:
> Synopsis: netbsd-6 pad(4) panic: locking against myself
>
> State-Changed-From-To: open->closed
> State-Changed-By: mrg@NetBSD.org
> State-Changed-When: Mon, 26 Mar 2012 18:27:13 +0000
> State-Changed-Why:
> this problem fixed. the other pad problems should be in other PRs.
> seems that PR 39204 covers #2, but the unkillable dd is not covered.
Should this be in pending-pullups for netbsd-6 though?
Thanks,
--
Matt
State-Changed-From-To: closed->pending-pullups
State-Changed-By: mrg@NetBSD.org
State-Changed-When: Mon, 26 Mar 2012 19:49:52 +0000
State-Changed-Why:
right, awaiting on netbsd-6 pullups.
From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/46232 CVS commit: [netbsd-6] src/sys/dev
Date: Mon, 9 Apr 2012 17:46:59 +0000
Module Name: src
Committed By: riz
Date: Mon Apr 9 17:46:58 UTC 2012
Modified Files:
src/sys/dev [netbsd-6]: audio.c
Log Message:
Pull up following revision(s) (requested by mrg in ticket #164):
sys/dev/audio.c: revision 1.259
deal with PR#46232: ensure locking around audio_clear() is handled
consistently and avoid locking against self.
To generate a diff of this commit:
cvs rdiff -u -r1.257.2.1 -r1.257.2.2 src/sys/dev/audio.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Matthew Mondor <mm_lists@pulsar-zone.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: PR/46232 CVS commit: [netbsd-6] src/sys/dev
Date: Sat, 14 Apr 2012 21:31:38 -0400
On Mon, 9 Apr 2012 17:50:07 +0000 (UTC)
"Jeff Rizzo" <riz@netbsd.org> wrote:
> Module Name: src
> Committed By: riz
> Date: Mon Apr 9 17:46:58 UTC 2012
>
> Modified Files:
> src/sys/dev [netbsd-6]: audio.c
>
> Log Message:
> Pull up following revision(s) (requested by mrg in ticket #164):
> sys/dev/audio.c: revision 1.259
> deal with PR#46232: ensure locking around audio_clear() is handled
> consistently and avoid locking against self.
I confirm that this fix works. Unfortunately pad(4) still doesn't work
as well as it did on netbsd-5, but I agree that this is another issue.
Thanks,
--
Matt
State-Changed-From-To: pending-pullups->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Mon, 16 Apr 2012 06:19:58 +0000
State-Changed-Why:
pullup done.
Can you make sure that the other issues get their own PRs?
(and preferably post a crossreference in here for the record too)
From: Matthew Mondor <mm_lists@pulsar-zone.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/46232 (netbsd-6 pad(4) panic: locking against myself)
Date: Mon, 16 Apr 2012 18:03:42 -0400
On Mon, 16 Apr 2012 06:20:00 +0000 (UTC)
dholland@NetBSD.org wrote:
> Synopsis: netbsd-6 pad(4) panic: locking against myself
>
> State-Changed-From-To: pending-pullups->closed
> State-Changed-By: dholland@NetBSD.org
> State-Changed-When: Mon, 16 Apr 2012 06:19:58 +0000
> State-Changed-Why:
> pullup done.
> Can you make sure that the other issues get their own PRs?
> (and preferably post a crossreference in here for the record too)
I think that kern/39204 touches this other issue. However, its
synopsis might be a little confusing as I do get PCM data, albeit
damaged data...
--
Matt
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/46232 (netbsd-6 pad(4) panic: locking against myself)
Date: Tue, 17 Apr 2012 17:02:47 +0000
On Mon, Apr 16, 2012 at 10:05:04PM +0000, Matthew Mondor wrote:
> > Can you make sure that the other issues get their own PRs?
> > (and preferably post a crossreference in here for the record too)
>
> I think that kern/39204 touches this other issue. However, its
> synopsis might be a little confusing as I do get PCM data, albeit
> damaged data...
Thanks.
(synopses can be edited)
--
David A. Holland
dholland@netbsd.org
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.