NetBSD Problem Report #46121

From www@NetBSD.org  Thu Mar  1 10:51:42 2012
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id 6270963E17F
	for <gnats-bugs@gnats.NetBSD.org>; Thu,  1 Mar 2012 10:51:42 +0000 (UTC)
Message-Id: <20120301105141.3561663DFD4@www.NetBSD.org>
Date: Thu,  1 Mar 2012 10:51:41 +0000 (UTC)
From: ad@netbsd.org
Reply-To: ad@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: audiomp: locking protocol error
X-Send-Pr-Version: www-1.0

>Number:         46121
>Category:       kern
>Synopsis:       audiomp: locking protocol error
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    mrg
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 01 10:55:00 +0000 2012
>Closed-Date:    Fri Jan 04 03:45:07 +0000 2013
>Last-Modified:  Fri Jan 04 03:45:07 +0000 2013
>Originator:     Andrew Doran
>Release:        -current
>Organization:
The NetBSD Project
>Environment:
N/A
>Description:
audio.c has this descriptive block:

    122  * List of hardware interface methods, and which locks are held when each
    123  * is called by this module:
    124  *
    125  *	METHOD			INTR	THREAD  NOTES
    126  *	----------------------- ------- -------	-------------------------
...
    146  *	allocm 			-	-	Called at attach time
    147  *	freem 			-	-	Called at attach time

Lets look at the code.

    381 void
    382 audioattach(device_t parent, device_t self, void *aux)
...
    454 	/*
    455 	 * XXX  Would like to not hold the sc_lock around this whole block
    456 	 * escpially for audio_alloc_ring(), except that the latter calls
    457 	 * ->round_blocksize() which demands the thread lock to be taken.
    458 	 *
    459 	 * Revisit.
    460 	 */
    461 	mutex_enter(sc->sc_lock);
...
    463 		error = audio_alloc_ring(sc, &sc->sc_pr,
...
    476 			if (sc->sc_pr.s.start != 0)
    477 				audio_free_ring(sc, &sc->sc_pr);

Further down the chain routines such as this are called:

   1041 static void
   1042 btsco_freem(void *hdl, void *addr, size_t size)
...
   1052 		while (sc->sc_tx_refcnt> 0 && count-- > 0)
   1053 			kpause("drain", false, 1, NULL);

So comment block does not match reality and we call allocm/freem routines with lock held. This is a serious problem as:

(a) down the chain we do e.g. kpause(), kmem_alloc() etc.
(b) audio driver "thread" lock is taken from soft interrupt context.
(c) thread lock may be shared with other subsystems.  For instance, in the case of bluetooth the thread lock will be bt_lock and this is also used to lock bluetooth sockets (so->so_lock == bt_lock). We cannot block with socket locks held.
>How-To-Repeat:
Code inspection.
>Fix:
Do not call allocm/freem methods with sc_lock held.  Instead inspect allocm/freem methods and calling code path.  Sprinkle synchronization as necessary.

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->mrg
Responsible-Changed-By: mrg@NetBSD.org
Responsible-Changed-When: Wed, 21 Mar 2012 00:38:17 +0000
Responsible-Changed-Why:
i'll take a look.


From: matthew green <mrg@eterna.com.au>
To: ad@netbsd.org, gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org
Subject: re: kern/46121: audiomp: locking protocol error
Date: Mon, 26 Mar 2012 18:07:02 +1100

 this looks like it should do the trick.

 oh, i'll fix the comment as well.


 .mrg.


 Index: audio.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/audio.c,v
 retrieving revision 1.258
 diff -p -r1.258 audio.c
 *** audio.c	21 Feb 2012 20:53:34 -0000	1.258
 --- audio.c	26 Mar 2012 07:01:37 -0000
 *************** audioattach(device_t parent, device_t se
 *** 389,394 ****
 --- 390,396 ----
   	mixer_devinfo_t mi;
   	int iclass, mclass, oclass, rclass, props;
   	int record_master_found, record_source_found;
 + 	bool can_capture, can_playback;

   	sc = device_private(self);
   	sc->dev = self;
 *************** audioattach(device_t parent, device_t se
 *** 459,482 ****
   	 * Revisit.
   	 */
   	mutex_enter(sc->sc_lock);
 ! 	if (audio_can_playback(sc)) {
   		error = audio_alloc_ring(sc, &sc->sc_pr,
   		    AUMODE_PLAY, AU_RING_SIZE);
   		if (error) {
   			sc->hw_if = NULL;
 -  			mutex_exit(sc->sc_lock);
   			aprint_error("audio: could not allocate play buffer\n");
   			return;
   		}
   	}
 ! 	if (audio_can_capture(sc)) {
   		error = audio_alloc_ring(sc, &sc->sc_rr,
   		    AUMODE_RECORD, AU_RING_SIZE);
   		if (error) {
   			if (sc->sc_pr.s.start != 0)
   				audio_free_ring(sc, &sc->sc_pr);
   			sc->hw_if = NULL;
 -  			mutex_exit(sc->sc_lock);
   			aprint_error("audio: could not allocate record buffer\n");
   			return;
   		}
 --- 461,486 ----
   	 * Revisit.
   	 */
   	mutex_enter(sc->sc_lock);
 ! 	can_playback = audio_can_playback(sc);
 ! 	can_capture = audio_can_capture(sc);
 !  	mutex_exit(sc->sc_lock);
 ! 
 ! 	if (can_playback) {
   		error = audio_alloc_ring(sc, &sc->sc_pr,
   		    AUMODE_PLAY, AU_RING_SIZE);
   		if (error) {
   			sc->hw_if = NULL;
   			aprint_error("audio: could not allocate play buffer\n");
   			return;
   		}
   	}
 ! 	if (can_capture) {
   		error = audio_alloc_ring(sc, &sc->sc_rr,
   		    AUMODE_RECORD, AU_RING_SIZE);
   		if (error) {
   			if (sc->sc_pr.s.start != 0)
   				audio_free_ring(sc, &sc->sc_pr);
   			sc->hw_if = NULL;
   			aprint_error("audio: could not allocate record buffer\n");
   			return;
   		}
 *************** audioattach(device_t parent, device_t se
 *** 484,489 ****
 --- 488,494 ----

   	sc->sc_lastgain = 128;

 + 	mutex_enter(sc->sc_lock);
   	error = audio_set_defaults(sc, 0);
   	mutex_exit(sc->sc_lock);
   	if (error != 0) {
 *************** audio_alloc_ring(struct audio_softc *sc,
 *** 884,891 ****
   	if (bufsize < AUMINBUF)
   		bufsize = AUMINBUF;
   	ROUNDSIZE(bufsize);
 ! 	if (hw->round_buffersize)
   		bufsize = hw->round_buffersize(hdl, direction, bufsize);
   	if (hw->allocm)
   		r->s.start = hw->allocm(hdl, direction, bufsize);
   	else
 --- 889,899 ----
   	if (bufsize < AUMINBUF)
   		bufsize = AUMINBUF;
   	ROUNDSIZE(bufsize);
 ! 	if (hw->round_buffersize) {
 ! 		mutex_enter(sc->sc_lock);
   		bufsize = hw->round_buffersize(hdl, direction, bufsize);
 +  		mutex_exit(sc->sc_lock);
 + 	}
   	if (hw->allocm)
   		r->s.start = hw->allocm(hdl, direction, bufsize);
   	else

From: David Laight <david@l8s.co.uk>
To: matthew green <mrg@eterna.com.au>
Cc: ad@netbsd.org, gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/46121: audiomp: locking protocol error
Date: Mon, 26 Mar 2012 08:33:01 +0100

 On Mon, Mar 26, 2012 at 06:07:02PM +1100, matthew green wrote:
 > 
 > this looks like it should do the trick.
 ...
 >   	mutex_enter(sc->sc_lock);
 > ! 	can_playback = audio_can_playback(sc);
 > ! 	can_capture = audio_can_capture(sc);
 > !  	mutex_exit(sc->sc_lock);
 > ! 
 > ! 	if (can_playback) {

 Hmmm... that sort of code tends to look dubious [1].
 While acquiring a mutex across a function call can protect the underlying
 data from immediate corruption, it isn't necessarily obvious that
 the checked condition can't change state after the mutex is released.

 	David

 [1] As does releaseing a mutex across a function call!

 -- 
 David Laight: david@l8s.co.uk

From: matthew green <mrg@eterna.com.au>
To: David Laight <david@l8s.co.uk>
Cc: ad@netbsd.org, gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
    gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: re: kern/46121: audiomp: locking protocol error
Date: Tue, 27 Mar 2012 04:44:11 +1100

 > > this looks like it should do the trick.
 > ...
 > >   	mutex_enter(sc->sc_lock);
 > > ! 	can_playback = audio_can_playback(sc);
 > > ! 	can_capture = audio_can_capture(sc);
 > > !  	mutex_exit(sc->sc_lock);
 > > ! 
 > > ! 	if (can_playback) {
 > 
 > Hmmm... that sort of code tends to look dubious [1].
 > While acquiring a mutex across a function call can protect the underlying
 > data from immediate corruption, it isn't necessarily obvious that
 > the checked condition can't change state after the mutex is released.
 > 
 > 	David
 > 
 > [1] As does releaseing a mutex across a function call!

 in a general you're right, but this case doesn't fit.  we need the
 lock to ask the hw driver about play/record state, and this is in
 audioattach() so there won't be other activity here yet.  so what
 this really does is push locking around that probably isn't strictly
 necessary for this path, but is necessary for others.

 the conditions can change, but only after open() etc.


 .mrg.

State-Changed-From-To: open->pending-pullups
State-Changed-By: mrg@NetBSD.org
State-Changed-When: Fri, 06 Apr 2012 06:15:22 +0000
State-Changed-Why:
fix commited to -current


From: "matthew green" <mrg@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/46121 CVS commit: src/sys/dev
Date: Fri, 6 Apr 2012 06:15:14 +0000

 Module Name:	src
 Committed By:	mrg
 Date:		Fri Apr  6 06:15:14 UTC 2012

 Modified Files:
 	src/sys/dev: audio.c

 Log Message:
 don't hold the thread lock while calling allocm() or freem().  fixes PR#46121


 To generate a diff of this commit:
 cvs rdiff -u -r1.259 -r1.260 src/sys/dev/audio.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/46121 CVS commit: [netbsd-6] src/sys/dev
Date: Mon, 7 May 2012 03:03:18 +0000

 Module Name:	src
 Committed By:	riz
 Date:		Mon May  7 03:03:18 UTC 2012

 Modified Files:
 	src/sys/dev [netbsd-6]: audio.c

 Log Message:
 Pull up following revision(s) (requested by mrg in ticket #205):
 	sys/dev/audio.c: revision 1.260
 don't hold the thread lock while calling allocm() or freem().  fixes PR#46121


 To generate a diff of this commit:
 cvs rdiff -u -r1.257.2.2 -r1.257.2.3 src/sys/dev/audio.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: pending-pullups->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Fri, 04 Jan 2013 03:45:07 +0000
State-Changed-Why:
pulled up to netbsd-6.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.