NetBSD Problem Report #52331

From www@NetBSD.org  Sun Jun 25 08:42:35 2017
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id EC2507A267
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 25 Jun 2017 08:42:34 +0000 (UTC)
Message-Id: <20170625084233.A8DC17A291@mollari.NetBSD.org>
Date: Sun, 25 Jun 2017 08:42:33 +0000 (UTC)
From: baijiaju1990@163.com
Reply-To: baijiaju1990@163.com
To: gnats-bugs@NetBSD.org
Subject: ydc driver: sleep-under-spin-mutex bugs in yds_allocmem
X-Send-Pr-Version: www-1.0

>Number:         52331
>Category:       kern
>Synopsis:       ydc driver: sleep-under-spin-mutex bugs in yds_allocmem
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Jun 25 08:45:00 +0000 2017
>Last-Modified:  Sun Jun 25 16:15:00 +0000 2017
>Originator:     Jia-Ju Bai
>Release:        NetBSD-7.1
>Organization:
Tsinghua University
>Environment:
i386
>Description:
The driver may sleep in interrupt, and the function call path in file "sys/dev/pci/yds.c" in NetBSD-7.1 release is:
yds_resume [acquire the spin mutex]
  yds_init
    yds_allocate_slots
      yds_allocmem
        bus_dmamem_alloc(BUS_DMA_WAITOK) --> may sleep
        bus_dmamem_map(BUS_DMA_WAITOK) --> may sleep
        bus_dmamem_create(BUS_DMA_WAITOK) --> may sleep
        bus_dmamem_load(BUS_DMA_WAITOK) --> may sleep

These bugs are found by a static analysis tool written by myself, and they are checked by my review of the NetBSD code.
>How-To-Repeat:

>Fix:
The possible fix of this bug is to replace "BUS_DMA_WAITOK" with "BUS_DMA_NOWAIT".

>Audit-Trail:
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc: nat@netbsd.org
Subject: Re: kern/52331: ydc driver: sleep-under-spin-mutex bugs in yds_allocmem
Date: Sun, 25 Jun 2017 19:53:31 +0700

     Date:        Sun, 25 Jun 2017 08:45:00 +0000 (UTC)
     From:        baijiaju1990@163.com
     Message-ID:  <20170625084500.68FE47A2B0@mollari.NetBSD.org>

 While your analysis tool seems good at finding code worth reviewing, I
 am not sure your review of the code to determine if there is a bug or
 not in this case is quite up to it.

   | The driver may sleep in interrupt, and the function call path in file "sys/dev/pci/yds.c" in NetBSD-7.1 release is:
   | yds_resume [acquire the spin mutex]
   |   yds_init
   |     yds_allocate_slots
   |       yds_allocmem
   |         bus_dmamem_alloc(BUS_DMA_WAITOK) --> may sleep
   |         bus_dmamem_map(BUS_DMA_WAITOK) --> may sleep
   |         bus_dmamem_create(BUS_DMA_WAITOK) --> may sleep
   |         bus_dmamem_load(BUS_DMA_WAITOK) --> may sleep
   | 

 First:

   | The possible fix of this bug is to replace "BUS_DMA_WAITOK" with
   | "BUS_DMA_NOWAIT".

 while that would avoid a potential sleep it would not actually work (if
 the sleep was ever necessary) as then the resources would not be allocated.

 When yds_resume() calls yds_init() the driver must have already been
 initialised, yds_init() is first called in yds_attach(), and if
 it fails, the attach also fails - in that case the code never
 reaches the code (right at the end of yds_attach() which
 establishes yds_resume as the "switch back on" power handler.

 So yds_resume() cannot be called unless yds_init() has succeeded.

 One of the events that makes yds_init() fail is if yds_allocate_slots() fails.

 yds_allocate_slots() only calls yds_allocmem() if KERNADDR(p) is NULL,
 where p = &sc->sc_ctrldata; (KERNADDR is p->addr)

 If that happens, that is, if yds_allocmem() is called, yds_allocate_slots()
 fails if yds_allocmem() fails - once again, if that happens in the call
 that comes from yds_init() from yds_attach() the attach fails, and yds_resume
 can never be called.

 yds_allocmem() does call bus_dmamem_alloc() (etc) as your PR revealed,
 but remember is only called if p->addr == NULL.

 The second bus_dma*() call in yds_allocmem() is

 	        error = bus_dmamem_map(sc->sc_dmatag, p->segs, p->nsegs, p->size,
                                &p->addr, BUS_DMA_WAITOK|BUS_DMA_COHERENT);

 That sets p_addr (unless it fails, in which case yds_allocmem returns the
 resources it has already claimed, and fails, and when that happens,
 yds_allocate_slots() also fails, which causes yds_init() to fail, which
 causes yds_attach() to fail, and yds_resume can never be called.

 So we know that for yds_resume to be called, the yds_init() in yds_attach()
 must have succeeded, which means that yds_allocate_slots() succeeded, which
 means that yds_allocmem() succeeded, which means that p->addr != NULL when
 yds_attach() is finished with the yds_init() call.

 Any later call of yds_allocmem() will find p->addr != NULL, and never call
 yds_allocmem() again (or not until yds_freemem() called from yds_free() has
 returned it all - that is only called from audio.c, but it wull take someone
 more familiar with the code than I can ever be to know whether that is
 possible in a situation where the power management resume function might
 still later be called.)

 But what's more, when yds_freemem() actually releases the resources
 identified by p_addr (and the others allocated by yds_allocmem()) it
 never bothers to set the pointer(s) back to NULL, so even if it were
 possible that yds_free() might be called from audio.c, and the power
 handler resume function called later, I still don't see how yds_allocmem()
 can ever be called again.

 I have cc'd Nathanial Sloss <nat@netbsd.org> on this reply - Nat, do you
 want this PR, or can we just assume that the bug reported is not in fact
 possible, and close it?

 kre

From: Robert Elz <kre@munnari.OZ.AU>
To: Jia-Ju Bai <baijiaju1990@163.com>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org
Subject: Re: kern/52331: ydc driver: sleep-under-spin-mutex bugs in yds_allocmem
Date: Sun, 25 Jun 2017 21:08:33 +0700

     Date:        Sun, 25 Jun 2017 21:14:19 +0800
     From:        Jia-Ju Bai <baijiaju1990@163.com>
     Message-ID:  <594FB72B.4050202@163.com>

   | From your words, I can see that "if (KERNADDR(p) == NULL)" is always 
   | not satisfied at runtime, is it?

 That happens (KERNADDR(p) == NULL) when yds_init() is called the first
 time from yds_attach().  That time no resources are yet allocated, and
 yds_allocmem() needs to allocate them.

 But that call is not from a context where sleeping is not allowed.

   | If it is, I think it is okay to remove the code (including yds_allocmem) 
   | of this if condition.

 So, no, the code is still needed, it is called, just not in the case of
 concern (I think, we need to wait and see what nat@ says about just
 how audio.c might influence things.)

 kre

From: Jia-Ju Bai <baijiaju1990@163.com>
To: Robert Elz <kre@munnari.OZ.AU>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, 
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/52331: ydc driver: sleep-under-spin-mutex bugs in yds_allocmem
Date: Sun, 25 Jun 2017 21:14:19 +0800

 On 06/25/2017 08:55 PM, Robert Elz wrote:
 > The following reply was made to PR kern/52331; it has been noted by GNATS.
 >
 > From: Robert Elz<kre@munnari.OZ.AU>
 > To: gnats-bugs@NetBSD.org
 > Cc: nat@netbsd.org
 > Subject: Re: kern/52331: ydc driver: sleep-under-spin-mutex bugs in yds_allocmem
 > Date: Sun, 25 Jun 2017 19:53:31 +0700
 >
 >       Date:        Sun, 25 Jun 2017 08:45:00 +0000 (UTC)
 >       From:        baijiaju1990@163.com
 >       Message-ID:<20170625084500.68FE47A2B0@mollari.NetBSD.org>
 >
 >   While your analysis tool seems good at finding code worth reviewing, I
 >   am not sure your review of the code to determine if there is a bug or
 >   not in this case is quite up to it.
 >
 >     | The driver may sleep in interrupt, and the function call path in file "sys/dev/pci/yds.c" in NetBSD-7.1 release is:
 >     | yds_resume [acquire the spin mutex]
 >     |   yds_init
 >     |     yds_allocate_slots
 >     |       yds_allocmem
 >     |         bus_dmamem_alloc(BUS_DMA_WAITOK) -->  may sleep
 >     |         bus_dmamem_map(BUS_DMA_WAITOK) -->  may sleep
 >     |         bus_dmamem_create(BUS_DMA_WAITOK) -->  may sleep
 >     |         bus_dmamem_load(BUS_DMA_WAITOK) -->  may sleep
 >     |
 >
 >   First:
 >
 >     | The possible fix of this bug is to replace "BUS_DMA_WAITOK" with
 >     | "BUS_DMA_NOWAIT".
 >
 >   while that would avoid a potential sleep it would not actually work (if
 >   the sleep was ever necessary) as then the resources would not be allocated.
 >
 >   When yds_resume() calls yds_init() the driver must have already been
 >   initialised, yds_init() is first called in yds_attach(), and if
 >   it fails, the attach also fails - in that case the code never
 >   reaches the code (right at the end of yds_attach() which
 >   establishes yds_resume as the "switch back on" power handler.
 >
 >   So yds_resume() cannot be called unless yds_init() has succeeded.
 >
 >   One of the events that makes yds_init() fail is if yds_allocate_slots() fails.
 >
 >   yds_allocate_slots() only calls yds_allocmem() if KERNADDR(p) is NULL,
 >   where p =&sc->sc_ctrldata; (KERNADDR is p->addr)
 >
 >   If that happens, that is, if yds_allocmem() is called, yds_allocate_slots()
 >   fails if yds_allocmem() fails - once again, if that happens in the call
 >   that comes from yds_init() from yds_attach() the attach fails, and yds_resume
 >   can never be called.
 >
 >   yds_allocmem() does call bus_dmamem_alloc() (etc) as your PR revealed,
 >   but remember is only called if p->addr == NULL.
 >
 >   The second bus_dma*() call in yds_allocmem() is
 >
 >   	        error = bus_dmamem_map(sc->sc_dmatag, p->segs, p->nsegs, p->size,
 >                                  &p->addr, BUS_DMA_WAITOK|BUS_DMA_COHERENT);
 >
 >   That sets p_addr (unless it fails, in which case yds_allocmem returns the
 >   resources it has already claimed, and fails, and when that happens,
 >   yds_allocate_slots() also fails, which causes yds_init() to fail, which
 >   causes yds_attach() to fail, and yds_resume can never be called.
 >
 >   So we know that for yds_resume to be called, the yds_init() in yds_attach()
 >   must have succeeded, which means that yds_allocate_slots() succeeded, which
 >   means that yds_allocmem() succeeded, which means that p->addr != NULL when
 >   yds_attach() is finished with the yds_init() call.
 >
 >   Any later call of yds_allocmem() will find p->addr != NULL, and never call
 >   yds_allocmem() again (or not until yds_freemem() called from yds_free() has
 >   returned it all - that is only called from audio.c, but it wull take someone
 >   more familiar with the code than I can ever be to know whether that is
 >   possible in a situation where the power management resume function might
 >   still later be called.)
 >
 >   But what's more, when yds_freemem() actually releases the resources
 >   identified by p_addr (and the others allocated by yds_allocmem()) it
 >   never bothers to set the pointer(s) back to NULL, so even if it were
 >   possible that yds_free() might be called from audio.c, and the power
 >   handler resume function called later, I still don't see how yds_allocmem()
 >   can ever be called again.
 >
 >   I have cc'd Nathanial Sloss<nat@netbsd.org>  on this reply - Nat, do you
 >   want this PR, or can we just assume that the bug reported is not in fact
 >   possible, and close it?
 >
 >   kre
 >

 Thanks for your reply and detailed analysis :)

  From your words, I can see that "if (KERNADDR(p) == NULL)" is always 
 not satisfied at runtime, is it?
 If it is, I think it is okay to remove the code (including yds_allocmem) 
 of this if condition.

 Thanks,
 Jia-Ju Bai

From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52331 CVS commit: src/sys/dev/pci
Date: Sun, 25 Jun 2017 12:07:48 -0400

 Module Name:	src
 Committed By:	christos
 Date:		Sun Jun 25 16:07:48 UTC 2017

 Modified Files:
 	src/sys/dev/pci: yds.c ydsvar.h

 Log Message:
 PR/52331: ydc driver: sleep-under-spin-mutex bugs in yds_allocmem
 Don't hold the spin interrupt mutex while calling yds_init from resume.
 Instead use a flag to short-circuit the interrupt while disabled.


 To generate a diff of this commit:
 cvs rdiff -u -r1.58 -r1.59 src/sys/dev/pci/yds.c
 cvs rdiff -u -r1.11 -r1.12 src/sys/dev/pci/ydsvar.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: christos@zoulas.com (Christos Zoulas)
To: Jia-Ju Bai <baijiaju1990@163.com>, Robert Elz <kre@munnari.OZ.AU>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/52331: ydc driver: sleep-under-spin-mutex bugs in yds_allocmem
Date: Sun, 25 Jun 2017 12:13:30 -0400

 On Jun 25,  9:14pm, baijiaju1990@163.com (Jia-Ju Bai) wrote:
 -- Subject: Re: kern/52331: ydc driver: sleep-under-spin-mutex bugs in yds_al

 | Thanks for your reply and detailed analysis :)
 | 
 |  From your words, I can see that "if (KERNADDR(p) == NULL)" is always 
 | not satisfied at runtime, is it?
 | If it is, I think it is okay to remove the code (including yds_allocmem) 
 | of this if condition.

 I fixed it a bit differently before reading this message. Since the spin
 mutex is only used to protect interrupts, and the init function is not
 called with the mutex held during attach, I made the init call from resume
 not called with the spin mutex held, and protected the interrupt with
 sc_enabled.

 christos

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.