NetBSD Problem Report #57783

From www@netbsd.org  Mon Dec 18 21:35:13 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 92A7F1A9238
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 18 Dec 2023 21:35:13 +0000 (UTC)
Message-Id: <20231218213442.163911A923C@mollari.NetBSD.org>
Date: Mon, 18 Dec 2023 21:34:42 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: usbd_set_polling calls ubm_softint with polling enabled but bus lock held
X-Send-Pr-Version: www-1.0

>Number:         57783
>Category:       kern
>Synopsis:       usbd_set_polling calls ubm_softint with polling enabled but bus lock held
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Dec 18 21:40:00 +0000 2023
>Closed-Date:    Wed Mar 13 02:46:37 +0000 2024
>Last-Modified:  Wed Mar 13 02:46:37 +0000 2024
>Originator:     Taylor R Campbell
>Release:        current, 10, 9, 8
>Organization:
The NetBSD Pollingation
>Environment:
>Description:
[ 61805.181248] fatal breakpoint trap in supervisor mode
[ 61805.181248] trap type 1 code 0 rip 0xffffffff80235415 cs 0x8 rflags 0x202 cr2 0x78d8318c6000 ilevel 0x6 rsp 0xffff949093d1aab0
[ 61805.181248] curlwp 0xffff8a75ebc1dc00 pid 0.5 lowest kstack 0xffff949093d162c0
[ 61805.181248] Mutex error: mutex_vector_enter,551: locking against myself

[ 61805.181248] lock address : ffff8a66f5e3a9c8
[ 61805.181248] current cpu  :                  0
[ 61805.181248] current lwp  : 0xffff8a75ebc1dc00
[ 61805.181248] owner field  : 0xffff8a75ebc1dc00 wait/spin:                0/0

[ 61805.181248] panic: lock error: Mutex: mutex_vector_enter,551: locking against myself: lock 0xffff8a66f5e3a9c8 cpu 0 lwp 0xffff8a75ebc1dc00
[ 61805.181248] cpu0: Begin traceback...
[ 61805.181248] vpanic() at netbsd:vpanic+0x171
[ 61805.181248] panic() at netbsd:panic+0x3c
[ 61805.181248] lockdebug_abort() at netbsd:lockdebug_abort+0x114
[ 61805.181248] mutex_vector_enter() at netbsd:mutex_vector_enter+0x381
[ 61805.181248] usbd_transfer() at netbsd:usbd_transfer+0xdb
[ 61805.181248] ucomsubmitread() at netbsd:ucomsubmitread+0x54
[ 61805.181248] ucom_read_complete() at netbsd:ucom_read_complete+0x105
[ 61805.181248] ucomreadcb() at netbsd:ucomreadcb+0x392
[ 61805.181248] usb_transfer_complete() at netbsd:usb_transfer_complete+0x536
[ 61805.181248] xhci_softintr() at netbsd:xhci_softintr+0xef8
[ 61805.181248] usbd_set_polling() at netbsd:usbd_set_polling+0x3f
[ 61805.181248] ukbd_cnpollc() at netbsd:ukbd_cnpollc+0x6c
[ 61805.181248] cnpollc() at netbsd:cnpollc+0x71
[ 61805.181248] kdb_trap() at netbsd:kdb_trap+0xfd
[ 61805.181248] trap() at netbsd:trap+0x1bc
[ 61805.181248] --- trap (number 1) ---
[ 61805.181248] breakpoint() at netbsd:breakpoint+0x5
[ 61805.181248] wskbd_translate() at netbsd:wskbd_translate+0xf22
[ 61805.181248] wskbd_input() at netbsd:wskbd_input+0xbe
[ 61805.181248] ukbd_decode() at netbsd:ukbd_decode+0x27e
[ 61805.181248] ukbd_delayed_decode() at netbsd:ukbd_delayed_decode+0x3e
[ 61805.181248] callout_softclock() at netbsd:callout_softclock+0x490
[ 61805.181248] softint_dispatch() at netbsd:softint_dispatch+0x100
[ 61805.181248] cpu0: End traceback...

What happened is:

1. usbd_set_polling incremented bus->ub_usepolling
2. usbd_set_polling acquired bus->ub_lock and called xhci_softintr
3. xhci_softintr processed the queue and called xfer callbacks _without_ releasing the lock because bus->ub_usepolling is nonzero
4. ucomreadcb called usbd_transfer
5. usbd_transfer unconditionally tried mutex_enter(bus->ub_lock) via usbd_lock_pipe and crashed because the lock was already held from(2)

There are two problems:

(a) xhci_softintr is calling xfer callbacks with the lock held, which it should never do
(b) usbd_transfer is taking the lock while polling is enabled and maybe it shouldn't do that
>How-To-Repeat:
1. Attach USB serial port and do I/O on it in background.
2. Type C-A-ESC at USB keyboard to trigger ddb.
>Fix:
1. Change usbd_set_polling sequence from:

        if (on)
                bus->ub_usepolling++;
        else
                bus->ub_usepolling--;

        mutex_enter(ub_lock);
        bus->ub_methods->ubm_softint(bus);
        mutex_exit(ub_lock);

   to:

        mutex_enter(ub_lock);
        if (on) {
                if (bus->ub_usepolling == 0)
                        bus->ub_methods->ubm_softint(bus);
                bus->ub_usepolling++;
        } else {
                bus->ub_usepolling--;
                if (bus->ub_usepolling == 0)
                        bus->ub_methods->ubm_softint(bus);
        }
        mutex_exit(ub_lock);

   This way we we avoid calling ubm_softint with the lock held but polling enabled.

2. Maybe make usbd_lock_pipe or usbd_transfer conditional on bus->ub_usepolling.  This way it is safe to invoke when polling is enabled.  Unclear if this is sensible or not -- need to think more about it.

>Release-Note:

>Audit-Trail:

State-Changed-From-To: open->analyzed
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Mon, 18 Dec 2023 22:30:22 +0000
State-Changed-Why:
problem analyzed, candidate fix posted


From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: netbsd-bugs@NetBSD.org, mrg@NetBSD.org, skrll@NetBSD.org
Subject: Re: kern/57783: usbd_set_polling calls ubm_softint with polling enabled but bus lock held
Date: Mon, 18 Dec 2023 22:29:09 +0000

 This is a multi-part message in MIME format.
 --=_nsoQjY3CD7bBvFEdm/FiyAa+axP8majm

 The attached patch series attempts to address this issue -- untested.
 Each patch should independently resolve this specific panic, but I
 think they may both generally be needed.

 --=_nsoQjY3CD7bBvFEdm/FiyAa+axP8majm
 Content-Type: text/plain; charset="ISO-8859-1"; name="pr57783-usbpollingsoftintxfer"
 Content-Transfer-Encoding: quoted-printable
 Content-Disposition: attachment; filename="pr57783-usbpollingsoftintxfer.patch"

 From db0cfab258b4fd1a691082195a2541988f11ab52 Mon Sep 17 00:00:00 2001
 From: Taylor R Campbell <riastradh@NetBSD.org>
 Date: Mon, 18 Dec 2023 22:17:13 +0000
 Subject: [PATCH 1/2] usbdi(9): Avoid calling ubm_softint with lock held and
  polling on.

 PR kern/57783

 XXX pullup-10
 XXX pullup-9
 XXX pullup-8
 ---
  sys/dev/usb/usbdi.c | 32 ++++++++++++++++++++++++++------
  1 file changed, 26 insertions(+), 6 deletions(-)

 diff --git a/sys/dev/usb/usbdi.c b/sys/dev/usb/usbdi.c
 index fac06b94df44..125ef3b6ba63 100644
 --- a/sys/dev/usb/usbdi.c
 +++ b/sys/dev/usb/usbdi.c
 @@ -1362,14 +1362,34 @@ usbd_dopoll(struct usbd_interface *iface)
  void
  usbd_set_polling(struct usbd_device *dev, int on)
  {
 -	if (on)
 -		dev->ud_bus->ub_usepolling++;
 -	else
 -		dev->ud_bus->ub_usepolling--;
 =20
 -	/* Kick the host controller when switching modes */
  	mutex_enter(dev->ud_bus->ub_lock);
 -	dev->ud_bus->ub_methods->ubm_softint(dev->ud_bus);
 +	if (on) {
 +		/*
 +		 * Enabling polling.  If we're enabling for the first
 +		 * time, call the softint routine on transition while
 +		 * we hold the lock and polling is still disabled, and
 +		 * then enable polling -- once polling is enabled, we
 +		 * must not hold the lock when we call the softint
 +		 * routine.
 +		 */
 +		KASSERT(dev->ud_bus->ub_usepolling < __type_max(char));
 +		if (dev->ud_bus->ub_usepolling =3D=3D 0)
 +			dev->ud_bus->ub_methods->ubm_softint(dev->ud_bus);
 +		dev->ud_bus->ub_usepolling++;
 +	} else {
 +		/*
 +		 * Disabling polling.  If we're disabling polling for
 +		 * the last time, disable polling first and then call
 +		 * the softint routine while we hold the lock -- until
 +		 * polling is disabled, we must not hold the lock when
 +		 * we call the softint routine.
 +		 */
 +		KASSERT(dev->ud_bus->ub_usepolling > 0);
 +		dev->ud_bus->ub_usepolling--;
 +		if (dev->ud_bus->ub_usepolling =3D=3D 0)
 +			dev->ud_bus->ub_methods->ubm_softint(dev->ud_bus);
 +	}
  	mutex_exit(dev->ud_bus->ub_lock);
  }
 =20

 From c8553615bd0ddac1eab7133e0c334a76f3482eb8 Mon Sep 17 00:00:00 2001
 From: Taylor R Campbell <riastradh@NetBSD.org>
 Date: Mon, 18 Dec 2023 22:19:13 +0000
 Subject: [PATCH 2/2] usbdi(9): Avoid taking locks in usbd_transfer while
  polling.

 PR kern/57783

 XXX pullup-10
 XXX pullup-9
 XXX pullup-8
 ---
  sys/dev/usb/usbdi.c | 23 ++++++++++++++++-------
  1 file changed, 16 insertions(+), 7 deletions(-)

 diff --git a/sys/dev/usb/usbdi.c b/sys/dev/usb/usbdi.c
 index 125ef3b6ba63..80e432a22bc2 100644
 --- a/sys/dev/usb/usbdi.c
 +++ b/sys/dev/usb/usbdi.c
 @@ -410,14 +410,18 @@ usbd_transfer(struct usbd_xfer *xfer)
  		}
  	}
 =20
 -	usbd_lock_pipe(pipe);
 +	if (pipe->up_dev->ud_bus->ub_usepolling =3D=3D 0)
 +		usbd_lock_pipe(pipe);
  	if (pipe->up_aborting) {
  		/*
  		 * XXX For synchronous transfers this is fine.  What to
  		 * do for asynchronous transfers?  The callback is
  		 * never run, not even with status USBD_CANCELLED.
 +		 *
 +		 * XXX Does it make sense to abort while polling?
  		 */
 -		usbd_unlock_pipe(pipe);
 +		if (pipe->up_dev->ud_bus->ub_usepolling =3D=3D 0)
 +			usbd_unlock_pipe(pipe);
  		USBHIST_LOG(usbdebug, "<- done xfer %#jx, aborting",
  		    (uintptr_t)xfer, 0, 0, 0);
  		SDT_PROBE2(usb, device, xfer, done,  xfer, USBD_CANCELLED);
 @@ -443,7 +447,8 @@ usbd_transfer(struct usbd_xfer *xfer)
  	} while (0);
  	SDT_PROBE3(usb, device, pipe, transfer__done,  pipe, xfer, err);
 =20
 -	usbd_unlock_pipe(pipe);
 +	if (pipe->up_dev->ud_bus->ub_usepolling =3D=3D 0)
 +		usbd_unlock_pipe(pipe);
 =20
  	if (err !=3D USBD_IN_PROGRESS && err) {
  		/*
 @@ -453,7 +458,8 @@ usbd_transfer(struct usbd_xfer *xfer)
  		 */
  		USBHIST_LOG(usbdebug, "xfer failed: %jd, reinserting",
  		    err, 0, 0, 0);
 -		usbd_lock_pipe(pipe);
 +		if (pipe->up_dev->ud_bus->ub_usepolling =3D=3D 0)
 +			usbd_lock_pipe(pipe);
  		SDT_PROBE1(usb, device, xfer, preabort,  xfer);
  #ifdef DIAGNOSTIC
  		xfer->ux_state =3D XFER_BUSY;
 @@ -461,7 +467,8 @@ usbd_transfer(struct usbd_xfer *xfer)
  		SIMPLEQ_REMOVE_HEAD(&pipe->up_queue, ux_next);
  		if (pipe->up_serialise)
  			usbd_start_next(pipe);
 -		usbd_unlock_pipe(pipe);
 +		if (pipe->up_dev->ud_bus->ub_usepolling =3D=3D 0)
 +			usbd_unlock_pipe(pipe);
  	}
 =20
  	if (!(flags & USBD_SYNCHRONOUS)) {
 @@ -480,7 +487,8 @@ usbd_transfer(struct usbd_xfer *xfer)
  	}
 =20
  	/* Sync transfer, wait for completion. */
 -	usbd_lock_pipe(pipe);
 +	if (pipe->up_dev->ud_bus->ub_usepolling =3D=3D 0)
 +		usbd_lock_pipe(pipe);
  	while (!xfer->ux_done) {
  		if (pipe->up_dev->ud_bus->ub_usepolling)
  			panic("usbd_transfer: not done");
 @@ -503,7 +511,8 @@ usbd_transfer(struct usbd_xfer *xfer)
  	}
  	err =3D xfer->ux_status;
  	SDT_PROBE2(usb, device, xfer, done,  xfer, err);
 -	usbd_unlock_pipe(pipe);
 +	if (pipe->up_dev->ud_bus->ub_usepolling =3D=3D 0)
 +		usbd_unlock_pipe(pipe);
  	return err;
  }
 =20

 --=_nsoQjY3CD7bBvFEdm/FiyAa+axP8majm--

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57783 CVS commit: src/sys/dev/usb
Date: Sat, 13 Jan 2024 12:27:54 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Jan 13 12:27:54 UTC 2024

 Modified Files:
 	src/sys/dev/usb: usbdi.c

 Log Message:
 usbdi(9): Avoid calling ubm_softint with lock held and polling on.

 PR kern/57783

 XXX pullup-10
 XXX pullup-9
 XXX pullup-8


 To generate a diff of this commit:
 cvs rdiff -u -r1.247 -r1.248 src/sys/dev/usb/usbdi.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57783 CVS commit: src/sys/dev/usb
Date: Sat, 13 Jan 2024 12:42:10 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Jan 13 12:42:10 UTC 2024

 Modified Files:
 	src/sys/dev/usb: usbdi.c

 Log Message:
 usbdi(9): Avoid taking locks in usbd_transfer while polling.

 PR kern/57783

 XXX pullup-10
 XXX pullup-9
 XXX pullup-8


 To generate a diff of this commit:
 cvs rdiff -u -r1.248 -r1.249 src/sys/dev/usb/usbdi.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: analyzed->needs-pullups
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Sat, 13 Jan 2024 18:25:11 +0000
State-Changed-Why:
fix committed to HEAD, needs pullups to 10, 9, 8


State-Changed-From-To: needs-pullups->pending-pullups
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Fri, 01 Mar 2024 15:11:22 +0000
State-Changed-Why:
pullup-10 #613
pullup-9 #1807 (usbd_set_polling change only)
pullup-8 #1940 (usbd_set_polling change only)

(The usbd_transfer change is painful to apply before 10 because of
other changes to it in usbdi renovation.)


From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57783 CVS commit: [netbsd-10] src/sys/dev/usb
Date: Sun, 10 Mar 2024 18:36:13 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Sun Mar 10 18:36:12 UTC 2024

 Modified Files:
 	src/sys/dev/usb [netbsd-10]: usbdi.c

 Log Message:
 Pull up following revision(s) (requested by riastradh in ticket #613):

 	sys/dev/usb/usbdi.c: revision 1.248
 	sys/dev/usb/usbdi.c: revision 1.249

 usbdi(9): Avoid calling ubm_softint with lock held and polling on.
 usbdi(9): Avoid taking locks in usbd_transfer while polling.

 PR kern/57783


 To generate a diff of this commit:
 cvs rdiff -u -r1.247 -r1.247.4.1 src/sys/dev/usb/usbdi.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57783 CVS commit: [netbsd-9] src/sys/dev/usb
Date: Tue, 12 Mar 2024 12:34:50 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Tue Mar 12 12:34:50 UTC 2024

 Modified Files:
 	src/sys/dev/usb [netbsd-9]: usbdi.c

 Log Message:
 Pull up following revision(s) (requested by riastradh in ticket #1807):

 	sys/dev/usb/usbdi.c: revision 1.248

 usbdi(9): Avoid calling ubm_softint with lock held and polling on.

 PR kern/57783


 To generate a diff of this commit:
 cvs rdiff -u -r1.182.4.5 -r1.182.4.6 src/sys/dev/usb/usbdi.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57783 CVS commit: [netbsd-8] src/sys/dev/usb
Date: Tue, 12 Mar 2024 12:36:17 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Tue Mar 12 12:36:17 UTC 2024

 Modified Files:
 	src/sys/dev/usb [netbsd-8]: usbdi.c

 Log Message:
 Pull up following revision(s) (requested by riastradh in ticket #1940):

 	sys/dev/usb/usbdi.c: revision 1.248

 usbdi(9): Avoid calling ubm_softint with lock held and polling on.

 PR kern/57783


 To generate a diff of this commit:
 cvs rdiff -u -r1.173.2.5 -r1.173.2.6 src/sys/dev/usb/usbdi.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: pending-pullups->closed
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Wed, 13 Mar 2024 02:46:37 +0000
State-Changed-Why:
fixed and pulled up


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.