NetBSD Problem Report #53019
From paul@whooppee.com Mon Feb 12 23:44:02 2018
Return-Path: <paul@whooppee.com>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 6FBC57A1EA
for <gnats-bugs@gnats.NetBSD.org>; Mon, 12 Feb 2018 23:44:02 +0000 (UTC)
Message-Id: <20180212234359.66D6316E44@speedy.whooppee.com>
Date: Tue, 13 Feb 2018 07:43:59 +0800 (+08)
From: paul@whooppee.com
Reply-To: paul@whooppee.com
To: gnats-bugs@NetBSD.org
Subject: xhci-connected keyboard with LOCKDEBUG kernel causes panic
X-Send-Pr-Version: 3.95
>Number: 53019
>Category: kern
>Synopsis: xhci-connected keyboard with LOCKDEBUG kernel causes panic
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Feb 12 23:45:00 +0000 2018
>Closed-Date: Wed Jun 09 16:10:29 +0000 2021
>Last-Modified: Wed Jun 09 16:10:29 +0000 2021
>Originator: Paul Goyette
>Release: NetBSD 8.99.12
>Organization:
+------------------+--------------------------+----------------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+------------------+--------------------------+----------------------------+
>Environment:
System: NetBSD speedy.whooppee.com 8.99.12 NetBSD 8.99.12 (SPEEDY 2018-02-12 00:00:12 UTC) #3: Mon Feb 12 06:57:12 UTC 2018 paul@speedy.whooppee.com:/build/netbsd-local/obj/amd64/sys/arch/amd64/compile/SPEEDY amd64
Architecture: x86_64
Machine: amd64
>Description:
With a LOCKDEBUG kernel and a USB keyboard attached via a xhci USB-3 port,
typing a character at the DDB(4) prompt causes a kernel panic.
It appears that the xhci_device_intr_start() code is trying to obtain
a spin mutex while another spin mutex is already held (perhaps in the
xhci_poll() routine?).
Here's the console output from the LOCKDEBUG panic - all transcribed by
hand, but hopefully without too many typos!
Mutex error: mutex_vector_enter,523: spin lock held
lock address: 0xffffe410e9d1d9a0 type: spin
initialized: 0xffffffff802bac06
shared holds: 0 exclusive: 1
shares wanted: 0 exclusive: 0
current CPU: 11 last held: 11
curlwp: 0xffffe41fc09ad2c0 last held: 0xffffe41fc09ad2c0
last locked*: 0xffffffff802b81de unlocked: 0xffffffff80291179
owner field: 0x0000000000010600 wait/spin: 0/1
panic: LOCKDEBUG: Mutex error: mutex_vector_enter,523: spin lock held
And the backtrace is
vpanic+0x140
snprintf
lockdebug_more
mutex_enter+0x69d
xhci_device_intr_start+0x125
usbd_start_next+0x65
xhci_soft_intr+0x49b
xhci_poll+0x37
ukbd_cngetc+0x19
cngetc+0x34
db_readline+0x65
db_read_line+0x15
db_command_loop+0x84
db_trap+0xe3
kbd_trap+0xe2
trap (number 4)
(This is then followed by the original backtrace which caused ddb(4)
to be entered in the first place.)
>How-To-Repeat:
See above. Boot a LOCKDEBUG kernel, and enter ddb(4) (via some
pre-existing bug - have not tried to enter via cnmagic key-combo).
Type a character and watch it go boom.
>Fix:
>Release-Note:
>Audit-Trail:
From: David Holland <dholland-gnats@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/53019: xhci-connected keyboard with LOCKDEBUG kernel causes
panic
Date: Mon, 12 Mar 2018 00:18:31 +0000
Not sent to gnats (gnats@ is the administrator address; use gnats-bugs@)
------
From: Paul Goyette <paul@whooppee.com>
To: gnats@netbsd.org
Subject: Re: kern/53019
Date: Tue, 13 Feb 2018 16:05:34 +0800 (+08)
# addr2line -e /netbsd.gdb 0xffffffff802bac06
/build/netbsd-local/src_ro/sys/dev/usb/xhci.c:1154
#
Looks like it is in xhci_init() right before setting the erst variable.
So it is likely scs->sc_intr_lock ...
From: "David H. Gutteridge" <david@gutteridge.ca>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/53019
Date: Tue, 18 Sep 2018 22:09:49 -0400
Hi,
I've filed what I believe is a related bug as kern/52944. mrg@ has
made some changes in -current that may be fix this and asked for a
re-test, so I thought I'd mention that here, too.
Regards,
Dave
From: Paul Goyette <paul@whooppee.com>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/53019
Date: Wed, 19 Sep 2018 16:13:21 +0800 (+08)
I attempted to retest kern/53019 but unfortunately a -current kernel
does not work on my hardware set-up. I'm suspecting it is related to
my video card (GTX 1050-Ti).
State-Changed-From-To: open->closed
State-Changed-By: pgoyette@NetBSD.org
State-Changed-When: Wed, 09 Jun 2021 16:10:29 +0000
State-Changed-Why:
This seems to have been fixed with one or more of the somewhat-recent
changes to xhci code. At least, I can no longer reproduce.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.