NetBSD Problem Report #56099

From reinoud@13thmonkey.org  Thu Apr  8 20:50:48 2021
Return-Path: <reinoud@13thmonkey.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 24C1A1A923E
	for <gnats-bugs@gnats.NetBSD.org>; Thu,  8 Apr 2021 20:50:48 +0000 (UTC)
Message-Id: <20210408205044.18D6AC1E908@dropje.13thmonkey.org>
Date: Thu,  8 Apr 2021 22:50:44 +0200 (CEST)
From: reinoud@NetBSD.org
Reply-To: reinoud@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: USB xHCI panic when using microphone
X-Send-Pr-Version: 3.95

>Number:         56099
>Category:       kern
>Synopsis:       USB xHCI panic when using microphone
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Apr 08 20:55:00 +0000 2021
>Last-Modified:  Wed Apr 28 21:20:00 +0000 2021
>Originator:     Reinoud Zandijk
>Release:        NetBSD 9.99.81
>Organization:
NetBSD

>Environment:


System: NetBSD dropje.13thmonkey.org 9.99.81 NetBSD 9.99.81 (GENERIC) #0: Sat Mar 27 14:24:25 CET 2021 reinoud@diablo.13thmonkey.org:/tmp/obj/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64

uhub1 at usb1: NetBSD (0x0000) xHCI root hub (0x0000), class 9/0, rev 2.00/1.00, addr 0
uhub1: 8 ports with 8 removable, self powered
uaudio0 at uhub1 port 1 configuration 1 interface 0
uaudio0: BEHRINGER (0x1397) C-1U (0x0301), rev 1.10/0.01, addr 1
uaudio0: audio rev 1.00
audio1 at uaudio0: capture
audio1: slinear_le:16 2ch 48000Hz, blk 12000 bytes (62.5ms) for recording


>Description:

The kernel paniced when I used Audacity to record audio from an USB
microphone. Recording can go well for quite some time, up to say half an hour
and then suddenly panic the machine. It can also happen just after a minute of
recording. Idling time seems not to matter.

The last messages in dmesg are:
netbsd: [ 544.4279640] audio_unlink@1 {,#92} pid=3001.3001 po=0 ro=1
netbsd: [ 561.4778564] audio_open@1 dev=audio flags=0x1 po=0 ro=0
netbsd: [ 561.4778564] audio track#- play mode: mulaw 8bit 1ch 8000Hz
netbsd: [ 561.4778564] audio track#93 rec  mode: mulaw 8bit 1ch 8000Hz
netbsd: [ 561.4878565] audio track#93 rec  mode: slinear_le 32bit 1ch 8000Hz
netbsd: [ 561.4878565] audio track#93 rec  mode: slinear_le 16bit 1ch 8000Hz
netbsd: [ 561.4878565] audio track#93 rec  mode: slinear_le 8bit 1ch 8000Hz
netbsd: [ 561.4878565] audio track#93 rec  mode: slinear_le 32bit 1ch 8000Hz
netbsd: [ 561.4878565] audio track#93 rec  mode: slinear_le 32bit 1ch 48000Hz
netbsd: [ 561.4878565] audio track#93 rec  mode: slinear_le 32bit 2ch 48000Hz
netbsd: [ 583.6078931] audio_unlink@1 {,#93} pid=3001.3001 po=0 ro=1
netbsd: [ 583.6078931] panic: kernel diagnostic assertion "xfer == SIMPLEQ_FIRST(&pipe->up_queue)" failed:
	file "/usr/sources/cvs.netbsd.org/src-clean/sys/dev/usb/usbdi.c", line 1035 xfer 0xffff80666b1c2a40 is not start of queue (0xffff80666b1c2b80 is at start)
netbsd: [ 583.6078931] cpu0: Begin traceback...
netbsd: [ 583.6078931] vpanic() at netbsd:vpanic+0x156
netbsd: [ 583.6078931] __x86_indirect_thunk_rax() at netbsd:__x86_indirect_thunk_rax
netbsd: [ 583.6078931] usb_transfer_complete() at netbsd:usb_transfer_complete+0x448
netbsd: [ 583.6078931] xhci_softintr() at netbsd:xhci_softintr+0x597
netbsd: [ 583.6078931] usb_soft_intr() at netbsd:usb_soft_intr+0x25
netbsd: [ 583.6178943] softint_dispatch() at netbsd:softint_dispatch+0xf5
netbsd: [ 583.6178943] DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xffffbf813d78b0f0
netbsd: [ 583.6178943] Xsoftintr() at netbsd:Xsoftintr+0x4f
netbsd: [ 583.6178943] --- interrupt ---
netbsd: [ 583.6178943] 877922af6802fc58:
netbsd: [ 583.6178943] cpu0: End traceback...

The traceback in gdb gives:

(gdb) target kvm netbsd.19.core
0xffffffff802261d5 in cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0) at /usr/sources/cvs.netbsd.org/src-clean/sys/arch/amd64/amd64/machdep.c:713
713                     dumpsys();
(gdb) where
#0  0xffffffff802261d5 in cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0) at
	/usr/sources/cvs.netbsd.org/src-clean/sys/arch/amd64/amd64/machdep.c:713
#1  0xffffffff80cb5511 in kern_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0) at
	/usr/sources/cvs.netbsd.org/src-clean/sys/kern/kern_reboot.c:73
#2  0xffffffff80cf5373 in vpanic (fmt=0xffffffff81396ba0 "kernel %sassertion \"%s\" failed: file \"%s\", line %d xfer %p is not start of queue (%p is at start)", ap=ap@entry=0xffffbf813d78aee8) at
	 /usr/sources/cvs.netbsd.org/src-clean/sys/kern/subr_prf.c:290
#3  0xffffffff80e67eb3 in kern_assert (fmt=fmt@entry=0xffffffff81396ba0 "kernel %sassertion \"%s\" failed: file \"%s\", line %d xfer %p is not start of queue (%p is at start)") at
	 /usr/sources/cvs.netbsd.org/src-clean/sys/lib/libkern/kern_assert.c:51
#4  0xffffffff80465dd3 in usb_transfer_complete (xfer=0xffff80666b1c2a40) at
	/usr/sources/cvs.netbsd.org/src-clean/sys/dev/usb/usbdivar.h:370
#5  0xffffffff8068bc25 in xhci_event_transfer (trb=<optimized out>, sc=0xffff80666ad14000) at
	/usr/sources/cvs.netbsd.org/src-clean/sys/dev/usb/xhci.c:2122
#6  xhci_handle_event (trb=<optimized out>, sc=0xffff80666ad14000) at
	/usr/sources/cvs.netbsd.org/src-clean/sys/dev/usb/xhci.c:2198
#7  xhci_softintr (v=<optimized out>) at
	/usr/sources/cvs.netbsd.org/src-clean/sys/dev/usb/xhci.c:2238i
#8  0xffffffff80461ece in usb_soft_intr (arg=0xffff80666ad14050) at
	/usr/sources/cvs.netbsd.org/src-clean/sys/dev/usb/usb.c:1316
#9  0xffffffff80cc3620 in softint_execute (s=5, l=0xffff80696a9ee4c0) at
	 /usr/sources/cvs.netbsd.org/src-clean/sys/kern/kern_softint.c:565
#10 softint_dispatch (pinned=<optimized out>, s=5) at
	/usr/sources/cvs.netbsd.org/src-clean/sys/kern/kern_softint.c:814
#11 0xffffffff80220e3f in Xsoftintr ()


usbdi.c:1035:
1029            if (!repeat) {
1030                    /* Remove request from queue. */
1031    
1032                    KASSERTMSG(!SIMPLEQ_EMPTY(&pipe->up_queue),
1033                        "pipe %p is empty, but xfer %p wants to complete", pipe,
1034                         xfer);
1035                    KASSERTMSG(xfer == SIMPLEQ_FIRST(&pipe->up_queue),
1036                        "xfer %p is not start of queue (%p is at start)", xfer,
1037                       SIMPLEQ_FIRST(&pipe->up_queue));
1038    
1039    #ifdef DIAGNOSTIC
1040                    xfer->ux_state = XFER_BUSY;
1041    #endif
1042                    SIMPLEQ_REMOVE_HEAD(&pipe->up_queue, ux_next);
1043            }

In the usb_transfer_complete() stackframe:
(gdb) print *pipe
$4 = {up_iface = 0xffff80666b1448b0, up_dev = 0xffff80666b13ac40, up_endpoint = 0xffff8066a9037440,
	up_running = 0 '\000', up_aborting = 1 '\001', up_serialise = false,
	up_queue = {sqh_first = 0xffff80666b1c2b80, sqh_last = 0xffff80666b1c2998},
	up_next = {le_next = 0x0, le_prev = 0xffff80666b1448d8},
	up_async_task = {next = {tqe_next = 0xffff8066bde38080, tqe_prev = 0xffff8066ddf77600},
		 fun = 0xffffffff8046573f <usbd_clear_endpoint_stall_task>,
		 arg = 0xffff8066bde57000, queue = 2, flags = 128},
	up_intrxfer = 0x0, up_repeat = 0 '\000', up_interval = -1, up_flags = 128 '', <-- one letter garbage
	up_methods = 0xffffffff8122bf80 <xhci_device_isoc_methods>
}



>How-To-Repeat:

Start recording with audio for some length of time on 44kHz over xHCI and see
it crash after an undetermined length of time.


>Fix:


>Audit-Trail:
From: sc dying <sc.dying@gmail.com>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/56099: USB xHCI panic when using microphone
Date: Thu, 29 Apr 2021 06:19:29 +0900

 Does xhci_event_transfer() forget to increment xx_isoc_done and
 avoid calling usb_transfer_comeplete() in error paths?
 When an error occurs while isoc transfer, i.e. completion code is not
 SUCCESS nor SHORT_PKT, usb_transfer_comeplete is called without
 incrementing xx_isoc_done and the xfer is removed from up_queue.
 Then, usb_transfer_comeplete is called at the last of transferring
 ux_frlengths[], but it cannot find the xfer in up_queue and fires KASSERT.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.