NetBSD Problem Report #56393

From yorickhardy@gmail.com  Thu Sep  9 20:42:09 2021
Return-Path: <yorickhardy@gmail.com>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 37DB31A9239
	for <gnats-bugs@gnats.NetBSD.org>; Thu,  9 Sep 2021 20:42:09 +0000 (UTC)
Message-Id: <613a719c.1c69fb81.adb42.650c@mx.google.com>
Date: Thu, 09 Sep 2021 22:41:54 +0200
From: yorickhardy@gmail.com
Reply-To: yorickhardy@gmail.com
To: gnats-bugs@NetBSD.org
Subject: panic in usbd_create_xfer
X-Send-Pr-Version: 3.95

>Number:         56393
>Category:       kern
>Synopsis:       panic in usbd_create_xfer
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    skrll
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Sep 09 20:45:00 +0000 2021
>Last-Modified:  Sun Dec 12 09:01:19 +0000 2021
>Originator:     Yorick Hardy
>Release:        NetBSD 9.99.88
>Organization:

>Environment:
System: NetBSD HOME 9.99.88 NetBSD 9.99.88 (YORICK.amd64) #7: Wed Sep 8 08:49:53 SAST 2021 root@HOME:/root/build.amd64.local/obj/sys/arch/amd64/compile/YORICK.amd64 amd64
Architecture: x86_64
Machine: amd64
>Description:
Sometimes (usually after uptime of at least a day), usbd_create_xfer
will panic. This usually happens when a program linked with devel/SDL2
is executed.

Here is the panic message (note the ohci1 WARNING):

[ 75728.072302] uvm_fault(0xffff81bb6cc20b40, 0x810000, 1) -> e
[ 75728.072302] fatal page fault in supervisor mode
[ 75728.072302] trap type 6 code 0 rip 0xffffffff8038366d cs 0x8 rflags 0x10286 cr2 0x810602 ilevel 0 rsp 0xffffde01536049c0
[ 75728.072302] curlwp 0xffff81bbe0a59500 pid 4992.4992 lowest kstack 0xffffde01536002c0
[ 75728.072302] panic: trap
[ 75728.072302] cpu0: Begin traceback...
[ 75728.072302] ohci1: WARNING: addr 0x40055170 not found
[ 75728.072302] vpanic() at netbsd:vpanic+0x156
[ 75728.082297] device_printf() at netbsd:device_printf
[ 75728.082297] startlwp() at netbsd:startlwp
[ 75728.082297] alltraps() at netbsd:alltraps+0xc3
[ 75728.092294] usbd_create_xfer() at netbsd:usbd_create_xfer+0x27b
[ 75728.092294] usbd_open_pipe_intr() at netbsd:usbd_open_pipe_intr+0x7f
[ 75728.092294] uhidev_open() at netbsd:uhidev_open+0x293
[ 75728.102294] uhidopen() at netbsd:uhidopen+0x107
[ 75728.102294] cdev_open() at netbsd:cdev_open+0xae
[ 75728.102294] spec_open() at netbsd:spec_open+0x176
[ 75728.112294] VOP_OPEN() at netbsd:VOP_OPEN+0x3c
[ 75728.112294] vn_open() at netbsd:vn_open+0x23b
[ 75728.122295] do_open() at netbsd:do_open+0xc3
[ 75728.122295] do_sys_openat() at netbsd:do_sys_openat+0x74
[ 75728.122295] sys_open() at netbsd:sys_open+0x24
[ 75728.132294] syscall() at netbsd:syscall+0x196
[ 75728.132294] --- syscall (number 5) ---
[ 75728.132294] netbsd:syscall+0x196:
[ 75728.132294] cpu0: End traceback...

and the relevant lines from dmesg (I am using a custom kernel with uintuos driver,
but the panic also occurs with GENERIC from netbsd.org):

[     1.015121] ohci0 at pci0 dev 18 function 0: ATI Technologies SB700-SB900 USB OHCI Controller (rev. 0x00)
[     1.015121] ohci0: interrupting at ioapic0 pin 18
[     1.015121] ohci0: OHCI version 1.0, legacy support
[     1.015121] usb0 at ohci0: USB revision 1.0
[     1.015121] ehci0 at pci0 dev 18 function 2: ATI Technologies SB700-SB900 USB EHCI Controller (rev. 0x00)
[     1.015121] ehci0: interrupting at ioapic0 pin 17
[     1.015121] ehci0: dropped intr workaround enabled
[     1.015121] ehci0: BIOS has given up ownership
[     1.015121] ehci0: EHCI version 1.0
[     1.015121] ehci0: 1 companion controller, 5 ports: ohci0
[     1.015121] usb1 at ehci0: USB revision 2.0
[     1.015121] ohci1 at pci0 dev 19 function 0: ATI Technologies SB700-SB900 USB OHCI Controller (rev. 0x00)
[     1.015121] ohci1: interrupting at ioapic0 pin 18
[     1.015121] ohci1: OHCI version 1.0, legacy support
[     1.015121] usb2 at ohci1: USB revision 1.0
[     1.015121] ehci1 at pci0 dev 19 function 2: ATI Technologies SB700-SB900 USB EHCI Controller (rev. 0x00)
[     1.015121] ehci1: interrupting at ioapic0 pin 17
[     1.015121] ehci1: dropped intr workaround enabled
[     1.015121] ehci1: EHCI version 1.0
[     1.015121] ehci1: 1 companion controller, 5 ports: ohci1
[     1.015121] usb3 at ehci1: USB revision 2.0
[     1.015121] ohci2 at pci0 dev 20 function 5: ATI Technologies SB700-SB900 USB OHCI Controller (rev. 0x00)
[     1.015121] ohci2: interrupting at ioapic0 pin 18
[     1.015121] ohci2: OHCI version 1.0, legacy support
[     1.015121] usb4 at ohci2: USB revision 1.0
[     1.015121] ohci3 at pci0 dev 22 function 0: ATI Technologies SB700-SB900 USB OHCI Controller (rev. 0x00)
[     1.015121] ohci3: interrupting at ioapic0 pin 18
[     1.015121] ohci3: OHCI version 1.0, legacy support
[     1.015121] usb5 at ohci3: USB revision 1.0
[     1.015121] ehci2 at pci0 dev 22 function 2: ATI Technologies SB700-SB900 USB EHCI Controller (rev. 0x00)
[     1.015121] ehci2: interrupting at ioapic0 pin 17
[     1.015121] ehci2: dropped intr workaround enabled
[     1.015121] ehci2: EHCI version 1.0
[     1.015121] ehci2: 1 companion controller, 4 ports: ohci3
[     1.015121] usb6 at ehci2: USB revision 2.0
[     1.933523] uhub0 at usb0: NetBSD (0x0000) OHCI root hub (0x0000), class 9/0, rev 1.00/1.00, addr 1
[     1.933523] uhub1 at usb2: NetBSD (0x0000) OHCI root hub (0x0000), class 9/0, rev 1.00/1.00, addr 1
[     1.933523] uhub2 at usb4: NetBSD (0x0000) OHCI root hub (0x0000), class 9/0, rev 1.00/1.00, addr 1
[     1.933523] uhub3 at usb6: NetBSD (0x0000) EHCI root hub (0x0000), class 9/0, rev 2.00/1.00, addr 1
[     1.933523] uhub4 at usb1: NetBSD (0x0000) EHCI root hub (0x0000), class 9/0, rev 2.00/1.00, addr 1
[     1.933523] uhub5 at usb3: NetBSD (0x0000) EHCI root hub (0x0000), class 9/0, rev 2.00/1.00, addr 1
[     1.933523] uhub6 at usb5: NetBSD (0x0000) OHCI root hub (0x0000), class 9/0, rev 1.00/1.00, addr 1
[     2.953529] ehci2: handing over low speed device on port 2 to ohci3
[     2.953529] ehci1: handing over low speed device on port 2 to ohci1
[     3.603531] ehci1: handing over full speed device on port 3 to ohci1
[     5.313540] uhidev0 at uhub1 port 2 configuration 1 interface 0
[     5.313540] uhidev0: Logitech (0x046d) Trackball (0xc404), rev 1.10/2.20, addr 2, iclass 3/1
[     5.313540] uhidev1 at uhub6 port 2 configuration 1 interface 0
[     5.313540] uhidev1: NOVATEK (0x0603) USB Keyboard (0x00f2), rev 1.10/1.12, addr 2, iclass 3/1
[     5.323540] ukbd0 at uhidev1
[     5.323540] ums0 at uhidev0: 3 buttons and Z dir
[     5.723535] uhidev2 at uhub6 port 2 configuration 1 interface 1
[     5.723535] uhidev2: NOVATEK (0x0603) USB Keyboard (0x00f2), rev 1.10/1.12, addr 2, iclass 3/0
[     5.733534] uhidev2: 4 report ids
[     5.733534] uhid0 at uhidev2 reportid 2: input=1, output=0, feature=0
[     5.733534] uhid1 at uhidev2 reportid 3: input=3, output=0, feature=0
[     5.733534] uhid2 at uhidev2 reportid 4: input=2, output=0, feature=0
[     6.903549] uhidev3 at uhub1 port 3 configuration 1 interface 0
[     6.903549] uhidev3: Wacom Co.,Ltd. (0x056a) Intuos PTS (0x033c), rev 2.00/1.00, addr 3, iclass 3/0
[     6.953538] uhidev3: 192 report ids
[     6.953538] uhid3 at uhidev3 reportid 2: input=0, output=0, feature=1
[     6.953538] uhid4 at uhidev3 reportid 3: input=0, output=0, feature=1
[     6.953538] uhid5 at uhidev3 reportid 4: input=0, output=0, feature=1
[     6.953538] uhid6 at uhidev3 reportid 5: input=0, output=0, feature=1
[     6.953538] uhid7 at uhidev3 reportid 7: input=0, output=0, feature=9
[     6.953538] uhid8 at uhidev3 reportid 8: input=0, output=0, feature=9
[     6.953538] uintuos0 at uhidev3 reportid 16
[     6.953538] uhid9 at uhidev3 reportid 17: input=0, output=0, feature=16
[     6.953538] uhid10 at uhidev3 reportid 19: input=0, output=0, feature=1
[     6.953538] uhid11 at uhidev3 reportid 20: input=0, output=0, feature=31
[     6.953538] uhid12 at uhidev3 reportid 32: input=0, output=0, feature=5
[     6.953538] uhid13 at uhidev3 reportid 33: input=0, output=0, feature=1
[     6.953538] uhid14 at uhidev3 reportid 34: input=0, output=0, feature=1
[     6.953538] uhid15 at uhidev3 reportid 35: input=0, output=0, feature=14
[     6.953538] uhid16 at uhidev3 reportid 36: input=0, output=0, feature=31
[     6.953538] uhid17 at uhidev3 reportid 37: input=0, output=0, feature=4
[     6.953538] uhid18 at uhidev3 reportid 48: input=0, output=0, feature=2
[     6.953538] uhid19 at uhidev3 reportid 49: input=0, output=0, feature=33
[     6.963549] uhid20 at uhidev3 reportid 50: input=0, output=0, feature=33
[     6.963549] uhid21 at uhidev3 reportid 51: input=0, output=0, feature=1
[     6.963549] uhid22 at uhidev3 reportid 64: input=0, output=0, feature=10
[     6.963549] uhid23 at uhidev3 reportid 192: input=9, output=0, feature=0
[     6.963549] uhidev4 at uhub1 port 3 configuration 1 interface 1
[     6.963549] uhidev4: Wacom Co.,Ltd. (0x056a) Intuos PTS (0x033c), rev 2.00/1.00, addr 3, iclass 3/0
[     6.963549] uhidev4: 3 report ids
[     6.963549] uhid24 at uhidev4 reportid 2: input=63, output=0, feature=0
[     6.963549] uhid25 at uhidev4 reportid 3: input=63, output=0, feature=0
[     6.963549] uhidev5 at uhub1 port 3 configuration 1 interface 2
[     6.963549] uhidev5: Wacom Co.,Ltd. (0x056a) Intuos PTS (0x033c), rev 2.00/1.00, addr 3, iclass 3/1
[     6.973550] uhidev5: 1 report ids
[     6.973550] ums1 at uhidev5 reportid 1: 5 buttons

>How-To-Repeat:
Sometimes (usually after uptime of at least a day), usbd_create_xfer
will panic. This usually happens when devel/SDL2 is used, for example
with emulators/dosbox or games/quakespasm (presumably when joystick/game
controllers are queried since net/lagrange does not trigger the panic).

I did not find a way to reliably trigger the panic, although 2 days of
uptime seems to make the probability reasonably high.
>Fix:
Reverting to src/sys/dev/usb/ohci.c revision 1.310 seems to solve the problem,
but I am not sure what the correct fix is.

>Release-Note:

>Audit-Trail:
From: Yorick Hardy <yorickhardy@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56393: panic in usbd_create_xfer
Date: Sat, 11 Dec 2021 22:32:43 +0200

 Masking the interrupts earlier seems to solve the problem.

 Index: sys/dev/usb/ohci.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/usb/ohci.c,v
 retrieving revision 1.317
 diff -u -r1.317 ohci.c
 --- sys/dev/usb/ohci.c	24 Jun 2021 23:01:03 -0000	1.317
 +++ sys/dev/usb/ohci.c	11 Dec 2021 17:34:57 -0000
 @@ -1336,9 +1336,16 @@
  		/* XXX do what */
  		eintrs &= ~OHCI_SO;
  	}
 +	if (eintrs != 0) {
 +		/* Block unprocessed interrupts listed below. */
 +		OWRITE4(sc, OHCI_INTERRUPT_DISABLE, eintrs);
 +		sc->sc_eintrs &= ~eintrs;
 +		DPRINTF("sc %#jx blocking intrs %#jx", (uintptr_t)sc,
 +		    eintrs, 0, 0);
 +	}
  	if (eintrs & OHCI_WDH) {
  		/*
 -		 * We block the interrupt below, and reenable it later from
 +		 * We blocked the interrupt above, and reenable it later from
  		 * ohci_softintr().
  		 */
  		usb_schedsoftintr(&sc->sc_bus);
 @@ -1356,7 +1363,7 @@

  		KASSERT(TAILQ_EMPTY(&sc->sc_abortingxfers));
  		DPRINTFN(10, "end SOF %#jx", (uintptr_t)sc, 0, 0, 0);
 -		/* Don't remove OHIC_SF from eintrs so it is blocked below */
 +		/* Don't remove OHIC_SF from eintrs, it is blocked above */
  	}
  	if (eintrs & OHCI_RD) {
  		DPRINTFN(5, "resume detect sc=%#jx", (uintptr_t)sc, 0, 0, 0);
 @@ -1372,19 +1379,12 @@
  	}
  	if (eintrs & OHCI_RHSC) {
  		/*
 -		 * We block the interrupt below, and reenable it later from
 +		 * We blocked the interrupt above, and reenable it later from
  		 * a timeout.
  		 */
  		softint_schedule(sc->sc_rhsc_si);
  	}

 -	if (eintrs != 0) {
 -		/* Block unprocessed interrupts. */
 -		OWRITE4(sc, OHCI_INTERRUPT_DISABLE, eintrs);
 -		sc->sc_eintrs &= ~eintrs;
 -		DPRINTF("sc %#jx blocking intrs %#jx", (uintptr_t)sc,
 -		    eintrs, 0, 0);
 -	}

  	return 1;
  }

Responsible-Changed-From-To: kern-bug-people->skrll
Responsible-Changed-By: skrll@NetBSD.org
Responsible-Changed-When: Sun, 12 Dec 2021 09:01:19 +0000
Responsible-Changed-Why:
Take


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.