NetBSD Problem Report #53233

From www@NetBSD.org  Sun Apr 29 19:48:31 2018
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 1513D7A110
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 29 Apr 2018 19:48:31 +0000 (UTC)
Message-Id: <20180429194829.B07DC7A225@mollari.NetBSD.org>
Date: Sun, 29 Apr 2018 19:48:29 +0000 (UTC)
From: coypu@sdf.org
Reply-To: coypu@sdf.org
To: gnats-bugs@NetBSD.org
Subject: one-off kernel panic while connecting a urtwn device
X-Send-Pr-Version: www-1.0

>Number:         53233
>Category:       kern
>Synopsis:       one-off kernel panic while connecting a urtwn device
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Apr 29 19:50:00 +0000 2018
>Closed-Date:    Sat Sep 01 23:57:52 +0000 2018
>Last-Modified:  Sat Sep 01 23:57:52 +0000 2018
>Originator:     coypu
>Release:        NetBSD 8.99.14
>Organization:
>Environment:
NetBSD planets 8.99.14 NetBSD 8.99.14 (GENERIC) #0: Mon Mar 26 06:12:37 IDT 2018  fly@planets:/home/fly/obj/sys/arch/amd64/compile/GENERIC amd64

>Description:
connect urtwn device. Win hotplug lottery.

urtwn0 at uhub3 port 1
urtwn0: Realtek (0x7392) 802.11n WLAN Adapter (0x7811), rev 2.00/2.00, addr 3
urtwn0: MAC/BB RTL8188CUS, RF 6052 1T1R, address 74:da:38:5b:6c:6c
urtwn0: 1 rx pipe, 2 tx pipes
urtwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
urtwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
fatal page fault in supervisor mode
trap type 6 code 0 rip 0xffffffff80976a36 cs 0x8 rflags 0x10286 cr2 0x8 ilevel 0x6 rsp 0xffff800064f6ae10
curlwp 0xffffe40137a09440 pid 0.6 lowest kstack 0xffff800064f672c0
panic: trap
cpu0: Begin traceback...
vpanic() at netbsd:vpanic+0x140
snprintf() at netbsd:snprintf
startlwp() at netbsd:startlwp
alltraps() at netbsd:alltraps+0xb7
mutex_vector_enter() at netbsd:mutex_vector_enter+0xc6
ieee80211_find_rxnode() at netbsd:ieee80211_find_rxnode+0x3e
urtwn_rxeof() at netbsd:urtwn_rxeof+0x29a
usb_transfer_complete() at netbsd:usb_transfer_complete+0x146
ehci_softintr() at netbsd:ehci_softintr+0x19c
usb_soft_intr() at netbsd:usb_soft_intr+0x1f
softint_dispatch() at netbsd:softint_dispatch+0xd9
DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xffff800064f6b0f0
Xsoftintr() at netbsd:Xsoftintr+0x4f
--- interrupt ---
fbe4ef6ffafe10dc:
cpu0: End traceback...

dumping to dev 0,1 (offset=1896, size=1013425):
dump succeeded

>How-To-Repeat:

>Fix:
Don't expect netbsd to support hotplug.

>Release-Note:

>Audit-Trail:
From: coypu@sdf.org
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/53233: one-off kernel panic while connecting a urtwn device
Date: Sun, 29 Apr 2018 20:00:15 +0000

 This is pretty easy to reproduce. I have both wpa_supplicant and dhcpcd
 running as a service, and just re-attaching the device repeatedy.

 I have a second backtrace too.

 I assume these need more sc_dying checks.

 urtwn0 at uhub3 port 1
 urtwn0: Realtek (0x7392) 802.11n WLAN Adapter (0x7811), rev 2.00/2.00, addr 3
 urtwn0: MAC/BB RTL8188CUS, RF 6052 1T1R, address 74:da:38:5b:6c:6c
 urtwn0: 1 rx pipe, 2 tx pipes
 urtwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
 urtwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
 uvm_fault(0xffffe401074e68a0, 0x0, 1) -> e
 fatal page fault in supervisor mode
 trap type 6 code 0 rip 0xffffffff8044a7d8 cs 0x8 rflags 0x10282 cr2 0 ilevel 0x6 rsp 0xffff800067b3db70
 curlwp 0xffffe4010e2848c0 pid 1534.1 lowest kstack 0xffff800067b3a2c0
 panic: trap
 cpu2: Begin traceback...
 vpanic() at netbsd:vpanic+0x140
 snprintf() at netbsd:snprintf
 startlwp() at netbsd:startlwp
 alltraps() at netbsd:alltraps+0xb7
 urtwn_init() at netbsd:urtwn_init+0x1dd1
 urtwn_ioctl() at netbsd:urtwn_ioctl+0xf9
 doifioctl() at netbsd:doifioctl+0x79a
 sys_ioctl() at netbsd:sys_ioctl+0x103
 syscall() at netbsd:syscall+0x1d8
 --- syscall (number 54) ---
 77f93d518bda:
 cpu2: End traceback...

 dumping to dev 0,1 (offset=1896, size=1013425):
 dump succeeded

From: coypu@sdf.org
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/53233: one-off kernel panic while connecting a urtwn device
Date: Sun, 29 Apr 2018 23:19:01 +0000

 from mlelstv who is not writing himself:

 it's a problem it calls if_attach before ieee80211_media_init, and not
 specific to urtwn.

From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/53233: one-off kernel panic while connecting a urtwn device
Date: Mon, 30 Apr 2018 05:00:11 -0000 (UTC)

 coypu@sdf.org writes:

 >fatal page fault in supervisor mode
 >trap type 6 code 0 rip 0xffffffff80976a36 cs 0x8 rflags 0x10286 cr2 0x8 ilevel 0x6 rsp 0xffff800064f6ae10
 >curlwp 0xffffe40137a09440 pid 0.6 lowest kstack 0xffff800064f672c0
 >panic: trap
 >cpu0: Begin traceback...
 >vpanic() at netbsd:vpanic+0x140
 >snprintf() at netbsd:snprintf
 >startlwp() at netbsd:startlwp
 >alltraps() at netbsd:alltraps+0xb7
 >mutex_vector_enter() at netbsd:mutex_vector_enter+0xc6
 >ieee80211_find_rxnode() at netbsd:ieee80211_find_rxnode+0x3e
 >urtwn_rxeof() at netbsd:urtwn_rxeof+0x29a
 >usb_transfer_complete() at netbsd:usb_transfer_complete+0x146
 >ehci_softintr() at netbsd:ehci_softintr+0x19c
 >usb_soft_intr() at netbsd:usb_soft_intr+0x1f
 >softint_dispatch() at netbsd:softint_dispatch+0xd9

 This is probably a race condition between urtwn_attach and something
 setting IFF_UP and affects almost all wifi devices. The attach code does

 if_attach(ifp)
 ieee80211_ifattach(ifp)
 ieee80211_media_init(ifp)

 Only the last call finishes the initialization and e.g. allocates mutexes
 that are used by ieee80211_find_rxnode. But the ifattach() already makes
 the interface globally visible and lets someone do ioctls.

 The better attach sequence might be:

 if_initialize(ifp)
 ieee80211_ifattach(ifp)
 ieee80211_media_init(ifp)
 ifp->if_percpuq = if_percpuq_create(ifp);
 if_register(ifp)

 The attach routine is supposed to be protected by KERNEL_LOCK. So the
 race can only happen if something between if_attach() and
 ieee80211_media_init() sleeps.

 However, that protection is missing in usb_subr.c:

 --- usb_subr.c  26 Dec 2017 18:44:52 -0000      1.223
 +++ usb_subr.c  30 Apr 2018 04:56:40 -0000
 @@ -858,7 +858,9 @@
         uaa.uaa_subclass = dd->bDeviceSubClass;
         uaa.uaa_proto = dd->bDeviceProtocol;

 +       KERNEL_LOCK(1, curlwp);
         dv = config_found_ia(parent, "usbroothubif", &uaa, 0);
 +       KERNEL_UNLOCK_ONE(curlwp);
         if (dv) {
                 dev->ud_subdevs = kmem_alloc(sizeof(dv), KM_SLEEP);
                 dev->ud_subdevs[0] = dv;

 -- 
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

State-Changed-From-To: open->closed
State-Changed-By: maya@NetBSD.org
State-Changed-When: Sat, 01 Sep 2018 23:57:52 +0000
State-Changed-Why:
Commit was applied by mlelstv. I don't recall it working 100%, but I lost my urtwn on a bus, so it'd be hard to reproduce it. :-)
(Also pulled up, thanks)


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.