NetBSD Problem Report #53233
From www@NetBSD.org Sun Apr 29 19:48:31 2018
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 1513D7A110
for <gnats-bugs@gnats.NetBSD.org>; Sun, 29 Apr 2018 19:48:31 +0000 (UTC)
Message-Id: <20180429194829.B07DC7A225@mollari.NetBSD.org>
Date: Sun, 29 Apr 2018 19:48:29 +0000 (UTC)
From: coypu@sdf.org
Reply-To: coypu@sdf.org
To: gnats-bugs@NetBSD.org
Subject: one-off kernel panic while connecting a urtwn device
X-Send-Pr-Version: www-1.0
>Number: 53233
>Category: kern
>Synopsis: one-off kernel panic while connecting a urtwn device
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Apr 29 19:50:00 +0000 2018
>Closed-Date: Sat Sep 01 23:57:52 +0000 2018
>Last-Modified: Sat Sep 01 23:57:52 +0000 2018
>Originator: coypu
>Release: NetBSD 8.99.14
>Organization:
>Environment:
NetBSD planets 8.99.14 NetBSD 8.99.14 (GENERIC) #0: Mon Mar 26 06:12:37 IDT 2018 fly@planets:/home/fly/obj/sys/arch/amd64/compile/GENERIC amd64
>Description:
connect urtwn device. Win hotplug lottery.
urtwn0 at uhub3 port 1
urtwn0: Realtek (0x7392) 802.11n WLAN Adapter (0x7811), rev 2.00/2.00, addr 3
urtwn0: MAC/BB RTL8188CUS, RF 6052 1T1R, address 74:da:38:5b:6c:6c
urtwn0: 1 rx pipe, 2 tx pipes
urtwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
urtwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
fatal page fault in supervisor mode
trap type 6 code 0 rip 0xffffffff80976a36 cs 0x8 rflags 0x10286 cr2 0x8 ilevel 0x6 rsp 0xffff800064f6ae10
curlwp 0xffffe40137a09440 pid 0.6 lowest kstack 0xffff800064f672c0
panic: trap
cpu0: Begin traceback...
vpanic() at netbsd:vpanic+0x140
snprintf() at netbsd:snprintf
startlwp() at netbsd:startlwp
alltraps() at netbsd:alltraps+0xb7
mutex_vector_enter() at netbsd:mutex_vector_enter+0xc6
ieee80211_find_rxnode() at netbsd:ieee80211_find_rxnode+0x3e
urtwn_rxeof() at netbsd:urtwn_rxeof+0x29a
usb_transfer_complete() at netbsd:usb_transfer_complete+0x146
ehci_softintr() at netbsd:ehci_softintr+0x19c
usb_soft_intr() at netbsd:usb_soft_intr+0x1f
softint_dispatch() at netbsd:softint_dispatch+0xd9
DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xffff800064f6b0f0
Xsoftintr() at netbsd:Xsoftintr+0x4f
--- interrupt ---
fbe4ef6ffafe10dc:
cpu0: End traceback...
dumping to dev 0,1 (offset=1896, size=1013425):
dump succeeded
>How-To-Repeat:
>Fix:
Don't expect netbsd to support hotplug.
>Release-Note:
>Audit-Trail:
From: coypu@sdf.org
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53233: one-off kernel panic while connecting a urtwn device
Date: Sun, 29 Apr 2018 20:00:15 +0000
This is pretty easy to reproduce. I have both wpa_supplicant and dhcpcd
running as a service, and just re-attaching the device repeatedy.
I have a second backtrace too.
I assume these need more sc_dying checks.
urtwn0 at uhub3 port 1
urtwn0: Realtek (0x7392) 802.11n WLAN Adapter (0x7811), rev 2.00/2.00, addr 3
urtwn0: MAC/BB RTL8188CUS, RF 6052 1T1R, address 74:da:38:5b:6c:6c
urtwn0: 1 rx pipe, 2 tx pipes
urtwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
urtwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
uvm_fault(0xffffe401074e68a0, 0x0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip 0xffffffff8044a7d8 cs 0x8 rflags 0x10282 cr2 0 ilevel 0x6 rsp 0xffff800067b3db70
curlwp 0xffffe4010e2848c0 pid 1534.1 lowest kstack 0xffff800067b3a2c0
panic: trap
cpu2: Begin traceback...
vpanic() at netbsd:vpanic+0x140
snprintf() at netbsd:snprintf
startlwp() at netbsd:startlwp
alltraps() at netbsd:alltraps+0xb7
urtwn_init() at netbsd:urtwn_init+0x1dd1
urtwn_ioctl() at netbsd:urtwn_ioctl+0xf9
doifioctl() at netbsd:doifioctl+0x79a
sys_ioctl() at netbsd:sys_ioctl+0x103
syscall() at netbsd:syscall+0x1d8
--- syscall (number 54) ---
77f93d518bda:
cpu2: End traceback...
dumping to dev 0,1 (offset=1896, size=1013425):
dump succeeded
From: coypu@sdf.org
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53233: one-off kernel panic while connecting a urtwn device
Date: Sun, 29 Apr 2018 23:19:01 +0000
from mlelstv who is not writing himself:
it's a problem it calls if_attach before ieee80211_media_init, and not
specific to urtwn.
From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/53233: one-off kernel panic while connecting a urtwn device
Date: Mon, 30 Apr 2018 05:00:11 -0000 (UTC)
coypu@sdf.org writes:
>fatal page fault in supervisor mode
>trap type 6 code 0 rip 0xffffffff80976a36 cs 0x8 rflags 0x10286 cr2 0x8 ilevel 0x6 rsp 0xffff800064f6ae10
>curlwp 0xffffe40137a09440 pid 0.6 lowest kstack 0xffff800064f672c0
>panic: trap
>cpu0: Begin traceback...
>vpanic() at netbsd:vpanic+0x140
>snprintf() at netbsd:snprintf
>startlwp() at netbsd:startlwp
>alltraps() at netbsd:alltraps+0xb7
>mutex_vector_enter() at netbsd:mutex_vector_enter+0xc6
>ieee80211_find_rxnode() at netbsd:ieee80211_find_rxnode+0x3e
>urtwn_rxeof() at netbsd:urtwn_rxeof+0x29a
>usb_transfer_complete() at netbsd:usb_transfer_complete+0x146
>ehci_softintr() at netbsd:ehci_softintr+0x19c
>usb_soft_intr() at netbsd:usb_soft_intr+0x1f
>softint_dispatch() at netbsd:softint_dispatch+0xd9
This is probably a race condition between urtwn_attach and something
setting IFF_UP and affects almost all wifi devices. The attach code does
if_attach(ifp)
ieee80211_ifattach(ifp)
ieee80211_media_init(ifp)
Only the last call finishes the initialization and e.g. allocates mutexes
that are used by ieee80211_find_rxnode. But the ifattach() already makes
the interface globally visible and lets someone do ioctls.
The better attach sequence might be:
if_initialize(ifp)
ieee80211_ifattach(ifp)
ieee80211_media_init(ifp)
ifp->if_percpuq = if_percpuq_create(ifp);
if_register(ifp)
The attach routine is supposed to be protected by KERNEL_LOCK. So the
race can only happen if something between if_attach() and
ieee80211_media_init() sleeps.
However, that protection is missing in usb_subr.c:
--- usb_subr.c 26 Dec 2017 18:44:52 -0000 1.223
+++ usb_subr.c 30 Apr 2018 04:56:40 -0000
@@ -858,7 +858,9 @@
uaa.uaa_subclass = dd->bDeviceSubClass;
uaa.uaa_proto = dd->bDeviceProtocol;
+ KERNEL_LOCK(1, curlwp);
dv = config_found_ia(parent, "usbroothubif", &uaa, 0);
+ KERNEL_UNLOCK_ONE(curlwp);
if (dv) {
dev->ud_subdevs = kmem_alloc(sizeof(dv), KM_SLEEP);
dev->ud_subdevs[0] = dv;
--
--
Michael van Elst
Internet: mlelstv@serpens.de
"A potential Snark may lurk in every tree."
State-Changed-From-To: open->closed
State-Changed-By: maya@NetBSD.org
State-Changed-When: Sat, 01 Sep 2018 23:57:52 +0000
State-Changed-Why:
Commit was applied by mlelstv. I don't recall it working 100%, but I lost my urtwn on a bus, so it'd be hard to reproduce it. :-)
(Also pulled up, thanks)
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.