NetBSD Problem Report #48954
From alnsn@NetBSD.org Fri Jun 27 09:45:03 2014
Return-Path: <alnsn@NetBSD.org>
Received: by mollari.NetBSD.org (Postfix, from userid 1459)
id 9D813A653D; Fri, 27 Jun 2014 09:45:03 +0000 (UTC)
Message-Id: <20140627094503.9D813A653D@mollari.NetBSD.org>
Date: Fri, 27 Jun 2014 09:45:03 +0000 (UTC)
From: alnsn@NetBSD.org
Reply-To: alnsn@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: USB diagconstic message: actlen (-15996) > len (4)
X-Send-Pr-Version: 3.95
>Number: 48954
>Category: kern
>Synopsis: USB diagconstic message: actlen (-15996) > len (4)
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Jun 27 09:50:00 +0000 2014
>Last-Modified: Sun Jul 06 09:10:07 +0000 2014
>Originator: Alexander Nasonov
>Release: NetBSD 6.99.44
>Organization:
TNF
>Environment:
$NetBSD: usb.c,v 1.149 2014/03/16 05:20:29 dholland Exp $
$NetBSD: usb_mem.c,v 1.64 2013/12/22 18:29:25 mlelstv Exp $
$NetBSD: usb_pci.c,v 1.7 2008/04/28 20:23:55 martin Exp $
$NetBSD: usb_quirks.c,v 1.80 2013/11/14 16:33:20 nonaka Exp $
$NetBSD: usb_subr.c,v 1.196 2014/02/17 07:34:21 skrll Exp $
$NetBSD: usbdi.c,v 1.160 2013/11/30 12:16:14 skrll Exp $
$NetBSD: usbdi_util.c,v 1.62 2013/09/26 07:25:31 skrll Exp $
$NetBSD: if_urtwn.c,v 1.30 2014/05/08 05:59:09 mrg Exp $
System: NetBSD neva 6.99.44 NetBSD 6.99.44 (GENERIC) #1: Thu Jun 26 11:53:57 BST 2014 alnsn@neva:/home/alnsn/netbsd-current/src/sys/arch/amd64/compile/obj/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
urtwn driver is a bit unstable for me. I suspect something
is corrupting memory. This diagnostic message is an extra
confirmation of my suspicion:
urtwn0: link state UP (was UNKNOWN)
urtwn1: link state UP (was UNKNOWN)
urtwn0: link state DOWN (was UP)
urtwn0: link state UP (was DOWN)
urtwn0: link state DOWN (was UP)
urtwn0: link state UP (was DOWN)
urtwn0: link state DOWN (was UP)
usb_transfer_complete: actlen (-15996) > len (4)
The kernel is built with DIAGNOSTIC, DEBUG, LOCKDEBUG, USB_DEBUG
and URTWN_DEBUG options.
>How-To-Repeat:
Not easily reproducable.
>Fix:
Not known.
>Audit-Trail:
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: re: kern/48954: USB diagconstic message: actlen (-15996) > len (4)
Date: Fri, 27 Jun 2014 21:21:10 +1000
i've recently been using urtwn(4) as well (as you can see from my
1.30 revision to if_urtwn.c :-).
i've not see anything that suggested corrupted memory, though it
does seem possible. i have seen it lock up twice, unable to talk
to the network at all, requiring being unplugged and reinserted
to work again.
so there is certainly something problematic, if not multiple things.
in dmesg:
urtwn0: link state UP (was UNKNOWN)
urtwn0: link state DOWN (was UP)
urtwn0: link state UP (was DOWN)
and
urtwn0: could not load firmware page 3
and
urtwn0: timeout waiting for MAC auto ON
and
urtwn0: device timeout
none of which seem to say much useful.
.mrg.
From: Alexander Nasonov <alnsn@yandex.ru>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, alnsn@NetBSD.org
Subject: Re: kern/48954: USB diagconstic message: actlen (-15996) > len (4)
Date: Fri, 27 Jun 2014 15:08:28 +0100
matthew green wrote:
> i've not see anything that suggested corrupted memory, though it
> does seem possible. i have seen it lock up twice, unable to talk
> to the network at all, requiring being unplugged and reinserted
> to work again.
Repluging my card almost surely leads to a crash. Location of a crash
is quite predictable but it depends on compilation flags and a verbosity
of debugging messages.
I picked one crash between usbd_setup_xfer and usbd_transfer
calls:
ffffffff8044b34c: 48 8b bb f8 32 00 00 mov 0x32f8(%rbx),%rdi
ffffffff8044b353: 48 c7 44 24 08 4d 75 movq $0xffffffff8044754d,0x8(%rsp)
ffffffff8044b35a: 44 80
ffffffff8044b35c: c7 04 24 00 00 00 00 movl $0x0,(%rsp)
ffffffff8044b363: 41 b9 05 00 00 00 mov $0x5,%r9d
ffffffff8044b369: 41 b8 00 40 00 00 mov $0x4000,%r8d
ffffffff8044b36f: 4c 89 e2 mov %r12,%rdx
ffffffff8044b372: e8 e7 17 41 00 callq ffffffff8085cb5e <usbd_setup_xfer>
ffffffff8044b377: 48 8b bb f8 32 00 00 mov 0x32f8(%rbx),%rdi
^^^^^^^^^^^^
IT CRASHES HERE
ffffffff8044b37e: e8 78 11 41 00 callq ffffffff8085c4fb <usbd_transfer>
Note that it's reading the same memory location 0x32f8(%rbx) twice but
the second read crashes the kernel.
Alex
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/48954: USB diagconstic message: actlen (-15996) > len (4)
Date: Fri, 27 Jun 2014 14:13:41 +0000
On Fri, Jun 27, 2014 at 02:10:14PM +0000, Alexander Nasonov wrote:
> ffffffff8044b34c: 48 8b bb f8 32 00 00 mov 0x32f8(%rbx),%rdi
> ffffffff8044b353: 48 c7 44 24 08 4d 75 movq $0xffffffff8044754d,0x8(%rsp)
> ffffffff8044b35a: 44 80
> ffffffff8044b35c: c7 04 24 00 00 00 00 movl $0x0,(%rsp)
> ffffffff8044b363: 41 b9 05 00 00 00 mov $0x5,%r9d
> ffffffff8044b369: 41 b8 00 40 00 00 mov $0x4000,%r8d
> ffffffff8044b36f: 4c 89 e2 mov %r12,%rdx
> ffffffff8044b372: e8 e7 17 41 00 callq ffffffff8085cb5e <usbd_setup_xfer>
> ffffffff8044b377: 48 8b bb f8 32 00 00 mov 0x32f8(%rbx),%rdi
>
> ^^^^^^^^^^^^
> IT CRASHES HERE
>
> ffffffff8044b37e: e8 78 11 41 00 callq ffffffff8085c4fb <usbd_transfer>
>
> Note that it's reading the same memory location 0x32f8(%rbx) twice but
> the second read crashes the kernel.
That means either compiled code isn't preserving %rbx according to the
function call ABI (unlikely) or the stack's being overwritten.
--
David A. Holland
dholland@netbsd.org
From: Nick Hudson <skrll@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: alnsn@NetBSD.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/48954: USB diagconstic message: actlen (-15996) > len (4)
Date: Fri, 27 Jun 2014 15:34:43 +0100
Have you tried setting kmem_guard_depth as described in kmem(9)?
Nick
From: Alexander Nasonov <alnsn@yandex.ru>
To: Nick Hudson <skrll@netbsd.org>
Cc: gnats-bugs@NetBSD.org, alnsn@NetBSD.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/48954: USB diagconstic message: actlen (-15996) > len (4)
Date: Fri, 27 Jun 2014 19:48:30 +0100
Nick Hudson wrote:
> Have you tried setting kmem_guard_depth as described in kmem(9)?
kmem_guard_depth=50000 made no difference.
usbd_get_string: getting lang failed, using 0
urtwn0 at uhub1 port 1
urtwn0: Realtek 802.11n WLAN Adapter, rev 2.00/2.00, addr 3
urtwn0: MAC/BB RTL8188CUS, RF 6052 1T1R, address 80:1f:02:84:fb:fe
urtwn0: 1 rx pipe, 2 tx pipes
urtwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
urtwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps
24Mbps 36Mbps 48Mbps 54Mbps
uvm_fault(0xfffffe811cc122e8, 0x0, 2) -> e
fatal page fault in supervisor mode
trap type 6 code 2 rip ffffffff80899582 cs 8 rflags 10286 cr2 0 ilevel 6
rsp fffffe80ca6fc6d0
curlwp 0xfffffe811c501b40 pid 155.1 lowest kstack 0xfffffe80ca6f92c0
panic: trap
cpu3: Begin traceback...
vpanic() at netbsd:vpanic+0x13c
snprintf() at netbsd:snprintf
startlwp() at netbsd:startlwp
alltraps() at netbsd:alltraps+0x9e
urtwn_init() at netbsd:urtwn_init+0x182c
ether_ioctl() at netbsd:ether_ioctl+0xc8
ieee80211_ioctl() at netbsd:ieee80211_ioctl+0x9a4
urtwn_ioctl() at netbsd:urtwn_ioctl+0x91
in6_update_ifa1() at netbsd:in6_update_ifa1+0x6fd
in6_update_ifa() at netbsd:in6_update_ifa+0x36
in6_control1() at netbsd:in6_control1+0x3c3
in6_control() at netbsd:in6_control+0x78
udp6_ioctl_wrapper() at netbsd:udp6_ioctl_wrapper+0x3e
compat_ifioctl() at netbsd:compat_ifioctl+0x127
doifioctl() at netbsd:doifioctl+0x43b
soo_ioctl() at netbsd:soo_ioctl+0x2b8
sys_ioctl() at netbsd:sys_ioctl+0x17e
syscall() at netbsd:syscall+0x9a
--- syscall (number 54) ---
7f7ff74d088a:
cpu3: End traceback...
ffffffff804536a1 <urtwn_init>:
ffffffff804536a1: 55 push %rbp
ffffffff804536a2: 48 89 e5 mov %rsp,%rbp
ffffffff804536a5: 41 57 push %r15
ffffffff804536a7: 41 56 push %r14
ffffffff804536a9: 41 55 push %r13
ffffffff804536ab: 41 54 push %r12
ffffffff804536ad: 53 push %rbx
ffffffff804536ae: 48 83 ec 58 sub $0x58,%rsp
ffffffff804536b2: 48 89 7d c8 mov %rdi,-0x38(%rbp)
ffffffff804536b6: 48 8b 1f mov (%rdi),%rbx
ffffffff804536b9: f6 05 f8 46 b0 00 02 testb
$0x2,0xb046f8(%rip) # ffffffff80f57db8 <urtwn_debug>
ffffffff804536c0: 0f 85 fa 03 00 00 jne ffffffff80453ac0
<urtwn_init+0x41f>
...
ffffffff80454e8b: e8 68 e0 ff ff callq ffffffff80452ef8
<urtwn_set_chan.constprop.7>
ffffffff80454e90: 48 8b 8b 00 33 00 00 mov 0x3300(%rbx),%rcx
ffffffff80454e97: 48 8d 93 f0 32 00 00 lea 0x32f0(%rbx),%rdx
ffffffff80454e9e: 48 8b b3 68 11 00 00 mov 0x1168(%rbx),%rsi
ffffffff80454ea5: 48 8b bb f8 32 00 00 mov 0x32f8(%rbx),%rdi
ffffffff80454eac: 48 c7 44 24 08 1d 0f movq
$0xffffffff80450f1d,0x8(%rsp)
ffffffff80454eb3: 45 80
ffffffff80454eb5: c7 04 24 00 00 00 00 movl $0x0,(%rsp)
ffffffff80454ebc: 41 b9 05 00 00 00 mov $0x5,%r9d
ffffffff80454ec2: 41 b8 00 40 00 00 mov $0x4000,%r8d
ffffffff80454ec8: e8 b1 46 44 00 callq ffffffff8089957e
<usbd_setup_xfer>
ffffffff80454ecd: 48 8b bb f8 32 00 00 mov 0x32f8(%rbx),%rdi
ffffffff80454ed4: e8 42 40 44 00 callq ffffffff80898f1b
<usbd_transfer>
ffffffff80454ed9: 83 f8 01 cmp $0x1,%eax
Alex
From: Nick Hudson <skrll@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: Alexander Nasonov <alnsn@yandex.ru>, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, alnsn@NetBSD.org
Subject: Re: kern/48954: USB diagconstic message: actlen (-15996) > len (4)
Date: Mon, 30 Jun 2014 08:07:25 +0100
Does disabling IPv6 on your machine help?
Nick
From: Alexander Nasonov <alnsn@yandex.ru>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, alnsn@NetBSD.org
Subject: Re: kern/48954: USB diagconstic message: actlen (-15996) > len (4)
Date: Sun, 6 Jul 2014 10:09:34 +0100
Nick Hudson wrote:
> Does disabling IPv6 on your machine help?
It makes no difference.
After spending a bit more time looking at the bug, I found out that
dmesg doesn't report stacktrace correctly. The kernel crashed in the
usbd_setup_xfer at the instruction that corresponds to the first line
of the function:
xfer->pipe = pipe;
Also, I got three more crashes from ehci_softintr. Two crashes were at
memcpy+0x14 and the third was at
crash> x/i 0xffffffff802a0582
ehci_idone+0xa9: movslq 58 (%r14),%rsi
The new kernel doesn't have DIAGNOSTIC so I don't know whether the
crashes in memcpy were preceeded by any diagnostic messages.
Alex
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.