NetBSD Problem Report #49831
From plunky@ogmig.net Thu Apr 9 18:33:54 2015
Return-Path: <plunky@ogmig.net>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id EF77FA6558
for <gnats-bugs@gnats.NetBSD.org>; Thu, 9 Apr 2015 18:33:54 +0000 (UTC)
Message-Id: <20150409183345.5BB952600C5@galant.ogmig.net>
Date: Thu, 9 Apr 2015 19:33:45 +0100 (BST)
From: plunky@ogmig.net
Reply-To: plunky@ogmig.net
To: gnats-bugs@NetBSD.org
Subject: kernel panic on close of ucom tty
X-Send-Pr-Version: 3.95
>Number: 49831
>Category: kern
>Synopsis: closing a ucom(4) tty causes a kernel diagnostic panic
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: skrll
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Apr 09 18:35:00 +0000 2015
>Last-Modified: Thu Jun 30 18:25:01 +0000 2016
>Originator: Iain Hibbert
>Release: NetBSD 7.99.7, 2015-03-22 snapshot
>Organization:
none
>Environment:
System: NetBSD galant.ogmig.net 7.99.7 NetBSD 7.99.7 (GALANT.i386) #0: Mon Mar 23 08:17:43 GMT 2015 plunky@galant.ogmig.net:/var/work/NetBSD-current/obj/sys/arch/i386/compile/GALANT i386
(the kernel is basically GENERIC, with BLUETOOTH_DEBUG and a couple of other options added)
Architecture: i386
Machine: i386
The machine is a ThinkPad T60, with Core Duo T2500 CPU
>Description:
When closing a /dev/ttyU0 device, the system panics with a diagnostic assertion
>How-To-Repeat:
I have here, a Targus PA088E USB->RS232 adapter. I plug it in, and see
ehci0: handing over full speed device on port 1 to uhci0
umct0 at uhub0 port 1
umct0: Targus Group Intl Targus Group Intl, rev 1.10/1.03, addr 3
ucom0 at umct0
Then, I connect to the tty device from a shell
% cu -l /dev/ttyU0
Connected
If something is connected, it seems you can use it as normal. It does not seem to
matter for the purpose of this PR if anything is connected. Then, attempt to disconnect
using the escape code ~. and see the following backtrace (typed in from a photo)
Kernel diagnostic assertion "xfer->pipe->intrxfer == xfer" failed: file "/var/cvs/NetBSD-current/src/sys/dev/usb/uhci.c", line 2453
fatal breakpoint trap in supervisor mode
..
Stopped in pid 16004.1 (cu) at
db{1}> bt
breakpoint()
vpanic()
kern_assert()
uhci_device_intr_abort()
usbd_abort_pipe()
ucom_cleanup()
ucomclose()
cdev_close()
spec_close()
VOP_CLOSE()
vn_close()
vn_closefile()
closef()
fd_free()
exit1()
sys_exit()
syscall()
--- syscall (number 1) ---
bbb7afd7
db{1}>
I note that this KASSERT() was added by skrll@ in r1.264
>Fix:
One workaround, is that just pulling the USB cable works fine and dumps you out of cu.
I guess because the ordering is different.
>Release-Note:
>Audit-Trail:
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: re: kern/49831: kernel panic on close of ucom tty
Date: Fri, 10 Apr 2015 06:49:04 +1000
hmmm, i don't see this with umsc(4) or uplcom(4). i wonder if there
is something specific to umct(4)..
.mrg.
Responsible-Changed-From-To: kern-bug-people->skrll
Responsible-Changed-By: skrll@NetBSD.org
Responsible-Changed-When: Fri, 10 Apr 2015 07:36:34 +0000
Responsible-Changed-Why:
Talke
From: Iain Hibbert <plunky@ogmig.net>
To: Nick Hudson <skrll@netbsd.org>, gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/49831: kernel panic on close of ucom tty
Date: Sat, 25 Jun 2016 09:13:51 +0100 (BST)
On Thu, 9 Jun 2016, Nick Hudson wrote:
> On 05/08/16 20:53, Iain Hibbert wrote:
> > On Sun, 8 May 2016, Nick Hudson wrote:
> >
> > > Hi,
> > >
> > > I have a fix that I've been working on
> > >
> > > Can you test it if I send the diff through?
> >
> > I got a new laptop (upgraded to T61p) and have only just installed amd64
> > (I just slipped the old HD in and ran i386 for a while).. I'll see if I
> > can copy an i386 live-image to a USB stick and boot from that on my T60
> > (which has no disk now). Can't do it tonight though, perhaps this week ..
> > please send patch at least?
>
> sorry, for the delay. I've added the diffs to my branch and created i386/amd64
> GENERIC kernels
> for you. They're here:
>
> http://www.netbsd.org/~skrll/nhusb.i386.netbsd
> http://www.netbsd.org/~skrll/nhusb.amd64.netbsd
Hi
Sorry for the delay on my part too!
as said, I'm on a new laptop - I still have the old one in storage but no
disk at this time, and my USB stick is defunct I need another
anyway, I tried this on the T61p running a standard NetBSD amd64 7.99.29
kernel and had a slightly different process but similar result. I plugged
the serial/parallel adapter in and it attached as normal
uhub7 at uhub4 port 3: vendor 0517 product 0606, class 9/0, rev 2.00/7.02, addr 2
uhub7: single transaction translator
uhub7: 2 ports with 0 removable, self powered
ulpt0 at uhub7 port 1 configuration 1 interface 0
ulpt0: Prolific Technology Inc. IEEE-1284 Controller, rev 1.00/2.02, addr 3, iclass 7/1
ulpt0: using bi-directional mode
uplcom0 at uhub7 port 2
uplcom0: Prolific Technology Inc. USB-Serial Controller, rev 1.10/3.00, addr 4
ucom0 at uplcom0
I then connected to the com port (nothing plugged in) with 'cu -l
/dev/ttyU0' and disconnected ok .. tried again, and the system froze I was
unable to see anything else -- bit of blind typing and I managed to
reboot, all I could get from the message buffer was
ulpt0: detached
ulpt0: at uhub7 port 1 (addr 3) disconnected
panic: kernel diagnostic assertion "xfer->ux_state == XFER_BUSY" failed: file "/var/cvs/NetBSD-current/src/sys/dev/usb/ehci.c", line 1573 xfer 0xfffffe8107880960 state 158
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff80114ae5 cs 8 rflags 246 cr2 75b607107068 ilevel 0 rsp fffffe8040defb00
curlwp 0xfffffe81078865a0 pid 0.55 lowest kstack 0xfffffe8040dec2c0
rebooting...
so anyway, I booted your amd64 kernel and I have a slightly different
problem with that, because the nouveau driver doesn't quite work on this
laptop so the screen is unreadable (though I can see green/white blobs so
I know its working ok inside :)
anyway, the same process (although in single-user mode) and no panic. So,
I'd say that something is good with your code!
iain
From: Nick Hudson <skrll@netbsd.org>
To: gnats-bugs@NetBSD.org, skrll@NetBSD.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, plunky@ogmig.net
Cc:
Subject: Re: kern/49831: kernel panic on close of ucom tty
Date: Sat, 25 Jun 2016 09:41:41 +0100
On 06/25/16 09:15, Iain Hibbert wrote:
> anyway, I tried this on the T61p running a standard NetBSD amd64 7.99.29
> kernel and had a slightly different process but similar result. I plugged
> the serial/parallel adapter in and it attached as normal
>
> uhub7 at uhub4 port 3: vendor 0517 product 0606, class 9/0, rev 2.00/7.02, addr 2
> uhub7: single transaction translator
> uhub7: 2 ports with 0 removable, self powered
> ulpt0 at uhub7 port 1 configuration 1 interface 0
> ulpt0: Prolific Technology Inc. IEEE-1284 Controller, rev 1.00/2.02, addr 3, iclass 7/1
> ulpt0: using bi-directional mode
> uplcom0 at uhub7 port 2
> uplcom0: Prolific Technology Inc. USB-Serial Controller, rev 1.10/3.00, addr 4
> ucom0 at uplcom0
oh, so no umct(4) anymore?
This device is still interesting, however.
>
> I then connected to the com port (nothing plugged in) with 'cu -l
> /dev/ttyU0' and disconnected ok .. tried again, and the system froze I was
> unable to see anything else -- bit of blind typing and I managed to
> reboot, all I could get from the message buffer was
>
> ulpt0: detached
> ulpt0: at uhub7 port 1 (addr 3) disconnected
> panic: kernel diagnostic assertion "xfer->ux_state == XFER_BUSY" failed: file "/var/cvs/NetBSD-current/src/sys/dev/usb/ehci.c", line 1573 xfer 0xfffffe8107880960 state 158
More like a problem in the ulpt(4) code. I'll take a look.
>
> fatal breakpoint trap in supervisor mode
> trap type 1 code 0 rip ffffffff80114ae5 cs 8 rflags 246 cr2 75b607107068 ilevel 0 rsp fffffe8040defb00
> curlwp 0xfffffe81078865a0 pid 0.55 lowest kstack 0xfffffe8040dec2c0
> rebooting...
>
> so anyway, I booted your amd64 kernel and I have a slightly different
> problem with that, because the nouveau driver doesn't quite work on this
> laptop so the screen is unreadable (though I can see green/white blobs so
> I know its working ok inside :)
>
> anyway, the same process (although in single-user mode) and no panic. So,
> I'd say that something is good with your code!
Bit early to say that without a umct(4) I'm afraid
>
> iain
>
Thanks for testing
Nick
From: Iain Hibbert <plunky@ogmig.net>
To: Nick Hudson <skrll@netbsd.org>, gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/49831: kernel panic on close of ucom tty
Date: Sat, 25 Jun 2016 09:51:07 +0100 (BST)
On Sat, 25 Jun 2016, Iain Hibbert wrote:
> anyway, I tried this on the T61p running a standard NetBSD amd64 7.99.29
> kernel and had a slightly different process but similar result. I plugged
> the serial/parallel adapter in and it attached as normal
gah please disregard that. I was using the wrong adapter!
Ok so I tried this again, with the right adapter and the NetBSD amd64
7.99.29 kernel that I am already using. Same result as originally
ehci0: handing over full speed device on port 3 to uhci1
umct0 at uhub2 port 1
umct0: Targus Group Intl Targus Group Intl, rev 1.10/1.03, addr 2
ucom0 at umct0
attaches as normal, but with 'cu -l /dev/ttyU0' upon disconnect it fails
(at least this laptop doesn't wipe its message buffer on reboot)
panic: kernel diagnostic assertion "xfer->ux_pipe->up_intrxfer == xfer" failed: file "/var/cvs/NetBSD-current/src/sys/dev/usb/uhci.c", line 2878
cpu0: Begin traceback...
vpanic() at netbsd:vpanic+0x13c
valid_user_selector() at netbsd:valid_user_selector
uhci_device_intr_abort() at netbsd:uhci_device_intr_abort+0x89
usbd_ar_pipe() at netbsd:usbd_ar_pipe+0x37
usbd_abort_pipe() at netbsd:usbd_abort_pipe+0x27
ucom_cleanup() at netbsd:ucom_cleanup+0xa2
ucomclose() at netbsd:ucomclose+0xe2
spec_close() at netbsd:spec_close+0x12c
VOP_CLOSE() at netbsd:VOP_CLOSE+0x33
vn_close() at netbsd:vn_close+0x33
closef() at netbsd:closef+0x54
fd_free() at netbsd:fd_free+0xcc
exit1() at netbsd:exit1+0x10d
sys_exit() at netbsd:sys_exit+0x39
syscall() at netbsd:syscall+0xbe
--- syscall (number 1) ---
78d004d13f2a:
cpu0: End traceback...
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff80114ae5 cs 8 rflags 246 cr2 78d004f5ee10 ilevel 0 rsp fffffe8040e49b78
curlwp 0xfffffe813ad6b240 pid 15.1 lowest kstack 0xfffffe8040e462c0
dumping to dev 0,1 (offset=8, size=1031757):
dump succeeded
and I do have a core dump, if that is helpful.
So then, I tried your amd64 kernel - disabled nouveau in userconf so I
could see what was going on and booted single-user
NetBSD 7.99.29 (GENERIC) #21: Thu Jun 9 06:18:11 BST 2016
nick@zoom:/wrk/nhusb/obj.amd64/wrk/nhusb/src/sys/arch/amd64/compile/GENERIC
and the device attaches ok,
ehci0: handing over full speed device on port 3 to uhci1
umct0 at uhub2 port 1
umct0: Targus Group Intl Targus Group Intl, rev 1.10/1.03, addr 2
ucom0 at umct0
but when I connect to the device (cu -l /dev/ttyU0) it says Connected but
when I pressed a button, immediately page fault.
fatal page fault in supervisor mode
trap type 6 code 0 rip ffffffff80335434 cs 8 rflags 10246 cr2 0 ilevel 4 rsp fffffe8040007e58
curlwp 0xfffffe813bb32420 pid 0.3 lowest kstack 0xfffffe80400042c0
dumping to dev 0,1 (offset=8, size=1031757):
dump succeeded
although.. didn't get a savecore this time (I don't know enough about
that) and there was more information shown on the screen about the panic
but it doesn't make it into the message buffer, hmm.. a photo of the
screen is here
http://www.netbsd.org/~plunky/nhusb.screen.0.jpg
regards,
iain
From: Nick Hudson <skrll@netbsd.org>
To: gnats-bugs@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
plunky@ogmig.net
Cc:
Subject: Re: kern/49831: kernel panic on close of ucom tty
Date: Tue, 28 Jun 2016 14:28:18 +0100
On 06/25/16 09:55, Iain Hibbert wrote:
> but when I connect to the device (cu -l /dev/ttyU0) it says Connected but
> when I pressed a button, immediately page fault.
>
> fatal page fault in supervisor mode
> trap type 6 code 0 rip ffffffff80335434 cs 8 rflags 10246 cr2 0 ilevel 4 rsp fffffe8040007e58
> curlwp 0xfffffe813bb32420 pid 0.3 lowest kstack 0xfffffe80400042c0
Please try
http://www.netbsd.org/~skrll/nhusb.amd64.2.netbsd
Thanks,
Nick
From: Iain Hibbert <plunky@ogmig.net>
To: Nick Hudson <skrll@netbsd.org>, gnats-bugs@NetBSD.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc:
Subject: Re: kern/49831: kernel panic on close of ucom tty
Date: Tue, 28 Jun 2016 21:16:14 +0100 (BST)
On Tue, 28 Jun 2016, Nick Hudson wrote:
> Please try
>
> http://www.netbsd.org/~skrll/nhusb.amd64.2.netbsd
Hmm.. boot single user with nouveau disabled .. then plug the device in,
and immediate panic in ucom_attach()
umct0 at uhub1 port 1
umct0: Targus Group Intl Targus Group Intl, rev 1.10/1.03, addr 2
ucom0 at umct0
panic: kernel diagnostic assertion "ucaa->ucaa_bulkin != -1 || (ucaa->ucaa_ipipe && ucaa->ucaa_ixfer)" failed: file "/wrk/nhusb/src/sys/dev/usb/ucom.c", line 271
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff80114c15 cs 8 rflags 246 cr2 706682946000 ilevel 0 rsp fffffe8040defa00
curlwp 0xfffffe81077e6160 pid 0.53 lowest kstack 0xfffffe8040dec2c0
Stopped in pid 0.53 (system) at netbsd:breakpoint+0x5: leave
db{1}>
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x140
cd_play_msf() at netbsd:cd_play_msf
ucom_attach() at netbsd:ucom_attach+0x85a
config_attach_loc() at netbsd:config_attach_loc+0x17a
config_found_sm_loc() at netbsd:config_found_sm_loc+0x48
umct_attach() at netbsd:umct_attach+0x23a
config_attach_loc() at netbsd:config_attach_loc+0x17a
config_found_sm_loc() at netbsd:config_found_sm_loc+0x48
usbd_attachwholedevice() at netbsd:usbd_attachwholedevice+0x8e
usbd_probe_and_attach() at netbsd:usbd_probe_and_attach+0x46
usbd_new_device() at netbsd:usbd_new_device+0xf0d
uhub_explore() at netbsd:uhub_explore+0x2f4
usb_discover() at netbsd:usb_discover+0x6f
usb_event_thread() at netbsd:usb_event_thread+0x238
db{1}>
I got a backtrace this time, though no core .. and also I tried with
hw.ucom.debug=1 but no additional information was shown
iain
From: Nick Hudson <skrll@netbsd.org>
To: Iain Hibbert <plunky@ogmig.net>, gnats-bugs@NetBSD.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc:
Subject: Re: kern/49831: kernel panic on close of ucom tty
Date: Wed, 29 Jun 2016 08:06:27 +0100
On 06/28/16 21:16, Iain Hibbert wrote:
> On Tue, 28 Jun 2016, Nick Hudson wrote:
>
>> Please try
>>
>> http://www.netbsd.org/~skrll/nhusb.amd64.2.netbsd
Oops,
http://www.netbsd.org/~skrll/nhusb.amd64.3.netbsd
is ready for you to try
Nick
From: Iain Hibbert <plunky@ogmig.net>
To: Nick Hudson <skrll@netbsd.org>, gnats-bugs@NetBSD.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc:
Subject: Re: kern/49831: kernel panic on close of ucom tty
Date: Thu, 30 Jun 2016 19:19:38 +0100 (BST)
On Wed, 29 Jun 2016, Nick Hudson wrote:
> http://www.netbsd.org/~skrll/nhusb.amd64.3.netbsd
>
> is ready for you to try
Hi
(I found it, was not in public directory :-)
with this one, some progress
NetBSD 7.99.29 (GENERIC) #26: Wed Jun 29 07:53:42 BST 2016
nick@zoom:/wrk/nhusb/obj.amd64/wrk/nhusb/src/sys/arch/amd64/compile/GENERIC
[...]
ehci0: handing over full speed device on port 3 to uhci1
umct0 at uhub1 port 1
umct0: Targus Group Intl Targus Group Intl, rev 1.10/1.03, addr 2
ucom0 at umct0
at this point the device attaches and detaches ok. With nothing connected
to the remote end I can open the device 'cu -l /dev/ttyU0' and disconnect
just fine; I did this several times with no error
Then I connected a device to the RS232 port, and connected again.. it
opened fine, but as soon as I tried to send data I got a panic assertion
panic: kernel debugging assertion "cp == ub->ub_data" failed: file "/wrk/nhusb/src/sys/dev/usb/ucom.c", line 1485
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff80114c15 cs 8 rflags 246 cr2 7e86e2d5ee10 ilevel 5 rsp fffffe8040013d90
curlwp 0xfffffe813bb2b440 pid 0.6 lowest kstack 0xfffffe80400102c0
Stopped in pid 0.6 (system) at netbsd:breakpoint+0x5: leave
db{0}>
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x140
cd_play_msf() at netbsd:cd_play_msf
ucomreadcb() at netbsd:ucomreadcb+0x3fc
usb_transfer_complete() at netbsd:usb_transfer_complete+0x39f
uhci_softintr() at netbsd:uhci_softintr+0x14d
usb_soft_intr() at netbsd:usb_soft_intr+0x1f
softint_dispatch() at netbsd:softint_dispatch+0xd3
DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xfffffe8040013ff0
Xsoftintr() at netbsd:Xsoftintr+0x4f
--- interrupt ---
0:
db{0}>
thanks
iain
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.