NetBSD Problem Report #49831

From plunky@ogmig.net  Thu Apr  9 18:33:54 2015
Return-Path: <plunky@ogmig.net>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id EF77FA6558
	for <gnats-bugs@gnats.NetBSD.org>; Thu,  9 Apr 2015 18:33:54 +0000 (UTC)
Message-Id: <20150409183345.5BB952600C5@galant.ogmig.net>
Date: Thu,  9 Apr 2015 19:33:45 +0100 (BST)
From: plunky@ogmig.net
Reply-To: plunky@ogmig.net
To: gnats-bugs@NetBSD.org
Subject: kernel panic on close of ucom tty
X-Send-Pr-Version: 3.95

>Number:         49831
>Category:       kern
>Synopsis:       closing a ucom(4) tty causes a kernel diagnostic panic
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    skrll
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Apr 09 18:35:00 +0000 2015
>Last-Modified:  Thu Jun 30 18:25:01 +0000 2016
>Originator:     Iain Hibbert
>Release:        NetBSD 7.99.7, 2015-03-22 snapshot
>Organization:
	none
>Environment:


System: NetBSD galant.ogmig.net 7.99.7 NetBSD 7.99.7 (GALANT.i386) #0: Mon Mar 23 08:17:43 GMT 2015 plunky@galant.ogmig.net:/var/work/NetBSD-current/obj/sys/arch/i386/compile/GALANT i386

(the kernel is basically GENERIC, with BLUETOOTH_DEBUG and a couple of other options added)

Architecture: i386
Machine: i386

The machine is a ThinkPad T60, with Core Duo T2500 CPU
>Description:
  When closing a /dev/ttyU0 device, the system panics with a diagnostic assertion
>How-To-Repeat:
  I have here, a Targus PA088E USB->RS232 adapter. I plug it in, and see

ehci0: handing over full speed device on port 1 to uhci0
umct0 at uhub0 port 1
umct0: Targus Group Intl Targus Group Intl, rev 1.10/1.03, addr 3
ucom0 at umct0

  Then, I connect to the tty device from a shell

% cu -l /dev/ttyU0
Connected

  If something is connected, it seems you can use it as normal. It does not seem to
  matter for the purpose of this PR if anything is connected. Then, attempt to disconnect
  using the escape code ~. and see the following backtrace (typed in from a photo)

Kernel diagnostic assertion "xfer->pipe->intrxfer == xfer" failed: file "/var/cvs/NetBSD-current/src/sys/dev/usb/uhci.c", line 2453
fatal breakpoint trap in supervisor mode
..
Stopped in pid 16004.1 (cu) at
db{1}> bt
breakpoint()
vpanic()
kern_assert()
uhci_device_intr_abort()
usbd_abort_pipe()
ucom_cleanup()
ucomclose()
cdev_close()
spec_close()
VOP_CLOSE()
vn_close()
vn_closefile()
closef()
fd_free()
exit1()
sys_exit()
syscall()
--- syscall (number 1) ---
bbb7afd7
db{1}>

I note that this KASSERT() was added by skrll@ in r1.264

>Fix:
One workaround, is that just pulling the USB cable works fine and dumps you out of cu.
I guess because the ordering is different.

>Release-Note:

>Audit-Trail:
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org
Subject: re: kern/49831: kernel panic on close of ucom tty
Date: Fri, 10 Apr 2015 06:49:04 +1000

 hmmm, i don't see this with umsc(4) or uplcom(4).  i wonder if there
 is something specific to umct(4)..


 .mrg.

Responsible-Changed-From-To: kern-bug-people->skrll
Responsible-Changed-By: skrll@NetBSD.org
Responsible-Changed-When: Fri, 10 Apr 2015 07:36:34 +0000
Responsible-Changed-Why:
Talke


From: Iain Hibbert <plunky@ogmig.net>
To: Nick Hudson <skrll@netbsd.org>, gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/49831: kernel panic on close of ucom tty
Date: Sat, 25 Jun 2016 09:13:51 +0100 (BST)

 On Thu, 9 Jun 2016, Nick Hudson wrote:

 > On 05/08/16 20:53, Iain Hibbert wrote:
 > > On Sun, 8 May 2016, Nick Hudson wrote:
 > > 
 > > > Hi,
 > > > 
 > > > I have a fix that I've been working on
 > > > 
 > > > Can you test it if I send the diff through?
 > > 
 > > I got a new laptop (upgraded to T61p) and have only just installed amd64
 > > (I just slipped the old HD in and ran i386 for a while).. I'll see if I
 > > can copy an i386 live-image to a USB stick and boot from that on my T60
 > > (which has no disk now). Can't do it tonight though, perhaps this week ..
 > > please send patch at least?
 > 
 > sorry, for the delay. I've added the diffs to my branch and created i386/amd64
 > GENERIC kernels
 > for you.  They're here:
 > 
 >     http://www.netbsd.org/~skrll/nhusb.i386.netbsd
 >     http://www.netbsd.org/~skrll/nhusb.amd64.netbsd

 Hi

 Sorry for the delay on my part too!

 as said, I'm on a new laptop - I still have the old one in storage but no 
 disk at this time, and my USB stick is defunct I need another

 anyway, I tried this on the T61p running a standard NetBSD amd64 7.99.29 
 kernel and had a slightly different process but similar result. I plugged 
 the serial/parallel adapter in and it attached as normal

 uhub7 at uhub4 port 3: vendor 0517 product 0606, class 9/0, rev 2.00/7.02, addr 2
 uhub7: single transaction translator
 uhub7: 2 ports with 0 removable, self powered
 ulpt0 at uhub7 port 1 configuration 1 interface 0
 ulpt0: Prolific Technology Inc. IEEE-1284 Controller, rev 1.00/2.02, addr 3, iclass 7/1
 ulpt0: using bi-directional mode
 uplcom0 at uhub7 port 2
 uplcom0: Prolific Technology Inc. USB-Serial Controller, rev 1.10/3.00, addr 4
 ucom0 at uplcom0

 I then connected to the com port (nothing plugged in) with 'cu -l 
 /dev/ttyU0' and disconnected ok .. tried again, and the system froze I was 
 unable to see anything else -- bit of blind typing and I managed to 
 reboot, all I could get from the message buffer was

 ulpt0: detached
 ulpt0: at uhub7 port 1 (addr 3) disconnected
 panic: kernel diagnostic assertion "xfer->ux_state == XFER_BUSY" failed: file "/var/cvs/NetBSD-current/src/sys/dev/usb/ehci.c", line 1573 xfer 0xfffffe8107880960 state 158

 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 rip ffffffff80114ae5 cs 8 rflags 246 cr2 75b607107068 ilevel 0 rsp fffffe8040defb00
 curlwp 0xfffffe81078865a0 pid 0.55 lowest kstack 0xfffffe8040dec2c0
 rebooting...

 so anyway, I booted your amd64 kernel and I have a slightly different 
 problem with that, because the nouveau driver doesn't quite work on this 
 laptop so the screen is unreadable (though I can see green/white blobs so 
 I know its working ok inside :)

 anyway, the same process (although in single-user mode) and no panic. So, 
 I'd say that something is good with your code!

 iain

From: Nick Hudson <skrll@netbsd.org>
To: gnats-bugs@NetBSD.org, skrll@NetBSD.org, gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org, plunky@ogmig.net
Cc: 
Subject: Re: kern/49831: kernel panic on close of ucom tty
Date: Sat, 25 Jun 2016 09:41:41 +0100

 On 06/25/16 09:15, Iain Hibbert wrote:


 >   anyway, I tried this on the T61p running a standard NetBSD amd64 7.99.29
 >   kernel and had a slightly different process but similar result. I plugged
 >   the serial/parallel adapter in and it attached as normal
 >   
 >   uhub7 at uhub4 port 3: vendor 0517 product 0606, class 9/0, rev 2.00/7.02, addr 2
 >   uhub7: single transaction translator
 >   uhub7: 2 ports with 0 removable, self powered
 >   ulpt0 at uhub7 port 1 configuration 1 interface 0
 >   ulpt0: Prolific Technology Inc. IEEE-1284 Controller, rev 1.00/2.02, addr 3, iclass 7/1
 >   ulpt0: using bi-directional mode
 >   uplcom0 at uhub7 port 2
 >   uplcom0: Prolific Technology Inc. USB-Serial Controller, rev 1.10/3.00, addr 4
 >   ucom0 at uplcom0

 oh, so no umct(4) anymore?

 This device is still interesting, however.

 >   
 >   I then connected to the com port (nothing plugged in) with 'cu -l
 >   /dev/ttyU0' and disconnected ok .. tried again, and the system froze I was
 >   unable to see anything else -- bit of blind typing and I managed to
 >   reboot, all I could get from the message buffer was
 >   
 >   ulpt0: detached
 >   ulpt0: at uhub7 port 1 (addr 3) disconnected
 >   panic: kernel diagnostic assertion "xfer->ux_state == XFER_BUSY" failed: file "/var/cvs/NetBSD-current/src/sys/dev/usb/ehci.c", line 1573 xfer 0xfffffe8107880960 state 158

 More like a problem in the ulpt(4) code. I'll take a look.
 >   
 >   fatal breakpoint trap in supervisor mode
 >   trap type 1 code 0 rip ffffffff80114ae5 cs 8 rflags 246 cr2 75b607107068 ilevel 0 rsp fffffe8040defb00
 >   curlwp 0xfffffe81078865a0 pid 0.55 lowest kstack 0xfffffe8040dec2c0
 >   rebooting...
 >   
 >   so anyway, I booted your amd64 kernel and I have a slightly different
 >   problem with that, because the nouveau driver doesn't quite work on this
 >   laptop so the screen is unreadable (though I can see green/white blobs so
 >   I know its working ok inside :)
 >   
 >   anyway, the same process (although in single-user mode) and no panic. So,
 >   I'd say that something is good with your code!

 Bit early to say that without a umct(4) I'm afraid

 >   
 >   iain
 >   
 Thanks for testing

 Nick

From: Iain Hibbert <plunky@ogmig.net>
To: Nick Hudson <skrll@netbsd.org>, gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/49831: kernel panic on close of ucom tty
Date: Sat, 25 Jun 2016 09:51:07 +0100 (BST)

 On Sat, 25 Jun 2016, Iain Hibbert wrote:

 > anyway, I tried this on the T61p running a standard NetBSD amd64 7.99.29 
 > kernel and had a slightly different process but similar result. I plugged 
 > the serial/parallel adapter in and it attached as normal

 gah please disregard that. I was using the wrong adapter!

 Ok so I tried this again, with the right adapter and the NetBSD amd64 
 7.99.29 kernel that I am already using. Same result as originally

 ehci0: handing over full speed device on port 3 to uhci1
 umct0 at uhub2 port 1
 umct0: Targus Group Intl Targus Group Intl, rev 1.10/1.03, addr 2
 ucom0 at umct0

 attaches as normal, but with 'cu -l /dev/ttyU0' upon disconnect it fails 
 (at least this laptop doesn't wipe its message buffer on reboot)

 panic: kernel diagnostic assertion "xfer->ux_pipe->up_intrxfer == xfer" failed: file "/var/cvs/NetBSD-current/src/sys/dev/usb/uhci.c", line 2878 
 cpu0: Begin traceback...
 vpanic() at netbsd:vpanic+0x13c
 valid_user_selector() at netbsd:valid_user_selector
 uhci_device_intr_abort() at netbsd:uhci_device_intr_abort+0x89
 usbd_ar_pipe() at netbsd:usbd_ar_pipe+0x37
 usbd_abort_pipe() at netbsd:usbd_abort_pipe+0x27
 ucom_cleanup() at netbsd:ucom_cleanup+0xa2
 ucomclose() at netbsd:ucomclose+0xe2
 spec_close() at netbsd:spec_close+0x12c
 VOP_CLOSE() at netbsd:VOP_CLOSE+0x33
 vn_close() at netbsd:vn_close+0x33
 closef() at netbsd:closef+0x54
 fd_free() at netbsd:fd_free+0xcc
 exit1() at netbsd:exit1+0x10d
 sys_exit() at netbsd:sys_exit+0x39
 syscall() at netbsd:syscall+0xbe
 --- syscall (number 1) ---
 78d004d13f2a:
 cpu0: End traceback...
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 rip ffffffff80114ae5 cs 8 rflags 246 cr2 78d004f5ee10 ilevel 0 rsp fffffe8040e49b78
 curlwp 0xfffffe813ad6b240 pid 15.1 lowest kstack 0xfffffe8040e462c0

 dumping to dev 0,1 (offset=8, size=1031757):
 dump succeeded

 and I do have a core dump, if that is helpful.

 So then, I tried your amd64 kernel - disabled nouveau in userconf so I 
 could see what was going on and booted single-user

 NetBSD 7.99.29 (GENERIC) #21: Thu Jun  9 06:18:11 BST 2016
 	nick@zoom:/wrk/nhusb/obj.amd64/wrk/nhusb/src/sys/arch/amd64/compile/GENERIC

 and  the device attaches ok,

  ehci0: handing over full speed device on port 3 to uhci1
  umct0 at uhub2 port 1
  umct0: Targus Group Intl Targus Group Intl, rev 1.10/1.03, addr 2
  ucom0 at umct0

 but when I connect to the device (cu -l /dev/ttyU0) it says Connected but 
 when I pressed a button, immediately page fault.

 fatal page fault in supervisor mode
 trap type 6 code 0 rip ffffffff80335434 cs 8 rflags 10246 cr2 0 ilevel 4 rsp fffffe8040007e58
 curlwp 0xfffffe813bb32420 pid 0.3 lowest kstack 0xfffffe80400042c0

 dumping to dev 0,1 (offset=8, size=1031757):
 dump succeeded

 although.. didn't get a savecore this time (I don't know enough about 
 that) and there was more information shown on the screen about the panic 
 but it doesn't make it into the message buffer, hmm..  a photo of the 
 screen is here

 	http://www.netbsd.org/~plunky/nhusb.screen.0.jpg

 regards,
 iain

From: Nick Hudson <skrll@netbsd.org>
To: gnats-bugs@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
 plunky@ogmig.net
Cc: 
Subject: Re: kern/49831: kernel panic on close of ucom tty
Date: Tue, 28 Jun 2016 14:28:18 +0100

 On 06/25/16 09:55, Iain Hibbert wrote:


 >   but when I connect to the device (cu -l /dev/ttyU0) it says Connected but
 >   when I pressed a button, immediately page fault.
 >   
 >   fatal page fault in supervisor mode
 >   trap type 6 code 0 rip ffffffff80335434 cs 8 rflags 10246 cr2 0 ilevel 4 rsp fffffe8040007e58
 >   curlwp 0xfffffe813bb32420 pid 0.3 lowest kstack 0xfffffe80400042c0

 Please try

 http://www.netbsd.org/~skrll/nhusb.amd64.2.netbsd

 Thanks,
 Nick

From: Iain Hibbert <plunky@ogmig.net>
To: Nick Hudson <skrll@netbsd.org>, gnats-bugs@NetBSD.org, 
    gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: kern/49831: kernel panic on close of ucom tty
Date: Tue, 28 Jun 2016 21:16:14 +0100 (BST)

 On Tue, 28 Jun 2016, Nick Hudson wrote:

 > Please try
 > 
 > http://www.netbsd.org/~skrll/nhusb.amd64.2.netbsd

 Hmm.. boot single user with nouveau disabled .. then plug the device in, 
 and immediate panic in ucom_attach()

 umct0 at uhub1 port 1
 umct0: Targus Group Intl Targus Group Intl, rev 1.10/1.03, addr 2
 ucom0 at umct0
 panic: kernel diagnostic assertion "ucaa->ucaa_bulkin != -1 || (ucaa->ucaa_ipipe && ucaa->ucaa_ixfer)" failed: file "/wrk/nhusb/src/sys/dev/usb/ucom.c", line 271 
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 rip ffffffff80114c15 cs 8 rflags 246 cr2 706682946000 ilevel 0 rsp fffffe8040defa00
 curlwp 0xfffffe81077e6160 pid 0.53 lowest kstack 0xfffffe8040dec2c0
 Stopped in pid 0.53 (system) at	netbsd:breakpoint+0x5:	leave	
 db{1}>
 breakpoint() at netbsd:breakpoint+0x5
 vpanic() at netbsd:vpanic+0x140
 cd_play_msf() at netbsd:cd_play_msf
 ucom_attach() at netbsd:ucom_attach+0x85a
 config_attach_loc() at netbsd:config_attach_loc+0x17a
 config_found_sm_loc() at netbsd:config_found_sm_loc+0x48
 umct_attach() at netbsd:umct_attach+0x23a
 config_attach_loc() at netbsd:config_attach_loc+0x17a
 config_found_sm_loc() at netbsd:config_found_sm_loc+0x48
 usbd_attachwholedevice() at netbsd:usbd_attachwholedevice+0x8e
 usbd_probe_and_attach() at netbsd:usbd_probe_and_attach+0x46
 usbd_new_device() at netbsd:usbd_new_device+0xf0d
 uhub_explore() at netbsd:uhub_explore+0x2f4
 usb_discover() at netbsd:usb_discover+0x6f
 usb_event_thread() at netbsd:usb_event_thread+0x238
 db{1}> 

 I got a backtrace this time, though no core .. and also I tried with 
 hw.ucom.debug=1 but no additional information was shown

 iain

From: Nick Hudson <skrll@netbsd.org>
To: Iain Hibbert <plunky@ogmig.net>, gnats-bugs@NetBSD.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: kern/49831: kernel panic on close of ucom tty
Date: Wed, 29 Jun 2016 08:06:27 +0100

 On 06/28/16 21:16, Iain Hibbert wrote:
 > On Tue, 28 Jun 2016, Nick Hudson wrote:
 >
 >> Please try
 >>
 >> http://www.netbsd.org/~skrll/nhusb.amd64.2.netbsd

 Oops,

 http://www.netbsd.org/~skrll/nhusb.amd64.3.netbsd

 is ready for you to try

 Nick

From: Iain Hibbert <plunky@ogmig.net>
To: Nick Hudson <skrll@netbsd.org>, gnats-bugs@NetBSD.org, 
    gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: kern/49831: kernel panic on close of ucom tty
Date: Thu, 30 Jun 2016 19:19:38 +0100 (BST)

 On Wed, 29 Jun 2016, Nick Hudson wrote:

 > http://www.netbsd.org/~skrll/nhusb.amd64.3.netbsd
 > 
 > is ready for you to try

 Hi

 (I found it, was not in public directory :-)

 with this one, some progress

 NetBSD 7.99.29 (GENERIC) #26: Wed Jun 29 07:53:42 BST 2016
 	nick@zoom:/wrk/nhusb/obj.amd64/wrk/nhusb/src/sys/arch/amd64/compile/GENERIC
 [...]
 ehci0: handing over full speed device on port 3 to uhci1
 umct0 at uhub1 port 1
 umct0: Targus Group Intl Targus Group Intl, rev 1.10/1.03, addr 2
 ucom0 at umct0

 at this point the device attaches and detaches ok. With nothing connected 
 to the remote end I can open the device 'cu -l /dev/ttyU0' and disconnect 
 just fine; I did this several times with no error

 Then I connected a device to the RS232 port, and connected again.. it 
 opened fine, but as soon as I tried to send data I got a panic assertion

 panic: kernel debugging assertion "cp == ub->ub_data" failed: file "/wrk/nhusb/src/sys/dev/usb/ucom.c", line 1485 
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 rip ffffffff80114c15 cs 8 rflags 246 cr2 7e86e2d5ee10 ilevel 5 rsp fffffe8040013d90
 curlwp 0xfffffe813bb2b440 pid 0.6 lowest kstack 0xfffffe80400102c0
 Stopped in pid 0.6 (system) at	netbsd:breakpoint+0x5:	leave	
 db{0}>
 breakpoint() at netbsd:breakpoint+0x5
 vpanic() at netbsd:vpanic+0x140
 cd_play_msf() at netbsd:cd_play_msf
 ucomreadcb() at netbsd:ucomreadcb+0x3fc
 usb_transfer_complete() at netbsd:usb_transfer_complete+0x39f
 uhci_softintr() at netbsd:uhci_softintr+0x14d
 usb_soft_intr() at netbsd:usb_soft_intr+0x1f
 softint_dispatch() at netbsd:softint_dispatch+0xd3
 DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xfffffe8040013ff0
 Xsoftintr() at netbsd:Xsoftintr+0x4f
 --- interrupt ---
 0:
 db{0}>

 thanks
 iain

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.