NetBSD Problem Report #57662

From www@netbsd.org  Sun Oct 15 20:43:16 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 7529D1A9238
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 15 Oct 2023 20:43:16 +0000 (UTC)
Message-Id: <20231015204314.2B9A41A9239@mollari.NetBSD.org>
Date: Sun, 15 Oct 2023 20:43:14 +0000 (UTC)
From: als@thangorodrim.ch
Reply-To: als@thangorodrim.ch
To: gnats-bugs@NetBSD.org
Subject: StarTech ICUSB23208FD 8-Port USB-to-Serial Adapter Hub fails on Alix with NetBSD/i386 9.3
X-Send-Pr-Version: www-1.0

>Number:         57662
>Category:       port-i386
>Synopsis:       StarTech ICUSB23208FD 8-Port USB-to-Serial Adapter Hub fails on Alix with NetBSD/i386 9.3
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    port-i386-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Oct 15 20:45:01 +0000 2023
>Last-Modified:  Mon Nov 13 08:20:01 +0000 2023
>Originator:     Alexander Schreiber
>Release:        
>Organization:
>Environment:
NetBSD khazad-dum 9.3 NetBSD 9.3 (KHAZAD_DUM) #3: Wed Aug 17 18:46:46 UTC 2022  root@khazad-dum:/net/build/khazad-dum/obj/sys/arch/i386/compile/KHAZAD_DUM i386

>Description:
I have acquired a StarTech 8-Port USB-to-Serial Adapter Hub, model number ICUSB23208FD with the intent of setting up a serial console server. Choosen machine is an alix2f13 (https://www.pcengines.ch/alix2f13.htm, LX800 AMD Geode CPU, Pentium Class).

Upon plugging in, the USB-to-Serial Adapter Hub is recognized:

Oct 15 14:48:59 khazad-dum /netbsd: [ 14167442.4086486] fatal breakpoint trap in supervisor mode
Oct 15 14:48:59 khazad-dum /netbsd: [ 14167442.4086486] trap type 1 code 0 eip 0xc0114bf4 cs 0x8 eflags 0x200202 cr2 0xb83d7860 ilevel 0x8 esp 0xcc8a9f6c
Oct 15 14:48:59 khazad-dum /netbsd: [ 14167442.4086486] curlwp 0xc0e80020 pid 0 lid 2 lowest kstack 0xcca062c0
Oct 15 14:48:59 khazad-dum /netbsd: uhub2 at uhub1 port 1: Terminus Technology (0x1a40) USB 2.0 Hub [MTT] (0x101), class 9/0, rev 2.00/1.00, addr 2
Oct 15 14:48:59 khazad-dum /netbsd: [ 14404697.6109689] uhub2: multiple transaction translators
Oct 15 14:48:59 khazad-dum /netbsd: [ 14404697.6109689] uhub2: 4 ports with 4 removable, self powered
Oct 15 14:49:01 khazad-dum /netbsd: [ 14404698.9188070] uftdi0 at uhub2 port 2 configuration 1 interface 0
Oct 15 14:49:01 khazad-dum /netbsd: [ 14404698.9295400] uftdi0: FTDI (0x403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 3
Oct 15 14:49:01 khazad-dum /netbsd: [ 14404698.9409797] ucom0 at uftdi0 portno 1
Oct 15 14:49:01 khazad-dum /netbsd: [ 14404698.9409797] uftdi1 at uhub2 port 2 configuration 1 interface 1
Oct 15 14:49:01 khazad-dum /netbsd: [ 14404698.9488820] uftdi1: FTDI (0x403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 3
Oct 15 14:49:01 khazad-dum /netbsd: [ 14404698.9597156] ucom1 at uftdi1 portno 2
Oct 15 14:49:01 khazad-dum /netbsd: [ 14404698.9597156] uftdi2 at uhub2 port 2 configuration 1 interface 2
Oct 15 14:49:01 khazad-dum /netbsd: [ 14404698.9690499] uftdi2: FTDI (0x403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 3
Oct 15 14:49:01 khazad-dum /netbsd: [ 14404698.9690499] ucom2 at uftdi2 portno 3
Oct 15 14:49:01 khazad-dum /netbsd: [ 14404698.9815384] uftdi3 at uhub2 port 2 configuration 1 interface 3
Oct 15 14:49:01 khazad-dum /netbsd: [ 14404698.9888359] uftdi3: FTDI (0x403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 3
Oct 15 14:49:01 khazad-dum /netbsd: [ 14404698.9888359] ucom3 at uftdi3 portno 4
Oct 15 14:49:02 khazad-dum /netbsd: [ 14404699.9689657] uftdi4 at uhub2 port 3 configuration 1 interface 0
Oct 15 14:49:02 khazad-dum /netbsd: [ 14404699.9804221] uftdi4: FTDI (0x403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 4
Oct 15 14:49:02 khazad-dum /netbsd: [ 14404699.9918598] ucom4 at uftdi4 portno 1
Oct 15 14:49:02 khazad-dum /netbsd: [ 14404699.9918598] uftdi5 at uhub2 port 3 configuration 1 interface 1
Oct 15 14:49:02 khazad-dum /netbsd: [ 14404699.9992033] uftdi5: FTDI (0x403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 4
Oct 15 14:49:02 khazad-dum /netbsd: [ 14404700.0106410] ucom5 at uftdi5 portno 2
Oct 15 14:49:02 khazad-dum /netbsd: [ 14404700.0106410] uftdi6 at uhub2 port 3 configuration 1 interface 2
Oct 15 14:49:02 khazad-dum /netbsd: [ 14404700.0199490] uftdi6: FTDI (0x403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 4
Oct 15 14:49:02 khazad-dum /netbsd: [ 14404700.0293960] ucom6 at uftdi6 portno 3
Oct 15 14:49:02 khazad-dum /netbsd: [ 14404700.0293960] uftdi7 at uhub2 port 3 configuration 1 interface 3
Oct 15 14:49:02 khazad-dum /netbsd: [ 14404700.0399267] uftdi7: FTDI (0x403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 4
Oct 15 14:49:02 khazad-dum /netbsd: [ 14404700.0399267] ucom7 at uftdi7 portno 4


(and yes, the "fatal breakpoint trap in supervisor mode" smells)

I can then connect via "cu -l /dev/ttyU0 -s 115200" to an attached device,
and disconnect via the "~." sequence. This works 1-2 times and then any further attempts to do so only yield "cu: link down" and then cu exiting.

I've trace cu and the relevant section from the trace dumped with
"kdump -R" is:

 12400      1 cu       0.000009779 CALL  open(0xbfb4b77a,6,0)
 12400      1 cu       0.000009779 NAMI  "/dev/ttyU0"
 12400      1 cu       12.994282345 RET   open -1 errno 5 Input/output error

Unplugging the device results in the USB devices disconnecting with some
delay:

Oct 15 18:38:57 khazad-dum /netbsd: [ 14418495.3727440] ucom0: detached
Oct 15 18:38:57 khazad-dum /netbsd: [ 14418495.3727440] uftdi0: detached
Oct 15 18:38:57 khazad-dum /netbsd: [ 14418495.3727440] uftdi0: at uhub2 port 2 (addr 3) disconnected
Oct 15 18:38:59 khazad-dum /netbsd: [ 14418497.3830518] ucom1: detached
Oct 15 18:38:59 khazad-dum /netbsd: [ 14418497.3830518] uftdi1: detached
Oct 15 18:38:59 khazad-dum /netbsd: [ 14418497.3830518] uftdi1: at uhub2 port 2 (addr 3) disconnected
Oct 15 18:39:01 khazad-dum /netbsd: [ 14418499.3933598] ucom2: detached
Oct 15 18:39:01 khazad-dum /netbsd: [ 14418499.3933598] uftdi2: detached
Oct 15 18:39:01 khazad-dum /netbsd: [ 14418499.3933598] uftdi2: at uhub2 port 2 (addr 3) disconnected
Oct 15 18:39:03 khazad-dum /netbsd: [ 14418501.4036678] ucom3: detached
Oct 15 18:39:03 khazad-dum /netbsd: [ 14418501.4036678] uftdi3: detached
Oct 15 18:39:03 khazad-dum /netbsd: [ 14418501.4036678] uftdi3: at uhub2 port 2 (addr 3) disconnected
Oct 15 18:39:06 khazad-dum /netbsd: [ 14418504.4141291] ucom4: detached
Oct 15 18:39:06 khazad-dum /netbsd: [ 14418504.4141291] uftdi4: detached
Oct 15 18:39:06 khazad-dum /netbsd: [ 14418504.4141291] uftdi4: at uhub2 port 3 (addr 4) disconnected
Oct 15 18:39:08 khazad-dum /netbsd: [ 14418506.4244371] ucom5: detached
Oct 15 18:39:08 khazad-dum /netbsd: [ 14418506.4244371] uftdi5: detached
Oct 15 18:39:08 khazad-dum /netbsd: [ 14418506.4244371] uftdi5: at uhub2 port 3 (addr 4) disconnected
Oct 15 18:39:10 khazad-dum /netbsd: [ 14418508.4347451] ucom6: detached
Oct 15 18:39:10 khazad-dum /netbsd: [ 14418508.4347451] uftdi6: detached
Oct 15 18:39:10 khazad-dum /netbsd: [ 14418508.4347451] uftdi6: at uhub2 port 3 (addr 4) disconnected
Oct 15 18:39:12 khazad-dum /netbsd: [ 14418510.4450534] ucom7: detached
Oct 15 18:39:12 khazad-dum /netbsd: [ 14418510.4450534] uftdi7: detached
Oct 15 18:39:12 khazad-dum /netbsd: [ 14418510.4450534] uftdi7: at uhub2 port 3 (addr 4) disconnected
Oct 15 18:39:15 khazad-dum /netbsd: [ 14418513.4555144] uhub2: detached
Oct 15 18:39:15 khazad-dum /netbsd: [ 14418513.4555144] uhub2: at uhub1 port 1 (addr 2) disconnected


Plugging it back in only yields an error after some time:

Oct 15 18:40:43 khazad-dum /netbsd: [ 14418601.0689389] uhub1: autoconfiguration error: device problem, disabling port 1


I've reinstalled the same machine with Linux (Debian bookworm) and it
works with no apparent issues. 

relevant dmesg snippet from that setup (mostly for h/w id):
[   46.587001] usbserial: USB Serial support registered for FTDI USB Serial Device
[   46.630146] ftdi_sio 2-2:1.0: FTDI USB Serial Device converter detected
[   46.650945] usb 2-2: Detected FT232R
[   46.668978] usb 2-2: FTDI USB Serial Device converter now attached to ttyUSB0
[   46.704109] ftdi_sio 1-1.2:1.0: FTDI USB Serial Device converter detected
[   46.927654] usb 1-1.2: Detected FT4232H
[   46.944386] usb 1-1.2: FTDI USB Serial Device converter now attached to ttyUSB1
[   46.964563] ftdi_sio 1-1.2:1.1: FTDI USB Serial Device converter detected
[   46.987101] usb 1-1.2: Detected FT4232H
[   46.998834] usb 1-1.2: FTDI USB Serial Device converter now attached to ttyUSB2
[   47.023395] ftdi_sio 1-1.2:1.2: FTDI USB Serial Device converter detected
[   47.043118] usb 1-1.2: Detected FT4232H
[   47.054859] usb 1-1.2: FTDI USB Serial Device converter now attached to ttyUSB3
[   47.079375] ftdi_sio 1-1.2:1.3: FTDI USB Serial Device converter detected
[   47.099113] usb 1-1.2: Detected FT4232H
[   47.111007] usb 1-1.2: FTDI USB Serial Device converter now attached to ttyUSB4
[   47.134324] ftdi_sio 1-1.3:1.0: FTDI USB Serial Device converter detected
[   47.154893] usb 1-1.3: Detected FT4232H
[   47.170720] usb 1-1.3: FTDI USB Serial Device converter now attached to ttyUSB5
[   47.191453] ftdi_sio 1-1.3:1.1: FTDI USB Serial Device converter detected
[   47.211397] usb 1-1.3: Detected FT4232H
[   47.226716] usb 1-1.3: FTDI USB Serial Device converter now attached to ttyUSB6
[   47.246908] ftdi_sio 1-1.3:1.2: FTDI USB Serial Device converter detected
[   47.266903] usb 1-1.3: Detected FT4232H
[   47.282737] usb 1-1.3: FTDI USB Serial Device converter now attached to ttyUSB7
[   47.303569] ftdi_sio 1-1.3:1.3: FTDI USB Serial Device converter detected
[   47.322895] usb 1-1.3: Detected FT4232H
[   47.338747] usb 1-1.3: FTDI USB Serial Device converter now attached to ttyUSB8


I've tried the same USB-to-Serial Adapter Hub on an APU (https://www.pcengines.ch/apu2.htm, AMD GX-412TC, amd64) running NetBSD/amd64 9.3 and it works without any apparent problems.

I currently have no other machine types running NetBSD/i386 to crosstest.

So it looks like this is confined to the combination of
- PCEngines Alix
- NetBSD/i386

I do not realistically expect a fix and this PR is just intended as documentation if someone else wants to use that device combo.

Note: while the NetBSD logs come from a machine running a (slightly, mostly
enabling diagnostics and dropping unneeded stuff) custom kernel, the initial installation having the same issues ran GENERIC from a fresh install.

>How-To-Repeat:
- acquire PEngines Alix
- install NetBSD/i386 9.3
- acquire StarTech ICUSB23208FD 8-Port USB-to-Serial Adapter Hub 
- plug above device into Alix
- try to use the above device as a serial console interface
  (e.g.: "cu -l /dev/ttyU8 -s 115200")


>Fix:
none

>Audit-Trail:
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@netbsd.org
Cc: port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
        netbsd-bugs@netbsd.org
Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Sun, 15 Oct 2023 23:31:07 +0200

 On Sun, Oct 15, 2023 at 08:45:01PM +0000, als@thangorodrim.ch wrote:
 > [...]
 > 
 > I can then connect via "cu -l /dev/ttyU0 -s 115200" to an attached device,
 > and disconnect via the "~." sequence. This works 1-2 times and then any further attempts to do so only yield "cu: link down" and then cu exiting.
 > 

 I've seen this with other USB a non-USB serial ports.
 Usually this is because the serial port is not completely closed (cu usually
 has 2 processes running, one may be left over when exiting).
 killing the leftover processes allows a new cu session to start

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Alexander Schreiber <als@thangorodrim.ch>
To: gnats-bugs@netbsd.org
Cc: port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Mon, 16 Oct 2023 01:51:48 +0200

 On Sun, Oct 15, 2023 at 09:35:02PM +0000, Manuel Bouyer wrote:
 > The following reply was made to PR port-i386/57662; it has been noted by GNATS.
 > 
 > From: Manuel Bouyer <bouyer@antioche.eu.org>
 > To: gnats-bugs@netbsd.org
 > Cc: port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
 >         netbsd-bugs@netbsd.org
 > Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 >  Adapter Hub fails on Alix with NetBSD/i386 9.3
 > Date: Sun, 15 Oct 2023 23:31:07 +0200
 > 
 >  On Sun, Oct 15, 2023 at 08:45:01PM +0000, als@thangorodrim.ch wrote:
 >  > [...]
 >  > 
 >  > I can then connect via "cu -l /dev/ttyU0 -s 115200" to an attached device,
 >  > and disconnect via the "~." sequence. This works 1-2 times and then any further attempts to do so only yield "cu: link down" and then cu exiting.
 >  > 
 >  
 >  I've seen this with other USB a non-USB serial ports.
 >  Usually this is because the serial port is not completely closed (cu usually
 >  has 2 processes running, one may be left over when exiting).
 >  killing the leftover processes allows a new cu session to start

 The fact that:
  - upon device plug-in, I get a "fatal breakpoint trap in supervisor mode"
  - upon removing, waiting a few min and plugging the device back
    in I get "uhub1: autoconfiguration error: device problem, disabling port 1"
 tells me that this might not be the problem.

 I suspect either some state corruption in the kernel driver or (actually
 more likely) incorrect handling of device quirks (quirk being a polite
 way of saying the device does something broken). This might be a combination
 of USB host issues (it appears to work fine on a different h/w with
 amd64) and NetBSD not having as much developer time to deal with devices
 behaving oddly (it works with Linux on the same h/w).

 -- 
 "Opportunity is missed by most people because it is dressed in overalls and
  looks like work."                                      -- Thomas A. Edison

From: "David H. Gutteridge" <david@gutteridge.ca>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Sun, 15 Oct 2023 21:28:05 -0400

 On Mon, 16 Oct 2023 at 01:51:48 +0200, Alexander Schreiber wrote:
 >The fact that:
 > - upon device plug-in, I get a "fatal breakpoint trap in supervisor mode"
 > - upon removing, waiting a few min and plugging the device back
 >   in I get "uhub1: autoconfiguration error: device problem, disabling por=
 t 1"
 >tells me that this might not be the problem.
 >
 >I suspect either some state corruption in the kernel driver or (actually
 >more likely) incorrect handling of device quirks (quirk being a polite
 >way of saying the device does something broken). This might be a combinati=
 on
 >of USB host issues (it appears to work fine on a different h/w with
 >amd64) and NetBSD not having as much developer time to deal with devices
 >behaving oddly (it works with Linux on the same h/w).

 You don't mention what USB controller details are specific to this
 hardware in question (e.g., dmesg excerpt of that part). Separately,
 have you tried a -current kernel?

 While I don't know if there's an overlap here, there have been issues
 (with memory fragmentation) that can cause USB devices to not work in
 some contexts (on some hardware). E.g., if I plug a USB 3 external
 hard drive into a Raspberry Pi 3B+ on 9.3_STABLE, it fails to attach,
 whereas there are no such issues on a typical amd64 laptop also
 running 9.3_STABLE (different arch, different USB controller, 8x
 difference in RAM). See for instance
 http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=3D54977

 Regards,

 Dave

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Alexander Schreiber <als@thangorodrim.ch>
Cc: gnats-bugs@netbsd.org, port-i386-maintainer@netbsd.org,
        gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Mon, 16 Oct 2023 09:01:17 +0200

 On Mon, Oct 16, 2023 at 01:51:48AM +0200, Alexander Schreiber wrote:
 > On Sun, Oct 15, 2023 at 09:35:02PM +0000, Manuel Bouyer wrote:
 > > The following reply was made to PR port-i386/57662; it has been noted by GNATS.
 > > 
 > > From: Manuel Bouyer <bouyer@antioche.eu.org>
 > > To: gnats-bugs@netbsd.org
 > > Cc: port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
 > >         netbsd-bugs@netbsd.org
 > > Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 > >  Adapter Hub fails on Alix with NetBSD/i386 9.3
 > > Date: Sun, 15 Oct 2023 23:31:07 +0200
 > > 
 > >  On Sun, Oct 15, 2023 at 08:45:01PM +0000, als@thangorodrim.ch wrote:
 > >  > [...]
 > >  > 
 > >  > I can then connect via "cu -l /dev/ttyU0 -s 115200" to an attached device,
 > >  > and disconnect via the "~." sequence. This works 1-2 times and then any further attempts to do so only yield "cu: link down" and then cu exiting.
 > >  > 
 > >  
 > >  I've seen this with other USB a non-USB serial ports.
 > >  Usually this is because the serial port is not completely closed (cu usually
 > >  has 2 processes running, one may be left over when exiting).
 > >  killing the leftover processes allows a new cu session to start
 > 
 > The fact that:
 >  - upon device plug-in, I get a "fatal breakpoint trap in supervisor mode"

 this I don't get. This should cause a panic or a drop to ddb,
 but in any case the kernel shouldn't keep running

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Alexander Schreiber <als@thangorodrim.ch>
To: gnats-bugs@netbsd.org
Cc: port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Mon, 16 Oct 2023 09:14:45 +0200

 On Mon, Oct 16, 2023 at 01:30:02AM +0000, David H. Gutteridge wrote:
 > The following reply was made to PR port-i386/57662; it has been noted by GNATS.
 > 
 > From: "David H. Gutteridge" <david@gutteridge.ca>
 > To: gnats-bugs@netbsd.org
 > Cc: 
 > Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 >  Adapter Hub fails on Alix with NetBSD/i386 9.3
 > Date: Sun, 15 Oct 2023 21:28:05 -0400
 > 
 >  On Mon, 16 Oct 2023 at 01:51:48 +0200, Alexander Schreiber wrote:
 >  >The fact that:
 >  > - upon device plug-in, I get a "fatal breakpoint trap in supervisor mode"
 >  > - upon removing, waiting a few min and plugging the device back
 >  >   in I get "uhub1: autoconfiguration error: device problem, disabling por=
 >  t 1"
 >  >tells me that this might not be the problem.
 >  >
 >  >I suspect either some state corruption in the kernel driver or (actually
 >  >more likely) incorrect handling of device quirks (quirk being a polite
 >  >way of saying the device does something broken). This might be a combinati=
 >  on
 >  >of USB host issues (it appears to work fine on a different h/w with
 >  >amd64) and NetBSD not having as much developer time to deal with devices
 >  >behaving oddly (it works with Linux on the same h/w).
 >  
 >  You don't mention what USB controller details are specific to this
 >  hardware in question (e.g., dmesg excerpt of that part). Separately,
 >  have you tried a -current kernel?

 I haven't tried -current, no.

 As for machine info, here is the full dmesg:

 [     1.000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
 [     1.000000]     2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,
 [     1.000000]     2018, 2019, 2020, 2021, 2022
 [     1.000000]     The NetBSD Foundation, Inc.  All rights reserved.
 [     1.000000] Copyright (c) 1982, 1986, 1989, 1991, 1993
 [     1.000000]     The Regents of the University of California.  All rights reserved.

 [     1.000000] NetBSD 9.3 (KHAZAD_DUM) #3: Wed Aug 17 18:46:46 UTC 2022
 [     1.000000]         root@khazad-dum.angband.thangorodrim.de:/net/build/khazad-dum/obj/sys/arch/i386/compile/KHAZAD_DUM
 [     1.000000] total memory = 255 MB
 [     1.000000] avail memory = 242 MB
 [     1.000000] rnd: seeded with 256 bits
 [     1.000000] timecounter: Timecounters tick every 10.000 msec
 [     1.000000] Kernelized RAIDframe activated
 [     1.000000] running cgd selftest aes-xts-256 aes-xts-512 done
 [     1.000000] timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
 [     1.000003] Generic PC
 [     1.000003] mainbus0 (root)
 [     1.000003] Firmware Error (ACPI): A valid RSDP was not found (20190405/tbxfroot-261)
 [     1.000003] autoconfiguration error: acpi_probe: failed to initialize tables
 [     1.000003] ACPI Error: Could not remove SCI handler (20190405/evmisc-312)
 [     1.000003] cpu0 at mainbus0
 [     1.000003] cpu0: Geode(TM) Integrated Processor by AMD PCS, id 0x5a2
 [     1.000003] cpu0: package 0, core 0, smt 0
 [     1.000003] pci0 at mainbus0 bus 0: configuration mode 1
 [     1.000003] pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
 [     1.000003] pchb0 at pci0 dev 1 function 0: vendor 1022 product 2080 (rev. 0x33)
 [     1.000003] glxsb0 at pci0 dev 1 function 2: RNG AES
 [     1.000003] vr0 at pci0 dev 9 function 0: vendor 1106 product 3053 (rev. 0x96)
 [     1.000003] vr0: interrupting at irq 10
 [     1.000003] vr0: Ethernet address 00:0d:b9:19:a3:44
 [     1.000003] ukphy0 at vr0 phy 1: OUI 0x0002c6, model 0x0034, rev. 3
 [     1.000003] ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
 [     1.000003] vr1 at pci0 dev 11 function 0: vendor 1106 product 3053 (rev. 0x96)
 [     1.000003] vr1: interrupting at irq 15
 [     1.000003] vr1: Ethernet address 00:0d:b9:19:a3:45
 [     1.000003] ukphy1 at vr1 phy 1: OUI 0x0002c6, model 0x0034, rev. 3
 [     1.000003] ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
 [     1.000003] gcscpcib0 at pci0 dev 15 function 0: vendor 1022 product 2090 (rev. 0x03)
 [     1.000003] timecounter: Timecounter "gcscpcib0" frequency 3579545 Hz quality 1000
 [     1.006160] gcscpcib0: Watchdog Timer via MFGPT0, GPIO
 [     1.006160] gpio0 at gcscpcib0: 32 pins
 [     1.006160] viaide0 at pci0 dev 15 function 2: AMD CS5536 IDE Controller (rev. 0x01)
 [     1.006160] viaide0: bus-master DMA support present
 [     1.006160] viaide0: primary channel wired to compatibility mode
 [     1.006160] viaide0: primary channel interrupting at irq 14
 [     1.006160] atabus0 at viaide0 channel 0
 [     1.006160] viaide0: secondary channel wired to compatibility mode
 [     1.006160] viaide0: secondary channel ignored (disabled)
 [     1.006160] ohci0 at pci0 dev 15 function 4: vendor 1022 product 2094 (rev. 0x02)
 [     1.006160] ohci0: interrupting at irq 12
 [     1.006160] ohci0: OHCI version 1.0, legacy support
 [     1.006160] usb0 at ohci0: USB revision 1.0
 [     1.006160] gcscehci0 at pci0 dev 15 function 5: vendor 1022 product 2095 (rev. 0x02)
 [     1.006160] gcscehci0: interrupting at irq 12
 [     1.006160] gcscehci0: EHCI version 1.0
 [     1.006160] gcscehci0: 1 companion controller, 4 ports: ohci0
 [     1.006160] usb1 at gcscehci0: USB revision 2.0
 [     1.006160] isa0 at gcscpcib0
 [     1.006160] com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
 [     1.006160] com0: console
 [     1.006160] com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
 [     1.006160] attimer0 at isa0 port 0x40-0x43
 [     1.006160] pcppi0 at isa0 port 0x61
 [     1.006160] midi0 at pcppi0: PC speaker
 [     1.006160] sysbeep0 at pcppi0
 [     1.006160] attimer0: attached to pcppi0
 [     1.006160] timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
 [     1.599963] uhub0 at usb0: NetBSD (0000) OHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
 [     1.610805] uhub0: 4 ports with 4 removable, self powered
 [     1.610805] uhub1 at usb1: NetBSD (0000) EHCI root hub (0000), class 9/0, rev 2.00/1.00, addr 1
 [     1.610805] uhub1: 4 ports with 4 removable, self powered
 [     1.619966] IPsec: Initialized Security Association Processing.
 [     4.590009] wd0 at atabus0 drive 0
 [     4.590009] wd0: <SanDisk SDCFX3-008G>
 [     4.602862] wd0: drive supports 4-sector PIO transfers, LBA addressing
 [     4.602862] wd0: 7815 MB, 15880 cyl, 16 head, 63 sec, 512 bytes/sect x 16007040 sectors
 [     4.610660] wd0: 32-bit data port
 [     4.610660] wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4 (Ultra/66)
 [     4.610660] wd0(viaide0:0:0): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA)
 [     4.620014] WARNING: 1 error while detecting hardware; check system log.
 [     4.630593] boot device: wd0
 [     4.630593] root on wd0a dumps on wd0b
 [     4.640012] root file system type: ffs
 [     4.640012] kern.module.path=/stand/i386/9.3/modules

 >  
 >  While I don't know if there's an overlap here, there have been issues
 >  (with memory fragmentation) that can cause USB devices to not work in
 >  some contexts (on some hardware). E.g., if I plug a USB 3 external
 >  hard drive into a Raspberry Pi 3B+ on 9.3_STABLE, it fails to attach,
 >  whereas there are no such issues on a typical amd64 laptop also
 >  running 9.3_STABLE (different arch, different USB controller, 8x
 >  difference in RAM). See for instance
 >  http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=3D54977

 Invalid PR number, I assume you meant 54977? And while I can see memory
 fragmentation being an issue with block devices, I wouldn't expect a
 serial device to need that much memory to run into it, especially on
 a freshly booted machine that runs little else (sshd, ntpd). Although
 yes, this machine only does have 256M of RAM, so .. maybe?

 Kind regards,
            Alex.
 -- 
 "Opportunity is missed by most people because it is dressed in overalls and
  looks like work."                                      -- Thomas A. Edison

From: "David H. Gutteridge" <david@gutteridge.ca>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Mon, 23 Oct 2023 00:02:02 -0400

 On Mon, 16 Oct 2023 at 09:14:45 +0200, Alexander Schreiber wrote:
 [...]
 >Invalid PR number, I assume you meant 54977? And while I can see memory
 >fragmentation being an issue with block devices, I wouldn't expect a
 >serial device to need that much memory to run into it, especially on
 >a freshly booted machine that runs little else (sshd, ntpd). Although
 >yes, this machine only does have 256M of RAM, so .. maybe?

 Yes, that's Gnats not handling escaped/encoded content correctly,
 should indeed be 54977.

 There's perhaps more than one issue occurring here. While the final
 "autoconfiguration" error you mentioned is quite generic, part of why I
 mentioned the potential overlap is because that's the same symptom I
 see when I'm able to initially mount a umass device successfully upon
 initial boot, but some time later, the same device will be rejected
 with that error. Could be completely different causes, of course.

 I'd suggest testing with a -current kernel. (Sorry, I don't have any
 particularly useful insights here; I do use USB-serial adapters, but
 nothing quite like that.)

 Regards,

 Dave

From: Alexander Schreiber <als@thangorodrim.ch>
To: gnats-bugs@netbsd.org
Cc: port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Mon, 23 Oct 2023 21:34:56 +0200

 On Mon, Oct 23, 2023 at 04:05:01AM +0000, David H. Gutteridge wrote:
 > The following reply was made to PR port-i386/57662; it has been noted by GNATS.
 > 
 > From: "David H. Gutteridge" <david@gutteridge.ca>
 > To: gnats-bugs@netbsd.org
 > Cc: 
 > Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 >  Adapter Hub fails on Alix with NetBSD/i386 9.3
 > Date: Mon, 23 Oct 2023 00:02:02 -0400
 > 
 >  On Mon, 16 Oct 2023 at 09:14:45 +0200, Alexander Schreiber wrote:
 >  [...]
 >  >Invalid PR number, I assume you meant 54977? And while I can see memory
 >  >fragmentation being an issue with block devices, I wouldn't expect a
 >  >serial device to need that much memory to run into it, especially on
 >  >a freshly booted machine that runs little else (sshd, ntpd). Although
 >  >yes, this machine only does have 256M of RAM, so .. maybe?
 >  
 >  Yes, that's Gnats not handling escaped/encoded content correctly,
 >  should indeed be 54977.
 >  
 >  There's perhaps more than one issue occurring here. While the final
 >  "autoconfiguration" error you mentioned is quite generic, part of why I
 >  mentioned the potential overlap is because that's the same symptom I
 >  see when I'm able to initially mount a umass device successfully upon
 >  initial boot, but some time later, the same device will be rejected
 >  with that error. Could be completely different causes, of course.
 >  
 >  I'd suggest testing with a -current kernel. (Sorry, I don't have any
 >  particularly useful insights here; I do use USB-serial adapters, but
 >  nothing quite like that.)

 And I just did, I grabbed NetBSD 10.0_BETA i386 boot iso from:

 http://nycdn.netbsd.org/pub/NetBSD-daily/netbsd-10/202310212330Z/i386/installation/cdrom/boot.iso

 and installed it on another one of those Alix machines I've got kicking
 around. Somewhat similar behaviour. I can do a few sessions with cu
 (3-5 seems to be typical) and then, with the next session, cu just hangs
 (sitting in status D, no output and so far I've been willing to wait a few
 min but to no avail).

 top says:

 CPU states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle

 Memory: 57M Act, 11M Wired, 13M Exec, 31M File, 138M Free
 Swap: 512M Total, 512M Free / Pools: 22M Used

 and again, nothing in dmesg. 

 uname -a says:
 NetBSD machine-name 10.0_BETA NetBSD 10.0_BETA (GENERIC) #0: Sat Oct 21 13:01:00 UTC 2023  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/i386/compile/GENERIC i386

 Kind regards,
             Alex.
 -- 
 "Opportunity is missed by most people because it is dressed in overalls and
  looks like work."                                      -- Thomas A. Edison

From: matthew green <mrg@eterna.com.au>
To: Alexander Schreiber <als@thangorodrim.ch>
Cc: port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org, gnats-bugs@netbsd.org
Subject: re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Tue, 24 Oct 2023 07:40:43 +1100

 Alexander Schreiber writes:
 > And I just did, I grabbed NetBSD 10.0_BETA i386 boot iso from:
 >
 > http://nycdn.netbsd.org/pub/NetBSD-daily/netbsd-10/202310212330Z/i386/in=
 stallation/cdrom/boot.iso
 >
 > and installed it on another one of those Alix machines I've got kicking
 > around. Somewhat similar behaviour. I can do a few sessions with cu
 > (3-5 seems to be typical) and then, with the next session, cu just hangs
 > (sitting in status D, no output and so far I've been willing to wait a f=
 ew
 > min but to no avail).
 >
 > top says:
 >
 > CPU states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100%=
  idle
 >
 > Memory: 57M Act, 11M Wired, 13M Exec, 31M File, 138M Free
 > Swap: 512M Total, 512M Free / Pools: 22M Used
 >
 > and again, nothing in dmesg. =


 so cu(1) is hung, let's find out why.  you can use crash(8)
 to do this fairly easily.

 # crash
 crash> ps|grep cu
 26171 26171 3   1       180   ffffa11149cde640                 cu poll
 crash> bt/a ffffa11149cde640
 trace: pid 26171 lid 26171 at 0xffffac019fc24d00
 sleepq_block() at sleepq_block+0x15e
 sel_do_scan() at sel_do_scan+0x62a
 pollcommon() at pollcommon+0xd2
 sys_poll() at sys_poll+0x57
 syscall() at syscall+0x1ae
 --- syscall (number 209) ---
 syscall+0x1ae:

 where the argument to "bt/a" is the address ps provided.
 in my example, cu(1) is sitting in poll() like i'd expect,
 can you show what your cu(1) backtraces are?

 from the back trace we'll have an idea on the next step.

 thanks.


 .mrg.

From: Alexander Schreiber <als@thangorodrim.ch>
To: matthew green <mrg@eterna.com.au>
Cc: port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, gnats-bugs@netbsd.org
Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Wed, 25 Oct 2023 01:06:51 +0200

 On Tue, Oct 24, 2023 at 07:40:43AM +1100, matthew green wrote:
 > Alexander Schreiber writes:
 > > And I just did, I grabbed NetBSD 10.0_BETA i386 boot iso from:
 > >
 > > http://nycdn.netbsd.org/pub/NetBSD-daily/netbsd-10/202310212330Z/i386/installation/cdrom/boot.iso
 > >
 > > and installed it on another one of those Alix machines I've got kicking
 > > around. Somewhat similar behaviour. I can do a few sessions with cu
 > > (3-5 seems to be typical) and then, with the next session, cu just hangs
 > > (sitting in status D, no output and so far I've been willing to wait a few
 > > min but to no avail).
 > >
 > > top says:
 > >
 > > CPU states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
 > >
 > > Memory: 57M Act, 11M Wired, 13M Exec, 31M File, 138M Free
 > > Swap: 512M Total, 512M Free / Pools: 22M Used
 > >
 > > and again, nothing in dmesg. 
 > 
 > so cu(1) is hung, let's find out why.  you can use crash(8)
 > to do this fairly easily.
 > 
 > # crash
 > crash> ps|grep cu
 > 26171 26171 3   1       180   ffffa11149cde640                 cu poll
 > crash> bt/a ffffa11149cde640
 > trace: pid 26171 lid 26171 at 0xffffac019fc24d00
 > sleepq_block() at sleepq_block+0x15e
 > sel_do_scan() at sel_do_scan+0x62a
 > pollcommon() at pollcommon+0xd2
 > sys_poll() at sys_poll+0x57
 > syscall() at syscall+0x1ae
 > --- syscall (number 209) ---
 > syscall+0x1ae:
 > 
 > where the argument to "bt/a" is the address ps provided.
 > in my example, cu(1) is sitting in poll() like i'd expect,
 > can you show what your cu(1) backtraces are?
 > 
 > from the back trace we'll have an idea on the next step.

 Did that. Interestingly, what I did this time:
  - did one cu session, terminated it normally
  - got distracted for at least an hour
  - did another cu session, that promptly hung

 invoked crash:

 console# crash
 Crash version 10.0_BETA, image version 10.0_BETA.
 Kernel compiled without options LOCKDEBUG.
 Output from a running system is unreliable.
 crash> ps|grep cu
 1524  1524 3   0   1000000           c260c640                 cu usbxfer
 0       17 3   0       200           c1eaa100           lnxrcugc lnxrcugc
 crash> bt/a c260c640
 trace: pid 1524 lid 1524 at 0xce80ab18
 sleepq_block(0,0,c14afda4,c14afda4,0,c1d43100,c1ec1808,c1d43100,c1f0ce80,c1d4310
 0) at sleepq_block+0x138
 cv_wait(c1edbe84,c1ec1808,c1edbe78,1ee4c25,0,2,23d6c,0,ce80abf8,ce80abf8) at cv_
 wait+0xc6
 usbd_transfer(c1edbe78,0,0,0,ce80abb0,0,8,8,8,c1edbe78) at usbd_transfer+0x15c
 usbd_do_request_len(c1f0ce00,ce80abf8,0,0,0,0,1388,ce80ac3c,c03ed0df,c1f0ce00) a
 t usbd_do_request_len+0xad
 usbd_do_request(c1f0ce00,ce80abf8,0,40,1,0,4af,c1eaf00c,dd313180,ce80ac70) at us
 bd_do_request+0x3f
 uftdi_open(c1f14340,1,0,c1ef5c00,0,0,4200,1841,0,c1ef5c00) at uftdi_open+0x4e
 ucomopen(4200,0,7,2000,c260c640,c0d28fe2,c22d17c0,0,c1f153c0,100) at ucomopen+0x
 307
 cdev_open(4200,0,7,2000,c260c640,d,c22db040,42,c22b8dc0,0) at cdev_open+0x11c
 spec_open(ce80ad90,0,c116aed8,c22db040,7,c2cf8640,7,c1fc7000,ce80ae9c,c0d18b64) 
 at spec_open+0x1ef
 VOP_OPEN(c22db040,7,c2cf8640,7,c22db040,c2cf8640,c22db040,0,c1f1a590,c1fdd000) a
 t VOP_OPEN+0x3e
 vn_open(0,c1f1a590,10,7,0,ce80aed0,ce80aecb,ce80aed4,c1f1a590,c260c640) at vn_op
 en+0x2ff
 do_open(c260c640,0,c1f1a590,6,0,ce80af34,c260b5c0,0,c1f1a590,c260c640) at do_ope
 n+0xa1
 do_sys_openat(6,0,ce80af34,c14af694,ce80af9c,c049b4c6,c260c640,ce80af68,ce80af60
 ,c1f09b00) at do_sys_openat+0x72
 sys_open(c260c640,ce80af68,ce80af60,c1f09b00,0,5,ce80af60,ce80af68,0,0) at sys_o
 pen+0x2c
 syscall() at syscall+0x1d6
 --- syscall (number 5) ---
 b8f54857:

 I hope that is helpful. Let me know what else I can do to help.

 Kind regards,
            Alex.
 -- 
 "Opportunity is missed by most people because it is dressed in overalls and
  looks like work."                                      -- Thomas A. Edison

From: matthew green <mrg@eterna.com.au>
To: Alexander Schreiber <als@thangorodrim.ch>
Cc: port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org, gnats-bugs@netbsd.org
Subject: re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Thu, 26 Oct 2023 08:45:34 +1100

 > Did that. Interestingly, what I did this time:
 >  - did one cu session, terminated it normally
 >  - got distracted for at least an hour
 >  - did another cu session, that promptly hung
 >
 > invoked crash:
 >
 > console# crash
 > Crash version 10.0_BETA, image version 10.0_BETA.
 > Kernel compiled without options LOCKDEBUG.
 > Output from a running system is unreliable.
 > crash> ps|grep cu
 > 1524  1524 3   0   1000000           c260c640                 cu usbxfer
 > 0       17 3   0       200           c1eaa100           lnxrcugc lnxrcug=
 c
 > crash> bt/a c260c640
 > trace: pid 1524 lid 1524 at 0xce80ab18
 > sleepq_block(0,0,c14afda4,c14afda4,0,c1d43100,c1ec1808,c1d43100,c1f0ce80=
 ,c1d43100) at sleepq_block+0x138
 > cv_wait(c1edbe84,c1ec1808,c1edbe78,1ee4c25,0,2,23d6c,0,ce80abf8,ce80abf8=
 ) at cv_wait+0xc6
 > usbd_transfer(c1edbe78,0,0,0,ce80abb0,0,8,8,8,c1edbe78) at usbd_transfer=
 +0x15c
 > usbd_do_request_len(c1f0ce00,ce80abf8,0,0,0,0,1388,ce80ac3c,c03ed0df,c1f=
 0ce00) at usbd_do_request_len+0xad
 > usbd_do_request(c1f0ce00,ce80abf8,0,40,1,0,4af,c1eaf00c,dd313180,ce80ac7=
 0) at usbd_do_request+0x3f
 > uftdi_open(c1f14340,1,0,c1ef5c00,0,0,4200,1841,0,c1ef5c00) at uftdi_open=
 +0x4e
 > ucomopen(4200,0,7,2000,c260c640,c0d28fe2,c22d17c0,0,c1f153c0,100) at uco=
 mopen+0x307
 [ ... ]
 >
 > I hope that is helpful. Let me know what else I can do to help.

 OK, so seems like a transfer isn't completing like it should.

 can you build a kernel with USB_DEBUG, UFTDI_DEBUG, which ever of
 XHCI_DEBUG / UHCI_DEBUG / OHCI_DEBUG / EHCI_DEBUG the ufdti is
 attached to, reboot without them plugged in, set the various
 sysctl that will be eg hw.usb.debug / hw.xhci.debug / etc, to 1,
 plug the devies in and try to trigger a hang.  when it hangs, use
 "vmstat -u usbhist" to obtain the usb history which should provide
 a bunch of useful data to what has happened.

 usbd_xfer_schedule_timeout() should have setup a timeout for this,
 the default would be 5 i think with this via usbd_do_request().


 .mrg.

From: Alexander Schreiber <als@thangorodrim.ch>
To: matthew green <mrg@eterna.com.au>
Cc: port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, gnats-bugs@netbsd.org
Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Fri, 27 Oct 2023 22:41:27 +0200

 On Thu, Oct 26, 2023 at 08:45:34AM +1100, matthew green wrote:
 > > Did that. Interestingly, what I did this time:
 > >  - did one cu session, terminated it normally
 > >  - got distracted for at least an hour
 > >  - did another cu session, that promptly hung
 > >
 > > invoked crash:
 > >
 > > console# crash
 > > Crash version 10.0_BETA, image version 10.0_BETA.
 > > Kernel compiled without options LOCKDEBUG.
 > > Output from a running system is unreliable.
 > > crash> ps|grep cu
 > > 1524  1524 3   0   1000000           c260c640                 cu usbxfer
 > > 0       17 3   0       200           c1eaa100           lnxrcugc lnxrcugc
 > > crash> bt/a c260c640
 > > trace: pid 1524 lid 1524 at 0xce80ab18
 > > sleepq_block(0,0,c14afda4,c14afda4,0,c1d43100,c1ec1808,c1d43100,c1f0ce80,c1d43100) at sleepq_block+0x138
 > > cv_wait(c1edbe84,c1ec1808,c1edbe78,1ee4c25,0,2,23d6c,0,ce80abf8,ce80abf8) at cv_wait+0xc6
 > > usbd_transfer(c1edbe78,0,0,0,ce80abb0,0,8,8,8,c1edbe78) at usbd_transfer+0x15c
 > > usbd_do_request_len(c1f0ce00,ce80abf8,0,0,0,0,1388,ce80ac3c,c03ed0df,c1f0ce00) at usbd_do_request_len+0xad
 > > usbd_do_request(c1f0ce00,ce80abf8,0,40,1,0,4af,c1eaf00c,dd313180,ce80ac70) at usbd_do_request+0x3f
 > > uftdi_open(c1f14340,1,0,c1ef5c00,0,0,4200,1841,0,c1ef5c00) at uftdi_open+0x4e
 > > ucomopen(4200,0,7,2000,c260c640,c0d28fe2,c22d17c0,0,c1f153c0,100) at ucomopen+0x307
 > [ ... ]
 > >
 > > I hope that is helpful. Let me know what else I can do to help.
 > 
 > OK, so seems like a transfer isn't completing like it should.
 > 
 > can you build a kernel with USB_DEBUG, UFTDI_DEBUG, which ever of
 > XHCI_DEBUG / UHCI_DEBUG / OHCI_DEBUG / EHCI_DEBUG the ufdti is
 > attached to, reboot without them plugged in, set the various
 > sysctl that will be eg hw.usb.debug / hw.xhci.debug / etc, to 1,
 > plug the devies in and try to trigger a hang.  when it hangs, use
 > "vmstat -u usbhist" to obtain the usb history which should provide
 > a bunch of useful data to what has happened.
 > 
 > usbd_xfer_schedule_timeout() should have setup a timeout for this,
 > the default would be 5 i think with this via usbd_do_request().

 Ok, done.

 This time it managed to get stuck right at the second connection attempt:
  - establish first connection: cu -l /dev/ttyU0 -s 115200
    --> I get the expected "Connected" and can interact with the
        attached serial console, then I disconnect 
  - establish second connection: cu -l /dev/ttyU1 -s 115200
    --> no output from cu, no response to the ~. disconnect sequence
    --> wait a minute or so, then vmstat -u usbhist output dumped to file

 diff -urN ./sys/arch/i386/conf/GENERIC ./sys/arch/i386/conf/CONSOLE
 --- ./sys/arch/i386/conf/GENERIC        2023-10-09 23:33:46.000000000 +0000
 +++ ./sys/arch/i386/conf/CONSOLE        2023-10-26 12:50:55.582874835 +0000
 @@ -1519,3 +1519,12 @@
  options        PAX_MPROTECT=1          # PaX mprotect(2) restrictions
  options        PAX_MPROTECT_DEBUG=1    # PaX mprotect debug
  options        PAX_ASLR=1              # PaX Address Space Layout Randomization
 +
 +options         USB_DEBUG
 +options         UFTDI_DEBUG
 +options         UHCI_DEBUG
 +options         XHCI_DEBUG
 +options         OHCI_DEBUG
 +options         EHCI_DEBUG
 +

 sysctl -a|grep debug|grep hw|grep -E '(usb|hci)'
 hw.usb.debug = 1
 hw.uhci.debug = 1
 hw.ohci.debug = 1
 hw.ehci.debug = 1
 hw.xhci.debug = 1

 dmesg when the device was plugged in:

 Oct 27 19:31:36 console /netbsd: [ 339.2244729] uhub2 at uhub1 port 1: Terminus Technology (0x1a40) USB 2.0 Hub [MTT] (0x0101), class 9/0, rev 2.00/1.00, addr 2
 Oct 27 19:31:36 console /netbsd: [ 339.2397117] uhub2: multiple transaction translators
 Oct 27 19:31:36 console /netbsd: [ 339.2445789] uhub2: 4 ports with 4 removable, self powered
 Oct 27 19:31:37 console /netbsd: [ 340.0845023] uftdi0 at uhub2 port 2 configuration 1 interface 0
 Oct 27 19:31:37 console /netbsd: [ 340.0845023] uftdi0: FTDI (0x0403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 3
 Oct 27 19:31:37 console /netbsd: [ 340.1002791] ucom0 at uftdi0 portno 1
 Oct 27 19:31:37 console /netbsd: [ 340.1044987] uftdi1 at uhub2 port 2 configuration 1 interface 1
 Oct 27 19:31:37 console /netbsd: [ 340.1044987] uftdi1: FTDI (0x0403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 3
 Oct 27 19:31:37 console /netbsd: [ 340.1183381] ucom1 at uftdi1 portno 2
 Oct 27 19:31:37 console /netbsd: [ 340.1183381] uftdi2 at uhub2 port 2 configuration 1 interface 2
 Oct 27 19:31:37 console /netbsd: [ 340.1253221] uftdi2: FTDI (0x0403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 3
 Oct 27 19:31:37 console /netbsd: [ 340.1364119] ucom2 at uftdi2 portno 3
 Oct 27 19:31:37 console /netbsd: [ 340.1364119] uftdi3 at uhub2 port 2 configuration 1 interface 3
 Oct 27 19:31:37 console /netbsd: [ 340.1453230] uftdi3: FTDI (0x0403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 3
 Oct 27 19:31:37 console /netbsd: [ 340.1544997] ucom3 at uftdi3 portno 4
 Oct 27 19:31:37 console /netbsd: [ 340.6545175] uftdi4 at uhub2 port 3 configuration 1 interface 0
 Oct 27 19:31:37 console /netbsd: [ 340.6545175] uftdi4: FTDI (0x0403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 4
 Oct 27 19:31:37 console /netbsd: [ 340.6710667] ucom4 at uftdi4 portno 1
 Oct 27 19:31:37 console /netbsd: [ 340.6746100] uftdi5 at uhub2 port 3 configuration 1 interface 1
 Oct 27 19:31:37 console /netbsd: [ 340.6746100] uftdi5: FTDI (0x0403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 4
 Oct 27 19:31:37 console /netbsd: [ 340.6891425] ucom5 at uftdi5 portno 2
 Oct 27 19:31:37 console /netbsd: [ 340.6891425] uftdi6 at uhub2 port 3 configuration 1 interface 2
 Oct 27 19:31:37 console /netbsd: [ 340.6961357] uftdi6: FTDI (0x0403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 4
 Oct 27 19:31:37 console /netbsd: [ 340.7072252] ucom6 at uftdi6 portno 3
 Oct 27 19:31:37 console /netbsd: [ 340.7072252] uftdi7 at uhub2 port 3 configuration 1 interface 3
 Oct 27 19:31:37 console /netbsd: [ 340.7147651] uftdi7: FTDI (0x0403) USB <-> Serial Cable (0x6011), rev 2.00/8.00, addr 4
 Oct 27 19:31:37 console /netbsd: [ 340.7252493] ucom7 at uftdi7 portno 4

 The output of "vmstat -u usbhist" is about 2.2M, and I'm not sure how gnats
 deals with either attachments or hilariously long emails, so I've parked
 that here: https://www.thangorodrim.ch/tmp/NetBSD10BETA_serial_vmstat-usbhist

 Kind regards,
            Alex.
 -- 
 "Opportunity is missed by most people because it is dressed in overalls and
  looks like work."                                      -- Thomas A. Edison

From: matthew green <mrg@eterna.com.au>
To: Alexander Schreiber <als@thangorodrim.ch>
Cc: port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org, gnats-bugs@netbsd.org, riastradh@netbsd.org
Subject: re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Sat, 28 Oct 2023 09:25:56 +1100

 OK, this at least agrees with what i thought was happening.  from

 > https://www.thangorodrim.ch/tmp/NetBSD10BETA_serial_vmstat-usbhist

 all the useful stuff is near the very bottom, you can see the parts
 of the 2nd to last successful transfer:

 1698435122.516436 usbd_transfer#608@0: called: xfer =3D 0xc217fe78, flags =
 =3D 0x2, pipe =3D 0xc3000100, running =3D 0
 1698435122.516442 ehci_device_ctrl_start#82@0: called!
 ...
 1698435122.516509 usbd_xfer_schedule_timeout#604@0: called: xfer 0xc217fe7=
 8 timeout 5000ms set 0 reset 0
 1698435122.516515 usbd_transfer#608@0: <- sleeping on xfer 0xc217fe78
 1698435122.516796 ehci_intr#609@0: called!
 ...
 1698435122.516958 usb_transfer_complete#613@0: <- done xfer 0xc217fe78, wa=
 keup

 but the next one doesn't get this far:

 1698435126.993608 usbd_transfer#609@0: called: xfer =3D 0xc217fe78, flags =
 =3D 0x2, pipe =3D 0xc3000100, running =3D 0
 1698435126.993612 ehci_device_ctrl_start#83@0: called!
 ...
 1698435126.993667 usbd_xfer_schedule_timeout#605@0: called: xfer 0xc217fe7=
 8 timeout 5000ms set 0 reset 0
 1698435126.993672 usbd_transfer#609@0: <- sleeping on xfer 0xc217fe78
 1698435131.987617 usbd_xfer_timeout#1@0: called: xfer 0xc217fe78 status 1

 but here it should complete with failure but the next logs are:

 1698435131.987621 usbd_xfer_probe_timeout#1@0: called: xfer 0xc217fe78 tim=
 eout 5000ms set 1 reset 0
 1698435131.987625 usbd_xfer_probe_timeout#1@0: xfer 0xc217fe78 timing out
 1698435131.987628 usbd_xfer_timeout#1@0: xfer 0xc217fe78 schedule timeout =
 task
 1698435131.987631 usb_add_task#1@0: called!
 1698435131.987652 usbd_xfer_timeout_task#1@0: called: xfer 0xc217fe78 stat=
 us 1
 1698435131.987655 usbd_xfer_probe_timeout#2@0: called: xfer 0xc217fe78 tim=
 eout 5000ms set 1 reset 0
 1698435131.987658 usbd_xfer_probe_timeout#2@0: xfer 0xc217fe78 timing out
 1698435131.987661 usbd_xfer_timeout_task#1@0: xfer 0xc217fe78 timed out
 1698435131.987664 ehci_abortx#2@0: called!
 1698435131.987667 ehci_abortx#2@0: xfer=3D0xc217fe78 pipe=3D0xc3000100
 1698435131.987675 ehci_sync_hc#8@0: called!
 1698435131.987679 ehci_sync_hc#8@0: cmd =3D 0x00020071 sts =3D 0x0000c008

 there's no completion event here, even though it timed out.  Taylor,
 any ideas off hand?


 .mrg.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Alexander Schreiber <als@thangorodrim.ch>
Cc: matthew green <mrg@eterna.com.au>,
	port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, gnats-bugs@netbsd.org
Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Sat, 28 Oct 2023 01:51:45 +0000

 This is a multi-part message in MIME format.
 --=_3GH/d8a4GiEuk6sY511/abKYWvHzDwXu

 Can you please try the attached patch and see if it makes a
 difference?

 Note that it might not fix whatever the underlying problem is, but it
 should at least cause the timeout on ehci sync to work instead of
 hanging forever.

 --=_3GH/d8a4GiEuk6sY511/abKYWvHzDwXu
 Content-Type: text/plain; charset="ISO-8859-1"; name="pr57662-ehcisyncloop"
 Content-Transfer-Encoding: quoted-printable
 Content-Disposition: attachment; filename="pr57662-ehcisyncloop.patch"

 From 2ccb4d42055737abfb68af0222d8871cbbc8cb6e Mon Sep 17 00:00:00 2001
 From: Taylor R Campbell <riastradh@NetBSD.org>
 Date: Sat, 28 Oct 2023 01:42:37 +0000
 Subject: [PATCH] ehci(4): Fix cv_timedwait loop in ehci_sync_hc.

 Stop when

 	now - starttime >=3D delta,

 i.e., when at least delta ticks have elapsed since the start, not
 when

 	endtime - now > delta,

 i.e., more than delta ticks _remain_ to sleep, which is never going
 to happen (except on arithmetic overflow).

 As is, what will happen in the case that should time out is that we
 wake up after delta ticks, and find now =3D getticks() is exactly
 endtime, so we retry cv_timedwait with timo=3D(endtime - now)=3D0 which
 means sleep indefinitely with no timeout as if with cv_wait.

 PR kern/57662

 XXX pullup-10
 ---
  sys/dev/usb/ehci.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/sys/dev/usb/ehci.c b/sys/dev/usb/ehci.c
 index 111957b34bd8..b59ad33b6395 100644
 --- a/sys/dev/usb/ehci.c
 +++ b/sys/dev/usb/ehci.c
 @@ -2277,7 +2277,7 @@ ehci_sync_hc(ehci_softc_t *sc)
  	 */
  	while (sc->sc_doorbelllwp =3D=3D curlwp) {
  		now =3D getticks();
 -		if (endtime - now > delta) {
 +		if (now - starttime >=3D delta) {
  			sc->sc_doorbelllwp =3D NULL;
  			cv_signal(&sc->sc_doorbell);
  			DPRINTF("doorbell timeout", 0, 0, 0, 0);

 --=_3GH/d8a4GiEuk6sY511/abKYWvHzDwXu--

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Alexander Schreiber <als@thangorodrim.ch>
Cc: matthew green <mrg@eterna.com.au>,
	port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, gnats-bugs@netbsd.org
Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Sat, 28 Oct 2023 02:05:02 +0000

 This is a multi-part message in MIME format.
 --=_iH22Mq8cEGadrlwrn55fhppMX7+AHtfW

 Also: Can you try this patch on top as well?

 --=_iH22Mq8cEGadrlwrn55fhppMX7+AHtfW
 Content-Type: text/plain; charset="ISO-8859-1"; name="pr57662-ehcisyncloop.2"
 Content-Transfer-Encoding: quoted-printable
 Content-Disposition: attachment; filename="pr57662-ehcisyncloop.2.patch"

 From fe71f2d88628e0a801d04fc4ad4b4667272c065f Mon Sep 17 00:00:00 2001
 From: Taylor R Campbell <riastradh@NetBSD.org>
 Date: Sat, 28 Oct 2023 02:01:10 +0000
 Subject: [PATCH] ehci(4): Fix bug causing missed wakeups since ehci.c 1.308.

 For reasons beyond me now, I used cv_signal on the same cv that is
 used to wait for the doorbell to be available _and_ to wait for the
 host controller to acknowledge the doorbell.  Which means when the
 host controller acknowledges the doorbell, we might wake some thread
 waiting for the doorbell to be available -- and leave the thread
 waiting for the doorbell acknowledgment hanging indefinitely.

 PR port-i386/57662

 XXX pullup-10
 ---
  sys/dev/usb/ehci.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/sys/dev/usb/ehci.c b/sys/dev/usb/ehci.c
 index b59ad33b6395..3ffdec9c6657 100644
 --- a/sys/dev/usb/ehci.c
 +++ b/sys/dev/usb/ehci.c
 @@ -826,7 +826,7 @@ ehci_doorbell(void *addr)
  	if (sc->sc_doorbelllwp =3D=3D NULL)
  		DPRINTF("spurious doorbell interrupt", 0, 0, 0, 0);
  	sc->sc_doorbelllwp =3D NULL;
 -	cv_signal(&sc->sc_doorbell);
 +	cv_broadcast(&sc->sc_doorbell);
  	mutex_exit(&sc->sc_lock);
  }
 =20
 @@ -2279,7 +2279,7 @@ ehci_sync_hc(ehci_softc_t *sc)
  		now =3D getticks();
  		if (now - starttime >=3D delta) {
  			sc->sc_doorbelllwp =3D NULL;
 -			cv_signal(&sc->sc_doorbell);
 +			cv_broadcast(&sc->sc_doorbell);
  			DPRINTF("doorbell timeout", 0, 0, 0, 0);
  #ifdef DIAGNOSTIC		/* XXX DIAGNOSTIC abuse, do this differently */
  			printf("ehci_sync_hc: timed out\n");

 --=_iH22Mq8cEGadrlwrn55fhppMX7+AHtfW--

From: Alexander Schreiber <als@thangorodrim.ch>
To: Taylor R Campbell <riastradh@netbsd.org>
Cc: matthew green <mrg@eterna.com.au>, port-i386-maintainer@netbsd.org,
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
	gnats-bugs@netbsd.org
Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Sat, 28 Oct 2023 22:26:06 +0200

 On Sat, Oct 28, 2023 at 02:05:02AM +0000, Taylor R Campbell wrote:
 > Also: Can you try this patch on top as well?

 > >From fe71f2d88628e0a801d04fc4ad4b4667272c065f Mon Sep 17 00:00:00 2001
 > From: Taylor R Campbell <riastradh@NetBSD.org>
 > Date: Sat, 28 Oct 2023 02:01:10 +0000
 > Subject: [PATCH] ehci(4): Fix bug causing missed wakeups since ehci.c 1.308.
 > 
 > For reasons beyond me now, I used cv_signal on the same cv that is
 > used to wait for the doorbell to be available _and_ to wait for the
 > host controller to acknowledge the doorbell.  Which means when the
 > host controller acknowledges the doorbell, we might wake some thread
 > waiting for the doorbell to be available -- and leave the thread
 > waiting for the doorbell acknowledgment hanging indefinitely.
 > 
 > PR port-i386/57662
 > 
 > XXX pullup-10
 > ---
 >  sys/dev/usb/ehci.c | 4 ++--
 >  1 file changed, 2 insertions(+), 2 deletions(-)

 Both patches applied and things do look better. Now, instead of hanging
 forever, after the second (so, at the third) connection attempt, cu
 errors out with: "cu: link down" after a few seconds. This behavior then
 sticks.

 Relevant snippet from the trace:

   1699   1699 cu       0.000010616 CALL  open(0xbfb9a56e,6,0)
   1699   1699 cu       0.000011733 NAMI  "/dev/ttyU1"
   1699   1699 cu       11.991814027 RET   open -1 errno 5 Input/output error

 vmstat usbhist dump at:

 https://www.thangorodrim.ch/tmp/NetBSD10BETA_serial_vmstat-usbhist_patch-1

 Kind regards,
            Alex.
 -- 
 "Opportunity is missed by most people because it is dressed in overalls and
  looks like work."                                      -- Thomas A. Edison

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57662 CVS commit: src/sys/dev/usb
Date: Sat, 28 Oct 2023 21:18:15 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Oct 28 21:18:15 UTC 2023

 Modified Files:
 	src/sys/dev/usb: ehci.c

 Log Message:
 ehci(4): Fix cv_timedwait loop in ehci_sync_hc.

 Stop when

 	now - starttime >= delta,

 i.e., when at least delta ticks have elapsed since the start, not
 when

 	endtime - now > delta,

 i.e., more than delta ticks _remain_ to sleep, which is never going
 to happen (except on arithmetic overflow).

 As is, what will happen in the case that should time out is that we
 wake up after delta ticks, and find now = getticks() is exactly
 endtime, so we retry cv_timedwait with timo=(endtime - now)=0 which
 means sleep indefinitely with no timeout as if with cv_wait.

 PR port-i386/57662

 XXX pullup-10


 To generate a diff of this commit:
 cvs rdiff -u -r1.317 -r1.318 src/sys/dev/usb/ehci.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57662 CVS commit: src/sys/dev/usb
Date: Sat, 28 Oct 2023 21:18:32 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Oct 28 21:18:32 UTC 2023

 Modified Files:
 	src/sys/dev/usb: ehci.c

 Log Message:
 ehci(4): Fix bug causing missed wakeups since ehci.c 1.308.

 For reasons beyond me now, I used cv_signal on the same cv that is
 used to wait for the doorbell to be available _and_ to wait for the
 host controller to acknowledge the doorbell.  Which means when the
 host controller acknowledges the doorbell, we might wake some thread
 waiting for the doorbell to be available -- and leave the thread
 waiting for the doorbell acknowledgment hanging indefinitely.

 PR port-i386/57662

 XXX pullup-10


 To generate a diff of this commit:
 cvs rdiff -u -r1.318 -r1.319 src/sys/dev/usb/ehci.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Alexander Schreiber <als@thangorodrim.ch>
Cc: matthew green <mrg@eterna.com.au>,
	port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, gnats-bugs@netbsd.org
Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
	Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Sat, 28 Oct 2023 21:29:09 +0000

 > Date: Sat, 28 Oct 2023 22:26:06 +0200
 > From: Alexander Schreiber <als@thangorodrim.ch>
 >=20
 > Both patches applied and things do look better. Now, instead of hanging
 > forever, after the second (so, at the third) connection attempt, cu
 > errors out with: "cu: link down" after a few seconds. This behavior then
 > sticks.
 >=20
 > Relevant snippet from the trace:
 >=20
 >   1699   1699 cu       0.000010616 CALL  open(0xbfb9a56e,6,0)
 >   1699   1699 cu       0.000011733 NAMI  "/dev/ttyU1"
 >   1699   1699 cu       11.991814027 RET   open -1 errno 5 Input/output er=
 ror
 >=20
 > vmstat usbhist dump at:
 >=20
 > https://www.thangorodrim.ch/tmp/NetBSD10BETA_serial_vmstat-usbhist_patch-1

 Great, thanks!  So we've fixed the logic to abort on timeout that I
 broke last year (which is good!), but now we have to figure out why
 the transfer is timing out at all.

From: Alexander Schreiber <als@thangorodrim.ch>
To: Taylor R Campbell <riastradh@netbsd.org>
Cc: matthew green <mrg@eterna.com.au>, port-i386-maintainer@netbsd.org,
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
	gnats-bugs@netbsd.org
Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Sun, 29 Oct 2023 22:14:32 +0100

 On Sat, Oct 28, 2023 at 09:29:09PM +0000, Taylor R Campbell wrote:
 > > Date: Sat, 28 Oct 2023 22:26:06 +0200
 > > From: Alexander Schreiber <als@thangorodrim.ch>
 > > 
 > > Both patches applied and things do look better. Now, instead of hanging
 > > forever, after the second (so, at the third) connection attempt, cu
 > > errors out with: "cu: link down" after a few seconds. This behavior then
 > > sticks.
 > > 
 > > Relevant snippet from the trace:
 > > 
 > >   1699   1699 cu       0.000010616 CALL  open(0xbfb9a56e,6,0)
 > >   1699   1699 cu       0.000011733 NAMI  "/dev/ttyU1"
 > >   1699   1699 cu       11.991814027 RET   open -1 errno 5 Input/output error
 > > 
 > > vmstat usbhist dump at:
 > > 
 > > https://www.thangorodrim.ch/tmp/NetBSD10BETA_serial_vmstat-usbhist_patch-1
 > 
 > Great, thanks!  So we've fixed the logic to abort on timeout that I
 > broke last year (which is good!), but now we have to figure out why
 > the transfer is timing out at all.

 Small update: I rebuilt the kernel (stripped out a lot of unneeded stuff
 from the config to get build time down and enable DIAGNOSTICS) and redid
 the test. Now, while cu hangs, I see matching complaints in dmesg:

 [ 394.1174957] ehci_sync_hc: timed out
 [ 400.1176959] ehci_sync_hc: timed out
 [ 406.1178962] ehci_sync_hc: timed out

 and then cu comes back with "cu: link down". I didn't see those kernel
 messages before, so I guess that is courtesy of DIAGNOSTICS.

 Kind regards,
            Alex.
 -- 
 "Opportunity is missed by most people because it is dressed in overalls and
  looks like work."                                      -- Thomas A. Edison

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57662 CVS commit: [netbsd-10] src/sys/dev/usb
Date: Mon, 30 Oct 2023 17:45:11 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Mon Oct 30 17:45:10 UTC 2023

 Modified Files:
 	src/sys/dev/usb [netbsd-10]: ehci.c

 Log Message:
 Pull up following revision(s) (requested by riastradh in ticket #440):

 	sys/dev/usb/ehci.c: revision 1.318
 	sys/dev/usb/ehci.c: revision 1.319

 ehci(4): Fix cv_timedwait loop in ehci_sync_hc.

 Stop when
         now - starttime >= delta,
 i.e., when at least delta ticks have elapsed since the start, not
 when
         endtime - now > delta,
 i.e., more than delta ticks _remain_ to sleep, which is never going
 to happen (except on arithmetic overflow).

 As is, what will happen in the case that should time out is that we
 wake up after delta ticks, and find now = getticks() is exactly
 endtime, so we retry cv_timedwait with timo=(endtime - now)=0 which
 means sleep indefinitely with no timeout as if with cv_wait.

 PR port-i386/57662

 ehci(4): Fix bug causing missed wakeups since ehci.c 1.308.

 For reasons beyond me now, I used cv_signal on the same cv that is
 used to wait for the doorbell to be available _and_ to wait for the
 host controller to acknowledge the doorbell.  Which means when the
 host controller acknowledges the doorbell, we might wake some thread
 waiting for the doorbell to be available -- and leave the thread
 waiting for the doorbell acknowledgment hanging indefinitely.

 PR port-i386/57662


 To generate a diff of this commit:
 cvs rdiff -u -r1.315.2.1 -r1.315.2.2 src/sys/dev/usb/ehci.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Alexander Schreiber <als@thangorodrim.ch>
To: Taylor R Campbell <riastradh@netbsd.org>
Cc: matthew green <mrg@eterna.com.au>, port-i386-maintainer@netbsd.org,
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
	gnats-bugs@netbsd.org
Subject: Re: port-i386/57662: StarTech ICUSB23208FD 8-Port USB-to-Serial
 Adapter Hub fails on Alix with NetBSD/i386 9.3
Date: Mon, 13 Nov 2023 09:12:00 +0100

 On Sat, Oct 28, 2023 at 09:29:09PM +0000, Taylor R Campbell wrote:
 > > Date: Sat, 28 Oct 2023 22:26:06 +0200
 > > From: Alexander Schreiber <als@thangorodrim.ch>
 > > 
 > > Both patches applied and things do look better. Now, instead of hanging
 > > forever, after the second (so, at the third) connection attempt, cu
 > > errors out with: "cu: link down" after a few seconds. This behavior then
 > > sticks.
 > > 
 > > Relevant snippet from the trace:
 > > 
 > >   1699   1699 cu       0.000010616 CALL  open(0xbfb9a56e,6,0)
 > >   1699   1699 cu       0.000011733 NAMI  "/dev/ttyU1"
 > >   1699   1699 cu       11.991814027 RET   open -1 errno 5 Input/output error
 > > 
 > > vmstat usbhist dump at:
 > > 
 > > https://www.thangorodrim.ch/tmp/NetBSD10BETA_serial_vmstat-usbhist_patch-1
 > 
 > Great, thanks!  So we've fixed the logic to abort on timeout that I
 > broke last year (which is good!), but now we have to figure out why
 > the transfer is timing out at all.

 I've done another test: Boot up machine, plug in USB2serial device,
 try some connections, wait for them to time out and things to settle.
 Unplug USB2serial device, wait a bit for things to settle, plug it
 back in. Result:
  - unplugging the device is noticed and the expected messages show
    up in the kernel log, but with a scattering of time out messages
  - plugging the device yields USB problems, ports disabled, device
    no longer recognized

 So, maybe the hardware side of the USB stack for this little Geode CPU
 manages to get wedged?

 Syslog snippet attached:

 --------------------- /var/log/messages begins ----------------------


 Oct 31 14:47:42 console /netbsd: [ 394.1174957] ehci_sync_hc: timed out
 Oct 31 14:47:42 console /netbsd: [ 400.1176959] ehci_sync_hc: timed out
 Oct 31 14:47:42 console /netbsd: [ 406.1178962] ehci_sync_hc: timed out
 Oct 31 14:47:42 console /netbsd: [ 154334.7451176] ehci_sync_hc: timed out
 Oct 31 14:47:42 console /netbsd: [ 154335.7451510] ehci_sync_hc: timed out
 Oct 31 14:47:42 console /netbsd: [ 154335.7451510] ucom0: detached
 Oct 31 14:47:42 console /netbsd: [ 154335.7451510] uftdi0: detached
 Oct 31 14:47:42 console /netbsd: [ 154335.7551535] uftdi0: at uhub2 port 2 (addr 3) disconnected
 Oct 31 14:47:44 console /netbsd: [ 154336.7551844] ehci_sync_hc: timed out
 Oct 31 14:47:44 console /netbsd: [ 154337.7552181] ehci_sync_hc: timed out
 Oct 31 14:47:44 console /netbsd: [ 154337.7552181] ucom1: detached
 Oct 31 14:47:44 console /netbsd: [ 154337.7552181] uftdi1: detached
 Oct 31 14:47:44 console /netbsd: [ 154337.7652254] uftdi1: at uhub2 port 2 (addr 3) disconnected
 Oct 31 14:47:46 console /netbsd: [ 154338.7652515] ehci_sync_hc: timed out
 Oct 31 14:47:46 console /netbsd: [ 154339.7652849] ehci_sync_hc: timed out
 Oct 31 14:47:46 console /netbsd: [ 154339.7652849] ucom2: detached
 Oct 31 14:47:46 console /netbsd: [ 154339.7652849] uftdi2: detached
 Oct 31 14:47:46 console /netbsd: [ 154339.7752871] uftdi2: at uhub2 port 2 (addr 3) disconnected
 Oct 31 14:47:48 console /netbsd: [ 154340.7753188] ehci_sync_hc: timed out
 Oct 31 14:47:48 console /netbsd: [ 154341.7753519] ehci_sync_hc: timed out
 Oct 31 14:47:48 console /netbsd: [ 154341.7753519] ucom3: detached
 Oct 31 14:47:48 console /netbsd: [ 154341.7753519] uftdi3: detached
 Oct 31 14:47:48 console /netbsd: [ 154341.7853609] uftdi3: at uhub2 port 2 (addr 3) disconnected
 Oct 31 14:47:51 console /netbsd: [ 154342.7853857] ehci_sync_hc: timed out
 Oct 31 14:47:51 console /netbsd: [ 154343.7854193] ehci_sync_hc: timed out
 Oct 31 14:47:51 console /netbsd: [ 154344.7854541] ehci_sync_hc: timed out
 Oct 31 14:47:51 console /netbsd: [ 154344.7854541] ucom4: detached
 Oct 31 14:47:51 console /netbsd: [ 154344.7854541] uftdi4: detached
 Oct 31 14:47:51 console /netbsd: [ 154344.7854541] uftdi4: at uhub2 port 3 (addr 4) disconnected
 Oct 31 14:47:53 console /netbsd: [ 154345.7954864] ehci_sync_hc: timed out
 Oct 31 14:47:53 console /netbsd: [ 154346.7955198] ehci_sync_hc: timed out
 Oct 31 14:47:53 console /netbsd: [ 154346.7955198] ucom5: detached
 Oct 31 14:47:53 console /netbsd: [ 154346.7955198] uftdi5: detached
 Oct 31 14:47:53 console /netbsd: [ 154346.8055240] uftdi5: at uhub2 port 3 (addr 4) disconnected
 Oct 31 14:47:55 console /netbsd: [ 154347.8055532] ehci_sync_hc: timed out
 Oct 31 14:47:55 console /netbsd: [ 154348.8055869] ehci_sync_hc: timed out
 Oct 31 14:47:55 console /netbsd: [ 154348.8055869] ucom6: detached
 Oct 31 14:47:55 console /netbsd: [ 154348.8055869] uftdi6: detached
 Oct 31 14:47:55 console /netbsd: [ 154348.8155903] uftdi6: at uhub2 port 3 (addr 4) disconnected
 Oct 31 14:47:57 console /netbsd: [ 154349.8156214] ehci_sync_hc: timed out
 Oct 31 14:47:57 console /netbsd: [ 154350.8156548] ehci_sync_hc: timed out
 Oct 31 14:47:57 console /netbsd: [ 154350.8156548] ucom7: detached
 Oct 31 14:47:57 console /netbsd: [ 154350.8156548] uftdi7: detached
 Oct 31 14:47:57 console /netbsd: [ 154350.8256604] uftdi7: at uhub2 port 3 (addr 4) disconnected
 Oct 31 14:48:00 console /netbsd: [ 154351.8256876] ehci_sync_hc: timed out
 Oct 31 14:48:00 console /netbsd: [ 154352.8257210] ehci_sync_hc: timed out
 Oct 31 14:48:00 console /netbsd: [ 154353.8257544] ehci_sync_hc: timed out
 Oct 31 14:48:00 console /netbsd: [ 154353.8257544] uhub2: detached
 Oct 31 14:48:00 console /netbsd: [ 154353.8257544] uhub2: at uhub1 port 1 (addr 2) disconnected
 Oct 31 14:48:30 console ntpd[732]: Soliciting pool server 195.186.1.101
 Oct 31 14:49:13 console /netbsd: [ 154354.8357878] ehci_sync_hc: timed out
 Oct 31 14:49:13 console /netbsd: [ 154369.6462824] ehci_sync_hc: timed out
 Oct 31 14:49:13 console /netbsd: [ 154375.8564896] ehci_sync_hc: timed out
 Oct 31 14:49:13 console /netbsd: [ 154382.0666969] ehci_sync_hc: timed out
 Oct 31 14:49:13 console /netbsd: [ 154388.2769042] ehci_sync_hc: timed out
 Oct 31 14:49:13 console /netbsd: [ 154394.8471234] ehci_sync_hc: timed out
 Oct 31 14:49:13 console /netbsd: [ 154401.0573307] ehci_sync_hc: timed out
 Oct 31 14:49:13 console /netbsd: [ 154407.2675380] ehci_sync_hc: timed out
 Oct 31 14:49:13 console /netbsd: [ 154413.4777449] ehci_sync_hc: timed out
 Oct 31 14:49:13 console /netbsd: [ 154413.9677616] gcscehci0: handing over full speed device on port 1 to ohci0
 Oct 31 14:49:13 console /netbsd: [ 154420.0279638] ehci_sync_hc: timed out
 Oct 31 14:49:13 console /netbsd: [ 154426.2381711] ehci_sync_hc: timed out
 Oct 31 14:49:13 console /netbsd: [ 154427.4482115] ehci_sync_hc: timed out
 Oct 31 14:49:13 console /netbsd: [ 154427.4482115] uhub1: autoconfiguration error: device problem, disabling port 1
 Oct 31 14:49:34 console ntpd[732]: Soliciting pool server 81.6.2.165
 Oct 31 14:50:00 console cron[5467]: (root) CMD START (/usr/libexec/atrun)
 Oct 31 14:50:00 console cron[4190]: (root) CMD FINISH (/usr/libexec/atrun)
 Oct 31 14:50:39 console ntpd[732]: Soliciting pool server 81.94.123.16
 Oct 31 14:50:58 console /netbsd: [ 154474.8897948] ehci_sync_hc: timed out
 Oct 31 14:50:58 console /netbsd: [ 154481.1000032] ehci_sync_hc: timed out
 Oct 31 14:50:58 console /netbsd: [ 154487.3102093] ehci_sync_hc: timed out
 Oct 31 14:50:58 console /netbsd: [ 154493.5204177] ehci_sync_hc: timed out
 Oct 31 14:50:58 console /netbsd: [ 154494.0104329] gcscehci0: handing over full speed device on port 1 to ohci0
 Oct 31 14:50:58 console /netbsd: [ 154500.0706352] ehci_sync_hc: timed out
 Oct 31 14:50:58 console /netbsd: [ 154506.2808422] ehci_sync_hc: timed out
 Oct 31 14:50:58 console /netbsd: [ 154512.4910497] ehci_sync_hc: timed out
 Oct 31 14:50:58 console /netbsd: [ 154518.7012567] ehci_sync_hc: timed out
 Oct 31 14:50:58 console /netbsd: [ 154519.1912733] gcscehci0: handing over full speed device on port 1 to ohci0
 Oct 31 14:50:58 console /netbsd: [ 154525.2514756] ehci_sync_hc: timed out
 Oct 31 14:50:58 console /netbsd: [ 154531.4616828] ehci_sync_hc: timed out
 Oct 31 14:50:58 console /netbsd: [ 154532.6717232] ehci_sync_hc: timed out
 Oct 31 14:50:58 console /netbsd: [ 154532.6717232] uhub1: autoconfiguration error: device problem, disabling port 1


 --------------------- /var/log/messages ends   ----------------------

 Kind regards,
            Alex.
 -- 
 "Opportunity is missed by most people because it is dressed in overalls and
  looks like work."                                      -- Thomas A. Edison

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.