NetBSD Problem Report #52605

From martin@aprisoft.de  Mon Oct  9 09:26:39 2017
Return-Path: <martin@aprisoft.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id CD2687A25E
	for <gnats-bugs@gnats.NetBSD.org>; Mon,  9 Oct 2017 09:26:39 +0000 (UTC)
Message-Id: <20171009092622.386AF5CC761@emmas.aprisoft.de>
Date: Mon,  9 Oct 2017 11:26:22 +0200 (CEST)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: aceride crashes
X-Send-Pr-Version: 3.95

>Number:         52605
>Category:       kern
>Synopsis:       aceride crashes
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    jdolecek
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Oct 09 09:30:00 +0000 2017
>Closed-Date:    Mon Oct 16 04:41:37 +0000 2017
>Last-Modified:  Mon Oct 16 04:41:37 +0000 2017
>Originator:     Martin Husemann
>Release:        NetBSD 8.99.3
>Organization:
The NetBSD Foundation, Inc.
>Environment:

System: NetBSD whoever-brings-the-night.aprisoft.de 8.99.2 NetBSD 8.99.2 (WHOEVER) #162: Fri Sep 15 15:50:54 CEST 2017 martin@seven-days-to-the-wolves.aprisoft.de:/work/src/sys/arch/sparc64/compile/WHOEVER sparc64
Architecture: sparc64
Machine: sparc64

>Description:

I have an on-board aceride and a few slightly strange devices on the ATA bus.
Old kernel shows:

aceride0 at pci2 dev 13 function 0: Acer Labs M5229 UDMA IDE Controller (rev. 0xc4)
aceride0: bus-master DMA support present
aceride0: using PIO transfers above 137GB as workaround for 48bit DMA access bug, expect reduced performance
aceride0: primary channel configured to native-PCI mode
aceride0: using ivec 1f98 for native-PCI interrupt
atabus4 at aceride0 channel 0
aceride0: secondary channel configured to native-PCI mode
atabus5 at aceride0 channel 1
[..]
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd1(aceride0:0:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA)
atapibus0 at atabus5: 2 targets
cd0 at atapibus0 drive 0: <JLMS XJ-HD166S, , D3S4> cdrom removable
cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2 (Ultra/33)
cd0(aceride0:1:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA)

After the NCQ merge, device detection fails - that is before the
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
line is printed, I get:

panic: kernel diagnostic assertion "chq->queue_actvie > 0" failed in ....dev/ata/ata.c, line 1589
...
ata_deactivate_xfer+0x120
__wdc_command_done+0x50
__wdccommand_intr+0x1f4
ata_xfer_start+0xa8
atastart+0x234
wdc_exec_command+0xd0
ata_get_params+0xcc
wdc_drvprobe+0x2a0
atabusconfig+0x1e8

(manually transcribed)

This happens before root is mounted, no other processes going on (besides
maybe concurrent probes for other ATA devices).

>How-To-Repeat:
Try to boot -current on a Sun Blade 2500?

>Fix:
n/a

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->jdolecek
Responsible-Changed-By: jdolecek@NetBSD.org
Responsible-Changed-When: Mon, 09 Oct 2017 22:02:05 +0000
Responsible-Changed-Why:
My changes broke this.


From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: mlelstv@NetBSD.org, jdolecek@NetBSD.org
Subject: Re: kern/52605 (aceride crashes)
Date: Tue, 10 Oct 2017 09:47:11 +0200

 Something changed since yesterday, I booted a very -current kernel
 with some channel activation/deactivation and timeout printfs added
 and it results in a similar hang as kern/52606 now.

 Martin

 Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
     2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017
     The NetBSD Foundation, Inc.  All rights reserved.
 Copyright (c) 1982, 1986, 1989, 1991, 1993
     The Regents of the University of California.  All rights reserved.

 NetBSD 8.99.3 (WHOEVER) #166: Tue Oct 10 09:32:04 CEST 2017
 	martin@seven-days-to-the-wolves.aprisoft.de:/work/src/sys/arch/sparc64/compile/WHOEVER
 total memory = 8192 MB
 avail memory = 8026 MB
 running cgd selftest aes-xts-256 aes-xts-512 done
 mainbus0 (root): SUNW,Sun-Blade-2500-S (Sun Blade 2500): hostid 83cd9ff1
 cpu0 at mainbus0: SUNW,UltraSPARC-IIIi @ 1600 MHz, CPU id 0
 cpu0: manuf 3e, impl 16, mask 34
 cpu0: system tick frequency 12 MHz
 cpu0: 32K instruction (32 b/l), 64K data (32 b/l), 1024K external (64 b/l)
 cpu1 at mainbus0: SUNW,UltraSPARC-IIIi @ 1600 MHz, CPU id 1
 cpu1: manuf 3e, impl 16, mask 34
 cpu1: system tick frequency 12 MHz
 cpu1: 32K instruction (32 b/l), 64K data (32 b/l), 1024K external (64 b/l)
 memory-controller at mainbus0 not configured
 memory-controller at mainbus0 not configured
 schizo0 at mainbus0: addr 4000e600000: Tomatillo, version 4, ign 700, bus A 0 to 0
 schizo0:  pci0 at schizo0
 bge0 at pci0 dev 3 function 0: Broadcom BCM5703 Gigabit Ethernet
 bge0: interrupting at ivec 371c
 bge0: HW config 00000000, 00000000, 00000000, 00000000 00000000
 bge0: ASIC BCM5702/5703 A2 (0x1002), Ethernet address 00:03:ba:cd:9f:f1
 brgphy0 at bge0 phy 1: BCM5703 1000BASE-T media interface, rev. 2
 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
 siisata0 at pci0 dev 2 function 0: CMD Technology SiI3124 SATALink (rev. 0x02)
 siisata0: interrupting at ivec 700
 atabus0 at siisata0 channel 0
 atabus1 at siisata0 channel 1
 atabus2 at siisata0 channel 2
 atabus3 at siisata0 channel 3
 ppm at mainbus0 not configured
 schizo1 at mainbus0: addr 4000ef00000: Tomatillo, version 4, ign 740, bus B 0 to 0
 schizo1:  pci1 at schizo1
 mpt0 at pci1 dev 4 function 0: Symbios Logic 53c1020/53c1030 (rev. 0x07)
 mpt0: applying 1030 quirk
 mpt0: interrupting at ivec 1f69
 scsibus0 at mpt0: 16 targets, 8 luns per target
 mpt1 at pci1 dev 4 function 1: Symbios Logic 53c1020/53c1030 (rev. 0x07)
 mpt1: applying 1030 quirk
 mpt1: interrupting at ivec 1f68
 scsibus1 at mpt1: 16 targets, 8 luns per target
 schizo2 at mainbus0: addr 4000f600000: Tomatillo, version 4, ign 780, bus A 0 to 1
 schizo2:  pci2 at schizo2
 ebus0 at pci2 dev 7 function 0
 ebus0: Acer Labs M1533 PCI-ISA Bridge, revision 0x00
 flashprom at ebus0 addr 0-fffff not configured
 rtc0 at ebus0 addr 70-71: mc146818 compatible time-of-day clock: m5823
 pcfiic0 at ebus0 addr 320-321 ipl 2e
 iic0 at pcfiic0: I2C bus
 i2c-bridge at iic0 addr 0x09 not configured
 gpio at iic0 addr 0x18 not configured
 hardware-monitor at iic0 addr 0x29 not configured
 dbcool0 at iic0 addr 0x2c
 dbcool0: ADM1031 dBCool(tm) Controller (rev 0x0083)
 dbcool1 at iic0 addr 0x2e
 dbcool1: ADM1031 dBCool(tm) Controller (rev 0x0083)
 gpio at iic0 addr 0x37 not configured
 gpio at iic0 addr 0x4e not configured
 seeprom0 at iic0 addr 0x50: audio-card-fru-prom: size 8192
 seeprom1 at iic0 addr 0x51: motherboard-fru-prom: size 8192
 seeprom2 at iic0 addr 0x54: scsi-backplane-fru-prom: size 8192
 spdmem0 at iic0 addr 0x5b
 spdmem0: DDR SDRAM (registered), data ECC, 1GB, 286MHz (PC-2300)
 spdmem1 at iic0 addr 0x5c
 spdmem1: DDR SDRAM (registered), data ECC, 1GB, 286MHz (PC-2300)
 spdmem2 at iic0 addr 0x5d
 spdmem2: DDR SDRAM (registered), data ECC, 1GB, 286MHz (PC-2300)
 spdmem3 at iic0 addr 0x5e
 spdmem3: DDR SDRAM (registered), data ECC, 1GB, 286MHz (PC-2300)
 spdmem4 at iic0 addr 0x63
 spdmem4: DDR SDRAM (registered), data ECC, 1GB, 286MHz (PC-2300)
 spdmem5 at iic0 addr 0x64
 spdmem5: DDR SDRAM (registered), data ECC, 1GB, 286MHz (PC-2300)
 spdmem6 at iic0 addr 0x65
 spdmem6: DDR SDRAM (registered), data ECC, 1GB, 286MHz (PC-2300)
 spdmem7 at iic0 addr 0x66
 spdmem7: DDR SDRAM (registered), data ECC, 1GB, 286MHz (PC-2300)
 clock-generator at iic0 addr 0x69 not configured
 power at ebus0 addr 800-82f ipl 20 not configured
 com0 at ebus0 addr 3f8-3ff ipl 2c: ns16550a, working fifo
 com0: console
 com1 at ebus0 addr 2e8-2ef ipl 2c: ns16550a, working fifo
 dma at ebus0 addr 0-ffff not configured
 alipm0 at pci2 dev 6 function 0: 223KHz clock
 iic1 at alipm0: I2C bus
 card-reader at iic1 addr 0x20 not configured
 autri0 at pci2 dev 8 function 0: Acer Labs M5451 AC-Link Controller Audio Device (rev. 0x02)
 autri0: interrupting at ivec 7a4
 autri0: ac97: Analog Devices AD1881A codec; headphone, Analog Devices Phat Stereo
 audio0 at autri0: full duplex, playback, capture, mmap, independent
 autri0: Virtual format configured - Format SLINEAR, precision 16, channels 2, frequency 48000
 midi0 at autri0: 4DWAVE MIDI UART
 ohci0 at pci2 dev 10 function 0: Acer Labs M5237 USB 1.1 Host Controller (rev. 0x03)
 ohci0: interrupting at ivec 7a7
 ohci0: OHCI version 1.0, legacy support
 usb0 at ohci0: USB revision 1.0
 ohci1 at pci2 dev 11 function 0: Acer Labs M5237 USB 1.1 Host Controller (rev. 0x03)
 ohci1: interrupting at ivec 7a6
 ohci1: OHCI version 1.0, legacy support
 usb1 at ohci1: USB revision 1.0
 aceride0 at pci2 dev 13 function 0: Acer Labs M5229 UDMA IDE Controller (rev. 0xc4)
 aceride0: using ivec 1f98 for native-PCI interrupt
 atabus4 at aceride0 channel 0
 atabus5 at aceride0 channel 1
 ppb0 at pci2 dev 4 function 0: Texas Instruments product ac23 (rev. 0x02)
 pci3 at ppb0 bus 1
 ohci2 at pci3 dev 8 function 0: NEC USB Host Controller (rev. 0x43)
 ohci2: interrupting at ivec 794
 ohci2: OHCI version 1.0
 usb2 at ohci2: USB revision 1.0
 ohci3 at pci3 dev 8 function 1: NEC USB Host Controller (rev. 0x43)
 ohci3: interrupting at ivec 795
 ohci3: OHCI version 1.0
 usb3 at ohci3: USB revision 1.0
 ehci0 at pci3 dev 8 function 2: NEC USB2 Host Controller (rev. 0x04)
 ehci0: interrupting at ivec 796
 ehci0: 2 companion controllers, 3 ports each: ohci2 ohci3
 usb4 at ehci0: USB revision 2.0
 fwohci0 at pci3 dev 11 function 0: Texas Instruments TSB43AA23 IEEE 1394 Host Controller (rev. 0x00)
 fwohci0: interrupting at ivec 797
 fwohci0: OHCI version 1.10 (ROM=1)
 fwohci0: No. of Isochronous channels is 4.
 fwohci0: EUI64 00:05:16:00:00:41:f8:4a
 fwohci0: Phy 1394a available S400, 3 ports.
 fwohci0: Link S400, max_rec 2048 bytes.
 ieee1394if0 at fwohci0: IEEE1394 bus
 fwip0 at ieee1394if0: IP over IEEE1394
 fwohci0: Initiate bus reset
 ppm at mainbus0 not configured
 schizo3 at mainbus0: addr 4000ff00000: Tomatillo, version 4, ign 7c0, bus B 0 to 0
 schizo3:  pci4 at schizo3
 radeonfb0 at pci4 dev 2 function 0: ATI Technologies Radeon 7000/VE QY (rev. 0x00)
 no data for est. mode 640x480x67
 radeonfb0: 64 MB aperture at 0x08000000, 64 KB registers at 0x00100000
 radeonfb0: display 0: initial virtual resolution 1280x1024 at 8 bpp
 radeonfb0: using 32 MB per display
 radeonfb0: port 0: physical 1280x1024 60Hz
 radeonfb0: port 1: physical 1024x768 60Hz
 wsdisplay1 at radeonfb0 kbdmux 1
 drm at radeonfb0 not configured
 i2c at mainbus0 not configured
 pcons at mainbus0 not configured
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x0
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 fwohci0: BUS reset
 fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode
 ieee1394if0: 1 nodes, maxhop <= 0 cable IRM irm(0) (me)
 ieee1394if0: bus manager 0
 No counter-timer -- using %stick at 12MHz as system clock.
 scsibus0: waiting 2 seconds for devices to settle...
 scsibus1: waiting 2 seconds for devices to settle...
 ata_activate_xfer_locked(chp=0x108ea9508, xfer=0x108eaba58)
 ata_deactivate_xfer(chp=0x108ea9508, xfer=0x108eaba58)
 uhub0 at usb1: Acer Labs (0x10b9) OHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
 uhub1 at usb3: NEC (0x1033) OHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
 uhub2 at usb0: Acer Labs (0x10b9) OHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
 uhub3 at usb2: NEC (0x1033) OHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
 uhub4 at usb4: NEC (0x1033) EHCI root hub (0000), class 9/0, rev 2.00/1.00, addr 1
 siisata0 port 3: device present, speed: 3.0Gb/s
 wd0 at atabus3 drive 0
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 wd0: <Samsung SSD 850 PRO 256GB>
 wd0: 238 GB, 496149 cyl, 16 head, 63 sec, 512 bytes/sect x 500118192 sectors
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 dk0 at wd0: "sb2k5swap", 93008640 blocks at 36, type: swap
 dk1 at wd0: "sb2k5root", 407109480 blocks at 93008676, type: ffs
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x0
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 wd1 at atabus4 drive 0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 wd1: <SINTECHI HighSpeed SD to CF Adapter V1.0>
 wd1: 3796 MB, 7712 cyl, 16 head, 63 sec, 512 bytes/sect x 7774208 sectors
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 dk2 at wd1: "sb2.5kboot/a", 7774208 blocks at 0, type: ffs
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 atapibus0 at atabus5: 2 targets
 ata_activate_xfer_locked(chp=0x108ea9508, xfer=0x108eaba58)
 ata_deactivate_xfer(chp=0x108ea9508, xfer=0x108eaba58)
 ata_activate_xfer_locked(chp=0x108ea9508, xfer=0x108eaba58)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x0
 wdcintr(chp=0x108ea9508): dequeued xfer=0x108eaba58
 ata_deactivate_xfer(chp=0x108ea9508, xfer=0x108eaba58)
 cd0 at atapibus0 drive 0: <JLMS XJ-HD166S, , D3S4> cdrom removable
 ata_activate_xfer_locked(chp=0x108ea9508, xfer=0x108eaba58)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x0
 wdcintr(chp=0x108ea9508): dequeued xfer=0x108eaba58
 ata_deactivate_xfer(chp=0x108ea9508, xfer=0x108eaba58)
 ata_activate_xfer_locked(chp=0x108ea9508, xfer=0x108eaba58)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x0
 wdcintr(chp=0x108ea9508): dequeued xfer=0x108eaba58
 ata_deactivate_xfer(chp=0x108ea9508, xfer=0x108eaba58)
 ata_activate_xfer_locked(chp=0x108ea9508, xfer=0x108eaba58)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x0
 wdcintr(chp=0x108ea9508): dequeued xfer=0x108eaba58
 ata_deactivate_xfer(chp=0x108ea9508, xfer=0x108eaba58)
 ata_activate_xfer_locked(chp=0x108ea9508, xfer=0x108eaba58)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x0
 wdcintr(chp=0x108ea9508): dequeued xfer=0x108eaba58
 ata_deactivate_xfer(chp=0x108ea9508, xfer=0x108eaba58)
 ata_activate_xfer_locked(chp=0x108ea9508, xfer=0x108eaba58)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x0
 wdcintr(chp=0x108ea9508): dequeued xfer=0x108eaba58
 ata_deactivate_xfer(chp=0x108ea9508, xfer=0x108eaba58)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108c53088, xfer=0x1076ea080)
 ata_deactivate_xfer(chp=0x108c53088, xfer=0x1076ea080)
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ata_activate_xfer_locked(chp=0x108ea91f0, xfer=0x108e9c218)
 wdcintr(chp=0x108ea91f0): dequeued xfer=0x108e9c218
 wd1: transfer error, downgrading to Ultra-DMA mode 4
 ata_deactivate_xfer(chp=0x108ea91f0, xfer=0x108e9c218)
 wd1c: error reading fsbn 32 of 32-33 (wd1 bn 32; cn 0 tn 0 sn 32), slot 0, retry 1
 wd1: (aborted command, interface CRC error)
 wdcintr(chp=0x108ea9508): dequeued xfer=0x0
 ~Stopped in pid 0.2 (system) at  netbsd:cpu_Debugger+0x4:        nop
 db{0}> ps
 PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
 1        1 3   1         0          109212920               init lbolt
 0       66 3   1       200          1092120e0          atapibus0 sccomp
 0       62 3   0       200          108eae8e0               usb4 usbevt
 0       61 3   0       200          108eae4c0               usb2 usbevt
 0       60 3   0       200          108eaf120               usb3 usbevt
 0       59 3   0       200          108eaed00               usb0 usbevt
 0       58 3   1       200          109212500               usb1 usbevt
 0       57 3   1       200          109212d40            rt_free rt_free
 0       56 3   1       200          109213160              unpgc unpgc
 0       55 3   1       200          109213580    icmp6_wqinput/1 icmp6_wqinput
 0       54 3   0       200          1090c1980    icmp6_wqinput/0 icmp6_wqinput
 0       53 3   1       200          1092139a0          nd6_timer nd6_timer
 0       52 3   1       200          1090c00c0     icmp_wqinput/1 icmp_wqinput
 0       51 3   0       200          1090c04e0     icmp_wqinput/0 icmp_wqinput
 0       50 3   0       200          1090c1560           rt_timer rt_timer
 0       49 3   0       200          1090c1140        vmem_rehash vmem_rehash
 0       48 3   0       200          1090c0900            dbcool1 dbcool1
 0       47 3   0       200          1090c0d20            dbcool0 dbcool0
 0       38 3   1       280          108eaf540           fw0probe ieee1394
 0       37 3   1       200          108eaf960            atabus5 atath
 0       36 3   1       200          108e7e080            atabus4 atath
 0       35 3   0       200          108e7e4a0         usbtask-dr usbtsk
 0       34 3   0       200          108e7e8c0         usbtask-hc usbtsk
 0       33 3   1       280          108e7ece0           audiomix play
 0       32 3   1       280          108e7f100           audiorec record
 0       31 3   0       200          108e7f520               iic1 iicintr
 0       30 3   1       200          108e7f940               iic0 iicintr
 0       29 5   1       200          108c5e060           (zombie)
 0       28 3   0       200          108c5e480           scsibus1 sccomp
 0       26 3   0       200          108c5ecc0           scsibus0 sccomp
 0       25 3   1       200          108c5f0e0            atabus3 atath
 0       24 3   0       200          108c5f500            atabus2 atath
 0       23 3   1       200          108c5f920            atabus1 atath
 0       22 3   0       200          10769c040            atabus0 atath
 0       21 3   1       200          10769c460            xcall/1 xcall
 0       20 1   1       200          10769c880          softser/1
 0       19 1   1       200          10769cca0          softclk/1
 0       18 1   1       200          10769d0c0          softbio/1
 0       17 1   1       200          10769d4e0          softnet/1
 0    >  16 7   1       201          10769d900             idle/1
 0       15 3   1       200          10767a020             sysmon smtaskq
 0       14 3   0       200          10767a440         pmfsuspend pmfsuspend
 0       13 3   1       200          10767a860           pmfevent pmfevent
 0       12 3   0       200          10767ac80         sopendfree sopendfr
 0       11 3   0       200          10767b0a0           nfssilly nfssilly
 0       10 3   1       200          10767b4c0            cachegc cachegc
 0        9 3   0       200          10767b8e0             vdrain vdrain
 0        8 3   1       200          10766a000          modunload mod_unld
 0        7 3   0       200          10766a420            xcall/0 xcall
 0        6 1   0       200          10766a840          softser/0
 0        5 1   0       200          10766ac60          softclk/0
 0        4 1   0       200          10766b080          softbio/0
 0        3 1   0       200          10766b4a0          softnet/0
 0    >   2 7   0       201          10766b8c0             idle/0
 0        1 3   0       200            1c872a0            swapper biowait
 db{0}> 

From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52605 CVS commit: src/sys/dev/scsipi
Date: Tue, 10 Oct 2017 21:37:49 +0000

 Module Name:	src
 Committed By:	jdolecek
 Date:		Tue Oct 10 21:37:49 UTC 2017

 Modified Files:
 	src/sys/dev/scsipi: atapi_wdc.c

 Log Message:
 revert the logic in wdc_atapi_intr() for wdc_wait_for_unbusy() to what it
 was before NCQ merge; it got broken during the efford to remove ch_status
 and ch_error on the branch

 fixes atapi timeouts in vbox and with real harware reported separately
 by Abhinav Upadhyay, Pault Goyette, Chavdar Ivanov, and Rares
 Aioanei; with a bit of luck it could also fix PR kern/52605 and/or PR
 kern/52606 by Martin Husemann


 To generate a diff of this commit:
 cvs rdiff -u -r1.127 -r1.128 src/sys/dev/scsipi/atapi_wdc.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->feedback
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Tue, 10 Oct 2017 21:44:14 +0000
State-Changed-Why:
You mentioned you no longer get the crash. Can you check if rev. 1.128
of sys/dev/scsipi/atapi_wdc.c fixes the hang you still see with aceride?


From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52605 CVS commit: src/sys/dev/ata
Date: Sat, 14 Oct 2017 13:15:14 +0000

 Module Name:	src
 Committed By:	jdolecek
 Date:		Sat Oct 14 13:15:14 UTC 2017

 Modified Files:
 	src/sys/dev/ata: wd.c

 Log Message:
 only call drive reset with AT_POLL when the command itself was
 polled, so that the logic for AT_POLL matches how e.g. ata_dmaerr() is
 called; this was the original intent of the change in 1.428.2.25,
 to make the error handling safe wrt. polled xfers

 this is stopgap fix for ATA channel wedge after DMA error, as reported
 by Martin Husemann in PR kern/52606, and PR kern/52605

 problem happened due to ata_reset_channel() being called once in ata_dmaerr()
 with flags == 0, which freezed channel and set flag to reset via thread,
 then ata_reset_channel() was called via wdc_drive_reset() with AT_POLL, which
 just executed the reset and cleared the flag, without clearing the extra
 freeze; that logic will be refactored in separate commit


 To generate a diff of this commit:
 cvs rdiff -u -r1.430 -r1.431 src/sys/dev/ata/wd.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/52605 (aceride crashes)
Date: Sun, 15 Oct 2017 09:40:58 +0200

 With a -current kernel from a few minutes ago I still get the assertion
 failure as long as I have no debug output enabled.

 With debug output, it gets further (but with that mass of information it
 is hard to tell how far, I saw it print the Hostname: from rc.d and
 it was still chugging along fine).

 Martin

From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52605 CVS commit: src/sys/dev/ic
Date: Sun, 15 Oct 2017 18:02:33 +0000

 Module Name:	src
 Committed By:	jdolecek
 Date:		Sun Oct 15 18:02:33 UTC 2017

 Modified Files:
 	src/sys/dev/ic: wdc.c

 Log Message:
 explicitely ignore polled xfers in wdcintr(), so it won't be processed
 twice - seems setting WDSD_IBM actually has no effect at least
 on some PCI-IDE, and the interrupt ends up being triggered when we release
 the channel lock to call c_poll hook

 fixes PR kern/52605, and should also fix the 'New panic in wdc_ata_bio_intr'
 reported on current-users@


 To generate a diff of this commit:
 cvs rdiff -u -r1.284 -r1.285 src/sys/dev/ic/wdc.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: gnats-bugs@netbsd.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, 
	Martin Husemann <martin@netbsd.org>
Subject: Re: kern/52605 (aceride crashes)
Date: Sun, 15 Oct 2017 20:06:13 +0200

 --001a114af92aded6d8055b99beb4
 Content-Type: text/plain; charset="UTF-8"

 Hello,

 commited the patch for sys/dev/wdc.c to ignore polled commands in rev.
 1.285, that should fix the aceride problem. Can you please confirm it fixes
 this problem?

 Jaromir

 2017-10-15 9:45 GMT+02:00 Martin Husemann <martin@duskware.de>:

 > The following reply was made to PR kern/52605; it has been noted by GNATS.
 >
 > From: Martin Husemann <martin@duskware.de>
 > To: gnats-bugs@NetBSD.org
 > Cc:
 > Subject: Re: kern/52605 (aceride crashes)
 > Date: Sun, 15 Oct 2017 09:40:58 +0200
 >
 >  With a -current kernel from a few minutes ago I still get the assertion
 >  failure as long as I have no debug output enabled.
 >
 >  With debug output, it gets further (but with that mass of information it
 >  is hard to tell how far, I saw it print the Hostname: from rc.d and
 >  it was still chugging along fine).
 >
 >  Martin
 >
 >

 --001a114af92aded6d8055b99beb4
 Content-Type: text/html; charset="UTF-8"
 Content-Transfer-Encoding: quoted-printable

 <div dir=3D"ltr">Hello,<div><br></div><div>commited the patch for sys/dev/w=
 dc.c to ignore polled commands in rev. 1.285, that should fix the aceride p=
 roblem. Can you please confirm it fixes this problem?</div><div><br></div><=
 div>Jaromir</div></div><div class=3D"gmail_extra"><br><div class=3D"gmail_q=
 uote">2017-10-15 9:45 GMT+02:00 Martin Husemann <span dir=3D"ltr">&lt;<a hr=
 ef=3D"mailto:martin@duskware.de" target=3D"_blank">martin@duskware.de</a>&g=
 t;</span>:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
 border-left:1px #ccc solid;padding-left:1ex"><span class=3D"">The following=
  reply was made to PR kern/52605; it has been noted by GNATS.<br>
 <br>
 From: Martin Husemann &lt;<a href=3D"mailto:martin@duskware.de">martin@dusk=
 ware.de</a>&gt;<br>
 To: gnats-bugs@NetBSD.org<br>
 Cc:<br>
 </span><span class=3D"">Subject: Re: kern/52605 (aceride crashes)<br>
 </span>Date: Sun, 15 Oct 2017 09:40:58 +0200<br>
 <br>
 =C2=A0With a -current kernel from a few minutes ago I still get the asserti=
 on<br>
 =C2=A0failure as long as I have no debug output enabled.<br>
 <br>
 =C2=A0With debug output, it gets further (but with that mass of information=
  it<br>
 =C2=A0is hard to tell how far, I saw it print the Hostname: from rc.d and<b=
 r>
 =C2=A0it was still chugging along fine).<br>
 <span class=3D"HOEnZb"><font color=3D"#888888"><br>
 =C2=A0Martin<br>
 <br>
 </font></span></blockquote></div><br></div>

 --001a114af92aded6d8055b99beb4--

State-Changed-From-To: feedback->closed
State-Changed-By: martin@NetBSD.org
State-Changed-When: Mon, 16 Oct 2017 04:41:37 +0000
State-Changed-Why:
Confirmed fixed, thanks!


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.