NetBSD Problem Report #52606
From martin@duskware.de Mon Oct 9 10:19:25 2017
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id EF5157A208
for <gnats-bugs@gnats.NetBSD.org>; Mon, 9 Oct 2017 10:19:24 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: cmdide transfers never finish
X-Send-Pr-Version: 3.95
>Number: 52606
>Category: kern
>Synopsis: cmdide transfers never finish
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: jdolecek
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Oct 09 10:20:00 +0000 2017
>Closed-Date: Sun Oct 22 13:22:09 +0000 2017
>Last-Modified: Sun Oct 22 13:22:09 +0000 2017
>Originator: Martin Husemann
>Release: NetBSD 8.99.2
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD setting-sun.duskware.de 8.99.2 NetBSD 8.99.2 (SETTINGSUN) #1: Fri Sep 15 14:34:34 CEST 2017 martin@seven-days-to-the-wolves.aprisoft.de:/work/src/sys/arch/sparc64/compile/SETTINGSUN sparc64
Architecture: sparc64
Machine: sparc64
(can't boot the new kernel, so the above is from an older one)
>Description:
I have:
cmdide0 at pci1 dev 3 function 0: CMD Technology PCI0646 (rev. 0x03)
cmdide0: bus-master DMA support present
cmdide0: primary channel configured to native-PCI mode
cmdide0: using ivec 1820 for native-PCI interrupt
atabus0 at cmdide0 channel 0
cmdide0: secondary channel configured to native-PCI mode
atabus1 at cmdide0 channel 1
[..]
wd0 at atabus0 drive 0
wd0: <Maxtor 32049H2>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 19541 MB, 39704 cyl, 16 head, 63 sec, 512 bytes/sect x 40021632 sectors
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd0(cmdide0:0:0): using PIO mode 4, DMA mode 2 (using DMA)
wd1 at atabus1 drive 0
wd1: <WDC WD205AA>
wd1: drive supports 16-sector PIO transfers, LBA addressing
wd1: 19569 MB, 39761 cyl, 16 head, 63 sec, 512 bytes/sect x 40079088 sectors
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4 (Ultra/66)
wd1(cmdide0:1:0): using PIO mode 4, DMA mode 2 (using DMA)
and the system hangs idle at mountroot(), apparently the cmdide transfers
never finish.
>How-To-Repeat:
Boot -current on a Sun U5
>Fix:
n/a
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->jdolecek
Responsible-Changed-By: jdolecek@NetBSD.org
Responsible-Changed-When: Mon, 09 Oct 2017 22:02:31 +0000
Responsible-Changed-Why:
My changes broke this.
From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/52606 CVS commit: src/sys/dev/scsipi
Date: Tue, 10 Oct 2017 21:37:49 +0000
Module Name: src
Committed By: jdolecek
Date: Tue Oct 10 21:37:49 UTC 2017
Modified Files:
src/sys/dev/scsipi: atapi_wdc.c
Log Message:
revert the logic in wdc_atapi_intr() for wdc_wait_for_unbusy() to what it
was before NCQ merge; it got broken during the efford to remove ch_status
and ch_error on the branch
fixes atapi timeouts in vbox and with real harware reported separately
by Abhinav Upadhyay, Pault Goyette, Chavdar Ivanov, and Rares
Aioanei; with a bit of luck it could also fix PR kern/52605 and/or PR
kern/52606 by Martin Husemann
To generate a diff of this commit:
cvs rdiff -u -r1.127 -r1.128 src/sys/dev/scsipi/atapi_wdc.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/52606
Date: Wed, 11 Oct 2017 07:06:44 +0200
With a -current kernel it gets a tiny bit further:
root on raid0a dumps on raid0b
root file system type: ffs
kern.module.path=/stand/sparc64/8.99.4/modules
Wed Oct 11 06:30:53 MEST 2017
Not checking /: fs_passno = 0 in /etc/fstab
then it hangs endlessly and ddb ps shows:
0 42 3 0 200 100c16ce0 raidio0 biowait
Martin
From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/52606 CVS commit: src/sys/dev/ata
Date: Sat, 14 Oct 2017 13:15:14 +0000
Module Name: src
Committed By: jdolecek
Date: Sat Oct 14 13:15:14 UTC 2017
Modified Files:
src/sys/dev/ata: wd.c
Log Message:
only call drive reset with AT_POLL when the command itself was
polled, so that the logic for AT_POLL matches how e.g. ata_dmaerr() is
called; this was the original intent of the change in 1.428.2.25,
to make the error handling safe wrt. polled xfers
this is stopgap fix for ATA channel wedge after DMA error, as reported
by Martin Husemann in PR kern/52606, and PR kern/52605
problem happened due to ata_reset_channel() being called once in ata_dmaerr()
with flags == 0, which freezed channel and set flag to reset via thread,
then ata_reset_channel() was called via wdc_drive_reset() with AT_POLL, which
just executed the reset and cleared the flag, without clearing the extra
freeze; that logic will be refactored in separate commit
To generate a diff of this commit:
cvs rdiff -u -r1.430 -r1.431 src/sys/dev/ata/wd.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/52606
Date: Sun, 15 Oct 2017 09:38:31 +0200
On Wed, Oct 11, 2017 at 07:06:44AM +0200, Martin Husemann wrote:
> With a -current kernel it gets a tiny bit further:
>
> root on raid0a dumps on raid0b
> root file system type: ffs
> kern.module.path=/stand/sparc64/8.99.4/modules
> Wed Oct 11 06:30:53 MEST 2017
> Not checking /: fs_passno = 0 in /etc/fstab
>
> then it hangs endlessly and ddb ps shows:
>
> 0 42 3 0 200 100c16ce0 raidio0 biowait
Exactly the same still happens with a -current kernel as of a few minutes
ago.
Martin
State-Changed-From-To: open->analyzed
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Thu, 19 Oct 2017 19:53:13 +0000
State-Changed-Why:
cmdide driver shares the queue between the two channels, the queue code
doesn't count with this case. I'll need to figure some solution for this.
From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/52606 CVS commit: src/sys/dev/pci
Date: Thu, 19 Oct 2017 20:11:38 +0000
Module Name: src
Committed By: jdolecek
Date: Thu Oct 19 20:11:38 UTC 2017
Modified Files:
src/sys/dev/pci: cmdide.c pciidevar.h
Log Message:
replace the chek for the shared channel of cmdide(4) a flag of the
product array, rather than switch inside attach routine
XXX judging from product name, Silicon Image 0680 might be newer than 0649
XXX and hence have actually independant channels, but I don't have the hw
XXX so keeping as-is
no functional change, just to improve visibility in course of fixing
PR kern/52606
To generate a diff of this commit:
cvs rdiff -u -r1.39 -r1.40 src/sys/dev/pci/cmdide.c
cvs rdiff -u -r1.47 -r1.48 src/sys/dev/pci/pciidevar.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/52606 CVS commit: src/sys
Date: Fri, 20 Oct 2017 07:06:08 +0000
Module Name: src
Committed By: jdolecek
Date: Fri Oct 20 07:06:08 UTC 2017
Modified Files:
src/sys/arch/acorn32/eb7500atx: rside.c
src/sys/arch/acorn32/mainbus: wdc_pioc.c
src/sys/arch/acorn32/podulebus: icside.c rapide.c simide.c
src/sys/arch/amiga/dev: efa.c wdc_acafh.c wdc_amiga.c wdc_buddha.c
wdc_xsurf.c
src/sys/arch/arm/gemini: obio_wdc.c
src/sys/arch/atari/dev: wdc_mb.c
src/sys/arch/dreamcast/dev/g1: wdc_g1.c
src/sys/arch/evbarm/iq31244: wdc_obio.c
src/sys/arch/evbarm/tsarm: wdc_ts.c
src/sys/arch/evbppc/mpc85xx: wdc_obio.c
src/sys/arch/i386/pnpbios: pciide_pnpbios.c
src/sys/arch/landisk/dev: wdc_obio.c
src/sys/arch/mac68k/obio: wdc_obio.c
src/sys/arch/macppc/dev: kauai.c wdc_obio.c
src/sys/arch/mips/adm5120/dev: wdc_extio.c
src/sys/arch/mmeye/dev: wdc_mainbus.c
src/sys/arch/playstation2/dev: wdc_spd.c
src/sys/arch/prep/pnpbus: wdc_pnpbus.c
src/sys/dev/ata: ata.c ata_subr.c
src/sys/dev/ic: ahcisata_core.c ninjaata32.c siisata.c wdc.c wdc_upc.c
src/sys/dev/isa: wdc_isa.c
src/sys/dev/isapnp: wdc_isapnp.c
src/sys/dev/ofisa: wdc_ofisa.c
src/sys/dev/pci: artsata.c cmdide.c cypide.c pciide_common.c pdcsata.c
satalink.c viaide.c
src/sys/dev/pcmcia: wdc_pcmcia.c
src/sys/dev/podulebus: dtide.c hcide.c
src/sys/dev/usb: umass_isdata.c
Log Message:
move ata_queue_alloc(1) and ata_queue_free() calls to ata_channel_init()
and ata_channel_destroy() respectively, to make attachment code simpler,
and to make it easier to spot special queue manipulation like cmdide(4)
on topic of PR kern/52606
To generate a diff of this commit:
cvs rdiff -u -r1.15 -r1.16 src/sys/arch/acorn32/eb7500atx/rside.c
cvs rdiff -u -r1.29 -r1.30 src/sys/arch/acorn32/mainbus/wdc_pioc.c
cvs rdiff -u -r1.33 -r1.34 src/sys/arch/acorn32/podulebus/icside.c
cvs rdiff -u -r1.31 -r1.32 src/sys/arch/acorn32/podulebus/rapide.c
cvs rdiff -u -r1.30 -r1.31 src/sys/arch/acorn32/podulebus/simide.c
cvs rdiff -u -r1.14 -r1.15 src/sys/arch/amiga/dev/efa.c
cvs rdiff -u -r1.5 -r1.6 src/sys/arch/amiga/dev/wdc_acafh.c \
src/sys/arch/amiga/dev/wdc_xsurf.c
cvs rdiff -u -r1.39 -r1.40 src/sys/arch/amiga/dev/wdc_amiga.c
cvs rdiff -u -r1.9 -r1.10 src/sys/arch/amiga/dev/wdc_buddha.c
cvs rdiff -u -r1.7 -r1.8 src/sys/arch/arm/gemini/obio_wdc.c
cvs rdiff -u -r1.39 -r1.40 src/sys/arch/atari/dev/wdc_mb.c
cvs rdiff -u -r1.2 -r1.3 src/sys/arch/dreamcast/dev/g1/wdc_g1.c
cvs rdiff -u -r1.10 -r1.11 src/sys/arch/evbarm/iq31244/wdc_obio.c
cvs rdiff -u -r1.10 -r1.11 src/sys/arch/evbarm/tsarm/wdc_ts.c
cvs rdiff -u -r1.5 -r1.6 src/sys/arch/evbppc/mpc85xx/wdc_obio.c
cvs rdiff -u -r1.32 -r1.33 src/sys/arch/i386/pnpbios/pciide_pnpbios.c
cvs rdiff -u -r1.9 -r1.10 src/sys/arch/landisk/dev/wdc_obio.c
cvs rdiff -u -r1.28 -r1.29 src/sys/arch/mac68k/obio/wdc_obio.c
cvs rdiff -u -r1.37 -r1.38 src/sys/arch/macppc/dev/kauai.c
cvs rdiff -u -r1.60 -r1.61 src/sys/arch/macppc/dev/wdc_obio.c
cvs rdiff -u -r1.9 -r1.10 src/sys/arch/mips/adm5120/dev/wdc_extio.c
cvs rdiff -u -r1.5 -r1.6 src/sys/arch/mmeye/dev/wdc_mainbus.c
cvs rdiff -u -r1.28 -r1.29 src/sys/arch/playstation2/dev/wdc_spd.c
cvs rdiff -u -r1.14 -r1.15 src/sys/arch/prep/pnpbus/wdc_pnpbus.c
cvs rdiff -u -r1.139 -r1.140 src/sys/dev/ata/ata.c
cvs rdiff -u -r1.3 -r1.4 src/sys/dev/ata/ata_subr.c
cvs rdiff -u -r1.58 -r1.59 src/sys/dev/ic/ahcisata_core.c
cvs rdiff -u -r1.19 -r1.20 src/sys/dev/ic/ninjaata32.c
cvs rdiff -u -r1.34 -r1.35 src/sys/dev/ic/siisata.c
cvs rdiff -u -r1.287 -r1.288 src/sys/dev/ic/wdc.c
cvs rdiff -u -r1.30 -r1.31 src/sys/dev/ic/wdc_upc.c
cvs rdiff -u -r1.60 -r1.61 src/sys/dev/isa/wdc_isa.c
cvs rdiff -u -r1.43 -r1.44 src/sys/dev/isapnp/wdc_isapnp.c
cvs rdiff -u -r1.35 -r1.36 src/sys/dev/ofisa/wdc_ofisa.c
cvs rdiff -u -r1.27 -r1.28 src/sys/dev/pci/artsata.c
cvs rdiff -u -r1.40 -r1.41 src/sys/dev/pci/cmdide.c
cvs rdiff -u -r1.31 -r1.32 src/sys/dev/pci/cypide.c
cvs rdiff -u -r1.64 -r1.65 src/sys/dev/pci/pciide_common.c
cvs rdiff -u -r1.28 -r1.29 src/sys/dev/pci/pdcsata.c
cvs rdiff -u -r1.54 -r1.55 src/sys/dev/pci/satalink.c
cvs rdiff -u -r1.85 -r1.86 src/sys/dev/pci/viaide.c
cvs rdiff -u -r1.125 -r1.126 src/sys/dev/pcmcia/wdc_pcmcia.c
cvs rdiff -u -r1.29 -r1.30 src/sys/dev/podulebus/dtide.c
cvs rdiff -u -r1.26 -r1.27 src/sys/dev/podulebus/hcide.c
cvs rdiff -u -r1.35 -r1.36 src/sys/dev/usb/umass_isdata.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/52606 CVS commit: src/sys/dev/pci
Date: Sun, 22 Oct 2017 13:13:56 +0000
Module Name: src
Committed By: jdolecek
Date: Sun Oct 22 13:13:55 UTC 2017
Modified Files:
src/sys/dev/pci: cmdide.c pciidevar.h
Log Message:
do not share queue between the non-indepedant channels; instead make
sure only one of the channels is ever active on the same controller
fixes PR kern/52606 by Martin Husemann, thanks for report and testing
To generate a diff of this commit:
cvs rdiff -u -r1.42 -r1.43 src/sys/dev/pci/cmdide.c
cvs rdiff -u -r1.48 -r1.49 src/sys/dev/pci/pciidevar.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: analyzed->closed
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Sun, 22 Oct 2017 13:22:09 +0000
State-Changed-Why:
Problem fixed. Thanks for report and testing!
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.