NetBSD Problem Report #33291

From abs@purplei.com  Wed Apr 19 09:59:00 2006
Return-Path: <abs@purplei.com>
Received: from gta.purplei.com (host-84-9-61-34.bulldogdsl.com [84.9.61.34])
	by narn.netbsd.org (Postfix) with ESMTP id 9399F63B884
	for <gnats-bugs@gnats.NetBSD.org>; Wed, 19 Apr 2006 09:58:59 +0000 (UTC)
Message-Id: <E1FW9Sb-0001RX-I7@gta.purplei.com>
Date: Wed, 19 Apr 2006 10:58:57 +0100
From: abs@absd.org
Reply-To: abs@absd.org
To: gnats-bugs@netbsd.org
Subject: pdcsata hang kernel on writing device timeout/retry
X-Send-Pr-Version: 3.95

>Number:         33291
>Category:       port-i386
>Synopsis:       pdcsata hang kernel on writing device timeout/retry
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    port-i386-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Apr 19 10:00:00 +0000 2006
>Last-Modified:  Sat Nov 16 09:35:01 +0000 2019
>Originator:     David Brownlee
>Release:        NetBSD 3.0_STABLE
>Organization:
>Environment:


System: NetBSD gta.i 3.0_STABLE NetBSD 3.0_STABLE (_ACPI_) #1: Thu Mar 16 15:47:21 GMT 2006 root@tll.i:/var/obj/i386/files/netbsd/3/sys/arch/i386/compile/_ACPI_ i386
Architecture: i386
Machine: i386
>Description:
	Running an Asus A8V-Deluxe motherboard with two 300GB RAID1
	raidframe filesystems (four WD3200 disks). We occasionally
	see write retries on some of the disks. If this happens on the
	viaide SATA controller everything is fine:

viaide0:1:0: lost interrupt
	type: ata tc_bcount: 65536 tc_skip: 0
viaide0:1:0: bus-master DMA error: missing interrupt, status=0x21
viaide0:1:0: device timeout, c_bcount=65536, c_skip0
wd3e: device timeout writing fsbn 177628224 of 177628224-177628351 (wd3 bn 177628287; cn 176218 tn 8 sn 39), retrying
wd3: soft error (corrected)

	If it happens on the pdcsata the machine locks solid (unable to
	switch wscons virtual consoles), with the following:

pdcsata0:1:0: lost interrupt
        type: ata tc_bcount: 16384 tc_skip:0

	Currently running a kernel with ddb.fromconsole enabled, will try to
	capture more details if it happens again.

pdcsata0 at pci0 dev 8 function 0
pdcsata0: Promise PDC20378 SATA150 controller (rev. 0x02)
pdcsata0: interrupting at ioapic0 pin 18 (irq 10)
pdcsata0: bus-master DMA support present
atabus0 at pdcsata0 channel 0
atabus1 at pdcsata0 channel 1
atabus2 at pdcsata0 channel 2
[...]
viaide0 at pci0 dev 15 function 0
viaide0: VIA Technologies VT8237 SATA Controller (rev. 0x80)
viaide0: bus-master DMA support present
viaide0: primary channel wired to native-PCI mode
viaide0: using ioapic0 pin 20 (irq 11) for native-PCI interrupt
atabus5 at viaide0 channel 0
viaide0: secondary channel wired to native-PCI mode
atabus6 at viaide0 channel 1
[...]
wd0 at atabus0 drive 0: <WDC WD3200SD-01KNB0>
wd0: drive supports 16-sector PIO transfers, LBA48 addressing
wd0: 298 GB, 620181 cyl, 16 head, 63 sec, 512 bytes/sect x 625142448 sectors
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd0(pdcsata0:0:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA)
wd1 at atabus1 drive 0: <WDC WD3200SD-01KNB0>
wd1: drive supports 16-sector PIO transfers, LBA48 addressing
wd1: 298 GB, 620181 cyl, 16 head, 63 sec, 512 bytes/sect x 625142448 sectors
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd1(pdcsata0:1:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA)
wd2 at atabus5 drive 0: <WDC WD3200JD-00KLB0>
wd2: drive supports 16-sector PIO transfers, LBA48 addressing
wd2: 298 GB, 620181 cyl, 16 head, 63 sec, 512 bytes/sect x 625142448 sectors
wd2: 32-bit data port
wd2: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd2(viaide0:0:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA)
wd3 at atabus6 drive 0: <WDC WD3200JD-00KLB0>
wd3: drive supports 16-sector PIO transfers, LBA48 addressing
wd3: 298 GB, 620181 cyl, 16 head, 63 sec, 512 bytes/sect x 625142448 sectors
wd3: 32-bit data port
wd3: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd3(viaide0:1:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA)

>How-To-Repeat:

	Run a disk on a Promise PDC20378 SATA150 controller which gets
	a 'lost interrupt'
>Fix:

	Do not use pdcsata.



>Audit-Trail:
From: T <bobs@thelibertytree.org>
To: gnats-bugs@NetBSD.org
Cc: "abs@absd.org >> David Brownlee" <abs@absd.org>
Subject: Re: port-i386/33291
Date: Sat, 16 Nov 2019 02:31:56 -0700

 Hello, I too was having this problem with the pdcsata driver for a 
 PDC40719 FastTrak TX4. I am using a different port (NetBSD/prep) but 
 would have similar errors to:

 pdcsata0:1:0: lost interrupt
          type: ata tc_bcount: 16384 tc_skip:01

 the difference being the 'tc_bcount' number and zero 'tc_skip'. 
 Conducting certain tasks with high drive IO would eventually result in a 
 "stuck" terminal  (DDB could be entered using the key sequence, caps/num 
 lock LED would toggle, machine could be pinged). I found that adding the 
 'buffer queue strategy' option BUFQ_READPRIO to the kernel config 
 resolved these problems and have not seen them since using it. The other 
 option BUFQ_PRIOSCAN may work, but it is unknown to me at this point, 
 because it will cause a kernel panic if enabled for this specific 
 system. It should also be noted that many ports have enabled 
 BUFQ_PRIOSCAN in their GENERIC configs, including i386, so this problem 
 may not resurface, unless someone specifically uses the 'disksort' 
 strategy in combination with this driver. This is an old ticket, but 
 figured this information would help others that may run into this problem.

 -Tim

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.