NetBSD Problem Report #48214

From gson@gson.org  Sun Sep 15 13:04:22 2013
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 88A3C72150
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 15 Sep 2013 13:04:22 +0000 (UTC)
Message-Id: <20130915130417.D14FB75FC8@guava.gson.org>
Date: Sun, 15 Sep 2013 16:04:17 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@gnats.NetBSD.org
Subject: "clearing WDCTL_RST failed" during boot
X-Send-Pr-Version: 3.95

>Number:         48214
>Category:       kern
>Synopsis:       "clearing WDCTL_RST failed" during boot
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Sep 15 13:05:00 +0000 2013
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current, source date 2013.09.12.07.26.13
>Organization:
>Environment:
System: NetBSD guido.araneus.fi
Architecture: x86_64
Machine: amd64
>Description:

I have an Intel DH67CLB3 motherboard and a Western Digital WD2500KS
SATA disk.  When the disk is plugged into either one of the two 6 Gbps
SATA ports on the motherboard, about half of the boot attempts fail.
When they fail, the system ends up at a "root device:" prompt, and
earlier in the boot messages, there will be an error message:

  ahcisata0 channel 0: clearing WDCTL_RST failed for drive 0

If I boot NetBSD 6.0 instead of -current, the behavior is similar,
except that the error message then reads

  ahcisata0: BSY never cleared, TD 0x80

If the machine gets past the boot, the disk works reliably after that.

The motherboard also has four 3 Gbps SATA ports.  If the disk is
plugged into one of those instead of the 6 Gbps ones, it works fine -
I have successfully done 100 successive reboots without running into
the problem.

Replacing the SATA data and power cables made no difference.  I have
not tried replacing the disk or motherboard.

The same machine is also afflicted by PRs 38970, 46596, 46696, 47153,
and 48213.

A full dmesg from a successful boot (well, successful apart from PR
46696) follows.  Here, the disk was attached to a 3 Gbps SATA port.

Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 6.99.23 (GENERIC) #0: Thu Sep 12 16:26:55 EEST 2013
	gson@guido.araneus.fi:/tmp/bracket/build/2013.09.12.07.26.13-amd64/obj/sys/arch/amd64/compile/GENERIC
total memory = 16292 MB
avail memory = 15803 MB
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
                                                                    (                                 )
mainbus0 (root)
cpu0 at mainbus0 apid 0: Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz, id 0x206a7
cpu1 at mainbus0 apid 2: Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz, id 0x206a7
cpu2 at mainbus0 apid 4: Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz, id 0x206a7
cpu3 at mainbus0 apid 6: Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz, id 0x206a7
ioapic0 at mainbus0 apid 0: pa 0xfec00000, version 0x20, 24 pins
acpi0 at mainbus0: Intel ACPICA 20110623
acpi0: X/RSDT: OemId <INTEL ,DH67CL  ,01072009>, AslId <AMI ,00010013>
acpi0: SCI interrupting at int 9
timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000
hpet0 at acpi0: high precision event timer (mem 0xfed00000-0xfed00400)
timecounter: Timecounter "hpet0" frequency 14318180 Hz quality 2000
MCH (PNP0C01) at acpi0 not configured
SIO1 (PNP0C02) at acpi0 not configured
attimer1 at acpi0 (TMR, PNP0100): io 0x40-0x43 irq 0
pcppi1 at acpi0 (SPKR, PNP0800): io 0x61
midi0 at pcppi1: PC speaker
sysbeep0 at pcppi1
RMSC (PNP0C02) at acpi0 not configured
PCH (PNP0C01) at acpi0 not configured
CWDT (INT3F0D) at acpi0 not configured
acpibut0 at acpi0 (PWRB, PNP0C0C-170): ACPI Power Button
RMEM (PNP0C01) at acpi0 not configured
OMSC (PNP0C02) at acpi0 not configured
attimer1: attached to pcppi1
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0: vendor 0x8086 product 0x0100 (rev. 0x09)
vga0 at pci0 dev 2 function 0: vendor 0x8086 product 0x0102 (rev. 0x09)
wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation)
wsmux1: connecting to wsdisplay0
drm at vga0 not configured
vendor 0x8086 product 0x1c3a (miscellaneous communications, revision 0x04) at pci0 dev 22 function 0 not configured
wm0 at pci0 dev 25 function 0: PCH2 LAN (82579V) Controller (rev. 0x05)
wm0: interrupting at ioapic0 pin 20
wm0: PCI-Express bus
wm0: FLASH
wm0: Ethernet address 38:60:77:b4:e5:f5
ihphy0 at wm0 phy 2: i82579 10/100/1000 media interface, rev. 3
ihphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
ehci0 at pci0 dev 26 function 0: vendor 0x8086 product 0x1c2d (rev. 0x05)
ehci0: interrupting at ioapic0 pin 16
ehci0: EHCI version 1.0
usb0 at ehci0: USB revision 2.0
hdaudio0 at pci0 dev 27 function 0: HD Audio Controller
hdaudio0: interrupting at ioapic0 pin 22
hdafg0 at hdaudio0: Realtek ALC892
hdafg0: DAC00 6ch: Speaker [Jack]
hdafg0: DAC01 2ch: HP Out [Jack]
hdafg0: DIG02 2ch: SPDIF Out [Jack]
hdafg0: DIG03 2ch: SPDIF Out [Jack]
hdafg0: ADC04 2ch: Line In [Jack], Mic In [Jack]
hdafg0: ADC05 2ch: Mic In [Jack]
hdafg0: 6ch/2ch 32000Hz 44100Hz 48000Hz 88200Hz 96000Hz 192000Hz PCM16 PCM20 PCM24 AC3
audio0 at hdafg0: full duplex, playback, capture, independent
hdafg1 at hdaudio0: Intel product 2805
hdafg1: DP00 8ch: Digital Out [Jack]
hdafg1: 8ch/0ch 48000Hz PCM16*
ppb0 at pci0 dev 28 function 0: vendor 0x8086 product 0x1c10 (rev. 0xb5)
ppb0: PCI Express 2.0 <Root Port of PCI-E Root Complex> x1 @ 5.0Gb/s
ppb0: link is x1 @ 2.5Gb/s
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
ppb1 at pci1 dev 0 function 0: vendor 0x1283 product 0x8892 (rev. 0x30)
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
puc0 at pci2 dev 1 function 0: NetMos NM9865 1 UART (com)
com2 at puc0 port 0: ioaddr 0xe130, interrupting at ioapic0 pin 17
com2: ns16550a, working fifo
puc1 at pci2 dev 1 function 1: NetMos NM9865 1 UART (com)
com3 at puc1 port 0: ioaddr 0xe120, interrupting at ioapic0 pin 18
com3: ns16550a, working fifo
re0 at pci2 dev 2 function 0: RealTek 8169/8110 Gigabit Ethernet (rev. 0x10)
re0: interrupting at ioapic0 pin 18
re0: Ethernet address 00:50:fc:fb:10:45
re0: using 256 tx descriptors
rgephy0 at re0 phy 7: RTL8169S/8110S/8211 1000BASE-T media interface, rev. 0
rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
ehci1 at pci0 dev 29 function 0: vendor 0x8086 product 0x1c26 (rev. 0x05)
ehci1: interrupting at ioapic0 pin 23
ehci1: EHCI version 1.0
usb1 at ehci1: USB revision 2.0
ichlpcib0 at pci0 dev 31 function 0: vendor 0x8086 product 0x1c4a (rev. 0x05)
timecounter: Timecounter "ichlpcib0" frequency 3579545 Hz quality 1000
ichlpcib0: 24-bit timer
ichlpcib0: TCO (watchdog) timer configured.
ahcisata0 at pci0 dev 31 function 2: vendor 0x8086 product 0x1c02 (rev. 0x05)
ahcisata0: interrupting at ioapic0 pin 19
ahcisata0: 64-bit DMA
ahcisata0: AHCI revision 1.30, 6 ports, 32 slots, CAP 0xe730ff65<SXS,EMS,PSC,SSC,PMD,ISS=0x3=Gen3,SCLO,SAL,SALP,SSNTF,SNCQ,S64A>
atabus0 at ahcisata0 channel 0
atabus1 at ahcisata0 channel 1
atabus2 at ahcisata0 channel 2
atabus3 at ahcisata0 channel 3
atabus4 at ahcisata0 channel 4
atabus5 at ahcisata0 channel 5
ichsmb0 at pci0 dev 31 function 3: vendor 0x8086 product 0x1c22 (rev. 0x05)
ichsmb0: interrupting at ioapic0 pin 18
iic0 at ichsmb0: I2C bus
isa0 at ichlpcib0
pckbc0 at isa0 port 0x60-0x64
acpicpu0 at cpu0: ACPI CPU
acpicpu0: C1: FFH, lat   1 us, pow  1000 mW
acpicpu0: C2: FFH, lat  80 us, pow   500 mW
acpicpu0: C3: FFH, lat 104 us, pow   350 mW, bus master check
acpicpu0: P0: FFH, lat  10 us, pow 95000 mW, 3301 MHz, turbo boost
acpicpu0: P1: FFH, lat  10 us, pow 95000 mW, 3300 MHz
acpicpu0: P2: FFH, lat  10 us, pow 87000 mW, 3100 MHz
acpicpu0: P3: FFH, lat  10 us, pow 80000 mW, 2900 MHz
acpicpu0: P4: FFH, lat  10 us, pow 72000 mW, 2700 MHz
acpicpu0: P5: FFH, lat  10 us, pow 66000 mW, 2500 MHz
acpicpu0: P6: FFH, lat  10 us, pow 59000 mW, 2300 MHz
acpicpu0: P7: FFH, lat  10 us, pow 53000 mW, 2100 MHz
acpicpu0: P8: FFH, lat  10 us, pow 47000 mW, 1900 MHz
acpicpu0: P9: FFH, lat  10 us, pow 41000 mW, 1700 MHz
acpicpu0: P10: FFH, lat  10 us, pow 38000 mW, 1600 MHz
acpicpu0: T0: FFH, lat   1 us, pow  1000 mW, 100 %
acpicpu0: T1: FFH, lat   1 us, pow   875 mW,  88 %
acpicpu0: T2: FFH, lat   1 us, pow   750 mW,  75 %
acpicpu0: T3: FFH, lat   1 us, pow   625 mW,  63 %
acpicpu0: T4: FFH, lat   1 us, pow   500 mW,  50 %
acpicpu0: T5: FFH, lat   1 us, pow   375 mW,  38 %
acpicpu0: T6: FFH, lat   1 us, pow   250 mW,  25 %
acpicpu0: T7: FFH, lat   1 us, pow   125 mW,  13 %
coretemp0 at cpu0: thermal sensor, 1 C resolution
acpicpu1 at cpu1: ACPI CPU
coretemp1 at cpu1: thermal sensor, 1 C resolution
acpicpu2 at cpu2: ACPI CPU
coretemp2 at cpu2: thermal sensor, 1 C resolution
acpicpu3 at cpu3: ACPI CPU
coretemp3 at cpu3: thermal sensor, 1 C resolution
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
timecounter: Timecounter "TSC" frequency 3292742840 Hz quality 3000
uhub0 at usb0: vendor 0x8086 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhub1 at usb1: vendor 0x8086 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
ahcisata0 port 2: device present, speed: 3.0Gb/s
wd0 at atabus2 drive 0
wd0: <WDC WD2500KS-00MJB0>
wd0: drive supports 16-sector PIO transfers, LBA48 addressing
wd0: 232 GB, 484521 cyl, 16 head, 63 sec, 512 bytes/sect x 488397168 sectors
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
wd0(ahcisata0:2:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133) (using DMA)
uhub2 at uhub0 port 1: vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2
uhub2: single transaction translator
uhub3 at uhub1 port 1: vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2
uhub3: single transaction translator
uhub2: 6 ports with 6 removable, self powered
uhub3: 8 ports with 8 removable, self powered
uhub2: device problem, disabling port 4
Kernelized RAIDframe activated
pad0: outputs: 44100Hz, 16-bit, stereo
audio1 at pad0: half duplex, playback, capture
boot device: wd0
root on wd0a dumps on wd0b
root file system type: ffs
wsdisplay0: screen 1 added (80x25, vt100 emulation)
wsdisplay0: screen 2 added (80x25, vt100 emulation)
wsdisplay0: screen 3 added (80x25, vt100 emulation)
wsdisplay0: screen 4 added (80x25, vt100 emulation)

>How-To-Repeat:

>Fix:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.