NetBSD Problem Report #47290
From cheusov@tut.by Thu Dec 6 23:18:26 2012
Return-Path: <cheusov@tut.by>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
by www.NetBSD.org (Postfix) with ESMTP id 58CC963EAA8
for <gnats-bugs@gnats.netbsd.org>; Thu, 6 Dec 2012 23:18:26 +0000 (UTC)
Message-Id: <s93a9trqhwy.fsf@cheusov.imb.invention.com>
Date: Thu, 06 Dec 2012 19:11:57 +0300
From: cheusov@tut.by
To: gnats-bugs@gnats.NetBSD.org
Subject: Boot hangs up (regression in 6.0.0_PATCH since 6.0_RELEASE)
X-Send-Pr-Version: 3.95
>Number: 47290
>Notify-List: ignatios@cs.uni-bonn.de
>Category: kern
>Synopsis: Boot hangs up (regression in 6.0.0_PATCH since 6.0_RELEASE)
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: chs
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Dec 06 23:20:06 +0000 2012
>Last-Modified: Fri Jan 31 12:20:00 +0000 2014
>Originator: Aleksey Cheusov
>Release: NetBSD 6.0.0_PATCH
>Organization:
>Environment:
System: NetBSD cheusov.imb.invention.com 6.0.0_PATCH NetBSD 6.0.0_PATCH (GENERIC) #2: Thu Dec 6 18:53:59 FET 2012 cheusov@cheusov.imb.invention.com:/srv/obj/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
After update from 6.0_RELEASE to 6.0.0_PATCH
my system stops booting. The last message I see is the following
acpi0 at mainbus0: Intel ACPICA 20110623
that is, it hangs up in early stage.
This is relatively modern machine with dual core AMD Athlon(tm) 64 X2 4600+
>How-To-Repeat:
>Fix:
Reverting the following commit solves the problem
revision 1.18.14.1
date: 2012/11/22 00:34:25; author: riz; state: Exp; lines: +2 -4
Pull up following revision(s) (requested by chs in ticket #682):
sys/dev/acpi/acpi_pci_link.c: revision 1.19
re-enable the code to disable link devices at startup, ie. revert rev 1.3.
this fixes PCI interrupts on some systems (eg. HP XW9400) and we suspect that
the problems which led to the original change were caused by buggy early
implementations of ACPI, which are now ignored by date.
>Release-Note:
>Audit-Trail:
From: Jeff Rizzo <riz@boogers.sf.ca.us>
To: gnats-bugs@NetBSD.org
Cc: cheusov@tut.by
Subject: Re: kern/47290: Boot hangs up (regression in 6.0.0_PATCH since 6.0_STABLE)
Date: Thu, 06 Dec 2012 16:41:37 -0800
On 12/6/12 3:20 PM, cheusov@tut.by wrote:
> After update from 6.0_STABLE to 6.0.0_PATCH
> my system stops booting. The last message I see is the following
>
The particular problem you noted has been seen by others; however, you
should be aware that 6.0_STABLE to 6.0_PATCH is a downgrade in many
ways. 6.0_STABLE will become 6.1, while 6.0_PATCH will become 6.0.1.
You should not be mixing-and-matching.
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/47290: Boot hangs up (regression in 6.0.0_PATCH since 6.0_STABLE)
Date: Fri, 7 Dec 2012 09:42:26 +0100
I have a machine that is affected as well. Chuck and I are debugging it,
but progress is slooow. My machine hangs up widway of interpreting the _DIS
call, while (for example) FreeBSD seems to do the same _DIS call and my
machine survives it there.
Martin
From: Aleksey Cheusov <cheusov@tut.by>
To: Jeff Rizzo <riz@boogers.sf.ca.us>
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/47290: Boot hangs up (regression in 6.0.0_PATCH since 6.0_STABLE)
Date: Fri, 7 Dec 2012 13:37:16 +0300
On Fri, Dec 7, 2012 at 3:41 AM, Jeff Rizzo <riz@boogers.sf.ca.us> wrote:
> you should be aware that 6.0_STABLE to 6.0_PATCH is a downgrade in many ways.
> 6.0_STABLE will become 6.1, while 6.0_PATCH will become 6.0.1. You should
> not be mixing-and-matching.
Fix: s/_STABLE/_RELEASE/
I updated the system along netbsd-6-0 branch
Responsible-Changed-From-To: kern-bug-people->chs
Responsible-Changed-By: chs@NetBSD.org
Responsible-Changed-When: Fri, 07 Dec 2012 12:53:03 +0000
Responsible-Changed-Why:
due to my change
From: "Kai-Uwe Eckhardt" <kuehro@gmx.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/47290 (Boot hangs up (regression in 6.0.0_PATCH since
6.0_RELEASE))
Date: Sun, 16 Dec 2012 12:25:15 +0100
6.99.15 hangs with the same boot message on my Athlon X4. After
reverting acpi_pci_link.c from revision 1.19 to 1.18 it boots and
installs. ACPI related parts of dmesg below.
Thanks for the fix.
Kai-Uwe
NetBSD 6.99.15 (GENERIC) #0: Sat Dec 15 18:03:16 CET 2012
root@vm7.backusf2x.de:/usr/objdir/sys/arch/amd64/compile/GENERIC
total memory = 4095 MB
avail memory = 3960 MB
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
To Be Filled By O.E.M. To Be Filled By O.E.M. (To Be Filled By O.E.M.)
mainbus0 (root)
cpu0 at mainbus0 apid 0: AMD Athlon(tm) II X4 640 Processor, id 0x100f53
cpu1 at mainbus0 apid 1: AMD Athlon(tm) II X4 640 Processor, id 0x100f53
cpu2 at mainbus0 apid 2: AMD Athlon(tm) II X4 640 Processor, id 0x100f53
cpu3 at mainbus0 apid 3: AMD Athlon(tm) II X4 640 Processor, id 0x100f53
ioapic0 at mainbus0 apid 4: pa 0xfec00000, version 11, 24 pins
acpi0 at mainbus0: Intel ACPICA 20110623
acpi0: X/RSDT: OemId <A_M_I ,OEMRSDT ,12001014>, AslId <MSFT,00000097>
ioapic0 reenabling
acpi0: SCI interrupting at int 9
timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000
attimer1 at acpi0 (TMR, PNP0100): io 0x40-0x43 irq 0
pcppi1 at acpi0 (SPKR, PNP0800): io 0x61
midi0 at pcppi1: PC speaker
sysbeep0 at pcppi1
LPTE (PNP0401) at acpi0 not configured
RMSC (PNP0C02) at acpi0 not configured
OMSC (PNP0C02) at acpi0 not configured
pckbc1 at acpi0 (PS2K, PNP0303) (kbd port): io 0x60,0x64 irq 1
pckbc2 at acpi0 (PS2M, PNP0F03) (aux port): irq 12
UAR1 (PNP0501) at acpi0 not configured
SIOR (PNP0C02) at acpi0 not configured
PCIE (PNP0C02) at acpi0 not configured
RMEM (PNP0C01) at acpi0 not configured
acpibut0 at acpi0 (PWRB, PNP0C0C-170): ACPI Power Button
acpicpu0 at cpu0: ACPI CPU
acpicpu0: C1: HLT, lat 0 us, pow 0 mW
acpicpu0: P0: FFH, lat 4 us, pow 25515 mW, 3000 MHz
acpicpu0: P1: FFH, lat 4 us, pow 17875 mW, 2300 MHz
acpicpu0: P2: FFH, lat 4 us, pow 14720 mW, 1800 MHz
acpicpu0: P3: FFH, lat 4 us, pow 9545 mW, 800 MHz
acpicpu1 at cpu1: ACPI CPU
acpicpu2 at cpu2: ACPI CPU
acpicpu3 at cpu3: ACPI CPU
--
--
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/47290: Boot hangs up (regression in 6.0.0_PATCH since 6.0_RELEASE)
Date: Fri, 11 Jan 2013 11:03:55 +0100
After a bit of painfull debugging (including booting patched FreeBSD kernels
to verify their acpi subsystem does the same _DIS calls) I ended up with a
patch that seems to work around the problem - but it is a bit mind puzzling.
Unfortunately todays -current doesn't seem to boot on this machine, probably
due to something completely unrelated - so my boot still does not complete.
Anyway, the interesting part of the long story: the execution of some _DIS
method on my machiine ends with a 8bit wide write to pci config space.
Since those writes are only well defined for 32bit access, we do a read-
modify-write cycle to set the requested byte. It is the write part of that
cycle that kills my machine.
So, despite knowing this is a bad idea, I hacked a patch that uses byte
access to directly store the requested byte in pci config space - and that
makes my machine survive. Now I am not sure if the out-of-spec byte acccess
is ignored by the hardware (so I basically disabled the deadly store), or if
this now all works as intended.
Now, I agree that in general we should forbid such writes, as some
architecture can not even implement them for arbitrary pci devices.
However, given the tight binding of ACPI and hardware, we should allow
them for this case.
The deadly store is this one:
calling _DIS for LNK3...
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 125 width 8 value 0
... done with _DIS for LNK3...
and the device it fiddles with is:
pcib0 at pci0 dev 1 function 0: vendor 0x10de product 0x0051 (rev. 0xa3)
The FreeBSD code does:
pci_cfgregwrite(PciId->Bus, PciId->Device, PciId->Function, Register,
Value, Width / 8);
and it ends up doing a outb() in this case:
port = pci_cfgenable(bus, slot, func, reg, bytes);
if (port != 0) {
switch (bytes) {
case 1:
outb(port, data);
break;
case 2:
outw(port, data);
break;
case 4:
outl(port, data);
break;
}
pci_cfgdisable();
}
So I guess my patch is more or less correct (but obviously incomplete).
Here is the dmesg from the patched kernel:
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013
The NetBSD Foundation, Inc. All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
NetBSD 6.99.16 (GENERIC) #57: Fri Jan 11 11:12:16 CET 2013
martin@seven-days-to-the-wolves.aprisoft.de:/usr/src/sys/arch/amd64/compile/GENERIC
total memory = 4094 MB
avail memory = 3960 MB
mainbus0 (root)
cpu0 at mainbus0 apid 0: AMD Opteron(tm) Processor 248, id 0xf5a
cpu0: WARNING: errata present, BIOS upgrade may be
cpu0: WARNING: necessary to ensure reliable operation
cpu1 at mainbus0 apid 1: AMD Opteron(tm) Processor 248, id 0xf5a
ioapic0 at mainbus0 apid 2
ioapic1 at mainbus0 apid 3
ioapic2 at mainbus0 apid 4
ioapic3 at mainbus0 apid 5
acpi0 at mainbus0: Intel ACPICA 20110623
calling _DIS for LSMB...
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 129 width 8 value 70
... done with _DIS for LSMB...
calling _DIS for LUS0...
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 132 width 8 value 0
... done with _DIS for LUS0...
calling _DIS for LUS2...
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 129 width 8 value 0
... done with _DIS for LUS2...
calling _DIS for LMAC...
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 133 width 8 value 0
... done with _DIS for LMAC...
calling _DIS for LACI...
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 134 width 8 value 0
... done with _DIS for LACI...
calling _DIS for LMCI...
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 134 width 8 value c
... done with _DIS for LMCI...
calling _DIS for LPID...
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 135 width 8 value 0
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 135 width 8 value 0
... done with _DIS for LPID...
calling _DIS for LTID...
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 131 width 8 value a
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 127 width 8 value 0
... done with _DIS for LTID...
calling _DIS for LSI1...
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 131 width 8 value b0
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 126 width 8 value 0
... done with _DIS for LSI1...
calling _DIS for LNK1...
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 124 width 8 value 0
... done with _DIS for LNK1...
calling _DIS for LNK2...
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 124 width 8 value 0
... done with _DIS for LNK2...
calling _DIS for LNK3...
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 125 width 8 value 0
... done with _DIS for LNK3...
calling _DIS for LNK4...
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 125 width 8 value 5
... done with _DIS for LNK4...
calling _DIS for LMAC...
AcpiOsWritePciConfiguration: bus 128 device 1 function 0: register 133 width 8 value 0
... done with _DIS for LMAC...
calling _DIS for LNK3...
AcpiOsWritePciConfiguration: bus 128 device 1 function 0: register 125 width 8 value 0
... done with _DIS for LNK3...
calling _DIS for LNK4...
AcpiOsWritePciConfiguration: bus 128 device 1 function 0: register 125 width 8 value 0
... done with _DIS for LNK4...
calling _DIS for LNK1...
AcpiOsWritePciConfiguration: bus 128 device 1 function 0: register 124 width 8 value 0
... done with _DIS for LNK1...
calling _DIS for LNK2...
AcpiOsWritePciConfiguration: bus 128 device 1 function 0: register 124 width 8 value 0
... done with _DIS for LNK2...
ioapic0 reenabling
acpibut0 at acpi0 (PWRB, PNP0C0C): ACPI Power Button
MEM0 (PNP0C01) at acpi0 not configured
PMIO (PNP0C02) at acpi0 not configured
SYS0 (PNP0C02) at acpi0 not configured
attimer1 at acpi0 (PIT0, PNP0100): io 0x40-0x43 irq 0
pcppi1 at acpi0 (SPK0, PNP0800): io 0x61
midi0 at pcppi1: PC speaker
sysbeep0 at pcppi1
COM1 (PNP0501) at acpi0 not configured
FDC (PNP0700) at acpi0 not configured
pckbc1 at acpi0 (PS2K, PNP0303) (kbd port): io 0x60,0x64 irq 1
pckbc2 at acpi0 (PS2M, PNP0F13) (aux port): irq 12
NVRB (_NVRAIDBUS) at acpi0 not configured
attimer1: attached to pcppi1
pckbd0 at pckbc1 (kbd slot)
pckbc1: using irq 1 for kbd slot
wskbd0 at pckbd0 mux 1
pci0 at mainbus0 bus 0: configuration mode 1
vendor 0x10de product 0x005e (miscellaneous memory, revision 0xa3) at pci0 dev 0 function 0 not configured
pcib0 at pci0 dev 1 function 0: vendor 0x10de product 0x0051 (rev. 0xa3)
nfsmbc0 at pci0 dev 1 function 1: vendor 0x10de product 0x0052 (rev. 0xa2)
nfsmb0 at nfsmbc0 SMBus 1
iic0 at nfsmb0: I2C bus
nfsmb1 at nfsmbc0 SMBus 2
iic1 at nfsmb1: I2C bus
ohci0 at pci0 dev 2 function 0: vendor 0x10de product 0x005a (rev. 0xa2)
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 132 width 8 value 8
ohci0: interrupting at ioapic0 pin 20
ohci0: OHCI version 1.0, legacy support
usb0 at ohci0: USB revision 1.0
ehci0 at pci0 dev 2 function 1: vendor 0x10de product 0x005b (rev. 0xa3)
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 129 width 8 value d0
ehci0: interrupting at ioapic0 pin 21
ehci0: BIOS refuses to give up ownership, using force
ehci0: companion controller, 4 ports each: ohci0
usb1 at ehci0: USB revision 2.0
auich0 at pci0 dev 4 function 0: nForce4 AC-97 Audio
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 134 width 8 value 2
auich0: interrupting at ioapic0 pin 22
auich0: ac97: Analog Devices AD1981B codec; headphone, 20 bit DAC, no 3D stereo
auich0: ac97: ext id 0x605<AC97_22,AMAP,SPDIF,VRA>
viaide0 at pci0 dev 6 function 0: NVIDIA nForce4 IDE Controller (rev. 0xf2)
viaide0: primary channel interrupting at ioapic0 pin 14
atabus0 at viaide0 channel 0
viaide0: secondary channel interrupting at ioapic0 pin 15
atabus1 at viaide0 channel 1
viaide1 at pci0 dev 7 function 0: NVIDIA nForce4 Serial ATA Controller (rev. 0xf3)
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 131 width 8 value 1a
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 127 width 8 value 1
viaide1: using ioapic0 pin 23 for native-PCI interrupt
atabus2 at viaide1 channel 0
atabus3 at viaide1 channel 1
viaide2 at pci0 dev 8 function 0: NVIDIA nForce4 Serial ATA Controller (rev. 0xf3)
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 131 width 8 value b8
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 126 width 8 value 80
viaide2: using ioapic0 pin 20 for native-PCI interrupt
atabus4 at viaide2 channel 0
atabus5 at viaide2 channel 1
ppb0 at pci0 dev 9 function 0: vendor 0x10de product 0x005c (rev. 0xa2)
pci1 at ppb0 bus 1
nfe0 at pci0 dev 10 function 0: vendor 0x10de product 0x0057 (rev. 0xa3)
AcpiOsWritePciConfiguration: bus 0 device 1 function 0: register 133 width 8 value d
nfe0: interrupting at ioapic0 pin 21
nfe0: Ethernet address 00:e0:81:54:9d:e8
makphy0 at nfe0 phy 1: Marvell 88E1111 Gigabit PHY, rev. 1
makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
ppb1 at pci0 dev 14 function 0: vendor 0x10de product 0x005d (rev. 0xa3)
ppb1: PCI Express 1.0 <Root Port of PCI-E Root Complex> x16 @ 2.5Gb/s
pci2 at ppb1 bus 2
vga0 at pci2 dev 0 function 0: vendor 0x1002 product 0x5e4b (rev. 0x00)
wsdisplay0 at vga0 kbdmux 1
radeondrm0 at vga0: ATI Radeon RV410 X700 Pro
radeondrm0: Initialized radeon 1.29.0 20080613
genfb0 at pci2 dev 0 function 1: vendor 0x1002 product 0x5e6b (rev. 0x00)
pchb0 at pci0 dev 24 function 0: vendor 0x1022 product 0x1100 (rev. 0x00)
pchb1 at pci0 dev 24 function 1: vendor 0x1022 product 0x1101 (rev. 0x00)
pchb2 at pci0 dev 24 function 2: vendor 0x1022 product 0x1102 (rev. 0x00)
amdnb_misc0 at pci0 dev 24 function 3: AMD NB Misc Configuration
pchb3 at pci0 dev 25 function 0: vendor 0x1022 product 0x1100 (rev. 0x00)
pchb4 at pci0 dev 25 function 1: vendor 0x1022 product 0x1101 (rev. 0x00)
pchb5 at pci0 dev 25 function 2: vendor 0x1022 product 0x1102 (rev. 0x00)
amdnb_misc1 at pci0 dev 25 function 3: AMD NB Misc Configuration
isa0 at pcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
pci3 at mainbus0 bus 16
ppb2 at pci3 dev 10 function 0: vendor 0x1022 product 0x7450 (rev. 0x12)
pci4 at ppb2 bus 17
aapic0 at pci3 dev 10 function 1: vendor 0x1022 product 0x7451 (rev. 0x01)
ppb3 at pci3 dev 11 function 0: vendor 0x1022 product 0x7450 (rev. 0x12)
pci5 at ppb3 bus 18
vendor 0x108e product 0x2bad (ethernet network, revision 0x01) at pci5 dev 4 function 0 not configured
aapic1 at pci3 dev 11 function 1: vendor 0x1022 product 0x7451 (rev. 0x01)
pci6 at mainbus0 bus 128
vendor 0x10de product 0x005e (miscellaneous memory, revision 0xa3) at pci6 dev 0 function 0 not configured
vendor 0x10de product 0x00d3 (miscellaneous memory, revision 0xa3) at pci6 dev 1 function 0 not configured
nfe1 at pci6 dev 10 function 0: vendor 0x10de product 0x0057 (rev. 0xa3)
AcpiOsWritePciConfiguration: bus 128 device 1 function 0: register 133 width 8 value 8
nfe1: interrupting at ioapic3 pin 20
nfe1: Ethernet address 00:e0:81:54:9d:e9
makphy1 at nfe1 phy 1: Marvell 88E1111 Gigabit PHY, rev. 1
makphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
ppb4 at pci6 dev 14 function 0: vendor 0x10de product 0x005d (rev. 0xa3)
ppb4: PCI Express 1.0 <Root Port of PCI-E Root Complex> x16 @ 2.5Gb/s
pci7 at ppb4 bus 129
acpicpu0 at cpu0: ACPI CPU
acpicpu1 at cpu1: ACPI CPU
audio0 at auich0: full duplex, playback, capture, mmap, independent
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
uhub0 at usb0: vendor 0x10de OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1 at usb1: vendor 0x10de EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
atapibus0 at atabus0: 2 targets
cd0 at atapibus0 drive 0: <HL-DT-STDVD-ROM GDR8164B, , 0L06> cdrom removable
viaide1 port 1: device present, speed: 1.5Gb/s
viaide2 port 0: device present, speed: 1.5Gb/s
viaide2 port 1: device present, speed: 1.5Gb/s
ehci0: handing over low speed device on port 1 to ohci0
wd0 at atabus3 drive 0
~fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff8025447d cs 8 rflags 202 cr2 0 ilevel 8 rsp fffffe80800138e8
curlwp 0xfffffe813fb36440 pid 0 lid 6 lowest kstack 0xfffffe8080010000
Stopped in pid 0.6 (system) at netbsd:breakpoint+0x5: leave
db{0}> bt
breakpoint() at netbsd:breakpoint+0x5
comintr() at netbsd:comintr+0x51d
Xintr_ioapic_edge8() at netbsd:Xintr_ioapic_edge8+0xea
--- interrupt ---
inw() at netbsd:inw+0x8
AcpiHwReadPort() at netbsd:AcpiHwReadPort+0xcc
AcpiHwRead() at netbsd:AcpiHwRead+0x5c
AcpiHwReadMultiple() at netbsd:AcpiHwReadMultiple+0x2a
AcpiHwRegisterRead() at netbsd:AcpiHwRegisterRead+0x84
AcpiEvFixedEventDetect() at netbsd:AcpiEvFixedEventDetect+0x1d
AcpiEvSciXruptHandler() at netbsd:AcpiEvSciXruptHandler+0xf
intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x1d
Xintr_ioapic_level0() at netbsd:Xintr_ioapic_level0+0xf2
--- interrupt ---
Xspllower() at netbsd:Xspllower+0xe
DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xfffffe8080013d70
Xsoftintr() at netbsd:Xsoftintr+0x4f
--- interrupt ---
0:
db{0}> ps
PID LID S CPU FLAGS STRUCT LWP * NAME WAIT
1 1 3 1 0 fffffe8107d8e5c0 init lbolt
0 47 3 1 200 fffffe810801a580 viaide1cnf wdccmd
0 46 3 0 200 fffffe810801a9a0 unpgc unpgc
0 45 3 1 200 fffffe81074e4560 usb1 usbevt
0 44 3 1 200 fffffe8107d8e9e0 usb0 usbevt
0 43 3 1 200 fffffe8107ffb5a0 vmem_rehash vmem_rehash
0 42 3 1 200 fffffe8107ffb180 atapibus0 sccomp
0 32 3 1 200 fffffe81074e4980 atabus5 atainitq
0 31 3 1 200 fffffe81074de120 atabus4 atainitq
0 30 3 1 200 fffffe81074de540 atabus3 atath
0 29 3 1 200 fffffe81074de960 atabus2 atath
0 28 3 1 200 fffffe81074e3100 atabus1 atath
0 27 3 0 200 fffffe81074e3520 atabus0 atath
0 26 3 0 200 fffffe81074e3940 usbtask-dr usbtsk
0 25 3 0 200 fffffe81074c20e0 usbtask-hc usbtsk
0 24 3 1 200 fffffe81074c2500 iic1 iicintr
0 23 3 1 200 fffffe81074c2920 iic0 iicintr
0 22 3 1 200 fffffe81074650c0 xcall/1 xcall
0 21 1 1 200 fffffe81074654e0 softser/1
0 20 1 1 200 fffffe8107465900 softclk/1
0 19 1 1 200 fffffe81074580a0 softbio/1
0 18 1 1 200 fffffe81074584c0 softnet/1
0 > 17 7 1 201 fffffe81074588e0 idle/1
0 16 3 0 200 fffffe813ea1b080 sysmon smtaskq
0 15 3 1 200 fffffe813ea1b4a0 pmfsuspend pmfsuspend
0 14 3 0 200 fffffe813ea1b8c0 pmfevent pmfevent
0 13 3 0 200 fffffe813f726060 sopendfree sopendfr
0 12 3 1 200 fffffe813f726480 nfssilly nfssilly
0 11 3 1 200 fffffe813f7268a0 cachegc cachegc
0 10 3 1 200 fffffe813fb2a040 vrele vrele
0 9 3 0 200 fffffe813fb2a460 vdrain vdrain
0 8 3 0 200 fffffe813fb2a880 modunload mod_unld
0 7 3 0 200 fffffe813fb36020 xcall/0 xcall
0 > 6 7 0 200 fffffe813fb36440 softser/0
0 5 1 0 200 fffffe813fb36860 softclk/0
0 4 1 0 200 fffffe813fb45000 softbio/0
0 > 3 7 0 200 fffffe813fb45420 softnet/0
0 > 2 7 0 201 fffffe813fb45840 idle/0
0 1 3 0 200 ffffffff80e65f40 swapper cfgmisc
The patch (including a bit of debug output) can be found below.
Martin
Index: sys/arch/x86/pci/pci_machdep.c
===================================================================
RCS file: /cvsroot/src/sys/arch/x86/pci/pci_machdep.c,v
retrieving revision 1.56
diff -u -p -r1.56 pci_machdep.c
--- sys/arch/x86/pci/pci_machdep.c 1 Mar 2012 20:16:27 -0000 1.56
+++ sys/arch/x86/pci/pci_machdep.c 11 Jan 2013 09:02:44 -0000
@@ -507,6 +507,20 @@ pci_conf_write(pci_chipset_tag_t pc, pci
pci_conf_unlock(&ocl);
}
+
+void
+pci_conf_write_8(pci_chipset_tag_t pc, pcitag_t tag, int reg, pcireg_t data);
+
+void
+pci_conf_write_8(pci_chipset_tag_t pc, pcitag_t tag, int reg, pcireg_t data)
+{
+ struct pci_conf_lock ocl;
+
+ pci_conf_lock(&ocl, pci_conf_selector(tag, reg));
+ outb(pci_conf_port(tag, reg), data);
+ pci_conf_unlock(&ocl);
+}
+
void
pci_mode_set(int mode)
{
Index: sys/dev/acpi/acpi_pci_link.c
===================================================================
RCS file: /cvsroot/src/sys/dev/acpi/acpi_pci_link.c,v
retrieving revision 1.19
diff -u -p -r1.19 acpi_pci_link.c
--- sys/dev/acpi/acpi_pci_link.c 23 Sep 2012 00:26:25 -0000 1.19
+++ sys/dev/acpi/acpi_pci_link.c 11 Jan 2013 09:02:44 -0000
@@ -533,6 +533,7 @@ acpi_pci_link_attach(struct acpi_pci_lin
* run _DIS (i.e., the method doesn't exist), assume the initial
* IRQ was routed by the BIOS.
*/
+ printf("calling _DIS for %s...\n", sc->pl_name);
if (ACPI_SUCCESS(AcpiEvaluateObject(sc->pl_handle, "_DIS", NULL,
NULL)))
for (i = 0; i < sc->pl_num_links; i++)
@@ -541,6 +542,7 @@ acpi_pci_link_attach(struct acpi_pci_lin
for (i = 0; i < sc->pl_num_links; i++)
if (PCI_INTERRUPT_VALID(sc->pl_links[i].l_irq))
sc->pl_links[i].l_routed = TRUE;
+ printf("... done with _DIS for %s...\n", sc->pl_name);
if (boothowto & AB_VERBOSE) {
printf("%s: Links after disable:\n", sc->pl_name);
acpi_pci_link_dump(sc);
Index: sys/dev/acpi/acpica/OsdHardware.c
===================================================================
RCS file: /cvsroot/src/sys/dev/acpi/acpica/OsdHardware.c,v
retrieving revision 1.8
diff -u -p -r1.8 OsdHardware.c
--- sys/dev/acpi/acpica/OsdHardware.c 17 Feb 2011 10:23:43 -0000 1.8
+++ sys/dev/acpi/acpica/OsdHardware.c 11 Jan 2013 09:02:44 -0000
@@ -234,6 +234,9 @@ AcpiOsReadPciConfiguration(ACPI_PCI_ID *
return AE_OK;
}
+
+extern void pci_conf_write_8(pci_chipset_tag_t pc, pcitag_t tag, int reg, pcireg_t data);
+
/*
* AcpiOsWritePciConfiguration:
*
@@ -253,10 +256,16 @@ AcpiOsWritePciConfiguration(ACPI_PCI_ID
switch (Width) {
case 8:
+ printf("AcpiOsWritePciConfiguration: bus %d device %d function %d: register %d width %d value %lx\n",
+ PciId->Bus, PciId->Device, PciId->Function, Register, Width, (unsigned long)Value);
+ pci_conf_write_8(acpi_softc->sc_pc, tag, Register, Value);
+ return AE_OK;
+#if 0
tmp = pci_conf_read(acpi_softc->sc_pc, tag, Register & ~3);
tmp &= ~(0xff << ((Register & 3) * 8));
tmp |= (Value << ((Register & 3) * 8));
break;
+#endif
case 16:
tmp = pci_conf_read(acpi_softc->sc_pc, tag, Register & ~3);
From: David Young <dyoung@pobox.com>
To: gnats-bugs@NetBSD.org
Cc: chs@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
cheusov@tut.by
Subject: Re: kern/47290: Boot hangs up (regression in 6.0.0_PATCH since
6.0_RELEASE)
Date: Fri, 11 Jan 2013 11:51:37 -0600
On Fri, Jan 11, 2013 at 10:05:03AM +0000, Martin Husemann wrote:
> The following reply was made to PR kern/47290; it has been noted by GNATS.
>
> From: Martin Husemann <martin@duskware.de>
> To: gnats-bugs@NetBSD.org
> Cc:
> Subject: Re: kern/47290: Boot hangs up (regression in 6.0.0_PATCH since 6.0_RELEASE)
> Date: Fri, 11 Jan 2013 11:03:55 +0100
>
> After a bit of painfull debugging (including booting patched FreeBSD kernels
> to verify their acpi subsystem does the same _DIS calls) I ended up with a
> patch that seems to work around the problem - but it is a bit mind puzzling.
>
> Unfortunately todays -current doesn't seem to boot on this machine, probably
> due to something completely unrelated - so my boot still does not complete.
>
> Anyway, the interesting part of the long story: the execution of some _DIS
> method on my machiine ends with a 8bit wide write to pci config space.
> Since those writes are only well defined for 32bit access, we do a read-
> modify-write cycle to set the requested byte. It is the write part of that
> cycle that kills my machine.
>
> So, despite knowing this is a bad idea, I hacked a patch that uses byte
> access to directly store the requested byte in pci config space - and that
> makes my machine survive. Now I am not sure if the out-of-spec byte acccess
> is ignored by the hardware (so I basically disabled the deadly store), or if
> this now all works as intended.
>
> Now, I agree that in general we should forbid such writes, as some
> architecture can not even implement them for arbitrary pci devices.
> However, given the tight binding of ACPI and hardware, we should allow
> them for this case.
It's my understanding that bits 0:1 of port 0xCF8 are read-only and
always zero, so your implementation of pci_conf_write_8(..., reg = 125,
val) may write one byte to reg 124. I think that in order to write to
the correct address, it may be necessary to write (reg & ~3) to 0xCFA
and to outb(0xCFC + (reg & 3), val).
Dave
--
David Young
dyoung@pobox.com Urbana, IL (217) 721-9981
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/47290: Boot hangs up (regression in 6.0.0_PATCH since
6.0_RELEASE)
Date: Sat, 12 Jan 2013 06:49:12 +0000
again, not sent to gnats (please fix your mailreader)
------
From: David Young <dyoung@pobox.com>
To: Chuck Silvers <chuq@chuq.com>, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Cc: Martin Husemann <martin@duskware.de>, cheusov@tut.by, dsl@netbsd.org
Subject: Re: kern/47290: Boot hangs up (regression in 6.0.0_PATCH since
6.0_RELEASE)
Date: Fri, 11 Jan 2013 12:02:31 -0600
Mail-Followup-To: David Young <dyoung@pobox.com>, Chuck Silvers
<chuq@chuq.com>, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, Martin
Husemann <martin@duskware.de>, cheusov@tut.by, dsl@netbsd.org
On Fri, Jan 11, 2013 at 09:20:28AM -0800, Chuck Silvers wrote:
> On Thu, Jan 10, 2013 at 05:41:54PM +0100, Martin Husemann wrote:
> > Ok - I have something, but you won't like it.
> >
> > I traced down further where it hangs up:
> >
> > As last step of executing the third (and last) AML instruction in the _DIS
> > method of LNK3 it ends up doing a pci config space write via:
> >
> >
> > /*
> > * AcpiOsWritePciConfiguration:
> > *
> > * Write a value to a PCI configuration register.
> > */
> > ACPI_STATUS
> > AcpiOsWritePciConfiguration(ACPI_PCI_ID *PciId, UINT32 Register,
> > ACPI_INTEGER Value, UINT32 Width)
> >
> > in sys/dev/acpi/acpica/OsdHardware.c.
> >
> > The call in question asks to write a 0 byte to register 125 of bus 0
> > device 1 function 0.
> >
> > Since a byte access to config space is not defined, we do a 32bit read
> > (resulting in a value of 0 as well) and mask and set the byte, then
> > write back the full 32bit word - and then the machine hangs.
> >
> > I wondered if FreeBSD does just a byte write instead, but I am lost in their
> > code.
> >
> > Easy to test: I hacked a pci_conf_write_8() call (doing outb instead
> > of outl) and modified AcpiOsWritePciConfiguration() to use that if Width is 8.
> >
> > This made my machine boot. Yay!
> >
> > However, this hack makes me realy nervous. We could consider allowing
> > sub-word config writes limited to acpi scope, given how tight the bios
> > and hardware are tied, but this still sounds hackish.
>
>
> wow, excellent detective work!
> I would never have guessed that this was the problem.
>
> I'd say you're right, we need to access PCI config space exactly like the BIOS
> tells us to, even though this might cause us to do so in ways which violate
> the normal rules. the BIOS knows better than we do about such things.
>
> I see that you verified that freebsd does this, and I confirmed just now
> that linux does it too. it looks like openbsd does NOT do this,
> so I would guess that openbsd wouldn't work on this box either.
> I don't remember if you've tried that. if openbsd actually does work,
> then that might be worth some more investigation.
>
>
> for a real fix, I guess we need to add pci_conf_{read,write}_width()
> (or somesuch) that add a "width" argument to what the normal calls have,
> and have ACPI use those instead of the normal versions.
>
> I'm not sure if it would be necessary to add these to the "struct pci_overrides" stuff...
> in the abstract I suppose it should be added but it's hard to say because nothing
> in the tree actually uses that override mechanism.
>
> dyoung, do you have any opinion on this, especially the overrides question?
>
> is there anyone else we should specifically ask about this?
Let me suggest the names pci_conf_{read,write}_{1,2,4} since
that's parallel with bus_space_{read,write}_{1,2,4}. I look at
pci_conf_write_8() and make a double-take: a 64-bit wide configuration
write, preposterous! :-)
I think that pci_conf_{read,write}_4 can simply be aliases for
pci_conf_{read,write}?
I'm about 75% finished getting rid of cardbus attachments using
pci_overrides, so their upkeep is important to me. :-)
Regarding the overrides, it's pretty easy to add a new one.
Basically you just need to reserve the next four bits, 9 - 13, from
pci_override_idx for PCI_OVERRIDE_CONF_{READ,WRITE}_{1,2}. Then add
the overrides to struct pci_overrides in the same order. The code can
be copied & pasted from pci_conf_{read,write}() and adapted to use the
right bits & struct members. I think that's all you will need to do.
Anyway, I'll be happy to help out.
Dave
--
David Young
dyoung@pobox.com Urbana, IL (217) 721-9981
From: Reinoud Zandijk <reinoud@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: chs@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
cheusov@tut.by
Subject: Re: kern/47290: Boot hangs up (regression in 6.0.0_PATCH since
6.0_RELEASE)
Date: Wed, 13 Mar 2013 11:56:29 +0100
On Fri, Jan 11, 2013 at 10:05:03AM +0000, Martin Husemann wrote:
> The patch (including a bit of debug output) can be found below.
I tried the patch on my machine since it gives me a ioapic0 pin 9 storm
(285247/sec!) (acpi0: SCI interrupting at int 9). My auixp(9) also ceased to
work giving a codec read timeout, most likely related since it worked fine on
the 6.0_BETA2 that i ran before.
recap: i added the pci_macpdep.c patch, (re) enabled the acpi_pci_link.c code
(_DIS) and added the OsdHardware.c patch. It made no difference.
Any ideas? Or is this a PR on worthy of its own?
With regards,
Reinoud
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/47290: Boot hangs up (regression in 6.0.0_PATCH since 6.0_RELEASE)
Date: Wed, 13 Mar 2013 12:02:12 +0100
On Wed, Mar 13, 2013 at 11:00:13AM +0000, Reinoud Zandijk wrote:
> recap: i added the pci_macpdep.c patch, (re) enabled the acpi_pci_link.c code
> (_DIS) and added the OsdHardware.c patch. It made no difference.
Sounds like a different problem, please file another PR.
Martin
From: Ignatios Souvatzis <ignatios@cs.uni-bonn.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/47290
Date: Fri, 31 Jan 2014 12:04:12 +0100
I see the problem with 6.99.30.
Hangs right after
acpi0 at mainbus0: Intel ACPICA 20131218
I made it work with
cvs update -kk -j1.19 -j1.18 acpi_pci_link.c
Below's a bit more of dmesg of the booting kernel:
Regards,
-is
NetBSD 6.99.30 (GENERIC) #2: Fri Jan 31 10:43:35 CET 2014
ignatios@random84.cs.uni-bonn.de:/var/itch/obj/cur/Oamd64/sys/arch/amd64
/compile/GENERIC
total memory = 2046 MB
avail memory = 1970 MB
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
Sun Microsystems Sun Ultra 20 Workstation (Rev 50)
mainbus0 (root)
ACPI: RSDP 0xf7bc0 000014 (v00 SUNW )
ACPI: RSDT 0x7fee3040 000038 (v01 SUNW AWRDACPI 42302E31 AWRD 00000000)
ACPI: FACP 0x7fee30c0 000074 (v01 SUNW AWRDACPI 42302E31 AWRD 00000000)
ACPI: DSDT 0x7fee3180 00611A (v01 SUNW AWRDACPI 00001000 MSFT 0100000E)
ACPI: FACS 0x7fee0000 000040
ACPI: SSDT 0x7fee93c0 000139 (v01 PTLTD POWERNOW 00000001 LTP 00000001)
ACPI: SRAT 0x7fee9540 000090 (v01 AMD HAMMER 00000001 AMD 00000001)
ACPI: MCFG 0x7fee9640 00003C (v01 SUNW AWRDACPI 42302E31 AWRD 00000000)
ACPI: APIC 0x7fee9300 00006C (v01 SUNW AWRDACPI 42302E31 AWRD 00000000)
ACPI: All ACPI Tables successfully acquired
cpu0 at mainbus0 apid 0: AMD Opteron(tm) Processor 152, id 0x20f71
ioapic0 at mainbus0 apid 2: pa 0xfec00000, version 0x11, 24 pins
acpi0 at mainbus0: Intel ACPICA 20131218
acpi0: X/RSDT: OemId <SUNW ,AWRDACPI,42302e31>, AslId <AWRD,00000000>
acpi0: SCI interrupting at int 9
timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000
acpibut0 at acpi0 (PWRB, PNP0C0C): ACPI Power Button
MBIO (PNP0C02) at acpi0 not configured
SYSR (PNP0C02) at acpi0 not configured
attimer1 at acpi0 (TMR, PNP0100): io 0x40-0x43 irq 0
pcppi1 at acpi0 (SPKR, PNP0800): io 0x61
midi0 at pcppi1: PC speaker
sysbeep0 at pcppi1
EXPL (PNP0C02) at acpi0 not configured
MEM (PNP0C01) at acpi0 not configured
ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_] (20131218/hwxface-646)
attimer1: attached to pcppi1
pci0 at mainbus0 bus 0: configuration mode 1
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.