NetBSD Problem Report #53811
From www@NetBSD.org Tue Dec 25 16:30:03 2018
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id A6C627A1B1
for <gnats-bugs@gnats.NetBSD.org>; Tue, 25 Dec 2018 16:30:03 +0000 (UTC)
Message-Id: <20181225163002.2FF257A1E9@mollari.NetBSD.org>
Date: Tue, 25 Dec 2018 16:30:02 +0000 (UTC)
From: prlw1@cam.ac.uk
Reply-To: prlw1@cam.ac.uk
To: gnats-bugs@NetBSD.org
Subject: wm0 device timeout when uefi booting
X-Send-Pr-Version: www-1.0
>Number: 53811
>Category: port-amd64
>Synopsis: wm0 device timeout when uefi booting
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: port-amd64-maintainer
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Dec 25 16:35:00 +0000 2018
>Closed-Date: Tue Feb 05 06:48:14 +0000 2019
>Last-Modified: Tue Feb 05 06:48:14 +0000 2019
>Originator: Patrick Welche
>Release: NetBSD-8.99.27/amd64
>Organization:
>Environment:
boot64.efi from Dec 20 code
>Description:
When booting using the BIOS / MBR, I have a working box (apart from nouveau). When booting using uefi, wm0 has timeouts:
[ 700.413533] wm0: device timeout (txfree 4068 txsfree 42 txnext 28)
[ 719.225142] wm0: device timeout (txfree 4095 txsfree 63 txnext 1)
[ 734.034281] wm0: device timeout (txfree 4095 txsfree 63 txnext 1)
etc.
tcpdump -nvi wm0 only shows packets sent, no incoming traffic of any kind.
A diff of working dmesg vs uefi dmesg only shows graphics card differences, so I don't really have a clue of what changed:
--- bios.boot 2018-12-22 19:47:10.128354034 +0000
+++ uefi.boot 2018-12-22 21:54:30.611972805 +0000
@@ -7,35 +7,37 @@
NetBSD 8.99.27 (QUANTZ) #323: Sat Dec 15 20:03:50 GMT 2018
prlw1@quantz:/usr/src/sys/arch/amd64/compile/QUANTZ
total memory = 32699 MB
-avail memory = 31733 MB
+avail memory = 31731 MB
cpu_rng: RDSEED
timecounter: Timecounters tick every 10.000 msec
Kernelized RAIDframe activated
running cgd selftest aes-xts-256 aes-xts-512 done
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
+efi: systbl at pa db7e8018
System manufacturer System Product Name (System Version)
mainbus0 (root)
-ACPI: RSDP 0x00000000000F05B0 000024 (v02 ALASKA)
-ACPI: XSDT 0x00000000CC9CC098 0000A4 (v01 ALASKA A M I 01072009 AMI 00010013)
-ACPI: FACP 0x00000000CC9D7CB0 000114 (v06 ALASKA A M I 01072009 AMI 00010013)
+ACPI: RSDP 0x00000000D3AE9000 000024 (v02 ALASKA)
+ACPI: XSDT 0x00000000D3AE9098 0000AC (v01 ALASKA A M I 01072009 AMI 00010013)
+ACPI: FACP 0x00000000D3AF4CC0 000114 (v06 ALASKA A M I 01072009 AMI 00010013)
...
@@ -221,10 +223,11 @@
ppb8: link is x16 @ 2.5GT/s
pci9 at ppb8 bus 9
pci9: i/o space, memory space enabled, rd/line, wr/inv ok
-vga0 at pci9 dev 0 function 0: NVIDIA product 1180 (rev. 0xa1)
-wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation)
+genfb0 at pci9 dev 0 function 0: NVIDIA product 1180 (rev. 0xa1)
+genfb0: framebuffer at 0xf1000000, size 800x600, depth 32, stride 3200
+wsdisplay0 at genfb0 kbdmux 1: console (default, vt100 emulation)
wsmux1: connecting to wsdisplay0
-drm at vga0 not configured
+drm at genfb0 not configured
hdaudio0 at pci9 dev 0 function 1: HD Audio Controller
hdaudio0: interrupting at msi4 vec 0
hdafg0 at hdaudio0: vendor 10de product 0040
The particular network card:
ppb6 at pci2 dev 6 function 0: AMD product 43b4 (rev. 0x02)
ppb6: PCI Express capability version 2 <Downstream Port of PCI-E Switch> x1 @ 5.
0GT/s
ppb6: link is x1 @ 2.5GT/s
pci7 at ppb6 bus 7
pci7: i/o space, memory space enabled, rd/line, wr/inv ok
wm0 at pci7 dev 0 function 0: I211 Ethernet (COPPER) (rev. 0x03)
wm0: for TX and RX interrupting at msix3 vec 0 affinity to 1
wm0: for TX and RX interrupting at msix3 vec 1 affinity to 2
wm0: for LINK interrupting at msix3 vec 2
wm0: PCI-Express bus
wm0: 64 words iNVM, version 0.6
wm0: Ethernet address 60:45:cb:9e:13:dd
wm0: Copper
wm0: 0xc214420<INVM,IOH_VALID,PCIE,NEWQUEUE,WOL,PLLWA,CLSEMWA>
ukphy0 at wm0 phy 1: OUI 0x000ac2, model 0x0000, rev. 0
ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
From: Masanobu SAITOH <msaitoh@execsw.org>
To: gnats-bugs@NetBSD.org, port-amd64-maintainer@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: msaitoh@execsw.org
Subject: Re: port-amd64/53811: wm0 device timeout when uefi booting
Date: Wed, 26 Dec 2018 12:03:02 +0900
On 2018/12/26 1:35, prlw1@cam.ac.uk wrote:
>> Number: 53811
>> Category: port-amd64
>> Synopsis: wm0 device timeout when uefi booting
>> Confidential: no
>> Severity: serious
>> Priority: medium
>> Responsible: port-amd64-maintainer
>> State: open
>> Class: sw-bug
>> Submitter-Id: net
>> Arrival-Date: Tue Dec 25 16:35:00 +0000 2018
>> Originator: Patrick Welche
>> Release: NetBSD-8.99.27/amd64
>> Organization:
>> Environment:
> boot64.efi from Dec 20 code
>> Description:
> When booting using the BIOS / MBR, I have a working box (apart from nouveau). When booting using uefi, wm0 has timeouts:
>
> [ 700.413533] wm0: device timeout (txfree 4068 txsfree 42 txnext 28)
> [ 719.225142] wm0: device timeout (txfree 4095 txsfree 63 txnext 1)
> [ 734.034281] wm0: device timeout (txfree 4095 txsfree 63 txnext 1)
>
> etc.
>
> tcpdump -nvi wm0 only shows packets sent, no incoming traffic of any kind.
>
> A diff of working dmesg vs uefi dmesg only shows graphics card differences, so I don't really have a clue of what changed:
>
> --- bios.boot 2018-12-22 19:47:10.128354034 +0000
> +++ uefi.boot 2018-12-22 21:54:30.611972805 +0000
> @@ -7,35 +7,37 @@
> NetBSD 8.99.27 (QUANTZ) #323: Sat Dec 15 20:03:50 GMT 2018
> prlw1@quantz:/usr/src/sys/arch/amd64/compile/QUANTZ
> total memory = 32699 MB
> -avail memory = 31733 MB
> +avail memory = 31731 MB
> cpu_rng: RDSEED
> timecounter: Timecounters tick every 10.000 msec
> Kernelized RAIDframe activated
> running cgd selftest aes-xts-256 aes-xts-512 done
> timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
> +efi: systbl at pa db7e8018
> System manufacturer System Product Name (System Version)
> mainbus0 (root)
> -ACPI: RSDP 0x00000000000F05B0 000024 (v02 ALASKA)
> -ACPI: XSDT 0x00000000CC9CC098 0000A4 (v01 ALASKA A M I 01072009 AMI 00010013)
> -ACPI: FACP 0x00000000CC9D7CB0 000114 (v06 ALASKA A M I 01072009 AMI 00010013)
> +ACPI: RSDP 0x00000000D3AE9000 000024 (v02 ALASKA)
> +ACPI: XSDT 0x00000000D3AE9098 0000AC (v01 ALASKA A M I 01072009 AMI 00010013)
> +ACPI: FACP 0x00000000D3AF4CC0 000114 (v06 ALASKA A M I 01072009 AMI 00010013)
> ...
> @@ -221,10 +223,11 @@
> ppb8: link is x16 @ 2.5GT/s
> pci9 at ppb8 bus 9
> pci9: i/o space, memory space enabled, rd/line, wr/inv ok
> -vga0 at pci9 dev 0 function 0: NVIDIA product 1180 (rev. 0xa1)
> -wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation)
> +genfb0 at pci9 dev 0 function 0: NVIDIA product 1180 (rev. 0xa1)
> +genfb0: framebuffer at 0xf1000000, size 800x600, depth 32, stride 3200
> +wsdisplay0 at genfb0 kbdmux 1: console (default, vt100 emulation)
> wsmux1: connecting to wsdisplay0
> -drm at vga0 not configured
> +drm at genfb0 not configured
> hdaudio0 at pci9 dev 0 function 1: HD Audio Controller
> hdaudio0: interrupting at msi4 vec 0
> hdafg0 at hdaudio0: vendor 10de product 0040
>
> The particular network card:
>
> ppb6 at pci2 dev 6 function 0: AMD product 43b4 (rev. 0x02)
> ppb6: PCI Express capability version 2 <Downstream Port of PCI-E Switch> x1 @ 5.
> 0GT/s
> ppb6: link is x1 @ 2.5GT/s
> pci7 at ppb6 bus 7
> pci7: i/o space, memory space enabled, rd/line, wr/inv ok
> wm0 at pci7 dev 0 function 0: I211 Ethernet (COPPER) (rev. 0x03)
> wm0: for TX and RX interrupting at msix3 vec 0 affinity to 1
> wm0: for TX and RX interrupting at msix3 vec 1 affinity to 2
> wm0: for LINK interrupting at msix3 vec 2
> wm0: PCI-Express bus
> wm0: 64 words iNVM, version 0.6
> wm0: Ethernet address 60:45:cb:9e:13:dd
> wm0: Copper
> wm0: 0xc214420<INVM,IOH_VALID,PCIE,NEWQUEUE,WOL,PLLWA,CLSEMWA>
> ukphy0 at wm0 phy 1: OUI 0x000ac2, model 0x0000, rev. 0
> ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
>
>> How-To-Repeat:
>
>> Fix:
>
Please show the following output:
0) full dmesg of "boot -xv" on UEFI
1) intrctl list
2) cpuctl list
3) cpuctl -v identify 0
4) /proc/cpuinfo
5) acpidump -dt
--
-----------------------------------------------
SAITOH Masanobu (msaitoh@execsw.org
msaitoh@netbsd.org)
From: Patrick Welche <prlw1@cam.ac.uk>
To: Masanobu SAITOH <msaitoh@execsw.org>, gnats-bugs@NetBSD.org
Cc:
Subject: Re: port-amd64/53811: wm0 device timeout when uefi booting
Date: Wed, 26 Dec 2018 14:24:03 +0000
On Wed, Dec 26, 2018 at 12:03:02PM +0900, Masanobu SAITOH wrote:
> Please show the following output:
>
> 0) full dmesg of "boot -xv" on UEFI
> 1) intrctl list
> 2) cpuctl list
> 3) cpuctl -v identify 0
> 4) /proc/cpuinfo
> 5) acpidump -dt
You mentioning acpi made me check the bios: I updated this asus prime x370-pro
from 4024 (7 Sep 2018) to 4207 (8 Dec 2018). Release notes say:
1.Update AGESA 1006
2.Improve compatibility and performance for Athlon™ with Radeon™ Vega Graphics Processors
so not entirely obvious what changed to help. wm0 now works on the uefi side.
[Should the acpi tables not be the same which ever way one boots?]
Thanks!
From: Patrick Welche <prlw1@cam.ac.uk>
To: Masanobu SAITOH <msaitoh@execsw.org>, gnats-bugs@NetBSD.org
Cc:
Subject: Re: port-amd64/53811: wm0 device timeout when uefi booting
Date: Wed, 26 Dec 2018 14:47:29 +0000
I was just lucky - timeouts are back on the next boot. Compiling the
information you asked for...
From: Patrick Welche <prlw1@cam.ac.uk>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-amd64/53811: wm0 device timeout when uefi booting
Date: Sat, 29 Dec 2018 15:35:43 +0000
The information is now at: http://www.netbsd.org/~prlw1/pr53811.tar.xz
From: Masanobu SAITOH <msaitoh@execsw.org>
To: gnats-bugs@NetBSD.org, port-amd64-maintainer@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, prlw1@cam.ac.uk
Cc: msaitoh@execsw.org
Subject: Re: port-amd64/53811: wm0 device timeout when uefi booting
Date: Sun, 30 Dec 2018 14:36:04 +0900
On 2018/12/30 0:40, Patrick Welche wrote:
> The following reply was made to PR port-amd64/53811; it has been noted by GNATS.
>
> From: Patrick Welche <prlw1@cam.ac.uk>
> To: gnats-bugs@netbsd.org
> Cc:
> Subject: Re: port-amd64/53811: wm0 device timeout when uefi booting
> Date: Sat, 29 Dec 2018 15:35:43 +0000
>
> The information is now at: http://www.netbsd.org/~prlw1/pr53811.tar.xz
>
>
In the dmesg:
> ioapic0 at mainbus0 apid 17: pa 0xfec00000, version 0x21, 24 pins
> ioapic0: misconfigured as apic 1
> ioapic0: autoconfiguration error: can't remap to apid 17
> ioapic1 at mainbus0 apid 18: pa 0xfec01000, version 0x21, 32 pins
> ioapic1: misconfigured as apic 2
> ioapic1: autoconfiguration error: can't remap to apid 18
I suspect this is the key of this problem.
--
-----------------------------------------------
SAITOH Masanobu (msaitoh@execsw.org
msaitoh@netbsd.org)
From: "SAITOH Masanobu" <msaitoh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/53811 CVS commit: src/sys/dev/pci
Date: Mon, 28 Jan 2019 04:09:51 +0000
Module Name: src
Committed By: msaitoh
Date: Mon Jan 28 04:09:51 UTC 2019
Modified Files:
src/sys/dev/pci: ppb.c
Log Message:
Explicitly enable bus masterling in case BIOS, UEFI or firmware don't enable
it. Might fix PR kern/53811.
To generate a diff of this commit:
cvs rdiff -u -r1.65 -r1.66 src/sys/dev/pci/ppb.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/53811 CVS commit: [netbsd-8] src/sys/dev/pci
Date: Fri, 1 Feb 2019 11:25:13 +0000
Module Name: src
Committed By: martin
Date: Fri Feb 1 11:25:13 UTC 2019
Modified Files:
src/sys/dev/pci [netbsd-8]: ppb.c
Log Message:
Pull up following revision(s) (requested by msaitoh in ticket #1181):
sys/dev/pci/ppb.c: revision 1.66
sys/dev/pci/ppb.c: revision 1.67
Explicitly enable bus masterling in case BIOS, UEFI or firmware don't enable
it. Might fix PR kern/53811.
-
If the secondary bus is configured and the bus mastering is not enabled,
enable it. Suggested by thorpej@.
To generate a diff of this commit:
cvs rdiff -u -r1.63 -r1.63.2.1 src/sys/dev/pci/ppb.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Patrick Welche <prlw1@cam.ac.uk>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: PR/53811 CVS commit: src/sys/dev/pci
Date: Mon, 4 Feb 2019 11:15:22 +0000
On Mon, Jan 28, 2019 at 04:10:01AM +0000, SAITOH Masanobu wrote:
> Log Message:
> Explicitly enable bus masterling in case BIOS, UEFI or firmware don't enable
> it. Might fix PR kern/53811.
It does fix it for me - thank you!
State-Changed-From-To: open->closed
State-Changed-By: msaitoh@NetBSD.org
State-Changed-When: Tue, 05 Feb 2019 06:48:14 +0000
State-Changed-Why:
Fixed (and pulled up to netbsd-8).
Thanks.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.