NetBSD Problem Report #53811

From www@NetBSD.org  Tue Dec 25 16:30:03 2018
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id A6C627A1B1
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 25 Dec 2018 16:30:03 +0000 (UTC)
Message-Id: <20181225163002.2FF257A1E9@mollari.NetBSD.org>
Date: Tue, 25 Dec 2018 16:30:02 +0000 (UTC)
From: prlw1@cam.ac.uk
Reply-To: prlw1@cam.ac.uk
To: gnats-bugs@NetBSD.org
Subject: wm0 device timeout when uefi booting
X-Send-Pr-Version: www-1.0

>Number:         53811
>Category:       port-amd64
>Synopsis:       wm0 device timeout when uefi booting
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-amd64-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Dec 25 16:35:00 +0000 2018
>Closed-Date:    Tue Feb 05 06:48:14 +0000 2019
>Last-Modified:  Tue Feb 05 06:48:14 +0000 2019
>Originator:     Patrick Welche
>Release:        NetBSD-8.99.27/amd64
>Organization:
>Environment:
boot64.efi from Dec 20 code
>Description:
When booting using the BIOS / MBR, I have a working box (apart from nouveau). When booting using uefi, wm0 has timeouts:

[   700.413533] wm0: device timeout (txfree 4068 txsfree 42 txnext 28)
[   719.225142] wm0: device timeout (txfree 4095 txsfree 63 txnext 1)
[   734.034281] wm0: device timeout (txfree 4095 txsfree 63 txnext 1)

etc.

tcpdump -nvi wm0 only shows packets sent, no incoming traffic of any kind.

A diff of working dmesg vs uefi dmesg only shows graphics card differences, so I don't really have a clue of what changed:

--- bios.boot   2018-12-22 19:47:10.128354034 +0000
+++ uefi.boot   2018-12-22 21:54:30.611972805 +0000
@@ -7,35 +7,37 @@
 NetBSD 8.99.27 (QUANTZ) #323: Sat Dec 15 20:03:50 GMT 2018
        prlw1@quantz:/usr/src/sys/arch/amd64/compile/QUANTZ
 total memory = 32699 MB
-avail memory = 31733 MB
+avail memory = 31731 MB
 cpu_rng: RDSEED
 timecounter: Timecounters tick every 10.000 msec
 Kernelized RAIDframe activated
 running cgd selftest aes-xts-256 aes-xts-512 done
 timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
+efi: systbl at pa db7e8018
 System manufacturer System Product Name (System Version)
 mainbus0 (root)
-ACPI: RSDP 0x00000000000F05B0 000024 (v02 ALASKA)
-ACPI: XSDT 0x00000000CC9CC098 0000A4 (v01 ALASKA A M I    01072009 AMI  00010013)
-ACPI: FACP 0x00000000CC9D7CB0 000114 (v06 ALASKA A M I    01072009 AMI  00010013)
+ACPI: RSDP 0x00000000D3AE9000 000024 (v02 ALASKA)
+ACPI: XSDT 0x00000000D3AE9098 0000AC (v01 ALASKA A M I    01072009 AMI  00010013)
+ACPI: FACP 0x00000000D3AF4CC0 000114 (v06 ALASKA A M I    01072009 AMI  00010013)
...
@@ -221,10 +223,11 @@
 ppb8: link is x16 @ 2.5GT/s
 pci9 at ppb8 bus 9
 pci9: i/o space, memory space enabled, rd/line, wr/inv ok
-vga0 at pci9 dev 0 function 0: NVIDIA product 1180 (rev. 0xa1)
-wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation)
+genfb0 at pci9 dev 0 function 0: NVIDIA product 1180 (rev. 0xa1)
+genfb0: framebuffer at 0xf1000000, size 800x600, depth 32, stride 3200
+wsdisplay0 at genfb0 kbdmux 1: console (default, vt100 emulation)
 wsmux1: connecting to wsdisplay0
-drm at vga0 not configured
+drm at genfb0 not configured
 hdaudio0 at pci9 dev 0 function 1: HD Audio Controller
 hdaudio0: interrupting at msi4 vec 0
 hdafg0 at hdaudio0: vendor 10de product 0040

The particular network card:

ppb6 at pci2 dev 6 function 0: AMD product 43b4 (rev. 0x02)
ppb6: PCI Express capability version 2 <Downstream Port of PCI-E Switch> x1 @ 5.
0GT/s
ppb6: link is x1 @ 2.5GT/s
pci7 at ppb6 bus 7
pci7: i/o space, memory space enabled, rd/line, wr/inv ok
wm0 at pci7 dev 0 function 0: I211 Ethernet (COPPER) (rev. 0x03)
wm0: for TX and RX interrupting at msix3 vec 0 affinity to 1
wm0: for TX and RX interrupting at msix3 vec 1 affinity to 2
wm0: for LINK interrupting at msix3 vec 2
wm0: PCI-Express bus
wm0: 64 words iNVM, version 0.6
wm0: Ethernet address 60:45:cb:9e:13:dd
wm0: Copper
wm0: 0xc214420<INVM,IOH_VALID,PCIE,NEWQUEUE,WOL,PLLWA,CLSEMWA>
ukphy0 at wm0 phy 1: OUI 0x000ac2, model 0x0000, rev. 0
ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto

>How-To-Repeat:

>Fix:

>Release-Note:

>Audit-Trail:
From: Masanobu SAITOH <msaitoh@execsw.org>
To: gnats-bugs@NetBSD.org, port-amd64-maintainer@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: msaitoh@execsw.org
Subject: Re: port-amd64/53811: wm0 device timeout when uefi booting
Date: Wed, 26 Dec 2018 12:03:02 +0900

 On 2018/12/26 1:35, prlw1@cam.ac.uk wrote:
 >> Number:         53811
 >> Category:       port-amd64
 >> Synopsis:       wm0 device timeout when uefi booting
 >> Confidential:   no
 >> Severity:       serious
 >> Priority:       medium
 >> Responsible:    port-amd64-maintainer
 >> State:          open
 >> Class:          sw-bug
 >> Submitter-Id:   net
 >> Arrival-Date:   Tue Dec 25 16:35:00 +0000 2018
 >> Originator:     Patrick Welche
 >> Release:        NetBSD-8.99.27/amd64
 >> Organization:
 >> Environment:
 > boot64.efi from Dec 20 code
 >> Description:
 > When booting using the BIOS / MBR, I have a working box (apart from nouveau). When booting using uefi, wm0 has timeouts:
 > 
 > [   700.413533] wm0: device timeout (txfree 4068 txsfree 42 txnext 28)
 > [   719.225142] wm0: device timeout (txfree 4095 txsfree 63 txnext 1)
 > [   734.034281] wm0: device timeout (txfree 4095 txsfree 63 txnext 1)
 > 
 > etc.
 > 
 > tcpdump -nvi wm0 only shows packets sent, no incoming traffic of any kind.
 > 
 > A diff of working dmesg vs uefi dmesg only shows graphics card differences, so I don't really have a clue of what changed:
 > 
 > --- bios.boot   2018-12-22 19:47:10.128354034 +0000
 > +++ uefi.boot   2018-12-22 21:54:30.611972805 +0000
 > @@ -7,35 +7,37 @@
 >   NetBSD 8.99.27 (QUANTZ) #323: Sat Dec 15 20:03:50 GMT 2018
 >          prlw1@quantz:/usr/src/sys/arch/amd64/compile/QUANTZ
 >   total memory = 32699 MB
 > -avail memory = 31733 MB
 > +avail memory = 31731 MB
 >   cpu_rng: RDSEED
 >   timecounter: Timecounters tick every 10.000 msec
 >   Kernelized RAIDframe activated
 >   running cgd selftest aes-xts-256 aes-xts-512 done
 >   timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
 > +efi: systbl at pa db7e8018
 >   System manufacturer System Product Name (System Version)
 >   mainbus0 (root)
 > -ACPI: RSDP 0x00000000000F05B0 000024 (v02 ALASKA)
 > -ACPI: XSDT 0x00000000CC9CC098 0000A4 (v01 ALASKA A M I    01072009 AMI  00010013)
 > -ACPI: FACP 0x00000000CC9D7CB0 000114 (v06 ALASKA A M I    01072009 AMI  00010013)
 > +ACPI: RSDP 0x00000000D3AE9000 000024 (v02 ALASKA)
 > +ACPI: XSDT 0x00000000D3AE9098 0000AC (v01 ALASKA A M I    01072009 AMI  00010013)
 > +ACPI: FACP 0x00000000D3AF4CC0 000114 (v06 ALASKA A M I    01072009 AMI  00010013)
 > ...
 > @@ -221,10 +223,11 @@
 >   ppb8: link is x16 @ 2.5GT/s
 >   pci9 at ppb8 bus 9
 >   pci9: i/o space, memory space enabled, rd/line, wr/inv ok
 > -vga0 at pci9 dev 0 function 0: NVIDIA product 1180 (rev. 0xa1)
 > -wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation)
 > +genfb0 at pci9 dev 0 function 0: NVIDIA product 1180 (rev. 0xa1)
 > +genfb0: framebuffer at 0xf1000000, size 800x600, depth 32, stride 3200
 > +wsdisplay0 at genfb0 kbdmux 1: console (default, vt100 emulation)
 >   wsmux1: connecting to wsdisplay0
 > -drm at vga0 not configured
 > +drm at genfb0 not configured
 >   hdaudio0 at pci9 dev 0 function 1: HD Audio Controller
 >   hdaudio0: interrupting at msi4 vec 0
 >   hdafg0 at hdaudio0: vendor 10de product 0040
 > 
 > The particular network card:
 > 
 > ppb6 at pci2 dev 6 function 0: AMD product 43b4 (rev. 0x02)
 > ppb6: PCI Express capability version 2 <Downstream Port of PCI-E Switch> x1 @ 5.
 > 0GT/s
 > ppb6: link is x1 @ 2.5GT/s
 > pci7 at ppb6 bus 7
 > pci7: i/o space, memory space enabled, rd/line, wr/inv ok
 > wm0 at pci7 dev 0 function 0: I211 Ethernet (COPPER) (rev. 0x03)
 > wm0: for TX and RX interrupting at msix3 vec 0 affinity to 1
 > wm0: for TX and RX interrupting at msix3 vec 1 affinity to 2
 > wm0: for LINK interrupting at msix3 vec 2
 > wm0: PCI-Express bus
 > wm0: 64 words iNVM, version 0.6
 > wm0: Ethernet address 60:45:cb:9e:13:dd
 > wm0: Copper
 > wm0: 0xc214420<INVM,IOH_VALID,PCIE,NEWQUEUE,WOL,PLLWA,CLSEMWA>
 > ukphy0 at wm0 phy 1: OUI 0x000ac2, model 0x0000, rev. 0
 > ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
 > 
 >> How-To-Repeat:
 > 
 >> Fix:
 > 

 Please show the following output:

 	0) full dmesg of "boot -xv" on UEFI
 	1) intrctl list
 	2) cpuctl list
 	3) cpuctl -v identify 0
 	4) /proc/cpuinfo
 	5) acpidump -dt

 -- 
 -----------------------------------------------
                  SAITOH Masanobu (msaitoh@execsw.org
                                   msaitoh@netbsd.org)

From: Patrick Welche <prlw1@cam.ac.uk>
To: Masanobu SAITOH <msaitoh@execsw.org>, gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-amd64/53811: wm0 device timeout when uefi booting
Date: Wed, 26 Dec 2018 14:24:03 +0000

 On Wed, Dec 26, 2018 at 12:03:02PM +0900, Masanobu SAITOH wrote:
 > Please show the following output:
 > 
 > 	0) full dmesg of "boot -xv" on UEFI
 > 	1) intrctl list
 > 	2) cpuctl list
 > 	3) cpuctl -v identify 0
 > 	4) /proc/cpuinfo
 > 	5) acpidump -dt

 You mentioning acpi made me check the bios: I updated this asus prime x370-pro
 from 4024 (7 Sep 2018) to 4207 (8 Dec 2018). Release notes say:

 1.Update AGESA 1006
 2.Improve compatibility and performance for Athlon™ with Radeon™ Vega Graphics Processors

 so not entirely obvious what changed to help. wm0 now works on the uefi side.
 [Should the acpi tables not be the same which ever way one boots?]

 Thanks!

From: Patrick Welche <prlw1@cam.ac.uk>
To: Masanobu SAITOH <msaitoh@execsw.org>, gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-amd64/53811: wm0 device timeout when uefi booting
Date: Wed, 26 Dec 2018 14:47:29 +0000

 I was just lucky - timeouts are back on the next boot. Compiling the
 information you asked for...

From: Patrick Welche <prlw1@cam.ac.uk>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-amd64/53811: wm0 device timeout when uefi booting
Date: Sat, 29 Dec 2018 15:35:43 +0000

 The information is now at: http://www.netbsd.org/~prlw1/pr53811.tar.xz

From: Masanobu SAITOH <msaitoh@execsw.org>
To: gnats-bugs@NetBSD.org, port-amd64-maintainer@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, prlw1@cam.ac.uk
Cc: msaitoh@execsw.org
Subject: Re: port-amd64/53811: wm0 device timeout when uefi booting
Date: Sun, 30 Dec 2018 14:36:04 +0900

 On 2018/12/30 0:40, Patrick Welche wrote:
 > The following reply was made to PR port-amd64/53811; it has been noted by GNATS.
 > 
 > From: Patrick Welche <prlw1@cam.ac.uk>
 > To: gnats-bugs@netbsd.org
 > Cc:
 > Subject: Re: port-amd64/53811: wm0 device timeout when uefi booting
 > Date: Sat, 29 Dec 2018 15:35:43 +0000
 > 
 >   The information is now at: http://www.netbsd.org/~prlw1/pr53811.tar.xz
 >   
 > 

 In the dmesg:
 > ioapic0 at mainbus0 apid 17: pa 0xfec00000, version 0x21, 24 pins
 > ioapic0: misconfigured as apic 1
 > ioapic0: autoconfiguration error: can't remap to apid 17
 > ioapic1 at mainbus0 apid 18: pa 0xfec01000, version 0x21, 32 pins
 > ioapic1: misconfigured as apic 2
 > ioapic1: autoconfiguration error: can't remap to apid 18

 I suspect this is the key of this problem.

 -- 
 -----------------------------------------------
                  SAITOH Masanobu (msaitoh@execsw.org
                                   msaitoh@netbsd.org)

From: "SAITOH Masanobu" <msaitoh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/53811 CVS commit: src/sys/dev/pci
Date: Mon, 28 Jan 2019 04:09:51 +0000

 Module Name:	src
 Committed By:	msaitoh
 Date:		Mon Jan 28 04:09:51 UTC 2019

 Modified Files:
 	src/sys/dev/pci: ppb.c

 Log Message:
  Explicitly enable bus masterling in case BIOS, UEFI or firmware don't enable
 it. Might fix PR kern/53811.


 To generate a diff of this commit:
 cvs rdiff -u -r1.65 -r1.66 src/sys/dev/pci/ppb.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/53811 CVS commit: [netbsd-8] src/sys/dev/pci
Date: Fri, 1 Feb 2019 11:25:13 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Fri Feb  1 11:25:13 UTC 2019

 Modified Files:
 	src/sys/dev/pci [netbsd-8]: ppb.c

 Log Message:
 Pull up following revision(s) (requested by msaitoh in ticket #1181):

 	sys/dev/pci/ppb.c: revision 1.66
 	sys/dev/pci/ppb.c: revision 1.67

   Explicitly enable bus masterling in case BIOS, UEFI or firmware don't enable
 it. Might fix PR kern/53811.

  -

   If the secondary bus is configured and the bus mastering is not enabled,
 enable it. Suggested by thorpej@.


 To generate a diff of this commit:
 cvs rdiff -u -r1.63 -r1.63.2.1 src/sys/dev/pci/ppb.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Patrick Welche <prlw1@cam.ac.uk>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: PR/53811 CVS commit: src/sys/dev/pci
Date: Mon, 4 Feb 2019 11:15:22 +0000

 On Mon, Jan 28, 2019 at 04:10:01AM +0000, SAITOH Masanobu wrote:
 >  Log Message:
 >   Explicitly enable bus masterling in case BIOS, UEFI or firmware don't enable
 >  it. Might fix PR kern/53811.

 It does fix it for me - thank you!

State-Changed-From-To: open->closed
State-Changed-By: msaitoh@NetBSD.org
State-Changed-When: Tue, 05 Feb 2019 06:48:14 +0000
State-Changed-Why:
Fixed (and pulled up to netbsd-8).
Thanks.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.