NetBSD Problem Report #52818

From clare@akizuki.csel.org  Thu Dec 14 14:44:08 2017
Return-Path: <clare@akizuki.csel.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 4F0157A1A1
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 14 Dec 2017 14:44:08 +0000 (UTC)
Message-Id: <20171214144329.107A11267DA@akizuki.csel.org>
Date: Thu, 14 Dec 2017 23:43:29 +0900 (JST)
From: Shinichi Doyashiki <clare@csel.org>
Reply-To: clare@csel.org
To: gnats-bugs@NetBSD.org
Subject: The wm driver stops working after large traffic
X-Send-Pr-Version: 3.95

>Number:         52818
>Category:       kern
>Synopsis:       The wm driver stops working after large traffic
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Dec 14 14:45:00 +0000 2017
>Closed-Date:    Mon Jan 15 04:49:45 +0000 2018
>Last-Modified:  Sat Aug 11 13:35:02 +0000 2018
>Originator:     Shinichi Doyashiki
>Release:        NetBSD 8.99.9
>Organization:
	at home
>Environment:
System: NetBSD akizuki.csel.org 8.99.9 NetBSD 8.99.9 (XCYMINIPC) #3: Tue Dec 12 07:17:00 JST 2017 clare@akizuki.csel.org:/export/stage/hack/sys/arch/amd64/compile/XCYMINIPC amd64
Architecture: x86_64
Machine: amd64
>Description:
	The wm driver stops working after large traffic.
	disabling TSO and other flags does not help for me.
	NET_MPSAFE is enabled at this point.
	The wm driver seems to be stopping transmit.
	This is can be a bug or individual difference
	of the device hardware, since another box of the
	same model is running fine with checksum offloading
	and with jumbo frames.

# dmesg | grep wm1
wm1 at pci2 dev 0 function 0: Intel i82583V (rev. 0x00)
wm1: interrupting at msi2 vec 0
wm1: PCI-Express bus
wm1: 512 words (8 address bits) SPI EEPROM, version 1.10.0, Image Unique ID ffffffff
wm1: Ethernet address 0c:e8:6c:**:**:**
wm1: 0x2a4440<SPI,IOH_VALID,PCIE,ASF_FIRM,AMT,WOL>
makphy1 at wm1 phy 1: Marvell 88E1149 Gigabit PHY, rev. 1
wm1: device timeout (txfree 4093 txsfree 61 txnext 3838)
wm1: device timeout (txfree 4086 txsfree 54 txnext 10)
wm1: device timeout (txfree 4088 txsfree 56 txnext 8)
wm1: device timeout (txfree 4089 txsfree 57 txnext 7)
wm1: device timeout (txfree 4089 txsfree 57 txnext 7)
wm1: device timeout (txfree 4089 txsfree 57 txnext 7)
wm1: device timeout (txfree 4087 txsfree 55 txnext 9)
wm1: device timeout (txfree 4088 txsfree 56 txnext 8)

# ifconfig wm1
wm1: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 9000
	capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
	capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
	capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
	enabled=0
	ec_capabilities=7<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU>
	ec_enabled=0
	address: 0c:e8:6c:**:**:**
	media: Ethernet autoselect (1000baseT full-duplex,master)
	status: active
	inet 192.168.***.***/24 broadcast 192.168.4.255 flags 0x0
	inet6 fe80::****:****:****:****%wm1/64 flags 0x0 scopeid 0x2
	inet6 ****:****:****:****::****/64 flags 0x0
>How-To-Repeat:
	NFS mount on IPv6 TCP or UDP on the wm device,
	then rsync on it to apply large traffic.
>Fix:
	Unknown.

>Release-Note:

>Audit-Trail:
From: clare@csel.org
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/52818: The wm driver stops working after large traffic
Date: Sat, 16 Dec 2017 23:58:50 +0900

 PCIe ASPM is disabled by the BIOS by the default
 in the XCY mini PC.
 Fixing ASPM flag in EEPROM does not effective.

 I installed FreeBSD into the problematic box,
 then the problem does not appear on the FreeBSD kernel.

 -- 
 Shinichi Doyashiki <clare@csel.org>

From: clare@csel.org
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org
Subject: Re: kern/52818: The wm driver stops working after large traffic
Date: Mon, 18 Dec 2017 23:58:48 +0900

 >  PCIe ASPM is disabled by the BIOS by the default
 >  in the XCY mini PC.
 >  Fixing ASPM flag in EEPROM does not effective.

 The buggy BIOS does not disable ASPM against any setup configuration.
 the following patch workarounds the bug of the device, and
 regains network stability of my device.

 Index: if_wm.c
 ===================================================================
 RCS file: /export/cvsroot/netbsd/src/sys/dev/pci/if_wm.c,v
 retrieving revision 1.549
 diff -u -r1.549 if_wm.c
 --- if_wm.c     8 Dec 2017 05:22:23 -0000       1.549
 +++ if_wm.c     18 Dec 2017 14:08:56 -0000
 @@ -1851,6 +1851,35 @@

         wm_adjust_qnum(sc, pci_msix_count(pa->pa_pc, pa->pa_tag));

 +       /*
 +        * XXX this is quick hack, please cleanup before commit.
 +        *
 +        * The 82583 does not support PCIe ASPM.
 +        * We should always disable PCIe ASPM on the device.
 +        */
 +       if (sc->sc_type == WM_T_82583) {
 +               int ok;
 +               int capoff;
 +
 +               ok = pci_get_capability(pc, pa->pa_tag,
 +                   PCI_CAP_PCIEXPRESS, &capoff, NULL);
 +               if (!ok) {
 +                       printf("XXX: I'm not on PCIe bus.\n");
 +                       goto skip_disable_aspm;
 +               }
 +               reg = pci_conf_read(pc, pa->pa_tag, capoff + PCIE_LCAP);
 +               if ((reg & PCIE_LCAP_ASPM) == 0) {
 +                       printf("XXX: ASPM was disabled.\n");
 +                       goto skip_disable_aspm;
 +               }
 +               printf("XXX: disabling ASPM now.\n");
 +               reg = pci_conf_read(pc, pa->pa_tag, capoff + PCIE_LCSR);
 +               reg &= ~PCIE_LCSR_ASPM_L1;
 +               reg &= ~PCIE_LCSR_ASPM_L0S;
 +               pci_conf_write(pc, pa->pa_tag, capoff + PCIE_LCSR, reg);
 +       skip_disable_aspm:;
 +       }
 +
         /* Allocation settings */
         max_type = PCI_INTR_TYPE_MSIX;
         /*
 ===================================================================

 the i82583 specification update says ASPM L0s should be disabled
 on some chipset.

 the FreeBSD kernel disables ASPM as following case:

         switch (adapter->hw.mac.type) {
                 case e1000_82573:
                 case e1000_82574:
                 case e1000_82583:
                         break;
                 default:
                         return;
         }


 -- 
 Shinichi Doyashiki <clare@csel.org>

From: Hisashi T Fujinaka <htodd@twofifty.com>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, 
    clare@csel.org
Subject: Re: kern/52818: The wm driver stops working after large traffic
Date: Mon, 18 Dec 2017 09:14:48 -0800 (PST)

 On Mon, 18 Dec 2017, clare@csel.org wrote:

 > The following reply was made to PR kern/52818; it has been noted by GNATS.
 >
 > From: clare@csel.org
 > To: gnats-bugs@NetBSD.org
 > Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
 > netbsd-bugs@netbsd.org
 > Subject: Re: kern/52818: The wm driver stops working after large traffic
 > Date: Mon, 18 Dec 2017 23:58:48 +0900
 >
 > >  PCIe ASPM is disabled by the BIOS by the default
 > >  in the XCY mini PC.
 > >  Fixing ASPM flag in EEPROM does not effective.
 >
 > The buggy BIOS does not disable ASPM against any setup configuration.
 > the following patch workarounds the bug of the device, and
 > regains network stability of my device.
 >
 > Index: if_wm.c
 > ===================================================================
 > RCS file: /export/cvsroot/netbsd/src/sys/dev/pci/if_wm.c,v
 > retrieving revision 1.549
 > diff -u -r1.549 if_wm.c
 > --- if_wm.c     8 Dec 2017 05:22:23 -0000       1.549
 > +++ if_wm.c     18 Dec 2017 14:08:56 -0000
 > @@ -1851,6 +1851,35 @@
 >
 >         wm_adjust_qnum(sc, pci_msix_count(pa->pa_pc, pa->pa_tag));
 >
 > +       /*
 > +        * XXX this is quick hack, please cleanup before commit.
 > +        *
 > +        * The 82583 does not support PCIe ASPM.
 > +        * We should always disable PCIe ASPM on the device.
 > +        */
 > +       if (sc->sc_type == WM_T_82583) {
 > +               int ok;
 > +               int capoff;
 > +
 > +               ok = pci_get_capability(pc, pa->pa_tag,
 > +                   PCI_CAP_PCIEXPRESS, &capoff, NULL);
 > +               if (!ok) {
 > +                       printf("XXX: I'm not on PCIe bus.\n");
 > +                       goto skip_disable_aspm;
 > +               }
 > +               reg = pci_conf_read(pc, pa->pa_tag, capoff + PCIE_LCAP);
 > +               if ((reg & PCIE_LCAP_ASPM) == 0) {
 > +                       printf("XXX: ASPM was disabled.\n");
 > +                       goto skip_disable_aspm;
 > +               }
 > +               printf("XXX: disabling ASPM now.\n");
 > +               reg = pci_conf_read(pc, pa->pa_tag, capoff + PCIE_LCSR);
 > +               reg &= ~PCIE_LCSR_ASPM_L1;
 > +               reg &= ~PCIE_LCSR_ASPM_L0S;
 > +               pci_conf_write(pc, pa->pa_tag, capoff + PCIE_LCSR, reg);
 > +       skip_disable_aspm:;
 > +       }
 > +
 >         /* Allocation settings */
 >         max_type = PCI_INTR_TYPE_MSIX;
 >         /*
 > ===================================================================
 >
 > the i82583 specification update says ASPM L0s should be disabled
 > on some chipset.
 >
 > the FreeBSD kernel disables ASPM as following case:
 >
 >         switch (adapter->hw.mac.type) {
 >                 case e1000_82573:
 >                 case e1000_82574:
 >                 case e1000_82583:
 >                         break;
 >                 default:
 >                         return;
 >         }

 ASPM needs to be negotiated on the PCIe link and really should be turned
 off in the BIOS.

 For the workaround I would suggest that you don't differentiate between
 the 82583 and the 82574 (do the same thing for both).

 -- 
 Hisashi T Fujinaka - htodd@twofifty.com
 BSEE + BSChem + BAEnglish + MSCS + $2.50 = coffee

From: clare@csel.org
To: Hisashi T Fujinaka <htodd@twofifty.com>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/52818: The wm driver stops working after large traffic
Date: Wed, 20 Dec 2017 07:06:00 +0900

 On Mon, 18 Dec 2017 09:14:48 -0800 (PST)
 Hisashi T Fujinaka <htodd@twofifty.com> wrote:

 > ASPM needs to be negotiated on the PCIe link and really should be turned
 > off in the BIOS.

 XCY mini PC's system firmware have a bug that it cannot handle PCIe ASPM.
 We should disable PCIe ASPM to use the such box.
 I think most end user cannot fix the behavior of the system firmware.


 > For the workaround I would suggest that you don't differentiate between
 > the 82583 and the 82574 (do the same thing for both).

 I read the specification update of the 82573 and 82574.
 I'll post the candidate patch later...


 -- 
 Shinichi Doyashiki <clare@csel.org>

From: Shinichi Doyashiki <clare@csel.org>
To: gnats-bugs@NetBSD.org
Cc: Hisashi T Fujinaka <htodd@twofifty.com>
Subject: Re: kern/52818: The wm driver stops working after large traffic
Date: Thu, 21 Dec 2017 01:22:38 +0900 (JST)

 This patch is a workaround of the 82583's errata.

 Index: if_wm.c
 ===================================================================
 RCS file: /export/cvsroot/netbsd/src/sys/dev/pci/if_wm.c,v
 retrieving revision 1.549
 diff -u -r1.549 if_wm.c
 --- if_wm.c	8 Dec 2017 05:22:23 -0000	1.549
 +++ if_wm.c	20 Dec 2017 16:07:06 -0000
 @@ -1851,6 +1851,43 @@

  	wm_adjust_qnum(sc, pci_msix_count(pa->pa_pc, pa->pa_tag));

 +	/*
 +	 * The 82573 disappears when PCIe ASPM L0s is enabled.
 +	 *
 +	 * The 82574 and 82583 does not support PCIe ASPM L0s with
 +	 * some chipset.  The document of 82574 and 82583 says that
 +	 * disabling L0s with some specific chipset is sufficient,
 +	 * but we follow as of the Intel em driver does.
 +	 *
 +	 * References:
 +	 * Errata 8 of the Specification Update of i82573.
 +	 * Errata 20 of the Specification Update of i82574.
 +	 * Errata 9 of the Specification Update of i82583.
 +	 */
 +	switch (sc->sc_type) {
 +		int ok;
 +		int capoff;
 +
 +	case WM_T_82573:
 +	case WM_T_82574:
 +	case WM_T_82583:
 +		ok = pci_get_capability(pc, pa->pa_tag,
 +		    PCI_CAP_PCIEXPRESS, &capoff, NULL);
 +		if (!ok)
 +			break;
 +		reg = pci_conf_read(pc, pa->pa_tag, capoff + PCIE_LCAP);
 +		if ((reg & PCIE_LCAP_ASPM) == 0)
 +			break;
 +		reg = pci_conf_read(pc, pa->pa_tag, capoff + PCIE_LCSR);
 +		reg &= ~PCIE_LCSR_ASPM_L1;
 +		reg &= ~PCIE_LCSR_ASPM_L0S;
 +		pci_conf_write(pc, pa->pa_tag, capoff + PCIE_LCSR, reg);
 +		aprint_verbose_dev(sc->sc_dev, "ASPM was disabled to workaround the errata.\n");
 +		break;
 +	default:
 +		break;
 +	}
 +
  	/* Allocation settings */
  	max_type = PCI_INTR_TYPE_MSIX;
  	/*

From: Masanobu SAITOH <msaitoh@execsw.org>
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, clare@csel.org
Cc: msaitoh@execsw.org
Subject: Re: kern/52818: The wm driver stops working after large traffic
Date: Thu, 28 Dec 2017 21:07:36 +0900

 Hi.

 On 2017/12/21 1:25, Shinichi Doyashiki wrote:
 > The following reply was made to PR kern/52818; it has been noted by GNATS.
 > 
 > From: Shinichi Doyashiki <clare@csel.org>
 > To: gnats-bugs@NetBSD.org
 > Cc: Hisashi T Fujinaka <htodd@twofifty.com>
 > Subject: Re: kern/52818: The wm driver stops working after large traffic
 > Date: Thu, 21 Dec 2017 01:22:38 +0900 (JST)
 > 
 >   This patch is a workaround of the 82583's errata.
 >   
 >   Index: if_wm.c
 >   ===================================================================
 >   RCS file: /export/cvsroot/netbsd/src/sys/dev/pci/if_wm.c,v
 >   retrieving revision 1.549
 >   diff -u -r1.549 if_wm.c
 >   --- if_wm.c	8 Dec 2017 05:22:23 -0000	1.549
 >   +++ if_wm.c	20 Dec 2017 16:07:06 -0000
 >   @@ -1851,6 +1851,43 @@
 >    
 >    	wm_adjust_qnum(sc, pci_msix_count(pa->pa_pc, pa->pa_tag));
 >    
 >   +	/*
 >   +	 * The 82573 disappears when PCIe ASPM L0s is enabled.
 >   +	 *
 >   +	 * The 82574 and 82583 does not support PCIe ASPM L0s with
 >   +	 * some chipset.  The document of 82574 and 82583 says that
 >   +	 * disabling L0s with some specific chipset is sufficient,
 >   +	 * but we follow as of the Intel em driver does.
 >   +	 *
 >   +	 * References:
 >   +	 * Errata 8 of the Specification Update of i82573.
 >   +	 * Errata 20 of the Specification Update of i82574.
 >   +	 * Errata 9 of the Specification Update of i82583.
 >   +	 */
 >   +	switch (sc->sc_type) {
 >   +		int ok;
 >   +		int capoff;
 >   +
 >   +	case WM_T_82573:
 >   +	case WM_T_82574:
 >   +	case WM_T_82583:
 >   +		ok = pci_get_capability(pc, pa->pa_tag,
 >   +		    PCI_CAP_PCIEXPRESS, &capoff, NULL);
 >   +		if (!ok)
 >   +			break;
 >   +		reg = pci_conf_read(pc, pa->pa_tag, capoff + PCIE_LCAP);
 >   +		if ((reg & PCIE_LCAP_ASPM) == 0)
 >   +			break;
 >   +		reg = pci_conf_read(pc, pa->pa_tag, capoff + PCIE_LCSR);
 >   +		reg &= ~PCIE_LCSR_ASPM_L1;
 >   +		reg &= ~PCIE_LCSR_ASPM_L0S;
 >   +		pci_conf_write(pc, pa->pa_tag, capoff + PCIE_LCSR, reg);
 >   +		aprint_verbose_dev(sc->sc_dev, "ASPM was disabled to workaround the errata.\n");
 >   +		break;
 >   +	default:
 >   +		break;
 >   +	}
 >   +
 >    	/* Allocation settings */
 >    	max_type = PCI_INTR_TYPE_MSIX;
 >    	/*
 >   
 > 

 Could you test the following patch?

 ---------
   Add ASPM workaround for 8257[1234] and 82583.

 Index: if_wm.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/pci/if_wm.c,v
 retrieving revision 1.550
 diff -u -p -r1.550 if_wm.c
 --- if_wm.c	28 Dec 2017 06:13:50 -0000	1.550
 +++ if_wm.c	28 Dec 2017 12:03:45 -0000
 @@ -921,6 +921,7 @@ static void	wm_ulp_disable(struct wm_sof
   static void	wm_enable_phy_wakeup(struct wm_softc *);
   static void	wm_igp3_phy_powerdown_workaround_ich8lan(struct wm_softc *);
   static void	wm_enable_wakeup(struct wm_softc *);
 +static void	wm_disable_aspm(struct wm_softc *);
   /* LPLU (Low Power Link Up) */
   static void	wm_lplu_d0_disable(struct wm_softc *);
   /* EEE */
 @@ -2048,6 +2049,9 @@ alloc_retry:
   		    (sc->sc_flags & WM_F_PCIX) ? "PCIX" : "PCI");
   	}

 +	/* Disable ASPM L0s and/or L1 for workaround */
 +	wm_disable_aspm(sc);
 +
   	/* clear interesting stat counters */
   	CSR_READ(sc, WMREG_COLC);
   	CSR_READ(sc, WMREG_RXERRC);
 @@ -2911,6 +2915,8 @@ wm_resume(device_t self, const pmf_qual_
   {
   	struct wm_softc *sc = device_private(self);

 +	/* Disable ASPM L0s and/or L1 for workaround */
 +	wm_disable_aspm(sc);
   	wm_init_manageability(sc);

   	return true;
 @@ -13806,6 +13812,66 @@ wm_enable_wakeup(struct wm_softc *sc)
   	pci_conf_write(sc->sc_pc, sc->sc_pcitag, pmreg + PCI_PMCSR, pmode);
   }

 +/* Disable ASPM L0s and/or L1 for workaround */
 +static void
 +wm_disable_aspm(struct wm_softc *sc)
 +{
 +	pcireg_t reg, mask = 0;
 +	unsigned const char *str = "";
 +
 +	/*
 +	 *  Only for PCIe device which has PCIe capability in the PCI config
 +	 * space.
 +	 */
 +	if (((sc->sc_flags & WM_F_PCIE) == 0) || (sc->sc_pcixe_capoff == 0))
 +		return;
 +
 +	switch (sc->sc_type) {
 +	case WM_T_82571:
 +	case WM_T_82572:
 +		/*
 +		 * 8257[12] Errata 13: Device Does Not Support PCIe Active
 +		 * State Power management L1 State (ASPM L1).
 +		 */
 +		mask = PCIE_LCSR_ASPM_L1;
 +		str = "L1 is";
 +		break;
 +	case WM_T_82573:
 +	case WM_T_82574:
 +	case WM_T_82583:
 +		/*
 +		 * The 82573 disappears when PCIe ASPM L0s is enabled.
 +		 *
 +		 * The 82574 and 82583 does not support PCIe ASPM L0s with
 +		 * some chipset.  The document of 82574 and 82583 says that
 +		 * disabling L0s with some specific chipset is sufficient,
 +		 * but we follow as of the Intel em driver does.
 +		 *
 +		 * References:
 +		 * Errata 8 of the Specification Update of i82573.
 +		 * Errata 20 of the Specification Update of i82574.
 +		 * Errata 9 of the Specification Update of i82583.
 +		 */
 +		mask = PCIE_LCSR_ASPM_L1 | PCIE_LCSR_ASPM_L0S;
 +		str = "L0s and L1 are";
 +		break;
 +	default:
 +		return;
 +	}
 +
 +	reg = pci_conf_read(sc->sc_pc, sc->sc_pcitag,
 +	    sc->sc_pcixe_capoff + PCIE_LCSR);
 +	reg &= ~mask;
 +	pci_conf_write(sc->sc_pc, sc->sc_pcitag,
 +	    sc->sc_pcixe_capoff + PCIE_LCSR, reg);
 +
 +	/* Print only in wm_attach() */
 +	if ((sc->sc_flags & WM_F_ATTACHED) == 0)
 +		aprint_verbose_dev(sc->sc_dev,
 +		    "ASPM %s disabled to workaround the errata.\n",
 +			str);
 +}
 +
   /* LPLU */

   static void
 ---------

 The same diff is at:

 	http://www.netbsd.org/~msaitoh/wm-aspm-20171228-0.dif

 -- 
 -----------------------------------------------
                  SAITOH Masanobu (msaitoh@execsw.org
                                   msaitoh@netbsd.org)

From: clare@csel.org
To: Masanobu SAITOH <msaitoh@execsw.org>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/52818: The wm driver stops working after large traffic
Date: Sat, 30 Dec 2017 00:23:31 +0900

 On Thu, 28 Dec 2017 21:07:36 +0900
 Masanobu SAITOH <msaitoh@execsw.org> wrote:

 > Could you test the following patch?
 > 
 > ---------
 >   Add ASPM workaround for 8257[1234] and 82583.
 > 
 > Index: if_wm.c
 > ===================================================================
 > RCS file: /cvsroot/src/sys/dev/pci/if_wm.c,v
 > retrieving revision 1.550
 > diff -u -p -r1.550 if_wm.c
 > --- if_wm.c	28 Dec 2017 06:13:50 -0000	1.550
 > +++ if_wm.c	28 Dec 2017 12:03:45 -0000
 > @@ -921,6 +921,7 @@ static void	wm_ulp_disable(struct wm_sof
 >   static void	wm_enable_phy_wakeup(struct wm_softc *);
 >   static void	wm_igp3_phy_powerdown_workaround_ich8lan(struct wm_softc *);
 >   static void	wm_enable_wakeup(struct wm_softc *);
 > +static void	wm_disable_aspm(struct wm_softc *);
 >   /* LPLU (Low Power Link Up) */
 >   static void	wm_lplu_d0_disable(struct wm_softc *);
 >   /* EEE */
 > @@ -2048,6 +2049,9 @@ alloc_retry:
 >   		    (sc->sc_flags & WM_F_PCIX) ? "PCIX" : "PCI");
 >   	}
 >   
 > +	/* Disable ASPM L0s and/or L1 for workaround */
 > +	wm_disable_aspm(sc);
 > +
 >   	/* clear interesting stat counters */
 >   	CSR_READ(sc, WMREG_COLC);
 >   	CSR_READ(sc, WMREG_RXERRC);
 > @@ -2911,6 +2915,8 @@ wm_resume(device_t self, const pmf_qual_
 >   {
 >   	struct wm_softc *sc = device_private(self);
 >   
 > +	/* Disable ASPM L0s and/or L1 for workaround */
 > +	wm_disable_aspm(sc);
 >   	wm_init_manageability(sc);
 >   
 >   	return true;
 > @@ -13806,6 +13812,66 @@ wm_enable_wakeup(struct wm_softc *sc)
 >   	pci_conf_write(sc->sc_pc, sc->sc_pcitag, pmreg + PCI_PMCSR, pmode);
 >   }
 >   
 > +/* Disable ASPM L0s and/or L1 for workaround */
 > +static void
 > +wm_disable_aspm(struct wm_softc *sc)
 > +{
 > +	pcireg_t reg, mask = 0;
 > +	unsigned const char *str = "";
 > +
 > +	/*
 > +	 *  Only for PCIe device which has PCIe capability in the PCI config
 > +	 * space.
 > +	 */
 > +	if (((sc->sc_flags & WM_F_PCIE) == 0) || (sc->sc_pcixe_capoff == 0))
 > +		return;
 > +
 > +	switch (sc->sc_type) {
 > +	case WM_T_82571:
 > +	case WM_T_82572:
 > +		/*
 > +		 * 8257[12] Errata 13: Device Does Not Support PCIe Active
 > +		 * State Power management L1 State (ASPM L1).
 > +		 */
 > +		mask = PCIE_LCSR_ASPM_L1;
 > +		str = "L1 is";
 > +		break;
 > +	case WM_T_82573:
 > +	case WM_T_82574:
 > +	case WM_T_82583:
 > +		/*
 > +		 * The 82573 disappears when PCIe ASPM L0s is enabled.
 > +		 *
 > +		 * The 82574 and 82583 does not support PCIe ASPM L0s with
 > +		 * some chipset.  The document of 82574 and 82583 says that
 > +		 * disabling L0s with some specific chipset is sufficient,
 > +		 * but we follow as of the Intel em driver does.
 > +		 *
 > +		 * References:
 > +		 * Errata 8 of the Specification Update of i82573.
 > +		 * Errata 20 of the Specification Update of i82574.
 > +		 * Errata 9 of the Specification Update of i82583.
 > +		 */
 > +		mask = PCIE_LCSR_ASPM_L1 | PCIE_LCSR_ASPM_L0S;
 > +		str = "L0s and L1 are";
 > +		break;
 > +	default:
 > +		return;
 > +	}
 > +
 > +	reg = pci_conf_read(sc->sc_pc, sc->sc_pcitag,
 > +	    sc->sc_pcixe_capoff + PCIE_LCSR);
 > +	reg &= ~mask;
 > +	pci_conf_write(sc->sc_pc, sc->sc_pcitag,
 > +	    sc->sc_pcixe_capoff + PCIE_LCSR, reg);
 > +
 > +	/* Print only in wm_attach() */
 > +	if ((sc->sc_flags & WM_F_ATTACHED) == 0)
 > +		aprint_verbose_dev(sc->sc_dev,
 > +		    "ASPM %s disabled to workaround the errata.\n",
 > +			str);
 > +}
 > +
 >   /* LPLU */
 >   
 >   static void
 > ---------
 > 
 > The same diff is at:
 > 
 > 	http://www.netbsd.org/~msaitoh/wm-aspm-20171228-0.dif

 I tested the patch on my machines that use Intel i82583V.
 The patch seems to be worked fine.  Thank you.


 -- 
 Shinichi Doyashiki <clare@csel.org>

From: "SAITOH Masanobu" <msaitoh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52818 CVS commit: src/sys/dev/pci
Date: Thu, 4 Jan 2018 09:43:28 +0000

 Module Name:	src
 Committed By:	msaitoh
 Date:		Thu Jan  4 09:43:28 UTC 2018

 Modified Files:
 	src/sys/dev/pci: if_wm.c

 Log Message:
  Add ASPM workaround for 8257[1234] and 82583 to prevent device timeout or
 hangup. Fixes PR#52818 reported by Shinichi Doyashiki.


 To generate a diff of this commit:
 cvs rdiff -u -r1.551 -r1.552 src/sys/dev/pci/if_wm.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Soren Jacobsen" <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52818 CVS commit: [netbsd-8] src/sys/dev/pci
Date: Sat, 13 Jan 2018 21:42:45 +0000

 Module Name:	src
 Committed By:	snj
 Date:		Sat Jan 13 21:42:45 UTC 2018

 Modified Files:
 	src/sys/dev/pci [netbsd-8]: if_wm.c

 Log Message:
 Pull up following revision(s) (requested by msaitoh in ticket #491):
 	sys/dev/pci/if_wm.c: 1.550, 1.552
  Don't use MSI-X if we can use only one queue to save interrupt resource.
 Written by knakahara and tested by me.
 --
  Add ASPM workaround for 8257[1234] and 82583 to prevent device timeout or
 hangup. Fixes PR#52818 reported by Shinichi Doyashiki.


 To generate a diff of this commit:
 cvs rdiff -u -r1.508.4.11 -r1.508.4.12 src/sys/dev/pci/if_wm.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: msaitoh@NetBSD.org
State-Changed-When: Mon, 15 Jan 2018 04:49:45 +0000
State-Changed-Why:
Fixed and pulled up.
Thanks.


From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52818 CVS commit: [netbsd-7] src
Date: Sat, 11 Aug 2018 13:34:21 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Sat Aug 11 13:34:21 UTC 2018

 Modified Files:
 	src/share/man/man4 [netbsd-7]: wm.4
 	src/sys/dev/mii [netbsd-7]: ihphyreg.h inbmphyreg.h
 	src/sys/dev/pci [netbsd-7]: if_wm.c if_wmreg.h if_wmvar.h pcidevs
 	    pcireg.h

 Log Message:
 Pull up the following, requested by msaitoh in ticket #1628:

 share/man/man4/wm.4				1.40 via patch
 sys/dev/mii/ihphyreg.h				1.2
 sys/dev/mii/inbmphyreg.h			1.10
 sys/dev/pci/if_wm.c				1.504, 1.506, 1.510-1.535, 1.539-1.540, 1.546, 1.548, 1.551-1.552, 1.558, 1.565-1.573, 1.575, 1.579, 1.582, 1.584 via patch
 sys/dev/pci/if_wmreg.h				1.99-1.103, 1.106-1.107 via patch
 sys/dev/pci/if_wmvar.h				1.34-1.39 via patch
 sys/dev/pci/pcidevs				1.1327 via patch
 sys/dev/pci/pcidevs.h				regen
 sys/dev/pci/pcidevs_data.h			regen
 sys/dev/pci/pcireg.h				patch

 	Sync wm(4) up to 2018/08/08 except MSI/MSI-X and NET_MPSAFE:
 	- remove extra "+"
 	- Fix a bug that non-GMII devices don't send a routing message when
 	  the link status is changed.
 	- Set WMREG_KABGTXD not in wm_init_locked() but in wm_reset(). Same as
 	  other OSes.
 	- If a interrupt is a spurious interrupt, don't print debug message.
 	- Don't print the Image Unique ID if an NVM is iNVM (i210 and I211).
 	- Print sc_flags with snprintb().
 	- Fix a bug that a RAL was written at incorrect address when the index
 	  number is more than 16 on 82544 and newer.
 	- The layout of RAL on PCH* are different from others. Fix it.
 	- Flush every MTA write. Same as Linux.
 	- Move the location of calling wm_set_filter. Same as some other OSes.
 	- Add CSR_WRITE_FLUSH() after writing WMREG_CTRL in
 	  wm_gmii_mediachange().
 	- Add missing "else" in wm_nvm_release().
 	- Make new wm_phy_post_reset() and use this function at all location
 	  after resetting phy.
 	- Move the location of calling wm_get_hw_control. Same as Linux.
 	- Add I219 specific wokaround for legacy interrupt. From OpenBSD.
 	- Move the location of calling wm_lplu_d0_disable().
 	- Fix latency calculation in wm_platform_pm_pch_lpt().
 	- Set OBFF water mark and enable OBFF on PCH_LPT and newer.
 	- Disable D0 LPLU on 8257[12356], 82580, I350 and I21[01], too.	Before
 	  this commit, above devices and non-PCIe devices accessed wrong
 	  register.
 	- Use device_printf() instead of aprint_error_dev() for PHY read/write
 	  functions because those are used not only in device attach.
 	- Fix a bug that wm_gmii_i82544_{read,write}reg() didn't take care of
 	  page select. PHY access from igphy() automatically did it, but
 	  accessing from wm(4) for wrokaround didn't work correctly. This
 	  change affects 8254[17], 8257[12] ICH8, ICH9 and ICH10.
 	- Call wm_kmrn_lock_loss_workaround_ich8lan() before any PHY access in
 	  wm_linkintr_gmii().
 	- Register access in wm_kmrn_lock_loss_workaround_ich8lan() now works
 	  correctly. Enable this function.
 	- Configure the LCD with the extended configuration region in NVM if
 	  it's required.
 	- If TX is not required to flush, RX is also not required to flush
 	  in wm_flush_desc_rings(). Same as other OSes.
 	- Remove wrong semaphore access in wm_nvm_{read,write}_{ich8,spt} to
 	  prevent hangup. A semaphore is get/put in wm_nvm_{read,write}.
 	- Move some initialization stuff in wm_attach() before wm_reset(). Some
 	  flags and callback function are required to set correctly before
 	  wm_reset() because wm_reset() and some helper functions refer them.
 	- Add wm_write_smbus_addr() to set SMBus address by software.
 	- Modify wm_gmii_hv_{read,write}reg_locked() to make them access
 	  HV_SMB_ADDR correctly.
 	- Use new nvm.{acquire,release}() for semaphore.
 	- Our MII readreg/writereg API has not way to detect an error.
 	  kmrn_{read,write}reg() are not used for MII API, so it's not required
 	  for these functions to use the same API. So,
 	  - Change return value as error code.
 	  - Change register value from int to uint16_t.
 	  - read: pass pointer for uint16_t as an argument.
 	  - Check return value on caller side.
 	- Check whether it's required to use MDIC workaround for 80003 or not
 	  in wm_reset(). If the workaround isn't required, don't use the
 	  workaround code in wm_gmii_i80003_{read,write}reg.
 	- Add WM_F_WA_I210_CLSEM flag for a workaround. FreeBSD/Linux drivers
 	  say "In rare circumstances, the SW semaphore may already be held
 	  unintentionally on I21[01]". PXE boot is one of the case.
 	- Qemu's e1000e emulation (82574L)'s SPI has only 64 words. I've never
 	  seen on real 82574 hardware with such small SPI ROM. Check
 	  sc->sc_nvm_wordsize before accessing higher address words to prevent
 	  timeout.
 	- Check some wm_nvm_read()'s return vale.
 	- Print NVM offset and word count when EERD polling failed.
 	- On I219, drop TARC0 bit 28 for DMA hang workaround (from Linux).
 	- 82583 supports jumbo frame. Fixes PR#52773 reported by
 	  Shinichi Doyashiki.
 	- Fix typo in comment. Reported by Shinichi Doyashiki in PR#52885.
 	- Add ASPM workaround for 8257[1234] and 82583 to prevent device
 	  timeout or hangup. Fixes PR#52818 reported by Shinichi Doyashiki.
 	- CID-1427779: Fix uninitialized variables.
 	- Fix a bug that wm_pll_workaround_i210() is not called when
 	  a) Chip is I211 or b) Chip is I210 and it uses iNVM (not FLASH).
 	- Do wm_reset_mdicnfg_82580() on 82580 only.
 	- Fix FLASH access on PCH_SPT and newer. Their FLASH access should be
 	  done by 32bit. Especially for ICH_FLASH_HSFCTL register, it's located
 	  at 0x0006, so it must be accessed via ICH_FLASH_HSFSTS(0x0004) and
 	  use shift or mask.
 	- Make wm_nvm_valid_bank_detect_ich8lan() the same as other OSes.
 	- If the extended configuration size in the EXTCNFSIZE register is 0,
 	  don't continue in wm_init_lcd_from_nvm().
 	- Add PCH_CNP support (I219 with Intel 300 series chipset).
 	- Enable I219 support.
 	- I354 uses an external PHY, so don't use wm_set_eee_i350().
 	- Fix a bug that the link can't detect in link interrupt function for
 	  non-SERDES fiber.
 	- Fix a bug that 82542 misunderstand fiber's signal detection.
 	- Add debug printf()s.
 	- Update comment.
 	- Rename functions and variables.
 	- Add diagnostic code.
 	- Sort registers.
 	- Lowercase hexadecimal values.
 	- KNF.


 To generate a diff of this commit:
 cvs rdiff -u -r1.30.2.1 -r1.30.2.2 src/share/man/man4/wm.4
 cvs rdiff -u -r1.1 -r1.1.38.1 src/sys/dev/mii/ihphyreg.h
 cvs rdiff -u -r1.3.30.1 -r1.3.30.2 src/sys/dev/mii/inbmphyreg.h
 cvs rdiff -u -r1.289.2.14 -r1.289.2.15 src/sys/dev/pci/if_wm.c
 cvs rdiff -u -r1.60.2.8 -r1.60.2.9 src/sys/dev/pci/if_wmreg.h
 cvs rdiff -u -r1.19.2.6 -r1.19.2.7 src/sys/dev/pci/if_wmvar.h
 cvs rdiff -u -r1.1199.2.11 -r1.1199.2.12 src/sys/dev/pci/pcidevs
 cvs rdiff -u -r1.95.2.3 -r1.95.2.4 src/sys/dev/pci/pcireg.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.