NetBSD Problem Report #48476

From bouyer@antioche.eu.org  Tue Dec 24 10:16:41 2013
Return-Path: <bouyer@antioche.eu.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "Postmaster NetBSD.org" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id CFA45A40BA
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 24 Dec 2013 10:16:41 +0000 (UTC)
Message-Id: <201312241015.rBOAFLxM001826@localhost.>
Date: Tue, 24 Dec 2013 11:15:21 +0100 (CET)
From: bouyer@antioche.eu.org
Reply-To: bouyer@antioche.eu.org
To: gnats-bugs@gnats.NetBSD.org
Subject: wm(4) transmit hang
X-Send-Pr-Version: 3.95

>Number:         48476
>Category:       kern
>Synopsis:       wm(4) transmit hang
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    msaitoh
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Dec 24 10:20:00 +0000 2013
>Closed-Date:    Wed Sep 12 07:47:02 +0000 2018
>Last-Modified:  Wed Sep 12 07:47:02 +0000 2018
>Originator:     Manuel Bouyer
>Release:        NetBSD 6.1_STABLE
>Organization:
>Environment:
System: NetBSD 6.1_STABLE NetBSD 6.1_STABLE (ANTIOCHE6-64) amd64
Architecture: amd64
Machine: amd64
	     NetBSD: if_wm.c,v 1.227.2.10 2013/07/29 20:24:04 jdc Exp
>Description:
	This is on ftp.fr.netbsd.org, a Supermicro X7DBR motherboard.
	Its wm0 interface is connected to a gigabit switch and is identified
	as:
wm0 at pci4 dev 0 function 0: i80003 dual 1000baseT Ethernet (rev. 0x01)
wm0: interrupting at ioapic0 pin 18
wm0: PCI-Express bus
wm0: 65536 word (16 address bits) SPI EEPROM
wm0: Ethernet address 00:30:48:31:43:82
ikphy0 at wm0 phy 1: i82563 10/100/1000 media interface, rev. 2
ikphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX
, auto
# ifconfig wm0
wm0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx,TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
        enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx,TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
        address: 00:30:48:31:43:82
        media: Ethernet autoselect (1000baseT full-duplex)
        status: active

	under traffic, transmission stops. ping complains with
	"no buffer space available" and named with:
# Dec 24 11:01:07 antioche named[181]: client 80.10.200.35#32697 (www.antioche.eu.org): error sending response: not enough free resources
	ifconfig wm0 doesn't show "OACTIVE flag".

	A "ifconfig wm0 down up" fixes it for a few minutes.

	Note that this system has been used for other tasks before,
	with older versions of NetBSD without problems.
	Also, other systems with different wm(4) models running
	kernels buidls from the same source shows no problems even
	under load.
>How-To-Repeat:
	put this motherboard's wm(4) under load ?
>Fix:
	unknown

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->msaitoh
Responsible-Changed-By: msaitoh@NetBSD.org
Responsible-Changed-When: Tue, 07 Jan 2014 13:18:55 +0000
Responsible-Changed-Why:
mine :-<


From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/48476: wm(4) transmit hang
Date: Tue, 7 Jan 2014 17:54:50 +0000

 (not sent to gnats)
    ------

 From: Manuel Bouyer <bouyer@antioche.eu.org>
 To: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
 Subject: Re: kern/48476: wm(4) transmit hang
 Date: Tue, 7 Jan 2014 15:24:19 +0100

 On Tue, Dec 24, 2013 at 10:20:00AM +0000, bouyer@antioche.eu.org wrote:
 > >Description:
 > 	This is on ftp.fr.netbsd.org, a Supermicro X7DBR motherboard.
 > 	Its wm0 interface is connected to a gigabit switch and is identified
 > 	as:
 > wm0 at pci4 dev 0 function 0: i80003 dual 1000baseT Ethernet (rev. 0x01)
 > wm0: interrupting at ioapic0 pin 18
 > wm0: PCI-Express bus
 > wm0: 65536 word (16 address bits) SPI EEPROM
 > wm0: Ethernet address 00:30:48:31:43:82
 > ikphy0 at wm0 phy 1: i82563 10/100/1000 media interface, rev. 2
 > ikphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX
 > , auto
 > # ifconfig wm0
 > wm0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
 >         capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx,TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
 >         enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx,TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
 >         address: 00:30:48:31:43:82
 >         media: Ethernet autoselect (1000baseT full-duplex)
 >         status: active
 > 
 > 	under traffic, transmission stops. ping complains with
 > 	"no buffer space available" and named with:
 > # Dec 24 11:01:07 antioche named[181]: client 80.10.200.35#32697 (www.antioche.eu.org): error sending response: not enough free resources
 > 	ifconfig wm0 doesn't show "OACTIVE flag".
 > 
 > 	A "ifconfig wm0 down up" fixes it for a few minutes.
 > 
 > 	Note that this system has been used for other tasks before,
 > 	with older versions of NetBSD without problems.
 > 	Also, other systems with different wm(4) models running
 > 	kernels buidls from the same source shows no problems even
 > 	under load.

 It looks like disabling tso4 and tso6 helps, but I don't know if TSO is the
 real cause of the problem, or if it only hides it by lowering the network
 load a bit.

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Masanobu SAITOH <msaitoh@execsw.org>
To: gnats-bugs@NetBSD.org, msaitoh@NetBSD.org, gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org, bouyer@antioche.eu.org, bouyer@netbsd.org
Cc: msaitoh@execsw.org
Subject: Re: kern/48476: wm(4) transmit hang
Date: Wed, 12 Sep 2018 14:25:59 +0900

 On 2014/01/08 2:55, David Holland wrote:
 > The following reply was made to PR kern/48476; it has been noted by GNATS.
 > 
 > From: David Holland <dholland-bugs@netbsd.org>
 > To: gnats-bugs@netbsd.org
 > Cc:
 > Subject: Re: kern/48476: wm(4) transmit hang
 > Date: Tue, 7 Jan 2014 17:54:50 +0000
 > 
 >   (not sent to gnats)
 >      ------
 >   
 >   From: Manuel Bouyer <bouyer@antioche.eu.org>
 >   To: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
 >   Subject: Re: kern/48476: wm(4) transmit hang
 >   Date: Tue, 7 Jan 2014 15:24:19 +0100
 >   
 >   On Tue, Dec 24, 2013 at 10:20:00AM +0000, bouyer@antioche.eu.org wrote:
 >   > >Description:
 >   > 	This is on ftp.fr.netbsd.org, a Supermicro X7DBR motherboard.
 >   > 	Its wm0 interface is connected to a gigabit switch and is identified
 >   > 	as:
 >   > wm0 at pci4 dev 0 function 0: i80003 dual 1000baseT Ethernet (rev. 0x01)
 >   > wm0: interrupting at ioapic0 pin 18
 >   > wm0: PCI-Express bus
 >   > wm0: 65536 word (16 address bits) SPI EEPROM
 >   > wm0: Ethernet address 00:30:48:31:43:82
 >   > ikphy0 at wm0 phy 1: i82563 10/100/1000 media interface, rev. 2
 >   > ikphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX
 >   > , auto
 >   > # ifconfig wm0
 >   > wm0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
 >   >         capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx,TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
 >   >         enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx,TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
 >   >         address: 00:30:48:31:43:82
 >   >         media: Ethernet autoselect (1000baseT full-duplex)
 >   >         status: active
 >   >
 >   > 	under traffic, transmission stops. ping complains with
 >   > 	"no buffer space available" and named with:
 >   > # Dec 24 11:01:07 antioche named[181]: client 80.10.200.35#32697 (www.antioche.eu.org): error sending response: not enough free resources
 >   > 	ifconfig wm0 doesn't show "OACTIVE flag".
 >   >
 >   > 	A "ifconfig wm0 down up" fixes it for a few minutes.
 >   >
 >   > 	Note that this system has been used for other tasks before,
 >   > 	with older versions of NetBSD without problems.
 >   > 	Also, other systems with different wm(4) models running
 >   > 	kernels buidls from the same source shows no problems even
 >   > 	under load.
 >   
 >   It looks like disabling tso4 and tso6 helps, but I don't know if TSO is the
 >   real cause of the problem, or if it only hides it by lowering the network
 >   load a bit.
 >   
 >   --
 >   Manuel Bouyer <bouyer@antioche.eu.org>
 >        NetBSD: 26 ans d'experience feront toujours la difference
 >   --
 >   
 > 

 Do you still have the system?

 This PR is for netbsd-6 and it's EOLed.
 After reporting this PR, an problem for TSO is fixed in rev. 1.269
 and pulled up to netbsd-6 branch (rev. 1.227.2.11).

 Is it OK to close this PR, or keep it?

 -- 
 -----------------------------------------------
                  SAITOH Masanobu (msaitoh@execsw.org
                                   msaitoh@netbsd.org)

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Masanobu SAITOH <msaitoh@execsw.org>
Cc: gnats-bugs@NetBSD.org, msaitoh@NetBSD.org, gnats-admin@netbsd.org,
        netbsd-bugs@netbsd.org
Subject: Re: kern/48476: wm(4) transmit hang
Date: Wed, 12 Sep 2018 09:37:01 +0200

 hello,

 On Wed, Sep 12, 2018 at 02:25:59PM +0900, Masanobu SAITOH wrote:
 > Do you still have the system?
 > 
 > This PR is for netbsd-6 and it's EOLed.
 > After reporting this PR, an problem for TSO is fixed in rev. 1.269
 > and pulled up to netbsd-6 branch (rev. 1.227.2.11).
 > 
 > Is it OK to close this PR, or keep it?

 I still have the system and it's still ftp.fr.
 It's now running netbsd8, but even before the upgrade I've not seen
 this problem for a very long time. I guess it can be closed.

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

State-Changed-From-To: open->closed
State-Changed-By: msaitoh@NetBSD.org
State-Changed-When: Wed, 12 Sep 2018 07:47:02 +0000
State-Changed-Why:
The submitter said he have not seen this problem for a very long time.
Thanks.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.