NetBSD Problem Report #34834

From mlelstv@fud.1st.de  Mon Oct 16 18:29:35 2006
Return-Path: <mlelstv@fud.1st.de>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id 2795963B84D
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 16 Oct 2006 18:29:35 +0000 (UTC)
Message-Id: <20061016182929.9BE4CA656@fud.1st.de>
Date: Mon, 16 Oct 2006 20:29:29 +0200 (CEST)
From: mlelstv@serpens.de
Reply-To: mlelstv@serpens.de
To: gnats-bugs@NetBSD.org
Subject: ste driver generates packet burst after timeout
X-Send-Pr-Version: 3.95

>Number:         34834
>Category:       kern
>Synopsis:       ste driver generates packet burst after timeout
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Oct 16 18:30:00 +0000 2006
>Last-Modified:  Mon Jul 30 22:20:01 +0000 2007
>Originator:     Michael van Elst
>Release:        NetBSD 4.0_BETA
>Organization:

>Environment:


System: NetBSD fud 4.0_BETA NetBSD 4.0_BETA (FUD) #4: Tue Oct 10 13:49:00 CEST 2006 mlelstv@henery:/home/netbsd4/obj/home/netbsd4/src/sys/arch/i386/compile/FUD i386
Architecture: i386
Machine: i386
>Description:
On high load the ste hardware seems to stall, the interface watchdog then
triggers a timeout handler with the message:

ste0: device timeout

The handler resets the interface and starts it again. When this happens
you see on the wire that before the timeout for about a second no packets
are sent. After the timeout message appears however one packet (here
it is a small ACK packet) is repeatedly sent back-to-back. On this
system you see about 90000 packets per second. After a couple of seconds
the burst stops and traffic continues as normal until the next timeout
happens. You can see the outgoing packets on the netstat -i counters
for that interface.

The ste interface is on a 4-way Ethernet board with a local PCI bridge:

cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel Pentium M (Yonah) (686-class), 1666.81 MHz, id 0x6e8
cpu0: features bfe9fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
cpu0: features bfe9fbff<PGE,MCA,CMOV,PAT,CFLUSH,DS,ACPI,MMX>
cpu0: features bfe9fbff<FXSR,SSE,SSE2,SS,HTT,TM,SBF>
cpu0: features2 c1a9<SSE3,MONITOR,VMX,EST,TM2,xTPR>
cpu0: "Genuine Intel(R) CPU           T2300  @ 1.66GHz"
cpu0: I-cache 32 KB 64B/line 8-way, D-cache 32 KB 64B/line 8-way
cpu0: L2 cache 2 MB 64B/line 8-way
cpu0: using thermal monitor 1
cpu0: Enhanced SpeedStep (1404 mV) 1667 MHz
cpu0: unknown Enhanced SpeedStep CPU.
cpu0: using only highest and lowest  power states.
cpu0: Enhanced SpeedStep frequencies available (MHz): 1667 1000
cpu0: calibrating local timer
cpu0: apic clock running at 166 MHz
cpu0: 64 page colors
cpu1 at mainbus0: apid 1 (application processor)
cpu1: starting
cpu1: Intel Pentium M (Yonah) (686-class), 1666.67 MHz, id 0x6e8
cpu1: features bfe9fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
cpu1: features bfe9fbff<PGE,MCA,CMOV,PAT,CFLUSH,DS,ACPI,MMX>
cpu1: features bfe9fbff<FXSR,SSE,SSE2,SS,HTT,TM,SBF>
cpu1: features2 c1a9<SSE3,MONITOR,VMX,EST,TM2,xTPR>
cpu1: "Genuine Intel(R) CPU           T2300  @ 1.66GHz"
cpu1: I-cache 32 KB 64B/line 8-way, D-cache 32 KB 64B/line 8-way
cpu1: L2 cache 2 MB 64B/line 8-way
cpu1: using thermal monitor 1
cpu1: Enhanced SpeedStep (1404 mV) 1667 MHz
cpu1: unknown Enhanced SpeedStep CPU.
cpu1: using only highest and lowest  power states.
cpu1: Enhanced SpeedStep frequencies available (MHz): 1667 1000
ioapic0 at mainbus0 apid 2 (I/O APIC)
ioapic0: pa 0xfec00000, version 20, 24 pins
ioapic0: misconfigured as apic 1
ioapic0: remapped to apic 2
[...]
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
[...]
ppb3 at pci0 dev 30 function 0: Intel 82801BAM Hub-PCI Bridge (rev. 0xe2)
pci4 at ppb3 bus 4
pci4: i/o space, memory space enabled
ppb4 at pci4 dev 9 function 0: Intel S21152BB PCI-PCI Bridge (rev. 0x00)
pci5 at ppb4 bus 5
pci5: i/o space, memory space enabled
ste0 at pci5 dev 4 function 0: D-Link DL-1002 10/100 Ethernet
pci_mem_find: void region
ste0: interrupting at ioapic0 pin 21 (irq 10)
ste0: Ethernet address 00:05:5d:7d:07:98
ukphy0 at ste0 phy 1: Generic IEEE 802.3u media interface
ukphy0: OUI 0x0009c3, model 0x0004, rev. 0
ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ste1 at pci5 dev 5 function 0: D-Link DL-1002 10/100 Ethernet
pci_mem_find: void region
ste1: interrupting at ioapic0 pin 22 (irq 11)
ste1: Ethernet address 00:05:5d:7d:07:99
ukphy1 at ste1 phy 1: Generic IEEE 802.3u media interface
ukphy1: OUI 0x0009c3, model 0x0004, rev. 0
ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ste2 at pci5 dev 6 function 0: D-Link DL-1002 10/100 Ethernet
pci_mem_find: void region
ste2: interrupting at ioapic0 pin 23 (irq 3)
ste2: Ethernet address 00:05:5d:7d:07:9a
ukphy2 at ste2 phy 1: Generic IEEE 802.3u media interface
ukphy2: OUI 0x0009c3, model 0x0004, rev. 0
ukphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ste3 at pci5 dev 7 function 0: D-Link DL-1002 10/100 Ethernet
pci_mem_find: void region
ste3: interrupting at ioapic0 pin 20 (irq 5)
ste3: Ethernet address 00:05:5d:7d:07:9b
ukphy3 at ste3 phy 1: Generic IEEE 802.3u media interface
ukphy3: OUI 0x0009c3, model 0x0004, rev. 0
ukphy3: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto

The problem appears on all ste ports, not just ste0.

>How-To-Repeat:
The same didn't happen with a different (much slower) motherboard, so
this might be specific to the Core Duo or PCI bridge.

>Fix:


>Audit-Trail:
From: Michael van Elst <mlelstv@serpens.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/34834 ste driver generates packet burst after timeout
Date: Thu, 4 Jan 2007 21:28:10 +0100

 I had a look at the FreeBSD driver and found that the watchdog
 function polls the device for interrupts before resetting it.

 The following equivalent patch seems to eliminate the packet bursts:

 Index: if_ste.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/pci/if_ste.c,v
 retrieving revision 1.25
 diff -u -r1.25 if_ste.c
 --- if_ste.c	16 Nov 2006 01:33:09 -0000	1.25
 +++ if_ste.c	4 Jan 2007 20:24:29 -0000
 @@ -784,6 +784,8 @@
  	printf("%s: device timeout\n", sc->sc_dev.dv_xname);
  	ifp->if_oerrors++;

 +	ste_txintr(sc);
 +	ste_rxintr(sc);
  	(void) ste_init(ifp);

  	/* Try to get more packets going. */

 I still see the device timeouts, but the driver now recovers quickly.

 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: Sergey Svishchev <svs+pr@grep.ru>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/34834
Date: Thu, 1 Feb 2007 23:21:19 +0300

 I'm seeing this problem on a i386 box running 3.0.2.  The fix doesn't
 help -- after timeout, card continuously generates 90K pps or more
 (packet rate appears to grow over time).

 -- 
 Sergey Svishchev

From: Michael van Elst <mlelstv@serpens.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/34834: ste driver generates packet burst after timeout
Date: Tue, 31 Jul 2007 00:15:16 +0200

 I haven't found the reason for the packet bursts, but for the
 device timeouts.

 When the ste chip receives an interrupt, then all corresponding
 interrupt sources are masked in the interrupt enable register
 (by virtue of reading the IntStatusAck register). When the driver
 has completed the interrupt handling the interrupts are enabled again.

 The problem is that another interrupt condition might be met while
 that specific interrupt source is disabled. Enabling that interrupt
 source will not cause the interrupt signal to be posted.

 As a workaround I have added a loop to the interrupt handler that
 continues handling interrupts until no interrupt is pending.

 I haven't encountered a device timeout since then.

 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.