NetBSD Problem Report #52876
From clare@csel.org Fri Dec 29 00:13:10 2017
Return-Path: <clare@csel.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 3EB807A1AE
for <gnats-bugs@gnats.NetBSD.org>; Fri, 29 Dec 2017 00:13:10 +0000 (UTC)
Message-Id: <20171229001304.7911E146CFA@router.csel.org>
Date: Fri, 29 Dec 2017 09:13:04 +0900 (JST)
From: Shinichi Doyashiki <clare@csel.org>
Reply-To: clare@csel.org
To: gnats-bugs@NetBSD.org
Subject: The vlan(4) over wm(4) behaves something strange
X-Send-Pr-Version: 3.95
>Number: 52876
>Category: kern
>Synopsis: The vlan(4) over wm(4) behaves something strange
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Dec 29 00:15:00 +0000 2017
>Last-Modified: Tue Jan 02 00:25:00 +0000 2018
>Originator: Shinichi Doyashiki
>Release: NetBSD 8.99.9
>Organization:
at home
>Environment:
System: NetBSD router.csel.org 8.99.9 NetBSD 8.99.9 (XCYMINIPC) #**: Fri Dec 29 **:**:** JST 2017 clare@mizuki.csel.org:/export/stage/hack/sys/arch/amd64/compile/XCYMINIPC amd64
Architecture: x86_64
Machine: amd64
>Description:
The vlan(4) over wm(4) behaves something strange.
Packet forwarding itself is seemes good,
but ssh or telnet session to the box is sometimes
freezed (or dropped packet) on the box.
>How-To-Repeat:
Detail is unknown.
My router box has configured vlan(4) over wm(4).
Another endpoint machine has wm(4) only, that works fine.
wm0: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
enabled=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
enabled=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
ec_capabilities=7<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU>
ec_enabled=3<VLAN_MTU,VLAN_HWTAGGING>
address: **:**:**:**:**:**
media: Ethernet autoselect (1000baseT full-duplex)
status: active
inet 192.168.0.1/24 broadcast 192.168.0.255 flags 0x0
inet 192.168.1.1/24 broadcast 192.168.1.255 flags 0x0
inet 192.168.0.34/24 broadcast 192.168.0.255 flags 0x0
inet 192.168.1.34/24 broadcast 192.168.1.255 flags 0x0
inet6 fe**::e**:****:****:**d4%wm0/64 flags 0x0 scopeid 0x1
inet6 fe**::**%wm0/64 flags 0x0 scopeid 0x1
inet6 24**:****:****:****::**/64 flags 0x0
(snip)
vlan10: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
enabled=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
enabled=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
vlan: 10 parent: wm0
address: **:**:**:**:**:**
inet 192.168.10.1/24 broadcast 192.168.10.255 flags 0x0
inet6 fe**::*:****:****:****%vlan10/64 flags 0x0 scopeid 0x8
inet6 fe**::1%vlan10/64 flags 0x0 scopeid 0x8
inet6 24**:****:****:****::1/64 flags 0x0
vlan11: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
enabled=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
enabled=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
vlan: 11 parent: wm0
address: **:**:**:**:**:**
inet 192.168.11.1/24 broadcast 192.168.11.255 flags 0x0
inet6 fe**::*:****:****:****%vlan11/64 flags 0x0 scopeid 0x9
inet6 fe**::1%vlan11/64 flags 0x0 scopeid 0x9
inet6 24**:****:****:****::1/64 flags 0x0
vlan2: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
enabled=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
enabled=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
vlan: 2 parent: wm0
address: **:**:**:**:**:**
inet 192.168.2.1/24 broadcast 192.168.2.255 flags 0x0
inet6 fe**::*:****:****:****%vlan2/64 flags 0x0 scopeid 0xa
inet6 fe**::1%vlan2/64 flags 0x0 scopeid 0xa
inet6 24**:****:****:****::1/64 flags 0x0
vlan29: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
enabled=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
enabled=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
vlan: 29 parent: wm0
address: **:**:**:**:**:**
inet 192.168.29.1/24 broadcast 192.168.29.255 flags 0x0
inet6 fe**::*:****:****:****%vlan29/64 flags 0x0 scopeid 0xb
inet6 fe**::1%vlan29/64 flags 0x0 scopeid 0xb
inet6 24**:****:****:****::1/64 flags 0x0
vlan3: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
enabled=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
enabled=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
vlan: 3 parent: wm0
address: **:**:**:**:**:**
inet 192.168.3.1/24 broadcast 192.168.3.255 flags 0x0
inet6 fe**::e**:****:****:****%vlan3/64 flags 0x0 scopeid 0xc
inet6 fe**::1%vlan3/64 flags 0x0 scopeid 0xc
inet6 24**:****:****:****::1/64 flags 0x0
vlan30: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
enabled=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
enabled=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
vlan: 30 parent: wm0
address: **:**:**:**:**:**
inet 192.168.30.1/24 broadcast 192.168.30.255 flags 0x0
inet6 fe**::e**:****:****:****%vlan30/64 flags 0x0 scopeid 0xd
inet6 fe**::1%vlan30/64 flags 0x0 scopeid 0xd
inet6 80:****:****:****::1/64 flags 0x0
vlan31: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
enabled=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
enabled=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
vlan: 31 parent: wm0
address: **:**:**:**:**:**
inet6 fe80::*%vlan31/64 flags 0x0 scopeid 0xe
vlan4: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 9000
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
enabled=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
enabled=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
vlan: 4 parent: wm1
address: **:**:**:**:**:**
inet 192.168.4.1/24 broadcast 192.168.4.255 flags 0x0
inet6 fe80::*%vlan4/64 flags 0x0 scopeid 0xf
inet6 fe80::1%vlan4/64 flags 0x0 scopeid 0xf
inet6 */64 flags 0x0
>Fix:
Unknown.
>Audit-Trail:
From: clare@csel.org
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/52876: The vlan(4) over wm(4) behaves something strange
Date: Fri, 29 Dec 2017 10:02:55 +0900
disabling TSO is workaround of the problem.
--
Shinichi Doyashiki <clare@csel.org>
From: SAITOH Masanobu <msaitoh@execsw.org>
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, clare@csel.org
Cc: msaitoh@execsw.org
Subject: Re: kern/52876: The vlan(4) over wm(4) behaves something strange
Date: Fri, 29 Dec 2017 12:27:01 +0900
Hi.
On 2017/12/29 10:05, clare@csel.org wrote:
> The following reply was made to PR kern/52876; it has been noted by GNATS.
>
> From: clare@csel.org
> To: gnats-bugs@NetBSD.org
> Cc:
> Subject: Re: kern/52876: The vlan(4) over wm(4) behaves something strange
> Date: Fri, 29 Dec 2017 10:02:55 +0900
>
> disabling TSO is workaround of the problem.
>
> --
> Shinichi Doyashiki <clare@csel.org>
>
>
Have you ever check /var/log/message?
One of possibilities is:
> error = bus_dmamap_load_mbuf(sc->sc_dmat, dmamap, m0,
> BUS_DMA_WRITE | BUS_DMA_NOWAIT);
> if (error) {
> if (error == EFBIG) {
> WM_Q_EVCNT_INCR(txq, txdrop);
> log(LOG_ERR, "%s: Tx packet consumes too many "
> "DMA segments, dropping...\n",
> device_xname(sc->sc_dev));
> wm_dump_mbuf_chain(sc, m0);
> m_freem(m0);
> continue;
> }
> /* Short on resources, just stop for now. */
> DPRINTF(WM_DEBUG_TX,
> ("%s: TX: dmamap load failed: %d\n",
> device_xname(sc->sc_dev), error));
> break;
> }
This error is by log(LOG_ERR), so it's not printed in dmesg but
in /var/log/message
And, could you test with "options WM_EVENT_COUNTERS" in your
kernel config and show me the output of "vmstat -ev |grep wm"
after problem occurred.
Thanks in advance.
--
-----------------------------------------------
SAITOH Masanobu (msaitoh@execsw.org
msaitoh@netbsd.org)
From: clare@csel.org
To: SAITOH Masanobu <msaitoh@execsw.org>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/52876: The vlan(4) over wm(4) behaves something strange
Date: Sat, 30 Dec 2017 11:12:38 +0900
On Fri, 29 Dec 2017 12:27:01 +0900
SAITOH Masanobu <msaitoh@execsw.org> wrote:
> Have you ever check /var/log/message?
> One of possibilities is:
>
> > error = bus_dmamap_load_mbuf(sc->sc_dmat, dmamap, m0,
> > BUS_DMA_WRITE | BUS_DMA_NOWAIT);
> > if (error) {
> > if (error == EFBIG) {
> > WM_Q_EVCNT_INCR(txq, txdrop);
> > log(LOG_ERR, "%s: Tx packet consumes too many "
> > "DMA segments, dropping...\n",
> > device_xname(sc->sc_dev));
> > wm_dump_mbuf_chain(sc, m0);
> > m_freem(m0);
> > continue;
> > }
> > /* Short on resources, just stop for now. */
> > DPRINTF(WM_DEBUG_TX,
> > ("%s: TX: dmamap load failed: %d\n",
> > device_xname(sc->sc_dev), error));
> > break;
> > }
>
> This error is by log(LOG_ERR), so it's not printed in dmesg but
> in /var/log/message
I couldn't find any messages generated from wm in /var/log/messages.
> And, could you test with "options WM_EVENT_COUNTERS" in your
> kernel config and show me the output of "vmstat -ev |grep wm"
> after problem occurred.
I placed the log to following URL:
https://www.csel.org/netbsd/pr/52876/vmstat-ev-wm-20171230.txt
--
Shinichi Doyashiki <clare@csel.org>
From: clare@csel.org
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/52876: The vlan(4) over wm(4) behaves something strange
Date: Sat, 30 Dec 2017 15:11:08 +0900
the problematic device is i82583V.
erratum 7 is complex, I can't understand it.
wm0 at pci1 dev 0 function 0: Intel i82583V (rev. 0x00)
wm0: interrupting at msi2 vec 0
wm0: PCI-Express bus
wm0: ASPM L0s and L1 are disabled to workaround the errata.
wm0: 512 words (8 address bits) SPI EEPROM, version 1.10.0, Image Unique ID ffffffff
wm0: Ethernet address 0c:e8:5c:**:**:**
wm0: 0x2a4440<SPI,IOH_VALID,PCIE,ASF_FIRM,AMT,WOL>
makphy0 at wm0 phy 1: Marvell 88E1149 Gigabit PHY, rev. 1
--
Shinichi Doyashiki <clare@csel.org>
From: clare@csel.org
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/52876: The vlan(4) over wm(4) behaves something strange
Date: Sat, 30 Dec 2017 17:37:55 +0900
the interface generates strange packets during occurence
of the problem. what is this?
17:31:05.780362 IP6 164.250.162.1 > 0.0.0.0: [|tcp]
17:31:06.820008 IP6 164.250.162.1 > 0.0.0.0: [|tcp]
17:31:08.859493 IP6 164.250.162.1 > 0.0.0.0: [|tcp]
--
Shinichi Doyashiki <clare@csel.org>
From: clare@csel.org
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/52876: The vlan(4) over wm(4) behaves something strange
Date: Mon, 1 Jan 2018 18:03:14 +0900
Transmitter DMA was seems to be halted on such packets.
It was not detected by the wm(4) watchdog logics.
I installed WM_DEBUG and WM_DEBUG_TX enabled kernel,
took a debug log, and placed as following URL:
https://www.csel.org/netbsd/pr/52876/wm-debug-tx-20180101.txt
How to repeat:
* setup Intel i82583V.
* setup wm(4) driver with all hardware offload flags.
* setup vlan(4) driver attached to the wm(4) and
enable hardware offload flags including TSO.
* connect to sshd via vlan(4) and apply large traffic
(doing cat /var/log/messages is sufficient).
How to workaround:
* disabling TSO on vlan(4) is sufficient.
--
Shinichi Doyashiki <clare@csel.org>
From: clare@csel.org
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/52876: The vlan(4) over wm(4) behaves something strange
Date: Tue, 2 Jan 2018 09:24:19 +0900
> How to repeat:
> * setup Intel i82583V.
> * setup wm(4) driver with all hardware offload flags.
> * setup vlan(4) driver attached to the wm(4) and
> enable hardware offload flags including TSO.
> * connect to sshd via vlan(4) and apply large traffic
> (doing cat /var/log/messages is sufficient).
On the FreeBSD-11.1, the problem does not reprodued.
$ ifconfig em5
em5: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,
TSO4,WOL_MAGIC,VLAN_HWTSO>
ether 0c:e8:6c:**:**:**
hwaddr 0c:e8:6c:**:**:**
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
$ ifconfig em5.10
em5.10: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=103<RXCSUM,TXCSUM,TSO4>
ether 0c:e8:6c:**:**:**
inet 192.168.**.** netmask 0xffffff00 broadcast 192.168.**.255
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
vlan: 10 vlanpcp: 0 parent interface: em5
groups: vlan
--
Shinichi Doyashiki <clare@csel.org>
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.