NetBSD Problem Report #53216

From bouyer@antioche.eu.org  Thu Apr 26 08:22:37 2018
Return-Path: <bouyer@antioche.eu.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 628B87A177
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 26 Apr 2018 08:22:37 +0000 (UTC)
Message-Id: <20180426082230.2AAB627FB@rochebonne.antioche.eu.org>
Date: Thu, 26 Apr 2018 10:22:30 +0200 (CEST)
From: bouyer@antioche.eu.org
Reply-To: bouyer@antioche.eu.org
To: gnats-bugs@NetBSD.org
Subject: sunxi awge is unreliable at gigabit speed
X-Send-Pr-Version: 3.95

>Number:         53216
>Category:       port-arm
>Synopsis:       sunxi awge is unreliable at gigabit speed
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    port-arm-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Apr 26 08:25:00 +0000 2018
>Last-Modified:  Fri Apr 27 07:15:00 +0000 2018
>Originator:     Manuel Bouyer
>Release:        NetBSD 8.99.14
>Organization:
>Environment:
System: NetBSD lime2 8.99.14 NetBSD 8.99.14 (SUNXI_CAN) #21: Wed Apr 25 14:57:43 CEST 2018 bouyer@bip.soc.lip6.fr:/dsk/l1/misc/bouyer/tmp/evbarm-earmhf/obj/dsk/l1/misc/bouyer/HEAD/clean/src/sys/arch/evbarm/compile/SUNXI_CAN evbarm
Architecture: earmv7hf
Machine: evbarm

>Description:
	On a olimex Lime2 board with:
awge0 at fdt1 (/soc@1c00000/ethernet@1c50000)fdt: [ethernet@1c50000] decoded addr #0: 1c50000 -> 1c50000
: GMAC
awge0: interrupting on GIC irq 117
awge0: Ethernet address: 02:c7:04:82:c2:37
rgephy0 at awge0 phy 0: RTL8169S/8110S/8211 1000BASE-T media interface, rev. 5
rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
rgephy1 at awge0 phy 1: RTL8169S/8110S/8211 1000BASE-T media interface, rev. 5
rgephy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
awge0: WARNING: power management not supported

	(note that the PHY attaches twice), the network is sometimes
	unreliable. When connected to a 100Mbs cisco switch everything
	works fine. When connected to a 1Gbs dlink switch, the green led
	on the board's ethernet connector (link state) flashes at about
	1s and there are packet loss:
100 packets transmitted, 96 packets received, 4.0% packet loss
round-trip min/avg/max/stddev = 0.286974/1.172346/10.286504/2.295635 ms
	(scp also has much lower speed than it should).
	The link on the switch side, and in ifconfig output doens't
	show this down/up problem.
	While the green led is off on the board's ethenet connector, the
	yellow led (link activity) seems to still be flashing as usual.
	Also, the ping's packet loss doens't reflect the led off/on ratio.

>How-To-Repeat:
	connect a lime2 to a 1Gbs switch
>Fix:
	unknown

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-arm/53216: sunxi awge is unreliable at gigabit speed
Date: Thu, 26 Apr 2018 11:21:01 +0200

 FWIW, my cubietruck does this when doing ping -c 10000 -f $host on a gigE
 link:

 10000 packets transmitted, 10000 packets received, 0.0% packet loss
 round-trip min/avg/max/stddev = 0.548294/1.014064/5.901567/0.262285 ms
   984.0 packets/sec sent,  984.0 packets/sec received


 Martin

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: port-arm-maintainer@netbsd.org, gnats-admin@netbsd.org,
        netbsd-bugs@netbsd.org
Subject: Re: port-arm/53216: sunxi awge is unreliable at gigabit speed
Date: Thu, 26 Apr 2018 11:36:14 +0200

 On Thu, Apr 26, 2018 at 09:25:01AM +0000, Martin Husemann wrote:
 > The following reply was made to PR port-arm/53216; it has been noted by GNATS.
 > 
 > From: Martin Husemann <martin@duskware.de>
 > To: gnats-bugs@NetBSD.org
 > Cc: 
 > Subject: Re: port-arm/53216: sunxi awge is unreliable at gigabit speed
 > Date: Thu, 26 Apr 2018 11:21:01 +0200
 > 
 >  FWIW, my cubietruck does this when doing ping -c 10000 -f $host on a gigE
 >  link:
 >  
 >  10000 packets transmitted, 10000 packets received, 0.0% packet loss
 >  round-trip min/avg/max/stddev = 0.548294/1.014064/5.901567/0.262285 ms
 >    984.0 packets/sec sent,  984.0 packets/sec received

 How does the ethernet attach ?
 Does it also see 2 PHYs ?

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Martin Husemann <martin@duskware.de>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@NetBSD.org, port-arm-maintainer@netbsd.org,
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-arm/53216: sunxi awge is unreliable at gigabit speed
Date: Thu, 26 Apr 2018 11:41:21 +0200

 On Thu, Apr 26, 2018 at 11:36:14AM +0200, Manuel Bouyer wrote:
 > How does the ethernet attach ?
 > Does it also see 2 PHYs ?

 Yes, there are two phys in the soc, only one is connected on the cubietruck
 (AFAIK).

 awge0 at fdt1: GMAC
 awge0: interrupting on GIC irq 117
 awge0: Ethernet address: 02:0e:03:41:63:14
 rgephy0 at awge0 phy 0: RTL8169S/8110S/8211 1000BASE-T media interface, rev. 5
 rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
 rgephy1 at awge0 phy 1: RTL8169S/8110S/8211 1000BASE-T media interface, rev. 5
 rgephy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto

 and:

 awge0: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
         ec_capabilities=1<VLAN_MTU>
         ec_enabled=0
         address: 02:0e:03:41:63:14
         media: Ethernet autoselect (1000baseT full-duplex)
         status: active
 [..]

 Which reminds me I need to add checksum offload to the driver.

 Martin

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Martin Husemann <martin@duskware.de>
Cc: gnats-bugs@NetBSD.org, port-arm-maintainer@netbsd.org,
        gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-arm/53216: sunxi awge is unreliable at gigabit speed
Date: Thu, 26 Apr 2018 11:46:38 +0200

 On Thu, Apr 26, 2018 at 11:41:21AM +0200, Martin Husemann wrote:
 > On Thu, Apr 26, 2018 at 11:36:14AM +0200, Manuel Bouyer wrote:
 > > How does the ethernet attach ?
 > > Does it also see 2 PHYs ?
 > 
 > Yes, there are two phys in the soc, only one is connected on the cubietruck
 > (AFAIK).
 > 
 > awge0 at fdt1: GMAC
 > awge0: interrupting on GIC irq 117
 > awge0: Ethernet address: 02:0e:03:41:63:14
 > rgephy0 at awge0 phy 0: RTL8169S/8110S/8211 1000BASE-T media interface, rev. 5
 > rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
 > rgephy1 at awge0 phy 1: RTL8169S/8110S/8211 1000BASE-T media interface, rev. 5
 > rgephy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto

 Actually I don't think there are 2 PHYs, in the cubietruck schematics, as
 well as the lime2 schematic, and by board inspection, there is only one
 PHY. I think the same PHY is responding on 2 different addresses.

 Also, it looks like the lim2 resision I have and the cubietruck have
 the same 8211 variant.

 I'll have to investigate some more on my side then

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Martin Husemann <martin@duskware.de>
Cc: gnats-bugs@NetBSD.org, port-arm-maintainer@netbsd.org,
        gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-arm/53216: sunxi awge is unreliable at gigabit speed
Date: Thu, 26 Apr 2018 19:46:32 +0200

 On Thu, Apr 26, 2018 at 11:46:38AM +0200, Manuel Bouyer wrote:
 > > Yes, there are two phys in the soc, only one is connected on the cubietruck
 > > (AFAIK).
 > > 
 > > awge0 at fdt1: GMAC
 > > awge0: interrupting on GIC irq 117
 > > awge0: Ethernet address: 02:0e:03:41:63:14
 > > rgephy0 at awge0 phy 0: RTL8169S/8110S/8211 1000BASE-T media interface, rev. 5
 > > rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
 > > rgephy1 at awge0 phy 1: RTL8169S/8110S/8211 1000BASE-T media interface, rev. 5
 > > rgephy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
 > 
 > Actually I don't think there are 2 PHYs, in the cubietruck schematics, as
 > well as the lime2 schematic, and by board inspection, there is only one
 > PHY. I think the same PHY is responding on 2 different addresses.
 > 
 > Also, it looks like the lim2 resision I have and the cubietruck have
 > the same 8211 variant.
 > 
 > I'll have to investigate some more on my side then

 Could be some uninitialized register problem.
 I tried to apply the no-rx-delay property for my board in sunxi_platform.c,
 this made the problem worse (more packet loss with ping, I couldn't
 even connect via ssh, link led on the board off but still on on the switch).
 rebooting with a known-good kernel didn't get me a working ethernet.
 I had to power cycle to get back to the previous working state.

 Now to find what appropriate "no-rx-delay" values would work for this
 board ...

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-arm/53216: sunxi awge is unreliable at gigabit speed
Date: Fri, 27 Apr 2018 09:07:57 +0200

 On Thu, Apr 26, 2018 at 05:50:01PM +0000, Manuel Bouyer wrote:
 >  Could be some uninitialized register problem.

 May depend on the u-boot version too, I have:

 U-Boot 2017.11 (Dec 05 2017 - 14:37:38 +0100) Allwinner Technology

 arm-none-eabi-gcc (GCC) 7.2.0
 GNU ld (GNU Binutils) 2.29


 (IIRC I build that myself from pkgsrc)

 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-arm/53216: sunxi awge is unreliable at gigabit speed
Date: Fri, 27 Apr 2018 09:11:53 +0200

 On Fri, Apr 27, 2018 at 07:10:01AM +0000, Martin Husemann wrote:
 >  May depend on the u-boot version too, I have:
 >  
 >  U-Boot 2017.11 (Dec 05 2017 - 14:37:38 +0100) Allwinner Technology

 And also I am netbooting the machine (via gigE).

 Martin

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.