NetBSD Problem Report #57613

From www@netbsd.org  Sun Sep 10 11:40:27 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id BE7131A9238
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 10 Sep 2023 11:40:26 +0000 (UTC)
Message-Id: <20230910114025.5E2451A9239@mollari.NetBSD.org>
Date: Sun, 10 Sep 2023 11:40:25 +0000 (UTC)
From: herdware@sdf.org
Reply-To: herdware@sdf.org
To: gnats-bugs@NetBSD.org
Subject: Abysmal network performance on shark
X-Send-Pr-Version: www-1.0

>Number:         57613
>Category:       port-shark
>Synopsis:       Abysmal network performance on shark
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-shark-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Sep 10 11:45:00 +0000 2023
>Last-Modified:  Sat Sep 16 12:20:01 +0000 2023
>Originator:     Björn Johannesson
>Release:        10_BETA
>Organization:
>Environment:
NetBSD dnard.home.lan 10.0_BETA NetBSD 10.0_BETA (GENERIC) #1: Wed Sep  6 21:54:35 CEST 2023  herdware@regin.home.lan:/usr/obj/sys/arch/shark/compile/GENERIC shark
>Description:
Network performance is ABYSMAL on shark. Speed is often 1.5-5kbyte/s with frequent stalling when using programs like ftp. Tried playing a wav-file from a nfs-share and it hung, can't break out with ^C.

Trying to use nfs as root hangs after setting the hostname.
However. Booting in single user mode (root on nfs) and playing that same wav-file works which is strange.
>How-To-Repeat:
Use the builtin cs0 ethernet on a shark.
>Fix:

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-shark/57613: Abysmal network performance on shark
Date: Mon, 11 Sep 2023 12:49:42 +0200

 This could depend on your switch or the negotiated link.
 Can you check ifconfig output and also error counters?

 I have a diskless shark and it certainly isn't fast when using the network,
 but not as bad as you describe.

 Martin

From: =?X-UNKNOWN?Q?Bj=C3=B6rn_Johannesson?= <herdware@sdf.org>
To: gnats-bugs@netbsd.org
Cc: port-shark-maintainer@netbsd.org, gnats-admin@netbsd.org,
        netbsd-bugs@netbsd.org
Subject: Re: port-shark/57613: Abysmal network performance on shark
Date: Mon, 11 Sep 2023 21:39:52 +0000 (UTC)

 On Mon, 11 Sep 2023, Martin Husemann wrote:

 >
 > This could depend on your switch or the negotiated link.
 > Can you check ifconfig output and also error counters?
 >
 Well this is embarrassing. I took all the things you wanted and tested two 
 separate network switches. The problem turns out that "mediaopt full-duplex"
 fubars everything. Don't even remember that. I briefly tested this with 
 netbsd-9 before going to netbsd-10 and I can remember that network was 
 pretty pants on that too. Unsure of when that full-duplex was added 
 though. Removing full-duplex makes the net behave normally.

 FTR there was zero in the error counters even when using full-duplex.

 Perhaps it would be possible to turn off the possibility to enable 
 full-duplex since it's so broken? (Other stuff connected to that switch 
 runs fine full-duplex.)

 The other problem that nfs-root hangs going multi-user turns out 
 to be that /etc/rc.d/network flushes routes after setting (or not in this 
 case) NIS/YP domain. Since this shark is on another subnet than the file 
 server and the gateway the machine got from dhcpd got flushed things 
 break.
 Setting flushroutes=NO in rc.conf fixes the problem but is that the 
 intended way? After network_start_domainname() flushes the route we can 
 never reach network_start_defaultroute() which comes later.

 Anyway, sorry for the noise and please close this PR.

 /B

From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-shark/57613: Abysmal network performance on shark
Date: Tue, 12 Sep 2023 06:03:20 -0000 (UTC)

 herdware@sdf.org (=?X-UNKNOWN?Q?Bj=C3=B6rn_Johannesson?=) writes:

 >Perhaps it would be possible to turn off the possibility to enable 
 >full-duplex since it's so broken? (Other stuff connected to that switch 
 >runs fine full-duplex.)

 Machine and switch both need to use auto-negotation or both need to
 be statically configured to the same duplex mode. Mixing auto and
 static configuration does not work, and the default for an interface
 is to use auto negotiation.


 >The other problem that nfs-root hangs going multi-user turns out 
 >to be that /etc/rc.d/network flushes routes after setting (or not in this 
 >case) NIS/YP domain. Since this shark is on another subnet than the file 
 >server and the gateway the machine got from dhcpd got flushed things 
 >break.
 >Setting flushroutes=NO in rc.conf fixes the problem but is that the 
 >intended way? After network_start_domainname() flushes the route we can 
 >never reach network_start_defaultroute() which comes later.

 It would be smarter if /etc/rc.d/network knew how to handle NFS roots.

From: Havard Eidnes <he@NetBSD.org>
To: herdware@sdf.org
Cc: gnats-bugs@netbsd.org
Subject: Re: port-shark/57613: Abysmal network performance on shark
Date: Sat, 16 Sep 2023 14:18:32 +0200 (CEST)

 > Well this is embarrassing. I took all the things you wanted and
 > tested two separate network switches. The problem turns out
 > that "mediaopt full-duplex" fubars everything. Don't even
 > remember that. I briefly tested this with netbsd-9 before going
 > to netbsd-10 and I can remember that network was pretty pants
 > on that too. Unsure of when that full-duplex was added
 > though. Removing full-duplex makes the net behave normally.
 >
 > FTR there was zero in the error counters even when using full-duplex.=


 Oh, ancient knowledge about 100Mbit/s ethernet and auto-
 negotiation status and results becomes useful again :)

 Setting the host to "static full-duplex" causes auto-negotiation
 to be disabled on the host, and *the other end* if it is set up
 to do auto-negotiation, will *by the spec* then need to fall back
 to half-duplex.  The half-duplex end will then end up seeing
 "jabber errors", i.e. the host "talking" while the local half-
 duplex end is also transmitting, so the half-duplex rules are not
 followed by the host (naturally).  It thus makes perfect sense
 that you do not see the resulting errors on the host doing
 full-duplex, as they will be seen on the switch end instead.

 So it *does* have an explanation...

 Configurations which are "OK":

 host       switch
 100M auto  100M auto   <-- recommended if supported
 100M FDX   100M FDX
 100M HDX   100M HDX
 100M HDX   100M auto
 100M auto  100M HDX

 Configurations which result in duplex-conflict, and are thus
 *not* OK:

 host       switch
 100M FDX   100M auto
 100M auto  100M FDX
 100M FDX   100M HDX
 100M HDX   100M FDX

 Regards,

 - H=E5vard

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.