NetBSD Problem Report #57613
From www@netbsd.org Sun Sep 10 11:40:27 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id BE7131A9238
for <gnats-bugs@gnats.NetBSD.org>; Sun, 10 Sep 2023 11:40:26 +0000 (UTC)
Message-Id: <20230910114025.5E2451A9239@mollari.NetBSD.org>
Date: Sun, 10 Sep 2023 11:40:25 +0000 (UTC)
From: herdware@sdf.org
Reply-To: herdware@sdf.org
To: gnats-bugs@NetBSD.org
Subject: Abysmal network performance on shark
X-Send-Pr-Version: www-1.0
>Number: 57613
>Category: port-shark
>Synopsis: Abysmal network performance on shark
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: port-shark-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Sep 10 11:45:00 +0000 2023
>Last-Modified: Sat Sep 16 12:20:01 +0000 2023
>Originator: Björn Johannesson
>Release: 10_BETA
>Organization:
>Environment:
NetBSD dnard.home.lan 10.0_BETA NetBSD 10.0_BETA (GENERIC) #1: Wed Sep 6 21:54:35 CEST 2023 herdware@regin.home.lan:/usr/obj/sys/arch/shark/compile/GENERIC shark
>Description:
Network performance is ABYSMAL on shark. Speed is often 1.5-5kbyte/s with frequent stalling when using programs like ftp. Tried playing a wav-file from a nfs-share and it hung, can't break out with ^C.
Trying to use nfs as root hangs after setting the hostname.
However. Booting in single user mode (root on nfs) and playing that same wav-file works which is strange.
>How-To-Repeat:
Use the builtin cs0 ethernet on a shark.
>Fix:
>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-shark/57613: Abysmal network performance on shark
Date: Mon, 11 Sep 2023 12:49:42 +0200
This could depend on your switch or the negotiated link.
Can you check ifconfig output and also error counters?
I have a diskless shark and it certainly isn't fast when using the network,
but not as bad as you describe.
Martin
From: =?X-UNKNOWN?Q?Bj=C3=B6rn_Johannesson?= <herdware@sdf.org>
To: gnats-bugs@netbsd.org
Cc: port-shark-maintainer@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: port-shark/57613: Abysmal network performance on shark
Date: Mon, 11 Sep 2023 21:39:52 +0000 (UTC)
On Mon, 11 Sep 2023, Martin Husemann wrote:
>
> This could depend on your switch or the negotiated link.
> Can you check ifconfig output and also error counters?
>
Well this is embarrassing. I took all the things you wanted and tested two
separate network switches. The problem turns out that "mediaopt full-duplex"
fubars everything. Don't even remember that. I briefly tested this with
netbsd-9 before going to netbsd-10 and I can remember that network was
pretty pants on that too. Unsure of when that full-duplex was added
though. Removing full-duplex makes the net behave normally.
FTR there was zero in the error counters even when using full-duplex.
Perhaps it would be possible to turn off the possibility to enable
full-duplex since it's so broken? (Other stuff connected to that switch
runs fine full-duplex.)
The other problem that nfs-root hangs going multi-user turns out
to be that /etc/rc.d/network flushes routes after setting (or not in this
case) NIS/YP domain. Since this shark is on another subnet than the file
server and the gateway the machine got from dhcpd got flushed things
break.
Setting flushroutes=NO in rc.conf fixes the problem but is that the
intended way? After network_start_domainname() flushes the route we can
never reach network_start_defaultroute() which comes later.
Anyway, sorry for the noise and please close this PR.
/B
From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-shark/57613: Abysmal network performance on shark
Date: Tue, 12 Sep 2023 06:03:20 -0000 (UTC)
herdware@sdf.org (=?X-UNKNOWN?Q?Bj=C3=B6rn_Johannesson?=) writes:
>Perhaps it would be possible to turn off the possibility to enable
>full-duplex since it's so broken? (Other stuff connected to that switch
>runs fine full-duplex.)
Machine and switch both need to use auto-negotation or both need to
be statically configured to the same duplex mode. Mixing auto and
static configuration does not work, and the default for an interface
is to use auto negotiation.
>The other problem that nfs-root hangs going multi-user turns out
>to be that /etc/rc.d/network flushes routes after setting (or not in this
>case) NIS/YP domain. Since this shark is on another subnet than the file
>server and the gateway the machine got from dhcpd got flushed things
>break.
>Setting flushroutes=NO in rc.conf fixes the problem but is that the
>intended way? After network_start_domainname() flushes the route we can
>never reach network_start_defaultroute() which comes later.
It would be smarter if /etc/rc.d/network knew how to handle NFS roots.
From: Havard Eidnes <he@NetBSD.org>
To: herdware@sdf.org
Cc: gnats-bugs@netbsd.org
Subject: Re: port-shark/57613: Abysmal network performance on shark
Date: Sat, 16 Sep 2023 14:18:32 +0200 (CEST)
> Well this is embarrassing. I took all the things you wanted and
> tested two separate network switches. The problem turns out
> that "mediaopt full-duplex" fubars everything. Don't even
> remember that. I briefly tested this with netbsd-9 before going
> to netbsd-10 and I can remember that network was pretty pants
> on that too. Unsure of when that full-duplex was added
> though. Removing full-duplex makes the net behave normally.
>
> FTR there was zero in the error counters even when using full-duplex.=
Oh, ancient knowledge about 100Mbit/s ethernet and auto-
negotiation status and results becomes useful again :)
Setting the host to "static full-duplex" causes auto-negotiation
to be disabled on the host, and *the other end* if it is set up
to do auto-negotiation, will *by the spec* then need to fall back
to half-duplex. The half-duplex end will then end up seeing
"jabber errors", i.e. the host "talking" while the local half-
duplex end is also transmitting, so the half-duplex rules are not
followed by the host (naturally). It thus makes perfect sense
that you do not see the resulting errors on the host doing
full-duplex, as they will be seen on the switch end instead.
So it *does* have an explanation...
Configurations which are "OK":
host switch
100M auto 100M auto <-- recommended if supported
100M FDX 100M FDX
100M HDX 100M HDX
100M HDX 100M auto
100M auto 100M HDX
Configurations which result in duplex-conflict, and are thus
*not* OK:
host switch
100M FDX 100M auto
100M auto 100M FDX
100M FDX 100M HDX
100M HDX 100M FDX
Regards,
- H=E5vard
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.