NetBSD Problem Report #57160
From bouyer@antioche.eu.org Wed Jan 4 19:40:03 2023
Return-Path: <bouyer@antioche.eu.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id C88191A9239
for <gnats-bugs@gnats.NetBSD.org>; Wed, 4 Jan 2023 19:40:03 +0000 (UTC)
Message-Id: <20230104193959.3483B10683@rochebonne.antioche.eu.org>
Date: Wed, 4 Jan 2023 20:39:59 +0100 (CET)
From: bouyer@antioche.eu.org
Reply-To: bouyer@antioche.eu.org
To: gnats-bugs@NetBSD.org
Subject: NFS (over TCP) very slow in 10_BETA
X-Send-Pr-Version: 3.95
>Number: 57160
>Category: kern
>Synopsis: NFS (over TCP) very slow in 10_BETA
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jan 04 19:45:00 +0000 2023
>Last-Modified: Tue Jan 31 14:00:01 +0000 2023
>Originator: Manuel Bouyer
>Release: NetBSD 10.0_BETA
>Organization:
>Environment:
System: NetBSD rochebonne.antioche.eu.org 10.0_BETA NetBSD 10.0_BETA (GENERIC_CAN) #0: Wed Jan 4 16:41:59 MET 2023 bouyer@armandeche.soc.lip6.fr:/local/armandeche1/tmp/build/amd64/obj/local/armandeche2/netbsd-10/src/sys/arch/amd64/compile/GENERIC_CAN amd64
Architecture: x86_64
Machine: amd64
>Description:
I have my mailbox (in mbox format) stored on a NFS server (which
is a cubieboard 2 with a SATA SSD). The NFS mount uses:
type nfs (nodev, nosuid, fsid: 0xb06/0x70b, reads: sync 0 async 0, writes: sync 0 async 0)
The client is connected to a gigabit switch (using a alc(4) interface),
the server is connected at 100Mbs.
Since upgrading the client to 10.0_BETA loading the mailbox
with mutt is very slow, and the transfers seems to pause
several times.
Reading the mailbox with dd gives expected speed:
(I made sure there isn't local cache in the way)
rochebonne:~> dd if=Mail/Inbox of=/dev/null
132+1 records in
132+1 records out
138625298 bytes transferred in 17.273 secs (8025548 bytes/sec)
I can't reproduce the slow read with dd.
I have a similar setup at work (but with a much faster NFS server)
and I didn't notice change in mutt performance on upgrade.
tcpdump pcap file while loading the mailbox with mutt available
on request in private mail.
>How-To-Repeat:
store mailbox on NFS server and read it with mutt on a 10.0 system;
it may depend on the network hardware.
>Fix:
>Release-Note:
>Audit-Trail:
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/57160: NFS (over TCP) very slow in 10_BETA
Date: Thu, 5 Jan 2023 11:13:45 +0100
On Wed, Jan 04, 2023 at 07:45:00PM +0000, bouyer@antioche.eu.org wrote:
> >Description:
> I have my mailbox (in mbox format) stored on a NFS server (which
> is a cubieboard 2 with a SATA SSD). The NFS mount uses:
> type nfs (nodev, nosuid, fsid: 0xb06/0x70b, reads: sync 0 async 0, writes: sync 0 async 0)
> The client is connected to a gigabit switch (using a alc(4) interface),
> the server is connected at 100Mbs.
> Since upgrading the client to 10.0_BETA loading the mailbox
> with mutt is very slow, and the transfers seems to pause
> several times.
> Reading the mailbox with dd gives expected speed:
> (I made sure there isn't local cache in the way)
> rochebonne:~> dd if=Mail/Inbox of=/dev/null
> 132+1 records in
> 132+1 records out
> 138625298 bytes transferred in 17.273 secs (8025548 bytes/sec)
> I can't reproduce the slow read with dd.
But creating a tar archive with the files on NFS is also slower than it
used to be.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: Martin Husemann <martin@duskware.de>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/57160: NFS (over TCP) very slow in 10_BETA
Date: Thu, 5 Jan 2023 11:27:34 +0100
On Thu, Jan 05, 2023 at 11:13:45AM +0100, Manuel Bouyer wrote:
> But creating a tar archive with the files on NFS is also slower than it
> used to be.
Can you check raw network speed?
I guess benchmarks/netio is the canonical first step.
Martin
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Martin Husemann <martin@duskware.de>
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/57160: NFS (over TCP) very slow in 10_BETA
Date: Thu, 5 Jan 2023 20:43:37 +0100
On Thu, Jan 05, 2023 at 11:27:34AM +0100, Martin Husemann wrote:
> On Thu, Jan 05, 2023 at 11:13:45AM +0100, Manuel Bouyer wrote:
> > But creating a tar archive with the files on NFS is also slower than it
> > used to be.
>
> Can you check raw network speed?
> I guess benchmarks/netio is the canonical first step.
Looks good:
rochebonne:/home/bouyer/tmp/netio>./netio -t 10.0.0.254
NETIO - Network Throughput Benchmark, Version 1.33
(C) 1997-2012 Kai Uwe Rommel
TCP connection established.
Packet size 1k bytes: 11.24 MByte/s Tx, 10.08 MByte/s Rx.
Packet size 2k bytes: 11.24 MByte/s Tx, 10.19 MByte/s Rx.
Packet size 4k bytes: 11.25 MByte/s Tx, 10.10 MByte/s Rx.
Packet size 8k bytes: 11.22 MByte/s Tx, 10.07 MByte/s Rx.
Packet size 16k bytes: 11.26 MByte/s Tx, 10.06 MByte/s Rx.
Packet size 32k bytes: 11.25 MByte/s Tx, 10.05 MByte/s Rx.
Done.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/57160: NFS (over TCP) very slow in 10_BETA
Date: Wed, 25 Jan 2023 23:35:56 +0100
I just noticed that if there is significant network traffic in background when
reading my mailbox, nfs is back at full speed.
I noticed this while I had a file download over ADSL.
A single ping has noticeable effect; a ping -i0.1 gives an appreciable
boost.
Without background traffic reading a small (45 messages) mailbox takes 6 or 7
seconds. With a background ping, reading the same mailbox takes half the
time. With ping -i0.1 it's about 0.5s
For a large mailbox, with ping -i0.1 in background it takes about 15s.
With a
cat /dev/zero | ssh someremotehost "cat > /dev/null"
in background, this time is down to 4 or 5s, with is the speed I had with
NetBSD 9.x
Maybe the issue is related to my network adapter (like: RX interrupts are
sometime missed). It's a
[ 1.032289] alc0 at pci2 dev 0 function 0: Atheros AR8161 PCIe Gigabit Ethernet
[ 1.032289] alc0: interrupting at ioapic0 pin 16
[ 1.032289] alc0: Ethernet address 94:de:80:21:be:c0
[ 1.032289] atphy0 at alc0 phy 0: Atheros AR8035 10/100/1000 PHY, rev. 9
[ 1.032289] atphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, auto
This driver got some changes between -9 and -10, I have not investigated yet.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/57160: NFS (over TCP) very slow in 10_BETA
Date: Mon, 30 Jan 2023 21:48:23 +0100
On Wed, Jan 25, 2023 at 11:35:56PM +0100, Manuel Bouyer wrote:
> I just noticed that if there is significant network traffic in background when
> reading my mailbox, nfs is back at full speed.
> I noticed this while I had a file download over ADSL.
> A single ping has noticeable effect; a ping -i0.1 gives an appreciable
> boost.
>
> Without background traffic reading a small (45 messages) mailbox takes 6 or 7
> seconds. With a background ping, reading the same mailbox takes half the
> time. With ping -i0.1 it's about 0.5s
>
> For a large mailbox, with ping -i0.1 in background it takes about 15s.
> With a
> cat /dev/zero | ssh someremotehost "cat > /dev/null"
> in background, this time is down to 4 or 5s, with is the speed I had with
> NetBSD 9.x
>
> Maybe the issue is related to my network adapter (like: RX interrupts are
> sometime missed). It's a
> [ 1.032289] alc0 at pci2 dev 0 function 0: Atheros AR8161 PCIe Gigabit Ethernet
> [ 1.032289] alc0: interrupting at ioapic0 pin 16
> [ 1.032289] alc0: Ethernet address 94:de:80:21:be:c0
> [ 1.032289] atphy0 at alc0 phy 0: Atheros AR8035 10/100/1000 PHY, rev. 9
> [ 1.032289] atphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, auto
>
> This driver got some changes between -9 and -10, I have not investigated yet.
I reverted to the netbsd-9 driver (with 1.47 pulled up as this is required
to build on netbsd-10) and this didn't change the performances.
So I guess it's not a direct change in the driver which causes this problem.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/57160: NFS (over TCP) very slow in 10_BETA
Date: Tue, 31 Jan 2023 14:57:39 +0100
On Mon, Jan 30, 2023 at 09:48:23PM +0100, Manuel Bouyer wrote:
> On Wed, Jan 25, 2023 at 11:35:56PM +0100, Manuel Bouyer wrote:
> > I just noticed that if there is significant network traffic in background when
> > reading my mailbox, nfs is back at full speed.
> > I noticed this while I had a file download over ADSL.
> > A single ping has noticeable effect; a ping -i0.1 gives an appreciable
> > boost.
> >
> > Without background traffic reading a small (45 messages) mailbox takes 6 or 7
> > seconds. With a background ping, reading the same mailbox takes half the
> > time. With ping -i0.1 it's about 0.5s
> >
> > For a large mailbox, with ping -i0.1 in background it takes about 15s.
> > With a
> > cat /dev/zero | ssh someremotehost "cat > /dev/null"
> > in background, this time is down to 4 or 5s, with is the speed I had with
> > NetBSD 9.x
> >
> > Maybe the issue is related to my network adapter (like: RX interrupts are
> > sometime missed). It's a
> > [ 1.032289] alc0 at pci2 dev 0 function 0: Atheros AR8161 PCIe Gigabit Ethernet
> > [ 1.032289] alc0: interrupting at ioapic0 pin 16
> > [ 1.032289] alc0: Ethernet address 94:de:80:21:be:c0
> > [ 1.032289] atphy0 at alc0 phy 0: Atheros AR8035 10/100/1000 PHY, rev. 9
> > [ 1.032289] atphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, auto
> >
> > This driver got some changes between -9 and -10, I have not investigated yet.
>
> I reverted to the netbsd-9 driver (with 1.47 pulled up as this is required
> to build on netbsd-10) and this didn't change the performances.
> So I guess it's not a direct change in the driver which causes this problem.
A way to work around the problem is to use udp mount.
With udp:
rochebonne:~> /usr/bin/time mutt -f Mail/Inbox
3.59 real 0.09 user 0.05 sys
With tcp:
rochebonne:~> /usr/bin/time mutt -f Mail/Inbox
37.91 real 0.08 user 0.06 sys
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.