NetBSD Problem Report #57160

From bouyer@antioche.eu.org  Wed Jan  4 19:40:03 2023
Return-Path: <bouyer@antioche.eu.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id C88191A9239
	for <gnats-bugs@gnats.NetBSD.org>; Wed,  4 Jan 2023 19:40:03 +0000 (UTC)
Message-Id: <20230104193959.3483B10683@rochebonne.antioche.eu.org>
Date: Wed,  4 Jan 2023 20:39:59 +0100 (CET)
From: bouyer@antioche.eu.org
Reply-To: bouyer@antioche.eu.org
To: gnats-bugs@NetBSD.org
Subject: NFS (over TCP) very slow in 10_BETA
X-Send-Pr-Version: 3.95

>Number:         57160
>Category:       kern
>Synopsis:       NFS (over TCP) very slow in 10_BETA
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jan 04 19:45:00 +0000 2023
>Last-Modified:  Tue Jan 31 14:00:01 +0000 2023
>Originator:     Manuel Bouyer
>Release:        NetBSD 10.0_BETA
>Organization:
>Environment:
System: NetBSD rochebonne.antioche.eu.org 10.0_BETA NetBSD 10.0_BETA (GENERIC_CAN) #0: Wed Jan 4 16:41:59 MET 2023 bouyer@armandeche.soc.lip6.fr:/local/armandeche1/tmp/build/amd64/obj/local/armandeche2/netbsd-10/src/sys/arch/amd64/compile/GENERIC_CAN amd64
Architecture: x86_64
Machine: amd64
>Description:
	I have my mailbox (in mbox format) stored on a NFS server (which
	is a cubieboard 2 with a SATA SSD). The NFS mount uses:
	type nfs (nodev, nosuid, fsid: 0xb06/0x70b, reads: sync 0 async 0, writes: sync 0 async 0)
	The client is connected to a gigabit switch (using a alc(4) interface),
	the server is connected at 100Mbs.
	Since upgrading the client to 10.0_BETA loading the mailbox
	with mutt is very slow, and the transfers seems to pause
	several times.
	Reading the mailbox with dd gives expected speed:
	(I made sure there isn't local cache in the way)
	rochebonne:~> dd if=Mail/Inbox of=/dev/null
	132+1 records in
	132+1 records out
	138625298 bytes transferred in 17.273 secs (8025548 bytes/sec)
	I can't reproduce the slow read with dd.

	I have a similar setup at work (but with a much faster NFS server)
	and I didn't notice change in mutt performance on upgrade.

	tcpdump pcap file while loading the mailbox with mutt available
	on request in private mail.

>How-To-Repeat:
	store mailbox on NFS server and read it with mutt on a 10.0 system;
	it may depend on the network hardware.
>Fix:


>Release-Note:

>Audit-Trail:
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/57160: NFS (over TCP) very slow in 10_BETA
Date: Thu, 5 Jan 2023 11:13:45 +0100

 On Wed, Jan 04, 2023 at 07:45:00PM +0000, bouyer@antioche.eu.org wrote:
 > >Description:
 > 	I have my mailbox (in mbox format) stored on a NFS server (which
 > 	is a cubieboard 2 with a SATA SSD). The NFS mount uses:
 > 	type nfs (nodev, nosuid, fsid: 0xb06/0x70b, reads: sync 0 async 0, writes: sync 0 async 0)
 > 	The client is connected to a gigabit switch (using a alc(4) interface),
 > 	the server is connected at 100Mbs.
 > 	Since upgrading the client to 10.0_BETA loading the mailbox
 > 	with mutt is very slow, and the transfers seems to pause
 > 	several times.
 > 	Reading the mailbox with dd gives expected speed:
 > 	(I made sure there isn't local cache in the way)
 > 	rochebonne:~> dd if=Mail/Inbox of=/dev/null
 > 	132+1 records in
 > 	132+1 records out
 > 	138625298 bytes transferred in 17.273 secs (8025548 bytes/sec)
 > 	I can't reproduce the slow read with dd.

 But creating a tar archive with the files on NFS is also slower than it
 used to be.

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Martin Husemann <martin@duskware.de>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/57160: NFS (over TCP) very slow in 10_BETA
Date: Thu, 5 Jan 2023 11:27:34 +0100

 On Thu, Jan 05, 2023 at 11:13:45AM +0100, Manuel Bouyer wrote:
 > But creating a tar archive with the files on NFS is also slower than it
 > used to be.

 Can you check raw network speed?
 I guess benchmarks/netio is the canonical first step.

 Martin

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Martin Husemann <martin@duskware.de>
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/57160: NFS (over TCP) very slow in 10_BETA
Date: Thu, 5 Jan 2023 20:43:37 +0100

 On Thu, Jan 05, 2023 at 11:27:34AM +0100, Martin Husemann wrote:
 > On Thu, Jan 05, 2023 at 11:13:45AM +0100, Manuel Bouyer wrote:
 > > But creating a tar archive with the files on NFS is also slower than it
 > > used to be.
 > 
 > Can you check raw network speed?
 > I guess benchmarks/netio is the canonical first step.

 Looks good:
 rochebonne:/home/bouyer/tmp/netio>./netio -t 10.0.0.254

 NETIO - Network Throughput Benchmark, Version 1.33
 (C) 1997-2012 Kai Uwe Rommel

 TCP connection established.
 Packet size  1k bytes:  11.24 MByte/s Tx,  10.08 MByte/s Rx.
 Packet size  2k bytes:  11.24 MByte/s Tx,  10.19 MByte/s Rx.
 Packet size  4k bytes:  11.25 MByte/s Tx,  10.10 MByte/s Rx.
 Packet size  8k bytes:  11.22 MByte/s Tx,  10.07 MByte/s Rx.
 Packet size 16k bytes:  11.26 MByte/s Tx,  10.06 MByte/s Rx.
 Packet size 32k bytes:  11.25 MByte/s Tx,  10.05 MByte/s Rx.
 Done.

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/57160: NFS (over TCP) very slow in 10_BETA
Date: Wed, 25 Jan 2023 23:35:56 +0100

 I just noticed that if there is significant network traffic in background when
 reading my mailbox, nfs is back at full speed.
 I noticed this while I had a file download over ADSL.
 A single ping has noticeable effect; a ping -i0.1 gives an appreciable
 boost. 

 Without background traffic reading a small (45 messages) mailbox takes 6 or 7
 seconds. With a background ping, reading the same mailbox takes half the
 time. With ping -i0.1 it's about 0.5s

 For a large mailbox, with ping -i0.1 in background it takes about 15s.
 With a
 cat /dev/zero | ssh someremotehost "cat > /dev/null"
 in background, this time is down to 4 or 5s, with is the speed I had with
 NetBSD 9.x

 Maybe the issue is related to my network adapter (like: RX interrupts are
 sometime missed). It's a
 [     1.032289] alc0 at pci2 dev 0 function 0: Atheros AR8161 PCIe Gigabit Ethernet
 [     1.032289] alc0: interrupting at ioapic0 pin 16
 [     1.032289] alc0: Ethernet address 94:de:80:21:be:c0
 [     1.032289] atphy0 at alc0 phy 0: Atheros AR8035 10/100/1000 PHY, rev. 9
 [     1.032289] atphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, auto

 This driver got some changes between -9 and -10, I have not investigated yet.

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/57160: NFS (over TCP) very slow in 10_BETA
Date: Mon, 30 Jan 2023 21:48:23 +0100

 On Wed, Jan 25, 2023 at 11:35:56PM +0100, Manuel Bouyer wrote:
 > I just noticed that if there is significant network traffic in background when
 > reading my mailbox, nfs is back at full speed.
 > I noticed this while I had a file download over ADSL.
 > A single ping has noticeable effect; a ping -i0.1 gives an appreciable
 > boost. 
 > 
 > Without background traffic reading a small (45 messages) mailbox takes 6 or 7
 > seconds. With a background ping, reading the same mailbox takes half the
 > time. With ping -i0.1 it's about 0.5s
 > 
 > For a large mailbox, with ping -i0.1 in background it takes about 15s.
 > With a
 > cat /dev/zero | ssh someremotehost "cat > /dev/null"
 > in background, this time is down to 4 or 5s, with is the speed I had with
 > NetBSD 9.x
 > 
 > Maybe the issue is related to my network adapter (like: RX interrupts are
 > sometime missed). It's a
 > [     1.032289] alc0 at pci2 dev 0 function 0: Atheros AR8161 PCIe Gigabit Ethernet
 > [     1.032289] alc0: interrupting at ioapic0 pin 16
 > [     1.032289] alc0: Ethernet address 94:de:80:21:be:c0
 > [     1.032289] atphy0 at alc0 phy 0: Atheros AR8035 10/100/1000 PHY, rev. 9
 > [     1.032289] atphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, auto
 > 
 > This driver got some changes between -9 and -10, I have not investigated yet.

 I reverted to the netbsd-9 driver (with 1.47 pulled up as this is required
 to build on netbsd-10) and this didn't change the performances.
 So I guess it's not a direct change in the driver which causes this problem.

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/57160: NFS (over TCP) very slow in 10_BETA
Date: Tue, 31 Jan 2023 14:57:39 +0100

 On Mon, Jan 30, 2023 at 09:48:23PM +0100, Manuel Bouyer wrote:
 > On Wed, Jan 25, 2023 at 11:35:56PM +0100, Manuel Bouyer wrote:
 > > I just noticed that if there is significant network traffic in background when
 > > reading my mailbox, nfs is back at full speed.
 > > I noticed this while I had a file download over ADSL.
 > > A single ping has noticeable effect; a ping -i0.1 gives an appreciable
 > > boost. 
 > > 
 > > Without background traffic reading a small (45 messages) mailbox takes 6 or 7
 > > seconds. With a background ping, reading the same mailbox takes half the
 > > time. With ping -i0.1 it's about 0.5s
 > > 
 > > For a large mailbox, with ping -i0.1 in background it takes about 15s.
 > > With a
 > > cat /dev/zero | ssh someremotehost "cat > /dev/null"
 > > in background, this time is down to 4 or 5s, with is the speed I had with
 > > NetBSD 9.x
 > > 
 > > Maybe the issue is related to my network adapter (like: RX interrupts are
 > > sometime missed). It's a
 > > [     1.032289] alc0 at pci2 dev 0 function 0: Atheros AR8161 PCIe Gigabit Ethernet
 > > [     1.032289] alc0: interrupting at ioapic0 pin 16
 > > [     1.032289] alc0: Ethernet address 94:de:80:21:be:c0
 > > [     1.032289] atphy0 at alc0 phy 0: Atheros AR8035 10/100/1000 PHY, rev. 9
 > > [     1.032289] atphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, auto
 > > 
 > > This driver got some changes between -9 and -10, I have not investigated yet.
 > 
 > I reverted to the netbsd-9 driver (with 1.47 pulled up as this is required
 > to build on netbsd-10) and this didn't change the performances.
 > So I guess it's not a direct change in the driver which causes this problem.

 A way to work around the problem is to use udp mount.
 With udp:
 rochebonne:~> /usr/bin/time mutt -f Mail/Inbox 
         3.59 real         0.09 user         0.05 sys
 With tcp:
 rochebonne:~> /usr/bin/time mutt -f Mail/Inbox 
        37.91 real         0.08 user         0.06 sys

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.