NetBSD Problem Report #50604

From tron@zhadum.org.uk  Fri Jan  1 19:36:29 2016
Return-Path: <tron@zhadum.org.uk>
Received: from mail.netbsd.org (mail.NetBSD.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 1EAFA7ABF2
	for <gnats-bugs@gnats.NetBSD.org>; Fri,  1 Jan 2016 19:36:29 +0000 (UTC)
Message-Id: <20160101193623.78C14F8FA2@lyssa.zhadum.org.uk>
Date: Fri,  1 Jan 2016 19:36:23 +0000 (GMT)
From: tron@zhadum.org.uk
Reply-To: tron@zhadum.org.uk
To: gnats-bugs@NetBSD.org
Subject: ld(4) on top of virtio(4) performs very badly
X-Send-Pr-Version: 3.95

>Number:         50604
>Category:       kern
>Synopsis:       ld(4) on top of virtio(4) performs very badly
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    kern-bug-people
>State:          analyzed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Jan 01 19:40:00 +0000 2016
>Closed-Date:    
>Last-Modified:  Sun Jun 17 12:33:14 +0000 2018
>Originator:     tron@zhadum.org.uk
>Release:        NetBSD 7.99.25 2016-01-01 sources
>Organization:
Matthias Scheler                                 https://zhadum.org.uk/
>Environment:
System: NetBSD lyssa.zhadum.org.uk 7.99.25 NetBSD 7.99.25 (GENERIC) #0: Fri Jan 1 15:28:41 GMT 2016 tron@lyssa.zhadum.org.uk:/src/sys/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
I've got a NetBSD-current virtual machine running as a KVM guest.
The host is a Debian 8.2 system (Linux kernel 3.16.0, QEMU 2.1).

Today I tried to switch on all of the virtual machine's three disks from
"IDE" to "VirtIO" to improve performance. But to my big surprise I got
a massive performance degradation instead. I've used un-tar-ring the
Perl 5.22.1 sources as a crude benchmark. Here are the results:

IDE:		1.67s user 0.83s system 50% cpu 4.935 total
VirtIO:		1.74s user 0.75s system 2% cpu 1:27.43 total

It seems that virtio(4) is twenty times slower. As I've never experienced
similar problems with Linux guests (at work) I guess that this is a problem
with NetBSD's virtio(4) driver.

>How-To-Repeat:
cd /some/filesystem/on/ld/at/virtio
time sh -c "tar xzf /usr/pkgsrc/distfiles/perl-5.22.1.tar.bz2; sync"

>Fix:
Not known

>Release-Note:

>Audit-Trail:
From: Valery Ushakov <uwe@stderr.spb.ru>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/50604 (ld(4) on top of virtio(4) performs very badly)
Date: Wed, 30 Nov 2016 05:02:07 +0300

 The following might be related.  Please, can you test with this revision?

 On Tue, Nov 29, 2016 at 20:36:38 -0500, Christos Zoulas wrote:

 > Module Name:	src
 > Committed By:	christos
 > Date:		Wed Nov 30 01:36:38 UTC 2016
 > 
 > Modified Files:
 > 	src/sys/dev/pci: ld_virtio.c viornd.c
 > 
 > Log Message:
 > Don't call virtio_enqueue_abort when virtio_enqueue_reserve fails.
 > Pointed out by uwe@
 > 
 > 
 > To generate a diff of this commit:
 > cvs rdiff -u -r1.12 -r1.13 src/sys/dev/pci/ld_virtio.c
 > cvs rdiff -u -r1.9 -r1.10 src/sys/dev/pci/viornd.c

 -uwe

From: Matthias Scheler <tron@zhadum.org.uk>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/50604 (ld(4) on top of virtio(4) performs very badly)
Date: Wed, 30 Nov 2016 14:35:47 +0000

 On Wed, Nov 30, 2016 at 02:05:01AM +0000, Valery Ushakov wrote:
 >  The following might be related.  Please, can you test with this revision?
 >  
 >  On Tue, Nov 29, 2016 at 20:36:38 -0500, Christos Zoulas wrote:
 >  
 >  > Module Name:	src
 >  > Committed By:	christos
 >  > Date:		Wed Nov 30 01:36:38 UTC 2016
 >  > 
 >  > Modified Files:
 >  > 	src/sys/dev/pci: ld_virtio.c viornd.c
 >  > 
 >  > Log Message:
 >  > Don't call virtio_enqueue_abort when virtio_enqueue_reserve fails.
 >  > Pointed out by uwe@
 >  > 
 >  > 
 >  > To generate a diff of this commit:
 >  > cvs rdiff -u -r1.12 -r1.13 src/sys/dev/pci/ld_virtio.c
 >  > cvs rdiff -u -r1.9 -r1.10 src/sys/dev/pci/viornd.c

 I've tried with a "GENERIC" kernel built from 2016-11-30 08:00 UTC sources:

 	NetBSD lyssa.zhadum.org.uk 7.99.43 NetBSD 7.99.43 (GENERIC) #0: Wed Nov 30 10:22:36 GMT 2016  tron@lyssa.zhadum.org.uk:/src/sys/compile/GENERIC amd64

 I've used the exact same machine on the same HyperVisor. These are the
 only difference in the KVM machine configuration:

 --- lyssa-ide.xml	2016-11-30 14:20:33.928453288 +0000
 +++ lyssa-virtio.xml	2016-11-30 14:24:03.459133542 +0000
 @@ -9,6 +9,7 @@
    </resource>
    <os>
      <type arch='x86_64' machine='pc-i440fx-2.1'>hvm</type>
 +    <boot dev='hd'/>
    </os>
    <features>
      <acpi/>
 @@ -68,9 +69,8 @@
      <disk type='file' device='disk'>
        <driver name='qemu' type='qcow2'/>
        <source file='/var/lib/libvirt/images/lyssa.qcow2'/>
 -      <target dev='hda' bus='ide'/>
 -      <boot order='1'/>
 -      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
 +      <target dev='vda' bus='virtio'/>
 +      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
      </disk>
      <disk type='block' device='cdrom'>
        <driver name='qemu' type='raw'/>

 Here are the result of my simple disk I/O benchmark:

 1.) Using IDE:
 	> time sh -c "tar -xzf /scratch/pkgsrc/distfiles/emacs-24.5.tar.gz && sync && rm -rf emacs-24.5 && sync"
 sh -c   0.90s user 1.29s system 12% cpu 17.144 total

 2.) Using VirtIO:
 	> time sh -c "tar -xzf /scratch/pkgsrc/distfiles/emacs-24.5.tar.gz && sync && rm -rf emacs-24.5 && sync"
 	sh -c   0.96s user 1.33s system 1% cpu 2:19.39 total

 So the problem still exists.

 	Kind regards

 -- 
 Matthias Scheler                                 https://zhadum.org.uk/

Responsible-Changed-From-To: kern-bug-people->jdolecek
Responsible-Changed-By: jdolecek@NetBSD.org
Responsible-Changed-When: Sun, 12 Mar 2017 21:23:42 +0000
Responsible-Changed-Why:
I'm looking at virtio, maybe I get to this.


Responsible-Changed-From-To: jdolecek->kern-bug-people
Responsible-Changed-By: jdolecek@NetBSD.org
Responsible-Changed-When: Sat, 16 Jun 2018 00:04:04 +0000
Responsible-Changed-Why:
Back to kern-bug-people, I'm not likely to go further on this.


From: "Jonathan A. Kollasch" <jakllsch@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/50604: ld(4) on top of virtio(4) performs very badly
Date: Sat, 16 Jun 2018 07:43:50 -0500

 This should be fixed since r1.20 src/sys/dev/pci/ld_virtio.c.

State-Changed-From-To: open->feedback
State-Changed-By: jakllsch@NetBSD.org
State-Changed-When: Sat, 16 Jun 2018 12:46:53 +0000
State-Changed-Why:
likely fixed by recent commit


From: Matthias Scheler <tron@zhadum.org.uk>
To: gnats-bugs@NetBSD.org
Cc: jakllsch@NetBSD.org
Subject: Re: kern/50604 (ld(4) on top of virtio(4) performs very badly)
Date: Sat, 16 Jun 2018 23:59:27 +0100

 On Sat, Jun 16, 2018 at 12:46:53PM +0000, jakllsch@NetBSD.org wrote:
 > Synopsis: ld(4) on top of virtio(4) performs very badly
 > 
 > State-Changed-From-To: open->feedback
 > State-Changed-By: jakllsch@NetBSD.org
 > State-Changed-When: Sat, 16 Jun 2018 12:46:53 +0000
 > State-Changed-Why:
 > likely fixed by recent commit

 Here are the performance numbers with a 2018-06-16 kernel:

 IDE:
 sh -c "tar -xzf /scratch/pkgsrc/distfiles/perl-5.26.2.tar.xz; sync"  1.05s user 2.03s system 60% cpu 5.118 total

 virtio(4):
 sh -c "tar -xzf /scratch/pkgsrc/distfiles/perl-5.26.2.tar.xz; sync"  1.18s user 1.55s system 74% cpu 3.665 total

 The problem is fixed as far as I can tell.

 	Thanks for the good work!

 -- 
 Matthias Scheler                                  http://zhadum.org.uk/

State-Changed-From-To: feedback->analyzed
State-Changed-By: tron@NetBSD.org
State-Changed-When: Sun, 17 Jun 2018 12:33:14 +0000
State-Changed-Why:
Feedback was provided, bug is understood and fixed. The fix might be
worthwhile backporting to existing release branches.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.