NetBSD Problem Report #45734
From www@NetBSD.org Thu Dec 22 23:26:41 2011
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id 93BA663D789
for <gnats-bugs@gnats.NetBSD.org>; Thu, 22 Dec 2011 23:26:41 +0000 (UTC)
Message-Id: <20111222232640.AED1663B954@www.NetBSD.org>
Date: Thu, 22 Dec 2011 23:26:40 +0000 (UTC)
From: jym@NetBSD.org
Reply-To: jym@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: performance regression for sequential/random creation of files, as well as stat(2) (ffs fs)
X-Send-Pr-Version: www-1.0
>Number: 45734
>Category: kern
>Synopsis: performance regression for sequential/random creation of files, as well as stat(2) (ffs fs)
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Dec 22 23:30:00 +0000 2011
>Closed-Date:
>Last-Modified: Tue Jan 03 20:25:01 +0000 2012
>Originator: Jean-Yves Migeon
>Release: -current
>Organization:
TNF
>Environment:
NetBSD paris 5.99.53 NetBSD 5.99.53 (GENERIC) #1: Thu Dec 1 18:48:00 CET 2011 jym@paris:/home/jym/cvs/obj/sys/arch/amd64/compile/GENERIC amd64
>Description:
I observed a performance regression for sequential/randomized creation of files, as well as stat(2)ing them, between a GENERIC kernel from 2011-03-01 and 2011-12-18.
These measurements were all done with bonnie++ (benchmarks/bonnie++), for a ffs mounted with -o log, and the following command:
bonnie++ -n 32 -r 4096 -x 3 -u nobody
(32 * 1024 creation/deletion for files, 4GiB RAM, 3 execute runs).
I narrowed the regression to this window:
GENERIC 2011-07-03
seq_create ran_create ran_stat (#/s)
4374 4499 6052
4488 4553 6008
4616 4574 6057
GENERIC 2011-07-05
seq_create ran_create ran_stat (#/s)
1518 1539 1698
1534 1521 1689
1532 1541 1713
Somewhere between 2011-07-03 and -05, a commit resulted in about a cut by 3 in terms of files creation/deletion for a ffs file-system. I don't know yet exactly which commit is at fault, but I think that a 3 days window should be fairly fast to analyze.
>How-To-Repeat:
Compile a GENERIC kernel between 2011-07-03 and 2011-07-05, compile and install benchmarks/bonnie++, and run it:
bonnie++ -n 32 -r 4096 -x 3 -u nobody
Look for the fields that represent file sequential creation, randomized creation, and stat(2) (fields 16, 22 and 24):
=> awk -F, '{print $16 " " $22 " " $24}'
>Fix:
Unknown. Likely a locking change in the vfs/ffs layer that affects performance of ffs file-system in the 2011-07-03 <> 07-05 time frame.
>Release-Note:
>Audit-Trail:
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/45734: performance regression for sequential/random
creation of files, as well as stat(2) (ffs fs)
Date: Fri, 23 Dec 2011 05:27:56 +0000
On Thu, Dec 22, 2011 at 11:30:00PM +0000, jym@NetBSD.org wrote:
> I narrowed the regression to this window:
>
> GENERIC 2011-07-03
> seq_create ran_create ran_stat (#/s)
> 4374 4499 6052
> 4488 4553 6008
> 4616 4574 6057
> GENERIC 2011-07-05
> seq_create ran_create ran_stat (#/s)
> 1518 1539 1698
> 1534 1521 1689
> 1532 1541 1713
On July 3 DIAGNOSTIC was turned on by default. Can you (1) check if
that's really one of the differences between your two endpoints, and
(2) if so turn it off in the July 5 kernel and see what the
performance looks like?
DIAGNOSTIC is not supposed to be expensive but that isn't necessarily
working right.
The only fs-related commits I see in the period are manu@'s extattr
cleanup, and that isn't really very credible as a source of
performance problems.
This is the only other even remotely likely candidate I can find:
------
Module Name: src
Committed By: yamt
Date: Tue Jul 5 14:03:07 UTC 2011
Modified Files:
src/sys/uvm: uvm_km.c uvm_map.c
Log Message:
- fix a use-after-free bug in uvm_km_free.
(after uvm_km_pgremove frees pages, the following pmap_remove touches them.)
- acquire the object lock for operations on pmap_kernel as it can actually be
raced with P->V operations. eg. pagedaemon.
To generate a diff of this commit:
cvs rdiff -u -r1.109 -r1.110 src/sys/uvm/uvm_km.c
cvs rdiff -u -r1.299 -r1.300 src/sys/uvm/uvm_map.c
------
--
David A. Holland
dholland@netbsd.org
From: jean-Yves Migeon <jym@NetBSD.org>
To: <gnats-bugs@NetBSD.org>
Cc: <kern-bug-people@netbsd.org>, <gnats-admin@netbsd.org>,
<netbsd-bugs@netbsd.org>
Subject: Re: kern/45734: performance regression for sequential/random creation of files, as well as stat(2) (ffs fs)
Date: Fri, 23 Dec 2011 11:50:18 +0100
On Fri, 23 Dec 2011 05:30:07 +0000 (UTC), David Holland wrote:
> On Thu, Dec 22, 2011 at 11:30:00PM +0000, jym@NetBSD.org wrote:
> > I narrowed the regression to this window:
> >
> > GENERIC 2011-07-03
> > seq_create ran_create ran_stat (#/s)
> > 4374 4499 6052
> > 4488 4553 6008
> > 4616 4574 6057
> > GENERIC 2011-07-05
> > seq_create ran_create ran_stat (#/s)
> > 1518 1539 1698
> > 1534 1521 1689
> > 1532 1541 1713
>
> On July 3 DIAGNOSTIC was turned on by default. Can you (1) check if
> that's really one of the differences between your two endpoints, and
> (2) if so turn it off in the July 5 kernel and see what the
> performance looks like?
>
> DIAGNOSTIC is not supposed to be expensive but that isn't
> necessarily
> working right.
>
> The only fs-related commits I see in the period are manu@'s extattr
> cleanup, and that isn't really very credible as a source of
> performance problems.
> [snip]
> Log Message:
> - fix a use-after-free bug in uvm_km_free.
> (after uvm_km_pgremove frees pages, the following pmap_remove
> touches them.)
> - acquire the object lock for operations on pmap_kernel as it can
> actually be
> raced with P->V operations. eg. pagedaemon.
Came to the same conclusion this morning. Will check.
Merry christmas to all!
--
Jean-Yves Migeon
jym@NetBSD.org
State-Changed-From-To: open->closed
State-Changed-By: jym@NetBSD.org
State-Changed-When: Mon, 02 Jan 2012 23:25:16 +0000
State-Changed-Why:
I can now confirm that just enabling DIAGNOSTIC is responsible for the
overhead observed when running Bonnie++.
While having file creation/deletion performance cut by 3 is not a good thing
per see, DIAGNOSTIC should remain enabled for experimental kernel to
help debugging and bug squashing.
I am closing this PR.
State-Changed-From-To: closed->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Tue, 03 Jan 2012 14:48:29 +0000
State-Changed-Why:
I don't agree - a 3x performance degradation from DIAGNOSTIC (even on a
naive benchmark) is not acceptable.
Are you in a position to check where it's coming from?
From: Jean-Yves Migeon <jeanyves.migeon@free.fr>
To: gnats-bugs@NetBSD.org
Cc: dholland@NetBSD.org, kern-bug-people@netbsd.org,
netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, jym@NetBSD.org
Subject: Re: kern/45734 (performance regression for sequential/random creation
of files, as well as stat(2) (ffs fs))
Date: Tue, 03 Jan 2012 21:24:39 +0100
On 03.01.2012 15:48, dholland@NetBSD.org wrote:
> Synopsis: performance regression for sequential/random creation of files, as well as stat(2) (ffs fs)
>
> State-Changed-From-To: closed->open
> State-Changed-By: dholland@NetBSD.org
> State-Changed-When: Tue, 03 Jan 2012 14:48:29 +0000
> State-Changed-Why:
> I don't agree - a 3x performance degradation from DIAGNOSTIC (even on a
> naive benchmark) is not acceptable.
Beware, it's a 3x performance degradation on specific operations (file
creation/deletion), the degradation is non existent for other type of
benchmarks (at least regarding bonnie++, build.sh or sysbench).
> Are you in a position to check where it's coming from?
Not ATM. My spare time is limited right now and I'd prefer to focus on
my Xen work first.
--
Jean-Yves Migeon
jeanyves.migeon@free.fr
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.