NetBSD Problem Report #37245
From yamt@mwd.biglobe.ne.jp Mon Oct 29 03:11:41 2007
Return-Path: <yamt@mwd.biglobe.ne.jp>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by narn.NetBSD.org (Postfix) with ESMTP id 6E2D563B935
for <gnats-bugs@gnats.NetBSD.org>; Mon, 29 Oct 2007 03:11:41 +0000 (UTC)
Message-Id: <20071029002603.927FD11702@yamt.dyndns.org>
Date: Mon, 29 Oct 2007 09:26:03 +0900 (JST)
From: yamt@mwd.biglobe.ne.jp
Reply-To: yamt@mwd.biglobe.ne.jp
To: gnats-bugs@NetBSD.org
Subject: sched_m2 is too unfair
X-Send-Pr-Version: 3.95
>Number: 37245
>Category: kern
>Synopsis: sched_m2 is too unfair
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: rmind
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Oct 29 03:15:01 +0000 2007
>Closed-Date: Sat Apr 12 12:14:07 +0000 2008
>Last-Modified: Sat Apr 12 12:14:07 +0000 2008
>Originator: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
>Release: NetBSD 4.99.34
>Organization:
>Environment:
>Description:
sched_m2 is too unfair for general uses.
see the following test program and an output of top.
there seems to be two problems, at least:
- cpu-hogging threads never moves between cpus unless
there are idle cpus.
- there are threads which get completely starved.
i guess they have never got run after fork and their sl_lrtime
are still 0.
--------------------------------------------
#include <unistd.h>
int
main()
{
int i;
for (i = 0; i < 40; i++) {
if (fork() == 0)
break;
}
for (;;)
;
}
--------------------------------------------
64 processes: 40 runnable, 22 sleeping, 2 on processor
CPU0 states: 100% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle
CPU1 states: 100% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle
Memory: 33M Inact, 7692K Wired, 5996K Exec, 11M File, 895M Free
Swap: 2000M Total, 2000M Free
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
6665 takashi 105 0 20K 624K RUN/0 0:24 48.62% 44.43% a.out
5059 takashi 105 0 20K 380K RUN/0 0:24 48.30% 44.14% a.out
5435 takashi 30 0 20K 380K RUN/1 0:02 4.01% 3.66% a.out
5439 takashi 30 0 20K 380K RUN/1 0:02 4.01% 3.66% a.out
6361 takashi 30 0 20K 380K RUN/1 0:02 4.01% 3.66% a.out
6786 takashi 30 0 20K 380K RUN/1 0:02 3.90% 3.56% a.out
5202 takashi 30 0 20K 380K RUN/1 0:02 3.90% 3.56% a.out
5248 takashi 30 0 20K 380K RUN/1 0:02 3.90% 3.56% a.out
6453 takashi 29 0 20K 380K RUN/1 0:02 3.79% 3.47% a.out
6455 takashi 29 0 20K 380K RUN/1 0:02 3.79% 3.47% a.out
5764 takashi 30 0 20K 380K RUN/1 0:02 3.69% 3.37% a.out
6739 takashi 29 0 20K 380K RUN/1 0:02 3.69% 3.37% a.out
6360 takashi 29 0 20K 380K RUN/1 0:02 3.69% 3.37% a.out
6451 takashi 29 0 20K 380K RUN/1 0:02 3.69% 3.37% a.out
6847 takashi 29 0 20K 380K RUN/1 0:02 3.69% 3.37% a.out
5309 takashi 29 0 20K 380K RUN/1 0:02 3.42% 3.12% a.out
4801 takashi 30 0 20K 380K RUN/1 0:02 2.56% 2.34% a.out
4932 takashi 30 0 20K 380K RUN/1 0:02 2.56% 2.34% a.out
6792 takashi 30 0 20K 380K CPU/1 0:02 2.51% 2.29% a.out
6794 takashi 29 0 20K 380K RUN/1 0:02 2.51% 2.29% a.out
5254 takashi 30 0 20K 380K RUN/1 0:01 2.51% 2.29% a.out
5301 takashi 31 0 20K 380K RUN/1 0:01 2.40% 2.20% a.out
6070 takashi 30 0 20K 380K RUN/1 0:02 2.35% 2.15% a.out
5130 takashi 29 0 20K 380K RUN/1 0:02 2.19% 2.00% a.out
10177 takashi 28 0 756K 1344K CPU/0 0:00 0.00% 0.00% top
6362 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
6366 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
9002 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
8346 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
6365 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
4902 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
4974 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
5939 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
4850 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
7127 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
6364 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
6613 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
6611 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
6363 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
6609 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
6467 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
7977 takashi 48 0 20K 4K RUN/1 0:00 0.00% 0.00% a.out
>How-To-Repeat:
see above.
>Fix:
- make sched_enqueue initialize l_lrtime properly for new lwps.
- balance cpus periodically.
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->rmind
Responsible-Changed-By: rmind@netbsd.org
Responsible-Changed-When: Mon, 29 Oct 2007 15:54:14 +0000
Responsible-Changed-Why:
Mine.
State-Changed-From-To: open->analyzed
State-Changed-By: rmind@netbsd.org
State-Changed-When: Mon, 29 Oct 2007 15:54:14 +0000
State-Changed-Why:
From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: PR/37245 CVS commit: src/sys/kern
Date: Sun, 4 Nov 2007 12:36:02 +0000 (UTC)
Module Name: src
Committed By: rmind
Date: Sun Nov 4 12:36:01 UTC 2007
Modified Files:
src/sys/kern: sched_m2.c
Log Message:
- sched_setup: use ilog2() for min_catch, which fixes the case when count
of CPU is non-power of 2. Fixes PR/37244.
- sched_enqueue: initialize sl_lrtime, when it is zero (new thread).
Part of PR/37245.
- Fix the mints/maxts sysctl helpers, use mstohz() for the checks. Also,
I meant miliseconds, not microseconds. Found by <bjs>.
To generate a diff of this commit:
cvs rdiff -r1.7 -r1.8 src/sys/kern/sched_m2.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: analyzed->feedback
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Tue, 11 Mar 2008 18:25:25 +0000
State-Changed-Why:
I think it is fixed now. Can you confirm?
Note, sched_takecpu() is not very intelligent in sched_pstats_hook(),
but for further optimisations we need a topology of CPUs.
From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/37245 CVS commit: src/sys/kern
Date: Tue, 11 Mar 2008 18:18:49 +0000 (UTC)
Module Name: src
Committed By: rmind
Date: Tue Mar 11 18:18:49 UTC 2008
Modified Files:
src/sys/kern: sched_m2.c
Log Message:
- Perform periodical balancing of CPU-bound threads, which tends to
never sleep. Should fix PR/37245 by <yamt>.
- Fix a regression - dissalow catching of bound threads. Also, allow
migration of non-bound kthreads, this restriction seems pointless.
- Few micro-optimisations, misc.
To generate a diff of this commit:
cvs rdiff -r1.21 -r1.22 src/sys/kern/sched_m2.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: yamt@mwd.biglobe.ne.jp (YAMAMOTO Takashi)
To: gnats-bugs@NetBSD.org
Cc: rmind@NetBSD.org, netbsd-bugs@netbsd.org, gnats-admin@netbsd.org,
rmind@NetBSD.org
Subject: Re: kern/37245 (sched_m2 is too unfair)
Date: Thu, 20 Mar 2008 22:42:37 +0900 (JST)
> Synopsis: sched_m2 is too unfair
>
> State-Changed-From-To: analyzed->feedback
> State-Changed-By: rmind@NetBSD.org
> State-Changed-When: Tue, 11 Mar 2008 18:25:25 +0000
> State-Changed-Why:
> I think it is fixed now. Can you confirm?
> Note, sched_takecpu() is not very intelligent in sched_pstats_hook(),
> but for further optimisations we need a topology of CPUs.
i'm not running M2 anymore on the system.
have you tried the test code i've provided in the PR? if it worked
for you, please close.
YAMAMOTO Takashi
State-Changed-From-To: feedback->closed
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Sat, 12 Apr 2008 12:14:07 +0000
State-Changed-Why:
Fixed some time ago.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.