NetBSD Problem Report #37245

From yamt@mwd.biglobe.ne.jp  Mon Oct 29 03:11:41 2007
Return-Path: <yamt@mwd.biglobe.ne.jp>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id 6E2D563B935
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 29 Oct 2007 03:11:41 +0000 (UTC)
Message-Id: <20071029002603.927FD11702@yamt.dyndns.org>
Date: Mon, 29 Oct 2007 09:26:03 +0900 (JST)
From: yamt@mwd.biglobe.ne.jp
Reply-To: yamt@mwd.biglobe.ne.jp
To: gnats-bugs@NetBSD.org
Subject: sched_m2 is too unfair
X-Send-Pr-Version: 3.95

>Number:         37245
>Category:       kern
>Synopsis:       sched_m2 is too unfair
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    rmind
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Oct 29 03:15:01 +0000 2007
>Closed-Date:    Sat Apr 12 12:14:07 +0000 2008
>Last-Modified:  Sat Apr 12 12:14:07 +0000 2008
>Originator:     YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
>Release:        NetBSD 4.99.34
>Organization:

>Environment:
>Description:
	sched_m2 is too unfair for general uses.

	see the following test program and an output of top.
	there seems to be two problems, at least:

	- cpu-hogging threads never moves between cpus unless
	  there are idle cpus.

	- there are threads which get completely starved.
	  i guess they have never got run after fork and their sl_lrtime
	  are still 0.

--------------------------------------------
#include <unistd.h>

int
main()
{
	int i;

	for (i = 0; i < 40; i++) {
		if (fork() == 0)
			break;
	}
	for (;;)
		;
}
--------------------------------------------

64 processes:  40 runnable, 22 sleeping, 2 on processor
CPU0 states:  100% user,  0.0% nice,  0.0% system,  0.0% interrupt,  0.0% idle
CPU1 states:  100% user,  0.0% nice,  0.0% system,  0.0% interrupt,  0.0% idle
Memory: 33M Inact, 7692K Wired, 5996K Exec, 11M File, 895M Free
Swap: 2000M Total, 2000M Free

  PID USERNAME PRI NICE   SIZE   RES STATE      TIME   WCPU    CPU COMMAND
 6665 takashi  105    0    20K  624K RUN/0      0:24 48.62% 44.43% a.out
 5059 takashi  105    0    20K  380K RUN/0      0:24 48.30% 44.14% a.out
 5435 takashi   30    0    20K  380K RUN/1      0:02  4.01%  3.66% a.out
 5439 takashi   30    0    20K  380K RUN/1      0:02  4.01%  3.66% a.out
 6361 takashi   30    0    20K  380K RUN/1      0:02  4.01%  3.66% a.out
 6786 takashi   30    0    20K  380K RUN/1      0:02  3.90%  3.56% a.out
 5202 takashi   30    0    20K  380K RUN/1      0:02  3.90%  3.56% a.out
 5248 takashi   30    0    20K  380K RUN/1      0:02  3.90%  3.56% a.out
 6453 takashi   29    0    20K  380K RUN/1      0:02  3.79%  3.47% a.out
 6455 takashi   29    0    20K  380K RUN/1      0:02  3.79%  3.47% a.out
 5764 takashi   30    0    20K  380K RUN/1      0:02  3.69%  3.37% a.out
 6739 takashi   29    0    20K  380K RUN/1      0:02  3.69%  3.37% a.out
 6360 takashi   29    0    20K  380K RUN/1      0:02  3.69%  3.37% a.out
 6451 takashi   29    0    20K  380K RUN/1      0:02  3.69%  3.37% a.out
 6847 takashi   29    0    20K  380K RUN/1      0:02  3.69%  3.37% a.out
 5309 takashi   29    0    20K  380K RUN/1      0:02  3.42%  3.12% a.out
 4801 takashi   30    0    20K  380K RUN/1      0:02  2.56%  2.34% a.out
 4932 takashi   30    0    20K  380K RUN/1      0:02  2.56%  2.34% a.out
 6792 takashi   30    0    20K  380K CPU/1      0:02  2.51%  2.29% a.out
 6794 takashi   29    0    20K  380K RUN/1      0:02  2.51%  2.29% a.out
 5254 takashi   30    0    20K  380K RUN/1      0:01  2.51%  2.29% a.out
 5301 takashi   31    0    20K  380K RUN/1      0:01  2.40%  2.20% a.out
 6070 takashi   30    0    20K  380K RUN/1      0:02  2.35%  2.15% a.out
 5130 takashi   29    0    20K  380K RUN/1      0:02  2.19%  2.00% a.out
10177 takashi   28    0   756K 1344K CPU/0      0:00  0.00%  0.00% top
 6362 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 6366 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 9002 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 8346 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 6365 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 4902 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 4974 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 5939 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 4850 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 7127 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 6364 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 6613 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 6611 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 6363 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 6609 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 6467 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out
 7977 takashi   48    0    20K    4K RUN/1      0:00  0.00%  0.00% a.out

>How-To-Repeat:
	see above.
>Fix:
	- make sched_enqueue initialize l_lrtime properly for new lwps.
	- balance cpus periodically.

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->rmind
Responsible-Changed-By: rmind@netbsd.org
Responsible-Changed-When: Mon, 29 Oct 2007 15:54:14 +0000
Responsible-Changed-Why:
Mine.


State-Changed-From-To: open->analyzed
State-Changed-By: rmind@netbsd.org
State-Changed-When: Mon, 29 Oct 2007 15:54:14 +0000
State-Changed-Why:


From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: PR/37245 CVS commit: src/sys/kern
Date: Sun,  4 Nov 2007 12:36:02 +0000 (UTC)

 Module Name:	src
 Committed By:	rmind
 Date:		Sun Nov  4 12:36:01 UTC 2007

 Modified Files:
 	src/sys/kern: sched_m2.c

 Log Message:
 - sched_setup: use ilog2() for min_catch, which fixes the case when count
   of CPU is non-power of 2.  Fixes PR/37244.
 - sched_enqueue: initialize sl_lrtime, when it is zero (new thread).
   Part of PR/37245.
 - Fix the mints/maxts sysctl helpers, use mstohz() for the checks.  Also,
   I meant miliseconds, not microseconds.  Found by <bjs>.


 To generate a diff of this commit:
 cvs rdiff -r1.7 -r1.8 src/sys/kern/sched_m2.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: analyzed->feedback
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Tue, 11 Mar 2008 18:25:25 +0000
State-Changed-Why:
I think it is fixed now. Can you confirm?
Note, sched_takecpu() is not very intelligent in sched_pstats_hook(),
but for further optimisations we need a topology of CPUs.


From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/37245 CVS commit: src/sys/kern
Date: Tue, 11 Mar 2008 18:18:49 +0000 (UTC)

 Module Name:	src
 Committed By:	rmind
 Date:		Tue Mar 11 18:18:49 UTC 2008

 Modified Files:
 	src/sys/kern: sched_m2.c

 Log Message:
 - Perform periodical balancing of CPU-bound threads, which tends to
   never sleep.  Should fix PR/37245 by <yamt>.
 - Fix a regression - dissalow catching of bound threads.  Also, allow
   migration of non-bound kthreads, this restriction seems pointless.
 - Few micro-optimisations, misc.


 To generate a diff of this commit:
 cvs rdiff -r1.21 -r1.22 src/sys/kern/sched_m2.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: yamt@mwd.biglobe.ne.jp (YAMAMOTO Takashi)
To: gnats-bugs@NetBSD.org
Cc: rmind@NetBSD.org, netbsd-bugs@netbsd.org, gnats-admin@netbsd.org,
        rmind@NetBSD.org
Subject: Re: kern/37245 (sched_m2 is too unfair)
Date: Thu, 20 Mar 2008 22:42:37 +0900 (JST)

 > Synopsis: sched_m2 is too unfair
 > 
 > State-Changed-From-To: analyzed->feedback
 > State-Changed-By: rmind@NetBSD.org
 > State-Changed-When: Tue, 11 Mar 2008 18:25:25 +0000
 > State-Changed-Why:
 > I think it is fixed now. Can you confirm?
 > Note, sched_takecpu() is not very intelligent in sched_pstats_hook(),
 > but for further optimisations we need a topology of CPUs.

 i'm not running M2 anymore on the system.

 have you tried the test code i've provided in the PR?  if it worked
 for you, please close.

 YAMAMOTO Takashi

State-Changed-From-To: feedback->closed
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Sat, 12 Apr 2008 12:14:07 +0000
State-Changed-Why:
Fixed some time ago.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.