NetBSD Problem Report #39349
From yamt@mwd.biglobe.ne.jp Thu Aug 14 08:58:40 2008
Return-Path: <yamt@mwd.biglobe.ne.jp>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by narn.NetBSD.org (Postfix) with ESMTP id 2BC9F63B11D
for <gnats-bugs@gnats.NetBSD.org>; Thu, 14 Aug 2008 08:58:40 +0000 (UTC)
Message-Id: <20080814085837.2340111704@yamt.dyndns.org>
Date: Thu, 14 Aug 2008 17:58:37 +0900 (JST)
From: yamt@mwd.biglobe.ne.jp
Reply-To: yamt@mwd.biglobe.ne.jp
To: gnats-bugs@gnats.NetBSD.org
Subject: cpu affinity can make lwps non-schedulable
X-Send-Pr-Version: 3.95
>Number: 39349
>Category: kern
>Synopsis: cpu affinity can make lwps non-schedulable
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: rmind
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Aug 14 09:00:00 +0000 2008
>Closed-Date: Sun Nov 23 20:27:51 +0000 2008
>Last-Modified: Sun Nov 23 20:27:51 +0000 2008
>Originator: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
>Release: NetBSD 4.99.72
>Organization:
>Environment:
>Description:
try:
# cpuctl offline 0
# cpuctl identify 0
"cpuctl identify" binds itself to cpu0, which is offline.
thus it will never be scheduled. if it has a lock (eg. p->p_lock),
the entire system will hang soon.
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
From: Jason Thorpe <thorpej@shagadelic.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org,
gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/39349: cpu affinity can make lwps non-schedulable
Date: Thu, 14 Aug 2008 10:38:47 -0700
On Aug 14, 2008, at 2:00 AM, yamt@mwd.biglobe.ne.jp wrote:
>> Number: 39349
>> Category: kern
>> Synopsis: cpu affinity can make lwps non-schedulable
>> Confidential: no
>> Severity: serious
>> Priority: medium
>> Responsible: kern-bug-people
>> State: open
>> Class: sw-bug
>> Submitter-Id: net
>> Arrival-Date: Thu Aug 14 09:00:00 +0000 2008
>> Originator: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
>> Release: NetBSD 4.99.72
>> Organization:
>
>> Environment:
>> Description:
> try:
> # cpuctl offline 0
> # cpuctl identify 0
>
> "cpuctl identify" binds itself to cpu0, which is offline.
> thus it will never be scheduled. if it has a lock (eg. p->p_lock),
> the entire system will hang soon.
Probably need to prevent binding to CPUs that have been taking
offline. But what to do about CPUs that already have bound threads
(which is all of them, of course). Perhaps we need to make note when
an LWP has taken a lock?
>
>> How-To-Repeat:
>> Fix:
>
>
>> Unformatted:
>
>
-- thorpej
From: jnemeth@victoria.tc.ca (John Nemeth)
To: Jason Thorpe <thorpej@shagadelic.org>, gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/39349: cpu affinity can make lwps non-schedulable
Date: Thu, 14 Aug 2008 12:38:22 -0700
On Jan 4, 5:14am, Jason Thorpe wrote:
} On Aug 14, 2008, at 2:00 AM, yamt@mwd.biglobe.ne.jp wrote:
}
} >> Number: 39349
} >> Category: kern
} >> Synopsis: cpu affinity can make lwps non-schedulable
} >> State: open
} >> Description:
} > try:
} > # cpuctl offline 0
} > # cpuctl identify 0
} >
} > "cpuctl identify" binds itself to cpu0, which is offline.
} > thus it will never be scheduled. if it has a lock (eg. p->p_lock),
} > the entire system will hang soon.
}
} Probably need to prevent binding to CPUs that have been taking
} offline. But what to do about CPUs that already have bound threads
This looks like the obvious answer to me.
} (which is all of them, of course). Perhaps we need to make note when
} an LWP has taken a lock?
I would like to see support for cpu hot swapping eventually. This
means that when a CPU is taken offline all bound threads and interrupts
MUST be migrated.
}-- End of excerpt from Jason Thorpe
Responsible-Changed-From-To: kern-bug-people->rmind
Responsible-Changed-By: rmind@NetBSD.org
Responsible-Changed-When: Tue, 30 Sep 2008 16:23:14 +0000
Responsible-Changed-Why:
State-Changed-From-To: open->analyzed
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Tue, 30 Sep 2008 16:32:24 +0000
State-Changed-Why:
Problem is clear, quick fix is commited. Decision what to do with already
bound threads should be made to make a full fix.
From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/39349 CVS commit: src/sys/kern
Date: Tue, 30 Sep 2008 16:28:45 +0000 (UTC)
Module Name: src
Committed By: rmind
Date: Tue Sep 30 16:28:45 UTC 2008
Modified Files:
src/sys/kern: kern_runq.c sys_pset.c
Log Message:
- Schedule bound threads even if CPU is offline. Might be revisited later,
when decision what to do with already bound threads will be made.
- Do not allow to assign offline CPU to the processor-set.
Quick fix for PR/39349.
To generate a diff of this commit:
cvs rdiff -r1.20 -r1.21 src/sys/kern/kern_runq.c
cvs rdiff -r1.8 -r1.9 src/sys/kern/sys_pset.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Mindaugas Rasiukevicius <rmind@NetBSD.org>
To: thorpej@shagadelic.org, yamt@mwd.biglobe.ne.jp
Cc: netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, gnats-bugs@NetBSD.org
Subject: Re: kern/39349 (cpu affinity can make lwps non-schedulable)
Date: Tue, 30 Sep 2008 17:46:31 +0100
> State-Changed-From-To: open->analyzed
> State-Changed-By: rmind@NetBSD.org
> State-Changed-When: Tue, 30 Sep 2008 16:32:24 +0000
> State-Changed-Why:
> Problem is clear, quick fix is commited. Decision what to do with already
> bound threads should be made to make a full fix.
Jason Thorpe <thorpej@shagadelic.org> wrote:
> ...
>
> Probably need to prevent binding to CPUs that have been taking
> offline. But what to do about CPUs that already have bound threads
> (which is all of them, of course). Perhaps we need to make note when
> an LWP has taken a lock?
>
> ...
Perhaps disallow (return EBUSY) changing of CPU state to offline, while there
are bound threads? What do you think?
--
Best regards,
Mindaugas
www.NetBSD.org
From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/39349 CVS commit: src/sys
Date: Fri, 31 Oct 2008 00:36:22 +0000 (UTC)
Module Name: src
Committed By: rmind
Date: Fri Oct 31 00:36:22 UTC 2008
Modified Files:
src/sys/arch/x86/x86: cpu.c
src/sys/arch/xen/x86: cpu.c
src/sys/kern: kern_cpu.c sys_sched.c
src/sys/sys: cpu.h
Log Message:
- Avoid the race with CPU online/offline state changes, when setting the
affinity (cpu_lock protects these operations now).
- Disallow setting of state of CPU to to offline, if there are bound LWPs,
which have no CPU to migrate.
- Disallow setting of affinity for the LWP(s), if all CPUs in the dynamic
CPU-set are offline.
- sched_setaffinity: fix invalid check of kcpuset_isset().
- Rename cpu_setonline() to cpu_setstate().
Should fix PR/39349.
To generate a diff of this commit:
cvs rdiff -r1.57 -r1.58 src/sys/arch/x86/x86/cpu.c
cvs rdiff -r1.28 -r1.29 src/sys/arch/xen/x86/cpu.c
cvs rdiff -r1.36 -r1.37 src/sys/kern/kern_cpu.c
cvs rdiff -r1.30 -r1.31 src/sys/kern/sys_sched.c
cvs rdiff -r1.23 -r1.24 src/sys/sys/cpu.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: analyzed->feedback
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Fri, 31 Oct 2008 09:39:07 +0000
State-Changed-Why:
Should be fixed, please confirm.
From: Soren Jacobsen <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/39349 CVS commit: [netbsd-5] src/sys
Date: Thu, 13 Nov 2008 00:04:08 +0000 (UTC)
Module Name: src
Committed By: snj
Date: Thu Nov 13 00:04:07 UTC 2008
Modified Files:
src/sys/arch/x86/x86 [netbsd-5]: cpu.c
src/sys/arch/xen/x86 [netbsd-5]: cpu.c
src/sys/kern [netbsd-5]: kern_cpu.c sys_sched.c
src/sys/sys [netbsd-5]: cpu.h
Log Message:
Pull up following revision(s) (requested by rmind in ticket #48):
sys/kern/kern_cpu.c: revision 1.37
sys/arch/x86/x86/cpu.c: revision 1.58
sys/arch/xen/x86/cpu.c: revision 1.29
sys/sys/cpu.h: revision 1.24
sys/kern/sys_sched.c: revision 1.31
- Avoid the race with CPU online/offline state changes, when setting the
affinity (cpu_lock protects these operations now).
- Disallow setting of state of CPU to to offline, if there are bound LWPs,
which have no CPU to migrate.
- Disallow setting of affinity for the LWP(s), if all CPUs in the dynamic
CPU-set are offline.
- sched_setaffinity: fix invalid check of kcpuset_isset().
- Rename cpu_setonline() to cpu_setstate().
Should fix PR/39349.
To generate a diff of this commit:
cvs rdiff -r1.57 -r1.57.4.1 src/sys/arch/x86/x86/cpu.c
cvs rdiff -r1.28 -r1.28.4.1 src/sys/arch/xen/x86/cpu.c
cvs rdiff -r1.36.4.1 -r1.36.4.2 src/sys/kern/kern_cpu.c
cvs rdiff -r1.30 -r1.30.4.1 src/sys/kern/sys_sched.c
cvs rdiff -r1.23 -r1.23.4.1 src/sys/sys/cpu.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: feedback->closed
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Sun, 23 Nov 2008 20:27:51 +0000
State-Changed-Why:
Feedback timeout. Close.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.