NetBSD Problem Report #39349

From yamt@mwd.biglobe.ne.jp  Thu Aug 14 08:58:40 2008
Return-Path: <yamt@mwd.biglobe.ne.jp>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id 2BC9F63B11D
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 14 Aug 2008 08:58:40 +0000 (UTC)
Message-Id: <20080814085837.2340111704@yamt.dyndns.org>
Date: Thu, 14 Aug 2008 17:58:37 +0900 (JST)
From: yamt@mwd.biglobe.ne.jp
Reply-To: yamt@mwd.biglobe.ne.jp
To: gnats-bugs@gnats.NetBSD.org
Subject: cpu affinity can make lwps non-schedulable
X-Send-Pr-Version: 3.95

>Number:         39349
>Category:       kern
>Synopsis:       cpu affinity can make lwps non-schedulable
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    rmind
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Aug 14 09:00:00 +0000 2008
>Closed-Date:    Sun Nov 23 20:27:51 +0000 2008
>Last-Modified:  Sun Nov 23 20:27:51 +0000 2008
>Originator:     YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
>Release:        NetBSD 4.99.72
>Organization:

>Environment:
>Description:
	try:
		# cpuctl offline 0
		# cpuctl identify 0

	"cpuctl identify" binds itself to cpu0, which is offline.
	thus it will never be scheduled.  if it has a lock (eg. p->p_lock),
	the entire system will hang soon.
>How-To-Repeat:
>Fix:


>Release-Note:

>Audit-Trail:
From: Jason Thorpe <thorpej@shagadelic.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org
Subject: Re: kern/39349: cpu affinity can make lwps non-schedulable
Date: Thu, 14 Aug 2008 10:38:47 -0700

 On Aug 14, 2008, at 2:00 AM, yamt@mwd.biglobe.ne.jp wrote:

 >> Number:         39349
 >> Category:       kern
 >> Synopsis:       cpu affinity can make lwps non-schedulable
 >> Confidential:   no
 >> Severity:       serious
 >> Priority:       medium
 >> Responsible:    kern-bug-people
 >> State:          open
 >> Class:          sw-bug
 >> Submitter-Id:   net
 >> Arrival-Date:   Thu Aug 14 09:00:00 +0000 2008
 >> Originator:     YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
 >> Release:        NetBSD 4.99.72
 >> Organization:
 > 	
 >> Environment:
 >> Description:
 > 	try:
 > 		# cpuctl offline 0
 > 		# cpuctl identify 0
 >
 > 	"cpuctl identify" binds itself to cpu0, which is offline.
 > 	thus it will never be scheduled.  if it has a lock (eg. p->p_lock),
 > 	the entire system will hang soon.

 Probably need to prevent binding to CPUs that have been taking  
 offline.  But what to do about CPUs that already have bound threads  
 (which is all of them, of course).  Perhaps we need to make note when  
 an LWP has taken a lock?


 >
 >> How-To-Repeat:
 >> Fix:
 > 	
 >
 >> Unformatted:
 > 	
 > 	

 -- thorpej

From: jnemeth@victoria.tc.ca (John Nemeth)
To: Jason Thorpe <thorpej@shagadelic.org>, gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/39349: cpu affinity can make lwps non-schedulable
Date: Thu, 14 Aug 2008 12:38:22 -0700

 On Jan 4,  5:14am, Jason Thorpe wrote:
 } On Aug 14, 2008, at 2:00 AM, yamt@mwd.biglobe.ne.jp wrote:
 } 
 } >> Number:         39349
 } >> Category:       kern
 } >> Synopsis:       cpu affinity can make lwps non-schedulable
 } >> State:          open
 } >> Description:
 } > 	try:
 } > 		# cpuctl offline 0
 } > 		# cpuctl identify 0
 } >
 } > 	"cpuctl identify" binds itself to cpu0, which is offline.
 } > 	thus it will never be scheduled.  if it has a lock (eg. p->p_lock),
 } > 	the entire system will hang soon.
 } 
 } Probably need to prevent binding to CPUs that have been taking  
 } offline.  But what to do about CPUs that already have bound threads  

      This looks like the obvious answer to me.

 } (which is all of them, of course).  Perhaps we need to make note when  
 } an LWP has taken a lock?

      I would like to see support for cpu hot swapping eventually.  This
 means that when a CPU is taken offline all bound threads and interrupts
 MUST be migrated.

 }-- End of excerpt from Jason Thorpe

Responsible-Changed-From-To: kern-bug-people->rmind
Responsible-Changed-By: rmind@NetBSD.org
Responsible-Changed-When: Tue, 30 Sep 2008 16:23:14 +0000
Responsible-Changed-Why:


State-Changed-From-To: open->analyzed
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Tue, 30 Sep 2008 16:32:24 +0000
State-Changed-Why:
Problem is clear, quick fix is commited.  Decision what to do with already
bound threads should be made to make a full fix.


From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/39349 CVS commit: src/sys/kern
Date: Tue, 30 Sep 2008 16:28:45 +0000 (UTC)

 Module Name:	src
 Committed By:	rmind
 Date:		Tue Sep 30 16:28:45 UTC 2008

 Modified Files:
 	src/sys/kern: kern_runq.c sys_pset.c

 Log Message:
 - Schedule bound threads even if CPU is offline.  Might be revisited later,
   when decision what to do with already bound threads will be made.
 - Do not allow to assign offline CPU to the processor-set.

 Quick fix for PR/39349.


 To generate a diff of this commit:
 cvs rdiff -r1.20 -r1.21 src/sys/kern/kern_runq.c
 cvs rdiff -r1.8 -r1.9 src/sys/kern/sys_pset.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Mindaugas Rasiukevicius <rmind@NetBSD.org>
To: thorpej@shagadelic.org, yamt@mwd.biglobe.ne.jp
Cc: netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, gnats-bugs@NetBSD.org
Subject: Re: kern/39349 (cpu affinity can make lwps non-schedulable)
Date: Tue, 30 Sep 2008 17:46:31 +0100

 > State-Changed-From-To: open->analyzed
 > State-Changed-By: rmind@NetBSD.org
 > State-Changed-When: Tue, 30 Sep 2008 16:32:24 +0000
 > State-Changed-Why:
 > Problem is clear, quick fix is commited.  Decision what to do with already
 > bound threads should be made to make a full fix.

 Jason Thorpe <thorpej@shagadelic.org> wrote:
 > ...
 > 
 > Probably need to prevent binding to CPUs that have been taking  
 > offline.  But what to do about CPUs that already have bound threads  
 > (which is all of them, of course).  Perhaps we need to make note when  
 > an LWP has taken a lock?
 > 
 > ...

 Perhaps disallow (return EBUSY) changing of CPU state to offline, while there
 are bound threads? What do you think?

 -- 
 Best regards,
 Mindaugas
 www.NetBSD.org

From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/39349 CVS commit: src/sys
Date: Fri, 31 Oct 2008 00:36:22 +0000 (UTC)

 Module Name:	src
 Committed By:	rmind
 Date:		Fri Oct 31 00:36:22 UTC 2008

 Modified Files:
 	src/sys/arch/x86/x86: cpu.c
 	src/sys/arch/xen/x86: cpu.c
 	src/sys/kern: kern_cpu.c sys_sched.c
 	src/sys/sys: cpu.h

 Log Message:
 - Avoid the race with CPU online/offline state changes, when setting the
   affinity (cpu_lock protects these operations now).
 - Disallow setting of state of CPU to to offline, if there are bound LWPs,
   which have no CPU to migrate.
 - Disallow setting of affinity for the LWP(s), if all CPUs in the dynamic
   CPU-set are offline.
 - sched_setaffinity: fix invalid check of kcpuset_isset().
 - Rename cpu_setonline() to cpu_setstate().

 Should fix PR/39349.


 To generate a diff of this commit:
 cvs rdiff -r1.57 -r1.58 src/sys/arch/x86/x86/cpu.c
 cvs rdiff -r1.28 -r1.29 src/sys/arch/xen/x86/cpu.c
 cvs rdiff -r1.36 -r1.37 src/sys/kern/kern_cpu.c
 cvs rdiff -r1.30 -r1.31 src/sys/kern/sys_sched.c
 cvs rdiff -r1.23 -r1.24 src/sys/sys/cpu.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: analyzed->feedback
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Fri, 31 Oct 2008 09:39:07 +0000
State-Changed-Why:
Should be fixed, please confirm.


From: Soren Jacobsen <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/39349 CVS commit: [netbsd-5] src/sys
Date: Thu, 13 Nov 2008 00:04:08 +0000 (UTC)

 Module Name:	src
 Committed By:	snj
 Date:		Thu Nov 13 00:04:07 UTC 2008

 Modified Files:
 	src/sys/arch/x86/x86 [netbsd-5]: cpu.c
 	src/sys/arch/xen/x86 [netbsd-5]: cpu.c
 	src/sys/kern [netbsd-5]: kern_cpu.c sys_sched.c
 	src/sys/sys [netbsd-5]: cpu.h

 Log Message:
 Pull up following revision(s) (requested by rmind in ticket #48):
 	sys/kern/kern_cpu.c: revision 1.37
 	sys/arch/x86/x86/cpu.c: revision 1.58
 	sys/arch/xen/x86/cpu.c: revision 1.29
 	sys/sys/cpu.h: revision 1.24
 	sys/kern/sys_sched.c: revision 1.31
 - Avoid the race with CPU online/offline state changes, when setting the
   affinity (cpu_lock protects these operations now).
 - Disallow setting of state of CPU to to offline, if there are bound LWPs,
   which have no CPU to migrate.
 - Disallow setting of affinity for the LWP(s), if all CPUs in the dynamic
   CPU-set are offline.
 - sched_setaffinity: fix invalid check of kcpuset_isset().
 - Rename cpu_setonline() to cpu_setstate().
 Should fix PR/39349.


 To generate a diff of this commit:
 cvs rdiff -r1.57 -r1.57.4.1 src/sys/arch/x86/x86/cpu.c
 cvs rdiff -r1.28 -r1.28.4.1 src/sys/arch/xen/x86/cpu.c
 cvs rdiff -r1.36.4.1 -r1.36.4.2 src/sys/kern/kern_cpu.c
 cvs rdiff -r1.30 -r1.30.4.1 src/sys/kern/sys_sched.c
 cvs rdiff -r1.23 -r1.23.4.1 src/sys/sys/cpu.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: feedback->closed
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Sun, 23 Nov 2008 20:27:51 +0000
State-Changed-Why:
Feedback timeout.  Close.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.