NetBSD Problem Report #40750
From www@NetBSD.org Tue Feb 24 23:11:50 2009
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id AED9663C1C2
for <gnats-bugs@gnats.netbsd.org>; Tue, 24 Feb 2009 23:11:50 +0000 (UTC)
Message-Id: <20090224231150.800D963C1C1@www.NetBSD.org>
Date: Tue, 24 Feb 2009 23:11:50 +0000 (UTC)
From: ad@netbsd.org
Reply-To: ad@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: hackbench screws up 5.0 kernel
X-Send-Pr-Version: www-1.0
>Number: 40750
>Category: kern
>Synopsis: hackbench screws up 5.0 kernel
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: ad
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Feb 24 23:15:00 +0000 2009
>Closed-Date: Sat Aug 29 17:29:54 +0000 2009
>Last-Modified: Thu Jan 07 07:10:03 +0000 2010
>Originator: Andrew Doran
>Release: 5.0_RC2
>Organization:
The NetBSD Project
>Environment:
i386 smp
very high maxprocs, maxfiles
4GB RAM, 4GB swap
>Description:
As per summary.
>How-To-Repeat:
This did not happen 6 months ago.
Run the new threaded hackbench three times:
- threaded mode, with pipes
- process mode, with pipes
- process mode, with sockets.
Boom, the hackbench processes hang waiting for KVA space to become available.
http://www.netbsd.org/~ad/vm_map/hackbench.c
http://www.netbsd.org/~ad/vm_map/backtrace.txt
http://www.netbsd.org/~ad/vm_map/kernel_map.txt
http://www.netbsd.org/~ad/vm_map/kmem_map.txt
http://www.netbsd.org/~ad/vm_map/ps-axlsww.txt
http://www.netbsd.org/~ad/vm_map/show-map.txt
http://www.netbsd.org/~ad/vm_map/vmstat-C.txt
http://www.netbsd.org/~ad/vm_map/vmstat-e.txt
http://www.netbsd.org/~ad/vm_map/vmstat-m.txt
http://www.netbsd.org/~ad/vm_map/vmstat-s.txt
>Fix:
Not yet debugged.
>Release-Note:
>Audit-Trail:
From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/40750: hackbench screws up 5.0 kernel
Date: Wed, 25 Feb 2009 19:58:03 +0000
The wait for VA is on kmem_map, which is ~130MB in size. The kernel seems to
have leaked 2.6 million ksiginfo_t structures, 120MB worth.
Name Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle
ksiginfo 48 2548220 0 0 30336 0 30336 30336 0 inf 0
Pool cache statistics.
Name Spin GrpSz Full Emty PoolLayer CacheLayer Hit% CpuLayer
Hit%
ksiginfo 0 14 0 0 2548220 2548228 0.0 2577954 1.2
Responsible-Changed-From-To: kern-bug-people->ad
Responsible-Changed-By: ad@NetBSD.org
Responsible-Changed-When: Mon, 02 Mar 2009 19:45:24 +0000
Responsible-Changed-Why:
take.
there is one obvious leak, in sigtimedwait().
hackbench does not use it, though.
State-Changed-From-To: open->feedback
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Sun, 29 Mar 2009 05:08:21 +0000
State-Changed-Why:
Might not be a problem in 5.0 RC3. Need more information.
From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: ad@netbsd.org
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: kern/40750 (hackbench screws up 5.0 kernel)
Date: Sun, 29 Mar 2009 06:19:34 +0100
rmind@NetBSD.org wrote:
> Synopsis: hackbench screws up 5.0 kernel
>
> State-Changed-From-To: open->feedback
> State-Changed-By: rmind@NetBSD.org
> State-Changed-When: Sun, 29 Mar 2009 05:08:21 +0000
> State-Changed-Why:
> Might not be a problem in 5.0 RC3. Need more information.
>
FYI: I could not reproduce the problem on -current and NetBSD 5.0_RC3:
http://www.netbsd.org/~rmind/hbtest.sh
http://www.netbsd.org/~rmind/hbtest.txt
One of the major changes between RC2 and RC3 was KVA cache for pipe direct
write (which was also disabled), though I do not see why it would send any
signals, if pipe_pgid == 0. With enabled direct write in -current, problem
does not occur, however. I have not tried reverting KVA cache changes yet..
--
Best regards,
Mindaugas
From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: ad@netbsd.org
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: kern/40750 (hackbench screws up 5.0 kernel)
Date: Sun, 29 Mar 2009 16:28:23 +0100
Mindaugas Rasiukevicius <rmind@netbsd.org> wrote:
> FYI: I could not reproduce the problem on -current and NetBSD 5.0_RC3:
>
> http://www.netbsd.org/~rmind/hbtest.sh
> http://www.netbsd.org/~rmind/hbtest.txt
>
> One of the major changes between RC2 and RC3 was KVA cache for pipe direct
> write (which was also disabled), though I do not see why it would send any
> signals, if pipe_pgid == 0. With enabled direct write in -current, problem
> does not occur, however. I have not tried reverting KVA cache changes yet..
Latest 5.0/i386 with reverted sys_pipe.c to 1.105 revision:
http://www.netbsd.org/~rmind/hbtest.2.txt
Seems to be fine? I am missing something.
--
Best regards,
Mindaugas
From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: ad@NetBSD.org
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: kern/40750 (hackbench screws up 5.0 kernel)
Date: Sun, 28 Jun 2009 20:57:28 +0100
rmind@NetBSD.org wrote:
> Synopsis: hackbench screws up 5.0 kernel
>
> State-Changed-From-To: open->feedback
> State-Changed-By: rmind@NetBSD.org
> State-Changed-When: Sun, 29 Mar 2009 05:08:21 +0000
> State-Changed-Why:
> Might not be a problem in 5.0 RC3. Need more information.
>
Have you ever seen it again? Shall we close this PR?
--
Mindaugas
State-Changed-From-To: feedback->closed
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Sat, 29 Aug 2009 17:29:54 +0000
State-Changed-Why:
Could not reproduce, close PR.
From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/40750 CVS commit: src/sys/kern
Date: Sat, 19 Dec 2009 18:25:55 +0000
Module Name: src
Committed By: rmind
Date: Sat Dec 19 18:25:55 UTC 2009
Modified Files:
src/sys/kern: sys_sig.c
Log Message:
sigtimedwait: fix a memory leak (which happens since newlock2 times).
Allocate ksiginfo on stack since it is safe and sigget() assumes that it is
not allocated from pool (pending signals via sigput()/sigget() "mill" should
be dynamically allocated, however). Might be useful to revisit later.
Likely the cause of PR/40750 and indirect cause of PR/39283.
To generate a diff of this commit:
cvs rdiff -u -r1.23 -r1.24 src/sys/kern/sys_sig.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Soren Jacobsen <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/40750 CVS commit: [netbsd-5] src/sys/kern
Date: Thu, 7 Jan 2010 07:04:51 +0000
Module Name: src
Committed By: snj
Date: Thu Jan 7 07:04:51 UTC 2010
Modified Files:
src/sys/kern [netbsd-5]: sys_sig.c
Log Message:
Pull up following revision(s) (requested by rmind in ticket #1199):
sys/kern/sys_sig.c: revision 1.24
sigtimedwait: fix a memory leak (which happens since newlock2 times).
Allocate ksiginfo on stack since it is safe and sigget() assumes that it is
not allocated from pool (pending signals via sigput()/sigget() "mill" should
be dynamically allocated, however). Might be useful to revisit later.
Likely the cause of PR/40750 and indirect cause of PR/39283.
To generate a diff of this commit:
cvs rdiff -u -r1.17.4.2 -r1.17.4.3 src/sys/kern/sys_sig.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Soren Jacobsen <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/40750 CVS commit: [netbsd-5-0] src/sys/kern
Date: Thu, 7 Jan 2010 07:08:34 +0000
Module Name: src
Committed By: snj
Date: Thu Jan 7 07:08:34 UTC 2010
Modified Files:
src/sys/kern [netbsd-5-0]: sys_sig.c
Log Message:
Pull up following revision(s) (requested by rmind in ticket #1199):
sys/kern/sys_sig.c: revision 1.24
sigtimedwait: fix a memory leak (which happens since newlock2 times).
Allocate ksiginfo on stack since it is safe and sigget() assumes that it is
not allocated from pool (pending signals via sigput()/sigget() "mill" should
be dynamically allocated, however). Might be useful to revisit later.
Likely the cause of PR/40750 and indirect cause of PR/39283.
To generate a diff of this commit:
cvs rdiff -u -r1.17.4.2 -r1.17.4.2.2.1 src/sys/kern/sys_sig.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.