NetBSD Problem Report #45718

From campbell@mumble.net  Sat Dec 17 05:21:31 2011
Return-Path: <campbell@mumble.net>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id E7F2763BF13
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 17 Dec 2011 05:21:30 +0000 (UTC)
Message-Id: <20111217052131.F1BA3982AB@pluto.mumble.net>
Date: Sat, 17 Dec 2011 05:21:31 +0000 (UTC)
From: Taylor R Campbell <campbell+netbsd@mumble.net>
Reply-To: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@gnats.NetBSD.org
Subject: processes sometimes get stuck and spin in vm_map
X-Send-Pr-Version: 3.95

>Number:         45718
>Category:       kern
>Synopsis:       processes sometimes get stuck and spin in vm_map
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Dec 17 05:25:00 +0000 2011
>Closed-Date:    Tue Jan 31 23:49:47 +0000 2023
>Last-Modified:  Tue Jan 31 23:49:47 +0000 2023
>Originator:     Taylor R Campbell <campbell+netbsd@mumble.net>
>Release:        NetBSD 5.99.58
>Organization:
>Environment:
System: NetBSD oberon.local 5.99.58 NetBSD 5.99.58 (RIAMONOHACK) #4: Sat Dec 10 23:18:29 UTC 2011 root@oberon.local:/home/riastradh/netbsd/current/obj/sys/arch/i386/compile/RIAMONOHACK i386
Architecture: i386
Machine: i386
>Description:

	Over the past six to eight months (maybe longer), I've
	occasionally noticed that after a lot of heavy file system
	activity, some processes seem to get stuck.  They occupy some
	CPU time, but spend most of their time in vm_map.

	Tonight I tried to create a null mount, and mount_null got
	stuck like this; ^C did nothing, and umount wedged too.  A
	little while later, mount_null seemed to get unstuck, but when
	I tried again, it got stuck again.  Now several processes,
	including mount_null, zsh, and xargs, are stuck like this.

21204 root     117    0  2904K  728K vm_map/0   2:05 16.41% 16.41% mount_null

	In the past, the only way I have recovered from this situation
	is by rebooting.

	Memory is not scarce; I have a good half a gigabyte of RAM
	free, out of two gigabytes total, and plenty of swap handy.

	I haven't observed a pattern to what processes get stuck like
	this, although if there are file system mounts involved (which
	there often are because I use null mounts all the time), and
	any process is stuck trying to do something like `df -h /mnt',
	then any other process will get stuck if it tries to handle
	/mnt too.

>How-To-Repeat:

	Not sure.  Try a few concurrent rsyncs, cvs updates, finds,
	tars, &c.

>Fix:

	Yes, please!

>Release-Note:

>Audit-Trail:

State-Changed-From-To: open->feedback
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Tue, 20 Dec 2011 11:28:03 +0000
State-Changed-Why:
Can you get into DDB (can use crash(8) as well) and grab (ps/l, t/a) the
backtrace of those LWPs?  Also, the output of 'show uvm'.


From: rudolf <netbsd@eq.cz>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45718 (processes sometimes get stuck and spin in vm_map)
Date: Fri, 24 Feb 2012 15:48:22 +0100

 I've recently observerd the same phenomenon.

 "ps /l" in DDB revealed 4 lwps stuck in vm_map: xulrunner-bin, ksh, 
 login, cron.

 "bt /a <lwp>" for xulrunner-bin:
 sleepq_block
 cv_timedwait
 uvm_map_prepare
 uvm_map
 uvm_km_alloc
 uarea_poolpage_alloc
 pool_allocator_alloc
 pool_grow
 poop_get
 poop_cache_get_slow
 pool_cache_get_paddr
 uvm_uarea_alloc
 sys__lwp_create
 sy_call
 syscall

 "bt /a <lwp>" for the other three lwps looked the same, with "fork1, 
 sys_fork" instead of the "sys_lwp_create". Unfortunately, I've not 
 collected the "show uvm", maybe next time.

 My environment: i386 with netbsd-5 userland and netbsd-6 MONOLITHIC 
 kernel updated 2012/02/22, with PAE, 8 GB RAM.

 r.

From: rudolf <netbsd@eq.cz>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45718 (processes sometimes get stuck and spin in vm_map)
Date: Mon, 27 Feb 2012 18:10:10 +0100

 This time 4 lwps in vm_map, 2x sh, 2x xulrunner-bin, all with the same 
 backtrace:
 sleepq_block
 cv_timedwait
 uvm_map_prepare
 uvm_map
 uvm_km_alloc
 pipe_ctor
 pool_cache_get_slow
 pool_cache_get_paddr
 pipe_create
 pipe1
 sys_pipe
 sy_call
 syscall

 Current UVM status:
 pagesize=4096, pagemask=0xfff, pageshift=12, ncolors=64
 2015729 VM pages: 445408 active, 0 inactive, 1479 wired, 1481348 free
 pages 135016 anon, 296929 file, 14942 exec
 freemin=512 free-target=682, wired-max=671909
 cpu0:
 faults=7976773, traps=7613655, intrs=6473188, ctxswitch=14547360
 softint=5749848, syscalls=136291738
 cpu1:
 faults=7132353, traps=6663764, intrs=0, ctxswitch=13848441
 softint=2613137, syscalls=103586607
 fault counts:
 noram=0, noanon=0, pgwait=0, pgrele=0
 ok relocks(total)=6605(6638)
 anget(retrys)=1330518(0)
 anapcopy=712393
 neighbor anon/obj pg=77203/1108057, gets(lock/unlock)=507179/6638
 cases: anon=1309558, anoncow=20966, obj=454788, prcopy=52356,
 przero=12489369
 daemon and swap counts:
 nswapdev=1, swpgavail=524457, swpages=524457, all other attributes = 0.

 r.

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org,
	gnats-admin@netbsd.org, rmind@NetBSD.org
Subject: Re: kern/45718 (processes sometimes get stuck and spin in vm_map)
Date: Sat, 14 Apr 2012 16:57:01 +0000

 I caught some processes wedged in vm_map, although they don't seem to
 be spinning -- just wedged.  ps(1) reports these processes as zombies;
 top(1) does not report their taking any CPU time.  One is a subprocess
 of newsyslog; the other is a subprocess of git-pull.

 crash> t/a ca73daa0
 trace: pid 25970 lid 1 at 0xdc10781c
 sleepq_block(64,0,c0c5fd9d,c0ce63d0,0,0,0,c08e40d3,0,0) at sleepq_block+0xda
 cv_timedwait(c0d291d0,c0d291cc,64,dc1078e8,c0d28fe0,ffffffff,ffffffff,0,801727,c66ef880) at cv_timedwait+0x126
 uvm_map_prepare(c0d291c0,c0000000,40000,c0d28fe0,ffffffff,ffffffff,0,801727,dc107928,dc107b54) at uvm_map_prepare+0x167
 uvm_map(c0d291c0,dc1079b0,40000,c0d28fe0,ffffffff,ffffffff,0,801727,800002,0) at uvm_map+0x78
 uvm_km_alloc(c0d291c0,40000,0,800002,c0d15ea0,1,dc107a4c,c07e117a,c0d15ea0,1) at uvm_km_alloc+0xe6
 exec_pool_alloc(c0d15ea0,1,dc107a0c,c09548fd,0,0,0,0,0,0) at exec_pool_alloc+0x2b
 pool_grow(c0d15f14,1,c385ac12,0,0,c054,ca5ff000,c385ac09,9,c0d15f18) at pool_grow+0x2a
 pool_get(c0d15ea0,1,ce4e4a80,c098bb91,0,c31eec40,dc107adc,c056764b,c3137d00,c3137bc0) at pool_get+0x79
 execve_loadvm(8063f44,c055c480,dc107b3c,c3137d00,ca73daa0,0,8063f1c,c3ec4400,c4519400,c380e000) at execve_loadvm+0x1da
 execve1(ca73daa0,8063f1c,8063f3c,8063f44,c055c480,6300,dc107d1c,0,c06ba833,cd722744) at execve1+0x32
 sys_execve(ca73daa0,dc107cf4,dc107d1c,c08096f0,0,cdb17e3c,c0c8b800,dc107d30,c06babb9,cd722730) at sys_execve+0x30
 syscall(dc107d48,b3,ab,bfbf001f,806001f,8063f1c,8063f3c,bfbfec48,8063f1c,7d7b7cff) at syscall+0x95

 crash> t/a ccc79540
 trace: pid 21497 lid 1 at 0xdc32581c
 sleepq_block(64,0,c0c5fd9d,c0ce63d0,0,0,0,c08e40d3,0,0) at sleepq_block+0xda
 cv_timedwait(c0d291d0,c0d291cc,64,dc3258e8,c0d28fe0,ffffffff,ffffffff,0,801727,c66ef880) at cv_timedwait+0x126
 uvm_map_prepare(c0d291c0,c0000000,40000,c0d28fe0,ffffffff,ffffffff,0,801727,dc325928,dc325b54) at uvm_map_prepare+0x167
 uvm_map(c0d291c0,dc3259b0,40000,c0d28fe0,ffffffff,ffffffff,0,801727,800002,0) at uvm_map+0x78
 uvm_km_alloc(c0d291c0,40000,0,800002,c0d15ea0,1,dc325a4c,c07e117a,c0d15ea0,1) at uvm_km_alloc+0xe6
 exec_pool_alloc(c0d15ea0,1,dc325a0c,c09548fd,0,0,0,0,0,0) at exec_pool_alloc+0x2b
 pool_grow(c0d15f14,1,c387f80b,0,0,c054,cd15fa80,c387f809,2,c0d15f18) at pool_grow+0x2a
 pool_get(c0d15ea0,1,c88faae0,c057c7e1,cda0986c,10,0,c3889700,ccc79540,0) at pool_get+0x79
 execve_loadvm(bb92d404,c055c480,dc325b3c,c38b3e80,c30822f0,0,bb92d48c,c3ec5c00,c3ec4800,ca9b9800) at execve_loadvm+0x1da
 execve1(ccc79540,bb92d48c,8063f9c,bb92d404,c055c480,106,dc325d3c,c0840430,0,0) at execve1+0x32
 sys_execve(ccc79540,dc325cf4,dc325d1c,c08096f0,0,cd722a78,c3663000,cda09868,0,bbb81010) at sys_execve+0x30
 syscall(dc325d48,bb9200b3,ab,bfbf001f,bbbb001f,bb92d48c,8063f9c,bfbfe1f8,bb92d48c,bb92d49c) at syscall+0x95

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org,
	netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, rmind@NetBSD.org
Subject: Re: kern/45718 (processes sometimes get stuck and spin in vm_map)
Date: Sat, 14 Apr 2012 17:01:52 +0000

 Forgot to add: `show uvm' in crash(8) produces no output.  Also, this
 is under a 6.99.4 kernel from about 20120325 with a ~5.1ish userland
 (and crash(8) from a 6.99.4ish userland in a chroot).

From: matthew green <mrg@eterna.com.au>
To: Taylor R Campbell <campbell+netbsd@mumble.net>
Cc: kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org,
    gnats-admin@netbsd.org, rmind@NetBSD.org, gnats-bugs@NetBSD.org,
    oster@netbsd.org, martin@netbsd.org
Subject: re: kern/45718 (processes sometimes get stuck and spin in vm_map)
Date: Sun, 15 Apr 2012 05:14:06 +1000

 > I caught some processes wedged in vm_map, although they don't seem to
 > be spinning -- just wedged.  ps(1) reports these processes as zombies;
 > top(1) does not report their taking any CPU time.  One is a subprocess
 > of newsyslog; the other is a subprocess of git-pull.
 > 
 > crash> t/a ca73daa0
 > trace: pid 25970 lid 1 at 0xdc10781c
 > sleepq_block(64,0,c0c5fd9d,c0ce63d0,0,0,0,c08e40d3,0,0) at sleepq_block+0xda
 > cv_timedwait(c0d291d0,c0d291cc,64,dc1078e8,c0d28fe0,ffffffff,ffffffff,0,801727,c66ef880) at cv_timedwait+0x126
 > uvm_map_prepare(c0d291c0,c0000000,40000,c0d28fe0,ffffffff,ffffffff,0,801727,dc107928,dc107b54) at uvm_map_prepare+0x167
 > uvm_map(c0d291c0,dc1079b0,40000,c0d28fe0,ffffffff,ffffffff,0,801727,800002,0) at uvm_map+0x78
 > uvm_km_alloc(c0d291c0,40000,0,800002,c0d15ea0,1,dc107a4c,c07e117a,c0d15ea0,1) at uvm_km_alloc+0xe6
 > exec_pool_alloc(c0d15ea0,1,dc107a0c,c09548fd,0,0,0,0,0,0) at exec_pool_alloc+0x2b
 > pool_grow(c0d15f14,1,c385ac12,0,0,c054,ca5ff000,c385ac09,9,c0d15f18) at pool_grow+0x2a
 > pool_get(c0d15ea0,1,ce4e4a80,c098bb91,0,c31eec40,dc107adc,c056764b,c3137d00,c3137bc0) at pool_get+0x79
 > execve_loadvm(8063f44,c055c480,dc107b3c,c3137d00,ca73daa0,0,8063f1c,c3ec4400,c4519400,c380e000) at execve_loadvm+0x1da
 > execve1(ca73daa0,8063f1c,8063f3c,8063f44,c055c480,6300,dc107d1c,0,c06ba833,cd722744) at execve1+0x32
 > sys_execve(ca73daa0,dc107cf4,dc107d1c,c08096f0,0,cdb17e3c,c0c8b800,dc107d30,c06babb9,cd722730) at sys_execve+0x30
 > syscall(dc107d48,b3,ab,bfbf001f,806001f,8063f1c,8063f3c,bfbfec48,8063f1c,7d7b7cff) at syscall+0x95

 this looks like the execargs pool leak that greg oster has a patch for.
 (reproduced below.)


 .mrg.

 Index: kern_exec.c
 ===================================================================
 RCS file: /cvsroot/src/sys/kern/kern_exec.c,v
 retrieving revision 1.349
 diff -u -p -r1.349 kern_exec.c
 --- kern_exec.c	9 Apr 2012 19:42:06 -0000	1.349
 +++ kern_exec.c	13 Apr 2012 20:28:14 -0000
 @@ -1991,6 +1991,8 @@ spawn_return(void *arg)
  		rw_exit(&exec_lock);
  	}

 +	execve_free_data(&spawn_data->sed_exec);
 +
  	/* release our refcount on the data */
  	spawn_exec_data_release(spawn_data);


From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: matthew green <mrg@eterna.com.au>
Cc: kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org,
	gnats-admin@netbsd.org, rmind@NetBSD.org,
	gnats-bugs@NetBSD.org, oster@netbsd.org, martin@netbsd.org
Subject: Re: kern/45718 (processes sometimes get stuck and spin in vm_map)
Date: Sat, 14 Apr 2012 19:23:58 +0000

    Date: Sun, 15 Apr 2012 05:14:06 +1000
    From: matthew green <mrg@eterna.com.au>

    this looks like the execargs pool leak that greg oster has a patch for.
    (reproduced below.)

    @@ -1991,6 +1991,8 @@ spawn_return(void *arg)
                    rw_exit(&exec_lock);
            }

    +	execve_free_data(&spawn_data->sed_exec);

 Isn't that code path reachable only if someone uses posix_spawn?
 Certainly nothing in a 5.1 userland will use posix_spawn, and although
 I suppose my having run the 6.99.4 tests in a chroot might have
 triggered it, it wouldn't explain the earlier problem I reported,
 because I believe I reported that before posix_spawn was imported.

From: Greg Oster <oster@cs.usask.ca>
To: matthew green <mrg@eterna.com.au>
Cc: Taylor R Campbell <campbell+netbsd@mumble.net>,
 kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org, gnats-admin@netbsd.org,
 rmind@NetBSD.org, gnats-bugs@NetBSD.org, oster@netbsd.org,
 martin@netbsd.org
Subject: Re: kern/45718 (processes sometimes get stuck and spin in vm_map)
Date: Sat, 14 Apr 2012 22:55:25 -0600

 On Sun, 15 Apr 2012 05:14:06 +1000
 matthew green <mrg@eterna.com.au> wrote:

 > 
 > > I caught some processes wedged in vm_map, although they don't seem
 > > to be spinning -- just wedged.  ps(1) reports these processes as
 > > zombies; top(1) does not report their taking any CPU time.  One is
 > > a subprocess of newsyslog; the other is a subprocess of git-pull.
 > > 
 > > crash> t/a ca73daa0
 > > trace: pid 25970 lid 1 at 0xdc10781c
 > > sleepq_block(64,0,c0c5fd9d,c0ce63d0,0,0,0,c08e40d3,0,0) at
 > > sleepq_block+0xda
 > > cv_timedwait(c0d291d0,c0d291cc,64,dc1078e8,c0d28fe0,ffffffff,ffffffff,0,801727,c66ef880)
 > > at cv_timedwait+0x126
 > > uvm_map_prepare(c0d291c0,c0000000,40000,c0d28fe0,ffffffff,ffffffff,0,801727,dc107928,dc107b54)
 > > at uvm_map_prepare+0x167
 > > uvm_map(c0d291c0,dc1079b0,40000,c0d28fe0,ffffffff,ffffffff,0,801727,800002,0)
 > > at uvm_map+0x78
 > > uvm_km_alloc(c0d291c0,40000,0,800002,c0d15ea0,1,dc107a4c,c07e117a,c0d15ea0,1)
 > > at uvm_km_alloc+0xe6
 > > exec_pool_alloc(c0d15ea0,1,dc107a0c,c09548fd,0,0,0,0,0,0) at
 > > exec_pool_alloc+0x2b
 > > pool_grow(c0d15f14,1,c385ac12,0,0,c054,ca5ff000,c385ac09,9,c0d15f18)
 > > at pool_grow+0x2a
 > > pool_get(c0d15ea0,1,ce4e4a80,c098bb91,0,c31eec40,dc107adc,c056764b,c3137d00,c3137bc0)
 > > at pool_get+0x79
 > > execve_loadvm(8063f44,c055c480,dc107b3c,c3137d00,ca73daa0,0,8063f1c,c3ec4400,c4519400,c380e000)
 > > at execve_loadvm+0x1da
 > > execve1(ca73daa0,8063f1c,8063f3c,8063f44,c055c480,6300,dc107d1c,0,c06ba833,cd722744)
 > > at execve1+0x32
 > > sys_execve(ca73daa0,dc107cf4,dc107d1c,c08096f0,0,cdb17e3c,c0c8b800,dc107d30,c06babb9,cd722730)
 > > at sys_execve+0x30
 > > syscall(dc107d48,b3,ab,bfbf001f,806001f,8063f1c,8063f3c,bfbfec48,8063f1c,7d7b7cff)
 > > at syscall+0x95
 > 
 > this looks like the execargs pool leak that greg oster has a patch
 > for. (reproduced below.)
 > 
 > 
 > .mrg.
 > 
 > Index: kern_exec.c
 > ===================================================================
 > RCS file: /cvsroot/src/sys/kern/kern_exec.c,v
 > retrieving revision 1.349
 > diff -u -p -r1.349 kern_exec.c
 > --- kern_exec.c	9 Apr 2012 19:42:06 -0000	1.349
 > +++ kern_exec.c	13 Apr 2012 20:28:14 -0000
 > @@ -1991,6 +1991,8 @@ spawn_return(void *arg)
 >  		rw_exit(&exec_lock);
 >  	}
 >  
 > +	execve_free_data(&spawn_data->sed_exec);
 > +
 >  	/* release our refcount on the data */
 >  	spawn_exec_data_release(spawn_data);
 >  

 Actually, info from Martin leads me to believe that the
 execve_free_data() needs to go into the: 

  if (have_reflock) {

  }

 bit just above here... I've run tests with that, and they all pass
 too...

 Later...

 Greg Oster

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45718 (processes sometimes get stuck and spin in vm_map)
Date: Sun, 15 Apr 2012 16:49:03 +0200

 On Sat, Apr 14, 2012 at 07:15:06PM +0000, matthew green wrote:
 >  this looks like the execargs pool leak that greg oster has a patch for.

 You should see a log message "should not happen" if this would be the case.

 However, if we go through exec_pool_alloc, we can't have hit the hard limit,
 can we?

 Martin

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org,
	gnats-admin@netbsd.org, rmind@NetBSD.org
Subject: Re: kern/45718 (processes sometimes get stuck and spin in vm_map)
Date: Wed, 9 May 2012 00:28:25 +0000

 Caught another one, different stack trace this time.  Forgot to run
 `show uvm', sorry.

 crash> bt/t 0t24257
 trace: pid 24257 lid 1 at 0xfbd0d85c
 sleepq_block(64,0,c0c5fd9d,c0ce63d0,0,0,0,c08e40d3,0,0) at sleepq_block+0xda
 cv_timedwait(c0d291d0,c0d291cc,64,fbd0d928,0,ffffffff,ffffffff,0,801727,c9a83200) at cv_timedwait+0x126
 uvm_map_prepare(c0d291c0,c0000000,3000,0,ffffffff,ffffffff,0,801727,fbd0d968,10) at uvm_map_prepare+0x167
 uvm_map(c0d291c0,fbd0d9f0,3000,0,ffffffff,ffffffff,0,801727,800001,0) at uvm_map+0x78
 uvm_km_alloc(c0d291c0,3000,0,800001,c30ba6c0,1,fbd0da8c,c07e117a,c30ba6c0,1) at uvm_km_alloc+0xe6
 uarea_poolpage_alloc(c30ba6c0,1,fbd0da4c,c09548fd,0,0,0,0,0,0) at uarea_poolpage_alloc+0x41
 pool_grow(c30ba734,c07e1c4e,6,c0d7436c,6,0,c07e1b13,c30ba7bc,cce4fd20,c30ba738) at pool_grow+0x2a
 pool_get(c30ba6c0,1,fbd0db2c,c056697e,0,1,28b8d000,5,21,0) at pool_get+0x79
 pool_cache_get_slow(0,1,0,c08de98e,0,0,1,3,fbd0dc7c,0) at pool_cache_get_slow+0x19a
 pool_cache_get_paddr(c30ba6c0,1,0,0,fbd0dc40,cce4fd20,fbd0dc6c,c05619e5,0,1) at pool_cache_get_paddr+0x21c
 uvm_uarea_alloc(0,1,c6fde044,e9,0,0,ffffffff,3f8,cce4fd20,c4436a40) at uvm_uarea_alloc+0x23
 fork1(cce4fd20,3,14,0,0,0,0,fbd0dd1c,0,106) at fork1+0x115
 sys___vfork14(cce4fd20,fbd0dcf4,fbd0dd1c,c08096f0,7d49f000,c0c8b800,cce4fd20,c838a350,0,8054095) at sys___vfork14+0x50
 syscall(fbd0dd48,b3,bbbb00ab,bfbf001f,806001f,0,0,bfbfe678,bb9102fc,bb9102fc) at syscall+0x95

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org,
	netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, rmind@NetBSD.org
Subject: Re: kern/45718 (processes sometimes get stuck and spin in vm_map)
Date: Sat, 23 Jun 2012 15:27:06 +0000

 I caught some more processes wedged in vm_map today.  I wasn't able to
 get a stack trace because my attempt to chroot into a recent enough
 userland to use crash(8) also wedged in vm_map.

 But then after five minutes, all the processes that were wedged in
 vm_map spontaneously unwedged and continued to make progress!

 When I next looked, there was nothing wedged in vm_map.

From: "Jonathan A. Kollasch" <jakllsch@kollasch.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45718: processes sometimes get stuck and spin in vm_map
Date: Sat, 1 Sep 2012 14:15:44 -0500

 I've been seeing the hangs of the exec_pool_alloc() variety in 6.0RC1 on
 i386.  Using  dd if=/dev/zero of=/dev/null count=1 bs=1G  (machine has
 about 2G of RAM, sometimes needs to be more than 1G though) tends to
 unwedge everything that gets stuck in vm_map wait channel here.

From: Chuck Silvers <chuq@chuq.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45718: processes sometimes get stuck and spin in vm_map
Date: Sat, 1 Sep 2012 17:48:23 -0700

 On Sat, Sep 01, 2012 at 07:20:05PM +0000, Jonathan A. Kollasch wrote:
 >  I've been seeing the hangs of the exec_pool_alloc() variety in 6.0RC1 on
 >  i386.  Using  dd if=/dev/zero of=/dev/null count=1 bs=1G  (machine has
 >  about 2G of RAM, sometimes needs to be more than 1G though) tends to
 >  unwedge everything that gets stuck in vm_map wait channel here.

 this sounds like KVA exhaustion.  this was a problem on i386 some years ago
 and I don't remember whether it was ever really fixed.  I vaguely recall
 there was a particular PR that was the main means of tracking that issue
 but I can't find it now.

 -Chuck

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45718: processes sometimes get stuck and spin in vm_map
Date: Sun, 2 Sep 2012 03:45:14 +0000

 On Sun, Sep 02, 2012 at 12:50:04AM +0000, Chuck Silvers wrote:
  >  On Sat, Sep 01, 2012 at 07:20:05PM +0000, Jonathan A. Kollasch wrote:
  >  >  I've been seeing the hangs of the exec_pool_alloc() variety in 6.0RC1 on
  >  >  i386.  Using  dd if=/dev/zero of=/dev/null count=1 bs=1G  (machine has
  >  >  about 2G of RAM, sometimes needs to be more than 1G though) tends to
  >  >  unwedge everything that gets stuck in vm_map wait channel here.
  >  
  >  this sounds like KVA exhaustion.  this was a problem on i386 some years ago
  >  and I don't remember whether it was ever really fixed.  I vaguely recall
  >  there was a particular PR that was the main means of tracking that issue
  >  but I can't find it now.

 Do you remember anything else about that PR? I can't find it from
 obvious keywords. Is it likely to have been closed?

 -- 
 David A. Holland
 dholland@netbsd.org

From: Chuck Silvers <chuq@chuq.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45718: processes sometimes get stuck and spin in vm_map
Date: Tue, 11 Sep 2012 07:07:40 -0700

 On Sun, Sep 02, 2012 at 03:50:05AM +0000, David Holland wrote:
 >  Do you remember anything else about that PR? I can't find it from
 >  obvious keywords. Is it likely to have been closed?

 I think I found the one I was thinking of: 33185

 -Chuck

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45718: processes sometimes get stuck and spin in vm_map
Date: Sun, 9 Dec 2012 06:13:43 +0000

 Every once in a while I still find processes stuck like this.  Tonight
 I investigated closer with the help of crash(8).  Here are some clues
 I've gathered:

 - The processes that are stuck are all stuck in the cv_timedwait in
 uvm_map_prepare deep inside exec.

 - I can still exec new processes, so the problem is *not* simply kva
 exhaustion causing the uvm_km_alloc in exec_pool_alloc to hang
 indefinitely waiting for kva -- all the requests for kva are for the
 same size, so if I can make new requests, the old ones should be
 serviceable too.

 - Examination of the lwp structures and their l_timeout_ch members
 reveals that the callouts for cv_timedwait are firing (the callouts'
 c_time values keep changing), so it's not that the sleepq mechanism is
 stuck or anything.

 - Examintion of kernel_map itself reveals that UVM_MAP_WANTVA is
 persistently flagged.

 So it looks like something freed up kva, but failed to signal to the
 waiters that the kva was freed up.

 I looked around for a race condition surrounding map->flags, but
 although there is a wacky locking dance surrounding vm_maps, I didn't
 see anything obvious there: uvm_unmap_remove looks like it does the
 right thing to signal to the waiters.  However, I suspect that
 uvm_map_replace and uvm_map_extract can free up space in the map, and
 neither of them signals UVM_MAP_WANTVA waiters.

 There may be other places in uvm_map.c that free up space -- it's huge
 and I haven't gone through it all.  Is it plausible that at least
 uvm_map_replace and uvm_map_extract, and perhaps other parts of
 uvm_map.c as well, need to signal UVM_MAP_WANTVA waiters?


 Caveat: I am currently using a kernel from back in March, because
 something broke related to drm in more recent kernels, but uvm hasn't
 changed much lately, so I suspect the problem is still here.

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45718: processes sometimes get stuck and spin in vm_map
Date: Sun, 9 Dec 2012 07:25:25 +0000

    Date: Sun, 9 Dec 2012 06:13:43 +0000
    From: Taylor R Campbell <campbell+netbsd@mumble.net>

    - Examintion of kernel_map itself reveals that UVM_MAP_WANTVA is
    persistently flagged.

    So it looks like something freed up kva, but failed to signal to the
    waiters that the kva was freed up.

 Manually clearing UVM_MAP_WANTVA in kernel_map->flags by writing to
 /dev/kmem failed to unwedge the processes that were wedged in vm_map,
 so there goes that hypothesis.

    - I can still exec new processes, so the problem is *not* simply kva
    exhaustion causing the uvm_km_alloc in exec_pool_alloc to hang
    indefinitely waiting for kva -- all the requests for kva are for the
    same size, so if I can make new requests, the old ones should be
    serviceable too.

 This analysis was a little too simple-minded.  It may be that most new
 execs are serviced by exec_pool without going through exec_pool_alloc,
 but every one that does go through exec_pool_alloc gets wedged in.

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45718: processes sometimes get stuck and spin in vm_map
Date: Sun, 9 Dec 2012 08:12:19 +0000

    Date: Sun, 9 Dec 2012 07:25:25 +0000
    From: Taylor R Campbell <campbell+netbsd@mumble.net>

       Date: Sun, 9 Dec 2012 06:13:43 +0000
       From: Taylor R Campbell <campbell+netbsd@mumble.net>

       - I can still exec new processes, so the problem is *not* simply kva
       exhaustion causing the uvm_km_alloc in exec_pool_alloc to hang
       indefinitely waiting for kva -- all the requests for kva are for the
       same size, so if I can make new requests, the old ones should be
       serviceable too.

    This analysis was a little too simple-minded.  It may be that most new
    execs are serviced by exec_pool without going through exec_pool_alloc,
    but every one that does go through exec_pool_alloc gets wedged in [vm_ma=
 p].

 It also seems that although there is a hard limit on the number of
 objects currently pooled in exec_pool, namely maxexec (16 on my
 system), there is no limit on the number of pending calls to the
 pool's allocator.  This strikes me as suboptimal...

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45718: processes sometimes get stuck and spin in vm_map
Date: Sun, 9 Dec 2012 08:43:58 +0000

 pmap(8) on the kernel_map (`pmap -V 0x87654321' with the address of
 kernel_map as shown by crash(8)) shows that kva is fragmented enough
 that there is indeed no empty space of at least #x40000 bytes, and
 that every request to exec_pool requiring a new kva allocation is
 guaranteed to fail.

From: David Laight <david@l8s.co.uk>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/45718: processes sometimes get stuck and spin in vm_map
Date: Sun, 9 Dec 2012 18:22:22 +0000

 On Sun, Dec 09, 2012 at 08:45:03AM +0000, Taylor R Campbell wrote:
 > The following reply was made to PR kern/45718; it has been noted by GNATS.
 > 
 > From: Taylor R Campbell <campbell+netbsd@mumble.net>
 > To: gnats-bugs@NetBSD.org
 > Cc: 
 > Subject: Re: kern/45718: processes sometimes get stuck and spin in vm_map
 > Date: Sun, 9 Dec 2012 08:43:58 +0000
 > 
 >  pmap(8) on the kernel_map (`pmap -V 0x87654321' with the address of
 >  kernel_map as shown by crash(8)) shows that kva is fragmented enough
 >  that there is indeed no empty space of at least #x40000 bytes, and
 >  that every request to exec_pool requiring a new kva allocation is
 >  guaranteed to fail.

 Allocating 256k blocks has to be sub-optimal!
 Especially for something that is going to be chopped into pieces
 (which is what I suspect is happening here).

 Personally I've no idea why we have this large proliferation of
 memory 'pools' - rather than just allocating items as-needed from
 general free lists.

 There might be some mileage in keeping stats for some uses - and
 predetermining the correct list for fixed size items - but not
 big private memory free lists.

 	David

 -- 
 David Laight: david@l8s.co.uk

From: Jeff Rizzo <riz@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/45718: processes sometimes get stuck and spin in vm_map
Date: Sun, 27 Jan 2013 13:58:23 -0800

 I just hit what appears to be this problem in 6.0_STABLE with 6.0.1 (or 
 maybe 6.0_STABLE, I forget exactly) userland on NetBSD/evbarm (a 
 Sheevaplug):


 db> show uvm
 Current UVM status:
    pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
 , ncolors=1  127411 VM pages: 87390 active, 0 inactive, 1345 wired, 
 23658 free
    pages  8564 anon, 75753 file, 4418 exec
    freemin=256, free-target=341, wired-max=42470
    cpu0:
      faults=1285187, traps=11298455, intrs=2416539, ctxswitch=2281911
      softint=989950, syscalls=4468244
    fault counts:
      noram=1, noanon=0, pgwait=0, pgrele=0
      ok relocks(total)=54213(54213), anget(retrys)=70879(0), amapcopy=125623
      neighbor anon/obj pg=48318/1379218, gets(lock/unlock)=449220/54213
      cases: anon=47679, anoncow=23189, obj=334661, prcopy=114559, 
 przero=241394
    daemon and swap counts:
      woke=450, revs=450, scans=520574, obscans=499526, anscans=0
      busy=0, freed=499526, reactivate=1194, deactivate=561810
      pageouts=0, pending=0, nswget=0
      nswapdev=1, swpgavail=1874999
      swpages=1874999, swpginuse=0, swpgonly=0, paging=0
 db> ps /l
 PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
 1758     1 3   0         0           c2f4ad20               cron vm_map
 1181     1 3   0         0           c2f36020               cron wait
 7580     1 3   0         0           c2f362c0               cron vm_map
 1627     1 3   0         0           c2f36560               cron wait
 1434     1 3   0         0           c2f36800               cron vm_map
 13657    1 3   0         0           c2f36aa0               cron wait
 13848    1 3   0         0           c2f36d40               cron vm_map
 1175     1 3   0         0           c2f16000               cron wait
 1238     1 3   0         0           c1be4540               cron vm_map
 13333    1 3   0         0           c1ce2aa0               cron vm_map
 12564    1 3   0         0           c1bb0560               cron wait
 1747     1 3   0         0           c1bb0aa0               cron wait
 1426     1 3   0         0           c2f162a0               cron vm_map
 1233     1 3   0         0           c2f16540               cron wait
 976      1 3   0         0           c2f167e0             dhcpcd vm_map
 1103     1 3   0         0           c2f16a80               cron vm_map
 13646    1 3   0         0           c2f16d20               cron wait
 13965    1 3   0         0           c2ef7020               cron vm_map
 5836     1 3   0         0           c2ef72c0               cron wait
 6667     1 3   0         0           c2ef7560               cron vm_map
 14218    1 3   0         0           c2ef7800               cron wait
 137      1 3   0         0           c2ef7aa0               cron vm_map
 328      1 3   0         0           c2ef7d40               cron wait
 135      1 3   0         0           c2eda000               cron vm_map
 326      1 3   0         0           c2eda2a0               cron vm_map
 12549    1 3   0         0           c2eda540               cron wait
 12996    1 3   0         0           c2eda7e0               cron wait
 1859     1 3   0         0           c2edaa80               cron vm_map
 11458    1 3   0         0           c2edad20               cron wait
 193      1 3   0         0           c2ebf020               cron vm_map
 192      1 3   0         0           c2ebf2c0               cron wait
 13363    1 3   0         0           c2ebf560               cron vm_map
 13233    1 3   0         0           c2ebf800               cron wait
 13371    1 3   0         0           c1ce2020             master vm_map
 13269    1 3   0         0           c2b4aaa0             master vm_map
 13784    1 3   0         0           c2ebfaa0               cron vm_map
 14142    1 3   0         0           c2ebfd40               cron wait
 13114    1 3   0         0           c2e8e000               cron vm_map
 14243    1 3   0         0           c2e8e2a0               cron wait
 13454    1 3   0         0           c2e8e540               cron vm_map
 12647    1 3   0         0           c2e8e7e0               cron vm_map
 13810    1 3   0         0           c2e8ea80               cron wait
 13423    1 3   0         0           c2e8ed20               cron wait
 11504    1 3   0         0           c1d65020               cron vm_map
 11266    1 3   0         0           c1d652c0               cron wait
 13559    1 3   0         0           c1d65560               cron vm_map
 13901    1 3   0         0           c1d65800               cron wait
 13309    1 3   0         0           c1d65aa0               cron vm_map
 12485    1 3   0         0           c1d65d40               cron wait
 12774    1 3   0         0           c2df7000               cron vm_map
 14154    1 3   0         0           c2df72a0               cron wait
 13183    1 3   0         0           c2df7540               cron vm_map
 11177    1 3   0         0           c2df77e0               cron wait
 10922    1 3   0         0           c2dc4020               cron vm_map
 12500    1 3   0         0           c2df7a80               cron vm_map
 11816    1 3   0         0           c2dc4aa0               cron wait
 12065    1 3   0         0           c2dc4800               cron wait
 12804    1 3   0         0           c2df7d20               make vm_map
 7516     1 3   0         0           c2dc42c0               make wait
 8630     1 3   0        80           c2dc4560                 sh wait
 6603     1 3   0        80           c2dc4d40               make wait
 5644     1 3   0        80           c2d89000                 sh wait
 96       1 3   0        80           c2d892a0                 sh wait
 1667     1 3   0        80           c2d89540        pbulk-build wait
 1566     1 3   0        80           c2d897e0                 sh wait
 1272     1 3   0        80           c2d89d20               sshd select
 136      1 3   0        80           c2d89a80                ssh select
 134      1 3   0        80           c2b4a020        pbulk-build select
 1555     1 3   0        80           c2b4a560                 sh wait
 98       1 3   0        80           c197e540                 sh wait
 1205     1 3   0        80           c197e000                 sh wait
 71       1 3   0        80           c1bb0800                zsh pause
 73       1 3   0        80           c1bb02c0               tmux kqueue
 1435     1 3   0   1000000           c1bb0d40              getty vm_map
 1262     1 3   0        80           c2b4a800               cron nanoslp
 784      1 3   0        80           c197e2a0               qmgr kqueue
 1117     1 3   0        80           c2b4a2c0              inetd kqueue
 1453     1 3   0        80           c2b4ad40              mdnsd select
 1316     1 3   0        80           c197e7e0             master kqueue
 1111     1 3   0        80           c1ce22c0               sshd select
 1041     1 3   0        80           c1be42a0               ntpd pause
 933      1 3   0        80           c197ea80               qmgr kqueue
 1046     1 3   0        80           c197ed20             master kqueue
 884      1 3   0        80           c1be4d20               sshd select
 364      1 3   0        80           c1bb0020            syslogd kqueue
 188      1 3   0         0           c1be4000             dhcpcd wait
 1        1 3   0        80           c1af82c0               init wait
 0       45 3   0       200           c1ce2560              nfsio nfsiod
 0       44 3   0       200           c1ce2800              nfsio nfsiod
 0       43 3   0       200           c1be4a80              nfsio nfsiod
 0       42 3   0       200           c1ce2d40              nfsio nfsiod
 0       41 3   0       200           c1be47e0            physiod physiod
 0       40 3   0       200           c1af9000           aiodoned aiodoned
 0       39 3   0       200           c1af9540            ioflush syncer
 0       38 3   0       200           c1af92a0           pgdaemon pgdaemon
 0       35 3   0       200           c1af97e0           swdmover swdmvr
 0       34 3   0       200           c1af8d40          cryptoret crypto_w
 0       33 3   0       200           c1a04000           scsibus0 sccomp
 0       32 3   0       200           c1af8aa0         usbtask-dr usbtsk
 0       31 3   0       200           c1af8800         usbtask-hc usbtsk
 0       30 3   0       200           c1af8560               usb0 usbevt
 0       29 3   0       200           c1af8020              unpgc unpgc
 0       28 3   0       200           c1af9d20        vmem_rehash vmem_rehash
 0       27 3   0       200           c1af9a80             sdmmc0 mmctaskq
 0       18 3   0       200           c1a042a0            atabus1 atath
 0       17 3   0       200           c1a04540            atabus0 atath
 0       16 3   0       200           c1a047e0               iic0 iicintr
 0       15 3   0       200           c1a04a80         pmfsuspend pmfsuspend
 0       14 3   0       200           c1a04d20           pmfevent pmfevent
 0       13 3   0       200           c19fd020         sopendfree sopendfr
 0       12 3   0       200           c19fd2c0           nfssilly nfssilly
 0       11 3   0       200           c19fd560            cachegc cachegc
 0       10 3   0       200           c19fd800              vrele vrele
 0        9 3   0       200           c19fdaa0             vdrain vdrain
 0        8 3   0       200           c19fdd40          modunload mod_unld
 0        7 3   0       200           c19f3000            xcall/0 xcall
 0        6 1   0       200           c19f32a0          softser/0
 0        5 1   0       200           c19f3540          softclk/0
 0        4 1   0       200           c19f37e0          softbio/0
 0        3 1   0       200           c19f3a80          softnet/0
 0    >   2 7   0       201           c19f3d20             idle/0
 0        1 3   0       200           c04cc960            swapper uvm
 db> t/a c2f4ad20
 trace: pid 1758 lid 1 at 0xcb691b58
 netbsd:mi_switch+0x10
          scp=0xc014787c rlv=0xc0143eb0 (netbsd:sleepq_block+0x88)
          rsp=0xcb691b5c rfp=0xcb691b80
          r10=0xc198e09c r9=0xc198e558
          r8=0x00000064 r7=0xc2f4ad20 r6=0xc04e9bd4 r5=0x00000000
          r4=0xc04ee2b4
 netbsd:sleepq_block+0x10
          scp=0xc0143e38 rlv=0xc011ae20 (netbsd:cv_timedwait+0xf0)
          rsp=0xcb691b84 rfp=0xcb691bb0
          r8=0x00000064 r7=0xc0367498
          r6=0xc04e9bd4 r5=0xc2f4ad20 r4=0xc04e9bd8
 netbsd:cv_timedwait+0x10
          scp=0xc011ad40 rlv=0xc02ddb40 (netbsd:uvm_map_prepare+0x178)
          rsp=0xcb691bb4 rfp=0xcb691c08
          r10=0x00801727 r9=0x00040000
          r8=0x00069138 r7=0xc04e9bd8 r6=0xc03e318c r5=0xc04e9bd4
          r4=0xc04e9bc4
 netbsd:uvm_map_prepare+0x10
          scp=0xc02dd9d8 rlv=0xc02ddda0 (netbsd:uvm_map+0x88)
          rsp=0xcb691c0c rfp=0xcb691c68
          r10=0x00000000 r9=0xcb691c24
          r8=0x00801727 r7=0xffffffff r6=0xcb691c98 r5=0x00040000
          r4=0xc04e9bc4
 netbsd:uvm_map+0x10
          scp=0xc02ddd28 rlv=0xc02d9394 (netbsd:uvm_km_alloc+0xbc)
          rsp=0xcb691c6c rfp=0xcb691cc4
          r10=0xc036b58c r9=0xc04dbc7c
          r8=0xc04e11c0 r7=0xffffffff r6=0xffffffff r5=0x00000000
          r4=0x00040000
 netbsd:uvm_km_alloc+0x10
          scp=0xc02d92e8 rlv=0xc0230040 (netbsd:pool_grow+0x38)
          rsp=0xcb691cc8 rfp=0xcb691cfc
          r10=0xc04dbc78 r9=0xc04dbc7c
          r8=0xc040934c r7=0xc04dbc78 r6=0x00000001 r5=0xc04dbc04
          r4=0xc04dbc04
 netbsd:pool_grow+0x10
          scp=0xc0230018 rlv=0xc022f994 (netbsd:pool_get+0x80)
          rsp=0xcb691d00 rfp=0xcb691d2c
          r10=0xc04dbc78 r9=0xc04dbc7c
          r8=0x00000001 r7=0x00000001 r6=0x00000000 r5=0x00000000
          r4=0xc04dbc04
 netbsd:pool_get+0x10
          scp=0xc022f924 rlv=0xc0123cd4 (netbsd:execve_loadvm+0x2d4)
          rsp=0xcb691d30 rfp=0xcb691d88
          r10=0xc04ee2b4 r9=0xcb691d5c
          r8=0xc2f38550 r7=0xc2f4ad20 r6=0x00000000 r5=0xc2f4ad20
          r4=0xcb691d94
 netbsd:execve_loadvm+0x10
          scp=0xc0123a10 rlv=0xc0126594 (netbsd:execve1+0x28)
          rsp=0xcb691d8c rfp=0xcb691ee0
          r10=0xc0407b14 r9=0x00000003
          r8=0xc2f38550 r7=0x0000003b r6=0xc2f4ad20 r5=0xc2f4ad20
          r4=0xcb691d94
 netbsd:execve1+0x10
          scp=0xc012657c rlv=0xc01265e4 (netbsd:sys_execve+0x2c)
          rsp=0xcb691ee4 rfp=0xcb691ef4
          r5=0xc2f4ad20 r4=0xcb691fb4
 netbsd:sys_execve+0x10
          scp=0xc01265c8 rlv=0xc02497a0 (netbsd:syscall+0x84)
          rsp=0xcb691ef8 rfp=0xcb691f84
 netbsd:syscall+0x10
          scp=0xc024972c rlv=0xc02499d0 (netbsd:swi_handler+0xb4)
          rsp=0xcb691f88 rfp=0xcb691fb0
          r10=0x00000000 r9=0x5105bad0
          r8=0x00000000 r7=0x00000001 r6=0xc03e322c r5=0xc2f4ad20
          r4=0xcb691fb4
 netbsd:swi_handler+0x10
          scp=0xc024992c rlv=0xc005f12c (netbsd:swi_entry+0x2c)
          rsp=0xcb691fb4 rfp=0xbfffe860
          r8=0xbfffede0 r7=0x00018ecc
          r6=0x2020c0a6 r5=0x00001230 r4=0xc03e322c
 db>

State-Changed-From-To: feedback->analyzed
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Sat, 20 Feb 2016 15:34:14 +0000
State-Changed-Why:
feedback provided and problem analyzed


From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/45718 CVS commit: src/sys/kern
Date: Fri, 20 Oct 2017 14:48:43 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Fri Oct 20 14:48:43 UTC 2017

 Modified Files:
 	src/sys/kern: kern_exec.c

 Log Message:
 Carve out KVA for execargs on boot from an exec_map like we used to.

 Candidate fix for PR kern/45718: `processes sometimes get stuck and
 spin in vm_map', a problem that has been plaguing all our 32-bit
 ports for years.

 Since we currently use large (256k) buffers for execargs, and since
 nobody has stepped up to tackle breaking them into bite-sized (or at
 least page-sized) chunks, after KVA gets sufficiently fragmented we
 can't allocate new execargs buffers from kernel_map.

 Until 2008, we always carved out KVA for execargs on boot with a uvm
 submap exec_map of kernel_map.  Then ad@ found that the uvm_km_free
 call, to discard them when done, cost about 100us, which a pool
 avoided:

 https://mail-index.NetBSD.org/tech-kern/2008/06/25/msg001854.html
 https://mail-index.NetBSD.org/tech-kern/2008/06/26/msg001859.html

 ad@ _simultaneously_ introduced a pool _and_ eliminated the reserved
 KVA in the exec_map submap.  This change preserves the pool, but
 restores exec_map (with less code, by putting it in MI code instead
 of copying it in every MD initialization routine).

 Patch proposed on tech-kern:
 https://mail-index.NetBSD.org/tech-kern/2017/10/19/msg022461.html

 Patch tested by bouyer@:
 https://mail-index.NetBSD.org/tech-kern/2017/10/20/msg022465.html

 I previously discussed the issue on tech-kern before I knew of the
 history around exec_map:
 https://mail-index.NetBSD.org/tech-kern/2012/12/09/msg014695.html

 The candidate workaround I proposed of using pool_setlowat to force
 preallocation of KVA would also force preallocation of physical RAM,
 which is a waste not incurred by using exec_map, and which is part of
 why I never committed it.

 There may remain a general problem that if thread A calls pool_get
 and tries to service that request by a uvm_km_alloc call that hangs
 because KVA is scarce, and thread B does pool_put, the pool_put in
 thread B will not notify the pool_get in thread A that it doesn't
 need to wait for KVA, and so thread A may continue to hang in
 uvm_km_alloc.  However,

 (a) That won't apply here, because there is exactly as much KVA
 available in exec_map as exec_pool will ever try to use.

 (b) It is possible that may not even matter in other cases as long as
 the page daemon eventually tries to shrink the pool, which will cause
 a uvm_km_free that can unhang the hung uvm_km_alloc.

 XXX pullup-8
 XXX pullup-7
 XXX pullup-6
 XXX pullup-5, perhaps...


 To generate a diff of this commit:
 cvs rdiff -u -r1.447 -r1.448 src/sys/kern/kern_exec.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/45718 CVS commit: src/sys/kern
Date: Sat, 28 Oct 2017 17:06:43 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Oct 28 17:06:43 UTC 2017

 Modified Files:
 	src/sys/kern: subr_pool.c

 Log Message:
 Allow only one pending call to a pool's backing allocator at a time.

 Candidate fix for problems with hanging after kva fragmentation related
 to PR kern/45718.

 Proposed on tech-kern:

 https://mail-index.NetBSD.org/tech-kern/2017/10/23/msg022472.html

 Tested by bouyer@ on i386.

 This makes one small change to the semantics of pool_prime and
 pool_setlowat: they may fail with EWOULDBLOCK instead of ENOMEM, if
 there is a pending call to the backing allocator in another thread but
 we are not actually out of memory.  That is unlikely because nearly
 always these are used during initialization, when the pool is not in
 use.

 XXX pullup-8
 XXX pullup-7
 XXX pullup-6 (requires tweaking the patch)
 XXX pullup-5...


 To generate a diff of this commit:
 cvs rdiff -u -r1.208 -r1.209 src/sys/kern/subr_pool.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/45718 CVS commit: [netbsd-8] src/sys
Date: Tue, 27 Feb 2018 09:07:33 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Tue Feb 27 09:07:33 UTC 2018

 Modified Files:
 	src/sys/arch/alpha/alpha [netbsd-8]: pmap.c
 	src/sys/arch/m68k/m68k [netbsd-8]: pmap_motorola.c
 	src/sys/arch/powerpc/oea [netbsd-8]: pmap.c
 	src/sys/arch/sparc64/sparc64 [netbsd-8]: pmap.c
 	src/sys/arch/x86/x86 [netbsd-8]: pmap.c
 	src/sys/dev/dtv [netbsd-8]: dtv_scatter.c
 	src/sys/dev/marvell [netbsd-8]: mvxpsec.c
 	src/sys/kern [netbsd-8]: subr_extent.c subr_pool.c uipc_mbuf.c
 	src/sys/opencrypto [netbsd-8]: crypto.c
 	src/sys/sys [netbsd-8]: mbuf.h pool.h
 	src/sys/ufs/chfs [netbsd-8]: chfs_malloc.c
 	src/sys/uvm [netbsd-8]: uvm_fault.c

 Log Message:
 Pull up following revision(s) (requested by mrg in ticket #593):
 	sys/dev/marvell/mvxpsec.c: revision 1.2
 	sys/arch/m68k/m68k/pmap_motorola.c: revision 1.70
 	sys/opencrypto/crypto.c: revision 1.102
 	sys/arch/sparc64/sparc64/pmap.c: revision 1.308
 	sys/ufs/chfs/chfs_malloc.c: revision 1.5
 	sys/arch/powerpc/oea/pmap.c: revision 1.95
 	sys/sys/pool.h: revision 1.80,1.82
 	sys/kern/subr_pool.c: revision 1.209-1.216,1.219-1.220
 	sys/arch/alpha/alpha/pmap.c: revision 1.262
 	sys/kern/uipc_mbuf.c: revision 1.173
 	sys/uvm/uvm_fault.c: revision 1.202
 	sys/sys/mbuf.h: revision 1.172
 	sys/kern/subr_extent.c: revision 1.86
 	sys/arch/x86/x86/pmap.c: revision 1.266 (via patch)
 	sys/dev/dtv/dtv_scatter.c: revision 1.4

 Allow only one pending call to a pool's backing allocator at a time.
 Candidate fix for problems with hanging after kva fragmentation related
 to PR kern/45718.

 Proposed on tech-kern:
 https://mail-index.NetBSD.org/tech-kern/2017/10/23/msg022472.html
 Tested by bouyer@ on i386.

 This makes one small change to the semantics of pool_prime and
 pool_setlowat: they may fail with EWOULDBLOCK instead of ENOMEM, if
 there is a pending call to the backing allocator in another thread but
 we are not actually out of memory.  That is unlikely because nearly
 always these are used during initialization, when the pool is not in
 use.

 Define the new flag too for previous commit.

 pool_grow can now fail even when sleeping is ok. Catch this case in pool_get
 and retry.

 Assert that pool_get failure happens only with PR_NOWAIT.
 This would have caught the mistake I made last week leading to null
 pointer dereferences all over the place, a mistake which I evidently
 poorly scheduled alongside maxv's change to the panic message on x86
 for null pointer dereferences.

 Since pr_lock is now used to wait for two things now (PR_GROWING and
 PR_WANTED) we need to loop for the condition we wanted.
 make the KASSERTMSG/panic strings consistent as '%s: [%s], __func__, wchan'
 Handle the ERESTART case from pool_grow()

 don't pass 0 to the pool flags
 Guess pool_cache_get(pc, 0) means PR_WAITOK here.
 Earlier on in the same context we use kmem_alloc(sz, KM_SLEEP).

 use PR_WAITOK everywhere.
 use PR_NOWAIT.

 Don't use 0 for PR_NOWAIT

 use PR_NOWAIT instead of 0

 panic ex nihilo -- PR_NOWAITing for zerot

 Add assertions that either PR_WAITOK or PR_NOWAIT are set.
 - fix an assert; we can reach there if we are nowait or limitfail.
 - when priming the pool and failing with ERESTART, don't decrement the number
   of pages; this avoids the issue of returning an ERESTART when we get to 0,
   and is more correct.
 - simplify the pool_grow code, and don't wakeup things if we ENOMEM.

 In pmap_enter_ma(), only try to allocate pves if we might need them,
 and even if that fails, only fail the operation if we later discover
 that we really do need them.  This implements the requirement that
 pmap_enter(PMAP_CANFAIL) must not fail when replacing an existing
 mapping with the first mapping of a new page, which is an unintended
 consequence of the changes from the rmind-uvmplock branch in 2011.

 The problem arises when pmap_enter(PMAP_CANFAIL) is used to replace an existing
 pmap mapping with a mapping of a different page (eg. to resolve a copy-on-write).
 If that fails and leaves the old pmap entry in place, then UVM won't hold
 the right locks when it eventually retries.  This entanglement of the UVM and
 pmap locking was done in rmind-uvmplock in order to improve performance,
 but it also means that the UVM state and pmap state need to be kept in sync
 more than they did before.  It would be possible to handle this in the UVM code
 instead of in the pmap code, but these pmap changes improve the handling of
 low memory situations in general, and handling this in UVM would be clunky,
 so this seemed like the better way to go.

 This somewhat indirectly fixes PR 52706, as well as the failing assertion
 about "uvm_page_locked_p(old_pg)".  (but only on x86, various other platforms
 will need their own changes to handle this issue.)
 In uvm_fault_upper_enter(), if pmap_enter(PMAP_CANFAIL) fails, assert that
 the pmap did not leave around a now-stale pmap mapping for an old page.
 If such a pmap mapping still existed after we unlocked the vm_map,
 the UVM code would not know later that it would need to lock the
 lower layer object while calling the pmap to remove or replace that
 stale pmap mapping.  See PR 52706 for further details.
 hopefully workaround the irregularly "fork fails in init" problem.
 if a pool is growing, and the grower is PR_NOWAIT, mark this.
 if another caller wants to grow the pool and is also PR_NOWAIT,
 busy-wait for the original caller, which should either succeed
 or hard-fail fairly quickly.

 implement the busy-wait by unlocking and relocking this pools
 mutex and returning ERESTART.  other methods (such as having
 the caller do this) were significantly more code and this hack
 is fairly localised.
 ok chs@ riastradh@

 Don't release the lock in the PR_NOWAIT allocation. Move flags setting
 after the acquiring the mutex. (from Tobias Nygren)
 apply the change from arch/x86/x86/pmap.c rev. 1.266 commitid vZRjvmxG7YTHLOfA:

 In pmap_enter_ma(), only try to allocate pves if we might need them,
 and even if that fails, only fail the operation if we later discover
 that we really do need them.  If we are replacing an existing mapping,
 reuse the pv structure where possible.

 This implements the requirement that pmap_enter(PMAP_CANFAIL) must not fail
 when replacing an existing mapping with the first mapping of a new page,
 which is an unintended consequence of the changes from the rmind-uvmplock
 branch in 2011.

 The problem arises when pmap_enter(PMAP_CANFAIL) is used to replace an existing
 pmap mapping with a mapping of a different page (eg. to resolve a copy-on-write).
 If that fails and leaves the old pmap entry in place, then UVM won't hold
 the right locks when it eventually retries.  This entanglement of the UVM and
 pmap locking was done in rmind-uvmplock in order to improve performance,
 but it also means that the UVM state and pmap state need to be kept in sync
 more than they did before.  It would be possible to handle this in the UVM code
 instead of in the pmap code, but these pmap changes improve the handling of
 low memory situations in general, and handling this in UVM would be clunky,
 so this seemed like the better way to go.

 This somewhat indirectly fixes PR 52706 on the remaining platforms where
 this problem existed.


 To generate a diff of this commit:
 cvs rdiff -u -r1.261 -r1.261.8.1 src/sys/arch/alpha/alpha/pmap.c
 cvs rdiff -u -r1.69 -r1.69.8.1 src/sys/arch/m68k/m68k/pmap_motorola.c
 cvs rdiff -u -r1.94 -r1.94.8.1 src/sys/arch/powerpc/oea/pmap.c
 cvs rdiff -u -r1.307 -r1.307.6.1 src/sys/arch/sparc64/sparc64/pmap.c
 cvs rdiff -u -r1.245.6.1 -r1.245.6.2 src/sys/arch/x86/x86/pmap.c
 cvs rdiff -u -r1.3 -r1.3.2.1 src/sys/dev/dtv/dtv_scatter.c
 cvs rdiff -u -r1.1 -r1.1.12.1 src/sys/dev/marvell/mvxpsec.c
 cvs rdiff -u -r1.80 -r1.80.8.1 src/sys/kern/subr_extent.c
 cvs rdiff -u -r1.207 -r1.207.6.1 src/sys/kern/subr_pool.c
 cvs rdiff -u -r1.172 -r1.172.6.1 src/sys/kern/uipc_mbuf.c
 cvs rdiff -u -r1.78.2.4 -r1.78.2.5 src/sys/opencrypto/crypto.c
 cvs rdiff -u -r1.170.2.1 -r1.170.2.2 src/sys/sys/mbuf.h
 cvs rdiff -u -r1.79 -r1.79.10.1 src/sys/sys/pool.h
 cvs rdiff -u -r1.4 -r1.4.30.1 src/sys/ufs/chfs/chfs_malloc.c
 cvs rdiff -u -r1.199.6.2 -r1.199.6.3 src/sys/uvm/uvm_fault.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: analyzed->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Tue, 10 Apr 2018 08:44:40 +0000
State-Changed-Why:
Is this fixed?


State-Changed-From-To: feedback->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Tue, 31 Jan 2023 23:49:47 +0000
State-Changed-Why:
Assume it's fixed. No response since 2018...


>Unformatted:
Home
PR Database Search
(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.