NetBSD Problem Report #56516

From he@smistad.uninett.no  Sun Nov 21 11:26:16 2021
Return-Path: <he@smistad.uninett.no>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 14DD21A9239
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 21 Nov 2021 11:26:16 +0000 (UTC)
Message-Id: <20211121112609.C584143FFD3@smistad.uninett.no>
Date: Sun, 21 Nov 2021 12:26:09 +0100 (CET)
From: he@NetBSD.org
Reply-To: he@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: pgdaemon spins despite lots of memory available
X-Send-Pr-Version: 3.95

>Number:         56516
>Category:       kern
>Synopsis:       pgdaemon spins despite lots of memory available
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Nov 21 11:30:00 +0000 2021
>Last-Modified:  Sun Nov 21 19:50:00 +0000 2021
>Originator:     he@NetBSD.org
>Release:        NetBSD 9.99.92
>Organization:
   I try...
>Environment:
System: NetBSD jonsvatnet.uninett.no 9.99.92 NetBSD 9.99.92 (JV) #1: Sat Nov 13 19:04:32 CET 2021  he@jonsvatnet.uninett.no:/usr/obj/sys/arch/i386/compile/JV i386
Architecture: i386
Machine: i386
>Description:
	I have this Lenovo T60 laptop with a core2-duo CPU, equipped
	with as much memory as will fit (4GB, 3GB usable), which now
	runs -current from about a week ago.

	On this I have firefox installed (version 86.0.1), and
	although it is usable at first, apparently the VM subsystem
	can be put under some pressure by this.  When this happens, it
	appears that pgdaemon hogs one of the CPU cores, *and is
	unable to recover*.  Woe the process which gets scheduled on
	the hogged core -- this creates long-lasting apparent "hang"
	situations, significantly reducing the usability fo the
	system.

	At this time I have killed firefox, and it took minutes
	exiting, because it was scheduled on the CPU core hogged by
	pgdaemon, and "load average" shot up to 22 as observed with
	"top".  Now, even after firefox is killed, pgdaemon continues
	to hog the core:

load averages:  0.28,  0.47,  2.92;               up 4+12:17:48        12:03:24
161 threads: 10 idle, 1 runnable, 148 sleeping, 2 on CPU
CPU states:  0.4% user,  0.0% nice, 51.2% system,  0.0% interrupt, 48.4% idle
Memory: 149M Act, 75M Inact, 2084K Wired, 21M Exec, 115M File, 2476M Free
Swap: 2048M Total, 451M Used, 1598M Free

  PID   LID USERNAME PRI STATE       TIME   WCPU    CPU NAME      COMMAND
    0   209 root     126 CPU/1     491:09 99.02% 99.02% pgdaemon  [system]

	Notice, also, that there is plenty of free physical memory,
	and also plenty of swap space available.

	I have collected "top -1", "top -t -1" and several "vmstat -s"
	outputs both while I had firefox running and after.  Now
	after, two successive diffs of "vmstat -s" outputs gives among
	other things:

-     7403 forks total
-     3607 forks blocked parent
-     3607 forks shared address space with parent
-208820310 pagealloc desired color avail
+     7416 forks total
+     3620 forks blocked parent
+     3620 forks shared address space with parent
+208843989 pagealloc desired color avail
  11053703 pagealloc desired color not avail
-182464528 pagealloc local cpu avail
- 37409485 pagealloc local cpu not avail
+182464752 pagealloc local cpu avail
+ 37432940 pagealloc local cpu not avail
   3505016 faults with no memory


	What does "pagealloc local cpu not avail" signify, and is this
	related to what I observe?

	I tried collecting stack backtraces of the page daemon using
	"crash", but obviously that's not easy to do on a running
	system.  The few entries I could get were


crash> ps
PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
...
0    > 209 7   1     40200           c5629980           pgdaemon
...
crash> trace/a 0xc5629980
trace: pid 0 lid 209 at 0xdd8dfee8
cpu_count_sync(2aa,dd8dff9c,c0c6ffb7,0,1,0,37a5000,0,9,c12b8638) at cpu_count_sync+0x34
uvm_availmem(c150b180,1,0,37a5000,0,9,c12b8638,b0556000,0,a4e11d) at uvm_availmem+0x12
uvm_pageout(c5629980,1792000,179d000,0,c01005a8,0,0,0,0,0) at uvm_pageout+0x11f
crash> trace/a 0xc5629980
trace: pid 0 lid 209 at 0xdd8dfee8
cpu_count_sync(0,dd8dff9c,c0c6ff8e,0,1,0,37a5000,0,9,c12b8638) at cpu_count_sync+0x34
gdt_desc(0,1,0,37a5000,0,9,c12b8638,b0556000,0,a4e11d) at gdt_desc+0x5788b
uvm_pageout(c5629980,1792000,179d000,0,c01005a8,0,0,0,0,0) at uvm_pageout+0x11f
crash> 

	I'm also left wondering if there is something else I should
	collect when this happens again, or whether there is some
	missing instrumentation which keeps the actual root cause
	for this problem hidden.

	Full "vmstats -s" diff between the first and last "vmstat -s"
	output looks like this:

--- vm-1        2021-11-21 10:33:42.981139066 +0100
+++ vm-5        2021-11-21 12:05:40.128109281 +0100
@@ -1,63 +1,63 @@
      4096 bytes per page
        64 page colors
    765972 pages managed
-   383342 pages free
-   187360 pages active
-   106346 pages inactive
+   634046 pages free
+    37934 pages active
+    19166 pages inactive
         0 pages paging
-      720 pages wired
+      521 pages wired
         1 reserve pagedaemon pages
        10 reserve kernel pages
     14214 boot kernel pages
-    99145 kernel pool pages
-   266962 anonymous pages
-     9540 cached file pages
-    18022 cached executable pages
+    98416 kernel pool pages
+    22970 anonymous pages
+    29508 cached file pages
+     5271 cached executable pages
       512 minimum free pages
       682 target free pages
    255324 maximum wired pages
         1 swap devices
    524411 swap pages
-   421039 swap pages in use
-  2740609 swap allocations
-172169042 total faults taken
-171356945 traps
- 23930539 device interrupts
-345600500 CPU context switches
- 90257614 software interrupts
-2495042890 system calls
-  2679735 pagein requests
+   115346 swap pages in use
+  2765102 swap allocations
+173975025 total faults taken
+173165874 traps
+ 24244599 device interrupts
+349252741 CPU context switches
+ 91729660 software interrupts
+2510872290 system calls
+  2700872 pagein requests
   3144353 pageout requests
         0 pages swapped in
   4972167 pages swapped out
-     7201 forks total
-     3489 forks blocked parent
-     3489 forks shared address space with parent
-206614703 pagealloc desired color avail
+     7416 forks total
+     3620 forks blocked parent
+     3620 forks shared address space with parent
+208843989 pagealloc desired color avail
  11053703 pagealloc desired color not avail
-182458238 pagealloc local cpu avail
- 35210168 pagealloc local cpu not avail
+182464752 pagealloc local cpu avail
+ 37432940 pagealloc local cpu not avail
   3505016 faults with no memory
         2 faults with no anons
-      412 faults had to wait on pages
+      414 faults had to wait on pages
         0 faults found released page
- 14774546 faults relock (14740483 ok)
- 11243088 anon page faults
-  3096851 anon retry faults
-  8825288 amap copy faults
-  1275389 neighbour anon page faults
-  7819793 neighbour object page faults
- 13943406 locked pager get faults
- 11647072 unlocked pager get faults
- 11082066 anon faults
-   128972 anon copy on write faults
- 13728090 object faults
-   211318 promote copy faults
- 70057788 promote zero fill faults
-   120911 faults upgraded lock
+ 14825337 faults relock (14790938 ok)
+ 11378568 anon page faults
+  3117702 anon retry faults
+  8908849 amap copy faults
+  1283399 neighbour anon page faults
+  7934522 neighbour object page faults
+ 14012170 locked pager get faults
+ 11676724 unlocked pager get faults
+ 11213313 anon faults
+   132908 anon copy on write faults
+ 13790641 object faults
+   217483 promote copy faults
+ 70492656 promote zero fill faults
+   125168 faults upgraded lock
       407 faults couldn't upgrade lock
   1152656 times daemon wokeup
-   606690 revolutions of the clock hand
+   607521 revolutions of the clock hand
  10805533 pages freed by daemon
  40501859 pages scanned by daemon
  14520225 anonymous pages scanned by daemon
@@ -65,26 +65,26 @@
   4744874 pages reactivated
       653 pages found busy by daemon
   1827814 total pending pageouts
- 34013801 pages deactivated
-40964371808 per-cpu stats synced
-    50619 anon pages possibly dirty
-   216343 anon pages dirty
+ 34072981 pages deactivated
+51478979019 per-cpu stats synced
+     5605 anon pages possibly dirty
+    17365 anon pages dirty
         0 anon pages clean
-       21 file pages possibly dirty
-        4 file pages dirty
-    27537 file pages clean
-127320424 total name lookups
- 39747935 good hits
-  2142393 negative hits
-   155267 bad hits
+        0 file pages possibly dirty
+        0 file pages dirty
+    34779 file pages clean
+127592221 total name lookups
+ 39979733 good hits
+  2174909 negative hits
+   155945 bad hits
         0 false hits
- 85274829 miss
+ 85281634 miss
         0 too long
- 16564909 pass2 hits
- 16730992 2passes
-     2805 reverse hits
-        2 reverse miss
+ 16565068 pass2 hits
+ 16731627 2passes
+     2862 reverse hits
+        3 reverse miss
         0 access denied
-          cache hits (31% pos + 1% neg) system 13% per-process
+          cache hits (31% pos + 1% neg) system 12% per-process
           deletions 0%, falsehits 0%, toolong 0%

	And here is some of the "top" outputs I collected while I had
	firefox running:

load averages:  7.27,  7.94,  7.63;               up 4+10:49:44        10:35:20
78 processes: 3 runnable, 73 sleeping, 2 on CPU
CPU0 states:  0.0% user,  0.0% nice,  100% system,  0.0% interrupt,  0.0% idle
CPU1 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Memory: 755M Act, 429M Inact, 2860K Wired, 70M Exec, 37M File, 1460M Free
Swap: 2048M Total, 1645M Used, 404M Free

  PID USERNAME PRI NICE   SIZE   RES STATE       TIME   WCPU    CPU COMMAND
    0 root     126    0     0K   32M CPU/0     418:21 99.02% 99.02% [system]
 2264 he        43    0  1311M  301M parked/0  121:33  5.71%  5.71% firefox
 3019 he        83    0   162M   43M poll/1     22:56  1.27%  1.27% X
 4592 he        43    0  1099M  409M RUN/1      53:11  0.83%  0.83% firefox
 4813 he        43    0   535M  161M RUN/1      70:40  0.39%  0.39% firefox
 4029 he        43    0   638M  114M RUN/1      30:22  0.00%  0.00% firefox
 1515 haldaemo  85    0   233M 5300K xclocv/1    9:00  0.00%  0.00% hald
 2196 he        43    0   667M   34M parked/0    6:09  0.00%  0.00% firefox
 9298 he        43    0   492M   51M parked/0    4:52  0.00%  0.00% firefox
 4769 he        43    0   569M   28M parked/1    4:38  0.00%  0.00% firefox
 1794 he        85    0    16M 2900K select/1    3:37  0.00%  0.00% ssh
 4763 he        43    0   514M   33M parked/1    3:37  0.00%  0.00% firefox
 5439 he        43    0   518M   42M parked/0    3:29  0.00%  0.00% firefox
15333 he        43    0   525M   78M parked/1    3:00  0.00%  0.00% firefox
 1616 he        85    0    15M 1260K select/1    1:39  0.00%  0.00% ssh

	and

load averages:  9.44,  8.77,  8.14;               up 4+10:58:01        10:43:37
393 threads: 10 idle, 11 runnable, 369 sleeping, 1 zombie, 2 on CPU
CPU0 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU1 states:  0.0% user,  0.0% nice,  100% system,  0.0% interrupt,  0.0% idle
Memory: 758M Act, 414M Inact, 2864K Wired, 72M Exec, 39M File, 1473M Free
Swap: 2048M Total, 1644M Used, 405M Free

  PID   LID USERNAME PRI STATE       TIME   WCPU    CPU NAME      COMMAND
    0   209 root     126 CPU/1     411:22 99.02% 99.02% pgdaemon  [system]
 4592  4592 he        85 poll/0     43:44  3.27%  3.27% Web Conte firefox
 2264 25010 he        85 poll/0     28:46  0.05%  0.05% Socket Th firefox
 4813  4813 he        85 poll/0     59:01  0.00%  0.00% Web Conte firefox
 2264  2264 he        85 poll/0     41:07  0.00%  0.00% MainThrea firefox
 4029  4029 he        85 poll/0     24:20  0.00%  0.00% Web Conte firefox
 3019  3019 he        85 poll/0     22:45  0.00%  0.00% -         X
 2264  2607 he        43 parked/0   16:29  0.00%  0.00% Composito firefox
 1515  1515 haldaemo  85 poll/0      9:00  0.00%  0.00% -         hald
 2264  2782 he        85 kqueue/0    8:40  0.00%  0.00% IPC I/O P firefox
    0   210 root     124 syncer/0    8:38  0.00%  0.00% ioflush   [system]
 2264 23272 he        43 parked/0    7:40  0.00%  0.00% DOM Worke firefox
 2196  2196 he        85 poll/0      4:44  0.00%  0.00% Web Conte firefox


>How-To-Repeat:
	See above.
>Fix:
	Sorry, don't know.
	VM issues is deep black magic...

>Audit-Trail:
From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56516: pgdaemon spins despite lots of memory available
Date: Sun, 21 Nov 2021 16:52:09 -0000 (UTC)

 he@NetBSD.org (Havard Eidnes) writes:

 >I got a tip to reduce kern.maxvnodes to about 1/10 of the current
 >value, and YES, this makes pgdaemon no longer hog the core:
 >    0   209 root     126 pgdaem/1  517:23  0.00%  0.00% pgdaemon  [system]

 >So ... this probably gives a lead to what's really causing this?

 With less vnodes the kernel needs less memory and you don't run out
 of (kernel-) memory that easily.

 When you kick pagedaemon to free memory, it checks some metric and
 if bad, it asks everyone (pools, cache, etc..) to free resources
 and it loops until the metric is good again. If the run doesn't
 actually free resources it will hog the CPU up to the point that
 some freeing operations don't even start.

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56516: pgdaemon spins despite lots of memory available
Date: Sun, 21 Nov 2021 19:45:51 +0000

 Not sent to gnats.

 From: Havard Eidnes <he@NetBSD.org>
 To: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
 Subject: Re: kern/56516: pgdaemon spins despite lots of memory available
 Date: Sun, 21 Nov 2021 12:39:30 +0100 (CET)

 I got a tip to reduce kern.maxvnodes to about 1/10 of the current
 value, and YES, this makes pgdaemon no longer hog the core:

 jonsvatnet# sysctl -w kern.maxvnodes=12000
 kern.maxvnodes: 110592 -> 12000
 jonsvatnet# 

 load averages:  0.05,  0.13,  0.50;               up 4+12:48:33      12:34:09
 162 threads: 10 idle, 151 sleeping, 1 on CPU
 CPU states:  0.2% user,  0.0% nice,  0.4% system,  0.1% interrupt, 99.3% idle
 Memory: 151M Act, 76M Inact, 2084K Wired, 21M Exec, 115M File, 2468M Free
 Swap: 2048M Total, 451M Used, 1598M Free

   PID   LID USERNAME PRI STATE       TIME   WCPU    CPU NAME      COMMAND
  3019  3019 he        85 poll/1     23:13  0.05%  0.05% -         X
     0   209 root     126 pgdaem/1  517:23  0.00%  0.00% pgdaemon  [system]
  1515  1515 haldaemo  85 poll/0      9:10  0.00%  0.00% -         hald

 So ... this probably gives a lead to what's really causing this?

 Regards,

 - Havard

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.