NetBSD Problem Report #58001

From www@netbsd.org  Wed Mar  6 07:38:59 2024
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 5696E1A9239
	for <gnats-bugs@gnats.NetBSD.org>; Wed,  6 Mar 2024 07:38:59 +0000 (UTC)
Message-Id: <20240306073857.BC1671A923A@mollari.NetBSD.org>
Date: Wed,  6 Mar 2024 07:38:57 +0000 (UTC)
From: michael.cheponis@gmail.com
Reply-To: michael.cheponis@gmail.com
To: gnats-bugs@NetBSD.org
Subject: systat -w1 vmstat reports "Cannot get buffers;Cannot allocate memory"
X-Send-Pr-Version: www-1.0

>Number:         58001
>Category:       bin
>Synopsis:       systat -w1 vmstat reports "Cannot get buffers;Cannot allocate memory"
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Mar 06 07:40:01 +0000 2024
>Last-Modified:  Wed Mar 06 13:45:01 +0000 2024
>Originator:     Mike Cheponis
>Release:        10.0_RC5
>Organization:
self
>Environment:
NetBSD SS.Culver.Net 10.0_RC5 NetBSD 10.0_RC5 (GENERIC) #0: Tue Feb 27 05:27:39 UTC 2024  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC am
>Description:
systat -w1 vmstat reports "Cannot get buffers;Cannot allocate memory"

at the same time, in a different window, this command is running on a 2TB disk:

dd if=/dev/sd0a of=/dev/null bs=1m

avail memory = 31424 MB

If this is not running, the error does not appear

>swapctl -l
Device       1048576-blocks     Used    Avail Capacity  Priority
/dev/dk2              32507        0    32507     0%    0
/r/swapfile0           4096        0     4096     0%    0
Total                 36603        0    36603     0%

>How-To-Repeat:
see above
>Fix:
This only occurs when the line: dd if=/dev/sd0a of=/dev/null bs=1m

is running.  I've not tried other possible long-time-running background tasks to see why systat -w1 vmstat is complaining

>Audit-Trail:
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/58001: systat -w1 vmstat reports "Cannot get buffers;Cannot allocate memory"
Date: Wed, 06 Mar 2024 20:20:36 +0700

     Date:        Wed,  6 Mar 2024 07:40:01 +0000 (UTC)
     From:        michael.cheponis@gmail.com
     Message-ID:  <20240306074001.47D251A923B@mollari.NetBSD.org>


   | systat -w1 vmstat reports "Cannot get buffers;Cannot allocate memory"

 This looks to be an issue with how systat collects buffer cache info.
 ( src/usr.bin/systat/bufcache.c fetchbufcache() )

 It starts by asking the kernel how much memory it needs to malloc()
 to fetch the data, allocates that much, and then tries to actually fetch.

 There's an obvious (and unavoidable) race there - if the amount required
 has grown between the request for how much, and the request to fetch into
 that buffer, then the 2nd will fail (insufficient space provided).

 systat anticipates that, if that happens it goes back and tries again,
 but this time adds 100 bytes to the amount the kernel says is required.

 If that attempt fails again (the same procedure as the first time, but
 this time anticipating the kernel is likely to actually want to send more
 data in the 2nd call than it claims it will in the first) then systat
 tries again, with 200 bytes extra instead of 100, and again, and again
 until it is allowing 1000 extra bytes from the first call's result in
 the second call.

 If that still isn't enough, systat gives up, and you get the error above.

 The "Cannot allocate memory" is something of a misnomer, and probably
 shouldn't be included - that's just because sysctl(2) is returning ENOSPC
 to indicate that the buffer provided isn't big enough for the data to
 be returned - it has nothing whatever to do with a malloc() failure
 or similar.   A better error message would be "requirement changing
 too quickly".

 My guess is that if you start that dd, wait for a while (maybe 10 or
 20 minutes) until things have stabilised a little, and then try the
 vmstat, it will work just fine, as by then the number the kernel returns
 in the first ('how much buffer space do I need') sysctl() call, will be
 close enough to what is actually needed for the algorithm systat implements
 to work OK.

 I'm not sure about the "100" though, or perhaps more the sequence 100, 200,
 300, ... 1000 - I think I'd be doing more like 100 200 400 800 1600, or
 perhaps even better 1024 2048 4096 8192 ... to reduce the chances of
 rapidly increasing data requirements from causing this particular issue.

 kre


From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/58001: systat -w1 vmstat reports "Cannot get buffers;Cannot allocate memory"
Date: Wed, 6 Mar 2024 13:42:19 -0000 (UTC)

 michael.cheponis@gmail.com writes:

 >This only occurs when the line: dd if=/dev/sd0a of=/dev/null bs=1m

 >is running.  I've not tried other possible long-time-running background tasks to see why systat -w1 vmstat is complaining

 It tries to fetch information about all allocated buffers (buffer cache)
 and limits this to 1000. If there is data about more buffers, you get the
 misleading message and systat fails.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.