NetBSD Problem Report #59996
From wiz@netbsd.org Wed Feb 11 23:02:33 2026
Return-Path: <wiz@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits)
client-signature RSA-PSS (2048 bits))
(Client CN "mail.netbsd.org", Issuer "R13" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 5A8821A9239
for <gnats-bugs@gnats.NetBSD.org>; Wed, 11 Feb 2026 23:02:33 +0000 (UTC)
Message-Id: <20260211230232.F177684D95@mail.netbsd.org>
Date: Wed, 11 Feb 2026 23:02:32 +0000 (UTC)
From: wiz@NetBSD.org
Reply-To: wiz@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3)
X-Send-Pr-Version: 3.95
X-From4GNATS: "wiz@NetBSD.org via gnats" <gnats-admin@NetBSD.org>
>Number: 59996
>Category: bin
>Synopsis: swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3)
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: bin-bug-people
>State: needs-pullups
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Feb 11 23:05:00 +0000 2026
>Closed-Date:
>Last-Modified: Tue Feb 17 19:50:01 +0000 2026
>Originator: Thomas Klausner
>Release: NetBSD 11.99.5
>Organization:
>Environment:
Architecture: x86_64
Machine: amd64
>Description:
When running "swapctl -l" in a loop (1/s) on a busy machine (bulk build)
with 10%-25% of available swap used, I sometimes see errors like
swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3)
I changed the code to use warn() instead of warnx(), but errno is not set:
swapctl: SWAP_STATS different to SWAP_NSWAP (2 != 3): Undefined error: 0
swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3): Undefined error: 0
swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3): Undefined error: 0
The machine does have three swap devices, so '3' is the correct value.
>How-To-Repeat:
Run a bulk build. (Doesn't work on an unloaded machine, I think.)
In a second window, run:
while true; do sleep 1; swapctl -l; done
Wait. Be lucky.
>Fix:
Please.
>Release-Note:
>Audit-Trail:
From: "matthew green" <mrg@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/59996 CVS commit: src/sys/uvm
Date: Thu, 12 Feb 2026 00:08:37 +0000
Module Name: src
Committed By: mrg
Date: Thu Feb 12 00:08:37 UTC 2026
Modified Files:
src/sys/uvm: uvm_swap.c
Log Message:
take uvm_swap_data_lock when looping devices in uvm_swap_stats()
should fix PR#59996, where sometimes the rotation of devices in a
priority list would happen and uvm_swap_stats() would exit early,
returning a less count than previous.
XXX: pullup-*
To generate a diff of this commit:
cvs rdiff -u -r1.211 -r1.212 src/sys/uvm/uvm_swap.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/59996: swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3)
Date: Thu, 12 Feb 2026 12:47:19 +0700
Date: Wed, 11 Feb 2026 23:05:00 +0000 (UTC)
From: "wiz@NetBSD.org via gnats" <gnats-admin@NetBSD.org>
Message-ID: <20260211230500.95C061A923D@mollari.NetBSD.org>
| When running "swapctl -l" in a loop (1/s) on a busy machine (bulk buil=
d)
| with 10%-25% of available swap used, I sometimes see errors like
|
| swapctl: SWAP_STATS different to SWAP_NSWAP (1 !=3D 3)
The only way that can happen, I believe, is if something is changing the
order of the swap device list while swapctl is running fetching the stats.
=
| I changed the code to use warn() instead of warnx(), but errno is not =
set:
Yes, that was a poor suggestion (waste of your time) - as long as swapctl(=
2)
is not returning -1 errno would be meaningless, even if set, and it isn't
returning -1 it is returning 1 or 2 (for you) in the cases where the warni=
ng is
generated.
So, it cannot be copyout() that is failing, that would cause swapctl(2)
to return -1 - the only way I can see the kernel loop failing to run 3
times (in your case) would be if the code asked for less than 3 stats
records (and swapctl(8) does not do that) or if the end of the kernel swap
device list is reached prematurely.
Since (I gather) after failing, it returns to returning 3 again for a whil=
e,
the kernel is obviously not "losing" swap devices, so all I can imagine is
that the list is being reordered, and the swapctl(2, SWAP_STATS) kernel co=
de
is not locking things to prevent that, or detecting when it has happened a=
nd
restarting (which would probably be the better solution - stats fetching
should not usually affect operations in any way at all).
One thing you could try, if you feel inclined, would be to give your 3
swap devices different priorities - either by adding priority=3DN options
to /etc/fstab, or by running
swapctl -c -p N swapdev-path
afterwards.
If it is just the list of swap devices at the same priority being reordere=
d,
then moving each of them to a different priority should avoid that happeni=
ng.
Whether this would end up showing anything (as in no more errors) is 100%
uncertain, but if it did it would be a clear indication of the issue.
kre
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org, wiz@netbsd.org
Cc:
Subject: Re: bin/59996: swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3)
Date: Thu, 12 Feb 2026 12:57:06 +0700
As an alternative, if you just want to (hopefully) avoid seeing any of
the error messages, you could try this patch to src/sbin/swapctl/swaplist.c
But this is not what I would suggest as any kind of real solution.
kre
Index: swaplist.c
===================================================================
RCS file: /cvsroot/src/sbin/swapctl/swaplist.c,v
retrieving revision 1.19
diff -u -r1.19 swaplist.c
--- swaplist.c 11 Dec 2023 12:47:24 -0000 1.19
+++ swaplist.c 12 Feb 2026 05:55:51 -0000
@@ -73,9 +73,12 @@
fsep = sep = (struct swapent *)malloc(nswap * sizeof(*sep));
if (sep == NULL)
err(1, "malloc");
- rnswap = swapctl(SWAP_STATS, (void *)sep, nswap);
- if (rnswap < 0)
- err(1, "SWAP_STATS");
+ i = 0; /*XXX*/
+ do { /*XXX*/
+ rnswap = swapctl(SWAP_STATS, (void *)sep, nswap);
+ if (rnswap < 0)
+ err(1, "SWAP_STATS");
+ } while (nswap != rnswap && ++i < 5); /*XXX*/
if (nswap != rnswap)
warnx("SWAP_STATS different to SWAP_NSWAP (%d != %d)",
rnswap, nswap);
From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/59996 CVS commit: src/sys/uvm
Date: Fri, 13 Feb 2026 03:44:49 +0000
Module Name: src
Committed By: kre
Date: Fri Feb 13 03:44:49 UTC 2026
Modified Files:
src/sys/uvm: uvm_swap.c
Log Message:
PR bin/59996 - handle hidden swap list reordering
A different attempt to achieve what 2 revs back was attempting.
The swap lists must be locked (uvm_swap_data_lock) when we are traversing
the lists of swap devices, as otherwise the lists can reorder themselves
behind our back. But we cannot hold that lock when actually doing the
processing, as our process might need to page/swap to copy out data,
and doing that will also attempt to take the lock - panic (or doom).
Instead, traverse the lists with the lock held, so they are stable, but do
nothing but keep a record of all of the swapdevs (independent of their lists)
and then use this new list of swapdevs to actually do the work. The number
or identity of the swap devices cannot change during all of this, as we also
hold swap_syscall_lock which prevents any other swapctl() operations (like
adding or deleting devices) from occurring.
Once we have done that, the number of swap devices found is the number
returned from swapctl(SWAP_STATS) (provided it is no bigger than requested).
Note that this does not guarantee that the number of devices returned from
swapctl(SWAP_STATS) will agree with an earlier call to swapctl(SWAP_NSWAP)
- that is obviously impossible, absolutely anything might have occurred
between the two calls.
To generate a diff of this commit:
cvs rdiff -u -r1.213 -r1.214 src/sys/uvm/uvm_swap.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/59996 CVS commit: src/sbin/swapctl
Date: Mon, 16 Feb 2026 20:35:30 +0000
Module Name: src
Committed By: kre
Date: Mon Feb 16 20:35:30 UTC 2026
Modified Files:
src/sbin/swapctl: swaplist.c
Log Message:
PR bin/59996 No more "SWAP_STATS different to SWAP_NSWAP"
That the kernel data structs might have changed between one
system call and another, later one, does not warrant a warning.
In this case this can happen if a swap device is removed from the
system while swapctl(8) (swapctl -l or -s) (or pstat(8) (pstat -s),
which is similarly affected) is running, at just the "wrong" time.
This isn't an error. This was always handled correctly, but with
the meaningless warning added.
There's also the other case, where a swap device has been added, instead
of removed, at just the same "wrong" time. Since the kernel won't return
stats for more devices than requested, which was the number returned from
the earlier call, the code never noticed this case, and simply printed
less data than the kernel could have supplied. That sounds like it is
just another example of the race above, and could simply be ignored - but
it isn't quite that simple. If the kernel was returning the older devices,
and omitting the newer one(s), then it would be OK, but there is no
guarantee that is what happens - it might easily return the new device(s)
(or some of them) along with some of the older ones, omitting others.
That's not ideal.
To cope, just request a few more swap device stats from the kernel than
it reported were available. If it happens that some were added, there
will be space in the buffer provided for the kernel to add the new
one(s) - unless many new ones happened to get added in the relatively
short interval involved. This way, we should usually get all of them.
In the normal case where no devices have been added or deleted, a few
extra bytes (maybe a KB, not a huge amount) will have been malloc()'d.
That's harmless.
Note that this kind of thing was not the reason for the messages reported
in the PR - that was a kernel bug, caused by the kernel reordering its
list of swap devices at an inopportune time (and then potentially returning
less devices than requested, even though all were still there).
That's fixed, but the kernel list reordering remains - along with its
effect of returning the swap devices in an almost arbitrary order - it
returns them in priority order, always, but within one priority, the
devices will be returned in any random order.
So, while here, deal with that. Sort the list returned, and thus always
print the devices in a stable order - sorted by priority, and then by the
device name (which is more or less arbitrary, the actual order might not
make much sense, but at least it will be a consistent order).
To generate a diff of this commit:
cvs rdiff -u -r1.21 -r1.22 src/sbin/swapctl/swaplist.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc: netbsd-bugs@NetBSD.org, wiz@NetBSD.org, mrg@NetBSD.org
Subject: Re: bin/59996: swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3)
Date: Tue, 17 Feb 2026 07:10:26 +0700
I have finished playing with swapctl related issues.
Now we need to decide what should be pulled up, and to where.
The kernel mod is most likely I assume, though it is for a very
rare and relatively harmless condition (no crashes possible, no
unintended data exposed) so could be just left for -12 or could
be pulled up to just -11 (or perhaps back further). I haven't
looked at what is involved in doing any of those yet.
The changes to swapctl(8) I don't think are worthy of any pullups.
The offending message is actually useful on any release where the
kernel issue isn't corrected. When it is corrected, the chances
that anyone, not deliberately trying to provoke it, is ever going to
encounter it, are miniscule. The other changes are just noise.
So, opinions? I will change the PR status to pending-pullups while
this is being decided.
kre
State-Changed-From-To: open->needs-pullups
State-Changed-By: kre@NetBSD.org
State-Changed-When: Tue, 17 Feb 2026 00:16:37 +0000
State-Changed-Why:
This is what I meant in the previous message, needs-pullups, not pending...
From: matthew green <mrg@eterna23.net>
To: Robert Elz <kre@munnari.OZ.AU>
Cc: netbsd-bugs@NetBSD.org, wiz@NetBSD.org, gnats-bugs@NetBSD.org
Subject: re: bin/59996: swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3)
Date: Wed, 18 Feb 2026 06:48:36 +1100
Robert Elz writes:
> I have finished playing with swapctl related issues.
>
> Now we need to decide what should be pulled up, and to where.
>
> The kernel mod is most likely I assume, though it is for a very
> rare and relatively harmless condition (no crashes possible, no
> unintended data exposed) so could be just left for -12 or could
> be pulled up to just -11 (or perhaps back further). I haven't
> looked at what is involved in doing any of those yet.
>
> The changes to swapctl(8) I don't think are worthy of any pullups.
> The offending message is actually useful on any release where the
> kernel issue isn't corrected. When it is corrected, the chances
> that anyone, not deliberately trying to provoke it, is ever going to
> encounter it, are miniscule. The other changes are just noise.
>
> So, opinions? I will change the PR status to pending-pullups while
> this is being decided.
thank you again for fixing this properly.
i'm fine with just pulling up the kernel fixes -- maybe the man
pages fixes are also nice, but i'd probably only do those if
doing all of swapctl(8) too.
.mrg.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2026
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.