NetBSD Problem Report #59996

From wiz@netbsd.org  Wed Feb 11 23:02:33 2026
Return-Path: <wiz@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits)
	 client-signature RSA-PSS (2048 bits))
	(Client CN "mail.netbsd.org", Issuer "R13" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 5A8821A9239
	for <gnats-bugs@gnats.NetBSD.org>; Wed, 11 Feb 2026 23:02:33 +0000 (UTC)
Message-Id: <20260211230232.F177684D95@mail.netbsd.org>
Date: Wed, 11 Feb 2026 23:02:32 +0000 (UTC)
From: wiz@NetBSD.org
Reply-To: wiz@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3)
X-Send-Pr-Version: 3.95
X-From4GNATS: "wiz@NetBSD.org via gnats" <gnats-admin@NetBSD.org>

>Number:         59996
>Category:       bin
>Synopsis:       swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3)
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    bin-bug-people
>State:          needs-pullups
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Feb 11 23:05:00 +0000 2026
>Closed-Date:    
>Last-Modified:  Tue Feb 17 19:50:01 +0000 2026
>Originator:     Thomas Klausner
>Release:        NetBSD 11.99.5
>Organization:

>Environment:


Architecture: x86_64
Machine: amd64
>Description:
When running "swapctl -l" in a loop (1/s) on a busy machine (bulk build)
with 10%-25% of available swap used, I sometimes see errors like

swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3)

I changed the code to use warn() instead of warnx(), but errno is not set:

swapctl: SWAP_STATS different to SWAP_NSWAP (2 != 3): Undefined error: 0
swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3): Undefined error: 0
swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3): Undefined error: 0

The machine does have three swap devices, so '3' is the correct value.
>How-To-Repeat:
Run a bulk build.  (Doesn't work on an unloaded machine, I think.)

In a second window, run:
while true; do sleep 1; swapctl -l; done

Wait. Be lucky.


>Fix:
Please.

>Release-Note:

>Audit-Trail:
From: "matthew green" <mrg@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/59996 CVS commit: src/sys/uvm
Date: Thu, 12 Feb 2026 00:08:37 +0000

 Module Name:	src
 Committed By:	mrg
 Date:		Thu Feb 12 00:08:37 UTC 2026

 Modified Files:
 	src/sys/uvm: uvm_swap.c

 Log Message:
 take uvm_swap_data_lock when looping devices in uvm_swap_stats()

 should fix PR#59996, where sometimes the rotation of devices in a
 priority list would happen and uvm_swap_stats() would exit early,
 returning a less count than previous.

 XXX: pullup-*


 To generate a diff of this commit:
 cvs rdiff -u -r1.211 -r1.212 src/sys/uvm/uvm_swap.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/59996: swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3)
Date: Thu, 12 Feb 2026 12:47:19 +0700

     Date:        Wed, 11 Feb 2026 23:05:00 +0000 (UTC)
     From:        "wiz@NetBSD.org via gnats" <gnats-admin@NetBSD.org>
     Message-ID:  <20260211230500.95C061A923D@mollari.NetBSD.org>

   | When running "swapctl -l" in a loop (1/s) on a busy machine (bulk buil=
 d)
   | with 10%-25% of available swap used, I sometimes see errors like
   |
   | swapctl: SWAP_STATS different to SWAP_NSWAP (1 !=3D 3)

 The only way that can happen, I believe, is if something is changing the
 order of the swap device list while swapctl is running fetching the stats.
   =

   | I changed the code to use warn() instead of warnx(), but errno is not =
 set:

 Yes, that was a poor suggestion (waste of your time) - as long as swapctl(=
 2)
 is not returning -1 errno would be meaningless, even if set, and it isn't
 returning -1 it is returning 1 or 2 (for you) in the cases where the warni=
 ng is
 generated.

 So, it cannot be copyout() that is failing, that would cause swapctl(2)
 to return -1 - the only way I can see the kernel loop failing to run 3
 times (in your case) would be if the code asked for less than 3 stats
 records (and swapctl(8) does not do that) or if the end of the kernel swap
 device list is reached prematurely.

 Since (I gather) after failing, it returns to returning 3 again for a whil=
 e,
 the kernel is obviously not "losing" swap devices, so all I can imagine is
 that the list is being reordered, and the swapctl(2, SWAP_STATS) kernel co=
 de
 is not locking things to prevent that, or detecting when it has happened a=
 nd
 restarting (which would probably be the better solution - stats fetching
 should not usually affect operations in any way at all).

 One thing you could try, if you feel inclined, would be to give your 3
 swap devices different priorities - either by adding priority=3DN options
 to /etc/fstab, or by running
 	swapctl -c -p N swapdev-path
 afterwards.

 If it is just the list of swap devices at the same priority being reordere=
 d,
 then moving each of them to a different priority should avoid that happeni=
 ng.

 Whether this would end up showing anything (as in no more errors) is 100%
 uncertain, but if it did it would be a clear indication of the issue.

 kre

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org, wiz@netbsd.org
Cc: 
Subject: Re: bin/59996: swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3)
Date: Thu, 12 Feb 2026 12:57:06 +0700

 As an alternative, if you just want to (hopefully) avoid seeing any of
 the error messages, you could try this patch to src/sbin/swapctl/swaplist.c

 But this is not what I would suggest as any kind of real solution.

 kre


 Index: swaplist.c
 ===================================================================
 RCS file: /cvsroot/src/sbin/swapctl/swaplist.c,v
 retrieving revision 1.19
 diff -u -r1.19 swaplist.c
 --- swaplist.c	11 Dec 2023 12:47:24 -0000	1.19
 +++ swaplist.c	12 Feb 2026 05:55:51 -0000
 @@ -73,9 +73,12 @@
  	fsep = sep = (struct swapent *)malloc(nswap * sizeof(*sep));
  	if (sep == NULL)
  		err(1, "malloc");
 -	rnswap = swapctl(SWAP_STATS, (void *)sep, nswap);
 -	if (rnswap < 0)
 -		err(1, "SWAP_STATS");
 +	i = 0;		/*XXX*/
 +	do {		/*XXX*/
 +		rnswap = swapctl(SWAP_STATS, (void *)sep, nswap);
 +		if (rnswap < 0)
 +			err(1, "SWAP_STATS");
 +	} while (nswap != rnswap && ++i < 5);	/*XXX*/
  	if (nswap != rnswap)
  		warnx("SWAP_STATS different to SWAP_NSWAP (%d != %d)",
  		    rnswap, nswap);

From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/59996 CVS commit: src/sys/uvm
Date: Fri, 13 Feb 2026 03:44:49 +0000

 Module Name:	src
 Committed By:	kre
 Date:		Fri Feb 13 03:44:49 UTC 2026

 Modified Files:
 	src/sys/uvm: uvm_swap.c

 Log Message:
 PR bin/59996 - handle hidden swap list reordering

 A different attempt to achieve what 2 revs back was attempting.

 The swap lists must be locked (uvm_swap_data_lock) when we are traversing
 the lists of swap devices, as otherwise the lists can reorder themselves
 behind our back.   But we cannot hold that lock when actually doing the
 processing, as our process might need to page/swap to copy out data,
 and doing that will also attempt to take the lock - panic (or doom).

 Instead, traverse the lists with the lock held, so they are stable, but do
 nothing but keep a record of all of the swapdevs (independent of their lists)
 and then use this new list of swapdevs to actually do the work.  The number
 or identity of the swap devices cannot change during all of this, as we also
 hold swap_syscall_lock which prevents any other swapctl() operations (like
 adding or deleting devices) from occurring.

 Once we have done that, the number of swap devices found is the number
 returned from swapctl(SWAP_STATS) (provided it is no bigger than requested).

 Note that this does not guarantee that the number of devices returned from
 swapctl(SWAP_STATS) will agree with an earlier call to swapctl(SWAP_NSWAP)
 - that is obviously impossible, absolutely anything might have occurred
 between the two calls.


 To generate a diff of this commit:
 cvs rdiff -u -r1.213 -r1.214 src/sys/uvm/uvm_swap.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/59996 CVS commit: src/sbin/swapctl
Date: Mon, 16 Feb 2026 20:35:30 +0000

 Module Name:	src
 Committed By:	kre
 Date:		Mon Feb 16 20:35:30 UTC 2026

 Modified Files:
 	src/sbin/swapctl: swaplist.c

 Log Message:
 PR bin/59996  No more "SWAP_STATS different to SWAP_NSWAP"

 That the kernel data structs might have changed between one
 system call and another, later one, does not warrant a warning.

 In this case this can happen if a swap device is removed from the
 system while swapctl(8) (swapctl -l or -s) (or pstat(8) (pstat -s),
 which is similarly affected) is running, at just the "wrong" time.
 This isn't an error.   This was always handled correctly, but with
 the meaningless warning added.

 There's also the other case, where a swap device has been added, instead
 of removed, at just the same "wrong" time.   Since the kernel won't return
 stats for more devices than requested, which was the number returned from
 the earlier call, the code never noticed this case, and simply printed
 less data than the kernel could have supplied.  That sounds like it is
 just another example of the race above, and could simply be ignored - but
 it isn't quite that simple.  If the kernel was returning the older devices,
 and omitting the newer one(s), then it would be OK, but there is no
 guarantee that is what happens - it might easily return the new device(s)
 (or some of them) along with some of the older ones, omitting others.
 That's not ideal.

 To cope, just request a few more swap device stats from the kernel than
 it reported were available.   If it happens that some were added, there
 will be space in the buffer provided for the kernel to add the new
 one(s) - unless many new ones happened to get added in the relatively
 short interval involved.   This way, we should usually get all of them.
 In the normal case where no devices have been added or deleted, a few
 extra bytes (maybe a KB, not a huge amount) will have been malloc()'d.
 That's harmless.

 Note that this kind of thing was not the reason for the messages reported
 in the PR - that was a kernel bug, caused by the kernel reordering its
 list of swap devices at an inopportune time (and then potentially returning
 less devices than requested, even though all were still there).

 That's fixed, but the kernel list reordering remains - along with its
 effect of returning the swap devices in an almost arbitrary order - it
 returns them in priority order, always, but within one priority, the
 devices will be returned in any random order.

 So, while here, deal with that.  Sort the list returned, and thus always
 print the devices in a stable order - sorted by priority, and then by the
 device name (which is more or less arbitrary, the actual order might not
 make much sense, but at least it will be a consistent order).


 To generate a diff of this commit:
 cvs rdiff -u -r1.21 -r1.22 src/sbin/swapctl/swaplist.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc: netbsd-bugs@NetBSD.org, wiz@NetBSD.org, mrg@NetBSD.org
Subject: Re: bin/59996: swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3)
Date: Tue, 17 Feb 2026 07:10:26 +0700

 I have finished playing with swapctl related issues.

 Now we need to decide what should be pulled up, and to where.

 The kernel mod is most likely I assume, though it is for a very
 rare and relatively harmless condition (no crashes possible, no
 unintended data exposed) so could be just left for -12 or could
 be pulled up to just -11 (or perhaps back further).   I haven't
 looked at what is involved in doing any of those yet.

 The changes to swapctl(8) I don't think are worthy of any pullups.
 The offending message is actually useful on any release where the
 kernel issue isn't corrected.  When it is corrected, the chances
 that anyone, not deliberately trying to provoke it, is ever going to
 encounter it, are miniscule.   The other changes are just noise.

 So, opinions?   I will change the PR status to pending-pullups while
 this is being decided.

 kre

State-Changed-From-To: open->needs-pullups
State-Changed-By: kre@NetBSD.org
State-Changed-When: Tue, 17 Feb 2026 00:16:37 +0000
State-Changed-Why:
This is what I meant in the previous message, needs-pullups, not pending...


From: matthew green <mrg@eterna23.net>
To: Robert Elz <kre@munnari.OZ.AU>
Cc: netbsd-bugs@NetBSD.org, wiz@NetBSD.org, gnats-bugs@NetBSD.org
Subject: re: bin/59996: swapctl: SWAP_STATS different to SWAP_NSWAP (1 != 3)
Date: Wed, 18 Feb 2026 06:48:36 +1100

 Robert Elz writes:
 > I have finished playing with swapctl related issues.
 >
 > Now we need to decide what should be pulled up, and to where.
 >
 > The kernel mod is most likely I assume, though it is for a very
 > rare and relatively harmless condition (no crashes possible, no
 > unintended data exposed) so could be just left for -12 or could
 > be pulled up to just -11 (or perhaps back further).   I haven't
 > looked at what is involved in doing any of those yet.
 >
 > The changes to swapctl(8) I don't think are worthy of any pullups.
 > The offending message is actually useful on any release where the
 > kernel issue isn't corrected.  When it is corrected, the chances
 > that anyone, not deliberately trying to provoke it, is ever going to
 > encounter it, are miniscule.   The other changes are just noise.
 >
 > So, opinions?   I will change the PR status to pending-pullups while
 > this is being decided.

 thank you again for fixing this properly.

 i'm fine with just pulling up the kernel fixes -- maybe the man
 pages fixes are also nice, but i'd probably only do those if
 doing all of swapctl(8) too.


 .mrg.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2026 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.