NetBSD Problem Report #59147
From www@netbsd.org Thu Mar 6 14:23:06 2025
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits)
client-signature RSA-PSS (2048 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 483B11A9239
for <gnats-bugs@gnats.NetBSD.org>; Thu, 6 Mar 2025 14:23:06 +0000 (UTC)
Message-Id: <20250306142304.E6AE81A923C@mollari.NetBSD.org>
Date: Thu, 6 Mar 2025 14:23:04 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: sysctl: bounded-memory lookups by name
X-Send-Pr-Version: www-1.0
>Number: 59147
>Category: kern
>Synopsis: sysctl: bounded-memory lookups by name
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Mar 06 14:25:00 +0000 2025
>Last-Modified: Thu Mar 06 20:05:01 +0000 2025
>Originator: Taylor R Campbell
>Release: current, 10, 9, ...
>Organization:
The NetBSysctlD __attribute__((constructor))
>Environment:
>Description:
Userland sysctl name lookups for nodes like "kern.entropy.epoch", given a prefix of known MIB numbers like CTL_KERN for "kern", work as follows:
1. Query sysctl {CTL_KERN, CTL_QUERY} for the length of a list of all sysctl nodes kern.*, with a null buffer.
2. Allocate a buffer to hold them.
3. Query sysctl {CTL_KERN, CTL_QUERY} for the list of all sysctl nodes kern.*.
4. Search through those sysctl nodes for the next matching component name "entropy".
5. Repeat until all the component names have been matched, "kern.entropy.epoch".
In principle this requires unbounded memory allocation, which makes it troublesome to use in difficult contexts like ELF constructors or signal handlers. For certain special cases, we can estimate the maximum size of the buffer based on what we know about the kernel and allocate a stack buffer of that size, as we did for PR lib/59107: libc constructors on arm use malloc <https://gnats.NetBSD.org/59107>, but this is fragile.
Certain information should perhaps be transmitted to userland another way -- e.g., on aarch64, the kernel could emulate MRS xN, ID_*_EL1 instructions on trap from EL0 so userland can execute them without a sysctl; the entropy epoch could be put in a shared page with vDSO -- and of course we can statically allocate sysctl numbers with #defines in sys/*.h, but we should nevertheless really be able to get at arbitrary sysctl nodes with small bounded memory allocation.
>How-To-Repeat:
try to query sysctls in troublesome contexts like ELF constructors, ifunc selectors, or signal handlers
>Fix:
This probably requires writing a new kernel interface, say CTL_QUERYBYNAME that takes a string on input and returns a MIB number on output.
>Audit-Trail:
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/59147: sysctl: bounded-memory lookups by name
Date: Fri, 07 Mar 2025 03:03:27 +0700
Date: Thu, 6 Mar 2025 14:25:00 +0000 (UTC)
From: campbell+netbsd@mumble.net
Message-ID: <20250306142500.B73421A923E@mollari.NetBSD.org>
| This probably requires writing a new kernel interface,
| say CTL_QUERYBYNAME that takes a string on input and returns
| a MIB number on output.
Or just make a sysctlbyname() (approx) system call, and just use
that instead of sysctl() (put the user code implementation in the
kernel, where it is trivial).
The "approx" is because I'd make its signature be
int whatevername(const char *sname, void *oldp, size_t *oldlenp,
const void *newp, size_t newlen, int *name, u_int *namelenp.
size_t snamelen);
where the first 5 args are exactly what they are in sysctlbyname()
and the last three are a pointer to an array of ints to hold the numeric
MIB number, a set/alter pointer to an int which contains the number of
elements in that array (on call) and the number that were required (on return)
which might be larger than it was on entry, in which case only the first N
are actually stored in *name (those two work just the same way as oldp/oldlenp
do), and last the the length of sname (strlen(sname)) so the kernel doesn't
need to guess how much to copyin.
Whether we would use u_int or size_t for namelenp isn't all that important,
some of the sysctl*() functions use one, and others the other. u_int
seems more sensible to me (could even use u_short) - its max value should
be well under 1000. Probably always < 100.
The order of the args needn't be that necessarily, just easier here
for me to cut&paste!
That way the same thing can be also used to implement sysctlnametomib().
I think that's cleaner than butchering sysctl() to have a query type
which uses different data types for the input & output.
kre
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.