NetBSD Problem Report #59125
From www@netbsd.org Mon Mar 3 21:34:20 2025
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits)
client-signature RSA-PSS (2048 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id BC8161A923A
for <gnats-bugs@gnats.NetBSD.org>; Mon, 3 Mar 2025 21:34:20 +0000 (UTC)
Message-Id: <20250303213419.54F9B1A923D@mollari.NetBSD.org>
Date: Mon, 3 Mar 2025 21:34:19 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: pthread_once(3) races with concurrent fork
X-Send-Pr-Version: www-1.0
>Number: 59125
>Category: lib
>Synopsis: pthread_once(3) races with concurrent fork
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: lib-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Mar 03 21:35:01 +0000 2025
>Last-Modified: Sun Mar 30 23:05:01 +0000 2025
>Originator: Taylor R Campbell
>Release: current, 10, 9, ...
>Organization:
The NetBSD Forkonceler
>Environment:
>Description:
Suppose thread A calls pthread_once(&once, &init), and thread B calls fork at the same time.
In thread A, pthread_once will (a) take the lock inside &once and then (b) call the init function.
But if the fork happens between (a) and (b), the child will see the lock held with no threads running to release it. So when the child process tries pthread_once(&once, ...), it will hang.
>How-To-Repeat:
#include <sys/wait.h>
#include <err.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
pthread_once_t once = PTHREAD_ONCE_INIT;
static void
init(void)
{
}
static void *
forkit(void *cookie)
{
pthread_barrier_t *bar = cookie;
pid_t pid;
int status;
(void)pthread_barrier_wait(bar);
if ((pid = fork()) == -1)
err(1, "fork");
if (pid == 0) {
(void)alarm(1);
(void)pthread_once(&once, &init);
_exit(0);
}
if (waitpid(pid, &status, 0) == -1)
err(1, "waitpid");
if (WIFSIGNALED(status)) {
errx(1, "child exited on signal %d (%s)", WTERMSIG(status),
strsignal(WTERMSIG(status)));
}
if (!WIFEXITED(status) || WEXITSTATUS(status) != 0)
errx(1, "child exited 0x%x", status);
return NULL;
}
static void
test(void)
{
pthread_barrier_t bar;
pthread_t t;
int error;
error = pthread_barrier_init(&bar, NULL, 2);
if (error)
errc(1, error, "pthread_barrier_init");
error = pthread_create(&t, NULL, &forkit, &bar);
if (error)
errc(1, error, "pthread_create");
(void)pthread_barrier_wait(&bar);
(void)pthread_once(&once, &init);
error = pthread_join(t, NULL);
if (error)
errc(1, error, "pthread_join");
}
int
main(void)
{
for (unsigned long long n = 1;; n++) {
pid_t pid;
int status;
if ((pid = fork()) == -1)
err(1, "fork");
if (pid == 0) {
test();
_exit(0);
}
if (waitpid(pid, &status, 0) == -1)
err(1, "waitpid");
if (!WIFEXITED(status) || WEXITSTATUS(status) != 0) {
errx(1, "test exited 0x%x after %zu trial%s", status,
n, n == 1 ? "" : "s");
}
}
return 0;
}
>Fix:
Yes, please!
See also: PR lib/59124: arc4random(3): first call in process races with concurrent fork
>Audit-Trail:
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: netbsd-bugs@NetBSD.org
Subject: Re: lib/59125: pthread_once(3) races with concurrent fork
Date: Wed, 5 Mar 2025 22:37:37 +0000
This is a multi-part message in MIME format.
--=_su1OON3HGWIM8fZuEZsawKeY4iHDSE1Z
My first attempt was to count a fork generation number, and shoehorn
it into the existing pthread_once structure with a futex, so that
pthread_once can reliably distinguish between
(a) another thread in this process is initializing it, or
(b) another thread in some ancestor of this process was initializing
it when we forked,
and handle waiting and wakeups for transition changes just like it
currently uses the fork-unsafe pthread_mutex(3) for.
The fork generation number itself doesn't require any synchronization:
it is only ever modified while the process is single-threaded, in the
fork child. Of course, it will roll over after 2^30 _nested_ forks,
but most process hierarchies never come near that (and any exec resets
it, naturally).
(It turns out this is how glibc works too.)
Unfortunately, that doesn't help: the child can still observe multiple
calls to the init routine, if the fork happens anywhere from the start
of the init routine to when the pthread_once_t is marked done. So the
attached variant of the test program which checks for repeated
initialization calls still fails -- perhaps at lower frequency than
the original test program's deadlocks, but not much lower, often
enough to trigger on my laptop in a VM after a few minutes; in any
case it rounds to `still not fork-safe'.
--=_su1OON3HGWIM8fZuEZsawKeY4iHDSE1Z
Content-Type: text/plain; charset="ISO-8859-1"; name="oncefork"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="oncefork.c"
#include <sys/wait.h>
#include <err.h>
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
pthread_once_t once;
int x;
long long ntrials;
static void
siginfo(int signo)
{
char buf[128];
snprintf_ss(buf, sizeof(buf), "no hang after %llu trials\n", ntrials);
(void)write(STDOUT_FILENO, buf, strlen(buf));
if (signo !=3D SIGINFO) {
(void)signal(signo, SIG_DFL);
raise(signo);
}
}
static void
ofunc_silent(void)
{
x++;
}
static void *
fork_and_once(void *cookie)
{
pthread_barrier_t *bar =3D cookie;
pid_t pid, child;
int status;
(void)pthread_barrier_wait(bar);
if ((pid =3D fork()) =3D=3D -1)
err(1, "fork");
if (pid =3D=3D 0) {
(void)alarm(1);
(void)pthread_once(&once, &ofunc_silent);
_exit(x - 1);
}
if ((child =3D waitpid(pid, &status, 0)) =3D=3D -1)
err(1, "waitpid");
if (child !=3D pid) {
errx(1, "[%llu trials] child=3D%lld pid=3D%lld", ntrials,
(long long)child, (long long)pid);
}
if (WIFSIGNALED(status)) {
errx(1, "[%llu trials] child exited on signal %d (%s)",
ntrials,
WTERMSIG(status),
strsignal(WTERMSIG(status)));
}
if (!WIFEXITED(status) || WEXITSTATUS(status) !=3D 0)
errx(1, "[%llu trials] child exited 0x%x", ntrials, status);
return NULL;
}
int
main(void)
{
static pthread_once_t once0 =3D PTHREAD_ONCE_INIT;
pthread_barrier_t bar;
sigset_t mask, omask;
int error;
error =3D pthread_barrier_init(&bar, NULL, 2);
if (error)
err(1, "pthread_barrier_init");
if (sigemptyset(&mask) =3D=3D -1)
err(1, "sigemptyset");
if (sigaddset(&mask, SIGINFO) =3D=3D -1)
err(1, "sigaddset");
if (sigaddset(&mask, SIGINT) =3D=3D -1)
err(1, "sigaddset");
if (signal(SIGINFO, &siginfo) =3D=3D SIG_ERR)
err(1, "signal(SIGINFO)");
if (signal(SIGINT, &siginfo) =3D=3D SIG_ERR)
err(1, "signal(SIGINFO)");
for (;;) {
pthread_t t;
if (sigprocmask(SIG_BLOCK, &mask, &omask) =3D=3D -1)
err(1, "sigprocmask(SIG_BLOCK)");
once =3D once0;
x =3D 0;
error =3D pthread_create(&t, NULL, &fork_and_once, &bar);
if (error)
errc(1, error, "pthread_create");
(void)alarm(1);
(void)pthread_barrier_wait(&bar);
(void)pthread_once(&once, &ofunc_silent);
(void)alarm(0);
error =3D pthread_join(t, NULL);
if (error)
errc(1, error, "pthread_join");
if (x !=3D 1)
errx(1, "[%llu trials] x=3D%d\n", ntrials, x);
ntrials++;
if (sigprocmask(SIG_SETMASK, &omask, NULL) =3D=3D -1)
err(1, "sigprocmask(SIG_SETMASK)");
}
return 0;
}
--=_su1OON3HGWIM8fZuEZsawKeY4iHDSE1Z
Content-Type: text/plain; charset="ISO-8859-1"; name="pr59125-pthreadonceforkfutex"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="pr59125-pthreadonceforkfutex.patch"
# HG changeset patch
# User Taylor R Campbell <riastradh@NetBSD.org>
# Date 1741198759 0
# Wed Mar 05 18:19:19 2025 +0000
# Branch trunk
# Node ID db0cae397ecdf2795f54954e2009a3d06fdb95ef
# Parent cd6b61c1b621c4bb80fbe22e714ba188467be421
# EXP-Topic riastradh-pr59125-oncefork
WIP: pthread_once(3): Try to avoid fork races with futexes.
Turns out, this doesn't work! It is still possible for a fork
concurrent with pthread_once to cause the child to see the
initialization action run twice.
PR lib/59125: pthread_once(3) races with concurrent fork
diff -r cd6b61c1b621 -r db0cae397ecd lib/libc/gen/pthread_atfork.c
--- a/lib/libc/gen/pthread_atfork.c Wed Mar 05 15:29:16 2025 +0000
+++ b/lib/libc/gen/pthread_atfork.c Wed Mar 05 18:19:19 2025 +0000
@@ -175,6 +175,8 @@ out: mutex_unlock(&atfork_lock);
return error;
}
=20
+uint64_t __libc_forkgen;
+
pid_t
fork(void)
{
@@ -199,6 +201,7 @@ fork(void)
mutex_unlock(&atfork_lock);
} else {
/* We are the child */
+ __libc_forkgen++;
_malloc_postfork_child();
SIMPLEQ_FOREACH(iter, &childq, next)
(*iter->fn)();
diff -r cd6b61c1b621 -r db0cae397ecd lib/libc/include/forkgen.h
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/lib/libc/include/forkgen.h Wed Mar 05 18:19:19 2025 +0000
@@ -0,0 +1,36 @@
+/* $NetBSD$ */
+
+/*-
+ * Copyright (c) 2025 The NetBSD Foundation, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTO=
RS
+ * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIM=
ITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICU=
LAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTO=
RS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF =
THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _LIB_LIBC_FORKGEN_H_
+#define _LIB_LIBC_FORKGEN_H_
+
+#include <stdint.h>
+
+extern uint64_t __libc_forkgen;
+
+#endif /* _LIB_LIBC_FORKGEN_H_ */
diff -r cd6b61c1b621 -r db0cae397ecd lib/libpthread/pthread_once.c
--- a/lib/libpthread/pthread_once.c Wed Mar 05 15:29:16 2025 +0000
+++ b/lib/libpthread/pthread_once.c Wed Mar 05 18:19:19 2025 +0000
@@ -1,12 +1,9 @@
-/* $NetBSD: pthread_once.c,v 1.5 2025/03/04 00:41:00 riastradh Exp $ */
+/* $NetBSD$ */
=20
/*-
- * Copyright (c) 2001, 2003 The NetBSD Foundation, Inc.
+ * Copyright (c) 2025 The NetBSD Foundation, Inc.
* All rights reserved.
*
- * This code is derived from software contributed to The NetBSD Foundation
- * by Nathan J. Williams, and by Jason R. Thorpe.
- *
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
@@ -15,13 +12,6 @@
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
- * 3. All advertising materials mentioning features or use of this software
- * must display the following acknowledgement:
- * This product includes software developed by the NetBSD
- * Foundation, Inc. and its contributors.
- * 4. Neither the name of The NetBSD Foundation nor the names of its
- * contributors may be used to endorse or promote products derived
- * from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTO=
RS
* ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIM=
ITED
@@ -37,41 +27,226 @@
*/
=20
#include <sys/cdefs.h>
-__RCSID("$NetBSD: pthread_once.c,v 1.5 2025/03/04 00:41:00 riastradh Exp $=
");
+__RCSID("$NetBSD");
=20
/* Need to use libc-private names for atomic operations. */
#include "../../common/lib/libc/atomic/atomic_op_namespace.h"
=20
+#include <sys/futex.h>
+#include <sys/syscall.h>
+
+#include <limits.h>
+#include <stdatomic.h>
+#include <unistd.h>
+
+#include "../../lib/libc/include/forkgen.h"
+
#include "pthread.h"
#include "pthread_int.h"
#include "reentrant.h"
=20
-static void
-once_cleanup(void *closure)
+/*************************************************************************=
****/
+/*
+ * BEGIN MOVE ME ELSEWEHRE
+ */
+
+#define atomic_store_relaxed(p, v) \
+ atomic_store_explicit(p, v, memory_order_relaxed)
+
+#define atomic_store_release(p, v) \
+ atomic_store_explicit(p, v, memory_order_release)
+
+#define atomic_load_relaxed(p) \
+ atomic_load_explicit(p, memory_order_relaxed)
+
+#define atomic_load_acquire(p) \
+ atomic_load_explicit(p, memory_order_acquire)
+
+static int
+atomic_cas_int(volatile int *p, int o, int n)
+{
+
+ (void)atomic_compare_exchange_strong_explicit(p, &o, n,
+ memory_order_relaxed, memory_order_relaxed);
+ return o;
+}
+
+static int
+futex(volatile int *uaddr, int op, int val, const struct timespec *timeout,
+ volatile int *uaddr2, int val2, int val3)
+{
+ return syscall(SYS___futex, uaddr, op, val, timeout, uaddr2,
+ val2, val3);
+}
+
+static int
+futex_wait(volatile int *uaddr, int cmpval, const struct timespec *timeout)
+{
+
+ return futex(uaddr, FUTEX_WAIT, /*val*/cmpval, timeout, /*uaddr2*/NULL,
+ /*val2*/0, /*val3*/0);
+}
+
+static int
+futex_wake(volatile int *uaddr, int nwake)
{
=20
- pthread_mutex_unlock((pthread_mutex_t *)closure);
+ return futex(uaddr, FUTEX_WAKE, /*val*/nwake, /*timeout*/NULL,
+ /*uaddr2*/NULL, /*val2*/0, /*val3*/0);
+}
+
+static int
+futex_wake_all(volatile int *uaddr)
+{
+
+ return futex_wake(uaddr, INT_MAX);
+}
+
+/*
+ * END MOVE ME ELSEWEHRE
+ */
+/*************************************************************************=
****/
+
+#define ONCE_LOCKED ((int)__BIT(31))
+#define ONCE_WAITING ((int)__BIT(30))
+#define ONCE_FORKGEN ((int)__BITS(29,0))
+#define ONCE_INIT 0
+#define ONCE_DONE 1
+
+static int
+once_forkgen(void)
+{
+
+ /*
+ * This algorithm is limited to 2^30 - 1 nested forks before it
+ * may deadlock in the event of a race.
+ */
+ return __libc_forkgen & ONCE_FORKGEN;
+}
+
+static void
+once_cleanup(void *cookie)
+{
+ pthread_once_t *once =3D cookie;
+
+ /*
+ * Clear the state and wake other callers so someone else can
+ * try in ourstead.
+ *
+ * If concurrent fork happens before the store, the child will
+ * see a stale fork generation and restore it to the initial
+ * state. If concurrent fork happens after the store, the
+ * child will see the initial state. Either way, the wakeups
+ * don't matter for the child -- the thread calling fork won't
+ * be woken up, and any other threads won't be in the child, so
+ * it's all moot.
+ *
+ * Don't worry about micro-optimizing this with the
+ * ONCE_WAITING bit -- cancellation is an unlikely scenario
+ * that's not worth optimizing; let's keep it simple and
+ * reliable.
+ */
+ pthread__assert((atomic_load_relaxed(&once->pto_done) &
+ ~ONCE_WAITING) =3D=3D (ONCE_LOCKED|once_forkgen()));
+ atomic_store_relaxed(&once->pto_done, ONCE_INIT);
+ (void)futex_wake_all(&once->pto_done);
}
=20
int
-pthread_once(pthread_once_t *once_control, void (*routine)(void))
+pthread_once(pthread_once_t *once, void (*init)(void))
{
+ const int forkgen =3D once_forkgen();
+ int done, cancelstate;
+
if (__predict_false(__uselibcstub))
- return __libc_thr_once_stub(once_control, routine);
+ return __libc_thr_once_stub(once, init);
+
+ /*
+ * Optimistically check whether it's already done, with a
+ * load-acquire to synchronize with the store-release in
+ * whatever thread did the initialization.
+ */
+ if (__predict_true(atomic_load_acquire(&once->pto_done) =3D=3D
+ ONCE_DONE))
+ return 0;
=20
- if (once_control->pto_done =3D=3D 0) {
- pthread_mutex_lock(&once_control->pto_mutex);
- pthread_cleanup_push(&once_cleanup, &once_control->pto_mutex);
- if (once_control->pto_done =3D=3D 0) {
- routine();
- membar_release();
- once_control->pto_done =3D 1;
- }
- pthread_cleanup_pop(1);
- } else {
+ /*
+ * Defer cancellation while we synchronize in case the caller
+ * is configured as PTHREAD_CANCEL_ASYNCHRONOUS.
+ */
+ pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &cancelstate);
+top:
+ /*
+ * Try to claim it, marked with the current fork generation.
+ * If it was done in the interim (unlikely, only if we lose a
+ * race to another thread), great -- issue an acquire barrier
+ * to synchronize with the other thread's initialization,
+ * restore the cancellation state, and we're done.
+ */
+ done =3D atomic_cas_int(&once->pto_done, ONCE_INIT,
+ ONCE_LOCKED|forkgen);
+ if (__predict_false(done =3D=3D ONCE_DONE)) {
membar_acquire();
+ goto out;
}
=20
+ /*
+ * If is held by a another thread in this process, that thread
+ * is presumably still running -- even if it has been
+ * cancelled, the once_cleanup routine will handle it. Wait
+ * for that thread to release it.
+ *
+ * If it was held by another thread in an ancestor of this
+ * process, from a different fork generation, try to clear it
+ * and start over. We have to use atomic_cas_* to clear it
+ * because another thread might be doing the same thing -- it
+ * doesn't matter which of us wins, but the transition to clear
+ * must happen only once (otherwise another thread might have
+ * finished initialization first, at which point we must not
+ * clear it).
+ */
+ if (__predict_false(done !=3D ONCE_INIT)) {
+ if ((done & ~ONCE_WAITING) =3D=3D (ONCE_LOCKED|forkgen)) {
+ (void)futex_wait(&once->pto_done, done, NULL);
+ } else {
+ (void)atomic_cas_int(&once->pto_done, done,
+ ONCE_INIT);
+ }
+ goto top;
+ }
+
+ /*
+ * Acquired at the current fork generation. Arrange to back
+ * out if cancelled, and then run the init routine with the
+ * cancellation state restored to the caller's.
+ */
+ pthread_cleanup_push(&once_cleanup, once);
+ pthread_setcancelstate(cancelstate, NULL);
+ (*init)();
+ pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &cancelstate);
+ pthread_cleanup_pop(/*execute*/0);
+
+ /*
+ * Mark it done with a store-release, to synchronize with other
+ * users' load-acquire. If there were other waiters, notify
+ * other callers that we are done -- but avoid a futex syscall
+ * if not.
+ */
+ membar_release();
+ done =3D atomic_cas_int(&once->pto_done, ONCE_LOCKED|forkgen,
+ ONCE_DONE);
+ if (__predict_false(done !=3D (ONCE_LOCKED|forkgen))) {
+ pthread__assert(done =3D=3D (ONCE_LOCKED|ONCE_WAITING|forkgen));
+ atomic_store_relaxed(&once->pto_done, ONCE_DONE);
+ (void)futex_wake_all(&once->pto_done);
+ }
+out:
+ /*
+ * Restore the cancellation state now that we are done
+ * synchronizing and no longer need to clean up after
+ * ourselves.
+ */
+ pthread_setcancelstate(cancelstate, NULL);
return 0;
}
=20
--=_su1OON3HGWIM8fZuEZsawKeY4iHDSE1Z--
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Cc:
Subject: Re: lib/59125: pthread_once(3) races with concurrent fork
Date: Wed, 5 Mar 2025 22:56:40 +0000
This is a multi-part message in MIME format.
--=_QSVhpQW7wL2t2ZyM0lNb6kI+m4HN6cIl
My second attempt is to serialize pthread_once and fork. I think this
is unavoidable to guarantee that the parent and child both observe a
single call to the init routine; any fork concurrent with a call to
the init routine in the parent will necessarily break this contract,
unless we decide to prohibit pthread_once or make it return failure in
the child -- which makes it not particularly useful.
Serializing pthread_once and fork doesn't require serializing all init
routines, however: instead, if there's any pthread_once call in
progress, we hold up fork, but the init routines for different
pthread_once_t objects can run in parallel. And if you try to call
fork in a pthread_once init routine, it fails with EDEADLK. (I picked
EDEADLK mainly because without logic to detect this scenario, it would
deadlock. Maybe EBUSY or something would be better, no strong feeling
here.)
If we want, we could also implement `writer starvation' avoidance by
having fork (the `writer') block new pthread_once calls (the
`readers') until all the top-level pending calls have completed.
--=_QSVhpQW7wL2t2ZyM0lNb6kI+m4HN6cIl
Content-Type: text/plain; charset="ISO-8859-1"; name="pr59125-pthreadonceforklock"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="pr59125-pthreadonceforklock.patch"
# HG changeset patch
# User Taylor R Campbell <riastradh@NetBSD.org>
# Date 1741201110 0
# Wed Mar 05 18:58:30 2025 +0000
# Branch trunk
# Node ID 652c5f82aa0cacebc9d86ccd9618950fe579752a
# Parent cd6b61c1b621c4bb80fbe22e714ba188467be421
# EXP-Topic riastradh-pr59125-oncefork
pthread_once(3): Fix races with concurrent fork by blocking it.
If fork can copy any states that the pthread_once initialization
routine transitions through, then it is not possible to satisfy the
contract of pthread_once(3), which is that the initialization routine
has been called exactly once.
To avoid this, we hold up fork until any concurrent pthread_once(3)
calls that may do any initialization have completed.
It is also not possible for pthread_once(3) to satisfy its contract
if the initalization routine itself forks. To prevent this, we make
fork(3) fail with EDEADLK if you try to call it during that time.
Nested calls to pthread_once(3) count the call depth. (Without this
explicit detection, fork(3) would simply deadlock instead.)
The runtime cost to fork(3) is two predicted-not-taken branches, one
via indirect function call -- and in the unlikely event that there
are any concurrent pthread_once(3) calls, it waits for them to
complete.
The runtime cost to the first call to pthread_once(3) is potential
contention over the atfork_lock, which serializes fork(3) and
pthread_atfork(3). However, the initialization routines themselves
in concurrent calls to pthread_once(3) are not serialized; they can
run in parallel.
While here, protect pthread_once(3) from PTHREAD_CANCEL_ASYNCHRONOUS.
PR lib/59125: pthread_once(3) races with concurrent fork
diff -r cd6b61c1b621 -r 652c5f82aa0c lib/libc/gen/pthread_atfork.c
--- a/lib/libc/gen/pthread_atfork.c Wed Mar 05 15:29:16 2025 +0000
+++ b/lib/libc/gen/pthread_atfork.c Wed Mar 05 18:58:30 2025 +0000
@@ -41,6 +41,7 @@
#include <unistd.h>
#include <sys/queue.h>
#include "extern.h"
+#include "fork.h"
#include "reentrant.h"
=20
#ifdef __weak_alias
@@ -175,6 +176,28 @@ out: mutex_unlock(&atfork_lock);
return error;
}
=20
+static uint64_t block_fork_count;
+static cond_t block_fork_cv =3D COND_INITIALIZER;
+
+void
+__libc_block_fork(void)
+{
+
+ mutex_lock(&atfork_lock);
+ block_fork_count++; /* 64-bit, can't overflow in 500 years */
+ mutex_unlock(&atfork_lock);
+}
+
+void
+__libc_unblock_fork(void)
+{
+
+ mutex_lock(&atfork_lock);
+ if (--block_fork_count =3D=3D 0)
+ cond_broadcast(&block_fork_cv);
+ mutex_unlock(&atfork_lock);
+}
+
pid_t
fork(void)
{
@@ -182,8 +205,16 @@ fork(void)
pid_t ret;
=20
mutex_lock(&atfork_lock);
+ if (__predict_false(__libc_thr_in_once())) {
+ mutex_unlock(&atfork_lock);
+ errno =3D EDEADLK;
+ return -1;
+ }
+ while (__predict_false(block_fork_count))
+ cond_wait(&block_fork_cv, &atfork_lock);
+
SIMPLEQ_FOREACH(iter, &prepareq, next)
- (*iter->fn)();
+ (*iter->fn)();
_malloc_prefork();
=20
ret =3D __locked_fork(&errno);
diff -r cd6b61c1b621 -r 652c5f82aa0c lib/libc/include/fork.h
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/lib/libc/include/fork.h Wed Mar 05 18:58:30 2025 +0000
@@ -0,0 +1,36 @@
+/* $NetBSD$ */
+
+/*-
+ * Copyright (c) 2025 The NetBSD Foundation, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTO=
RS
+ * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIM=
ITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICU=
LAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTO=
RS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF =
THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _LIB_LIBC_FORK_H_
+#define _LIB_LIBC_FORK_H_
+
+void __libc_block_fork(void);
+void __libc_unblock_fork(void);
+int __libc_thr_in_once(void);
+
+#endif /* _LIB_LIBC_FORK_H_ */
diff -r cd6b61c1b621 -r 652c5f82aa0c lib/libc/include/reentrant.h
--- a/lib/libc/include/reentrant.h Wed Mar 05 15:29:16 2025 +0000
+++ b/lib/libc/include/reentrant.h Wed Mar 05 18:58:30 2025 +0000
@@ -262,6 +262,7 @@ void *__libc_thr_getspecific_stub(thread
int __libc_thr_keydelete_stub(thread_key_t);
=20
int __libc_thr_once_stub(once_t *, void (*)(void));
+int __libc_thr_in_once_stub(void);
int __libc_thr_sigsetmask_stub(int, const sigset_t *, sigset_t *);
thr_t __libc_thr_self_stub(void);
int __libc_thr_yield_stub(void);
diff -r cd6b61c1b621 -r 652c5f82aa0c lib/libc/thread-stub/thread-stub.c
--- a/lib/libc/thread-stub/thread-stub.c Wed Mar 05 15:29:16 2025 +0000
+++ b/lib/libc/thread-stub/thread-stub.c Wed Mar 05 18:58:30 2025 +0000
@@ -355,6 +355,7 @@ int
/* misc. */
=20
__weak_alias(__libc_thr_once,__libc_thr_once_stub)
+__weak_alias(__libc_thr_in_once,__libc_thr_in_once_stub)
__weak_alias(__libc_thr_sigsetmask,__libc_thr_sigsetmask_stub)
__weak_alias(__libc_thr_self,__libc_thr_self_stub)
__weak_alias(__libc_thr_yield,__libc_thr_yield_stub)
@@ -365,6 +366,8 @@ int
__weak_alias(__libc_thr_curcpu,__libc_thr_curcpu_stub)
=20
=20
+static size_t __libc_thr_once_depth;
+
int
__libc_thr_once_stub(once_t *o, void (*r)(void))
{
@@ -372,7 +375,9 @@ int
/* XXX Knowledge of libpthread types. */
=20
if (o->pto_done =3D=3D 0) {
+ __libc_thr_once_depth++;
(*r)();
+ __libc_thr_once_depth--;
o->pto_done =3D 1;
}
=20
@@ -380,6 +385,13 @@ int
}
=20
int
+__libc_thr_in_once_stub(void)
+{
+
+ return __libc_thr_once_depth !=3D 0;
+}
+
+int
__libc_thr_sigsetmask_stub(int h, const sigset_t *s, sigset_t *o)
{
=20
diff -r cd6b61c1b621 -r 652c5f82aa0c lib/libpthread/pthread_int.h
--- a/lib/libpthread/pthread_int.h Wed Mar 05 15:29:16 2025 +0000
+++ b/lib/libpthread/pthread_int.h Wed Mar 05 18:58:30 2025 +0000
@@ -106,6 +106,7 @@ struct __pthread_st {
struct pthread_lock_ops pt_lockops;/* Cached to avoid PIC overhead */
void *(*pt_func)(void *);/* Function to call at start. */
void *pt_arg; /* Argument to pass at start. */
+ size_t pt_oncedepth; /* pthread_once call depth */
=20
/* Stack of cancellation cleanup handlers and their arguments */
PTQ_HEAD(, pt_clean_t) pt_cleanup_stack;
diff -r cd6b61c1b621 -r 652c5f82aa0c lib/libpthread/pthread_mi.expsym
--- a/lib/libpthread/pthread_mi.expsym Wed Mar 05 15:29:16 2025 +0000
+++ b/lib/libpthread/pthread_mi.expsym Wed Mar 05 18:58:30 2025 +0000
@@ -29,6 +29,7 @@
__libc_thr_errno
__libc_thr_exit
__libc_thr_getspecific
+__libc_thr_in_once
__libc_thr_init
__libc_thr_keycreate
__libc_thr_keydelete
diff -r cd6b61c1b621 -r 652c5f82aa0c lib/libpthread/pthread_once.c
--- a/lib/libpthread/pthread_once.c Wed Mar 05 15:29:16 2025 +0000
+++ b/lib/libpthread/pthread_once.c Wed Mar 05 18:58:30 2025 +0000
@@ -1,12 +1,9 @@
-/* $NetBSD: pthread_once.c,v 1.5 2025/03/04 00:41:00 riastradh Exp $ */
+/* $NetBSD$ */
=20
/*-
- * Copyright (c) 2001, 2003 The NetBSD Foundation, Inc.
+ * Copyright (c) 2025 The NetBSD Foundation, Inc.
* All rights reserved.
*
- * This code is derived from software contributed to The NetBSD Foundation
- * by Nathan J. Williams, and by Jason R. Thorpe.
- *
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
@@ -15,13 +12,6 @@
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
- * 3. All advertising materials mentioning features or use of this software
- * must display the following acknowledgement:
- * This product includes software developed by the NetBSD
- * Foundation, Inc. and its contributors.
- * 4. Neither the name of The NetBSD Foundation nor the names of its
- * contributors may be used to endorse or promote products derived
- * from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTO=
RS
* ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIM=
ITED
@@ -37,42 +27,123 @@
*/
=20
#include <sys/cdefs.h>
-__RCSID("$NetBSD: pthread_once.c,v 1.5 2025/03/04 00:41:00 riastradh Exp $=
");
+__RCSID("$NetBSD$");
=20
/* Need to use libc-private names for atomic operations. */
#include "../../common/lib/libc/atomic/atomic_op_namespace.h"
=20
+#include "../../lib/libc/include/fork.h"
+
+#include <stdatomic.h>
+
#include "pthread.h"
#include "pthread_int.h"
#include "reentrant.h"
=20
+#define atomic_load_acquire(p) \
+ atomic_load_explicit(p, memory_order_acquire)
+
+#define atomic_store_release(p, v) \
+ atomic_store_explicit(p, v, memory_order_release)
+
static void
-once_cleanup(void *closure)
+once_cleanup(void *cookie)
{
+ pthread_once_t *once_control =3D cookie;
=20
- pthread_mutex_unlock((pthread_mutex_t *)closure);
+ pthread_mutex_unlock(&once_control->pto_mutex);
+ __libc_unblock_fork();
}
=20
int
-pthread_once(pthread_once_t *once_control, void (*routine)(void))
+pthread_once(pthread_once_t *once_control, void (*init_routine)(void))
{
+ int cancelstate;
+
if (__predict_false(__uselibcstub))
- return __libc_thr_once_stub(once_control, routine);
+ return __libc_thr_once_stub(once_control, init_routine);
+
+ /*
+ * Optimistically check whether it's already done, with a
+ * load-acquire to synchronize with the store-release in
+ * whatever thread did the initialization.
+ */
+ if (__predict_true(atomic_load_acquire(&once_control->pto_done) =3D=3D 1))
+ return 0;
+
+ /*
+ * Carefully prepare to check and run the routine:
+ *
+ * 1. Defer cancellation while we synchronize in case the
+ * caller is configured as PTHREAD_CANCEL_ASYNCHRONOUS, so
+ * we are guaranteed to have a chance to clean up after
+ * ourselves if cancelled.
+ *
+ * 2. Block concurrent fork while we work so that children will
+ * never observe either the once_control state or the state
+ * that the user's init_routine runs partially initialized.
+ *
+ * 3. Serialize access to this once_control in particular by
+ * taking the mutex.
+ *
+ * 4. Push a cleanup action to unwind this preparation when
+ * we're done, or if we're cancelled during init_routine.
+ */
+ pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &cancelstate);
+ __libc_block_fork();
+ pthread_mutex_lock(&once_control->pto_mutex);
+ pthread_cleanup_push(&once_cleanup, &once_control->pto_mutex);
=20
- if (once_control->pto_done =3D=3D 0) {
- pthread_mutex_lock(&once_control->pto_mutex);
- pthread_cleanup_push(&once_cleanup, &once_control->pto_mutex);
- if (once_control->pto_done =3D=3D 0) {
- routine();
- membar_release();
- once_control->pto_done =3D 1;
- }
- pthread_cleanup_pop(1);
- } else {
- membar_acquire();
- }
+ /*
+ * It's possible that we raced with another caller to
+ * pthread_once with the same state. Don't run the routine
+ * again if so.
+ */
+ if (__predict_false(once_control->pto_done))
+ goto out;
+
+ /*
+ * Allow cancellation, if the caller allowed it, while we run
+ * the init routine. If cancelled during the initialization,
+ * we don't mark it done:
+ *
+ * The pthread_once() function is not a cancellation
+ * point. However, if init_routine is a cancellation
+ * point and is canceled, the effect on once_control shall
+ * be as if pthread_once() was never called.
+ *
+ * And mark ourselves in the middle of pthread_once so that if
+ * init_routine tries to use fork(), it will fail -- it is
+ * impossible to satisfy the requirements of pthread_once and
+ * fork.
+ */
+ pthread_setcancelstate(cancelstate, NULL);
+ pthread_self()->pt_oncedepth++; /* stack overflows before this can */
+ (*init_routine)();
+ pthread_self()->pt_oncedepth--;
+ pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, NULL);
+
+ /*
+ * Done. Use a store-release to synchronize with the
+ * load-acquire in unlocked users of pthread_once.
+ */
+ atomic_store_release(&once_control->pto_done, 1);
+out:
+ /*
+ * Release the lock, allow fork again, and restore the caller's
+ * cancellation state.
+ */
+ pthread_cleanup_pop(/*execute*/1);
+ pthread_setcancelstate(cancelstate, NULL);
=20
return 0;
}
=20
+int
+__libc_thr_in_once(void)
+{
+
+ return __predict_false(pthread_self()->pt_oncedepth !=3D 0);
+}
+
__strong_alias(__libc_thr_once,pthread_once)
--=_QSVhpQW7wL2t2ZyM0lNb6kI+m4HN6cIl--
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/59125 CVS commit: src/tests/lib/libpthread
Date: Sun, 30 Mar 2025 23:03:06 +0000
Module Name: src
Committed By: riastradh
Date: Sun Mar 30 23:03:06 UTC 2025
Modified Files:
src/tests/lib/libpthread: t_once.c
Log Message:
tests/lib/libpthread/t_once: Test fork and pthread_once race.
Test is disabled by default because it triggers only with very low
probability in each trial, but let's make it easy to run if you want.
PR lib/59125: pthread_once(3) races with concurrent fork
To generate a diff of this commit:
cvs rdiff -u -r1.2 -r1.3 src/tests/lib/libpthread/t_once.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.