NetBSD Problem Report #59137
From www@netbsd.org Wed Mar 5 20:57:11 2025
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits)
client-signature RSA-PSS (2048 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 077341A9239
for <gnats-bugs@gnats.NetBSD.org>; Wed, 5 Mar 2025 20:57:11 +0000 (UTC)
Message-Id: <20250305205709.A55F51A923C@mollari.NetBSD.org>
Date: Wed, 5 Mar 2025 20:57:09 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: pthread_atfork deadlocks if called in atfork handler
X-Send-Pr-Version: www-1.0
>Number: 59137
>Category: lib
>Synopsis: pthread_atfork deadlocks if called in atfork handler
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: lib-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Mar 05 21:00:01 +0000 2025
>Last-Modified: Wed Mar 05 23:30:01 +0000 2025
>Originator: Taylor R Campbell
>Release: current, 10, 9, ...
>Organization:
The NetBSD Atforkatforkation
>Environment:
>Description:
What happens if a pthread_atfork prefork handler calls pthread_atfork?
POSIX is fuzzy on the semantics (and hints that pthread_atfork is destined for the dustbin of history): https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/functions/pthread_atfork.html
NetBSD's libpthread currently deadlocks in this case, because it takes a lock against itself. It could fail with EDEADLK, though, since pthread_atfork is supposed to have the opportunity for failure, although the only failure POSIX enumerates for pthread_atfork is ENOMEM.
glibc quietly allows it, but then violates the contract by calling a postfork handler in the parent and child that doesn't correspond to any prefork handler that had been called in the parent. It is not possible to call the prefork handler too because that would violate the ordering requirement: prefork handlers must be called in reverse order of registration. Thus, if prefork handler A installs prefork handler B, it's too late to call B because the order has to be B-then-A; A-then-B is forbidden.
Another option would be to quietly allow it but only have it affect subsequent forks.
>How-To-Repeat:
#include <sys/wait.h>
#include <err.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
static void
prefork(void)
{
fprintf(stderr, "prefork handler\n");
if (pthread_atfork(NULL, NULL, NULL) == -1)
err(1, "pthread_atfork 2");
fprintf(stderr, "second atfork handler installed\n");
}
int
main(void)
{
pid_t pid;
int status;
if (pthread_atfork(prefork, NULL, NULL) == -1)
err(1, "pthread_atfork 1");
alarm(1);
fprintf(stderr, "call fork\n");
if ((pid = fork()) == -1)
err(1, "fork");
if (pid == 0)
_exit(0);
fprintf(stderr, "fork returned\n");
alarm(0);
if (waitpid(pid, &status, 0) == -1)
err(1, "waitpid");
if (!WIFEXITED(status) || WEXITSTATUS(status) != 0)
errx(1, "child exited 0x%x\n", status);
fflush(stdout);
return ferror(stdout);
}
>Fix:
Whether or not we decide to `fix' this (whether or not we consider it a bug), we should at least write down the rationale for our choice in a place that is easily findable, like this gnats bug database.
>Audit-Trail:
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: lib/59137: pthread_atfork deadlocks if called in atfork handler
Date: Thu, 06 Mar 2025 06:27:08 +0700
Date: Wed, 5 Mar 2025 21:00:01 +0000 (UTC)
From: campbell+netbsd@mumble.net
Message-ID: <20250305210001.955531A923F@mollari.NetBSD.org>
| POSIX is fuzzy on the semantics (and hints that pthread_atfork
| is destined for the dustbin of history):
Yes, it will probably be removed in the next major edition, Issue 9,
but I'd guess that's something of the order of a decade or more away.
In the meantime we are stuck with it. It might be (ie: is) marked as
Obsolete, but it is still part of the standard (but no-one is likely
to spend much, or any, effort in fixing or adding stuff).
| NetBSD's libpthread currently deadlocks in this case,
Yes, it would. However I see no reason the lock is necessary while
running the handlers, or if one would seem to be a good idea to prevent
any being run in parallel in case of multiple threads all deciding to
fork at the same time, then use a different lock for running the things
than is used for creating them (and forking()).
| It could fail with EDEADLK, although the only failure POSIX
| enumerates for pthread_atfork is ENOMEM.
That would be OK, generating other errors than those listed is
generally OK, as long as the error (errno) generated in the cases
listed is the one specified. If there's a new/different error
condition (which this would be) a different errno is appropriate,
and allowed (provided it isn't generated in a case POSIX says must
work, and this one is not that I think).
| >Fix:
| Whether or not we decide to `fix' this (whether or not we consider
| it a bug), we should at least write down the rationale for our choice
| in a place that is easily findable, like this gnats bug database.
I think it is a bug, libc functions should not cause processes to deadlock.
But assuming we don't simply fix it, so nothing specific needs to be
documented, the place to document it would be the manual page, which
users can be expected to have read before using the interface (any of
them) - not some in some random PR which will be forgotten completely
soon after it is closed (even if it is referenced in the commit message,
application code writers aren't expected to look through the commit logs
to try and work out how to use something).
kre
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.