NetBSD Problem Report #56979

From www@netbsd.org  Thu Aug 25 16:27:10 2022
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 392B91A921F
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 25 Aug 2022 16:27:10 +0000 (UTC)
Message-Id: <20220825162638.791821A9239@mollari.NetBSD.org>
Date: Thu, 25 Aug 2022 16:26:38 +0000 (UTC)
From: tgl@sss.pgh.pa.us
Reply-To: tgl@sss.pgh.pa.us
To: gnats-bugs@NetBSD.org
Subject: fork(2) fails to be signal safe
X-Send-Pr-Version: www-1.0

>Number:         56979
>Category:       lib
>Synopsis:       fork(2) fails to be signal safe
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    lib-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Aug 25 16:30:00 +0000 2022
>Last-Modified:  Sun Oct 16 10:40:01 +0000 2022
>Originator:     Tom Lane
>Release:        HEAD/202208100950Z
>Organization:
PostgreSQL Global Development Group
>Environment:
NetBSD cube.sss.pgh.pa.us 9.99.99 NetBSD 9.99.99 (GENERIC) #0: Wed Aug 10 08:38:43 UTC 2022  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/macppc/compile/GENERIC macppc
>Description:
I traced down a deadlock that occurs rarely while starting a Postgres server on current NetBSD.
The stack trace is

(gdb) bt
#0  0xfdeede98 in ___lwp_park60 () from /usr/libexec/ld.elf_so
#1  0xfdee3d48 in _rtld_exclusive_enter () from /usr/libexec/ld.elf_so
#2  0xfdee5dac in __locked_fork () from /usr/libexec/ld.elf_so
#3  0xfdd0650c in fork () from /usr/lib/libc.so.12
#4  0x01c1bd18 in fork_process () at fork_process.c:59
#5  0x01c1fbb0 in StartChildProcess (type=WalWriterProcess)
    at postmaster.c:5398
#6  0x01c212b8 in reaper (postgres_signal_arg=<optimized out>)
    at postmaster.c:3076
#7  <signal handler called>
#8  0xfdee195c in _rtld_bind () from /usr/libexec/ld.elf_so
#9  0xfdee1dc0 in _rtld_bind_secureplt_start () from /usr/libexec/ld.elf_so
Backtrace stopped: frame did not save the PC

This seems to be a self-deadlock occurring because _rtld_bind() already did _rtld_shared_enter() and now __locked_fork() wants that mutex exclusively.
Although gdb fails to trace any further back, digging in the stack identified the
previous PC as 0x01c2b81c, which is

   0x1c2b804 <PostmasterMain+4544>:     lwz     r3,108(r1)
   0x1c2b808 <PostmasterMain+4548>:     mr      r7,r28
   0x1c2b80c <PostmasterMain+4552>:     mr      r4,r23
   0x1c2b810 <PostmasterMain+4556>:     li      r6,0
   0x1c2b814 <PostmasterMain+4560>:     li      r5,0
   0x1c2b818 <PostmasterMain+4564>:     bl      0x1ee6230 <__select50@plt>
-> 0x1c2b81c <PostmasterMain+4568>:     li      r5,0

Evidently, this is the first time this select(2) call has been reached in this process,
and we're trying to resolve the PLT entry, and while that is happening a SIGCHLD
signal occurs, leading the signal handler to try to fork a new child process.

I realize that calling system functions from signal handlers is generally deprecated;
but POSIX specifies that fork(2) is safe to call from a signal handler, which IMO
makes this a NetBSD bug.

>How-To-Repeat:
I've run into this a few times while running Postgres regression tests, but it's very hard to reproduce that way.  The "startup process" child process has to exit before the parent postmaster process reaches the select(2) in its idle loop for the first time, which would be very unusual timing given the relative amounts of work to be done in each process.  A bespoke test program might be a better way to make it reproducible.
>Fix:
Is there a way to not need the RTLD lock during fork()?

>Audit-Trail:
From: "David H. Gutteridge" <david@gutteridge.ca>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: lib/56979: fork(2) fails to be signal safe
Date: Thu, 25 Aug 2022 15:23:19 -0400

 Our sigaction(2) man page lists fork(2) as being signal safe, so that's
 concerning if it isn't.

 Dave

From: Joerg Sonnenberger <joerg@bec.de>
To: gnats-bugs@netbsd.org
Cc: lib-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: lib/56979: fork(2) fails to be signal safe
Date: Fri, 26 Aug 2022 00:16:48 +0200

 Am Thu, Aug 25, 2022 at 04:30:00PM +0000 schrieb tgl@sss.pgh.pa.us:
 > >Fix:
 > Is there a way to not need the RTLD lock during fork()?

 Depending on what you want to do, you can use __fork() directly. That
 skipps the locking, but also e.g. pthread_atfork(3).

 At this point, I have mostly given up on async signal safety for
 fork(2). On the one hand, way too many programs and libraries nowadays
 expect fork from multi-threaded progams to give a mostly usable state
 (e.g. beyond "I can run exec now"). Then you have all the work fork()
 has to do beyond being a simple system call (the traditional members of
 the async signal safe list all are). E.g. pthread_atfork has to be
 written to be atomic regarding any interruption by a signal handler.

 Joerg

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Thu, 25 Aug 2022 22:54:52 +0000

 This is a multi-part message in MIME format.
 --=_3LM3Qr5snjR6jwWdKR2eeC80vJ4rOpBA

 Can you try the attached patch series?  Do you have a quick reproducer
 for the problem?  Ideally we could make an automatic test for this if
 there's a cheap reproducer to trigger the race.

 --=_3LM3Qr5snjR6jwWdKR2eeC80vJ4rOpBA
 Content-Type: text/plain; charset="ISO-8859-1"; name="forksignalsafety"
 Content-Transfer-Encoding: quoted-printable
 Content-Disposition: attachment; filename="forksignalsafety.patch"

 From 14e63dd753fad250b3b82701b02a654dd6913a32 Mon Sep 17 00:00:00 2001
 From: Taylor R Campbell <riastradh@NetBSD.org>
 Date: Thu, 25 Aug 2022 22:49:00 +0000
 Subject: [PATCH 1/2] pthread_atfork(3): Block signals during the call to
  pthread_atfork.

 This doesn't affect the calls to the atfork handlers -- it only
 protects access to the lists of handlers from interruption by a
 signal, in case the signal handler calls fork(2).
 ---
  lib/libc/gen/pthread_atfork.c | 23 +++++++++++++++--------
  1 file changed, 15 insertions(+), 8 deletions(-)

 diff --git a/lib/libc/gen/pthread_atfork.c b/lib/libc/gen/pthread_atfork.c
 index 9d72261eb6cf..22f2371f44b2 100644
 --- a/lib/libc/gen/pthread_atfork.c
 +++ b/lib/libc/gen/pthread_atfork.c
 @@ -101,15 +101,20 @@ pthread_atfork(void (*prepare)(void), void (*parent)(=
 void),
      void (*child)(void))
  {
  	struct atfork_callback *newprepare, *newparent, *newchild;
 +	sigset_t mask, omask;
 +	int error;
 =20
  	newprepare =3D newparent =3D newchild =3D NULL;
 =20
 +	sigfillset(&mask);
 +	thr_sigsetmask(SIG_SETMASK, &mask, &omask);
 +
  	mutex_lock(&atfork_lock);
  	if (prepare !=3D NULL) {
  		newprepare =3D af_alloc();
  		if (newprepare =3D=3D NULL) {
 -			mutex_unlock(&atfork_lock);
 -			return ENOMEM;
 +			error =3D ENOMEM;
 +			goto out;
  		}
  		newprepare->fn =3D prepare;
  	}
 @@ -119,8 +124,8 @@ pthread_atfork(void (*prepare)(void), void (*parent)(vo=
 id),
  		if (newparent =3D=3D NULL) {
  			if (newprepare !=3D NULL)
  				af_free(newprepare);
 -			mutex_unlock(&atfork_lock);
 -			return ENOMEM;
 +			error =3D ENOMEM;
 +			goto out;
  		}
  		newparent->fn =3D parent;
  	}
 @@ -132,8 +137,8 @@ pthread_atfork(void (*prepare)(void), void (*parent)(vo=
 id),
  				af_free(newprepare);
  			if (newparent !=3D NULL)
  				af_free(newparent);
 -			mutex_unlock(&atfork_lock);
 -			return ENOMEM;
 +			error =3D ENOMEM;
 +			goto out;
  		}
  		newchild->fn =3D child;
  	}
 @@ -150,9 +155,11 @@ pthread_atfork(void (*prepare)(void), void (*parent)(v=
 oid),
  		SIMPLEQ_INSERT_TAIL(&parentq, newparent, next);
  	if (child)
  		SIMPLEQ_INSERT_TAIL(&childq, newchild, next);
 -	mutex_unlock(&atfork_lock);
 +	error =3D 0;
 =20
 -	return 0;
 +out:	mutex_unlock(&atfork_lock);
 +	thr_sigsetmask(SIG_SETMASK, &omask, NULL);
 +	return error;
  }
 =20
  pid_t

 From f21ba3adae5fddd6a991b1ba40642db1465ecc99 Mon Sep 17 00:00:00 2001
 From: Taylor R Campbell <riastradh@NetBSD.org>
 Date: Thu, 25 Aug 2022 22:50:00 +0000
 Subject: [PATCH 2/2] ld.elf_so(8): Make fork take a shared, not exclusive,
  lock.

 We only need to ensure that there are no concurrent modifications to
 the rtld data structures in flight, since the threads that began
 those modifications will not exist in the child and will therefore be
 unable to complete them in the child.

 A shared lock suffices to ensure there are no such concurrent
 modifications in flight; an exclusive lock is not necessary, and can
 cause deadlock if fork is executed from a signal handler, which is
 explicitly allowed by POSIX (and our own sigaction(2) man page) which
 marks fork as async-signal-safe.

 PR lib/56979
 ---
  libexec/ld.elf_so/rtld.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

 diff --git a/libexec/ld.elf_so/rtld.c b/libexec/ld.elf_so/rtld.c
 index 5ea978e2e61d..bb37b7f9f232 100644
 --- a/libexec/ld.elf_so/rtld.c
 +++ b/libexec/ld.elf_so/rtld.c
 @@ -1539,14 +1539,13 @@ pid_t __fork(void);
  __dso_public pid_t
  __locked_fork(int *my_errno)
  {
 -	sigset_t mask;
  	pid_t result;
 =20
 -	_rtld_exclusive_enter(&mask);
 +	_rtld_shared_enter();
  	result =3D __fork();
  	if (result =3D=3D -1)
  		*my_errno =3D errno;
 -	_rtld_exclusive_exit(&mask);
 +	_rtld_shared_exit();
 =20
  	return result;
  }

 --=_3LM3Qr5snjR6jwWdKR2eeC80vJ4rOpBA--

From: Tom Lane <tgl@sss.pgh.pa.us>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Thu, 25 Aug 2022 22:37:03 -0400

 Taylor R Campbell <riastradh@NetBSD.org> writes:
 > Can you try the attached patch series?  Do you have a quick reproducer
 > for the problem?  Ideally we could make an automatic test for this if
 > there's a cheap reproducer to trigger the race.

 I've built the system with that patch added, and will start using it
 tomorrow.  I do not have a reliable reproducer, and on reflection it's
 hard to see how to make one -- unless you know of an operation that
 would hold the rtld lock for a decently long period of time.  However,
 I'm finding that my macppc Postgres buildfarm instance [1] is hitting
 this in perhaps one out of five or ten runs, so if it can survive for
 a week or two then I'll feel pretty confident that this patch fixes
 it for me.

 FWIW, the second patch certainly looks like it will alleviate the
 symptom I'm seeing.  I suppose it would not fix cases where the
 signal interrupts an operation holding the rtld lock exclusively;
 but I do not think that will be a problem for my use-case.  PLT
 resolution is pretty much the only dynamic-loader operation that
 I can foresee happening with signals enabled in the Postgres
 postmaster --- indeed, before I saw this failure I'd have sworn
 up and down that we do *nothing* interesting with signals enabled.
 That select(2) call and the adjacent sigprocmask calls are the
 entire extent of the window for trouble.

 I have no opinion about the first patch -- Postgres doesn't use
 pthread_atfork.

 I'll report back in a week or two.

 			regards, tom lane

 [1] https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=mamba&br=HEAD

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Fri, 26 Aug 2022 08:38:10 +0000

 > Date: Thu, 25 Aug 2022 22:37:03 -0400
 > From: Tom Lane <tgl@sss.pgh.pa.us>
 > 
 > I've built the system with that patch added, and will start using it
 > tomorrow.  I do not have a reliable reproducer, and on reflection it's
 > hard to see how to make one -- unless you know of an operation that
 > would hold the rtld lock for a decently long period of time.  However,
 > I'm finding that my macppc Postgres buildfarm instance [1] is hitting
 > this in perhaps one out of five or ten runs, so if it can survive for
 > a week or two then I'll feel pretty confident that this patch fixes
 > it for me.

 Cool, thanks!

 > FWIW, the second patch certainly looks like it will alleviate the
 > symptom I'm seeing.  I suppose it would not fix cases where the
 > signal interrupts an operation holding the rtld lock exclusively;
 > but I do not think that will be a problem for my use-case.

 The rtld exclusive lock blocks signals (except for SIGTRAP, which
 might be a bug...).

 > I have no opinion about the first patch -- Postgres doesn't use
 > pthread_atfork.

 Perhaps not, but if you did, and if a signal interrupted
 pthread_atfork triggering a signal handler that calls fork, then
 fork's attempts to invoke the atfork handlers might see the data
 structures in the middle of an update.

From: Tom Lane <tgl@sss.pgh.pa.us>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Fri, 26 Aug 2022 09:23:54 -0400

 Taylor R Campbell <riastradh@NetBSD.org> writes:
 > From: Tom Lane <tgl@sss.pgh.pa.us>
 >> FWIW, the second patch certainly looks like it will alleviate the
 >> symptom I'm seeing.  I suppose it would not fix cases where the
 >> signal interrupts an operation holding the rtld lock exclusively;
 >> but I do not think that will be a problem for my use-case.

 > The rtld exclusive lock blocks signals (except for SIGTRAP, which
 > might be a bug...).

 D'oh, I knew that, having looked at the code a few weeks ago...
 but yeah, I find the SIGTRAP exception troubling.

 On the whole, the notion that something as simple as a C function call
 can result in behind-your-back taking of a lock is pretty scary.
 Maybe it's only of concern to code directly associated with the
 dynamic loader, but I'm not very convinced.

 			regards, tom lane

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Sat, 27 Aug 2022 21:28:23 +0000

 On Fri, Aug 26, 2022 at 01:25:02PM +0000, Tom Lane wrote:
  >  > The rtld exclusive lock blocks signals (except for SIGTRAP, which
  >  > might be a bug...).
  >  
  >  D'oh, I knew that, having looked at the code a few weeks ago...
  >  but yeah, I find the SIGTRAP exception troubling.
  >  
  >  On the whole, the notion that something as simple as a C function call
  >  can result in behind-your-back taking of a lock is pretty scary.
  >  Maybe it's only of concern to code directly associated with the
  >  dynamic loader, but I'm not very convinced.

 Dynamic linkers are inherently evil :-|

 I suppose we could in principle set up to use different dynamic linker
 images for threaded and unthreaded programs; but in addition to being
 a fairly invasive change, we've found that such measures have a
 tendency to break linux code that assumes everything is always
 threaded.

 -- 
 David A. Holland
 dholland@netbsd.org

From: Tom Lane <tgl@sss.pgh.pa.us>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Sun, 28 Aug 2022 13:03:46 -0400

 I wrote:
 > I'll report back in a week or two.

 Didn't take long to find out that there's still a problem.  With
 this patch, it gets past the fork() all right, but there's still
 a risk of the child process getting stuck on the RTLD lock later:

 #0  0xfdeede4c in ___lwp_park60 () from /usr/libexec/ld.elf_so
 #1  0xfdee3e08 in _rtld_exclusive_enter () from /usr/libexec/ld.elf_so
 #2  0xfdee59e4 in dlopen () from /usr/libexec/ld.elf_so
 #3  0x01e6a4c0 in internal_load_library (
     libname=3Dlibname@entry=3D0xfde3cbf8 "/home/tgl/testversion/lib/postgr=
 esql/libpqwalreceiver.so") at dfmgr.c:239
 #4  0x01e6b2a0 in load_file (
     filename=3Dfilename@entry=3D0x1fc643c "libpqwalreceiver", =

     restricted=3Drestricted@entry=3Dfalse) at dfmgr.c:156
 #5  0x01c6d640 in WalReceiverMain () at walreceiver.c:278
 #6  0x01c19ba0 in AuxiliaryProcessMain (
     auxtype=3Dauxtype@entry=3DWalReceiverProcess) at auxprocess.c:161
 #7  0x01c21778 in StartChildProcess (type=3DWalReceiverProcess)
     at postmaster.c:5412
 #8  0x01c23214 in MaybeStartWalReceiver () at postmaster.c:5577
 #9  MaybeStartWalReceiver () at postmaster.c:5570
 #10 sigusr1_handler (postgres_signal_arg=3D<optimized out>) at postmaster.=
 c:5229
 #11 <signal handler called>
 #12 0xfdee195c in _rtld_bind () from /usr/libexec/ld.elf_so
 #13 0xfdee1dc0 in _rtld_bind_secureplt_start () from /usr/libexec/ld.elf_s=
 o
 Backtrace stopped: frame did not save the PC

 Again, manual investigation shows that the _rtld_bind is trying
 to resolve the select(2) call in the postmaster's main loop, and it's
 plausible that this happened during our first arrival at that call.

 I think Taylor's fix may still be a good idea for pro-forma spec complianc=
 e,
 but I despair of getting to a reliable Postgres build this way.  (BTW,
 is the RTLD lock business new in v10?  I'm surprised that we've not heard
 field reports of Postgres getting stuck at startup on NetBSD.)

 What I'm wondering about now is whether there is a way to force resolution
 of that PLT entry, or even all of the program's PLT entries, before we
 enable signals.  If there are multiple select(2) calls in the same source
 file, will they share a PLT entry?  If so, I could arrange to run a dummy
 select() call somewhere early in startup.

 			regards, tom lane

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Sun, 28 Aug 2022 19:52:22 +0000

 On Sun, Aug 28, 2022 at 05:05:01PM +0000, Tom Lane wrote:
  > I think Taylor's fix may still be a good idea for pro-forma spec
  > compliance, but I despair of getting to a reliable Postgres build
  > this way.  (BTW, is the RTLD lock business new in v10?  I'm
  > surprised that we've not heard field reports of Postgres getting
  > stuck at startup on NetBSD.)

 I don't think so; no idea why it would have broken now.

 The expedient path may be to use posix_spawn instead of fork; it's
 kernel-level on netbsd so the various specific issues that pthreads
 causes with fork disappear.

  >  What I'm wondering about now is whether there is a way to force resolution
  >  of that PLT entry, or even all of the program's PLT entries, before we
  >  enable signals.  If there are multiple select(2) calls in the same source
  >  file, will they share a PLT entry?  If so, I could arrange to run a dummy
  >  select() call somewhere early in startup.

 There's LD_BIND_NOW, but I'm not sure if there's a good way to request
 that behavior from inside a program :-(

 -- 
 David A. Holland
 dholland@netbsd.org

From: Tom Lane <tgl@sss.pgh.pa.us>
To: gnats-bugs@netbsd.org, David Holland <dholland-bugs@netbsd.org>
Cc: lib-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Mon, 29 Aug 2022 14:41:35 -0400

 David Holland <dholland-bugs@netbsd.org> writes:
 >  On Sun, Aug 28, 2022 at 05:05:01PM +0000, Tom Lane wrote:
 >>>> What I'm wondering about now is whether there is a way to force resol=
 ution
 >>>> of that PLT entry, or even all of the program's PLT entries, before w=
 e
 >>>> enable signals.  If there are multiple select(2) calls in the same so=
 urce
 >>>> file, will they share a PLT entry?  If so, I could arrange to run a d=
 ummy
 >>>> select() call somewhere early in startup.
  =

 >  There's LD_BIND_NOW, but I'm not sure if there's a good way to request
 >  that behavior from inside a program :-(

 Local testing seems to confirm the idea that issuing a dummy select()
 before unblocking signals eliminates the issue for Postgres.  I can't
 be totally sure given the narrowness of the failure window, but my
 test ran without failure for considerably longer than it had managed
 before.  So I'm thinking of going with that solution.

 I now suspect that the reason we'd not seen this before could be
 (aside from the timing improbability) that there was previously
 another select() call somewhere in postmaster startup, and that
 that one went away in seemingly-unrelated refactoring.  This'd be
 substantially more plausible if the PLT entry for select() is
 shared across the whole main executable and not just one .c file
 --- could anyone confirm for me which way it works?

 			regards, tom lane

From: Tom Lane <tgl@sss.pgh.pa.us>
To: gnats-bugs@netbsd.org, David Holland <dholland-bugs@netbsd.org>
Cc: lib-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Tue, 30 Aug 2022 14:28:56 -0400

 David Holland <dholland-bugs@netbsd.org> writes:
 >> There's LD_BIND_NOW, but I'm not sure if there's a good way to request
 >> that behavior from inside a program :-(

 Actually, it was pointed out to me that we can easily request that
 behavior at link time, just by adding -Wl,-z,now to the command.
 So we'll probably go with that fix rather than hacking any code.

 			regards, tom lane

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56979 CVS commit: src/libexec/ld.elf_so
Date: Tue, 13 Sep 2022 10:18:58 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Tue Sep 13 10:18:58 UTC 2022

 Modified Files:
 	src/libexec/ld.elf_so: rtld.c

 Log Message:
 ld.elf_so(8): Make fork take a shared, not exclusive, lock.

 We only need to ensure that there are no concurrent modifications to
 the rtld data structures in flight, since the threads that began
 those modifications will not exist in the child and will therefore be
 unable to complete them in the child.

 A shared lock suffices to ensure there are no such concurrent
 modifications in flight; an exclusive lock is not necessary, and can
 cause deadlock if fork is executed from a signal handler, which is
 explicitly allowed by POSIX (and our own sigaction(2) man page) which
 marks fork as async-signal-safe.

 PR lib/56979


 To generate a diff of this commit:
 cvs rdiff -u -r1.211 -r1.212 src/libexec/ld.elf_so/rtld.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Sun, 16 Oct 2022 00:24:38 +0000

 > Date: Sun, 28 Aug 2022 13:03:46 -0400
 > From: Tom Lane <tgl@sss.pgh.pa.us>
 >=20
 > Didn't take long to find out that there's still a problem.  With
 > this patch, it gets past the fork() all right, but there's still
 > a risk of the child process getting stuck on the RTLD lock later:
 >=20
 > #0  0xfdeede4c in ___lwp_park60 () from /usr/libexec/ld.elf_so
 > #1  0xfdee3e08 in _rtld_exclusive_enter () from /usr/libexec/ld.elf_so
 > #2  0xfdee59e4 in dlopen () from /usr/libexec/ld.elf_so
 > [...]
 > #11 <signal handler called>
 > #12 0xfdee195c in _rtld_bind () from /usr/libexec/ld.elf_so
 > #13 0xfdee1dc0 in _rtld_bind_secureplt_start () from /usr/libexec/ld.elf_=
 so

 Do I understand correctly that this means you're trying to call dlopen
 from a signal handler?

 Although fork is documented (in NetBSD and in POSIX) as
 async-signal-safe, dlopen is definitely not and never has been and
 probably never will be -- and I would expect if it works anywhere,
 it's by accident and not remotely guaranteed to be reliable.

 I didn't follow exactly what you're doing, but I suspect it would be
 much more reliable to have the signal handler set a flag or write a
 flag to a pipe and cause select(2) to fail with EINTR and process the
 flag -- then you can safely dlopen (and malloc and fopen and whatever
 else) with wild abandon.

 > (BTW, is the RTLD lock business new in v10?  I'm surprised that
 > we've not heard field reports of Postgres getting stuck at startup
 > on NetBSD.)

 Yes.

 > What I'm wondering about now is whether there is a way to force resolution
 > of that PLT entry, or even all of the program's PLT entries, before we
 > enable signals.  If there are multiple select(2) calls in the same source
 > file, will they share a PLT entry?  If so, I could arrange to run a dummy
 > select() call somewhere early in startup.

 Same source file, or even .so (or executable), probably shared;
 different .so, not likely.

From: Tom Lane <tgl@sss.pgh.pa.us>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Sat, 15 Oct 2022 21:17:37 -0400

 Taylor R Campbell <riastradh@NetBSD.org> writes:
 >> From: Tom Lane <tgl@sss.pgh.pa.us>
 >> Didn't take long to find out that there's still a problem.  With
 >> this patch, it gets past the fork() all right, but there's still
 >> a risk of the child process getting stuck on the RTLD lock later:

 > Do I understand correctly that this means you're trying to call dlopen
 > from a signal handler?

 Well, it *was* a signal handler, but once it issues fork() I wouldn't
 personally regard it as a signal handler anymore.  The child process
 is certainly never going to return control to the interrupted code.
 The parent process (the Postgres "postmaster") runs with signals blocked
 everywhere except this one select() call in its wait loop, so it's safer
 than it sounds.  The postmaster has been coded like that since the
 nineties, and AFAIR this is the first bit of trouble we've had with it.

 > I didn't follow exactly what you're doing, but I suspect it would be
 > much more reliable to have the signal handler set a flag or write a
 > flag to a pipe and cause select(2) to fail with EINTR and process the
 > flag

 People have been talking about changing the postmaster to not do anything
 interesting in signal handlers since the nineties, too, and it's not
 gotten done yet.  The effort-to-reward ratio is just not very good.
 It might happen sometime, but I'm not holding my breath.

 >> (BTW, is the RTLD lock business new in v10?  I'm surprised that
 >> we've not heard field reports of Postgres getting stuck at startup
 >> on NetBSD.)

 > Yes.

 OK, thanks for confirming that.  What we've done about this for the
 moment is to force linking with -Wl,-z,now on NetBSD, which fixes
 this particular problem --- at least, we've not seen it since then
 on two different NetBSD test machines that previously did exhibit
 the failure intermittently --- and it seems like generally a good
 idea anyway.

 			regards, tom lane

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Sun, 16 Oct 2022 08:51:14 +0700

     Date:        Sun, 16 Oct 2022 01:20:01 +0000 (UTC)
     From:        Tom Lane <tgl@sss.pgh.pa.us>
     Message-ID:  <20221016012001.C22DB1A9239@mollari.NetBSD.org>

   |  Well, it *was* a signal handler, but once it issues fork() I wouldn't
   |  personally regard it as a signal handler anymore.

 Sorry, but that is utter nonsense.

   |  The child process
   |  is certainly never going to return control to the interrupted code.

 Your child process might not, but someone else's might.   Neither the
 system, nor the libraries, can assume that it will not.

 There is not, and never has been, any rule that says what code can be
 run in the parent or child returns from fork().

   |  The postmaster has been coded like that since the  nineties,

 Then it has been broken since then.

   | and AFAIR this is the first bit of trouble we've had with it.

 That may be true, but "I tried it and it worked, even though the
 rules say I shouldn't do that" has never been a good excuse for
 anything.

 The next time you get pulled over for drink driving, try telling the
 police/courts that you've been doing it since the 90's, and never had
 any trouble before, see how far that gets you.

 You break the rules, things don't go as you'd hoped they would, it is
 your fault, this time and every time.

 Call anything that is not async signal safe in a signal handler, or fail
 to have shared variables correctly marked as volatile, and anything might
 happen.   That it doesn't happen today, doesn't mean that it won't tomorrow.

 kre

From: Tom Lane <tgl@sss.pgh.pa.us>
To: Robert Elz <kre@munnari.OZ.AU>
Cc: lib-bug-people@netbsd.org, gnats-admin@netbsd.org, gnats-bugs@netbsd.org,
        netbsd-bugs@netbsd.org
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Sat, 15 Oct 2022 22:46:52 -0400

 Robert Elz <kre@munnari.OZ.AU> writes:
 >    |  Well, it *was* a signal handler, but once it issues fork() I wouldn't
 >    |  personally regard it as a signal handler anymore.

 >  Sorry, but that is utter nonsense.

 Well, here is the actual problem: with this implementation, the mere
 act of invoking a C function is not guaranteed to be async-signal-safe,
 depending on whether it crosses a not-terribly-well-defined linkage
 boundary.

 POSIX forgot to specify that that's not okay, because who would think
 you need to mention it?  They certainly aren't going to write down
 that addition is async-signal-safe, for example; but there's nothing
 in either POSIX or the C standard that distinguishes some bare-C
 operations from others.  Indeed, depending on platform and data type
 involved, I wouldn't be too surprised if addition is sometimes *not*
 async-signal-safe on NetBSD 10.  Which nominally-primitive C operations
 get implemented by calls to libgcc_s.so?

 I also suspect that it's possible to construct examples that violate the
 explicit letter of the POSIX spec, in that they list a bunch of functions
 that are required to be async-signal-safe but are in libc.so and therefore
 require RTLD resolution before the first call.  Consider, say, select()
 in the mainline, and while we're resolving that we get an interrupt and
 the signal handler tries to invoke lseek() and that requires resolution.
 I'm failing to see how the resulting deadlock is allowed per POSIX.

 Anyway, linking with "-z now" solves this problem sufficiently for my
 purposes.

 			regards, tom lane

From: Tom Lane <tgl@sss.pgh.pa.us>
To: Robert Elz <kre@munnari.OZ.AU>
Cc: lib-bug-people@netbsd.org, gnats-admin@netbsd.org, gnats-bugs@netbsd.org,
        netbsd-bugs@netbsd.org
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Sat, 15 Oct 2022 23:03:04 -0400

 I wrote:
 > I also suspect that it's possible to construct examples that violate the
 > explicit letter of the POSIX spec, in that they list a bunch of functions
 > that are required to be async-signal-safe but are in libc.so and therefore
 > require RTLD resolution before the first call.  Consider, say, select()
 > in the mainline, and while we're resolving that we get an interrupt and
 > the signal handler tries to invoke lseek() and that requires resolution.

 Ah, sorry, I wasn't thinking very clearly there: the mainline would need
 to be doing something that takes the RTLD lock exclusively.  Not that
 there's any shortage of cases that do that.  Then, if a signal handler
 interrupts that and invokes one of the required-to-be-safe libc
 functions for the first time in the program, you have a deadlock
 in a situation that absolutely should be legal per POSIX.

 The Postgres case I started with does not exhibit this, because
 we don't do anything in the signals-enabled part of mainline that
 would involve an exclusive RTLD lock.  Nonetheless, taking a shared
 RTLD lock in a signal handler is clearly a hazard for POSIX compliance,
 and this system seems like it will do that in cases that POSIX requires
 to be legal.

 			regards, tom lane

From: Robert Elz <kre@munnari.OZ.AU>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: lib-bug-people@netbsd.org, gnats-admin@netbsd.org, gnats-bugs@netbsd.org,
        netbsd-bugs@netbsd.org
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Sun, 16 Oct 2022 15:18:58 +0700

     Date:        Sat, 15 Oct 2022 22:46:52 -0400
     From:        Tom Lane <tgl@sss.pgh.pa.us>
     Message-ID:  <1628442.1665888412@sss.pgh.pa.us>

   | Well, here is the actual problem: with this implementation, the mere
   | act of invoking a C function is not guaranteed to be async-signal-safe,
   | depending on whether it crosses a not-terribly-well-defined linkage
   | boundary.

 For that I'd agree, it should be possible to call any async signal safe
 function from within a signal handler, regardless of where the signal
 occurs, and if that isn't working for us, we should fix that.

 But the stack trace that Taylor showed in his message had the signal handler
 calling dlopen() - that's linking to a whole new library (loading a new
 external file).   That is certainly not async signal safe.

 kre

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Sun, 16 Oct 2022 10:11:52 +0000

 > Date: Sat, 15 Oct 2022 21:17:37 -0400
 > From: Tom Lane <tgl@sss.pgh.pa.us>
 > 
 > Taylor R Campbell <riastradh@NetBSD.org> writes:
 > >> From: Tom Lane <tgl@sss.pgh.pa.us>
 > >> Didn't take long to find out that there's still a problem.  With
 > >> this patch, it gets past the fork() all right, but there's still
 > >> a risk of the child process getting stuck on the RTLD lock later:
 > 
 > > Do I understand correctly that this means you're trying to call dlopen
 > > from a signal handler?
 > 
 > Well, it *was* a signal handler, but once it issues fork() I wouldn't
 > personally regard it as a signal handler anymore.

 It's still in a signal handler because it's interrupting previously
 running code somewhere in the middle.  You have some control over what
 code has been running, but not much:

 >                                                    The child process
 > is certainly never going to return control to the interrupted code.
 > The parent process (the Postgres "postmaster") runs with signals blocked
 > everywhere except this one select() call in its wait loop, so it's safer
 > than it sounds.  The postmaster has been coded like that since the
 > nineties, and AFAIR this is the first bit of trouble we've had with it.

 If the signal handler is blocked except during select, that might be
 OK except for the trouble you're seeing with rtld lazy binding.  But
 even with LD_BIND_NOW or whatever, I'm not going to make any promises
 about it!  E.g., in a threaded program, I wouldn't put it past some
 rtld to do RCU-style tidying in a background thread or something.

 If you used pselect you could at least ensure that the signal is only
 delivered during the system call, not in userland -- and as a bonus
 you would also avoid possible deadlock if the signal delivery races
 with the non-atomic sigprocmask/select sequence!  (I don't know if
 it's definitely a problem in postgres, but it is a very easy problem
 to have without noticing for a while.)  Personally I'd be more
 comfortable relying on that, and I suspect it's a much smaller change
 than moving logic from signal handlers to the select loop.

 > OK, thanks for confirming that.  What we've done about this for the
 > moment is to force linking with -Wl,-z,now on NetBSD, which fixes
 > this particular problem --- at least, we've not seen it since then
 > on two different NetBSD test machines that previously did exhibit
 > the failure intermittently --- and it seems like generally a good
 > idea anyway.

 That sounds like it might work for onw, although it's still skating on
 thin ice and it would be safer to avoid dlopen in a signal handler
 altogether or at least confine it to a signal delivered during a
 system call of your choice as with pselect.

 Given all that, I'm inclined to close this as fixed for the fork part,
 and not-a-bug for the dlopen part.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/56979: fork(2) fails to be signal safe
Date: Sun, 16 Oct 2022 10:37:54 +0000

 > Date: Sat, Sat, 15 Oct 2022 22:46:52 -0400
 > From: Tom Lane <tgl@sss.pgh.pa.us>
 > 
 > Well, here is the actual problem: with this implementation, the mere
 > act of invoking a C function is not guaranteed to be async-signal-safe,
 > depending on whether it crosses a not-terribly-well-defined linkage
 > boundary.

 This is not accurate.  Symbol binding _is_ async-signal-safe because
 all operations that _change_ rtld state -- which, if interrupted by a
 signal, might lead to rtld symbol binding logic seeing inconsistent
 states of the data structures -- block signals while they hold an
 exclusive thread lock, so symbol binding can't happen in a signal
 _while_ the rtld state is being changed.

 > Which nominally-primitive C operations get implemented by calls to
 > libgcc_s.so?

 As far as I know, none of these affect or are affected by global state
 (except possibly the floating-point exception state in softfloat,
 which is a known bug), so they should all be async-signal-safe.  If
 you find an exception, please let us know!

 > Date: Sat, 15 Oct 2022 23:03:04 -0400
 > From: Tom Lane <tgl@sss.pgh.pa.us>
 > 
 > Ah, sorry, I wasn't thinking very clearly there: the mainline would
 > need to be doing something that takes the RTLD lock exclusively.
 > Not that there's any shortage of cases that do that.  Then, if a
 > signal handler interrupts that and invokes one of the
 > required-to-be-safe libc functions for the first time in the
 > program, you have a deadlock in a situation that absolutely should
 > be legal per POSIX.

 No deadlock because the exclusive lock blocks signals and other
 threads from taking the shared lock, so there's no way to take the
 shared lock while the exclusive lock is held.

 The problem in postmaster is that a signal interrupted the _shared_
 lock during symbol binding, and then took the forbidden action --
 calling dlopen in a signal handler, which tries to take the exclusive
 lock, which waits for symbol binding to finish, which will never
 happen because symbol binding is waiting for the signal handler to
 return.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2022 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.