NetBSD Problem Report #59784

From www@netbsd.org  Sat Nov 22 16:18:07 2025
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
	 client-signature RSA-PSS (2048 bits) client-digest SHA256)
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id E529C1A923A
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 22 Nov 2025 16:18:06 +0000 (UTC)
Message-Id: <20251122161804.D1BED1A923C@mollari.NetBSD.org>
Date: Sat, 22 Nov 2025 16:18:04 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: dlopening and dlclosing libpthread is broken
X-Send-Pr-Version: www-1.0

>Number:         59784
>Category:       lib
>Synopsis:       dlopening and dlclosing libpthread is broken
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    riastradh
>State:          needs-pullups
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Nov 22 16:20:00 +0000 2025
>Closed-Date:    
>Last-Modified:  Sat Nov 29 16:10:01 +0000 2025
>Originator:     Taylor R Campbell
>Release:        current, 11, 10, 9, ...
>Organization:
Locked and Unloaded LLC
>Environment:
>Description:

	A program that dlopens (a library linked against) libpthread
	and then dlcloses it can find itself in a pretty pickle with
	mysterious symptoms like this:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000079bbe310cccc in ?? ()
#0  0x000079bbe310cccc in ?? ()
#1  0x000079bbe2e9847c in __deregister_frame_info_bases () from /usr/lib/libgcc_s.so.1
#2  0x000079bbe2e86365 in __do_global_dtors_aux () from /usr/lib/libgcc_s.so.1
#3  0x000079bbe311ac00 in ?? ()
#4  0x000079bbe2e99a79 in _fini () from /usr/lib/libgcc_s.so.1
#5  0x000079bbe3585120 in atexit_handler_stack () from /usr/lib/libc.so.12
#6  0x00007f7ff709fbe1 in _rtld_call_initfini_function (mask=0x7f7fff539130, func=0x79bbe2e99a70 <_fini>) at /home/riastradh/netbsd/11/src/libexec/ld.elf_so/rtld.c:152
#7  _rtld_call_fini_function (obj=0x79bbe2e9ddf0, mask=0x7f7fff539130, cur_objgen=4) at /home/riastradh/netbsd/11/src/libexec/ld.elf_so/rtld.c:167
#8  0x00007f7ff70a06a6 in _rtld_call_fini_functions (force=1, mask=0x7f7fff539130) at /home/riastradh/netbsd/11/src/libexec/ld.elf_so/rtld.c:213
#9  _rtld_exit () at /home/riastradh/netbsd/11/src/libexec/ld.elf_so/rtld.c:431
#10 0x000079bbe32c895f in __cxa_finalize (dso=dso@entry=0x0) at /home/riastradh/netbsd/11/src/lib/libc/stdlib/atexit.c:222
#11 0x000079bbe32c853b in exit (status=status@entry=0) at /home/riastradh/netbsd/11/src/lib/libc/stdlib/exit.c:60
#12 0x000079bbe3592b90 in pass (ctx=0x79bbe359e860 <Current>) at /home/riastradh/netbsd/11/src/external/bsd/atf/dist/atf-c/tc.c:337
#13 0x000079bbe35931d5 in atf_tc_run (tc=0x792168 <atfu_dlopen_tc>, resfile=<optimized out>) at /home/riastradh/netbsd/11/src/external/bsd/atf/dist/atf-c/tc.c:1041
#14 0x000079bbe359000e in atf_tp_run (tp=tp@entry=0x7f7fff5392c0, tcname=<optimized out>, resfile=<optimized out>) at /home/riastradh/netbsd/11/src/external/bsd/atf/dist/atf-c/tp.c:205
#15 0x000079bbe358fb95 in run_tc (exitcode=<synthetic pointer>, p=0x7f7fff5392e0, tp=0x7f7fff5392c0) at /home/riastradh/netbsd/11/src/external/bsd/atf/dist/atf-c/detail/tp_main.c:510
#16 controlled_main (exitcode=<synthetic pointer>, add_tcs_hook=0x78fad8 <atfu_tp_add_tcs>, argv=<optimized out>, argc=<optimized out>) at /home/riastradh/netbsd/11/src/external/bsd/atf/dist/atf-c/detail/tp_main.c:580
#17 atf_tp_main (argc=<optimized out>, argv=<optimized out>, add_tcs_hook=add_tcs_hook@entry=0x78fad8 <atfu_tp_add_tcs>) at /home/riastradh/netbsd/11/src/external/bsd/atf/dist/atf-c/detail/tp_main.c:610
#18 0x000000000078fcb6 in main (argc=<optimized out>, argv=<optimized out>) at /home/riastradh/netbsd/11/src/tests/lib/libpthread/dlopen/t_dlopen.c:163
#19 0x000000000078f4eb in ___start (cleanup=<optimized out>, ps_strings=0x7f7fff539fe0) at /home/riastradh/netbsd/11/src/lib/csu/common/crt0-common.c:375
#20 0x00007f7ff70a68d0 in ?? () from /usr/libexec/ld.elf_so
#21 0x0000000000000005 in ?? ()
#22 0x00007f7fff539968 in ?? ()
#23 0x00007f7fff539971 in ?? ()
#24 0x00007f7fff53998b in ?? ()
#25 0x00007f7fff5399ae in ?? ()
#26 0x00007f7fff5399c9 in ?? ()
#27 0x0000000000000000 in ?? ()

	Setting a breakpoint on __deregister_frame_info_bases and
	single-stepping through it reveals that the crash is trying to
	jump into code in libpthread.so that no longer exists, after
	dlclose, in order to call __libc_mutex_lock via PLT.  Why is it
	trying to jump there?

	What happened is:

	1. The program dlopened (a library linked against) libpthread.

	2. The program called pthread_mutex_lock -- or rather,
	   __libc_mutex_lock, renamed via #define in <pthread.h>.

	3. The symbol __libc_mutex_lock has two definitions:

	   (a) A weak definition in libc.so -- the no-op thread stub.
	   (b) A strong definition in libpthread.so -- the real one.

	   Lazy binding of the symbol chooses the strong one, so the
	   entry for __libc_mutex_lock in the .got.plt is bound to
	   libpthread.so's definition, as shown by `info proc mappings'
	   and single-stepping in gdb:

(gdb) info proc mappings
...
      0x7ee838cfb000     0x7ee838d03000     0x8000     0x7000  r-x CNPD /lib/libpthread.so.1.5
...
(gdb) display/i $pc
1: x/i $pc
=> 0x7ee838a8a402 <__deregister_frame_info_bases+4>:    push   %r12
(gdb) si
...
(gdb) si
0x00007ee838a8a477 in __deregister_frame_info_bases ()
   from /usr/lib/libgcc_s.so.1
1: x/i $pc
=> 0x7ee838a8a477 <__deregister_frame_info_bases+121>:
    call   0x7ee838a78150 <__libc_mutex_lock@plt>
(gdb) si
0x00007ee838a78150 in __libc_mutex_lock@plt () from /usr/lib/libgcc_s.so.1
1: x/i $pc
=> 0x7ee838a78150 <__libc_mutex_lock@plt>:
    jmp    *0x17f42(%rip)        # 0x7ee838a90098 <__libc_mutex_lock@got.plt>
(gdb) x/xg $rip + 6 + 0x17f42
0x7ee838a90098 <__libc_mutex_lock@got.plt>:     0x00007ee838cfeccc
(gdb) si
pthread_mutex_lock (ptm=0x7ee838a90400 <object_mutex>)
    at /home/riastradh/netbsd/11/src/lib/libpthread/pthread_mutex.c:204
1: x/i $pc
=> 0x7ee838cfeccc <pthread_mutex_lock>:
    mov    0x92b5(%rip),%rax        # 0x7ee838d07f88

	   Note that 0x7ee838cfeccc lies in the interval
	   [0x7ee838cfb000,0x7ee838d03000) where libpthread.so is
	   mapped.

	4. dlclose unmapped everything in libpthread.so -- including the
	   pages of instructions that the .got.plt entry for
	   __libc_mutex_lock now points to, and dlclose has no
	   mechanism to _unbind_ this.

	5. The next thing that tried to call __libc_mutex_lock jumped
	   into oblivion where libpthread.so used to be.  In the test
	   case above, that happened to be in some mysterious code path
	   at program exit, but it could just as well have been, say,
	   one of the stdio(3) functions taking a FILE lock.

(gdb) si
0x00007ee838a8a477 in __deregister_frame_info_bases ()
   from /usr/lib/libgcc_s.so.1
1: x/i $pc
=> 0x7ee838a8a477 <__deregister_frame_info_bases+121>:
    call   0x7ee838a78150 <__libc_mutex_lock@plt>
(gdb) si
0x00007ee838a78150 in __libc_mutex_lock@plt () from /usr/lib/libgcc_s.so.1
1: x/i $pc
=> 0x7ee838a78150 <__libc_mutex_lock@plt>:
    jmp    *0x17f42(%rip)        # 0x7ee838a90098 <__libc_mutex_lock@got.plt>
(gdb) si
0x00007ee838cfeccc in ?? ()
1: x/i $pc
=> 0x7ee838cfeccc:      <error: Cannot access memory at address 0x7ee838cfeccc>

	Why doesn't RTLD_LOCAL limit the scope of libpthread.so's
	__libc_mutex_lock definition so only those .got.plt entries for
	objects that dlclose is unloading will point to the
	libpthread.so one, and any .got.plt entries for objects in the
	global namespace will get the libc.so weak one?

	=> Because the library that the test dlopens, which is linked
	   against libpthread.so, is _also_ linked against libgcc_s.so,
	   which is already marked with -Wl,-z,nodelete -- and
	   libgcc_s.so's .got.plt entry for __libc_mutex_lock is
	   resolved in the RTLD_LOCAL scope and bound to
	   libpthread.so's __libc_mutex_lock.  If we remove libgcc_s.so
	   (by not using LIBISCXX=yes in the test library -- not sure
	   why we're using that anyway), the symptom goes away.

>How-To-Repeat:

	cd /usr/tests/lib/libpthread/dlopen
	atf-run | atf-report

	Caveat: This no longer works as a test case for this particular
	bug in HEAD, because __deregister_frame_info_bases has changed
	to avoid taking a lock with __libc_mutex_lock.  Need to
	construct a test case that still works in HEAD in spite of
	those changes.

>Fix:

	Add to lib/libpthread/Makefile:

	LDADD+=		-Wl,-z,nodelete

	This prevents rtld from actually unloading libpthread.

	The same is probably needed for any library that provides
	strong definitions of a symbol that is still used when the
	library isn't loaded, via a weak definition from some other
	source -- like __libc_mutex_lock.

	It's a dark corner of ELF wizardry that we probably don't use
	much outside of libpthread.so but I can't rule out the
	possibility that someone has dabbled in such nefarious magic
	elsewhere.

>Release-Note:

>Audit-Trail:
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/59784 CVS commit: src/tests/lib/libpthread/dlopen
Date: Sat, 22 Nov 2025 20:04:02 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Nov 22 20:04:02 UTC 2025

 Modified Files:
 	src/tests/lib/libpthread/dlopen: t_dlopen.c

 Log Message:
 tests/lib/libpthread: Test unloading libpthread after lazy binding.

 If you dlopen libpthread and dlclose it again, the thread stubs like
 pthread_mutex_lock need to continue working -- a library might have
 calls to it in order to support thread-safety for threaded
 applications, but that library needs to continue working even in
 non-threaded applications after lazy binding of the libpthread symbol
 instead of the libc stub.

 PR lib/59784: dlopening and dlclosing libpthread is broken


 To generate a diff of this commit:
 cvs rdiff -u -r1.1 -r1.2 src/tests/lib/libpthread/dlopen/t_dlopen.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/59784 CVS commit: src/tests/lib/libpthread/dlopen
Date: Sat, 22 Nov 2025 20:05:21 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Nov 22 20:05:20 UTC 2025

 Modified Files:
 	src/tests/lib/libpthread/dlopen: t_dso_pthread_create.c

 Log Message:
 tests/lib/libpthread: Don't abuse xfail.

 Use a signal handler to check for SIGABRT, rather than
 atf_tc_expect_signal.

 xfail is for when there is a bug that we haven't fixed yet and the
 test manifests a symptom of that bug -- a list of xfails is a list of
 open bugs to be fixed.  In this case, we are verifying that
 pthread_create _correctly_ raises SIGABRT (or fails with nonzero
 return code -- both are acceptable outcomes, really), and there is no
 bug here at the moment.

 Prompted by (but unrelated to):

 PR lib/59784: dlopening and dlclosing libpthread is broken


 To generate a diff of this commit:
 cvs rdiff -u -r1.1 -r1.2 \
     src/tests/lib/libpthread/dlopen/t_dso_pthread_create.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/59784 CVS commit: src
Date: Sun, 23 Nov 2025 22:11:42 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sun Nov 23 22:11:42 UTC 2025

 Modified Files:
 	src: UPDATING
 	src/lib/libpthread: Makefile
 	src/tests/lib/libpthread/dlopen: t_dlopen.c

 Log Message:
 libpthread: Link with -Wl,-z,nodelete.

 Can't safely unload libpthread because of the interaction with libc
 thread stubs.

 PR lib/59784: dlopening and dlclosing libpthread is broken


 To generate a diff of this commit:
 cvs rdiff -u -r1.386 -r1.387 src/UPDATING
 cvs rdiff -u -r1.102 -r1.103 src/lib/libpthread/Makefile
 cvs rdiff -u -r1.2 -r1.3 src/tests/lib/libpthread/dlopen/t_dlopen.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

Responsible-Changed-From-To: lib-bug-people->riastradh
Responsible-Changed-By: riastradh@NetBSD.org
Responsible-Changed-When: Sun, 23 Nov 2025 22:15:42 +0000
Responsible-Changed-Why:
fixed in HEAD with tests, needs pullup-11, pullup-10, pullup-9


State-Changed-From-To: open->needs-pullups
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Sun, 23 Nov 2025 22:15:42 +0000
State-Changed-Why:
mine


From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/59784 CVS commit: src
Date: Sat, 29 Nov 2025 14:39:36 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Nov 29 14:39:36 UTC 2025

 Modified Files:
 	src: UPDATING
 	src/lib/libpthread: shlib_version

 Log Message:
 libpthread: Touch comment in shlib_version for recent LDADD.

 This provokes relinking libpthread.so with the new arguments, without
 needing manual intervention to follow a note in UPDATING.

 PR lib/59784: dlopening and dlclosing libpthread is broken


 To generate a diff of this commit:
 cvs rdiff -u -r1.387 -r1.388 src/UPDATING
 cvs rdiff -u -r1.23 -r1.24 src/lib/libpthread/shlib_version

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/59784 CVS commit: [netbsd-11] src
Date: Sat, 29 Nov 2025 16:08:02 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Sat Nov 29 16:08:01 UTC 2025

 Modified Files:
 	src/lib/libpthread [netbsd-11]: Makefile shlib_version
 	src/tests/lib/libpthread/dlopen [netbsd-11]: t_dlopen.c
 	    t_dso_pthread_create.c

 Log Message:
 Pull up following revision(s) (requested by riastradh in ticket #110):

 	lib/libpthread/Makefile: revision 1.103
 	tests/lib/libpthread/dlopen/t_dso_pthread_create.c: revision 1.2
 	tests/lib/libpthread/dlopen/t_dlopen.c: revision 1.2
 	tests/lib/libpthread/dlopen/t_dlopen.c: revision 1.3
 	lib/libpthread/shlib_version: revision 1.24

 tests/lib/libpthread: Test unloading libpthread after lazy binding.

 If you dlopen libpthread and dlclose it again, the thread stubs like
 pthread_mutex_lock need to continue working -- a library might have
 calls to it in order to support thread-safety for threaded
 applications, but that library needs to continue working even in
 non-threaded applications after lazy binding of the libpthread symbol
 instead of the libc stub.

 PR lib/59784: dlopening and dlclosing libpthread is broken

 tests/lib/libpthread: Don't abuse xfail.

 Use a signal handler to check for SIGABRT, rather than
 atf_tc_expect_signal.

 xfail is for when there is a bug that we haven't fixed yet and the
 test manifests a symptom of that bug -- a list of xfails is a list of
 open bugs to be fixed.  In this case, we are verifying that
 pthread_create _correctly_ raises SIGABRT (or fails with nonzero
 return code -- both are acceptable outcomes, really), and there is no
 bug here at the moment.

 Prompted by (but unrelated to):
 PR lib/59784: dlopening and dlclosing libpthread is broken

 libpthread: Link with -Wl,-z,nodelete.
 Can't safely unload libpthread because of the interaction with libc
 thread stubs.

 PR lib/59784: dlopening and dlclosing libpthread is broken

 libpthread: Touch comment in shlib_version for recent LDADD.

 This provokes relinking libpthread.so with the new arguments, without
 needing manual intervention to follow a note in UPDATING.

 PR lib/59784: dlopening and dlclosing libpthread is broken


 To generate a diff of this commit:
 cvs rdiff -u -r1.100 -r1.100.2.1 src/lib/libpthread/Makefile
 cvs rdiff -u -r1.20.2.1 -r1.20.2.2 src/lib/libpthread/shlib_version
 cvs rdiff -u -r1.1 -r1.1.48.1 src/tests/lib/libpthread/dlopen/t_dlopen.c \
     src/tests/lib/libpthread/dlopen/t_dso_pthread_create.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.