NetBSD Problem Report #49791
From www@NetBSD.org Thu Mar 26 23:44:16 2015
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id D51EFA5858
for <gnats-bugs@gnats.NetBSD.org>; Thu, 26 Mar 2015 23:44:16 +0000 (UTC)
Message-Id: <20150326234415.60676A6552@mollari.NetBSD.org>
Date: Thu, 26 Mar 2015 23:44:15 +0000 (UTC)
From: prlw1@cam.ac.uk
Reply-To: prlw1@cam.ac.uk
To: gnats-bugs@NetBSD.org
Subject: dlopen(0, and dlopened libraries
X-Send-Pr-Version: www-1.0
>Number: 49791
>Category: lib
>Synopsis: dlopen(0, and dlopened libraries
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: lib-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Mar 26 23:45:00 +0000 2015
>Last-Modified: Wed Jun 30 14:30:02 +0000 2021
>Originator: Patrick Welche
>Release: NetBSD-7.99.7/amd64
>Organization:
>Environment:
>Description:
Back in 2004, Julio Merino points out in a glib bug
https://bugzilla.gnome.org/show_bug.cgi?id=140329
that our dlopen() does (cf dlfcn(3))
If the first argument is NULL, dlopen() returns a handle on the global
symbol object. This object provides access to all symbols from an
ordered set of objects consisting of the original program image and any
dependencies loaded during startup.
whereas accoding to posix in
http://pubs.opengroup.org/onlinepubs/007904975/functions/dlopen.html
If the value of file is 0, dlopen() shall provide a handle on a global symbol object. This object shall provide access to the symbols from an ordered set of objects consisting of the original program image file, together with any objects loaded at program start-up as specified by that process image file (for example, shared libraries), and the set of objects loaded using a dlopen() operation together with the RTLD_GLOBAL flag. As the latter set of objects can change during execution, the set identified by handle can also change dynamically.
glib assumes the posix variant, and we have a patch in pkgsrc to detect ours as broken.
>How-To-Repeat:
Try:
#include <err.h>
#include <dlfcn.h>
#include <stdio.h>
int main()
{
void *handle;
handle = dlopen ("libm.so", RTLD_GLOBAL | RTLD_LAZY);
if (handle == NULL)
errx(1, "dlopen of libm failed (%s)", dlerror());
handle = dlopen (NULL, 0);
if (handle == NULL)
errx(1, "dlopen of global symbol object failed (%s)", dlerror());
handle = dlsym (handle, "sin");
if (handle == NULL)
errx(1, "sin() not found in libm (%s)", dlerror());
return 0;
}
$ ./dltest
dltest: sin() not found in libm (Undefined symbol "sin")
>Fix:
>Audit-Trail:
From: David Holland <dholland@eecs.harvard.edu>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: lib/49791: dlopen(0, and dlopened libraries
Date: Fri, 27 Mar 2015 13:45:52 -0400
RTLD_GLOBAL is a bug; don't use it :-)
--
- David A. Holland / dholland@eecs.harvard.edu
From: Joerg Sonnenberger <joerg@britannica.bec.de>
To: gnats-bugs@NetBSD.org
Cc: lib-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: lib/49791: dlopen(0, and dlopened libraries
Date: Fri, 27 Mar 2015 18:47:11 +0100
On Thu, Mar 26, 2015 at 11:45:00PM +0000, prlw1@cam.ac.uk wrote:
> that our dlopen() does (cf dlfcn(3))
>
> If the first argument is NULL, dlopen() returns a handle on the global
> symbol object. This object provides access to all symbols from an
> ordered set of objects consisting of the original program image and any
> dependencies loaded during startup.
This is the same behavior documented on glibc
> whereas accoding to posix in
>
> http://pubs.opengroup.org/onlinepubs/007904975/functions/dlopen.html
>
> If the value of file is 0, dlopen() shall provide a handle on a global
> symbol object. This object shall provide access to the symbols from an
> ordered set of objects consisting of the original program image file,
> together with any objects loaded at program start-up as specified by
> that process image file (for example, shared libraries), and the set
> of objects loaded using a dlopen() operation together with the
> RTLD_GLOBAL flag. As the latter set of objects can change during
> execution, the set identified by handle can also change dynamically.
This, frankly, doesn't make sense.
> glib assumes the posix variant, and we have a patch in pkgsrc to detect ours as broken.
How can it? It doesn't seem like glibc provides the same:
If filename is NULL, then the returned handle is for the main program.
> #include <err.h>
> #include <dlfcn.h>
> #include <stdio.h>
>
> int main()
> {
> void *handle;
>
> handle = dlopen ("libm.so", RTLD_GLOBAL | RTLD_LAZY);
> if (handle == NULL)
> errx(1, "dlopen of libm failed (%s)", dlerror());
>
> handle = dlopen (NULL, 0);
> if (handle == NULL)
> errx(1, "dlopen of global symbol object failed (%s)", dlerror());
>
> handle = dlsym (handle, "sin");
> if (handle == NULL)
> errx(1, "sin() not found in libm (%s)", dlerror());
I think the "correct" behavior here is:
dlsym(RTLD_DEFAULT, "sin");
Joerg
From: David Laight <david@l8s.co.uk>
To: gnats-bugs@NetBSD.org
Cc: lib-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, prlw1@cam.ac.uk
Subject: Re: lib/49791: dlopen(0, and dlopened libraries
Date: Fri, 27 Mar 2015 19:54:25 +0000
On Fri, Mar 27, 2015 at 05:50:00PM +0000, David Holland wrote:
> The following reply was made to PR lib/49791; it has been noted by GNATS.
>
> From: David Holland <dholland@eecs.harvard.edu>
> To: gnats-bugs@netbsd.org
> Cc:
> Subject: Re: lib/49791: dlopen(0, and dlopened libraries
> Date: Fri, 27 Mar 2015 13:45:52 -0400
>
> RTLD_GLOBAL is a bug; don't use it :-)
And if you do use it, don't even think about calling dlclose().
David
--
David Laight: david@l8s.co.uk
From: Patrick Welche <prlw1@cam.ac.uk>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: lib/49791: dlopen(0, and dlopened libraries
Date: Sat, 4 Apr 2015 19:32:34 +0100
On Fri, Mar 27, 2015 at 05:50:01PM +0000, Joerg Sonnenberger wrote:
> On Thu, Mar 26, 2015 at 11:45:00PM +0000, prlw1@cam.ac.uk wrote:
> > that our dlopen() does (cf dlfcn(3))
> >
> > If the first argument is NULL, dlopen() returns a handle on the global
> > symbol object. This object provides access to all symbols from an
> > ordered set of objects consisting of the original program image and any
> > dependencies loaded during startup.
>
> This is the same behavior documented on glibc
According to Ubuntu 14.04.2 dlopen(3):
If filename is a NULL pointer, then the returned handle is for the main
program. When given to dlsym(), this handle causes a search for a sym
bol in the main program, followed by all shared libraries loaded at
program startup, and then all shared libraries loaded by dlopen() with
the flag RTLD_GLOBAL.
So this has the 3rd clause missing from our description, yet present
in the posix description:
> > whereas accoding to posix in
> >
> > http://pubs.opengroup.org/onlinepubs/007904975/functions/dlopen.html
> >
> > If the value of file is 0, dlopen() shall provide a handle on a global
> > symbol object. This object shall provide access to the symbols from an
> > ordered set of objects consisting of the original program image file,
> > together with any objects loaded at program start-up as specified by
> > that process image file (for example, shared libraries), and the set
> > of objects loaded using a dlopen() operation together with the
> > RTLD_GLOBAL flag. As the latter set of objects can change during
> > execution, the set identified by handle can also change dynamically.
>
> This, frankly, doesn't make sense.
>
> > glib assumes the posix variant, and we have a patch in pkgsrc to detect ours as broken.
>
> How can it? It doesn't seem like glibc provides the same:
>
> If filename is NULL, then the returned handle is for the main program.
>
> > #include <err.h>
> > #include <dlfcn.h>
> > #include <stdio.h>
> >
> > int main()
> > {
> > void *handle;
> >
> > handle = dlopen ("libm.so", RTLD_GLOBAL | RTLD_LAZY);
> > if (handle == NULL)
> > errx(1, "dlopen of libm failed (%s)", dlerror());
> >
> > handle = dlopen (NULL, 0);
> > if (handle == NULL)
> > errx(1, "dlopen of global symbol object failed (%s)", dlerror());
> >
> > handle = dlsym (handle, "sin");
> > if (handle == NULL)
> > errx(1, "sin() not found in libm (%s)", dlerror());
>
> I think the "correct" behavior here is:
>
> dlsym(RTLD_DEFAULT, "sin");
Interestingly, glib didn't worry about autoconfigury when android support
came along, and committed:
#ifdef __BIONIC__
handle = RTLD_DEFAULT;
#else
handle = dlopen (NULL, RTLD_GLOBAL | RTLD_LAZY);
#endif
So maybe this is the way forward. How do you decide on "correct"?
Could you give me some pointers to why RTLD_GLOBAL is a bug, and the
problems with subsequent dlclose()?
From: Joerg Sonnenberger <joerg@britannica.bec.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: lib/49791: dlopen(0, and dlopened libraries
Date: Sat, 4 Apr 2015 20:49:57 +0200
On Sat, Apr 04, 2015 at 06:35:01PM +0000, Patrick Welche wrote:
> So maybe this is the way forward. How do you decide on "correct"?
> Could you give me some pointers to why RTLD_GLOBAL is a bug, and the
> problems with subsequent dlclose()?
The POSIX behavior makes no sense in light of multi-threaded programs.
Returning a handle gives the illusion that the result of two consecutive
calls will be the same, when a different thread might have done a
dlopen or dlclose in the mean time. If the return value is supposed to
be a magic global object, RTLD_DEFAULT is much saner as it doesn't
pretend dlopen/dlclose is needed.
The problem with RTLD_GLOBAL is that is (a) expensive and (b) dangerous.
It is expensive because it adds work for every look-up the same way
LD_PRELOAD does. It is dangerous because the meaning of a symbol changes
over time. Especially noticable is that it can interact with lazy
binding in surprising ways. There is broken software depending on such
behavior (XFree86 and successors, I look at you!), but that is generally
a sign of a very badly designed module system.
Joerg
From: "Patrick Welche" <prlw1@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/49791 CVS commit: pkgsrc/devel/glib2
Date: Mon, 8 Oct 2018 10:12:06 +0000
Module Name: pkgsrc
Committed By: prlw1
Date: Mon Oct 8 10:12:06 UTC 2018
Modified Files:
pkgsrc/devel/glib2: Makefile distinfo
pkgsrc/devel/glib2/patches: patch-aa patch-ak
Added Files:
pkgsrc/devel/glib2/patches: patch-gmodule_gmodule-ar.c
patch-gmodule_gmodule-dl.c patch-gmodule_gmodule-dyld.c
patch-gmodule_gmodule-win32.c patch-gmodule_gmodule.c
Removed Files:
pkgsrc/devel/glib2/patches: patch-ab patch-ac patch-ae
Log Message:
glib2's gobject subsystem is essentially a wrapper for dlopen. In
view of comments in PR lib/49791 which can be summarised as
"RTLD_GLOBAL is a bug", make gobject use RTLD_DEFAULT instead.
To generate a diff of this commit:
cvs rdiff -u -r1.242 -r1.243 pkgsrc/devel/glib2/Makefile
cvs rdiff -u -r1.235 -r1.236 pkgsrc/devel/glib2/distinfo
cvs rdiff -u -r1.61 -r1.62 pkgsrc/devel/glib2/patches/patch-aa
cvs rdiff -u -r1.14 -r0 pkgsrc/devel/glib2/patches/patch-ab
cvs rdiff -u -r1.10 -r0 pkgsrc/devel/glib2/patches/patch-ac
cvs rdiff -u -r1.5 -r0 pkgsrc/devel/glib2/patches/patch-ae
cvs rdiff -u -r1.20 -r1.21 pkgsrc/devel/glib2/patches/patch-ak
cvs rdiff -u -r0 -r1.1 pkgsrc/devel/glib2/patches/patch-gmodule_gmodule-ar.c \
pkgsrc/devel/glib2/patches/patch-gmodule_gmodule-dl.c \
pkgsrc/devel/glib2/patches/patch-gmodule_gmodule-dyld.c \
pkgsrc/devel/glib2/patches/patch-gmodule_gmodule-win32.c \
pkgsrc/devel/glib2/patches/patch-gmodule_gmodule.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Patrick Welche" <prlw1@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/49791 CVS commit: pkgsrc/devel/glib2
Date: Wed, 30 Jun 2021 14:26:11 +0000
Module Name: pkgsrc
Committed By: prlw1
Date: Wed Jun 30 14:26:11 UTC 2021
Modified Files:
pkgsrc/devel/glib2: Makefile distinfo
Added Files:
pkgsrc/devel/glib2/patches: patch-gmodule_gmodule-dl.c
patch-gmodule_gmodule.c
Log Message:
Re-add patches I wrote in October 2018:
glib2's gobject subsystem is essentially a wrapper for dlopen. In
view of comments in PR lib/49791 which can be summarised as
"RTLD_GLOBAL is a bug", make gobject use RTLD_DEFAULT instead.
This should fix PR pkg/56212
The upstream merge request
https://gitlab.gnome.org/GNOME/glib/-/merge_requests/2171
has been updated - feel free to add a description of the problems you
experienced without this patch to it.
To generate a diff of this commit:
cvs rdiff -u -r1.281 -r1.282 pkgsrc/devel/glib2/Makefile
cvs rdiff -u -r1.287 -r1.288 pkgsrc/devel/glib2/distinfo
cvs rdiff -u -r0 -r1.3 pkgsrc/devel/glib2/patches/patch-gmodule_gmodule-dl.c \
pkgsrc/devel/glib2/patches/patch-gmodule_gmodule.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.