NetBSD Problem Report #49791

From www@NetBSD.org  Thu Mar 26 23:44:16 2015
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id D51EFA5858
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 26 Mar 2015 23:44:16 +0000 (UTC)
Message-Id: <20150326234415.60676A6552@mollari.NetBSD.org>
Date: Thu, 26 Mar 2015 23:44:15 +0000 (UTC)
From: prlw1@cam.ac.uk
Reply-To: prlw1@cam.ac.uk
To: gnats-bugs@NetBSD.org
Subject: dlopen(0, and dlopened libraries
X-Send-Pr-Version: www-1.0

>Number:         49791
>Category:       lib
>Synopsis:       dlopen(0, and dlopened libraries
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    lib-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 26 23:45:00 +0000 2015
>Last-Modified:  Wed Jun 30 14:30:02 +0000 2021
>Originator:     Patrick Welche
>Release:        NetBSD-7.99.7/amd64
>Organization:
>Environment:
>Description:
Back in 2004, Julio Merino points out in a glib bug

  https://bugzilla.gnome.org/show_bug.cgi?id=140329

that our dlopen() does (cf dlfcn(3))

     If the first argument is NULL, dlopen() returns a handle on the global
     symbol object.  This object provides access to all symbols from an
     ordered set of objects consisting of the original program image and any
     dependencies loaded during startup.

whereas accoding to posix in

  http://pubs.opengroup.org/onlinepubs/007904975/functions/dlopen.html

If the value of file is 0, dlopen() shall provide a handle on a global symbol object. This object shall provide access to the symbols from an ordered set of objects consisting of the original program image file, together with any objects loaded at program start-up as specified by that process image file (for example, shared libraries), and the set of objects loaded using a dlopen() operation together with the RTLD_GLOBAL flag. As the latter set of objects can change during execution, the set identified by handle can also change dynamically.


glib assumes the posix variant, and we have a patch in pkgsrc to detect ours as broken.
>How-To-Repeat:
Try:

#include <err.h>
#include <dlfcn.h>
#include <stdio.h>

int main()
{
    void *handle;

    handle = dlopen ("libm.so", RTLD_GLOBAL | RTLD_LAZY);
    if (handle == NULL)
        errx(1, "dlopen of libm failed (%s)", dlerror());

    handle = dlopen (NULL, 0);
    if (handle == NULL)
        errx(1, "dlopen of global symbol object failed (%s)", dlerror());

    handle = dlsym (handle, "sin");
    if (handle == NULL)
        errx(1, "sin() not found in libm (%s)", dlerror());

    return 0;
}


$ ./dltest
dltest: sin() not found in libm (Undefined symbol "sin")

>Fix:

>Audit-Trail:
From: David Holland <dholland@eecs.harvard.edu>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: lib/49791: dlopen(0, and dlopened libraries
Date: Fri, 27 Mar 2015 13:45:52 -0400

 RTLD_GLOBAL is a bug; don't use it :-)

 -- 
    - David A. Holland / dholland@eecs.harvard.edu

From: Joerg Sonnenberger <joerg@britannica.bec.de>
To: gnats-bugs@NetBSD.org
Cc: lib-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: lib/49791: dlopen(0, and dlopened libraries
Date: Fri, 27 Mar 2015 18:47:11 +0100

 On Thu, Mar 26, 2015 at 11:45:00PM +0000, prlw1@cam.ac.uk wrote:
 > that our dlopen() does (cf dlfcn(3))
 > 
 >      If the first argument is NULL, dlopen() returns a handle on the global
 >      symbol object.  This object provides access to all symbols from an
 >      ordered set of objects consisting of the original program image and any
 >      dependencies loaded during startup.

 This is the same behavior documented on glibc

 > whereas accoding to posix in
 > 
 >   http://pubs.opengroup.org/onlinepubs/007904975/functions/dlopen.html
 > 
 > If the value of file is 0, dlopen() shall provide a handle on a global
 > symbol object. This object shall provide access to the symbols from an
 > ordered set of objects consisting of the original program image file,
 > together with any objects loaded at program start-up as specified by
 > that process image file (for example, shared libraries), and the set
 > of objects loaded using a dlopen() operation together with the
 > RTLD_GLOBAL flag. As the latter set of objects can change during
 > execution, the set identified by handle can also change dynamically.

 This, frankly, doesn't make sense.

 > glib assumes the posix variant, and we have a patch in pkgsrc to detect ours as broken.

 How can it? It doesn't seem like glibc provides the same:

   If filename is NULL, then the returned handle is for the main program.

 > #include <err.h>
 > #include <dlfcn.h>
 > #include <stdio.h>
 > 
 > int main()
 > {
 >     void *handle;
 > 
 >     handle = dlopen ("libm.so", RTLD_GLOBAL | RTLD_LAZY);
 >     if (handle == NULL)
 >         errx(1, "dlopen of libm failed (%s)", dlerror());
 > 
 >     handle = dlopen (NULL, 0);
 >     if (handle == NULL)
 >         errx(1, "dlopen of global symbol object failed (%s)", dlerror());
 > 
 >     handle = dlsym (handle, "sin");
 >     if (handle == NULL)
 >         errx(1, "sin() not found in libm (%s)", dlerror());

 I think the "correct" behavior here is:

 dlsym(RTLD_DEFAULT, "sin");

 Joerg

From: David Laight <david@l8s.co.uk>
To: gnats-bugs@NetBSD.org
Cc: lib-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, prlw1@cam.ac.uk
Subject: Re: lib/49791: dlopen(0, and dlopened libraries
Date: Fri, 27 Mar 2015 19:54:25 +0000

 On Fri, Mar 27, 2015 at 05:50:00PM +0000, David Holland wrote:
 > The following reply was made to PR lib/49791; it has been noted by GNATS.
 > 
 > From: David Holland <dholland@eecs.harvard.edu>
 > To: gnats-bugs@netbsd.org
 > Cc: 
 > Subject: Re: lib/49791: dlopen(0, and dlopened libraries
 > Date: Fri, 27 Mar 2015 13:45:52 -0400
 > 
 >  RTLD_GLOBAL is a bug; don't use it :-)

 And if you do use it, don't even think about calling dlclose().

 	David

 -- 
 David Laight: david@l8s.co.uk

From: Patrick Welche <prlw1@cam.ac.uk>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: lib/49791: dlopen(0, and dlopened libraries
Date: Sat, 4 Apr 2015 19:32:34 +0100

 On Fri, Mar 27, 2015 at 05:50:01PM +0000, Joerg Sonnenberger wrote:
 >  On Thu, Mar 26, 2015 at 11:45:00PM +0000, prlw1@cam.ac.uk wrote:
 >  > that our dlopen() does (cf dlfcn(3))
 >  > 
 >  >      If the first argument is NULL, dlopen() returns a handle on the global
 >  >      symbol object.  This object provides access to all symbols from an
 >  >      ordered set of objects consisting of the original program image and any
 >  >      dependencies loaded during startup.
 >  
 >  This is the same behavior documented on glibc

 According to Ubuntu 14.04.2 dlopen(3):

    If filename is a NULL pointer, then the returned handle is for the main
    program.  When given to dlsym(), this handle causes a search for a sym
    bol in the main program, followed by all  shared  libraries  loaded  at
    program  startup, and then all shared libraries loaded by dlopen() with
    the flag RTLD_GLOBAL.

 So this has the 3rd clause missing from our description, yet present
 in the posix description:

 >  > whereas accoding to posix in
 >  > 
 >  >   http://pubs.opengroup.org/onlinepubs/007904975/functions/dlopen.html
 >  > 
 >  > If the value of file is 0, dlopen() shall provide a handle on a global
 >  > symbol object. This object shall provide access to the symbols from an
 >  > ordered set of objects consisting of the original program image file,
 >  > together with any objects loaded at program start-up as specified by
 >  > that process image file (for example, shared libraries), and the set
 >  > of objects loaded using a dlopen() operation together with the
 >  > RTLD_GLOBAL flag. As the latter set of objects can change during
 >  > execution, the set identified by handle can also change dynamically.
 >  
 >  This, frankly, doesn't make sense.
 >  
 >  > glib assumes the posix variant, and we have a patch in pkgsrc to detect ours as broken.
 >  
 >  How can it? It doesn't seem like glibc provides the same:
 >  
 >    If filename is NULL, then the returned handle is for the main program.
 >  
 >  > #include <err.h>
 >  > #include <dlfcn.h>
 >  > #include <stdio.h>
 >  > 
 >  > int main()
 >  > {
 >  >     void *handle;
 >  > 
 >  >     handle = dlopen ("libm.so", RTLD_GLOBAL | RTLD_LAZY);
 >  >     if (handle == NULL)
 >  >         errx(1, "dlopen of libm failed (%s)", dlerror());
 >  > 
 >  >     handle = dlopen (NULL, 0);
 >  >     if (handle == NULL)
 >  >         errx(1, "dlopen of global symbol object failed (%s)", dlerror());
 >  > 
 >  >     handle = dlsym (handle, "sin");
 >  >     if (handle == NULL)
 >  >         errx(1, "sin() not found in libm (%s)", dlerror());
 >  
 >  I think the "correct" behavior here is:
 >  
 >  dlsym(RTLD_DEFAULT, "sin");

 Interestingly, glib didn't worry about autoconfigury when android support
 came along, and committed:

 #ifdef __BIONIC__
   handle = RTLD_DEFAULT;
 #else
   handle = dlopen (NULL, RTLD_GLOBAL | RTLD_LAZY);
 #endif

 So maybe this is the way forward. How do you decide on "correct"?
 Could you give me some pointers to why RTLD_GLOBAL is a bug, and the
 problems with subsequent dlclose()?

From: Joerg Sonnenberger <joerg@britannica.bec.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: lib/49791: dlopen(0, and dlopened libraries
Date: Sat, 4 Apr 2015 20:49:57 +0200

 On Sat, Apr 04, 2015 at 06:35:01PM +0000, Patrick Welche wrote:
 >  So maybe this is the way forward. How do you decide on "correct"?
 >  Could you give me some pointers to why RTLD_GLOBAL is a bug, and the
 >  problems with subsequent dlclose()?

 The POSIX behavior makes no sense in light of multi-threaded programs.
 Returning a handle gives the illusion that the result of two consecutive
 calls will be the same, when a different thread might have done a
 dlopen or dlclose in the mean time. If the return value is supposed to
 be a magic global object, RTLD_DEFAULT is much saner as it doesn't
 pretend dlopen/dlclose is needed.

 The problem with RTLD_GLOBAL is that is (a) expensive and (b) dangerous.
 It is expensive because it adds work for every look-up the same way
 LD_PRELOAD does. It is dangerous because the meaning of a symbol changes
 over time. Especially noticable is that it can interact with lazy
 binding in surprising ways. There is broken software depending on such
 behavior (XFree86 and successors, I look at you!), but that is generally
 a sign of a very badly designed module system.

 Joerg

From: "Patrick Welche" <prlw1@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/49791 CVS commit: pkgsrc/devel/glib2
Date: Mon, 8 Oct 2018 10:12:06 +0000

 Module Name:	pkgsrc
 Committed By:	prlw1
 Date:		Mon Oct  8 10:12:06 UTC 2018

 Modified Files:
 	pkgsrc/devel/glib2: Makefile distinfo
 	pkgsrc/devel/glib2/patches: patch-aa patch-ak
 Added Files:
 	pkgsrc/devel/glib2/patches: patch-gmodule_gmodule-ar.c
 	    patch-gmodule_gmodule-dl.c patch-gmodule_gmodule-dyld.c
 	    patch-gmodule_gmodule-win32.c patch-gmodule_gmodule.c
 Removed Files:
 	pkgsrc/devel/glib2/patches: patch-ab patch-ac patch-ae

 Log Message:
 glib2's gobject subsystem is essentially a wrapper for dlopen. In
 view of comments in PR lib/49791 which can be summarised as
 "RTLD_GLOBAL is a bug", make gobject use RTLD_DEFAULT instead.


 To generate a diff of this commit:
 cvs rdiff -u -r1.242 -r1.243 pkgsrc/devel/glib2/Makefile
 cvs rdiff -u -r1.235 -r1.236 pkgsrc/devel/glib2/distinfo
 cvs rdiff -u -r1.61 -r1.62 pkgsrc/devel/glib2/patches/patch-aa
 cvs rdiff -u -r1.14 -r0 pkgsrc/devel/glib2/patches/patch-ab
 cvs rdiff -u -r1.10 -r0 pkgsrc/devel/glib2/patches/patch-ac
 cvs rdiff -u -r1.5 -r0 pkgsrc/devel/glib2/patches/patch-ae
 cvs rdiff -u -r1.20 -r1.21 pkgsrc/devel/glib2/patches/patch-ak
 cvs rdiff -u -r0 -r1.1 pkgsrc/devel/glib2/patches/patch-gmodule_gmodule-ar.c \
     pkgsrc/devel/glib2/patches/patch-gmodule_gmodule-dl.c \
     pkgsrc/devel/glib2/patches/patch-gmodule_gmodule-dyld.c \
     pkgsrc/devel/glib2/patches/patch-gmodule_gmodule-win32.c \
     pkgsrc/devel/glib2/patches/patch-gmodule_gmodule.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Patrick Welche" <prlw1@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/49791 CVS commit: pkgsrc/devel/glib2
Date: Wed, 30 Jun 2021 14:26:11 +0000

 Module Name:	pkgsrc
 Committed By:	prlw1
 Date:		Wed Jun 30 14:26:11 UTC 2021

 Modified Files:
 	pkgsrc/devel/glib2: Makefile distinfo
 Added Files:
 	pkgsrc/devel/glib2/patches: patch-gmodule_gmodule-dl.c
 	    patch-gmodule_gmodule.c

 Log Message:
 Re-add patches I wrote in October 2018:

     glib2's gobject subsystem is essentially a wrapper for dlopen. In
     view of comments in PR lib/49791 which can be summarised as
     "RTLD_GLOBAL is a bug", make gobject use RTLD_DEFAULT instead.

 This should fix PR pkg/56212

 The upstream merge request

     https://gitlab.gnome.org/GNOME/glib/-/merge_requests/2171

 has been updated - feel free to add a description of the problems you
 experienced without this patch to it.


 To generate a diff of this commit:
 cvs rdiff -u -r1.281 -r1.282 pkgsrc/devel/glib2/Makefile
 cvs rdiff -u -r1.287 -r1.288 pkgsrc/devel/glib2/distinfo
 cvs rdiff -u -r0 -r1.3 pkgsrc/devel/glib2/patches/patch-gmodule_gmodule-dl.c \
     pkgsrc/devel/glib2/patches/patch-gmodule_gmodule.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.