NetBSD Problem Report #58865

From www@netbsd.org  Sat Nov 30 03:31:41 2024
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
	 client-signature RSA-PSS (2048 bits) client-digest SHA256)
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id D1E001A9238
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 30 Nov 2024 03:31:41 +0000 (UTC)
Message-Id: <20241130033140.1F8F91A923E@mollari.NetBSD.org>
Date: Sat, 30 Nov 2024 03:31:40 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: static and dynamic dl_iterate_phdr disagree on main object name
X-Send-Pr-Version: www-1.0

>Number:         58865
>Category:       lib
>Synopsis:       static and dynamic dl_iterate_phdr disagree on main object name
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    lib-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Nov 30 03:35:00 +0000 2024
>Last-Modified:  Sat Nov 30 20:00:01 +0000 2024
>Originator:     Taylor R Campbell
>Release:        current, 10, 9, ...
>Organization:
The AT_NETBSD_EXECFOUNDATION
>Environment:
>Description:
Static dl_iterate_phdr (lib/libc/dlfcn/dlfcn_elf.c) gives AT_SUN_EXECNAME as the main object's struct dl_phdr_info::dlpi_name:

    174 		case AT_SUN_EXECNAME:
    175 			dlpi_name = (void *)aux->a_v;
    176 			break;
...
    216 	phdr_info.dlpi_name = dlpi_name;
    217 
    218 	return callback(&phdr_info, sizeof(phdr_info), data);

https://nxr.netbsd.org/xref/src/lib/libc/dlfcn/dlfcn_elf.c?r=1.17#174

Dynamic dl_iterate_phdr (libexec/ld.elf_so/rtld.c) instead gives argv[0] as the main object's struct dl_phdr_info::dlpi_name:

    682 		_rtld_objmain->path = xstrdup(argv[0] ? argv[0] :
    683 		    "main program");
...
   1467 	/* XXX: wrong but not fixing it yet */
   1468 	phdr_info->dlpi_name = obj->path;

https://nxr.netbsd.org/xref/src/libexec/ld.elf_so/rtld.c?r=1.217#682

ld.elf_so does read out AT_SUN_EXECNAME, but only uses it for $ORIGIN.

Not a priori clear which one is correct but I lean toward AT_SUN_EXECNAME since there is otherwise no way to obtain it without going through the undocumented _dlauxinfo().
>How-To-Repeat:
$ pwd
/tmp/riastradh
$ cat dlx.c
#include <dlfcn.h>
#include <elf.h>
#include <errno.h>
#include <link.h>
#include <stdio.h>

static int
callback(struct dl_phdr_info *dlpi, size_t size, void *cookie)
{

	printf("dl_iterate_phdr name=%s\n", dlpi->dlpi_name);
	return 1;
}

int
main(void)
{
	const AuxInfo *aux;

	for (aux = _dlauxinfo(); aux->a_type != AT_NULL; aux++) {
		switch (aux->a_type) {
		case AT_SUN_EXECNAME:
			printf("AT_SUN_EXECNAME=%s\n", (char *)aux->a_v);
			break;
		}
	}

	dl_iterate_phdr(&callback, NULL);
	return 0;
}
$ rm -f dlx && make dlx DBG=-g\ -O2\ -Wall\ -Werror && ./dlx
cc -g -O2 -Wall -Werror   -o dlx dlx.c 
AT_SUN_EXECNAME=/tmp/riastradh/./dlx
dl_iterate_phdr name=./dlx
$ rm -f dlx && make dlx DBG=-g\ -O2\ -Wall\ -Werror\ -static && ./dlx
cc -g -O2 -Wall -Werror -static   -o dlx dlx.c 
AT_SUN_EXECNAME=/tmp/riastradh/./dlx
dl_iterate_phdr name=/tmp/riastradh/./dlx
>Fix:
Yes, please!

>Audit-Trail:
From: Christos Zoulas <christos@zoulas.com>
To: gnats-bugs@netbsd.org
Cc: lib-bug-people@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org
Subject: Re: lib/58865: static and dynamic dl_iterate_phdr disagree on main
 object name
Date: Sat, 30 Nov 2024 13:32:17 -0500

 --Apple-Mail=_AFD46184-104A-4358-AB84-5875042323C1
 Content-Type: multipart/mixed;
 	boundary="Apple-Mail=_7E2AA65F-A7B7-4570-9B65-7FB4110A2775"


 --Apple-Mail=_7E2AA65F-A7B7-4570-9B65-7FB4110A2775
 Content-Transfer-Encoding: quoted-printable
 Content-Type: text/plain;
 	charset=us-ascii

 FreeBSD does not process AT_SUN_EXECNAME at all. But I think you are =
 right, might as well use the full path for the name. The Dl_info will be =
 different as returned from dladdr(3), the linkmap entry as returned by =
 dlinfo(3), and the error messages from the linker itself. It will look =
 non-familiar to the user (why I typed ./foo and the error message is =
 /path/foo?), but I think it is better (since it provides an absolute =
 path) and should not break anything.

 christos



 --Apple-Mail=_7E2AA65F-A7B7-4570-9B65-7FB4110A2775
 Content-Disposition: attachment;
 	filename=objmain_name.diff
 Content-Type: application/octet-stream;
 	name=objmain_name.diff;
 	x-unix-mode=0664
 Content-Transfer-Encoding: 7bit

 ? align.cc
 ? elf_hash.c
 ? o
 ? symbol.c.debug
 Index: rtld.c
 ===================================================================
 RCS file: /cvsroot/src/libexec/ld.elf_so/rtld.c,v
 retrieving revision 1.217
 diff -u -p -u -r1.217 rtld.c
 --- rtld.c	19 Jan 2024 19:21:34 -0000	1.217
 +++ rtld.c	30 Nov 2024 18:23:08 -0000
 @@ -467,7 +467,7 @@ _rtld(Elf_Addr *sp, Elf_Addr relocbase)
  	bool            bind_now = 0;
  	const char     *ld_bind_now, *ld_preload, *ld_library_path;
  	const char    **argv;
 -	const char     *execname;
 +	const char     *execname, objmain_name;
  	long		argc;
  	const char **real___progname;
  	const Obj_Entry **real___mainprog_obj;
 @@ -656,11 +656,12 @@ _rtld(Elf_Addr *sp, Elf_Addr relocbase)
           * Load the main program, or process its program header if it is
           * already loaded.
           */
 +	objmain_name = execname ? execname :
 +	    (argv[0] ? argv[0] : "main program");
  	if (pAUX_execfd != NULL) {	/* Load the main program. */
  		int             fd = pAUX_execfd->a_v;
 -		const char *obj_name = argv[0] ? argv[0] : "main program";
  		dbg(("loading main program"));
 -		_rtld_objmain = _rtld_map_object(obj_name, fd, NULL);
 +		_rtld_objmain = _rtld_map_object(objmain_name, fd, NULL);
  		close(fd);
  		if (_rtld_objmain == NULL)
  			_rtld_die();
 @@ -679,8 +680,7 @@ _rtld(Elf_Addr *sp, Elf_Addr relocbase)
  		assert(pAUX_entry != NULL);
  		entry = (caddr_t) pAUX_entry->a_v;
  		_rtld_objmain = _rtld_digest_phdr(phdr, phnum, entry);
 -		_rtld_objmain->path = xstrdup(argv[0] ? argv[0] :
 -		    "main program");
 +		_rtld_objmain->path = xstrdup(objmain_name);
  		_rtld_objmain->pathlen = strlen(_rtld_objmain->path);
  	}


 --Apple-Mail=_7E2AA65F-A7B7-4570-9B65-7FB4110A2775
 Content-Transfer-Encoding: 7bit
 Content-Type: text/plain;
 	charset=us-ascii





 --Apple-Mail=_7E2AA65F-A7B7-4570-9B65-7FB4110A2775--

 --Apple-Mail=_AFD46184-104A-4358-AB84-5875042323C1
 Content-Transfer-Encoding: 7bit
 Content-Disposition: attachment;
 	filename=signature.asc
 Content-Type: application/pgp-signature;
 	name=signature.asc
 Content-Description: Message signed with OpenPGP

 -----BEGIN PGP SIGNATURE-----
 Comment: GPGTools - http://gpgtools.org

 iF0EARECAB0WIQS+BJlbqPkO0MDBdsRxESqxbLM7OgUCZ0taMQAKCRBxESqxbLM7
 OqFvAJ9LWss/4WJivSQsMjOdEHjonE1bKgCdGpMUkZFtN6YE9AuyvCHMqTFUxQ0=
 =cpgu
 -----END PGP SIGNATURE-----

 --Apple-Mail=_AFD46184-104A-4358-AB84-5875042323C1--

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: lib/58865: static and dynamic dl_iterate_phdr disagree on main
	object name
Date: Sat, 30 Nov 2024 19:59:17 +0000

 > Date: Sat, 30 Nov 2024 13:32:17 -0500
 > From: Christos Zoulas <christos@zoulas.com>
 >  
 > FreeBSD does not process AT_SUN_EXECNAME at all.

 FreeBSD does have AT_EXECPATH, with slightly different semantics:

 - Our AT_SUN_EXECNAME is essentially:

 	$(pwd)/${path}

   which will include ./ or symlink components in ${path}, e.g. `cd
   /tmp && ./foo' will pass `/tmp/./foo' and `cd /tmp && ln -s .
   symlink && ./symlink/foo' will pass `/tmp/./symlink/foo'.

 - FreeBSD's AT_EXECPATH is ${path} if absolute, or essentially:

 	$(cd "$(dirname "${path}")" && pwd)/$(basename "${path}")

   (or if that fails, ${path} even if relative), which will usually
   avoid any ./ components and symlink components in ${path} when the
   argument to execve(2) is relative.

 No opinion here about which approach is right and which approach is
 wrong -- just observing what is implemented.

 It might be nice if the exact path passed to execve(2) were provided
 in some way -- not necessarily via AT_SUN_EXECNAME -- for the program,
 which can always prepend $(pwd) or filter through dirname/pwd/basename
 if it wants.  As far as I can tell there's no way to get this right
 now since $(pwd) might change between when the kernel adds it and when
 userland might try to strip it from AT_SUN_EXECNAME, and there's no
 guarantee argv[0] has any connection to the execve(2) path.  But it is
 always a suffix of AT_SUN_EXECNAME at the moment.

 > But I think you are right, might as well use the full path for the
 > name.  The Dl_info will be different as returned from dladdr(3), the
 > linkmap entry as returned by dlinfo(3), and the error messages from
 > the linker itself.  It will look non-familiar to the user (why I
 > typed ./foo and the error message is /path/foo?), but I think it is
 > better (since it provides an absolute path) and should not break
 > anything.

 FreeBSD appears to use AT_EXECPATH for these, not argv[0] unless
 AT_EXECPATH is missing -- so it will get the fully resolved path,
 closer to what AT_SUN_EXECNAME gives than argv[0].  (But OpenBSD uses
 argv[0] -- doesn't seem to have anything like AT_EXECPATH or
 AT_SUN_EXECNAME.)

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.