NetBSD Problem Report #58865
From www@netbsd.org Sat Nov 30 03:31:41 2024
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
client-signature RSA-PSS (2048 bits) client-digest SHA256)
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id D1E001A9238
for <gnats-bugs@gnats.NetBSD.org>; Sat, 30 Nov 2024 03:31:41 +0000 (UTC)
Message-Id: <20241130033140.1F8F91A923E@mollari.NetBSD.org>
Date: Sat, 30 Nov 2024 03:31:40 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: static and dynamic dl_iterate_phdr disagree on main object name
X-Send-Pr-Version: www-1.0
>Number: 58865
>Category: lib
>Synopsis: static and dynamic dl_iterate_phdr disagree on main object name
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: lib-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Nov 30 03:35:00 +0000 2024
>Last-Modified: Sat Nov 30 20:00:01 +0000 2024
>Originator: Taylor R Campbell
>Release: current, 10, 9, ...
>Organization:
The AT_NETBSD_EXECFOUNDATION
>Environment:
>Description:
Static dl_iterate_phdr (lib/libc/dlfcn/dlfcn_elf.c) gives AT_SUN_EXECNAME as the main object's struct dl_phdr_info::dlpi_name:
174 case AT_SUN_EXECNAME:
175 dlpi_name = (void *)aux->a_v;
176 break;
...
216 phdr_info.dlpi_name = dlpi_name;
217
218 return callback(&phdr_info, sizeof(phdr_info), data);
https://nxr.netbsd.org/xref/src/lib/libc/dlfcn/dlfcn_elf.c?r=1.17#174
Dynamic dl_iterate_phdr (libexec/ld.elf_so/rtld.c) instead gives argv[0] as the main object's struct dl_phdr_info::dlpi_name:
682 _rtld_objmain->path = xstrdup(argv[0] ? argv[0] :
683 "main program");
...
1467 /* XXX: wrong but not fixing it yet */
1468 phdr_info->dlpi_name = obj->path;
https://nxr.netbsd.org/xref/src/libexec/ld.elf_so/rtld.c?r=1.217#682
ld.elf_so does read out AT_SUN_EXECNAME, but only uses it for $ORIGIN.
Not a priori clear which one is correct but I lean toward AT_SUN_EXECNAME since there is otherwise no way to obtain it without going through the undocumented _dlauxinfo().
>How-To-Repeat:
$ pwd
/tmp/riastradh
$ cat dlx.c
#include <dlfcn.h>
#include <elf.h>
#include <errno.h>
#include <link.h>
#include <stdio.h>
static int
callback(struct dl_phdr_info *dlpi, size_t size, void *cookie)
{
printf("dl_iterate_phdr name=%s\n", dlpi->dlpi_name);
return 1;
}
int
main(void)
{
const AuxInfo *aux;
for (aux = _dlauxinfo(); aux->a_type != AT_NULL; aux++) {
switch (aux->a_type) {
case AT_SUN_EXECNAME:
printf("AT_SUN_EXECNAME=%s\n", (char *)aux->a_v);
break;
}
}
dl_iterate_phdr(&callback, NULL);
return 0;
}
$ rm -f dlx && make dlx DBG=-g\ -O2\ -Wall\ -Werror && ./dlx
cc -g -O2 -Wall -Werror -o dlx dlx.c
AT_SUN_EXECNAME=/tmp/riastradh/./dlx
dl_iterate_phdr name=./dlx
$ rm -f dlx && make dlx DBG=-g\ -O2\ -Wall\ -Werror\ -static && ./dlx
cc -g -O2 -Wall -Werror -static -o dlx dlx.c
AT_SUN_EXECNAME=/tmp/riastradh/./dlx
dl_iterate_phdr name=/tmp/riastradh/./dlx
>Fix:
Yes, please!
>Audit-Trail:
From: Christos Zoulas <christos@zoulas.com>
To: gnats-bugs@netbsd.org
Cc: lib-bug-people@netbsd.org,
gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: lib/58865: static and dynamic dl_iterate_phdr disagree on main
object name
Date: Sat, 30 Nov 2024 13:32:17 -0500
--Apple-Mail=_AFD46184-104A-4358-AB84-5875042323C1
Content-Type: multipart/mixed;
boundary="Apple-Mail=_7E2AA65F-A7B7-4570-9B65-7FB4110A2775"
--Apple-Mail=_7E2AA65F-A7B7-4570-9B65-7FB4110A2775
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
charset=us-ascii
FreeBSD does not process AT_SUN_EXECNAME at all. But I think you are =
right, might as well use the full path for the name. The Dl_info will be =
different as returned from dladdr(3), the linkmap entry as returned by =
dlinfo(3), and the error messages from the linker itself. It will look =
non-familiar to the user (why I typed ./foo and the error message is =
/path/foo?), but I think it is better (since it provides an absolute =
path) and should not break anything.
christos
--Apple-Mail=_7E2AA65F-A7B7-4570-9B65-7FB4110A2775
Content-Disposition: attachment;
filename=objmain_name.diff
Content-Type: application/octet-stream;
name=objmain_name.diff;
x-unix-mode=0664
Content-Transfer-Encoding: 7bit
? align.cc
? elf_hash.c
? o
? symbol.c.debug
Index: rtld.c
===================================================================
RCS file: /cvsroot/src/libexec/ld.elf_so/rtld.c,v
retrieving revision 1.217
diff -u -p -u -r1.217 rtld.c
--- rtld.c 19 Jan 2024 19:21:34 -0000 1.217
+++ rtld.c 30 Nov 2024 18:23:08 -0000
@@ -467,7 +467,7 @@ _rtld(Elf_Addr *sp, Elf_Addr relocbase)
bool bind_now = 0;
const char *ld_bind_now, *ld_preload, *ld_library_path;
const char **argv;
- const char *execname;
+ const char *execname, objmain_name;
long argc;
const char **real___progname;
const Obj_Entry **real___mainprog_obj;
@@ -656,11 +656,12 @@ _rtld(Elf_Addr *sp, Elf_Addr relocbase)
* Load the main program, or process its program header if it is
* already loaded.
*/
+ objmain_name = execname ? execname :
+ (argv[0] ? argv[0] : "main program");
if (pAUX_execfd != NULL) { /* Load the main program. */
int fd = pAUX_execfd->a_v;
- const char *obj_name = argv[0] ? argv[0] : "main program";
dbg(("loading main program"));
- _rtld_objmain = _rtld_map_object(obj_name, fd, NULL);
+ _rtld_objmain = _rtld_map_object(objmain_name, fd, NULL);
close(fd);
if (_rtld_objmain == NULL)
_rtld_die();
@@ -679,8 +680,7 @@ _rtld(Elf_Addr *sp, Elf_Addr relocbase)
assert(pAUX_entry != NULL);
entry = (caddr_t) pAUX_entry->a_v;
_rtld_objmain = _rtld_digest_phdr(phdr, phnum, entry);
- _rtld_objmain->path = xstrdup(argv[0] ? argv[0] :
- "main program");
+ _rtld_objmain->path = xstrdup(objmain_name);
_rtld_objmain->pathlen = strlen(_rtld_objmain->path);
}
--Apple-Mail=_7E2AA65F-A7B7-4570-9B65-7FB4110A2775
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=us-ascii
--Apple-Mail=_7E2AA65F-A7B7-4570-9B65-7FB4110A2775--
--Apple-Mail=_AFD46184-104A-4358-AB84-5875042323C1
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
filename=signature.asc
Content-Type: application/pgp-signature;
name=signature.asc
Content-Description: Message signed with OpenPGP
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - http://gpgtools.org
iF0EARECAB0WIQS+BJlbqPkO0MDBdsRxESqxbLM7OgUCZ0taMQAKCRBxESqxbLM7
OqFvAJ9LWss/4WJivSQsMjOdEHjonE1bKgCdGpMUkZFtN6YE9AuyvCHMqTFUxQ0=
=cpgu
-----END PGP SIGNATURE-----
--Apple-Mail=_AFD46184-104A-4358-AB84-5875042323C1--
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: lib/58865: static and dynamic dl_iterate_phdr disagree on main
object name
Date: Sat, 30 Nov 2024 19:59:17 +0000
> Date: Sat, 30 Nov 2024 13:32:17 -0500
> From: Christos Zoulas <christos@zoulas.com>
>
> FreeBSD does not process AT_SUN_EXECNAME at all.
FreeBSD does have AT_EXECPATH, with slightly different semantics:
- Our AT_SUN_EXECNAME is essentially:
$(pwd)/${path}
which will include ./ or symlink components in ${path}, e.g. `cd
/tmp && ./foo' will pass `/tmp/./foo' and `cd /tmp && ln -s .
symlink && ./symlink/foo' will pass `/tmp/./symlink/foo'.
- FreeBSD's AT_EXECPATH is ${path} if absolute, or essentially:
$(cd "$(dirname "${path}")" && pwd)/$(basename "${path}")
(or if that fails, ${path} even if relative), which will usually
avoid any ./ components and symlink components in ${path} when the
argument to execve(2) is relative.
No opinion here about which approach is right and which approach is
wrong -- just observing what is implemented.
It might be nice if the exact path passed to execve(2) were provided
in some way -- not necessarily via AT_SUN_EXECNAME -- for the program,
which can always prepend $(pwd) or filter through dirname/pwd/basename
if it wants. As far as I can tell there's no way to get this right
now since $(pwd) might change between when the kernel adds it and when
userland might try to strip it from AT_SUN_EXECNAME, and there's no
guarantee argv[0] has any connection to the execve(2) path. But it is
always a suffix of AT_SUN_EXECNAME at the moment.
> But I think you are right, might as well use the full path for the
> name. The Dl_info will be different as returned from dladdr(3), the
> linkmap entry as returned by dlinfo(3), and the error messages from
> the linker itself. It will look non-familiar to the user (why I
> typed ./foo and the error message is /path/foo?), but I think it is
> better (since it provides an absolute path) and should not break
> anything.
FreeBSD appears to use AT_EXECPATH for these, not argv[0] unless
AT_EXECPATH is missing -- so it will get the fully resolved path,
closer to what AT_SUN_EXECNAME gives than argv[0]. (But OpenBSD uses
argv[0] -- doesn't seem to have anything like AT_EXECPATH or
AT_SUN_EXECNAME.)
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.