NetBSD Problem Report #59333
From martin@aprisoft.de Sun Apr 20 10:29:29 2025
Return-Path: <martin@aprisoft.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits)
client-signature RSA-PSS (2048 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id B1C601A9245
for <gnats-bugs@gnats.NetBSD.org>; Sun, 20 Apr 2025 10:29:29 +0000 (UTC)
Message-Id: <20250420102919.AB1C05CC7A2@emmas.aprisoft.de>
Date: Sun, 20 Apr 2025 12:29:19 +0200 (CEST)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: perl install fails with sparc userland on sparc64 kernel
X-Send-Pr-Version: 3.95
>Number: 59333
>Category: port-sparc
>Synopsis: perl install fails with sparc userland on sparc64 kernel
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: port-sparc-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Apr 20 10:30:01 +0000 2025
>Last-Modified: Fri May 02 19:40:00 +0000 2025
>Originator: Martin Husemann
>Release: NetBSD 10.99.14
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD nelly.aprisoft.de 10.99.14 NetBSD 10.99.14 (NELLY) #82: Sat Apr 19 18:41:20 CEST 2025 martin@seven-days-to-the-wolves.aprisoft.de:/work/src/sys/arch/sparc64/compile/NELLY sparc
Architecture: sparc
Machine: sparc
>Description:
Running a -current sparc userland on a -current sparc64 kernel, trying to
build & install pkgsrc/lang/perl5 fails during the install phase with:
LD_LIBRARY_PATH=/usr/pkgobj/lang/perl5/work/perl-5.40.2 ./miniperl -Ilib make_ext.pl dist/Thread-Semaphore/pm_to_blib MAKE="/usr/bin/make" LIBPERL_A=libperl.so
*** Signal 11
No usefull backtrace, but it tries loading a halfword from a properly aligned
address that is not mapped - guessing something went wrong with TLS.
/usr/tests/libexec/ld.elf_so tests all work fine:
Summary for 15 test programs:
56 passed test cases.
0 failed test cases.
0 expected failed test cases.
0 skipped test cases.
same for /usr/tests/lib/libc/tls:
Summary for 3 test programs:
3 passed test cases.
0 failed test cases.
0 expected failed test cases.
0 skipped test cases.
Core was generated by `miniperl'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00042028 in perl_run ()
(gdb) bt
#0 0x00042028 in perl_run ()
#1 0x001cfd7c in main ()
(gdb) x/16i $pc
=> 0x42028 <perl_run+996>: lduh [ %g3 + 0x66 ], %g1
0x4202c <perl_run+1000>: st %g2, [ %g3 + 0x208 ]
0x42030 <perl_run+1004>: b 0x42040 <perl_run+1020>
0x42034 <perl_run+1008>: sth %g1, [ %fp + -10 ]
0x42038 <perl_run+1012>: call 0x176fdc <Perl_pop_scope>
(gdb) p/x $g3
$1 = 0x20000000
(gdb) x/x $g3
0x20000000: Cannot access memory at address 0x20000000
(gdb) x/16i $pc-48
0x41ff8 <perl_run+948>: cmp %o2, 0
0x41ffc <perl_run+952>: be 0x41dd8 <perl_run+404>
0x42000 <perl_run+956>: mov 3, %g1
0x42004 <perl_run+960>: ld [ %fp + -88 ], %o1
0x42008 <perl_run+964>: mov %g2, %o0
0x4200c <perl_run+968>: call 0x3bf7c <Perl_call_list>
0x42010 <perl_run+972>: st %g1, [ %g2 + 0x518 ]
0x42014 <perl_run+976>: b 0x41ddc <perl_run+408>
0x42018 <perl_run+980>: ld [ %fp + -84 ], %g3
0x4201c <perl_run+984>: add %fp, -80, %g2
0x42020 <perl_run+988>: st %o0, [ %fp + -16 ]
0x42024 <perl_run+992>: clrb [ %fp + -12 ]
=> 0x42028 <perl_run+996>: lduh [ %g3 + 0x66 ], %g1
0x4202c <perl_run+1000>: st %g2, [ %g3 + 0x208 ]
0x42030 <perl_run+1004>: b 0x42040 <perl_run+1020>
0x42034 <perl_run+1008>: sth %g1, [ %fp + -10 ]
The issue is reproducable. Will try to reproduce on 32bit sparc hardware.
>How-To-Repeat:
s/a
>Fix:
n/a
>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-sparc/59333: perl install fails with sparc userland on
sparc64 kernel
Date: Sun, 20 Apr 2025 16:21:45 +0200
Rebuilding with symbols I get lots of crashes like this:
Program terminated with signal SIGBUS, Bus error.
#0 0x00042330 in perl_run (my_perl=0x293a2000) at perl.c:2773
2773 JMPENV_PUSH(ret);
(gdb) bt
#0 0x00042330 in perl_run (my_perl=0x293a2000) at perl.c:2773
#1 0x001d6884 in main (argc=<optimized out>, argv=<optimized out>,
env=<optimized out>) at miniperlmain.c:133
(gdb) x/16i $pc-16
0x42320 <perl_run+1260>: mov 2, %g1
0x42324 <perl_run+1264>: st %g1, [ %fp + -16 ]
0x42328 <perl_run+1268>: add %fp, -80, %g1
0x4232c <perl_run+1272>: ld [ %fp + -84 ], %g2
=> 0x42330 <perl_run+1276>: st %g1, [ %g2 + 0x208 ]
0x42334 <perl_run+1280>: clrb [ %fp + -12 ]
0x42338 <perl_run+1284>: lduh [ %g2 + 0x66 ], %g1
0x4233c <perl_run+1288>: b 0x42138 <perl_run+772>
0x42340 <perl_run+1292>: sth %g1, [ %fp + -10 ]
0x42344 <perl_run+1296>: call 0x2f2384 <__stack_chk_fail@got.plt>
0x42348 <perl_run+1300>: nop
0x4234c <perl_run+1304>: nop
0x42350 <Perl_my_failure_exit>: save %sp, -96, %sp
0x42354 <Perl_my_failure_exit+4>: sethi %hi(0x2ad000), %l7
0x42358 <Perl_my_failure_exit+8>:
call 0x1d6a90 <__sparc_get_pc_thunk.l7>
(gdb) p/x $g2
$1 = 0x1
... which looks like a Perl problem - will dig deeper.
Martin
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-sparc/59333: perl install fails with sparc userland on
sparc64 kernel
Date: Sun, 20 Apr 2025 20:12:21 +0200
The same perl pkg build & installed just fine on 32bit hardware.
Martin
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-sparc/59333: perl install fails with sparc userland on
sparc64 kernel
Date: Wed, 30 Apr 2025 20:59:28 +0200
I have seen core dumps form other programs (e.g. /bin/sh) in this environment
too, in the "restore" at the end of _longjmp(). There must be something
strange going on here, maybe something is broken in the window fill handling
for 32bit stacks.
Martin
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-sparc/59333: perl install fails with sparc userland on
sparc64 kernel
Date: Fri, 2 May 2025 21:35:26 +0200
I added some instrumentation and all failures I see have a bogus return
address (%pc) recorded in jmp_buf and come via a call like this:
(gdb) x/16i 0x000185f0
0x185f0 <evalcommand+5036>: sethi %hi(0x50000), %g1
0x185f4 <evalcommand+5040>: call 0x4f8a0 <__siglongjmp14@got.plt>
0x185f8 <evalcommand+5044>:
st %o0, [ %g1 + 0x280 ] ! 0x50280 <handler>
and the jmp_buf looks like:
(gdb) x/16x 0x2860b580
0x2860b580: 0x00000000 0x86823000 0x284fc000 0x0000e000
0x2860b590: 0x00000000 0x00000000 0x00000000 0x00000015
0x2860b5a0: 0x00000000 0x00000000 0x00000000 0x00000000
0x2860b5b0: 0x2860b580 0x2860b580 0x00000000 0x00000000
where 0x86823000 is the bogus return address.
I would have hoped "savemask" to be 1 and the problem could have been
the sigprocmask implementation for compat_netbsd32, but that doesn't seem
to be the case - digging deeper.
Martin
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.