NetBSD Problem Report #59333

From martin@aprisoft.de  Sun Apr 20 10:29:29 2025
Return-Path: <martin@aprisoft.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits)
	 client-signature RSA-PSS (2048 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id B1C601A9245
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 20 Apr 2025 10:29:29 +0000 (UTC)
Message-Id: <20250420102919.AB1C05CC7A2@emmas.aprisoft.de>
Date: Sun, 20 Apr 2025 12:29:19 +0200 (CEST)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: perl install fails with sparc userland on sparc64 kernel
X-Send-Pr-Version: 3.95

>Number:         59333
>Category:       port-sparc
>Synopsis:       perl install fails with sparc userland on sparc64 kernel
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    port-sparc-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Apr 20 10:30:01 +0000 2025
>Last-Modified:  Fri May 02 19:40:00 +0000 2025
>Originator:     Martin Husemann
>Release:        NetBSD 10.99.14
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD nelly.aprisoft.de 10.99.14 NetBSD 10.99.14 (NELLY) #82: Sat Apr 19 18:41:20 CEST 2025 martin@seven-days-to-the-wolves.aprisoft.de:/work/src/sys/arch/sparc64/compile/NELLY sparc
Architecture: sparc
Machine: sparc
>Description:

Running a -current sparc userland on a -current sparc64 kernel, trying to
build & install pkgsrc/lang/perl5 fails during the install phase with:

LD_LIBRARY_PATH=/usr/pkgobj/lang/perl5/work/perl-5.40.2 ./miniperl -Ilib make_ext.pl dist/Thread-Semaphore/pm_to_blib  MAKE="/usr/bin/make" LIBPERL_A=libperl.so
*** Signal 11

No usefull backtrace, but it tries loading a halfword from a properly aligned
address that is not mapped - guessing something went wrong with TLS.

/usr/tests/libexec/ld.elf_so tests all work fine:

Summary for 15 test programs:
    56 passed test cases.
    0 failed test cases.
    0 expected failed test cases.
    0 skipped test cases.

same for /usr/tests/lib/libc/tls:

Summary for 3 test programs:
    3 passed test cases.
    0 failed test cases.
    0 expected failed test cases.
    0 skipped test cases.

Core was generated by `miniperl'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00042028 in perl_run ()
(gdb) bt
#0  0x00042028 in perl_run ()
#1  0x001cfd7c in main ()
(gdb) x/16i $pc
=> 0x42028 <perl_run+996>:      lduh  [ %g3 + 0x66 ], %g1
   0x4202c <perl_run+1000>:     st  %g2, [ %g3 + 0x208 ]
   0x42030 <perl_run+1004>:     b  0x42040 <perl_run+1020>
   0x42034 <perl_run+1008>:     sth  %g1, [ %fp + -10 ]
   0x42038 <perl_run+1012>:     call  0x176fdc <Perl_pop_scope>
(gdb) p/x $g3
$1 = 0x20000000
(gdb) x/x $g3
0x20000000:     Cannot access memory at address 0x20000000
(gdb) x/16i $pc-48
   0x41ff8 <perl_run+948>:      cmp  %o2, 0
   0x41ffc <perl_run+952>:      be  0x41dd8 <perl_run+404>
   0x42000 <perl_run+956>:      mov  3, %g1
   0x42004 <perl_run+960>:      ld  [ %fp + -88 ], %o1
   0x42008 <perl_run+964>:      mov  %g2, %o0
   0x4200c <perl_run+968>:      call  0x3bf7c <Perl_call_list>
   0x42010 <perl_run+972>:      st  %g1, [ %g2 + 0x518 ]
   0x42014 <perl_run+976>:      b  0x41ddc <perl_run+408>
   0x42018 <perl_run+980>:      ld  [ %fp + -84 ], %g3
   0x4201c <perl_run+984>:      add  %fp, -80, %g2
   0x42020 <perl_run+988>:      st  %o0, [ %fp + -16 ]
   0x42024 <perl_run+992>:      clrb  [ %fp + -12 ]
=> 0x42028 <perl_run+996>:      lduh  [ %g3 + 0x66 ], %g1
   0x4202c <perl_run+1000>:     st  %g2, [ %g3 + 0x208 ]
   0x42030 <perl_run+1004>:     b  0x42040 <perl_run+1020>
   0x42034 <perl_run+1008>:     sth  %g1, [ %fp + -10 ]


The issue is reproducable. Will try to reproduce on 32bit sparc hardware.

>How-To-Repeat:
s/a

>Fix:
n/a

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-sparc/59333: perl install fails with sparc userland on
 sparc64 kernel
Date: Sun, 20 Apr 2025 16:21:45 +0200

 Rebuilding with symbols I get lots of crashes like this:

 Program terminated with signal SIGBUS, Bus error.
 #0  0x00042330 in perl_run (my_perl=0x293a2000) at perl.c:2773
 2773        JMPENV_PUSH(ret);
 (gdb) bt
 #0  0x00042330 in perl_run (my_perl=0x293a2000) at perl.c:2773
 #1  0x001d6884 in main (argc=<optimized out>, argv=<optimized out>, 
     env=<optimized out>) at miniperlmain.c:133
 (gdb) x/16i $pc-16
    0x42320 <perl_run+1260>:     mov  2, %g1
    0x42324 <perl_run+1264>:     st  %g1, [ %fp + -16 ]
    0x42328 <perl_run+1268>:     add  %fp, -80, %g1
    0x4232c <perl_run+1272>:     ld  [ %fp + -84 ], %g2
 => 0x42330 <perl_run+1276>:     st  %g1, [ %g2 + 0x208 ]
    0x42334 <perl_run+1280>:     clrb  [ %fp + -12 ]
    0x42338 <perl_run+1284>:     lduh  [ %g2 + 0x66 ], %g1
    0x4233c <perl_run+1288>:     b  0x42138 <perl_run+772>
    0x42340 <perl_run+1292>:     sth  %g1, [ %fp + -10 ]
    0x42344 <perl_run+1296>:     call  0x2f2384 <__stack_chk_fail@got.plt>
    0x42348 <perl_run+1300>:     nop 
    0x4234c <perl_run+1304>:     nop 
    0x42350 <Perl_my_failure_exit>:      save  %sp, -96, %sp
    0x42354 <Perl_my_failure_exit+4>:    sethi  %hi(0x2ad000), %l7
    0x42358 <Perl_my_failure_exit+8>:    
     call  0x1d6a90 <__sparc_get_pc_thunk.l7>
 (gdb) p/x $g2
 $1 = 0x1


 ... which looks like a Perl problem - will dig deeper.

 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-sparc/59333: perl install fails with sparc userland on
 sparc64 kernel
Date: Sun, 20 Apr 2025 20:12:21 +0200

 The same perl pkg build & installed just fine on 32bit hardware.

 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-sparc/59333: perl install fails with sparc userland on
 sparc64 kernel
Date: Wed, 30 Apr 2025 20:59:28 +0200

 I have seen core dumps form other programs (e.g. /bin/sh) in this environment
 too, in the "restore" at the end of _longjmp(). There must be something
 strange going on here, maybe something is broken in the window fill handling
 for 32bit stacks.

 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-sparc/59333: perl install fails with sparc userland on
 sparc64 kernel
Date: Fri, 2 May 2025 21:35:26 +0200

 I added some instrumentation and all failures I see have a bogus return
 address (%pc) recorded in jmp_buf and come via a call like this:

 (gdb) x/16i 0x000185f0
    0x185f0 <evalcommand+5036>:  sethi  %hi(0x50000), %g1
    0x185f4 <evalcommand+5040>:  call  0x4f8a0 <__siglongjmp14@got.plt>
    0x185f8 <evalcommand+5044>:  
     st  %o0, [ %g1 + 0x280 ]    ! 0x50280 <handler>

 and the jmp_buf looks like:

 (gdb) x/16x 0x2860b580
 0x2860b580:     0x00000000      0x86823000      0x284fc000      0x0000e000
 0x2860b590:     0x00000000      0x00000000      0x00000000      0x00000015
 0x2860b5a0:     0x00000000      0x00000000      0x00000000      0x00000000
 0x2860b5b0:     0x2860b580      0x2860b580      0x00000000      0x00000000


 where 0x86823000 is the bogus return address.

 I would have hoped "savemask" to be 1 and the problem could have been
 the sigprocmask implementation for compat_netbsd32, but that doesn't seem
 to be the case - digging deeper.

 Martin

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.