NetBSD Problem Report #55248

From www@netbsd.org  Sat May  9 00:26:31 2020
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 6A3151A9213
	for <gnats-bugs@gnats.NetBSD.org>; Sat,  9 May 2020 00:26:31 +0000 (UTC)
Message-Id: <20200509002630.3C5901A921E@mollari.NetBSD.org>
Date: Sat,  9 May 2020 00:26:30 +0000 (UTC)
From: tnn@nygren.pp.se
Reply-To: tnn@nygren.pp.se
To: gnats-bugs@NetBSD.org
Subject: ld.elf_so on 9.0/aarch64 might need reduced optimization for rtld.c
X-Send-Pr-Version: www-1.0

>Number:         55248
>Category:       port-evbarm
>Synopsis:       ld.elf_so on 9.0/aarch64 might need reduced optimization for rtld.c
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-evbarm-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat May 09 00:30:00 +0000 2020
>Closed-Date:    
>Last-Modified:  Mon May 16 13:28:44 +0000 2022
>Originator:     Tobias Nygren
>Release:        9.0_STABLE
>Organization:
>Environment:
>Description:
pkgsrc/lang/openjdk11 segfaults on NetBSD-9.0_STABLE-aarch64.
If /libexec/ld.elf_so is replaced with a copy from -current, it works fine.

This might be due to different GCC major version (7.4.0 vs 8.4.0).
ld.elf_so is built with optimization level -O3.
I suspect GCC 7 generates bad code.

>How-To-Repeat:
run "java -version"

>Fix:
Not sure how to request this change for NetBSD-9. It should not be applied to -current.

--- libexec/ld.elf_so/Makefile  26 Nov 2019 08:12:26 -0000      1.141.2.1
+++ libexec/ld.elf_so/Makefile  8 May 2020 20:46:00 -0000
@@ -129,6 +129,10 @@ COPTS.symbol.c+=-Wno-stack-protector
 COPTS.rtld.c+= -O0
 .endif

+.if ${MACHINE_CPU} == "aarch64"
+COPTS.rtld.c+= -O2
+.endif
+

>Release-Note:

>Audit-Trail:
From: "Tobias Nygren" <tnn@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55248 CVS commit: pkgsrc/lang/openjdk11
Date: Sat, 9 May 2020 00:55:44 +0000

 Module Name:	pkgsrc
 Committed By:	tnn
 Date:		Sat May  9 00:55:44 UTC 2020

 Modified Files:
 	pkgsrc/lang/openjdk11: bootstrap.mk distinfo

 Log Message:
 openjdk11: enable support for NetBSD-*-aarch64. Add bootstrap binaries.

 Only works on -current. See PR port-evbarm/55248 for 9.0 caveats.


 To generate a diff of this commit:
 cvs rdiff -u -r1.3 -r1.4 pkgsrc/lang/openjdk11/bootstrap.mk
 cvs rdiff -u -r1.15 -r1.16 pkgsrc/lang/openjdk11/distinfo

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Tobias Nygren" <tnn@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55248 CVS commit: pkgsrc/lang
Date: Sun, 15 May 2022 22:44:24 +0000

 Module Name:	pkgsrc
 Committed By:	tnn
 Date:		Sun May 15 22:44:24 UTC 2022

 Modified Files:
 	pkgsrc/lang/openjdk11: bootstrap.mk
 	pkgsrc/lang/openjdk17: bootstrap.mk

 Log Message:
 openjdk11 & 17: preemptively set PKG_FAIL_REASON when PR 55248 applies

 That is, NetBSD before 9.98.83. This around when we switched to GCC 10.
 Is suspected to be related.


 To generate a diff of this commit:
 cvs rdiff -u -r1.4 -r1.5 pkgsrc/lang/openjdk11/bootstrap.mk
 cvs rdiff -u -r1.4 -r1.5 pkgsrc/lang/openjdk17/bootstrap.mk

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->feedback
State-Changed-By: jmcneill@NetBSD.org
State-Changed-When: Mon, 16 May 2022 10:21:08 +0000
State-Changed-Why:
You can send your patch to pullup-9@ - see https://www.netbsd.org/developers/releng/pullups.html


From: Tobias Nygren <tnn@nygren.pp.se>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-evbarm/55248
Date: Mon, 16 May 2022 15:05:16 +0200

 On NetBSD 9.2 earmv6hf, ld.elf_so is also broken. But the error is
 different. Instead of the random late-game segfault we get on aarch64,
 instead we get consistently:

 ./work/bootstrap/bin/java -version
 Error: dl failure on line 562
 Error: failed /work/bootstrap/lib/server/libjvm.so, because /work/bootstrap/lib/server/libjvm.so: Shared object has no run-time symbol table

 But there is nothing obviously wrong with the solib.
 Again, dropping in ld.elf_so from -current solves the problem.
 This should be easier to debug. With luck it is the same issue.

State-Changed-From-To: feedback->open
State-Changed-By: tnn@NetBSD.org
State-Changed-When: Mon, 16 May 2022 13:28:44 +0000
State-Changed-Why:
I'll reopen this since more platforms are affected and IIRC joerg
objected off-list to lowering the optimization level without
bisecting which function gets miscompiled.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.