NetBSD Problem Report #55837

From www@netbsd.org  Wed Dec  2 08:59:42 2020
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 0A2EA1A9217
	for <gnats-bugs@gnats.NetBSD.org>; Wed,  2 Dec 2020 08:59:42 +0000 (UTC)
Message-Id: <20201202085940.DAD301A923A@mollari.NetBSD.org>
Date: Wed,  2 Dec 2020 08:59:40 +0000 (UTC)
From: rokuyama.rk@gmail.com
Reply-To: rokuyama.rk@gmail.com
To: gnats-bugs@NetBSD.org
Subject: GCC 9.3 g++ generates wrong codes for hard-float arm
X-Send-Pr-Version: www-1.0

>Number:         55837
>Category:       toolchain
>Synopsis:       GCC 9.3 g++ generates wrong codes for hard-float arm
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    toolchain-manager
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Dec 02 09:00:00 +0000 2020
>Last-Modified:  Fri Apr 30 12:35:01 +0000 2021
>Originator:     Rin Okuyama
>Release:        7.99.76
>Organization:
Department of Physics, Meiji University
>Environment:
NetBSD rpi0w 9.99.76 NetBSD 9.99.76 (RPI0EB) #12: Tue Dec  1 11:02:35 JST 2020  rin@latipes:/build/work/src/sys/arch/evbarm/compile/RPI0EB evbarm
>Description:
g++ from GCC 9.3 generates wrong codes for hard-float arm variants, i.e.,
earmv[5-7]hf{,eb}. The situation is much more serious on v[5-6]hf{eb},
but v7hf{,eb} is also affected. On the other hand, soft-float variants,
earmv[4-7] seem not affected as far as I can see.

For GDB on v[5-6]hf, trying to place breakpoint on non-existent symbol
causes abort:

on v7hf{,eb}:

|(gdb) b non_existent_symbol
|Function "non_existent_symbol" not defined.
|Make breakpoint pending on future shared library load? (y or [n])

on v[5-6]hf{,eb}:

|(gdb) b non_existent_symbol
|terminate called after throwing an instance of 'gdb_exception_RETURN_MASK_ERROR'
|[1]   Abort trap (core dumped) gdb ./hello

Or backtrace longer than one display screen cannot quit by ``q'':

on v7hf{,eb}:

|(gdb) bt
|...
|--Type <RET> for more, q to quit, c to continue without paging--q
|Quit
|(gdb)

on v[5-6]hf{,eb}:

|(gdb) bt
|...
|--Type <RET> for more, q to quit, c to continue without paging--q
|terminate called after throwing an instance of 'gdb_exception_RETURN_MASK_QUIT'
|[1]   Abort trap (core dumped) gdb atf-run atf-run.core

Note that even v7hf{,eb} is affected; we already works around similar
failure in exception handling on v7hf{,eb}. See doc/HACKS:

	https://nxr.netbsd.org/xref/src/doc/HACKS#978

or toolchain/54820:

	http://gnats.netbsd.org/54820

The failures do not occur on soft-float arm variants as far as I can see.

On hard-float arm variants, if whole GDB is compiled with -O0, they do not
occur. Alternatively, compiling GDB by GCC 8.4 remarkably mitigates the
problems, although GCC8 is not perfect; hack described above was initially
introduced for GCC8.

Even if the other parts of userland, shared-libs, rtld, etc., are compiled
by GCC8, the problems occur when GDB itself is compiled by GCC9.

For ATF on v[5-6]hf{,eb}, atf-run aborts like:

on v[5-6]hf{,eb}

| # atf-run | atf-report
| ...
| atf-c/detail/fs_test (3/68): 24 test cases
|    eaccess: [0.021425s] Passed.
|    exists: terminate called after throwing an instance of 'tools::parser::parse_error'
|  what():  LONELY PARSE ERROR: 237: Unexpected token `<<EOF>>'; expected end of
|test case or test case's stdout/stderr line
|[1]   Segmentation fault (core dumped) atf-run |
|      Abort trap (core dumped) atf-report
| #

Many tests in /usr/tests/atf fail due to ``Failed: Test program received
signal 6 (core dumped)''.

These failures do not occur on soft-float variants. If all source codes of
ATF are compiled with -O0, these do not occur also.
>How-To-Repeat:
On earmv[5-6]hf{,eb} machines,

# gdb hello
...
(gdb) b non-existent-symbol

or

# cd /usr/tests/atf && atf-run | atf-report

The problems also occurs for COMPAT_NETBSD32 on aarch64{,eb}.
>Fix:
Exception handling is suspicious, but I haven't still figured out what
actually happens. Note that while other OS's use arm specific exception
handling, NetBSD uses normal DWARF exception handling based on libunwind.

>Audit-Trail:
From: Rin Okuyama <rokuyama.rk@gmail.com>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@NetBSD.org>
Cc: 
Subject: Re: toolchain/55837 (GCC 9.3 g++ generates wrong codes for hard-float
 arm)
Date: Fri, 30 Apr 2021 21:29:36 +0900

 I tried to compile sys/lib/libunwind with VFP instructions being disabled
 forcibly with this patch:

 http://www.netbsd.org/~rin/arm_unwind_20210430.patch

 But the situation does not change unfortunately...

 Note that patch for GCC is necessary because -mgeneral-regs-only does not
 work for some C++ headers like:

 ----
 /build/gcc9/dest/evbarm-earmv6hfeb/usr/include/g++/bits/std_abs.h: In function 'constexpr double std::abs(double)':
 /build/gcc9/dest/evbarm-earmv6hfeb/usr/include/g++/bits/std_abs.h:71:17: error: argument of type 'double' not permitted with -mgeneral-regs-only
     71 |   abs(double __x)
 ----

 We may need more -mgeneral-regs-only for libstdc++, c.f.:

 ---
 % find /usr/src/external/gpl3/gcc/dist/libstdc++-v3 -name '*.cc' | \
 xargs grep general-regs-only
 /usr/src/external/gpl3/gcc/dist/libstdc++-v3/libsupc++/eh_personality.cc:__attribute__((target ("general-regs-only")))
 /usr/src/external/gpl3/gcc/dist/libstdc++-v3/libsupc++/eh_personality.cc:__attribute__((target ("general-regs-only")))
 ---

 Thanks,
 rin

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.