NetBSD Problem Report #56839

From www@netbsd.org  Mon May 16 15:13:18 2022
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id CA1881A921F
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 16 May 2022 15:13:18 +0000 (UTC)
Message-Id: <20220516151317.5F00B1A923A@mollari.NetBSD.org>
Date: Mon, 16 May 2022 15:13:17 +0000 (UTC)
From: rokuyama.rk@gmail.com
Reply-To: rokuyama.rk@gmail.com
To: gnats-bugs@NetBSD.org
Subject: GCC emits wrong codes for compare_and_swap_1 bultins on armv5 (el & eb)
X-Send-Pr-Version: www-1.0

>Number:         56839
>Category:       port-arm
>Synopsis:       GCC emits wrong codes for compare_and_swap_1 bultins on armv5 (el & eb)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-arm-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon May 16 15:15:00 +0000 2022
>Last-Modified:  Thu Mar 13 14:30:01 +0000 2025
>Originator:     Rin Okuyama
>Release:        9.99.96
>Organization:
Department of Physics, Meiji University
>Environment:
NetBSD hdlg 9.99.96 NetBSD 9.99.96 (HDL_G) #5: Sun May 15 19:24:35 JST 2022  rin@latipes:/build/src/sys/arch/evbarm/compile/HDL_G evbarm
>Description:
Tests for __sync_{bool,val}_compare_and_swap_1() fail on armv5:

https://www.netbsd.org/~martin/evbarm-atf/402_atf.html#lib_libc_atomic_t___sync_compare_and_swap___sync_bool_compare_and_swap_1

These failures are observed both for little- and big-endian.

This is because GCC emits wrong codes for these built-in functions.
For some cases, GCC uses "mvn" insn to generate 2nd ("expected") and
3rd ("new") arguments for these builtins.

For example, it emits "mvn reg,#0x77" for 0x88, which results in
0xffffff88. This is wrong as these arguments are passed via registers;
inappropriate 0xff's in upper 3 bytes for "expected" argument make
comparison with uint8_t value of "*ptr" unconditionally fail.

This kind of wrong immediate generations are not observed for other
functions that take uint8_t arguments. Therefore, something is wrong
for treatments peculiar to these builtins.

Note that Linux version of __sync_bool_compare_and_swap_N() takes
signed integers as the 2nd and 3rd arguments on arm, whereas these are
unsigned for us. This may be related to this wrong "sign-extension".

The above-mentioned scenario is confirmed by a minimal stripped-down
version of the test for __sync_bool_compare_and_swap_1() (and
_atomic_cas_8(), which is used internally for this builtin):

https://gist.github.com/rokuyama/f469cbf96774db1f4658f0bb932c7a23

It fails for __sync_bool_compare_and_swap_1(), and succeeds for
_atomic_cas_8():

----
$ cc -g -O2 -c arm_builtin_compare_and_swap_1.c
$ cc arm_builtin_compare_and_swap_1.o
$ ./a.out
__sync_bool_compare_and_swap_1: val expects 0xf0 but 0x88
----

(Here, "-O2" is only for readability of objdump (see below).
The same failure occurs even with "-O0".)

objdump for this test case reads:

https://gist.github.com/rokuyama/eaf499d1e57767dbbe3e7bf30b2e0fce

Here, GCC generates 0x88 by "mvn" and "mov" for arguments for
__sync_bool_compare_and_swap_1() and _atomic_cas_8(), respectively.
>How-To-Repeat:
cd /usr/tests/lib/libc/atomic && atf-run t___sync_compare_and_swap

The stripped-down version of this test is provided above.
>Fix:
N/A

>Audit-Trail:
From: Nick Hudson <nick.hudson@gmx.co.uk>
To: gnats-bugs@netbsd.org, port-arm-maintainer@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: port-arm/56839: GCC emits wrong codes for compare_and_swap_1
 bultins on armv5 (el & eb)
Date: Wed, 18 May 2022 12:48:54 +0100

 I think this fixes things. It's a bit ugly...

 https://www.netbsd.org/~skrll/pr56839.diff

 Nick

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: Nick Hudson <nick.hudson@gmx.co.uk>, gnats-bugs@netbsd.org,
 port-arm-maintainer@netbsd.org, gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org
Cc: 
Subject: Re: port-arm/56839: GCC emits wrong codes for compare_and_swap_1
 bultins on armv5 (el & eb)
Date: Fri, 20 May 2022 10:32:32 +0900

 Hi,

 On 2022/05/18 20:48, Nick Hudson wrote:
 > I think this fixes things. It's a bit ugly...
 > 
 > https://www.netbsd.org/~skrll/pr56839.diff

 This does not work:

 (1) GCC expands __sync_bool_compare_and_swap_N() into
      __sync_val_compare_and_swap_N() as optimization.

 (2) The patch does not cover __sync_val_compare_and_swap_N()
      ( == _atomic_cas_8N() ).

 On the other hand, this patch for _atomic_cas_8N() fixes the problem:

 https://gist.github.com/rokuyama/3d791f2ceb2757dc389a95910162c9cb

 However, IMO, this is not the real fix:

 (a) I've found that MI part, not arm-dependent part, of GCC carries
      out sign-extension; something is wrong in
      expand_builtin_compare_and_swap() @ dist/gcc/gcc/builtins.c,
      but I've never fully understand what goes wrong yet...
      (Strangely enough, GCC itself recognizes that these builtins
      takes unsigned integers...)

 (b) Therefore, archs other than arm that do not have 1- and 2-byte
      atomic cas builtins (== missing equivalent instructions) should
      be affected.

 (c) It should be better to fix GCC rather than adding workaround to
      libc routines.

 Thanks,
 rin

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: port-arm/56839 (GCC emits wrong codes for compare_and_swap_1
 bultins on armv5 (el & eb))
Date: Tue, 15 Aug 2023 13:02:24 +0900

 This still fails in the same manner even for GCC 12.3.0 and 10.5.0.

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-arm/56839: GCC emits wrong codes for compare_and_swap_1
 builtins
Date: Wed, 5 Feb 2025 11:50:18 +0100

 I am not convinced GCC is wrong here. The ABI says the caller needs to
 0-extend or sign-extend the sub-word value and pass it as a 32bit register
 value. So it all boils down to what the signature of the called function
 is - and unfortunately that seems to be poorly or not documented anywhere.

 But: gcc/dist/libgcc/config/arm/ implements the functions as taking
 signed values - so that would match the gcc generated call sites.

 #define SUBWORD_VAL_CAS(TYPE, WIDTH)                                    \
   TYPE HIDDEN                                                           \
   __sync_val_compare_and_swap_##WIDTH (TYPE *ptr, TYPE oldval,          \
                                        TYPE newval)                     \

 SUBWORD_VAL_CAS (short,       2)
 SUBWORD_VAL_CAS (signed char, 1)


 I think we need to make our functions match that (basically Nick's patch,
 plus maybe extensions to a few other functions).

 Martin

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56839 CVS commit: src/tests/lib/libc/atomic
Date: Thu, 13 Mar 2025 14:00:51 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Thu Mar 13 14:00:51 UTC 2025

 Modified Files:
 	src/tests/lib/libc/atomic: t___sync_compare_and_swap.c

 Log Message:
 tests/lib/libc/atomic: Test subword compare-and-swap explicitly.

 Make sure the sign doesn't bleed into an adjacent word.

 We already had failing tests, but this should make the failures a
 little more obvious and perhaps check for more adjacent problems.

 PR port-arm/56839: GCC emits wrong codes for compare_and_swap_1
 bultins on armv5 (el & eb)


 To generate a diff of this commit:
 cvs rdiff -u -r1.1 -r1.2 \
     src/tests/lib/libc/atomic/t___sync_compare_and_swap.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Rin Okuyama <rokuyama.rk@gmail.com>,
	Nick Hudson <nick.hudson@gmx.co.uk>,
	Martin Husemann <martin@duskware.de>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: port-arm/56839: GCC emits wrong codes for compare_and_swap_1 bultins on armv5 (el & eb)
Date: Thu, 13 Mar 2025 14:29:40 +0000

 It looks like gcc is _inconsistent_ about the signedness of the
 __sync_* builtins.  Here, for example, in gcc-14.2.0, it appears to
 use unsigned arguments in libgcc:

  134 typedef unsigned int UQItype __attribute__((mode (QI)));
  135 DEFINE (FN, 1, UQItype)
 ...
  139 typedef unsigned int UHItype __attribute__((mode (HI)));
  140 DEFINE (FN, 2, UHItype)
 ...
  144 typedef unsigned int USItype __attribute__((mode (SI)));
  145 DEFINE (FN, 4, USItype)
 ...

 https://gcc.gnu.org/git/?p=3Dgcc.git;a=3Dblob;f=3Dlibgcc/sync.c;h=3D6ffe88a=
 ac0f08d1adbc0b02f7f10d50d95d4c6eb;hb=3D04696df09633baf97cdbbdd6e9929b9d4721=
 61d3#l125

 And, for _FreeBSD_ on arm32, it also uses unsigned arguments:

  220 EMIT_ALL_OPS_N (1, unsigned char, "ldrb", "strb", "streqb")
  221 EMIT_ALL_OPS_N (2, unsigned short, "ldrh", "strh", "streqh")
  222 EMIT_ALL_OPS_N (4, unsigned int, "ldr", "str", "streq")

 https://gcc.gnu.org/git/?p=3Dgcc.git;a=3Dblob;f=3Dlibgcc/config/arm/freebsd=
 -atomic.c;h=3D7cf00b1e75d4759b983401797f0224764a2ce82d;hb=3D04696df09633baf=
 97cdbbdd6e9929b9d472161d3#l220

 But elsewhere, e.g. for _Linux_ on arm32, it uses signed arguments:

  249 SUBWORD_VAL_CAS (short,       2)
  250 SUBWORD_VAL_CAS (signed char, 1)

 https://gcc.gnu.org/git/?p=3Dgcc.git;a=3Dblob;f=3Dlibgcc/config/arm/linux-a=
 tomic.c;h=3D6d6683194aff0404967b28030bc7677b0f2e5949;hb=3D04696df09633baf97=
 cdbbdd6e9929b9d472161d3#l249

 And I have no idea about the code generation parts of things.

 We should file a bug upstream, perhaps.

 But until we do, I think it should be sufficient to write all of our
 __sync_* routines -- including both __sync_bool_* and __sync_val_*, of
 course -- with signed arguments and pass them to the unsigned
 _atomic_* functions, like Nick suggested.

 This should paper over the gcc issue, by forcing gcc to generate code
 that always zero-extends the subword when passing arguments in
 registers so the comparison will succeed (at least, for ABIs where
 that is relevant), without requiring changes to our _atomic_*
 definitions which are perfectly good for their signatures.

 I guess it is conceivable that there is an ABI where this will do the
 wrong thing but I doubt it.  Even on riscv64, where 8/16-bit units are
 type-extended to 32 bit and then sign-extended to 64 bits (no matter
 the signedness of the type), I think this will produce the correct
 result.  (Also I think these stubs won't be used on riscv anyway.)=20

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.