NetBSD Problem Report #50940

From gson@gson.org  Fri Mar 11 14:40:57 2016
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 639DE7ABE5
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 11 Mar 2016 14:40:57 +0000 (UTC)
Message-Id: <20160311144045.AF002744641@guava.gson.org>
Date: Fri, 11 Mar 2016 16:40:45 +0200 (EET)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: cc -ffast-math fails to disable denormals on amd64
X-Send-Pr-Version: 3.95

>Number:         50940
>Category:       toolchain
>Synopsis:       cc -ffast-math fails to disable denormals on amd64
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    toolchain-manager
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Mar 11 14:45:00 +0000 2016
>Closed-Date:    Wed Jul 21 05:14:48 +0000 2021
>Last-Modified:  Wed Jul 21 05:14:48 +0000 2021
>Originator:     Andreas Gustafsson
>Release:        NetBSD 7.0
>Organization:

>Environment:
System: NetBSD 7.0
Architecture: x86_64
Machine: amd64
>Description:

When the -ffast-math option is enabled, gcc generates x86_64 floating
point code that assumes that the FTZ (Flush To Zero) and DAZ
(Denormals Are Zero) bits in the MXCSR register are set.

On at least some non-NetBSD platforms, -ffast-math also causes the
program to link in a runtime support module "crtfastmath.o" that sets
FTZ and DAZ, but this does not happen on NetBSD.  As a result, the
generated code that is relying on FTZ and DAZ being set can yield
incorrect results for denormal input.

The specific issue I ran into was calculating an array of square roots
of single-precision floats.  For this, gcc -O3 -ffast-math managed to
generate vectorized SSE code using the rsqrtps instruction followed by
a Newton-Raphson step to calculate four square roots in one go.  This
code returns incorrect results (-infinity rather than zero) for denormal
inputs when DAZ is not set.

I have not checked if NetBSD ports other than amd64 are also affected.

A work-around is to execute the following early in the program, before
doing floating-point math:

    fenv_t fe;
    fegetenv(&fe);
    fe.mxcsr |= 0x8040; // Set DAZ and FTZ
    fesetenv(&fe);

>How-To-Repeat:

Compile the following test program on a NetBSD/amd64 7.0 system with

   cc -O3 -ffast-math test_denormal.c -lm -o test_denormal

and run it:

#include <math.h>
#include <stdlib.h>
#include <stdio.h>

#define N 4 // Vector size

void test() {
    int i;
    float a[N], b[N];
    for (i = 0; i < N; i++) {
	a[i] = pow(10, -36 - i);
    }
    for (i = 0; i < N; i++) {
	b[i] = sqrtf(a[i]);
    }
    for (i = 0; i < N; i++) {
	printf("sqrt(%g) = %g\n", a[i], b[i]);
    }
    if (isinf(b[3])) {
	printf("FAIL: square root of a small number returned infinity\n");
	exit(1);
    }
}

int main(int argc, char **argv) {
    test();
    printf("PASS\n");
    return 0;
}

Compiled without -ffast-math, the output is correct:

  sqrt(1e-36) = 1e-18
  sqrt(1e-37) = 3.16228e-19
  sqrt(1e-38) = 1e-19
  sqrt(1e-39) = 3.16228e-20
  PASS

Compiled with -ffast-math on a non-NetBSD system, the denormals are
flushed to zero as expected, and the results are now slightly less
accurate, but still within 1e-19 of the exact result:

  sqrt(1e-36) = 1e-18
  sqrt(1e-37) = 3.16228e-19
  sqrt(0) = 0
  sqrt(0) = 0
  PASS

Compiled on NetBSD/amd64 with "-O3 -ffast-math", you get the following
output, where the last two square roots are not even close:

  sqrt(1e-36) = 1e-18
  sqrt(1e-37) = 3.16228e-19
  sqrt(1e-38) = -inf
  sqrt(1e-39) = -inf
  FAIL: square root of a small number returned infinity

>Fix:

>Release-Note:

>Audit-Trail:
From: Joerg Sonnenberger <joerg@britannica.bec.de>
To: gnats-bugs@NetBSD.org
Cc: toolchain-manager@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: toolchain/50940: cc -ffast-math fails to disable denormals on
 amd64
Date: Fri, 11 Mar 2016 16:28:38 +0100

 On Fri, Mar 11, 2016 at 02:45:00PM +0000, Andreas Gustafsson wrote:
 > When the -ffast-math option is enabled, gcc generates x86_64 floating
 > point code that assumes that the FTZ (Flush To Zero) and DAZ
 > (Denormals Are Zero) bits in the MXCSR register are set.

 That's not really true. The assumption with -ffast-math is that
 denormals and infinites don't happen at all. Whether the CPU generates
 them or not is a secondary question -- at that point you are already in
 Undefined Behavior land for this mode.

 Joerg

From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc: toolchain-manager@netbsd.org,
    gnats-admin@netbsd.org
    gson@gson.org (Andreas Gustafsson)
Subject: Re: toolchain/50940: cc -ffast-math fails to disable denormals on
 amd64
Date: Fri, 11 Mar 2016 18:07:56 +0200

 Joerg Sonnenberger wrote:
 >  > When the -ffast-math option is enabled, gcc generates x86_64 floating
 >  > point code that assumes that the FTZ (Flush To Zero) and DAZ
 >  > (Denormals Are Zero) bits in the MXCSR register are set.
 >  
 >  That's not really true. The assumption with -ffast-math is that
 >  denormals and infinites don't happen at all.

 That may be - I haven't found any documention on what the exact
 assumptions are.  If you know of some, I would appreciate a pointer.

 >  Whether the CPU generates them or not is a secondary question -- at
 >  that point you are already in Undefined Behavior land for this
 >  mode.

 Do you at least agree that the current NetBSD behavior is incorrect?
 -- 
 Andreas Gustafsson, gson@gson.org

From: Joerg Sonnenberger <joerg@britannica.bec.de>
To: gnats-bugs@NetBSD.org
Cc: toolchain-manager@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, Andreas Gustafsson <gson@gson.org>
Subject: Re: toolchain/50940: cc -ffast-math fails to disable denormals on
 amd64
Date: Sat, 12 Mar 2016 01:13:16 +0100

 On Fri, Mar 11, 2016 at 04:10:00PM +0000, Andreas Gustafsson wrote:
 >  >  Whether the CPU generates them or not is a secondary question -- at
 >  >  that point you are already in Undefined Behavior land for this
 >  >  mode.
 >  
 >  Do you at least agree that the current NetBSD behavior is incorrect?

 No. Just because some parts of the code is compiled with -ffast-math, it
 doesn't mean that other parts should get incorrect behavior. That
 includes strtod and other parts of the run time library. As such,
 changing the FP state automatically on program execution is much more
 harmful.

 Joerg

From: Andreas Gustafsson <gson@gson.org>
To: Joerg Sonnenberger <joerg@britannica.bec.de>
Cc: gnats-bugs@NetBSD.org,
    toolchain-manager@netbsd.org
Subject: Re: toolchain/50940: cc -ffast-math fails to disable denormals on
 amd64
Date: Fri, 18 Mar 2016 11:48:52 +0200

 Last week, Joerg Sonnenberger wrote:
 > >  Do you at least agree that the current NetBSD behavior is incorrect?
 > 
 > No. Just because some parts of the code is compiled with -ffast-math, it
 > doesn't mean that other parts should get incorrect behavior. That
 > includes strtod and other parts of the run time library. As such,
 > changing the FP state automatically on program execution is much more
 > harmful.

 That is a good point, and I'm actually inclined to agree.  In addition
 to the potential harm to unsuspecting libraries, there is also the
 fact that the technique of disabling denormals via crtfastmath.o
 doesn't work if -ffast-math is not specified in the link phase, which
 is likely to be the case when the code compiled with -ffast-math is
 itself a library.  For example, the following build recipe reproduces
 the bug even on the non-NetBSD system that gave correct results when
 the compiling and linking were done in a single cc command:

     cc -c -O3 -ffast-math test_denormal.c
     cc test_denormal.o -lm -o test_denormal

 Now, if we do resolve that NetBSD is correct in not disabling
 denormals, then we have a different bug instead, that of gcc
 generating code that yields infinity when taking the square root of a
 denormal.  I guess I should just report that upstream, with the above
 recipe.

 Earlier, you said:
 > The assumption with -ffast-math is that denormals and infinites
 > don't happen at all.

 I still haven't seen any evidence supporting that assertion when it
 comes to denormals.  The -ffast-math option is shorthand for turning
 on multiple options including -ffinite-math-only, which does what you
 say for NaNs and Infs, but it does not mention denormals:

    -ffinite-math-only
        Allow optimizations for floating-point arithmetic that assume that
        arguments and results are not NaNs or +-Infs.

 -- 
 Andreas Gustafsson, gson@gson.org

From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: toolchain/50940: cc -ffast-math fails to disable denormals on amd64
Date: Fri, 1 Dec 2017 13:33:41 +0200

 Reported as a gcc bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83240
 -- 
 Andreas Gustafsson, gson@gson.org

From: coypu@sdf.org
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: toolchain/50940: cc -ffast-math fails to disable denormals on
 amd64
Date: Fri, 1 Dec 2017 14:33:24 +0000

  No. Just because some parts of the code is compiled with -ffast-math, it
  doesn't mean that other parts should get incorrect behavior. That
  includes strtod and other parts of the run time library. As such,
  changing the FP state automatically on program execution is much more
  harmful.

 FWIW, we do this behaviour in alpha: crtfm is a hint to the kernel to
 map underflow and denormals to zero in that process.

State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Wed, 21 Jul 2021 05:14:48 +0000
State-Changed-Why:
-ffast-math is just a fancy way of making your program's behavior undefined.


>Unformatted:
Home
PR Database Search
(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.