public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/96373] New: SVE miscompilation on vectorized division loop, leading to FP exception
@ 2020-07-29 15:27 matz at gcc dot gnu.org
  2020-08-04 13:41 ` [Bug target/96373] " rsandifo at gcc dot gnu.org
                   ` (21 more replies)
  0 siblings, 22 replies; 23+ messages in thread
From: matz at gcc dot gnu.org @ 2020-07-29 15:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96373

            Bug ID: 96373
           Summary: SVE miscompilation on vectorized division loop,
                    leading to FP exception
           Product: gcc
           Version: 10.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: matz at gcc dot gnu.org
  Target Milestone: ---

I believe gcc-10 miscompiles the following program when SVE and vectorization
are enabled.  You need glibc to show this, or a different way to enable traps
on floating point exceptions:

% cat x.c
#define _GNU_SOURCE
#include <fenv.h>
void __attribute__((noinline, noclone)) div (double *d, double *s, int n)
{
  for (;n; n--, d++, s++)
    *d = *d / *s;
}

extern int printf(const char*, ...);

int main()
{
  int i;
  double d[] = {1,2,3,4,5,6,7,8,9,10,11};
  double s[] = {11,10,9,8,7,6,5,4,3,2,1};
  //fesetenv(FE_NOMASK_ENV);
  feenableexcept(FE_DIVBYZERO|FE_INVALID);
  div(d, s, 11);
  for (i = 0; i < 11; i++)
    printf(" %f", d[i]);
  printf("\n");
  return 0;
}

% gcc-10 --version
gcc-10 (SUSE Linux) 10.2.1 20200723 [revision
677b80db41f5345b32ce18cd000e45ea39b80d8f]

% gcc-10 -g -march=armv8.2-a -O2 -ftree-vectorize x.c -lm && ./a.out
 0.090909 0.200000 0.333333 0.500000 0.714286 1.000000 1.400000 2.000000
3.000000 5.000000 11.000000

% gcc-10 -g -march=armv8.2-a+sve -O2 -ftree-vectorize  x.c -lm && ./a.out 
Floating point exception (core dumped)

I think the code speaks for itself, excerpt from div():

        whilelo p0.d, wzr, w2
        ptrue   p1.b, all
        .p2align 3,,7
.L4:
        ld1d    z0.d, p0/z, [x0, x3, lsl 3]
        ld1d    z1.d, p0/z, [x1, x3, lsl 3]
        fdiv    z0.d, p1/m, z0.d, z1.d
        st1d    z0.d, p0, [x0, x3, lsl 3]
        incd    x3
        whilelo p0.d, w3, w2
        b.any   .L4

So, it enables all lanes in p1, while the active lanes in the loop are tracked
in p0.  In particular non-active lanes from the load are zeroed.  The
division uses p1 and hence divides all lanes, including those that were zeroed.

Indeed that's what happens when the exception is thrown:

% gdb ./a.out
...
Program received signal SIGFPE, Arithmetic exception.
(gdb) x/i $pc
=> 0x400848 <div+56>:   fdiv    z0.d, p1/m, z0.d, z1.d
(gdb) p $p1
$1 = {255, 255, 255, 255, 255, 255, 255, 255}
(gdb) p $z1.d.f
$2 = {3, 2, 1, 0, 0, 0, 0, 0}

When traps aren't enabled (the default is disabled) then these zero divisions
simply lead to NaNs in the respective lanes, and as in further instructions
the p0 predicate is used that's of no issue as those are ignored then.

But if traps are enabled this leads to an incorrect FPE trap.

The same behaviour occurs already with gcc-9.  I haven't tested master.

We noticed this within OpenFOAM on SVE capable hardware, but divisions in
vectorizable contexts should occur reasonably often for this to be a serious
problem.  (traps on exceptions aren't enabled very often, though, so this
bug will be hidden often).

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-02-29  5:33 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-29 15:27 [Bug target/96373] New: SVE miscompilation on vectorized division loop, leading to FP exception matz at gcc dot gnu.org
2020-08-04 13:41 ` [Bug target/96373] " rsandifo at gcc dot gnu.org
2020-08-04 13:49 ` rguenth at gcc dot gnu.org
2020-08-04 14:38 ` matz at gcc dot gnu.org
2020-08-04 14:59 ` rsandifo at gcc dot gnu.org
2020-08-04 15:46 ` schwab@linux-m68k.org
2020-08-05 10:08 ` rsandifo at gcc dot gnu.org
2020-08-05 10:15 ` rguenther at suse dot de
2020-08-05 10:28 ` rsandifo at gcc dot gnu.org
2020-08-05 11:09 ` rguenther at suse dot de
2020-08-05 12:24 ` matz at gcc dot gnu.org
2020-08-05 13:02 ` matz at gcc dot gnu.org
2023-01-11 23:50 ` pinskia at gcc dot gnu.org
2023-01-11 23:54 ` pinskia at gcc dot gnu.org
2023-01-27 17:04 ` cvs-commit at gcc dot gnu.org
2023-02-14  2:05 ` cvs-commit at gcc dot gnu.org
2023-02-14  9:18 ` cvs-commit at gcc dot gnu.org
2023-02-27  2:50 ` cvs-commit at gcc dot gnu.org
2023-02-27  2:57 ` cvs-commit at gcc dot gnu.org
2023-04-03  8:58 ` cvs-commit at gcc dot gnu.org
2023-04-14  8:19 ` [Bug target/96373] [10/11 Regression] " rguenth at gcc dot gnu.org
2023-05-29 10:03 ` jakub at gcc dot gnu.org
2024-02-29  5:33 ` [Bug target/96373] [11 " pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).