public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "matz at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug target/96373] New: SVE miscompilation on vectorized division loop, leading to FP exception Date: Wed, 29 Jul 2020 15:27:58 +0000 [thread overview] Message-ID: <bug-96373-4@http.gcc.gnu.org/bugzilla/> (raw) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96373 Bug ID: 96373 Summary: SVE miscompilation on vectorized division loop, leading to FP exception Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: matz at gcc dot gnu.org Target Milestone: --- I believe gcc-10 miscompiles the following program when SVE and vectorization are enabled. You need glibc to show this, or a different way to enable traps on floating point exceptions: % cat x.c #define _GNU_SOURCE #include <fenv.h> void __attribute__((noinline, noclone)) div (double *d, double *s, int n) { for (;n; n--, d++, s++) *d = *d / *s; } extern int printf(const char*, ...); int main() { int i; double d[] = {1,2,3,4,5,6,7,8,9,10,11}; double s[] = {11,10,9,8,7,6,5,4,3,2,1}; //fesetenv(FE_NOMASK_ENV); feenableexcept(FE_DIVBYZERO|FE_INVALID); div(d, s, 11); for (i = 0; i < 11; i++) printf(" %f", d[i]); printf("\n"); return 0; } % gcc-10 --version gcc-10 (SUSE Linux) 10.2.1 20200723 [revision 677b80db41f5345b32ce18cd000e45ea39b80d8f] % gcc-10 -g -march=armv8.2-a -O2 -ftree-vectorize x.c -lm && ./a.out 0.090909 0.200000 0.333333 0.500000 0.714286 1.000000 1.400000 2.000000 3.000000 5.000000 11.000000 % gcc-10 -g -march=armv8.2-a+sve -O2 -ftree-vectorize x.c -lm && ./a.out Floating point exception (core dumped) I think the code speaks for itself, excerpt from div(): whilelo p0.d, wzr, w2 ptrue p1.b, all .p2align 3,,7 .L4: ld1d z0.d, p0/z, [x0, x3, lsl 3] ld1d z1.d, p0/z, [x1, x3, lsl 3] fdiv z0.d, p1/m, z0.d, z1.d st1d z0.d, p0, [x0, x3, lsl 3] incd x3 whilelo p0.d, w3, w2 b.any .L4 So, it enables all lanes in p1, while the active lanes in the loop are tracked in p0. In particular non-active lanes from the load are zeroed. The division uses p1 and hence divides all lanes, including those that were zeroed. Indeed that's what happens when the exception is thrown: % gdb ./a.out ... Program received signal SIGFPE, Arithmetic exception. (gdb) x/i $pc => 0x400848 <div+56>: fdiv z0.d, p1/m, z0.d, z1.d (gdb) p $p1 $1 = {255, 255, 255, 255, 255, 255, 255, 255} (gdb) p $z1.d.f $2 = {3, 2, 1, 0, 0, 0, 0, 0} When traps aren't enabled (the default is disabled) then these zero divisions simply lead to NaNs in the respective lanes, and as in further instructions the p0 predicate is used that's of no issue as those are ignored then. But if traps are enabled this leads to an incorrect FPE trap. The same behaviour occurs already with gcc-9. I haven't tested master. We noticed this within OpenFOAM on SVE capable hardware, but divisions in vectorizable contexts should occur reasonably often for this to be a serious problem. (traps on exceptions aren't enabled very often, though, so this bug will be hidden often).
next reply other threads:[~2020-07-29 15:27 UTC|newest] Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-07-29 15:27 matz at gcc dot gnu.org [this message] 2020-08-04 13:41 ` [Bug target/96373] " rsandifo at gcc dot gnu.org 2020-08-04 13:49 ` rguenth at gcc dot gnu.org 2020-08-04 14:38 ` matz at gcc dot gnu.org 2020-08-04 14:59 ` rsandifo at gcc dot gnu.org 2020-08-04 15:46 ` schwab@linux-m68k.org 2020-08-05 10:08 ` rsandifo at gcc dot gnu.org 2020-08-05 10:15 ` rguenther at suse dot de 2020-08-05 10:28 ` rsandifo at gcc dot gnu.org 2020-08-05 11:09 ` rguenther at suse dot de 2020-08-05 12:24 ` matz at gcc dot gnu.org 2020-08-05 13:02 ` matz at gcc dot gnu.org 2023-01-11 23:50 ` pinskia at gcc dot gnu.org 2023-01-11 23:54 ` pinskia at gcc dot gnu.org 2023-01-27 17:04 ` cvs-commit at gcc dot gnu.org 2023-02-14 2:05 ` cvs-commit at gcc dot gnu.org 2023-02-14 9:18 ` cvs-commit at gcc dot gnu.org 2023-02-27 2:50 ` cvs-commit at gcc dot gnu.org 2023-02-27 2:57 ` cvs-commit at gcc dot gnu.org 2023-04-03 8:58 ` cvs-commit at gcc dot gnu.org 2023-04-14 8:19 ` [Bug target/96373] [10/11 Regression] " rguenth at gcc dot gnu.org 2023-05-29 10:03 ` jakub at gcc dot gnu.org 2024-02-29 5:33 ` [Bug target/96373] [11 " pinskia at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-96373-4@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).