public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "amonakov at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug target/106902] [11/12/13 Regression] Program compiled with -O3 -mfma produces different result Date: Wed, 14 Sep 2022 15:20:13 +0000 [thread overview] Message-ID: <bug-106902-4-h85nPs5LeF@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-106902-4@http.gcc.gnu.org/bugzilla/> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902 Alexander Monakov <amonakov at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |amonakov at gcc dot gnu.org --- Comment #6 from Alexander Monakov <amonakov at gcc dot gnu.org> --- This is a lovely showcase how optimizations cooperatively produce something unexpected. TL;DR: SLP introduces redundant computations and then fma formation contracts some (but not all) of those, dramatically reducing numerical stability. In principle that's similar to incorrectly "optimizing" double f(double x) { double y = x * x; return y - y; } (which is guaranteed to return either NaN or 0) to double f(double x) { return fma(x, x, -(x * x)); } which returns the round-off tail of x * x (or NaN). I think there's already another bug with a similar root cause. In this bug, we begin with (note, all following examples are supposed to be compiled without fma contraction, i.e. -O0, plain -O2, or -O2 -ffp-contract=off if your target has fma): #include <math.h> #include <stdio.h> double one = 1; double b1 = 0x1.70e906b54fe4fp+1; double b2 = -0x1.62adb4752c14ep+1; double b3 = 0x1.c7001a6f3bd8p-1; double B = 0x1.29c9034e7cp-13; void f1(void) { double x1 = 0, x2 = 0, x3 = 0; for (int i = 0; i < 99; ) { double t = B * one + x1 * b1 + x2 * b2 + x3 * b3; printf("%d %g\t%a\n", i++, t, t); x3 = x2, x2 = x1, x1 = t; } } predcom unrolls by 3 to get rid of moves: void f2(void) { double x1 = 0, x2 = 0, x3 = 0; for (int i = 0; i < 99; ) { x3 = B * one + x1 * b1 + x2 * b2 + x3 * b3; printf("%d %g\t%a\n", i++, x3, x3); x2 = B * one + x3 * b1 + x1 * b2 + x2 * b3; printf("%d %g\t%a\n", i++, x2, x2); x1 = B * one + x2 * b1 + x3 * b2 + x1 * b3; printf("%d %g\t%a\n", i++, x1, x1); } } SLP introduces some redundant vector computations: typedef double f64v2 __attribute__((vector_size(16))); void f3(void) { double x1 = 0, x2 = 0, x3 = 0; f64v2 x32 = { 0 }, x21 = { 0 }; for (int i = 0; i < 99; ) { x3 = B * one + x21[1] * b1 + x2 * b2 + x3 * b3; f64v2 x13b1 = { x21[1] * b1, x3 * b1 }; x32 = B * one + x13b1 + x21 * b2 + x32 * b3; x2 = B * one + x3 * b1 + x1 * b2 + x2 * b3; f64v2 x13b2 = { b2 * x1, b2 * x32[0] }; x21 = B * one + x32 * b1 + x13b2 + x21 * b3; x1 = B * one + x2 * b1 + x32[0] * b2 + x1 * b3; printf("%d %g\t%a\n", i++, x32[0], x32[0]); printf("%d %g\t%a\n", i++, x32[1], x32[1]); printf("%d %g\t%a\n", i++, x21[1], x21[1]); } } Note that this is still bit-identical to the initial function. But then tree-ssa-math-opts "randomly" forms some FMAs: f64v2 vfma(f64v2 x, f64v2 y, f64v2 z) { return (f64v2){ fma(x[0], y[0], z[0]), fma(x[1], y[1], z[1]) }; } void f4(void) { f64v2 vone = { one, one }, vB = { B, B }; f64v2 vb1 = { b1, b1 }, vb2 = { b2, b2 }, vb3 = { b3, b3 }; double x1 = 0, x2 = 0, x3 = 0; f64v2 x32 = { 0 }, x21 = { 0 }; for (int i = 0; i < 99; ) { x3 = fma(b3, x3, fma(b2, x2, fma(B, one, x21[1] * b1))); f64v2 x13b1 = { x21[1] * b1, x3 * b1 }; x32 = vfma(vb3, x32, vfma(vb2, x21, vfma(vB, vone, x13b1))); x2 = fma(b3, x2, b2 * x1 + fma(B, one, x3 * b1)); f64v2 x13b2 = { b2 * x1, b2 * x32[0] }; x21 = vfma(vb3, x21, x13b2 + vfma(vB, vone, x32 * vb1)); x1 = fma(b3, x1, b2 * x32[0] + fma(B, one, b1 * x2)); printf("%d %g\t%a\n", i++, x32[0], x32[0]); printf("%d %g\t%a\n", i++, x32[1], x32[1]); printf("%d %g\t%a\n", i++, x21[1], x21[1]); } } and here some of the redundantly computed values are computed differently depending on where rounding after multiplication was omitted. Somehow this is enough to make the computation explode numerically.
next prev parent reply other threads:[~2022-09-14 15:20 UTC|newest] Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-09-10 18:58 [Bug tree-optimization/106902] New: Program compiled with -O3 -fmfa " jhllawrence963 at gmail dot com 2022-09-10 19:07 ` [Bug target/106902] Program compiled with -O3 -mfma " pinskia at gcc dot gnu.org 2022-09-12 8:01 ` [Bug target/106902] [11/12/13 Regression] " rguenth at gcc dot gnu.org 2022-09-12 14:08 ` marxin at gcc dot gnu.org 2022-09-12 14:10 ` [Bug target/106902] [11/12 " marxin at gcc dot gnu.org 2022-09-13 7:06 ` [Bug target/106902] [11/12/13 " rguenth at gcc dot gnu.org 2022-09-14 15:20 ` amonakov at gcc dot gnu.org [this message] 2022-09-15 9:33 ` amonakov at gcc dot gnu.org 2022-09-17 18:19 ` jhllawrence963 at gmail dot com 2022-09-17 18:23 ` jhllawrence963 at gmail dot com 2022-09-19 7:08 ` rguenth at gcc dot gnu.org 2022-09-19 7:14 ` amonakov at gcc dot gnu.org 2022-09-19 7:25 ` rguenth at gcc dot gnu.org 2022-09-19 8:14 ` amonakov at gcc dot gnu.org 2022-09-19 9:44 ` rguenth at gcc dot gnu.org 2022-09-27 18:31 ` amonakov at gcc dot gnu.org 2022-09-29 6:41 ` rguenth at gcc dot gnu.org 2022-09-29 11:28 ` amonakov at gcc dot gnu.org 2022-09-29 13:39 ` rguenther at suse dot de 2022-09-30 6:17 ` amonakov at gcc dot gnu.org 2023-05-11 17:32 ` [Bug target/106902] [11/12/13/14 " amonakov at gcc dot gnu.org 2023-05-12 6:27 ` rguenth at gcc dot gnu.org 2023-05-17 18:49 ` amonakov at gcc dot gnu.org 2023-05-17 18:54 ` pinskia at gcc dot gnu.org 2023-05-18 5:53 ` rguenth at gcc dot gnu.org 2023-05-18 8:31 ` amonakov at gcc dot gnu.org 2023-05-18 16:03 ` amonakov at gcc dot gnu.org 2023-05-18 16:52 ` rguenther at suse dot de 2023-05-29 10:07 ` jakub at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-106902-4-h85nPs5LeF@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).