From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 69BDC3857BA2; Thu, 15 Sep 2022 09:33:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 69BDC3857BA2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1663234424; bh=GGptGyM0aIw5RJpYCrAgj7bSPj3X5x0SJlMmLkLdKqM=; h=From:To:Subject:Date:In-Reply-To:References:From; b=WRlak2In9N5/Bnx9o8OV7qwlsxy5NHvHPVH8UWXP+BVtJ65K74X4beDHpUnPqIyVh BfLlI2116koiSFNwL/YtRkrWwG611TnifMEMDNVqb/qeUqvQC/oihGUcWwkh6pxt9h CxwKuHbVNlpz+TdfyT7SrVnyk8+egT/OClVjeiDc= From: "amonakov at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/106902] [11/12/13 Regression] Program compiled with -O3 -mfma produces different result Date: Thu, 15 Sep 2022 09:33:43 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.2.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: amonakov at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 11.4 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106902 --- Comment #7 from Alexander Monakov --- Lawrence, thank you for the nice work reducing the testcase. For RawTherapee the recommended course of action would be to compile everything with -ffp-contract=3Doff, then manually reintroduce use of fma in performance-sensitive places by testing the FP_FAST_FMA macro to know if hardware fma is available. This way you'll know that all systems without fma get the same results, and all systems with fma also get the same results (b= ut different from the former). For example, my function 'f1' could be adapted like this: void f1(void) { double x1 =3D 0, x2 =3D 0, x3 =3D 0; for (int i =3D 0; i < 99; ) { double t; #ifdef FP_FAST_FMA t =3D fma(x1, b1, fma(x2, b2, fma(x3, b3, B * one))); #else t =3D B * one + x1 * b1 + x2 * b2 + x3 * b3; #endif printf("%d %g\t%a\n", i++, t, t); x3 =3D x2, x2 =3D x1, x1 =3D t; } }=