From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 469A13858C83; Sat, 10 Sep 2022 18:58:13 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 469A13858C83 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1662836293; bh=Qi5jXCm1Kzeq3xx65ebvAvIc4PlM3GnewrjHm/jJCPw=; h=From:To:Subject:Date:From; b=CYeKeu2aCEELwdzdl3a6xmMUMUPP+bZxXf8Wl8w61mIj1BPF1Tbx3ziINsbiiAKmF oSzHNFhaNpbpdM8+wYHLnzqrSI9ohqHpONmYP90XMQ+lpYOjm9IgwXx5KiRJc4Nl3U oVzGfAe52fSX0+QO4MYDtYRJ20cjr7DiYgDE/XFA= From: "jhllawrence963 at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/106902] New: Program compiled with -O3 -fmfa produces different result Date: Sat, 10 Sep 2022 18:58:12 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.2.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: jhllawrence963 at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106902 Bug ID: 106902 Summary: Program compiled with -O3 -fmfa produces different result Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: jhllawrence963 at gmail dot com Target Milestone: --- Created attachment 53560 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D53560&action=3Dedit Sample C++ program Compiling the attached sample program with g++ -mfma -O3 and executing it l= eads to the wrong output starting with GCC version 11.1. The expected output is approximately 0.905017, but the actual output is -415762. GCC 10.4 and lower works as expected. Compiling with other optimization flags and -mno-fma wor= ks as expected too. About the program: It starts with an array of 1s, performs a local average for each element, t= hen prints one result from the middle of the array. The algorithm has been redu= ced to remove code that is not needed to reproduce the bug, which is why the expected output is not exactly 1. The sample contains extra code which is n= ot relevant to the bug, but removing them causes the bug to be not reproducibl= e. The relevant parts have been commented with "FIXME". I'm not 100% certain, = but there appears to be some loss of precision which gets compounded because the result of one loop iteration is used as an input to the next iterations. The program output becomes more incorrect as the input array size increases. GCC Version: $ gcc -v Using built-in specs. COLLECT_GCC=3Dgcc COLLECT_LTO_WRAPPER=3D/usr/lib/gcc/x86_64-pc-linux-gnu/12.2.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /build/gcc/src/gcc/configure --enable-languages=3Dc,c++,ada,fortran,go,lto,objc,obj-c++,d --enable-boots= trap --prefix=3D/usr --libdir=3D/usr/lib --libexecdir=3D/usr/lib --mandir=3D/usr= /share/man --infodir=3D/usr/share/info --with-bugurl=3Dhttps://bugs.archlinux.org/ --with-build-config=3Dbootstrap-lto --with-linker-hash-style=3Dgnu --with-system-zlib --enable-__cxa_atexit --enable-cet=3Dauto --enable-checking=3Drelease --enable-clocale=3Dgnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-obj= ect --enable-libstdcxx-backtrace --enable-link-serialization=3D1 --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=3Dposix --disable-libssp --disable-libstdc= xx-pch --disable-werror Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.2.0 (GCC)=