From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 043083858002; Tue, 3 May 2022 06:45:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 043083858002 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/105453] load introduced by ce1 for conditional loads at -O1, might cause issues with the C/C++ memory model Date: Tue, 03 May 2022 06:45:49 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 11.3.0 X-Bugzilla-Keywords: missed-optimization, needs-bisection, wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: everconfirmed cf_reconfirmed_on cf_known_to_work keywords bug_status Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 May 2022 06:45:50 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D105453 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Last reconfirmed| |2022-05-03 Known to work| |12.0 Keywords| |missed-optimization, | |needs-bisection Status|UNCONFIRMED |NEW --- Comment #4 from Richard Biener --- With gcc 12 we get _Z6func_1v: .LFB0: .cfi_startproc movl g_6(%rip), %eax movl %eax, g_10(%rip) cmpl $0, g_6+4(%rip) movl $1, %edx cmove %edx, %eax ret the difference is that we somehow un-CSE the g_6 load at RTL expansion with= GCC 11 and then things go downhill later: ;; Generating RTL for gimple basic block 2 ;; _1 =3D g_6[0]; (insn 6 5 7 (set (reg/f:DI 84) (symbol_ref:DI ("g_6") [flags 0x2] )) "t.c":7:20 -1 (nil)) (insn 7 6 0 (set (reg:SI 83 [ ]) (mem/c:SI (reg/f:DI 84) [1 g_6[0]+0 S4 A64])) "t.c":7:20 -1 (nil)) ;; g_10 =3D _1; (insn 8 7 0 (set (mem/c:SI (symbol_ref:DI ("g_10") [flags 0x2] ) [1 g_10+0 S4 A32]) (reg:SI 83 [ ])) "t.c":7:12 -1 (nil)) ;; if (_2 !=3D 0) (insn 9 8 10 (set (reg/f:DI 85) (symbol_ref:DI ("g_6") [flags 0x2] )) "t.c":8:14 -1 (nil)) (insn 10 9 11 (set (reg:CCZ 17 flags) (compare:CCZ (mem/c:SI (plus:DI (reg/f:DI 85) (const_int 4 [0x4])) [1 g_6[1]+0 S4 A32]) (const_int 0 [0]))) "t.c":8:5 -1 (nil)) the same happens with GCC 12. CSE cleans this up so it's maybe not importa= nt but in GCC 11 forwprop then does 4: NOTE_INSN_BASIC_BLOCK 2 2: NOTE_INSN_FUNCTION_BEG - 6: r84:DI=3D`g_6' - 7: r83:SI=3D[r84:DI] - REG_DEAD r84:DI + 7: r83:SI=3D[`g_6'] 8: [`g_10']=3Dr83:SI - 9: r85:DI=3D`g_6' - 10: flags:CCZ=3Dcmp([r84:DI+0x4],0) - REG_DEAD r85:DI + 10: flags:CCZ=3Dcmp([const(`g_6'+0x4)],0) which seems to confuse CE enough to emit the extra load: 4: NOTE_INSN_BASIC_BLOCK 2 2: NOTE_INSN_FUNCTION_BEG 7: r83:SI=3D[`g_6'] 8: [`g_10']=3Dr83:SI - 10: flags:CCZ=3Dcmp([const(`g_6'+0x4)],0) - 11: pc=3D{(flags:CCZ=3D=3D0)?L24:pc} - REG_DEAD flags:CCZ - REG_BR_PROB 708669604 - ; pc falls through to BB 4 - 24: L24: - 23: NOTE_INSN_BASIC_BLOCK 3 - 3: r83:SI=3D0x1 - 17: L17: - 20: NOTE_INSN_BASIC_BLOCK 4 + 25: r88:SI=3D[`g_6'] + 26: r87:SI=3D0x1 + 27: flags:CCZ=3Dcmp([const(`g_6'+0x4)],0) + 28: r83:SI=3D{(flags:CCZ=3D=3D0)?r87:SI:r88:SI} 18: ax:SI=3Dr83:SI disabling fwprop1 produces + 6: r84:DI=3D`g_6' + 7: r83:SI=3D[r84:DI] + REG_DEAD r84:DI + 8: [`g_10']=3Dr83:SI + 26: r87:SI=3D0x1 + 25: flags:CCZ=3Dcmp([r84:DI+0x4],0) + 27: r83:SI=3D{(flags:CCZ!=3D0)?r83:SI:r87:SI} 18: ax:SI=3Dr83:SI again. Likely the new SSA based forwprop "fixed" this, but maybe only accidentially. On trunk fwprop1 does - 6: r84:DI=3D`g_6' - 7: r83:SI=3D[r84:DI] - REG_DEAD r84:DI + 7: r83:SI=3D[`g_6'] 8: [`g_10']=3Dr83:SI - 9: r85:DI=3D`g_6' - 10: flags:CCZ=3Dcmp([r84:DI+0x4],0) - REG_DEAD r85:DI + 10: flags:CCZ=3Dcmp([const(`g_6'+0x4)],0)=