From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 87B7D385841A; Fri, 26 Nov 2021 00:30:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 87B7D385841A From: "wilson at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/103302] wrong code with -fharden-compares Date: Fri, 26 Nov 2021 00:30:07 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: wilson at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: aoliva at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Nov 2021 00:30:07 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D103302 Jim Wilson changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |wilson at gcc dot gnu.org --- Comment #2 from Jim Wilson --- It is the second reversed comparison that is wrong. This is the u32_0 <=3D (...) on the first line of foo0. In the assembly file, this ends up as mv a0,a6 mv a1,a7 xor a6,a0,a6 xor a7,a1,a7 or a6,a6,a7 seqz a6,a6 and note that it is comparing a value against itself when it should be comparing two different values. The harden compare pass is generating RTL insn 156 152 155 6 (set (reg:TI 201) (asm_operands:TI ("") ("=3Dg") 0 [ (reg:TI 77 [ _8 ]) ] [ (asm_input:TI ("0")) ] [])) -1 (nil)) (insn 155 156 153 6 (clobber (reg:TI 77 [ _8 ])) -1 (nil)) (insn 153 155 154 6 (set (subreg:DI (reg:TI 77 [ _8 ]) 0) (subreg:DI (reg:TI 201) 0)) -1 (nil)) (insn 154 153 160 6 (set (subreg:DI (reg:TI 77 [ _8 ]) 8) (subreg:DI (reg:TI 201) 8)) -1 (nil)) Then the asmcons pass is changing this to (insn 851 152 849 5 (clobber (reg:TI 201)) -1 (nil)) (insn 849 851 850 5 (set (subreg:DI (reg:TI 201) 0) (subreg:DI (reg:TI 77 [ _8 ]) 0)) -1 (nil)) (insn 850 849 156 5 (set (subreg:DI (reg:TI 201) 8) (subreg:DI (reg:TI 77 [ _8 ]) 8)) -1 (nil)) (insn 156 850 155 5 (set (reg:TI 201) (asm_operands:TI ("") ("=3Dg") 0 [ (reg:TI 201) ] [ (asm_input:TI ("0")) ] [])) -1 (expr_list:REG_DEAD (reg:TI 77 [ _8 ]) (nil))) (insn 155 156 153 5 (clobber (reg:TI 77 [ _8 ])) -1 (nil)) (insn 153 155 154 5 (set (subreg:DI (reg:TI 77 [ _8 ]) 0) (subreg:DI (reg:TI 201) 0)) 135 {*movdi_64bit} (nil)) (insn 154 153 854 5 (set (subreg:DI (reg:TI 77 [ _8 ]) 8) (subreg:DI (reg:TI 201) 8)) 135 {*movdi_64bit} (expr_list:REG_DEAD (reg:TI 201) (nil))) Then the register allocator puts both 77 and 201 in the same register, which means we are now clobbering values we need. In the reload dump I see (insn 851 152 849 5 (clobber (reg:TI 16 a6 [201])) -1 (nil)) (insn 849 851 850 5 (set (reg:DI 16 a6 [201]) (reg:DI 10 a0 [orig:77 _8 ] [77])) 135 {*movdi_64bit} (nil)) (insn 850 849 907 5 (set (reg:DI 17 a7 [+8 ]) (reg:DI 11 a1 [ _8+8 ])) 135 {*movdi_64bit} (nil)) (insn 907 850 1014 5 (clobber (reg:TI 16 a6 [201])) -1 (nil)) so the insns 849 and 850 get optimized away, but we need them. Also, we ha= ve (insn 854 154 852 5 (clobber (reg:TI 16 a6 [202])) -1 (nil)) (insn 852 854 853 5 (set (reg:DI 16 a6 [202]) (reg:DI 6 t1 [orig:86 _39 ] [86])) 135 {*movdi_64bit} (nil)) (insn 853 852 913 5 (set (reg:DI 17 a7 [+8 ]) (reg:DI 7 t2 [ _39+8 ])) 135 {*movdi_64bit} (nil)) (insn 913 853 1010 5 (clobber (reg:TI 16 a6 [202])) -1 (nil)) and the insns 852 and 853 get optimized away, but we need them. The compar= ison is supposed to be a0/a1 versus t1/t2, but we end up with comparing a6/a7 against itself. asmcons is calling emit_move_insn to copy the asm source to the asm dest so= it can simplify the asm. Since this is a multiword mode, and the riscv backend doesn't have a movti pattern, this ends up calling emit_move_multi_word whi= ch emits the extra clobber that causes the problem. I suppose we could fix this by adding a movti pattern to the riscv backend = to avoid the clobbers but we shouldn't have to. Though it would be interestin= g to see if this maybe results in better code optimization. It isn't clear exactly where the problem is. Maybe asmcons shouldn't try to fix an asm when the mode is larger than the word mode? This could be left = to the register allocator to fix. Or maybe harden compare shouldn't generate = RTL like this? This could be a harden compare issue, or maybe an issue with the RTL expander to emit the rtl differently. Looks like the same issue with t= he RTL expander calling emit_move_multi_word which generates the clobber. Or maybe a movti pattern is actually required now? I did verify that disabling asmcons fixes the problem for this testcase. I= had to hack the code in function.c to do that as there is no option to disable = it.=