From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id CADC0385AC2A; Tue, 30 Nov 2021 05:28:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CADC0385AC2A From: "luoxhu at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/102239] powerpc suboptimal boolean test of contiguous bits Date: Tue, 30 Nov 2021 05:28:03 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 11.2.1 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: luoxhu at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Nov 2021 05:28:03 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102239 --- Comment #9 from luoxhu at gcc dot gnu.org --- (In reply to Segher Boessenkool from comment #8) > (In reply to luoxhu from comment #6) > > > > foo: > > > > .LFB0: > > > > .cfi_startproc > > > > rldicr. 3,3,29,1 > > > > beq 0,.L2 > > >=20 > > > This is fine, but only because it tests the EQ bit (not the LT or GT = bits). > > > So the generated RTL for this insn (the 2insn one) is not correct. > >=20 > > The generated RTL in pr102239.c.300r.split2 is: > >=20 > > (insn 32 8 33 2 (parallel [ > > (set (reg:CC 100 0 [123]) > > (compare:CC (and:DI (ashift:DI (reg:DI 3 3 [124]) > > (const_int 29 [0x1d])) > > (const_int -4611686018427387904 > > [0xc000000000000000])) > > (const_int 0 [0]))) > > (clobber (reg:DI 3 3 [125])) > > ]) "pr102239.c":4:6 238 {*rotldi3_mask_dot} > > (nil)) > > (insn 33 32 10 2 (set (reg:DI 3 3 [125]) > > (lshiftrt:DI (reg:DI 3 3 [125]) > > (const_int 29 [0x1d]))) "pr102239.c":4:6 278 {lshrdi3} > > (nil)) > > (jump_insn 10 33 11 2 (set (pc) > > (if_then_else (eq (reg:CC 100 0 [123]) > > (const_int 0 [0])) > > (label_ref 15) > > (pc))) "pr102239.c":4:6 868 {*cbranch} > > (int_list:REG_BR_PROB 536870916 (nil)) > > -> 15) >=20 > So combine will have to look at insn 10 as well when it does the combinat= ion > (it often already does, via "other_insn") -- but also it does have to know > an "eq" is okay here, and that requires a new pattern. >=20 > > rotldi3_mask_dot is what you mentioned in c#1, it is a shifted result a= nd > > not matter for comparing to 0: >=20 > It does matter, if what you are want to see is if it is smaller than zero= or > greater than zero. CCmode includes those things. There is a CCEQmode for > if only the EQ bit is set correctly. Got it, thanks. As the example in c#7. If CCmode is LT, rotate data to hig= hest bits will get negative result and set CR0 to negative, which is unexpected.= =20 >=20 > > > *rotl3_mask_dot cannot do this either; the base and the dot2 of= that > > > cannot be done, they return a shifted result, but that doesn't matter= for > > > comparing it to 0. So we should add a specialised version. > >=20 > > What specialized version to add? >=20 > Some pattern that just does this as an rldicr, as a single insn. It will > have to be excluded by the 2insn thing (it is only a single insn itself!), > and it will have to have comparison mode CCEQ only. I was motivated by the clang code, and tried to rotate the data to LSB inst= ead, it doesn't suffer from CCmode issue again? Will this be simpler than the combine & new pattern solution? diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index c9ce0550df1..d2a5b916b1d 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -11747,11 +11747,11 @@ rs6000_emit_2insn_and (machine_mode mode, rtx *operands, bool expand, int dot) } else { - rtx tmp =3D gen_rtx_ASHIFT (mode, operands[1], GEN_INT (shift)); - tmp =3D gen_rtx_AND (mode, tmp, GEN_INT (val << shift)); - emit_move_insn (operands[0], tmp); - tmp =3D gen_rtx_LSHIFTRT (mode, operands[0], GEN_INT (shift)); + rtx tmp =3D gen_rtx_LSHIFTRT (mode, operands[1], GEN_INT (ne)); + tmp =3D gen_rtx_AND (mode, tmp, GEN_INT (val >> ne)); rs6000_emit_dot_insn (operands[0], tmp, dot, dot ? operands[3] : = 0); + tmp =3D gen_rtx_ASHIFT (mode, operands[0], GEN_INT (ne)); + emit_move_insn (operands[0], tmp); } return; RTL pr102239.c.300r.split2: (insn 32 8 33 2 (parallel [ (set (reg:CC 100 0 [123]) (compare:CC (and:DI (lshiftrt:DI (reg:DI 3 3 [124]) (const_int 33 [0x21])) (const_int 3 [0x3])) (const_int 0 [0]))) (clobber (reg:DI 3 3 [125])) ]) "pr102239.c":4:6 238 {*rotldi3_mask_dot} (nil)) (insn 33 32 10 2 (set (reg:DI 3 3 [125]) (ashift:DI (reg:DI 3 3 [125]) (const_int 33 [0x21]))) "pr102239.c":4:6 268 {ashldi3} (nil)) (jump_insn 10 33 11 2 (set (pc) (if_then_else (eq (reg:CC 100 0 [123]) (const_int 0 [0])) (label_ref 15) (pc))) "pr102239.c":4:6 868 {*cbranch} (int_list:REG_BR_PROB 536870916 (nil)) -> 15) ASM pr102239.s: foo: .LFB0: .cfi_startproc rldicl. 3,3,31,62 beq 0,.L2 #APP # 5 "pr102239.c" 1 # if # 0 "" 2 #NO_APP blr .p2align 4,,15 .L2: #APP=