From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 20AF53858C2C; Tue, 23 Nov 2021 19:44:18 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 20AF53858C2C From: "segher at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/93453] PPC: rldimi not taken into account to avoid shift+or Date: Tue, 23 Nov 2021 19:44:17 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 8.3.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: segher at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Nov 2021 19:44:18 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D93453 --- Comment #7 from Segher Boessenkool --- (In reply to HaoChen Gui from comment #6) > Yes, I found that the nonzero_bits doesn't return exact value in other > pass. It returns a different value. Neither is "exact". The version used by combine uses what combine keeps track of in reg_stat. = What that records depends on the order combine tries its combinations, and on wh= ich succeed as well: it is suboptimal. The version used outside of combine does not keep track of known values at = all. This is worse. > So calling nonzero_bits in md file is bad as it can't be recognized in > other pass.=20 An insn that matches in one pass is required to match in the next as well, = yes. This does not mean you cannot use nonzero_bits at all, but yes it is close. > Right now I want to convert a single pseudo to the pseudo AND with a ma= sk > in combine pass if its nonzero_bits is less than its mode mask and the ou= ter > operation is plus/ior/xor and its one of inner operation is > ashift/lshiftrt/and. Thus it is possible to match rotate and insert patte= rn. > What's your opinion? Thanks a lot. >=20 > (ior:DI (ashift:DI (reg:DI 125) > (const_int 32 [0x20])) > (reg:DI 126))) >=20 > is converted to >=20 > (ior:DI (ashift:DI (reg:DI 125) > (const_int 32 [0x20])) > (and:DI (reg:DI 126) > (const_int 4294967295 [0xfffffff])))) In this example it isn't invalid RTL. > diff --git a/gcc/combine.c b/gcc/combine.c > index 892c834a160..8b72a5ec831 100644 > --- a/gcc/combine.c > +++ b/gcc/combine.c > @@ -11539,6 +11539,26 @@ change_zero_ext (rtx pat) > return changed; > } >=20 > +/* Convert a psuedo to psuedo AND with a mask if its nonzero_bits is less > + than its mode mask. */ > +static bool > +pseudo_and_with_mask (rtx *reg) > +{ > + bool changed =3D false; > + gcc_assert (REG_P (*reg)); > + > + machine_mode mode =3D GET_MODE (*reg); > + unsigned HOST_WIDE_INT nonzero =3D nonzero_bits (*reg, mode); > + if (nonzero < GET_MODE_MASK (mode)) > + { > + rtx x =3D gen_rtx_AND (mode, *reg, GEN_INT (nonzero)); > + SUBST (*reg, x); > + changed =3D true; > + //fprintf (stdout, "PIX optimization\n"); > + } > + return changed; > +} > + > /* Like recog, but we receive the address of a pointer to a new pattern. > We try to match the rtx that the pointer points to. > If that fails, we may try to modify or replace the pattern, > @@ -11565,9 +11585,34 @@ recog_for_combine (rtx *pnewpat, rtx_insn *insn, > rtx *pnotes) >=20 > void *marker =3D get_undo_marker (); > bool changed =3D false; > + //bool PIX_opt =3D false; >=20 > if (GET_CODE (pat) =3D=3D SET) > - changed =3D change_zero_ext (pat); > + { > + rtx src =3D SET_SRC (pat); > + if ((GET_CODE (src) =3D=3D IOR > + || GET_CODE (src) =3D=3D XOR > + || GET_CODE (src) =3D=3D PLUS) > + && (((GET_CODE (XEXP (src, 0)) =3D=3D ASHIFT > + || GET_CODE (XEXP (src, 0)) =3D=3D LSHIFTRT > + || GET_CODE (XEXP (src, 0)) =3D=3D AND) > + && REG_P (XEXP (src, 1))) > + || ((GET_CODE (XEXP (src, 1)) =3D=3D ASHIFT > + || GET_CODE (XEXP (src, 1)) =3D=3D LSHIFTRT > + || GET_CODE (XEXP (src, 1)) =3D=3D AND) > + && REG_P (XEXP (src, 0))))) > + { > + changed =3D REG_P (XEXP (src, 0)) > + ? pseudo_and_with_mask (&XEXP (SET_SRC (pat), 0)) > + : pseudo_and_with_mask (&XEXP (SET_SRC (pat), 1)); > + if (changed) > + { > + maybe_swap_commutative_operands (SET_SRC (pat)); > + //PIX_opt =3D true; > + } > + } > + changed |=3D change_zero_ext (pat); > + } > else if (GET_CODE (pat) =3D=3D PARALLEL) > { > int i; > @@ -11585,6 +11630,8 @@ recog_for_combine (rtx *pnewpat, rtx_insn *insn, = rtx > *pnotes) >=20 > if (insn_code_number < 0) > undo_to_marker (marker); > + //else if (PIX_opt) > + //fprintf (stdout, "PIX applied\n"); > } >=20 > return insn_code_number; I don't like this at all. Maybe it will look better if you split this out to a new helper like change_zero_ext? That will make it clearer if and how these change things interact: they should not make us try out an exponential number of things (you don't cause that here, good), but also they should not negatively affect each other, and that isn't clear at all as written. Another problem is this is kind of giving up and trying to solve a problem in combine itself with this band-aid, while the existing change_* stuff is battling a problem in the design of RTL itself. That can be solved by clarifying things in a comment of course :-)=