public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Uros Bizjak <ubizjak@gmail.com>
To: Roger Sayle <roger@nextmovesoftware.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [x86 PATCH take #2] Double word logical operation clean-ups in i386.md.
Date: Thu, 30 Jun 2022 14:01:02 +0200	[thread overview]
Message-ID: <CAFULd4Ywc13zcWTb435A20Sf4AuU6XXN8M4p+T8jb_tEw7kqHg@mail.gmail.com> (raw)
In-Reply-To: <006b01d88c70$0dbc7530$29355f90$@nextmovesoftware.com>

On Thu, Jun 30, 2022 at 12:56 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> Hi Uros,
> Many thanks for your review of the "double word logical operation clean-up" patch.
> The revision below incorporates the majority of your feedback, but with one or two
> exceptions (required to allow the patch to bootstrap) that I thought I'd double check
> with you before pushing.
>
> Firstly, great catch that we no longer need to test rtx_equal (operands[0], operands[1])
> when moving a splitter from before reload to after reload, as this is guaranteed by the
> "0" constraints.  I've cleaned this up in all the doubleword splitters (including the
> <any_or> case that's now moved).  Also, as you've suggested, this patch uses
> a pair of define_insn_and_split for ANDN, one for TARGET_BMI (split post-reload)
> and the other for !TARGET_BMI (that's lowered rather than split, pre-load/post-STV).
>
> Unfortunately, the "force_reg of tricky immediate constants" checks really are
> required for these expanders.  I agree normally the predicate is checked/guaranteed
> for a define_insn, but in this case the gen_iordi3 function and related expanders are
> frequently called directly by the middle-end or from i386-expand, which bypasses
> the checks made by the later RTL passes.  When given arbitrary immediate constants,
> this results in ICEs from insns not matching their predicates soon after expand
> (breaking bootstrap with an ICE).  It's only "standard name" expanders that require
> this treatment, define_insn{_and_split} templates do enforce their predicates.
>
> And finally, we can't/shouldn't use <general_szext_operand> in the actual
> doubleword splitters, as the mode being iterated over is DWIH (not DWI),
> where we require the predicate for the corresponding <DWI> mode.  It turns
> out that it's always appropriate to use x86_64_hilo_general_operand wherever
> we use the "r<di>" constraint, and that's used consistently in this patch.
>
> I hope these exceptions are acceptable.  The attached revised patch has
> been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check
> both with and with --target_board=unix{-m32} with no new failures.
> Are these revisions OK for mainline?

Thanks for your explanation of the particularities of the patch!

Yes, the patch is OK.

Thanks,
Uros.

>
> 2022-06-30  Roger Sayle  <roger@nextmovesoftware.com>
>             Uroš Bizjak  <ubizjak@gmail.com>
>
> gcc/ChangeLog
>         * config/i386/i386.md (general_szext_operand): Add TImode
>         support using x86_64_hilo_general_operand predicate.
>         (*cmp<dwi>_doubleword): Use x86_64_hilo_general_operand predicate.
>         (*add<dwi>3_doubleword): Improved optimization of zero addition.
>         (and<mode>3): Use SDWIM mode iterator to add support for double
>         word bit-wise AND in TImode.  Use force_reg when double word
>         immediate operand isn't x86_64_hilo_general_operand.
>         (and<dwi>3_doubleword): Generalized from anddi3_doubleword and
>         converted into a post-reload splitter.
>         (*andndi3_doubleword): Old define_insn deleted.
>         (*andn<mode>3_doubleword_bmi): New define_insn_and_split for
>         TARGET_BMI that splits post-reload.
>         (*andn<mode>3_doubleword): New define_insn_and_split for
>         !TARGET_BMI, that lowers/splits before reload.
>         (<any_or><mode>3): Use SDWIM mode iterator to add suppport for
>         double word bit-wise XOR and bit-wise IOR in TImode.  Use
>         force_reg when double word immediate operand isn't
>         x86_64_hilo_general_operand.
>         (*<any_or>di3_doubleword): Generalized from <any_or>di3_doubleword.
>         (one_cmpl<mode>2): Use SDWIM mode iterator to add support for
>         double word bit-wise NOT in TImode.
>         (one_cmpl<dwi>2_doubleword): Generalize from one_cmpldi2_doubleword
>         and converted into a post-reload splitter.
>
>
> Thanks again,
> Roger
> --
>
> > -----Original Message-----
> > From: Uros Bizjak <ubizjak@gmail.com>
> > Sent: 28 June 2022 16:38
> > To: Roger Sayle <roger@nextmovesoftware.com>
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [x86 PATCH] Double word logical operation clean-ups in i386.md.
> >
> > On Tue, Jun 28, 2022 at 1:34 PM Roger Sayle <roger@nextmovesoftware.com>
> > wrote:
> > >
> > >
> > > Hi Uros,
> > > As you've requested/suggested, here's a patch that tidies up and
> > > unifies doubleword handling in i386.md; converting all doubleword
> > > splitters for logic operations to post-reload form, generalizing their
> > > define_insn_and_split templates to <dwi> form (supporting TARGET_64BIT
> > > ? TImode : DImode), and where required tweaking the corresponding
> > > expanders to use SDWIM to support TImode doubleword operations.  These
> > > changes incorporate your feedback from
> > > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596205.html
> > > where I included many/several of these clean-ups, in a patch to add a
> > > new optimization.  I agree, it's better to split these out (this
> > > patch), and I'll resubmit the (smaller) optimization patch as a
> > > follow-up.
> > >
> > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > > and make -k check, both with and without --target_board=unix{-m32},
> > > with no new failures.  Ok for mainline?
> > >
> > >
> > > 2022-06-28  Roger Sayle  <roger@nextmovesoftware.com>
> > >
> > > gcc/ChangeLog
> > >         * config/i386/i386.md (general_szext_operand): Add TImode
> > >         support using x86_64_hilo_general_operand predicate.
> > >         (*cmp<dwi>_doubleword): Use x86_64_hilo_general_operand predicate.
> > >         (*add<dwi>3_doubleword): Improved optimization of zero addition.
> > >         (and<mode>3): Use SDWIM mode iterator to add support for double
> > >         word bit-wise AND in TImode.  Use force_reg when double word
> > >         immediate operand isn't x86_64_hilo_general_operand.
> > >         (and<dwi>3_doubleword): Generalized from anddi3_doubleword and
> > >         converted into a post-reload splitter.
> > >         (*andn<mode>3_doubleword): Generalized from *andndi3_doubleword.
> > >         (define_split): Generalize DImode splitters for andn to <DWI>.
> > >         One splitter for TARGET_BMI, the other for !TARGET_BMI.
> > >         (<any_or><mode>3): Use SDWIM mode iterator to add suppport for
> > >         double word bit-wise XOR and bit-wise IOR in TImode.  Use
> > >         force_reg when double word immediate operand isn't
> > >         x86_64_hilo_general_operand.
> > >         (*<any_or>di3_doubleword): Generalized from <any_or>di3_doubleword.
> > >         (one_cmpl<mode>2): Use SDWIM mode iterator to add support for
> > >         double word bit-wise NOT in TImode.
> > >         (one_cmpl<dwi>2_doubleword): Generalize from
> > one_cmpldi2_doubleword
> > >         and converted into a post-reload splitter.
> >
> >
> >  (define_expand "and<mode>3"
> > -  [(set (match_operand:SWIM1248x 0 "nonimmediate_operand")
> > -    (and:SWIM1248x (match_operand:SWIM1248x 1 "nonimmediate_operand")
> > -               (match_operand:SWIM1248x 2 "<general_szext_operand>")))]
> > +  [(set (match_operand:SDWIM 0 "nonimmediate_operand")
> > +    (and:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand")
> > +           (match_operand:SDWIM 2 "<general_szext_operand>")))]
> >    ""
> >  {
> >    machine_mode mode = <MODE>mode;
> >
> > -  if (<MODE>mode == DImode && !TARGET_64BIT)
> > -    ;
> > -  else if (const_int_operand (operands[2], <MODE>mode)
> > -       && register_operand (operands[0], <MODE>mode)
> > -       && !(TARGET_ZERO_EXTEND_WITH_AND
> > -        && optimize_function_for_speed_p (cfun)))
> > +  if (GET_MODE_SIZE (<MODE>mode) > UNITS_PER_WORD
> > +      && !x86_64_hilo_general_operand (operands[2], <MODE>mode))
> > +    operands[2] = force_reg (<MODE>mode, operands[2]);
> >
> > You don't have to do that - when the predicate can't be satisfied, the middle-end
> > pushes the value to a register as a last resort by default.
> >
> > +  bool emit_insn_deleted_note_p = false;
> > +
> > +  split_double_mode (<DWI>mode, &operands[0], 3, &operands[0],
> > + &operands[3]);
> >
> >    if (operands[2] == const0_rtx)
> >      emit_move_insn (operands[0], const0_rtx);
> >    else if (operands[2] == constm1_rtx)
> > -    emit_move_insn (operands[0], operands[1]);
> > +    {
> > +      if (!rtx_equal_p (operands[0], operands[1]))
> > +    emit_move_insn (operands[0], operands[1]);
> > +      else
> > +    emit_insn_deleted_note_p = true;
> > +    }
> >
> > Please note that when operands[2] is an immediate, constraints after reload
> > *guarantee* that operands[1] match operands[0]. So, the insn should always be
> > deleted (I think that this functionality was in your <any_or> patch - it is
> > unneeded there, too).
> >
> > +(define_insn "*andn<mode>3_doubleword"
> > +  [(set (match_operand:DWI 0 "register_operand")
> > +    (and:DWI
> > +      (not:DWI (match_operand:DWI 1 "register_operand"))
> > +      (match_operand:DWI 2 "nonimmediate_operand")))
> >     (clobber (reg:CC FLAGS_REG))]
> > -  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2
> > -   && ix86_pre_reload_split ()"
> > +  "ix86_pre_reload_split ()"
> >    "#")
> >
> > Please introduce two ANDN double-word insn-and-split patterns, one for BMI
> > and one for !BMI. The one for BMI should be moved to a post-reload splitter,
> > too. As we figured out, *all* double-word patterns should either be of pre-
> > reload or of post-reload type.
> >
> >  (define_split
> > -  [(set (match_operand:DI 0 "register_operand")
> > -    (and:DI
> > -      (not:DI (match_operand:DI 1 "register_operand"))
> > -      (match_operand:DI 2 "nonimmediate_operand")))
> > +  [(set (match_operand:DWI 0 "register_operand")
> > +    (and:DWI
> > +      (not:DWI (match_operand:DWI 1 "register_operand"))
> > +      (match_operand:DWI 2 "nonimmediate_operand")))
> >     (clobber (reg:CC FLAGS_REG))]
> > -  "!TARGET_64BIT && !TARGET_BMI && TARGET_STV && TARGET_SSE2
> > +  "!TARGET_BMI
> >
> > Without BMI, the ANDN should be split to a double-word NOT + AND before
> > reload (and these two insns are split to single-word operations after reload).
> > This simplifies splitting logic quite a bit.
> >
> >  (define_expand "<code><mode>3"
> > -  [(set (match_operand:SWIM1248x 0 "nonimmediate_operand")
> > -    (any_or:SWIM1248x (match_operand:SWIM1248x 1
> > "nonimmediate_operand")
> > -              (match_operand:SWIM1248x 2 "<general_operand>")))]
> > +  [(set (match_operand:SDWIM 0 "nonimmediate_operand")
> > +    (any_or:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand")
> > +              (match_operand:SDWIM 2 "<general_operand>")))]
> >
> > Use <general_szext_operand> here ...
> >
> >    ""
> > -  "ix86_expand_binary_operator (<CODE>, <MODE>mode, operands); DONE;")
> > +{
> >
> > -(define_insn_and_split "*<code>di3_doubleword"
> > -  [(set (match_operand:DI 0 "nonimmediate_operand" "=ro,r")
> > -    (any_or:DI
> > -     (match_operand:DI 1 "nonimmediate_operand" "0,0")
> > -     (match_operand:DI 2 "x86_64_szext_general_operand" "re,o")))
> > +  if (GET_MODE_SIZE (<MODE>mode) > UNITS_PER_WORD
> > +      && !x86_64_hilo_general_operand (operands[2], <MODE>mode))
> > +    operands[2] = force_reg (<MODE>mode, operands[2]);
> >
> > ... to avoid the above fixup.
> >
> > +(define_insn_and_split "*<code><mode>3_doubleword"
> > +  [(set (match_operand:<DWI> 0 "nonimmediate_operand" "=ro,r")
> > +    (any_or:<DWI>
> > +     (match_operand:<DWI> 1 "nonimmediate_operand" "%0,0")
> > +     (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "r<di>,o")))
> >
> > <general_szext_operand> for consistency.
> >
> > Otherwise OK.
> >
> > Uros.

      reply	other threads:[~2022-06-30 12:01 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-30 10:56 Roger Sayle
2022-06-30 12:01 ` Uros Bizjak [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFULd4Ywc13zcWTb435A20Sf4AuU6XXN8M4p+T8jb_tEw7kqHg@mail.gmail.com \
    --to=ubizjak@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=roger@nextmovesoftware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).