public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [x86 PATCH take #2] Double word logical operation clean-ups in i386.md.
@ 2022-06-30 10:56 Roger Sayle
  2022-06-30 12:01 ` Uros Bizjak
  0 siblings, 1 reply; 2+ messages in thread
From: Roger Sayle @ 2022-06-30 10:56 UTC (permalink / raw)
  To: 'Uros Bizjak'; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 11349 bytes --]


Hi Uros,
Many thanks for your review of the "double word logical operation clean-up" patch.
The revision below incorporates the majority of your feedback, but with one or two
exceptions (required to allow the patch to bootstrap) that I thought I'd double check
with you before pushing.

Firstly, great catch that we no longer need to test rtx_equal (operands[0], operands[1])
when moving a splitter from before reload to after reload, as this is guaranteed by the
"0" constraints.  I've cleaned this up in all the doubleword splitters (including the
<any_or> case that's now moved).  Also, as you've suggested, this patch uses
a pair of define_insn_and_split for ANDN, one for TARGET_BMI (split post-reload)
and the other for !TARGET_BMI (that's lowered rather than split, pre-load/post-STV).

Unfortunately, the "force_reg of tricky immediate constants" checks really are
required for these expanders.  I agree normally the predicate is checked/guaranteed
for a define_insn, but in this case the gen_iordi3 function and related expanders are
frequently called directly by the middle-end or from i386-expand, which bypasses
the checks made by the later RTL passes.  When given arbitrary immediate constants,
this results in ICEs from insns not matching their predicates soon after expand
(breaking bootstrap with an ICE).  It's only "standard name" expanders that require
this treatment, define_insn{_and_split} templates do enforce their predicates.

And finally, we can't/shouldn't use <general_szext_operand> in the actual
doubleword splitters, as the mode being iterated over is DWIH (not DWI),
where we require the predicate for the corresponding <DWI> mode.  It turns
out that it's always appropriate to use x86_64_hilo_general_operand wherever
we use the "r<di>" constraint, and that's used consistently in this patch.

I hope these exceptions are acceptable.  The attached revised patch has
been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check
both with and with --target_board=unix{-m32} with no new failures.
Are these revisions OK for mainline?

2022-06-30  Roger Sayle  <roger@nextmovesoftware.com>
            Uroš Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog
        * config/i386/i386.md (general_szext_operand): Add TImode
        support using x86_64_hilo_general_operand predicate.
        (*cmp<dwi>_doubleword): Use x86_64_hilo_general_operand predicate.
        (*add<dwi>3_doubleword): Improved optimization of zero addition.
        (and<mode>3): Use SDWIM mode iterator to add support for double
        word bit-wise AND in TImode.  Use force_reg when double word
        immediate operand isn't x86_64_hilo_general_operand.
        (and<dwi>3_doubleword): Generalized from anddi3_doubleword and
        converted into a post-reload splitter.
        (*andndi3_doubleword): Old define_insn deleted.
        (*andn<mode>3_doubleword_bmi): New define_insn_and_split for
        TARGET_BMI that splits post-reload.
        (*andn<mode>3_doubleword): New define_insn_and_split for
        !TARGET_BMI, that lowers/splits before reload.
        (<any_or><mode>3): Use SDWIM mode iterator to add suppport for
        double word bit-wise XOR and bit-wise IOR in TImode.  Use
        force_reg when double word immediate operand isn't
        x86_64_hilo_general_operand.
        (*<any_or>di3_doubleword): Generalized from <any_or>di3_doubleword.
        (one_cmpl<mode>2): Use SDWIM mode iterator to add support for
        double word bit-wise NOT in TImode.
        (one_cmpl<dwi>2_doubleword): Generalize from one_cmpldi2_doubleword
        and converted into a post-reload splitter.


Thanks again,
Roger
--

> -----Original Message-----
> From: Uros Bizjak <ubizjak@gmail.com>
> Sent: 28 June 2022 16:38
> To: Roger Sayle <roger@nextmovesoftware.com>
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [x86 PATCH] Double word logical operation clean-ups in i386.md.
> 
> On Tue, Jun 28, 2022 at 1:34 PM Roger Sayle <roger@nextmovesoftware.com>
> wrote:
> >
> >
> > Hi Uros,
> > As you've requested/suggested, here's a patch that tidies up and
> > unifies doubleword handling in i386.md; converting all doubleword
> > splitters for logic operations to post-reload form, generalizing their
> > define_insn_and_split templates to <dwi> form (supporting TARGET_64BIT
> > ? TImode : DImode), and where required tweaking the corresponding
> > expanders to use SDWIM to support TImode doubleword operations.  These
> > changes incorporate your feedback from
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596205.html
> > where I included many/several of these clean-ups, in a patch to add a
> > new optimization.  I agree, it's better to split these out (this
> > patch), and I'll resubmit the (smaller) optimization patch as a
> > follow-up.
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check, both with and without --target_board=unix{-m32},
> > with no new failures.  Ok for mainline?
> >
> >
> > 2022-06-28  Roger Sayle  <roger@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> >         * config/i386/i386.md (general_szext_operand): Add TImode
> >         support using x86_64_hilo_general_operand predicate.
> >         (*cmp<dwi>_doubleword): Use x86_64_hilo_general_operand predicate.
> >         (*add<dwi>3_doubleword): Improved optimization of zero addition.
> >         (and<mode>3): Use SDWIM mode iterator to add support for double
> >         word bit-wise AND in TImode.  Use force_reg when double word
> >         immediate operand isn't x86_64_hilo_general_operand.
> >         (and<dwi>3_doubleword): Generalized from anddi3_doubleword and
> >         converted into a post-reload splitter.
> >         (*andn<mode>3_doubleword): Generalized from *andndi3_doubleword.
> >         (define_split): Generalize DImode splitters for andn to <DWI>.
> >         One splitter for TARGET_BMI, the other for !TARGET_BMI.
> >         (<any_or><mode>3): Use SDWIM mode iterator to add suppport for
> >         double word bit-wise XOR and bit-wise IOR in TImode.  Use
> >         force_reg when double word immediate operand isn't
> >         x86_64_hilo_general_operand.
> >         (*<any_or>di3_doubleword): Generalized from <any_or>di3_doubleword.
> >         (one_cmpl<mode>2): Use SDWIM mode iterator to add support for
> >         double word bit-wise NOT in TImode.
> >         (one_cmpl<dwi>2_doubleword): Generalize from
> one_cmpldi2_doubleword
> >         and converted into a post-reload splitter.
> 
> 
>  (define_expand "and<mode>3"
> -  [(set (match_operand:SWIM1248x 0 "nonimmediate_operand")
> -    (and:SWIM1248x (match_operand:SWIM1248x 1 "nonimmediate_operand")
> -               (match_operand:SWIM1248x 2 "<general_szext_operand>")))]
> +  [(set (match_operand:SDWIM 0 "nonimmediate_operand")
> +    (and:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand")
> +           (match_operand:SDWIM 2 "<general_szext_operand>")))]
>    ""
>  {
>    machine_mode mode = <MODE>mode;
> 
> -  if (<MODE>mode == DImode && !TARGET_64BIT)
> -    ;
> -  else if (const_int_operand (operands[2], <MODE>mode)
> -       && register_operand (operands[0], <MODE>mode)
> -       && !(TARGET_ZERO_EXTEND_WITH_AND
> -        && optimize_function_for_speed_p (cfun)))
> +  if (GET_MODE_SIZE (<MODE>mode) > UNITS_PER_WORD
> +      && !x86_64_hilo_general_operand (operands[2], <MODE>mode))
> +    operands[2] = force_reg (<MODE>mode, operands[2]);
> 
> You don't have to do that - when the predicate can't be satisfied, the middle-end
> pushes the value to a register as a last resort by default.
> 
> +  bool emit_insn_deleted_note_p = false;
> +
> +  split_double_mode (<DWI>mode, &operands[0], 3, &operands[0],
> + &operands[3]);
> 
>    if (operands[2] == const0_rtx)
>      emit_move_insn (operands[0], const0_rtx);
>    else if (operands[2] == constm1_rtx)
> -    emit_move_insn (operands[0], operands[1]);
> +    {
> +      if (!rtx_equal_p (operands[0], operands[1]))
> +    emit_move_insn (operands[0], operands[1]);
> +      else
> +    emit_insn_deleted_note_p = true;
> +    }
> 
> Please note that when operands[2] is an immediate, constraints after reload
> *guarantee* that operands[1] match operands[0]. So, the insn should always be
> deleted (I think that this functionality was in your <any_or> patch - it is
> unneeded there, too).
> 
> +(define_insn "*andn<mode>3_doubleword"
> +  [(set (match_operand:DWI 0 "register_operand")
> +    (and:DWI
> +      (not:DWI (match_operand:DWI 1 "register_operand"))
> +      (match_operand:DWI 2 "nonimmediate_operand")))
>     (clobber (reg:CC FLAGS_REG))]
> -  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2
> -   && ix86_pre_reload_split ()"
> +  "ix86_pre_reload_split ()"
>    "#")
> 
> Please introduce two ANDN double-word insn-and-split patterns, one for BMI
> and one for !BMI. The one for BMI should be moved to a post-reload splitter,
> too. As we figured out, *all* double-word patterns should either be of pre-
> reload or of post-reload type.
> 
>  (define_split
> -  [(set (match_operand:DI 0 "register_operand")
> -    (and:DI
> -      (not:DI (match_operand:DI 1 "register_operand"))
> -      (match_operand:DI 2 "nonimmediate_operand")))
> +  [(set (match_operand:DWI 0 "register_operand")
> +    (and:DWI
> +      (not:DWI (match_operand:DWI 1 "register_operand"))
> +      (match_operand:DWI 2 "nonimmediate_operand")))
>     (clobber (reg:CC FLAGS_REG))]
> -  "!TARGET_64BIT && !TARGET_BMI && TARGET_STV && TARGET_SSE2
> +  "!TARGET_BMI
> 
> Without BMI, the ANDN should be split to a double-word NOT + AND before
> reload (and these two insns are split to single-word operations after reload).
> This simplifies splitting logic quite a bit.
> 
>  (define_expand "<code><mode>3"
> -  [(set (match_operand:SWIM1248x 0 "nonimmediate_operand")
> -    (any_or:SWIM1248x (match_operand:SWIM1248x 1
> "nonimmediate_operand")
> -              (match_operand:SWIM1248x 2 "<general_operand>")))]
> +  [(set (match_operand:SDWIM 0 "nonimmediate_operand")
> +    (any_or:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand")
> +              (match_operand:SDWIM 2 "<general_operand>")))]
> 
> Use <general_szext_operand> here ...
> 
>    ""
> -  "ix86_expand_binary_operator (<CODE>, <MODE>mode, operands); DONE;")
> +{
> 
> -(define_insn_and_split "*<code>di3_doubleword"
> -  [(set (match_operand:DI 0 "nonimmediate_operand" "=ro,r")
> -    (any_or:DI
> -     (match_operand:DI 1 "nonimmediate_operand" "0,0")
> -     (match_operand:DI 2 "x86_64_szext_general_operand" "re,o")))
> +  if (GET_MODE_SIZE (<MODE>mode) > UNITS_PER_WORD
> +      && !x86_64_hilo_general_operand (operands[2], <MODE>mode))
> +    operands[2] = force_reg (<MODE>mode, operands[2]);
> 
> ... to avoid the above fixup.
> 
> +(define_insn_and_split "*<code><mode>3_doubleword"
> +  [(set (match_operand:<DWI> 0 "nonimmediate_operand" "=ro,r")
> +    (any_or:<DWI>
> +     (match_operand:<DWI> 1 "nonimmediate_operand" "%0,0")
> +     (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "r<di>,o")))
> 
> <general_szext_operand> for consistency.
> 
> Otherwise OK.
> 
> Uros.

[-- Attachment #2: patchdw2.txt --]
[-- Type: text/plain, Size: 12058 bytes --]

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 125a3b4..3947b05 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1192,7 +1192,8 @@
 	[(QI "general_operand")
 	 (HI "general_operand")
 	 (SI "x86_64_szext_general_operand")
-	 (DI "x86_64_szext_general_operand")])
+	 (DI "x86_64_szext_general_operand")
+	 (TI "x86_64_hilo_general_operand")])
 
 (define_mode_attr nonmemory_szext_operand
 	[(QI "nonmemory_operand")
@@ -1509,7 +1510,7 @@
 (define_insn_and_split "*cmp<dwi>_doubleword"
   [(set (reg:CCZ FLAGS_REG)
 	(compare:CCZ (match_operand:<DWI> 0 "nonimmediate_operand")
-		     (match_operand:<DWI> 1 "x86_64_general_operand")))]
+		     (match_operand:<DWI> 1 "x86_64_hilo_general_operand")))]
   "ix86_pre_reload_split ()"
   "#"
   "&& 1"
@@ -5802,7 +5803,12 @@
   split_double_mode (<DWI>mode, &operands[0], 3, &operands[0], &operands[3]);
   if (operands[2] == const0_rtx)
     {
-      ix86_expand_binary_operator (PLUS, <MODE>mode, &operands[3]);
+      if (operands[5] != const0_rtx)
+	ix86_expand_binary_operator (PLUS, <MODE>mode, &operands[3]);
+      else if (!rtx_equal_p (operands[3], operands[4]))
+	emit_move_insn (operands[3], operands[4]);
+      else
+	emit_note (NOTE_INSN_DELETED);
       DONE;
     }
 })
@@ -9846,19 +9852,22 @@
 ;; it should be done with splitters.
 
 (define_expand "and<mode>3"
-  [(set (match_operand:SWIM1248x 0 "nonimmediate_operand")
-	(and:SWIM1248x (match_operand:SWIM1248x 1 "nonimmediate_operand")
-		       (match_operand:SWIM1248x 2 "<general_szext_operand>")))]
+  [(set (match_operand:SDWIM 0 "nonimmediate_operand")
+	(and:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand")
+		   (match_operand:SDWIM 2 "<general_szext_operand>")))]
   ""
 {
   machine_mode mode = <MODE>mode;
 
-  if (<MODE>mode == DImode && !TARGET_64BIT)
-    ;
-  else if (const_int_operand (operands[2], <MODE>mode)
-	   && register_operand (operands[0], <MODE>mode)
-	   && !(TARGET_ZERO_EXTEND_WITH_AND
-		&& optimize_function_for_speed_p (cfun)))
+  if (GET_MODE_SIZE (<MODE>mode) > UNITS_PER_WORD
+      && !x86_64_hilo_general_operand (operands[2], <MODE>mode))
+    operands[2] = force_reg (<MODE>mode, operands[2]);
+
+  if (GET_MODE_SIZE (<MODE>mode) <= UNITS_PER_WORD
+      && const_int_operand (operands[2], <MODE>mode)
+      && register_operand (operands[0], <MODE>mode)
+      && !(TARGET_ZERO_EXTEND_WITH_AND
+	   && optimize_function_for_speed_p (cfun)))
     {
       unsigned HOST_WIDE_INT ival = UINTVAL (operands[2]);
 
@@ -9880,34 +9889,37 @@
   DONE;
 })
 
-(define_insn_and_split "*anddi3_doubleword"
-  [(set (match_operand:DI 0 "nonimmediate_operand")
-	(and:DI
-	 (match_operand:DI 1 "nonimmediate_operand")
-	 (match_operand:DI 2 "x86_64_szext_general_operand")))
+(define_insn_and_split "*and<dwi>3_doubleword"
+  [(set (match_operand:<DWI> 0 "nonimmediate_operand" "=ro,r")
+	(and:<DWI>
+	 (match_operand:<DWI> 1 "nonimmediate_operand" "%0,0")
+	 (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "r<di>,o")))
    (clobber (reg:CC FLAGS_REG))]
-  "!TARGET_64BIT
-   && ix86_binary_operator_ok (AND, DImode, operands)
-   && ix86_pre_reload_split ()"
+  "ix86_binary_operator_ok (AND, <DWI>mode, operands)"
   "#"
-  "&& 1"
-  [(const_int 0)]
+  "&& reload_completed"
+  [(const_int:DWIH 0)]
 {
-  split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);
+  bool emit_insn_deleted_note_p = false;
+
+  split_double_mode (<DWI>mode, &operands[0], 3, &operands[0], &operands[3]);
 
   if (operands[2] == const0_rtx)
     emit_move_insn (operands[0], const0_rtx);
   else if (operands[2] == constm1_rtx)
-    emit_move_insn (operands[0], operands[1]);
+    emit_insn_deleted_note_p = true;
   else
-    emit_insn (gen_andsi3 (operands[0], operands[1], operands[2]));
+    ix86_expand_binary_operator (AND, <MODE>mode, &operands[0]);
 
   if (operands[5] == const0_rtx)
     emit_move_insn (operands[3], const0_rtx);
   else if (operands[5] == constm1_rtx)
-    emit_move_insn (operands[3], operands[4]);
+    {
+      if (emit_insn_deleted_note_p)
+	emit_note (NOTE_INSN_DELETED);
+    }
   else
-    emit_insn (gen_andsi3 (operands[3], operands[4], operands[5]));
+    ix86_expand_binary_operator (AND, <MODE>mode, &operands[3]);
 
   DONE;
 })
@@ -10391,54 +10403,38 @@
   operands[2] = gen_int_mode (INTVAL (operands[2]), QImode);
 })
 
-(define_insn "*andndi3_doubleword"
-  [(set (match_operand:DI 0 "register_operand")
-	(and:DI
-	  (not:DI (match_operand:DI 1 "register_operand"))
-	  (match_operand:DI 2 "nonimmediate_operand")))
-   (clobber (reg:CC FLAGS_REG))]
-  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2
-   && ix86_pre_reload_split ()"
-  "#")
-
-(define_split
-  [(set (match_operand:DI 0 "register_operand")
-	(and:DI
-	  (not:DI (match_operand:DI 1 "register_operand"))
-	  (match_operand:DI 2 "nonimmediate_operand")))
+(define_insn_and_split "*andn<mode>3_doubleword_bmi"
+  [(set (match_operand:<DWI> 0 "register_operand")
+	(and:<DWI>
+	  (not:<DWI> (match_operand:<DWI> 1 "register_operand"))
+	  (match_operand:<DWI> 2 "nonimmediate_operand")))
    (clobber (reg:CC FLAGS_REG))]
-  "!TARGET_64BIT && TARGET_BMI && TARGET_STV && TARGET_SSE2
-   && can_create_pseudo_p ()"
+  "TARGET_BMI"
+  "#"
+  "&& reload_completed"
   [(parallel [(set (match_dup 0)
-		   (and:SI (not:SI (match_dup 1)) (match_dup 2)))
+		   (and:DWIH (not:DWIH (match_dup 1)) (match_dup 2)))
 	      (clobber (reg:CC FLAGS_REG))])
    (parallel [(set (match_dup 3)
-		   (and:SI (not:SI (match_dup 4)) (match_dup 5)))
+		   (and:DWIH (not:DWIH (match_dup 4)) (match_dup 5)))
 	      (clobber (reg:CC FLAGS_REG))])]
-  "split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);")
+  "split_double_mode (<DWI>mode, &operands[0], 3, &operands[0], &operands[3]);")
 
-(define_split
-  [(set (match_operand:DI 0 "register_operand")
-	(and:DI
-	  (not:DI (match_operand:DI 1 "register_operand"))
-	  (match_operand:DI 2 "nonimmediate_operand")))
+(define_insn_and_split "*andn<mode>3_doubleword"
+  [(set (match_operand:DWI 0 "register_operand")
+	(and:DWI
+	  (not:DWI (match_operand:DWI 1 "register_operand"))
+	  (match_operand:DWI 2 "nonimmediate_operand")))
    (clobber (reg:CC FLAGS_REG))]
-  "!TARGET_64BIT && !TARGET_BMI && TARGET_STV && TARGET_SSE2
-   && can_create_pseudo_p ()"
-  [(set (match_dup 6) (not:SI (match_dup 1)))
+  "!TARGET_BMI
+   && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(set (match_dup 3) (not:DWI (match_dup 1)))
    (parallel [(set (match_dup 0)
-		   (and:SI (match_dup 6) (match_dup 2)))
-	      (clobber (reg:CC FLAGS_REG))])
-   (set (match_dup 7) (not:SI (match_dup 4)))
-   (parallel [(set (match_dup 3)
-		   (and:SI (match_dup 7) (match_dup 5)))
+		   (and:DWI (match_dup 3) (match_dup 2)))
 	      (clobber (reg:CC FLAGS_REG))])]
-{
-  operands[6] = gen_reg_rtx (SImode);
-  operands[7] = gen_reg_rtx (SImode);
-
-  split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);
-})
+  "operands[3] = gen_reg_rtx (<MODE>mode);")
 
 (define_insn "*andn<mode>_1"
   [(set (match_operand:SWI48 0 "register_operand" "=r,r,?k")
@@ -10532,52 +10528,51 @@
 ;; If this is considered useful, it should be done with splitters.
 
 (define_expand "<code><mode>3"
-  [(set (match_operand:SWIM1248x 0 "nonimmediate_operand")
-	(any_or:SWIM1248x (match_operand:SWIM1248x 1 "nonimmediate_operand")
-			  (match_operand:SWIM1248x 2 "<general_operand>")))]
+  [(set (match_operand:SDWIM 0 "nonimmediate_operand")
+	(any_or:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand")
+		      (match_operand:SDWIM 2 "<general_operand>")))]
   ""
-  "ix86_expand_binary_operator (<CODE>, <MODE>mode, operands); DONE;")
+{
+  if (GET_MODE_SIZE (<MODE>mode) > UNITS_PER_WORD
+      && !x86_64_hilo_general_operand (operands[2], <MODE>mode))
+    operands[2] = force_reg (<MODE>mode, operands[2]);
 
-(define_insn_and_split "*<code>di3_doubleword"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=ro,r")
-	(any_or:DI
-	 (match_operand:DI 1 "nonimmediate_operand" "0,0")
-	 (match_operand:DI 2 "x86_64_szext_general_operand" "re,o")))
+  ix86_expand_binary_operator (<CODE>, <MODE>mode, operands);
+  DONE;
+})
+
+(define_insn_and_split "*<code><mode>3_doubleword"
+  [(set (match_operand:<DWI> 0 "nonimmediate_operand" "=ro,r")
+	(any_or:<DWI>
+	 (match_operand:<DWI> 1 "nonimmediate_operand" "%0,0")
+	 (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "r<di>,o")))
    (clobber (reg:CC FLAGS_REG))]
-  "!TARGET_64BIT
-   && ix86_binary_operator_ok (<CODE>, DImode, operands)"
+  "ix86_binary_operator_ok (<CODE>, <DWI>mode, operands)"
   "#"
   "&& reload_completed"
-  [(const_int 0)]
+  [(const_int:DWIH 0)]
 {
   /* This insn may disappear completely when operands[2] == const0_rtx
      and operands[0] == operands[1], which requires a NOTE_INSN_DELETED.  */
   bool emit_insn_deleted_note_p = false;
 
-  split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);
+  split_double_mode (<DWI>mode, &operands[0], 3, &operands[0], &operands[3]);
 
   if (operands[2] == const0_rtx)
-    {
-      if (!rtx_equal_p (operands[0], operands[1]))
-	emit_move_insn (operands[0], operands[1]);
-      else
-	emit_insn_deleted_note_p = true;
-    }
+    emit_insn_deleted_note_p = true;
   else if (operands[2] == constm1_rtx)
     {
       if (<CODE> == IOR)
 	emit_move_insn (operands[0], constm1_rtx);
       else
-	ix86_expand_unary_operator (NOT, SImode, &operands[0]);
+	ix86_expand_unary_operator (NOT, <MODE>mode, &operands[0]);
     }
   else
-    ix86_expand_binary_operator (<CODE>, SImode, &operands[0]);
+    ix86_expand_binary_operator (<CODE>, <MODE>mode, &operands[0]);
 
   if (operands[5] == const0_rtx)
     {
-      if (!rtx_equal_p (operands[3], operands[4]))
-	emit_move_insn (operands[3], operands[4]);
-      else if (emit_insn_deleted_note_p)
+      if (emit_insn_deleted_note_p)
 	emit_note (NOTE_INSN_DELETED);
     }
   else if (operands[5] == constm1_rtx)
@@ -10585,10 +10580,10 @@
       if (<CODE> == IOR)
 	emit_move_insn (operands[3], constm1_rtx);
       else
-	ix86_expand_unary_operator (NOT, SImode, &operands[3]);
+	ix86_expand_unary_operator (NOT, <MODE>mode, &operands[3]);
     }
   else
-    ix86_expand_binary_operator (<CODE>, SImode, &operands[3]);
+    ix86_expand_binary_operator (<CODE>, <MODE>mode, &operands[3]);
 
   DONE;
 })
@@ -11727,24 +11722,22 @@
 ;; One complement instructions
 
 (define_expand "one_cmpl<mode>2"
-  [(set (match_operand:SWIM1248x 0 "nonimmediate_operand")
-	(not:SWIM1248x (match_operand:SWIM1248x 1 "nonimmediate_operand")))]
+  [(set (match_operand:SDWIM 0 "nonimmediate_operand")
+	(not:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand")))]
   ""
   "ix86_expand_unary_operator (NOT, <MODE>mode, operands); DONE;")
 
-(define_insn_and_split "*one_cmpldi2_doubleword"
-  [(set (match_operand:DI 0 "nonimmediate_operand")
-	(not:DI (match_operand:DI 1 "nonimmediate_operand")))]
-  "!TARGET_64BIT
-   && ix86_unary_operator_ok (NOT, DImode, operands)
-   && ix86_pre_reload_split ()"
+(define_insn_and_split "*one_cmpl<dwi>2_doubleword"
+  [(set (match_operand:<DWI> 0 "nonimmediate_operand" "=ro")
+	(not:<DWI> (match_operand:<DWI> 1 "nonimmediate_operand" "0")))]
+  "ix86_unary_operator_ok (NOT, <DWI>mode, operands)"
   "#"
-  "&& 1"
+  "&& reload_completed"
   [(set (match_dup 0)
-	(not:SI (match_dup 1)))
+	(not:DWIH (match_dup 1)))
    (set (match_dup 2)
-	(not:SI (match_dup 3)))]
-  "split_double_mode (DImode, &operands[0], 2, &operands[0], &operands[2]);")
+	(not:DWIH (match_dup 3)))]
+  "split_double_mode (<DWI>mode, &operands[0], 2, &operands[0], &operands[2]);")
 
 (define_insn "*one_cmpl<mode>2_1"
   [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,?k")

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [x86 PATCH take #2] Double word logical operation clean-ups in i386.md.
  2022-06-30 10:56 [x86 PATCH take #2] Double word logical operation clean-ups in i386.md Roger Sayle
@ 2022-06-30 12:01 ` Uros Bizjak
  0 siblings, 0 replies; 2+ messages in thread
From: Uros Bizjak @ 2022-06-30 12:01 UTC (permalink / raw)
  To: Roger Sayle; +Cc: gcc-patches

On Thu, Jun 30, 2022 at 12:56 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> Hi Uros,
> Many thanks for your review of the "double word logical operation clean-up" patch.
> The revision below incorporates the majority of your feedback, but with one or two
> exceptions (required to allow the patch to bootstrap) that I thought I'd double check
> with you before pushing.
>
> Firstly, great catch that we no longer need to test rtx_equal (operands[0], operands[1])
> when moving a splitter from before reload to after reload, as this is guaranteed by the
> "0" constraints.  I've cleaned this up in all the doubleword splitters (including the
> <any_or> case that's now moved).  Also, as you've suggested, this patch uses
> a pair of define_insn_and_split for ANDN, one for TARGET_BMI (split post-reload)
> and the other for !TARGET_BMI (that's lowered rather than split, pre-load/post-STV).
>
> Unfortunately, the "force_reg of tricky immediate constants" checks really are
> required for these expanders.  I agree normally the predicate is checked/guaranteed
> for a define_insn, but in this case the gen_iordi3 function and related expanders are
> frequently called directly by the middle-end or from i386-expand, which bypasses
> the checks made by the later RTL passes.  When given arbitrary immediate constants,
> this results in ICEs from insns not matching their predicates soon after expand
> (breaking bootstrap with an ICE).  It's only "standard name" expanders that require
> this treatment, define_insn{_and_split} templates do enforce their predicates.
>
> And finally, we can't/shouldn't use <general_szext_operand> in the actual
> doubleword splitters, as the mode being iterated over is DWIH (not DWI),
> where we require the predicate for the corresponding <DWI> mode.  It turns
> out that it's always appropriate to use x86_64_hilo_general_operand wherever
> we use the "r<di>" constraint, and that's used consistently in this patch.
>
> I hope these exceptions are acceptable.  The attached revised patch has
> been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check
> both with and with --target_board=unix{-m32} with no new failures.
> Are these revisions OK for mainline?

Thanks for your explanation of the particularities of the patch!

Yes, the patch is OK.

Thanks,
Uros.

>
> 2022-06-30  Roger Sayle  <roger@nextmovesoftware.com>
>             Uroš Bizjak  <ubizjak@gmail.com>
>
> gcc/ChangeLog
>         * config/i386/i386.md (general_szext_operand): Add TImode
>         support using x86_64_hilo_general_operand predicate.
>         (*cmp<dwi>_doubleword): Use x86_64_hilo_general_operand predicate.
>         (*add<dwi>3_doubleword): Improved optimization of zero addition.
>         (and<mode>3): Use SDWIM mode iterator to add support for double
>         word bit-wise AND in TImode.  Use force_reg when double word
>         immediate operand isn't x86_64_hilo_general_operand.
>         (and<dwi>3_doubleword): Generalized from anddi3_doubleword and
>         converted into a post-reload splitter.
>         (*andndi3_doubleword): Old define_insn deleted.
>         (*andn<mode>3_doubleword_bmi): New define_insn_and_split for
>         TARGET_BMI that splits post-reload.
>         (*andn<mode>3_doubleword): New define_insn_and_split for
>         !TARGET_BMI, that lowers/splits before reload.
>         (<any_or><mode>3): Use SDWIM mode iterator to add suppport for
>         double word bit-wise XOR and bit-wise IOR in TImode.  Use
>         force_reg when double word immediate operand isn't
>         x86_64_hilo_general_operand.
>         (*<any_or>di3_doubleword): Generalized from <any_or>di3_doubleword.
>         (one_cmpl<mode>2): Use SDWIM mode iterator to add support for
>         double word bit-wise NOT in TImode.
>         (one_cmpl<dwi>2_doubleword): Generalize from one_cmpldi2_doubleword
>         and converted into a post-reload splitter.
>
>
> Thanks again,
> Roger
> --
>
> > -----Original Message-----
> > From: Uros Bizjak <ubizjak@gmail.com>
> > Sent: 28 June 2022 16:38
> > To: Roger Sayle <roger@nextmovesoftware.com>
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [x86 PATCH] Double word logical operation clean-ups in i386.md.
> >
> > On Tue, Jun 28, 2022 at 1:34 PM Roger Sayle <roger@nextmovesoftware.com>
> > wrote:
> > >
> > >
> > > Hi Uros,
> > > As you've requested/suggested, here's a patch that tidies up and
> > > unifies doubleword handling in i386.md; converting all doubleword
> > > splitters for logic operations to post-reload form, generalizing their
> > > define_insn_and_split templates to <dwi> form (supporting TARGET_64BIT
> > > ? TImode : DImode), and where required tweaking the corresponding
> > > expanders to use SDWIM to support TImode doubleword operations.  These
> > > changes incorporate your feedback from
> > > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596205.html
> > > where I included many/several of these clean-ups, in a patch to add a
> > > new optimization.  I agree, it's better to split these out (this
> > > patch), and I'll resubmit the (smaller) optimization patch as a
> > > follow-up.
> > >
> > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > > and make -k check, both with and without --target_board=unix{-m32},
> > > with no new failures.  Ok for mainline?
> > >
> > >
> > > 2022-06-28  Roger Sayle  <roger@nextmovesoftware.com>
> > >
> > > gcc/ChangeLog
> > >         * config/i386/i386.md (general_szext_operand): Add TImode
> > >         support using x86_64_hilo_general_operand predicate.
> > >         (*cmp<dwi>_doubleword): Use x86_64_hilo_general_operand predicate.
> > >         (*add<dwi>3_doubleword): Improved optimization of zero addition.
> > >         (and<mode>3): Use SDWIM mode iterator to add support for double
> > >         word bit-wise AND in TImode.  Use force_reg when double word
> > >         immediate operand isn't x86_64_hilo_general_operand.
> > >         (and<dwi>3_doubleword): Generalized from anddi3_doubleword and
> > >         converted into a post-reload splitter.
> > >         (*andn<mode>3_doubleword): Generalized from *andndi3_doubleword.
> > >         (define_split): Generalize DImode splitters for andn to <DWI>.
> > >         One splitter for TARGET_BMI, the other for !TARGET_BMI.
> > >         (<any_or><mode>3): Use SDWIM mode iterator to add suppport for
> > >         double word bit-wise XOR and bit-wise IOR in TImode.  Use
> > >         force_reg when double word immediate operand isn't
> > >         x86_64_hilo_general_operand.
> > >         (*<any_or>di3_doubleword): Generalized from <any_or>di3_doubleword.
> > >         (one_cmpl<mode>2): Use SDWIM mode iterator to add support for
> > >         double word bit-wise NOT in TImode.
> > >         (one_cmpl<dwi>2_doubleword): Generalize from
> > one_cmpldi2_doubleword
> > >         and converted into a post-reload splitter.
> >
> >
> >  (define_expand "and<mode>3"
> > -  [(set (match_operand:SWIM1248x 0 "nonimmediate_operand")
> > -    (and:SWIM1248x (match_operand:SWIM1248x 1 "nonimmediate_operand")
> > -               (match_operand:SWIM1248x 2 "<general_szext_operand>")))]
> > +  [(set (match_operand:SDWIM 0 "nonimmediate_operand")
> > +    (and:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand")
> > +           (match_operand:SDWIM 2 "<general_szext_operand>")))]
> >    ""
> >  {
> >    machine_mode mode = <MODE>mode;
> >
> > -  if (<MODE>mode == DImode && !TARGET_64BIT)
> > -    ;
> > -  else if (const_int_operand (operands[2], <MODE>mode)
> > -       && register_operand (operands[0], <MODE>mode)
> > -       && !(TARGET_ZERO_EXTEND_WITH_AND
> > -        && optimize_function_for_speed_p (cfun)))
> > +  if (GET_MODE_SIZE (<MODE>mode) > UNITS_PER_WORD
> > +      && !x86_64_hilo_general_operand (operands[2], <MODE>mode))
> > +    operands[2] = force_reg (<MODE>mode, operands[2]);
> >
> > You don't have to do that - when the predicate can't be satisfied, the middle-end
> > pushes the value to a register as a last resort by default.
> >
> > +  bool emit_insn_deleted_note_p = false;
> > +
> > +  split_double_mode (<DWI>mode, &operands[0], 3, &operands[0],
> > + &operands[3]);
> >
> >    if (operands[2] == const0_rtx)
> >      emit_move_insn (operands[0], const0_rtx);
> >    else if (operands[2] == constm1_rtx)
> > -    emit_move_insn (operands[0], operands[1]);
> > +    {
> > +      if (!rtx_equal_p (operands[0], operands[1]))
> > +    emit_move_insn (operands[0], operands[1]);
> > +      else
> > +    emit_insn_deleted_note_p = true;
> > +    }
> >
> > Please note that when operands[2] is an immediate, constraints after reload
> > *guarantee* that operands[1] match operands[0]. So, the insn should always be
> > deleted (I think that this functionality was in your <any_or> patch - it is
> > unneeded there, too).
> >
> > +(define_insn "*andn<mode>3_doubleword"
> > +  [(set (match_operand:DWI 0 "register_operand")
> > +    (and:DWI
> > +      (not:DWI (match_operand:DWI 1 "register_operand"))
> > +      (match_operand:DWI 2 "nonimmediate_operand")))
> >     (clobber (reg:CC FLAGS_REG))]
> > -  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2
> > -   && ix86_pre_reload_split ()"
> > +  "ix86_pre_reload_split ()"
> >    "#")
> >
> > Please introduce two ANDN double-word insn-and-split patterns, one for BMI
> > and one for !BMI. The one for BMI should be moved to a post-reload splitter,
> > too. As we figured out, *all* double-word patterns should either be of pre-
> > reload or of post-reload type.
> >
> >  (define_split
> > -  [(set (match_operand:DI 0 "register_operand")
> > -    (and:DI
> > -      (not:DI (match_operand:DI 1 "register_operand"))
> > -      (match_operand:DI 2 "nonimmediate_operand")))
> > +  [(set (match_operand:DWI 0 "register_operand")
> > +    (and:DWI
> > +      (not:DWI (match_operand:DWI 1 "register_operand"))
> > +      (match_operand:DWI 2 "nonimmediate_operand")))
> >     (clobber (reg:CC FLAGS_REG))]
> > -  "!TARGET_64BIT && !TARGET_BMI && TARGET_STV && TARGET_SSE2
> > +  "!TARGET_BMI
> >
> > Without BMI, the ANDN should be split to a double-word NOT + AND before
> > reload (and these two insns are split to single-word operations after reload).
> > This simplifies splitting logic quite a bit.
> >
> >  (define_expand "<code><mode>3"
> > -  [(set (match_operand:SWIM1248x 0 "nonimmediate_operand")
> > -    (any_or:SWIM1248x (match_operand:SWIM1248x 1
> > "nonimmediate_operand")
> > -              (match_operand:SWIM1248x 2 "<general_operand>")))]
> > +  [(set (match_operand:SDWIM 0 "nonimmediate_operand")
> > +    (any_or:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand")
> > +              (match_operand:SDWIM 2 "<general_operand>")))]
> >
> > Use <general_szext_operand> here ...
> >
> >    ""
> > -  "ix86_expand_binary_operator (<CODE>, <MODE>mode, operands); DONE;")
> > +{
> >
> > -(define_insn_and_split "*<code>di3_doubleword"
> > -  [(set (match_operand:DI 0 "nonimmediate_operand" "=ro,r")
> > -    (any_or:DI
> > -     (match_operand:DI 1 "nonimmediate_operand" "0,0")
> > -     (match_operand:DI 2 "x86_64_szext_general_operand" "re,o")))
> > +  if (GET_MODE_SIZE (<MODE>mode) > UNITS_PER_WORD
> > +      && !x86_64_hilo_general_operand (operands[2], <MODE>mode))
> > +    operands[2] = force_reg (<MODE>mode, operands[2]);
> >
> > ... to avoid the above fixup.
> >
> > +(define_insn_and_split "*<code><mode>3_doubleword"
> > +  [(set (match_operand:<DWI> 0 "nonimmediate_operand" "=ro,r")
> > +    (any_or:<DWI>
> > +     (match_operand:<DWI> 1 "nonimmediate_operand" "%0,0")
> > +     (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "r<di>,o")))
> >
> > <general_szext_operand> for consistency.
> >
> > Otherwise OK.
> >
> > Uros.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-06-30 12:01 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-30 10:56 [x86 PATCH take #2] Double word logical operation clean-ups in i386.md Roger Sayle
2022-06-30 12:01 ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).