public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
@ 2022-09-23  9:24 Tamar Christina
  2022-09-23  9:25 ` [PATCH 2/2]AArch64 Extend tbz pattern to allow SI to SI extensions Tamar Christina
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Tamar Christina @ 2022-09-23  9:24 UTC (permalink / raw)
  To: gcc-patches; +Cc: nd, rguenther, jeffreyalaw, richard.sandiford

[-- Attachment #1: Type: text/plain, Size: 2762 bytes --]

Hi All,

This is an RFC to figure out how to deal with targets that don't have native
comparisons against QImode values.

Booleans, at least in C99 and higher are 0-1 valued.  This means that we only
really need to test a single bit.  However in RTL we no longer have this
information available and just have an SImode value (due to the promotion of
QImode to SImode).

This RFC fixes it by emitting an explicit & 1 during the expansion of the
conditional branch.

However it's unlikely that we want to do this unconditionally.  Most targets
I've tested seem to have harmless code changes, like x86 changes from testb to
andl, $1.

So I have two questions:

1. Should I limit this behind a target macro? Or should I just leave it for all
   targets and deal with the fallout.
2. How can I tell whether the C99 0-1 values bools are being used or the older
   0, non-0 variant?

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
However there are some benign codegen changes on x86, testb changed to andl $1.

This pattern occurs more than 120,000 times in SPECCPU 2017 and so is quite common.

Thanks,
Tamar

gcc/ChangeLog:

	* tree.h (tree_zero_one_valued_p): New.
	* dojump.cc (do_jump): Add & 1 if truth type.

--- inline copy of patch -- 
diff --git a/gcc/dojump.cc b/gcc/dojump.cc
index 2af0cd1aca3b6af13d5d8799094ee93f18022296..8eaf1be49cd12298e61c6946ae79ca9de6197864 100644
--- a/gcc/dojump.cc
+++ b/gcc/dojump.cc
@@ -605,7 +605,17 @@ do_jump (tree exp, rtx_code_label *if_false_label,
       /* Fall through and generate the normal code.  */
     default:
     normal:
-      temp = expand_normal (exp);
+      tree cmp = exp;
+      /* If the expression is a truth type then explicitly generate an & 1
+	 to indicate to the target that it's a zero-one values type.  This
+	 allows the target to further optimize the comparison should it
+	 choose to.  */
+      if (tree_zero_one_valued_p (exp))
+	{
+	  type = TREE_TYPE (exp);
+	  cmp = build2 (BIT_AND_EXPR, type, exp, build_int_cstu (type, 1));
+	}
+      temp = expand_normal (cmp);
       do_pending_stack_adjust ();
       /* The RTL optimizers prefer comparisons against pseudos.  */
       if (GET_CODE (temp) == SUBREG)
diff --git a/gcc/tree.h b/gcc/tree.h
index 8f8a9660c9e0605eb516de194640b8c1b531b798..be3d2dee82f692e81082cf21c878c10f9fe9e1f1 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -4690,6 +4690,7 @@ extern tree signed_or_unsigned_type_for (int, tree);
 extern tree signed_type_for (tree);
 extern tree unsigned_type_for (tree);
 extern bool is_truth_type_for (tree, tree);
+extern bool tree_zero_one_valued_p (tree);
 extern tree truth_type_for (tree);
 extern tree build_pointer_type_for_mode (tree, machine_mode, bool);
 extern tree build_pointer_type (tree);




-- 

[-- Attachment #2: rb16248.patch --]
[-- Type: text/plain, Size: 1501 bytes --]

diff --git a/gcc/dojump.cc b/gcc/dojump.cc
index 2af0cd1aca3b6af13d5d8799094ee93f18022296..8eaf1be49cd12298e61c6946ae79ca9de6197864 100644
--- a/gcc/dojump.cc
+++ b/gcc/dojump.cc
@@ -605,7 +605,17 @@ do_jump (tree exp, rtx_code_label *if_false_label,
       /* Fall through and generate the normal code.  */
     default:
     normal:
-      temp = expand_normal (exp);
+      tree cmp = exp;
+      /* If the expression is a truth type then explicitly generate an & 1
+	 to indicate to the target that it's a zero-one values type.  This
+	 allows the target to further optimize the comparison should it
+	 choose to.  */
+      if (tree_zero_one_valued_p (exp))
+	{
+	  type = TREE_TYPE (exp);
+	  cmp = build2 (BIT_AND_EXPR, type, exp, build_int_cstu (type, 1));
+	}
+      temp = expand_normal (cmp);
       do_pending_stack_adjust ();
       /* The RTL optimizers prefer comparisons against pseudos.  */
       if (GET_CODE (temp) == SUBREG)
diff --git a/gcc/tree.h b/gcc/tree.h
index 8f8a9660c9e0605eb516de194640b8c1b531b798..be3d2dee82f692e81082cf21c878c10f9fe9e1f1 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -4690,6 +4690,7 @@ extern tree signed_or_unsigned_type_for (int, tree);
 extern tree signed_type_for (tree);
 extern tree unsigned_type_for (tree);
 extern bool is_truth_type_for (tree, tree);
+extern bool tree_zero_one_valued_p (tree);
 extern tree truth_type_for (tree);
 extern tree build_pointer_type_for_mode (tree, machine_mode, bool);
 extern tree build_pointer_type (tree);




^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 2/2]AArch64 Extend tbz pattern to allow SI to SI extensions.
  2022-09-23  9:24 [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend Tamar Christina
@ 2022-09-23  9:25 ` Tamar Christina
  2022-09-23  9:42   ` Richard Sandiford
  2022-09-26 10:35 ` [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend Richard Biener
  2022-10-27  3:22 ` Andrew Pinski
  2 siblings, 1 reply; 30+ messages in thread
From: Tamar Christina @ 2022-09-23  9:25 UTC (permalink / raw)
  To: gcc-patches
  Cc: nd, Richard.Earnshaw, Marcus.Shawcroft, Kyrylo.Tkachov,
	richard.sandiford

[-- Attachment #1: Type: text/plain, Size: 3840 bytes --]

Hi All,

This adds additional recognition of & 1 into the tbz/tbnz pattern.

Concretely with the mid-end changes this changes

void g1(bool x)
{
  if (__builtin_expect (x, 0))
    h ();
}

from

        tst     w0, 255
        bne     .L7

to

        tbnz    w0, #0, .L5

This pattern occurs ~120,000 times in SPECCPU 20127, basically on
every boolean comparison.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

	* config/aarch64/aarch64.md (*tb<optab><mode>1): Renamed this ...
	(*tb<optab><GPI2:mode><GPI:mode>1): ... To this.
	* config/aarch64/iterators.md (GPI2): New.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/tbz_1.c: New test.

--- inline copy of patch -- 
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 6aa1fb4be003f2027d63ac69fd314c2bbc876258..3faa03f453c94665d9d82225f180d8afdcd0b5fe 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -943,31 +943,33 @@ (define_insn "*cb<optab><mode>1"
 		      (const_int 1)))]
 )
 
-(define_insn "*tb<optab><mode>1"
+(define_insn "*tb<optab><GPI2:mode><GPI:mode>1"
   [(set (pc) (if_then_else
-	      (EQL (zero_extract:DI (match_operand:GPI 0 "register_operand" "r")
-				    (const_int 1)
-				    (match_operand 1
-				      "aarch64_simd_shift_imm_<mode>" "n"))
+	      (EQL (zero_extract:GPI2
+		     (match_operand:GPI 0 "register_operand" "r")
+		     (const_int 1)
+		     (match_operand 1 "aarch64_simd_shift_imm_<GPI:mode>" "n"))
 		   (const_int 0))
 	     (label_ref (match_operand 2 "" ""))
 	     (pc)))
    (clobber (reg:CC CC_REGNUM))]
-  "!aarch64_track_speculation"
+  "!aarch64_track_speculation
+    && known_ge (GET_MODE_SIZE (<GPI2:MODE>mode),
+		 GET_MODE_SIZE (<GPI:MODE>mode))"
   {
     if (get_attr_length (insn) == 8)
       {
 	if (get_attr_far_branch (insn) == 1)
 	  return aarch64_gen_far_branch (operands, 2, "Ltb",
-					 "<inv_tb>\\t%<w>0, %1, ");
+					 "<inv_tb>\\t%<GPI:w>0, %1, ");
 	else
 	  {
 	    operands[1] = GEN_INT (HOST_WIDE_INT_1U << UINTVAL (operands[1]));
-	    return "tst\t%<w>0, %1\;<bcond>\t%l2";
+	    return "tst\t%<GPI:w>0, %1\;<bcond>\t%l2";
 	  }
       }
     else
-      return "<tbz>\t%<w>0, %1, %l2";
+      return "<tbz>\t%<GPI:w>0, %1, %l2";
   }
   [(set_attr "type" "branch")
    (set (attr "length")
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 89ca66fd291b60a28979785706ecc5345ea86744..f6b2e7a83c63cab73947b6bd61b499b4b57d14ac 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -28,6 +28,8 @@ (define_mode_iterator CCFP_CCFPE [CCFP CCFPE])
 
 ;; Iterator for General Purpose Integer registers (32- and 64-bit modes)
 (define_mode_iterator GPI [SI DI])
+;; Copy of the above iterator
+(define_mode_iterator GPI2 [SI DI])
 
 ;; Iterator for HI, SI, DI, some instructions can only work on these modes.
 (define_mode_iterator GPI_I16 [(HI "AARCH64_ISA_F16") SI DI])
diff --git a/gcc/testsuite/gcc.target/aarch64/tbz_1.c b/gcc/testsuite/gcc.target/aarch64/tbz_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..6a75eb4e7aedbfa3ae329358c6ee4d675704a074
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/tbz_1.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -std=c99  -fno-unwind-tables -fno-asynchronous-unwind-tables" } */
+/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */
+
+#include <stdbool.h>// Type your code here, or load an example.
+
+void h(void);
+
+/*
+** g1:
+** 	tbnz	w[0-9], #?0, .L([0-9]+)
+** 	ret
+**	...
+*/
+void g1(bool x)
+{
+  if (__builtin_expect (x, 0))
+    h ();
+}
+
+/*
+** g2:
+** 	tbz	w[0-9]+, #?0, .L([0-9]+)
+** 	b	h
+**	...
+*/
+void g2(bool x)
+{
+  if (__builtin_expect (x, 1))
+    h ();
+}
+




-- 

[-- Attachment #2: rb16249.patch --]
[-- Type: text/plain, Size: 3098 bytes --]

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 6aa1fb4be003f2027d63ac69fd314c2bbc876258..3faa03f453c94665d9d82225f180d8afdcd0b5fe 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -943,31 +943,33 @@ (define_insn "*cb<optab><mode>1"
 		      (const_int 1)))]
 )
 
-(define_insn "*tb<optab><mode>1"
+(define_insn "*tb<optab><GPI2:mode><GPI:mode>1"
   [(set (pc) (if_then_else
-	      (EQL (zero_extract:DI (match_operand:GPI 0 "register_operand" "r")
-				    (const_int 1)
-				    (match_operand 1
-				      "aarch64_simd_shift_imm_<mode>" "n"))
+	      (EQL (zero_extract:GPI2
+		     (match_operand:GPI 0 "register_operand" "r")
+		     (const_int 1)
+		     (match_operand 1 "aarch64_simd_shift_imm_<GPI:mode>" "n"))
 		   (const_int 0))
 	     (label_ref (match_operand 2 "" ""))
 	     (pc)))
    (clobber (reg:CC CC_REGNUM))]
-  "!aarch64_track_speculation"
+  "!aarch64_track_speculation
+    && known_ge (GET_MODE_SIZE (<GPI2:MODE>mode),
+		 GET_MODE_SIZE (<GPI:MODE>mode))"
   {
     if (get_attr_length (insn) == 8)
       {
 	if (get_attr_far_branch (insn) == 1)
 	  return aarch64_gen_far_branch (operands, 2, "Ltb",
-					 "<inv_tb>\\t%<w>0, %1, ");
+					 "<inv_tb>\\t%<GPI:w>0, %1, ");
 	else
 	  {
 	    operands[1] = GEN_INT (HOST_WIDE_INT_1U << UINTVAL (operands[1]));
-	    return "tst\t%<w>0, %1\;<bcond>\t%l2";
+	    return "tst\t%<GPI:w>0, %1\;<bcond>\t%l2";
 	  }
       }
     else
-      return "<tbz>\t%<w>0, %1, %l2";
+      return "<tbz>\t%<GPI:w>0, %1, %l2";
   }
   [(set_attr "type" "branch")
    (set (attr "length")
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 89ca66fd291b60a28979785706ecc5345ea86744..f6b2e7a83c63cab73947b6bd61b499b4b57d14ac 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -28,6 +28,8 @@ (define_mode_iterator CCFP_CCFPE [CCFP CCFPE])
 
 ;; Iterator for General Purpose Integer registers (32- and 64-bit modes)
 (define_mode_iterator GPI [SI DI])
+;; Copy of the above iterator
+(define_mode_iterator GPI2 [SI DI])
 
 ;; Iterator for HI, SI, DI, some instructions can only work on these modes.
 (define_mode_iterator GPI_I16 [(HI "AARCH64_ISA_F16") SI DI])
diff --git a/gcc/testsuite/gcc.target/aarch64/tbz_1.c b/gcc/testsuite/gcc.target/aarch64/tbz_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..6a75eb4e7aedbfa3ae329358c6ee4d675704a074
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/tbz_1.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -std=c99  -fno-unwind-tables -fno-asynchronous-unwind-tables" } */
+/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */
+
+#include <stdbool.h>// Type your code here, or load an example.
+
+void h(void);
+
+/*
+** g1:
+** 	tbnz	w[0-9], #?0, .L([0-9]+)
+** 	ret
+**	...
+*/
+void g1(bool x)
+{
+  if (__builtin_expect (x, 0))
+    h ();
+}
+
+/*
+** g2:
+** 	tbz	w[0-9]+, #?0, .L([0-9]+)
+** 	b	h
+**	...
+*/
+void g2(bool x)
+{
+  if (__builtin_expect (x, 1))
+    h ();
+}
+




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2]AArch64 Extend tbz pattern to allow SI to SI extensions.
  2022-09-23  9:25 ` [PATCH 2/2]AArch64 Extend tbz pattern to allow SI to SI extensions Tamar Christina
@ 2022-09-23  9:42   ` Richard Sandiford
  2022-09-23  9:48     ` Tamar Christina
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Sandiford @ 2022-09-23  9:42 UTC (permalink / raw)
  To: Tamar Christina
  Cc: gcc-patches, nd, Richard.Earnshaw, Marcus.Shawcroft, Kyrylo.Tkachov

Tamar Christina <tamar.christina@arm.com> writes:
> Hi All,
>
> This adds additional recognition of & 1 into the tbz/tbnz pattern.
>
> Concretely with the mid-end changes this changes
>
> void g1(bool x)
> {
>   if (__builtin_expect (x, 0))
>     h ();
> }
>
> from
>
>         tst     w0, 255
>         bne     .L7
>
> to
>
>         tbnz    w0, #0, .L5
>
> This pattern occurs ~120,000 times in SPECCPU 20127, basically on
> every boolean comparison.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> 	* config/aarch64/aarch64.md (*tb<optab><mode>1): Renamed this ...
> 	(*tb<optab><GPI2:mode><GPI:mode>1): ... To this.
> 	* config/aarch64/iterators.md (GPI2): New.
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.target/aarch64/tbz_1.c: New test.
>
> --- inline copy of patch -- 
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 6aa1fb4be003f2027d63ac69fd314c2bbc876258..3faa03f453c94665d9d82225f180d8afdcd0b5fe 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -943,31 +943,33 @@ (define_insn "*cb<optab><mode>1"
>  		      (const_int 1)))]
>  )
>  
> -(define_insn "*tb<optab><mode>1"
> +(define_insn "*tb<optab><GPI2:mode><GPI:mode>1"
>    [(set (pc) (if_then_else
> -	      (EQL (zero_extract:DI (match_operand:GPI 0 "register_operand" "r")
> -				    (const_int 1)
> -				    (match_operand 1
> -				      "aarch64_simd_shift_imm_<mode>" "n"))
> +	      (EQL (zero_extract:GPI2
> +		     (match_operand:GPI 0 "register_operand" "r")
> +		     (const_int 1)
> +		     (match_operand 1 "aarch64_simd_shift_imm_<GPI:mode>" "n"))
>  		   (const_int 0))
>  	     (label_ref (match_operand 2 "" ""))
>  	     (pc)))
>     (clobber (reg:CC CC_REGNUM))]
> -  "!aarch64_track_speculation"
> +  "!aarch64_track_speculation
> +    && known_ge (GET_MODE_SIZE (<GPI2:MODE>mode),
> +		 GET_MODE_SIZE (<GPI:MODE>mode))"

Is this check necessary?  The extraction evaluates to 0 or 1,
so it shouldn't matter whether it is interpreted as DI or SI.

OK without the check if you agree.

Thanks,
Richard

>    {
>      if (get_attr_length (insn) == 8)
>        {
>  	if (get_attr_far_branch (insn) == 1)
>  	  return aarch64_gen_far_branch (operands, 2, "Ltb",
> -					 "<inv_tb>\\t%<w>0, %1, ");
> +					 "<inv_tb>\\t%<GPI:w>0, %1, ");
>  	else
>  	  {
>  	    operands[1] = GEN_INT (HOST_WIDE_INT_1U << UINTVAL (operands[1]));
> -	    return "tst\t%<w>0, %1\;<bcond>\t%l2";
> +	    return "tst\t%<GPI:w>0, %1\;<bcond>\t%l2";
>  	  }
>        }
>      else
> -      return "<tbz>\t%<w>0, %1, %l2";
> +      return "<tbz>\t%<GPI:w>0, %1, %l2";
>    }
>    [(set_attr "type" "branch")
>     (set (attr "length")
> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
> index 89ca66fd291b60a28979785706ecc5345ea86744..f6b2e7a83c63cab73947b6bd61b499b4b57d14ac 100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -28,6 +28,8 @@ (define_mode_iterator CCFP_CCFPE [CCFP CCFPE])
>  
>  ;; Iterator for General Purpose Integer registers (32- and 64-bit modes)
>  (define_mode_iterator GPI [SI DI])
> +;; Copy of the above iterator
> +(define_mode_iterator GPI2 [SI DI])
>  
>  ;; Iterator for HI, SI, DI, some instructions can only work on these modes.
>  (define_mode_iterator GPI_I16 [(HI "AARCH64_ISA_F16") SI DI])
> diff --git a/gcc/testsuite/gcc.target/aarch64/tbz_1.c b/gcc/testsuite/gcc.target/aarch64/tbz_1.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..6a75eb4e7aedbfa3ae329358c6ee4d675704a074
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/tbz_1.c
> @@ -0,0 +1,32 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2 -std=c99  -fno-unwind-tables -fno-asynchronous-unwind-tables" } */
> +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */
> +
> +#include <stdbool.h>// Type your code here, or load an example.
> +
> +void h(void);
> +
> +/*
> +** g1:
> +** 	tbnz	w[0-9], #?0, .L([0-9]+)
> +** 	ret
> +**	...
> +*/
> +void g1(bool x)
> +{
> +  if (__builtin_expect (x, 0))
> +    h ();
> +}
> +
> +/*
> +** g2:
> +** 	tbz	w[0-9]+, #?0, .L([0-9]+)
> +** 	b	h
> +**	...
> +*/
> +void g2(bool x)
> +{
> +  if (__builtin_expect (x, 1))
> +    h ();
> +}
> +

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH 2/2]AArch64 Extend tbz pattern to allow SI to SI extensions.
  2022-09-23  9:42   ` Richard Sandiford
@ 2022-09-23  9:48     ` Tamar Christina
  0 siblings, 0 replies; 30+ messages in thread
From: Tamar Christina @ 2022-09-23  9:48 UTC (permalink / raw)
  To: Richard Sandiford
  Cc: gcc-patches, nd, Richard Earnshaw, Marcus Shawcroft, Kyrylo Tkachov

> -----Original Message-----
> From: Richard Sandiford <richard.sandiford@arm.com>
> Sent: Friday, September 23, 2022 5:43 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; Richard Earnshaw
> <Richard.Earnshaw@arm.com>; Marcus Shawcroft
> <Marcus.Shawcroft@arm.com>; Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> Subject: Re: [PATCH 2/2]AArch64 Extend tbz pattern to allow SI to SI
> extensions.
> 
> Tamar Christina <tamar.christina@arm.com> writes:
> > Hi All,
> >
> > This adds additional recognition of & 1 into the tbz/tbnz pattern.
> >
> > Concretely with the mid-end changes this changes
> >
> > void g1(bool x)
> > {
> >   if (__builtin_expect (x, 0))
> >     h ();
> > }
> >
> > from
> >
> >         tst     w0, 255
> >         bne     .L7
> >
> > to
> >
> >         tbnz    w0, #0, .L5
> >
> > This pattern occurs ~120,000 times in SPECCPU 20127, basically on
> > every boolean comparison.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > 	* config/aarch64/aarch64.md (*tb<optab><mode>1): Renamed this
> ...
> > 	(*tb<optab><GPI2:mode><GPI:mode>1): ... To this.
> > 	* config/aarch64/iterators.md (GPI2): New.
> >
> > gcc/testsuite/ChangeLog:
> >
> > 	* gcc.target/aarch64/tbz_1.c: New test.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/config/aarch64/aarch64.md
> > b/gcc/config/aarch64/aarch64.md index
> >
> 6aa1fb4be003f2027d63ac69fd314c2bbc876258..3faa03f453c94665d9d82225f18
> 0
> > d8afdcd0b5fe 100644
> > --- a/gcc/config/aarch64/aarch64.md
> > +++ b/gcc/config/aarch64/aarch64.md
> > @@ -943,31 +943,33 @@ (define_insn "*cb<optab><mode>1"
> >  		      (const_int 1)))]
> >  )
> >
> > -(define_insn "*tb<optab><mode>1"
> > +(define_insn "*tb<optab><GPI2:mode><GPI:mode>1"
> >    [(set (pc) (if_then_else
> > -	      (EQL (zero_extract:DI (match_operand:GPI 0 "register_operand"
> "r")
> > -				    (const_int 1)
> > -				    (match_operand 1
> > -				      "aarch64_simd_shift_imm_<mode>" "n"))
> > +	      (EQL (zero_extract:GPI2
> > +		     (match_operand:GPI 0 "register_operand" "r")
> > +		     (const_int 1)
> > +		     (match_operand 1
> "aarch64_simd_shift_imm_<GPI:mode>" "n"))
> >  		   (const_int 0))
> >  	     (label_ref (match_operand 2 "" ""))
> >  	     (pc)))
> >     (clobber (reg:CC CC_REGNUM))]
> > -  "!aarch64_track_speculation"
> > +  "!aarch64_track_speculation
> > +    && known_ge (GET_MODE_SIZE (<GPI2:MODE>mode),
> > +		 GET_MODE_SIZE (<GPI:MODE>mode))"
> 
> Is this check necessary?  The extraction evaluates to 0 or 1, so it shouldn't
> matter whether it is interpreted as DI or SI.
> 

Ah yes, fair point.

I will remove the check,

Thanks,
Tamar.

> OK without the check if you agree.
> 
> Thanks,
> Richard
> 
> >    {
> >      if (get_attr_length (insn) == 8)
> >        {
> >  	if (get_attr_far_branch (insn) == 1)
> >  	  return aarch64_gen_far_branch (operands, 2, "Ltb",
> > -					 "<inv_tb>\\t%<w>0, %1, ");
> > +					 "<inv_tb>\\t%<GPI:w>0, %1, ");
> >  	else
> >  	  {
> >  	    operands[1] = GEN_INT (HOST_WIDE_INT_1U << UINTVAL
> (operands[1]));
> > -	    return "tst\t%<w>0, %1\;<bcond>\t%l2";
> > +	    return "tst\t%<GPI:w>0, %1\;<bcond>\t%l2";
> >  	  }
> >        }
> >      else
> > -      return "<tbz>\t%<w>0, %1, %l2";
> > +      return "<tbz>\t%<GPI:w>0, %1, %l2";
> >    }
> >    [(set_attr "type" "branch")
> >     (set (attr "length")
> > diff --git a/gcc/config/aarch64/iterators.md
> > b/gcc/config/aarch64/iterators.md index
> >
> 89ca66fd291b60a28979785706ecc5345ea86744..f6b2e7a83c63cab73947b6bd6
> 1b4
> > 99b4b57d14ac 100644
> > --- a/gcc/config/aarch64/iterators.md
> > +++ b/gcc/config/aarch64/iterators.md
> > @@ -28,6 +28,8 @@ (define_mode_iterator CCFP_CCFPE [CCFP CCFPE])
> >
> >  ;; Iterator for General Purpose Integer registers (32- and 64-bit
> > modes)  (define_mode_iterator GPI [SI DI])
> > +;; Copy of the above iterator
> > +(define_mode_iterator GPI2 [SI DI])
> >
> >  ;; Iterator for HI, SI, DI, some instructions can only work on these modes.
> >  (define_mode_iterator GPI_I16 [(HI "AARCH64_ISA_F16") SI DI]) diff
> > --git a/gcc/testsuite/gcc.target/aarch64/tbz_1.c
> > b/gcc/testsuite/gcc.target/aarch64/tbz_1.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..6a75eb4e7aedbfa3ae329358c
> 6ee
> > 4d675704a074
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/tbz_1.c
> > @@ -0,0 +1,32 @@
> > +/* { dg-do compile } */
> > +/* { dg-additional-options "-O2 -std=c99  -fno-unwind-tables
> > +-fno-asynchronous-unwind-tables" } */
> > +/* { dg-final { check-function-bodies "**" "" "" { target { le } } }
> > +} */
> > +
> > +#include <stdbool.h>// Type your code here, or load an example.
> > +
> > +void h(void);
> > +
> > +/*
> > +** g1:
> > +** 	tbnz	w[0-9], #?0, .L([0-9]+)
> > +** 	ret
> > +**	...
> > +*/
> > +void g1(bool x)
> > +{
> > +  if (__builtin_expect (x, 0))
> > +    h ();
> > +}
> > +
> > +/*
> > +** g2:
> > +** 	tbz	w[0-9]+, #?0, .L([0-9]+)
> > +** 	b	h
> > +**	...
> > +*/
> > +void g2(bool x)
> > +{
> > +  if (__builtin_expect (x, 1))
> > +    h ();
> > +}
> > +

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-23  9:24 [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend Tamar Christina
  2022-09-23  9:25 ` [PATCH 2/2]AArch64 Extend tbz pattern to allow SI to SI extensions Tamar Christina
@ 2022-09-26 10:35 ` Richard Biener
  2022-09-26 11:05   ` Tamar Christina
  2022-10-27  3:22 ` Andrew Pinski
  2 siblings, 1 reply; 30+ messages in thread
From: Richard Biener @ 2022-09-26 10:35 UTC (permalink / raw)
  To: Tamar Christina; +Cc: gcc-patches, nd, jeffreyalaw, richard.sandiford

On Fri, 23 Sep 2022, Tamar Christina wrote:

> Hi All,
> 
> This is an RFC to figure out how to deal with targets that don't have native
> comparisons against QImode values.
> 
> Booleans, at least in C99 and higher are 0-1 valued.  This means that we only
> really need to test a single bit.  However in RTL we no longer have this
> information available and just have an SImode value (due to the promotion of
> QImode to SImode).
>
> This RFC fixes it by emitting an explicit & 1 during the expansion of the
> conditional branch.
> 
> However it's unlikely that we want to do this unconditionally.  Most targets
> I've tested seem to have harmless code changes, like x86 changes from testb to
> andl, $1.
> 
> So I have two questions:
> 
> 1. Should I limit this behind a target macro? Or should I just leave it for all
>    targets and deal with the fallout.
> 2. How can I tell whether the C99 0-1 values bools are being used or the older
>    0, non-0 variant?
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> However there are some benign codegen changes on x86, testb changed to andl $1.
> 
> This pattern occurs more than 120,000 times in SPECCPU 2017 and so is quite common.

How does this help a target?  Why does RTL nonzerop bits not recover this
information and the desired optimization done later during combine
for example?  Why's a SImode compare not OK if there's no QImode compare?
We have undocumented addcc, negcc, etc. patterns, should we have a
andcc pattern for this indicating support for andcc + jump as opposed
to cmpcc + jump?

So - what's the target and what's a testcase?

Thanks,
Richard.
 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 	* tree.h (tree_zero_one_valued_p): New.
> 	* dojump.cc (do_jump): Add & 1 if truth type.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/dojump.cc b/gcc/dojump.cc
> index 2af0cd1aca3b6af13d5d8799094ee93f18022296..8eaf1be49cd12298e61c6946ae79ca9de6197864 100644
> --- a/gcc/dojump.cc
> +++ b/gcc/dojump.cc
> @@ -605,7 +605,17 @@ do_jump (tree exp, rtx_code_label *if_false_label,
>        /* Fall through and generate the normal code.  */
>      default:
>      normal:
> -      temp = expand_normal (exp);
> +      tree cmp = exp;
> +      /* If the expression is a truth type then explicitly generate an & 1
> +	 to indicate to the target that it's a zero-one values type.  This
> +	 allows the target to further optimize the comparison should it
> +	 choose to.  */
> +      if (tree_zero_one_valued_p (exp))
> +	{
> +	  type = TREE_TYPE (exp);
> +	  cmp = build2 (BIT_AND_EXPR, type, exp, build_int_cstu (type, 1));
> +	}
> +      temp = expand_normal (cmp);
>        do_pending_stack_adjust ();
>        /* The RTL optimizers prefer comparisons against pseudos.  */
>        if (GET_CODE (temp) == SUBREG)
> diff --git a/gcc/tree.h b/gcc/tree.h
> index 8f8a9660c9e0605eb516de194640b8c1b531b798..be3d2dee82f692e81082cf21c878c10f9fe9e1f1 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -4690,6 +4690,7 @@ extern tree signed_or_unsigned_type_for (int, tree);
>  extern tree signed_type_for (tree);
>  extern tree unsigned_type_for (tree);
>  extern bool is_truth_type_for (tree, tree);
> +extern bool tree_zero_one_valued_p (tree);
>  extern tree truth_type_for (tree);
>  extern tree build_pointer_type_for_mode (tree, machine_mode, bool);
>  extern tree build_pointer_type (tree);
> 
> 
> 
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-26 10:35 ` [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend Richard Biener
@ 2022-09-26 11:05   ` Tamar Christina
  2022-09-26 11:32     ` Richard Biener
  0 siblings, 1 reply; 30+ messages in thread
From: Tamar Christina @ 2022-09-26 11:05 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, nd, jeffreyalaw, Richard Sandiford

[-- Attachment #1: Type: text/plain, Size: 5629 bytes --]

> > This pattern occurs more than 120,000 times in SPECCPU 2017 and so is quite common.

> How does this help a target?

So the idea is to communicate that only the bottom bit of the value is relevant and not the entire value itself.

> Why does RTL nonzerop bits not recover thisinformation and the desired optimization done later during combinefor example?

I'm not sure it works here. We (AArch64) don't have QImode integer registers, so our apcs says that the top bits of the 32-bit registers it's passed in are undefined.

We have to zero extend the value first if we really want it as an 8-bit value. So the problem is if you e.g. Pass a boolean as an argument of a function I don't think nonzero bits will return anything useful.

> Why's a SImode compare not OK if there's no QImode compare?

The mode then becomes irrelevant because we're telling the backend that only a single bit matters. And we have instructions to test and branch on the value of a single bit. See https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602090.html for the testcases

> We have undocumented addcc, negcc, etc. patterns, should we have aandcc pattern for this indicating support for andcc + jump as opposedto cmpcc + jump?

This could work yeah. I didn't know these existed.

> So - what's the target and what's a testcase?

See https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602090.html :)

Thanks,
Tamar

________________________________
From: Richard Biener <rguenther@suse.de>
Sent: Monday, September 26, 2022 11:35 AM
To: Tamar Christina <Tamar.Christina@arm.com>
Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; jeffreyalaw@gmail.com <jeffreyalaw@gmail.com>; Richard Sandiford <Richard.Sandiford@arm.com>
Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend

On Fri, 23 Sep 2022, Tamar Christina wrote:

> Hi All,
>
> This is an RFC to figure out how to deal with targets that don't have native
> comparisons against QImode values.
>
> Booleans, at least in C99 and higher are 0-1 valued.  This means that we only
> really need to test a single bit.  However in RTL we no longer have this
> information available and just have an SImode value (due to the promotion of
> QImode to SImode).
>
> This RFC fixes it by emitting an explicit & 1 during the expansion of the
> conditional branch.
>
> However it's unlikely that we want to do this unconditionally.  Most targets
> I've tested seem to have harmless code changes, like x86 changes from testb to
> andl, $1.
>
> So I have two questions:
>
> 1. Should I limit this behind a target macro? Or should I just leave it for all
>    targets and deal with the fallout.
> 2. How can I tell whether the C99 0-1 values bools are being used or the older
>    0, non-0 variant?
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> However there are some benign codegen changes on x86, testb changed to andl $1.
>
> This pattern occurs more than 120,000 times in SPECCPU 2017 and so is quite common.

How does this help a target?  Why does RTL nonzerop bits not recover this
information and the desired optimization done later during combine
for example?  Why's a SImode compare not OK if there's no QImode compare?
We have undocumented addcc, negcc, etc. patterns, should we have a
andcc pattern for this indicating support for andcc + jump as opposed
to cmpcc + jump?

So - what's the target and what's a testcase?

Thanks,
Richard.

> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>        * tree.h (tree_zero_one_valued_p): New.
>        * dojump.cc (do_jump): Add & 1 if truth type.
>
> --- inline copy of patch --
> diff --git a/gcc/dojump.cc b/gcc/dojump.cc
> index 2af0cd1aca3b6af13d5d8799094ee93f18022296..8eaf1be49cd12298e61c6946ae79ca9de6197864 100644
> --- a/gcc/dojump.cc
> +++ b/gcc/dojump.cc
> @@ -605,7 +605,17 @@ do_jump (tree exp, rtx_code_label *if_false_label,
>        /* Fall through and generate the normal code.  */
>      default:
>      normal:
> -      temp = expand_normal (exp);
> +      tree cmp = exp;
> +      /* If the expression is a truth type then explicitly generate an & 1
> +      to indicate to the target that it's a zero-one values type.  This
> +      allows the target to further optimize the comparison should it
> +      choose to.  */
> +      if (tree_zero_one_valued_p (exp))
> +     {
> +       type = TREE_TYPE (exp);
> +       cmp = build2 (BIT_AND_EXPR, type, exp, build_int_cstu (type, 1));
> +     }
> +      temp = expand_normal (cmp);
>        do_pending_stack_adjust ();
>        /* The RTL optimizers prefer comparisons against pseudos.  */
>        if (GET_CODE (temp) == SUBREG)
> diff --git a/gcc/tree.h b/gcc/tree.h
> index 8f8a9660c9e0605eb516de194640b8c1b531b798..be3d2dee82f692e81082cf21c878c10f9fe9e1f1 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -4690,6 +4690,7 @@ extern tree signed_or_unsigned_type_for (int, tree);
>  extern tree signed_type_for (tree);
>  extern tree unsigned_type_for (tree);
>  extern bool is_truth_type_for (tree, tree);
> +extern bool tree_zero_one_valued_p (tree);
>  extern tree truth_type_for (tree);
>  extern tree build_pointer_type_for_mode (tree, machine_mode, bool);
>  extern tree build_pointer_type (tree);
>
>
>
>
>

--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-26 11:05   ` Tamar Christina
@ 2022-09-26 11:32     ` Richard Biener
  2022-09-26 11:46       ` Tamar Christina
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Biener @ 2022-09-26 11:32 UTC (permalink / raw)
  To: Tamar Christina; +Cc: gcc-patches, nd, jeffreyalaw, Richard Sandiford

On Mon, 26 Sep 2022, Tamar Christina wrote:

> > > This pattern occurs more than 120,000 times in SPECCPU 2017 and so is quite common.
> 
> > How does this help a target?
> 
> So the idea is to communicate that only the bottom bit of the value is relevant and not the entire value itself.
> 
> > Why does RTL nonzerop bits not recover thisinformation and the desired optimization done later during combinefor example?
> 
> I'm not sure it works here. We (AArch64) don't have QImode integer registers, so our apcs says that the top bits of the 32-bit registers it's passed in are undefined.
> 
> We have to zero extend the value first if we really want it as an 8-bit value. So the problem is if you e.g. Pass a boolean as an argument of a function I don't think nonzero bits will return anything useful.
> 
> > Why's a SImode compare not OK if there's no QImode compare?
> 
> The mode then becomes irrelevant because we're telling the backend that only a single bit matters. And we have instructions to test and branch on the value of a single bit. See https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602090.html for the testcases

Maybe the target could use (subreg:SI (reg:BI ...)) as argument.  Heh.

> > We have undocumented addcc, negcc, etc. patterns, should we have aandcc pattern for this indicating support for andcc + jump as opposedto cmpcc + jump?
> 
> This could work yeah. I didn't know these existed.

Ah, so they are conditional add, not add setting CC, so andcc wouldn't
be appropriate.

So I'm not sure how we'd handle such situation - maybe looking at
REG_DECL and recognizing a _Bool PARM_DECL is OK?

Richard.

> > So - what's the target and what's a testcase?
> 
> See https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602090.html :)
> 
> Thanks,
> Tamar
> 
> ________________________________
> From: Richard Biener <rguenther@suse.de>
> Sent: Monday, September 26, 2022 11:35 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; jeffreyalaw@gmail.com <jeffreyalaw@gmail.com>; Richard Sandiford <Richard.Sandiford@arm.com>
> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
> 
> On Fri, 23 Sep 2022, Tamar Christina wrote:
> 
> > Hi All,
> >
> > This is an RFC to figure out how to deal with targets that don't have native
> > comparisons against QImode values.
> >
> > Booleans, at least in C99 and higher are 0-1 valued.  This means that we only
> > really need to test a single bit.  However in RTL we no longer have this
> > information available and just have an SImode value (due to the promotion of
> > QImode to SImode).
> >
> > This RFC fixes it by emitting an explicit & 1 during the expansion of the
> > conditional branch.
> >
> > However it's unlikely that we want to do this unconditionally.  Most targets
> > I've tested seem to have harmless code changes, like x86 changes from testb to
> > andl, $1.
> >
> > So I have two questions:
> >
> > 1. Should I limit this behind a target macro? Or should I just leave it for all
> >    targets and deal with the fallout.
> > 2. How can I tell whether the C99 0-1 values bools are being used or the older
> >    0, non-0 variant?
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > However there are some benign codegen changes on x86, testb changed to andl $1.
> >
> > This pattern occurs more than 120,000 times in SPECCPU 2017 and so is quite common.
> 
> How does this help a target?  Why does RTL nonzerop bits not recover this
> information and the desired optimization done later during combine
> for example?  Why's a SImode compare not OK if there's no QImode compare?
> We have undocumented addcc, negcc, etc. patterns, should we have a
> andcc pattern for this indicating support for andcc + jump as opposed
> to cmpcc + jump?
> 
> So - what's the target and what's a testcase?
> 
> Thanks,
> Richard.
> 
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> >        * tree.h (tree_zero_one_valued_p): New.
> >        * dojump.cc (do_jump): Add & 1 if truth type.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/dojump.cc b/gcc/dojump.cc
> > index 2af0cd1aca3b6af13d5d8799094ee93f18022296..8eaf1be49cd12298e61c6946ae79ca9de6197864 100644
> > --- a/gcc/dojump.cc
> > +++ b/gcc/dojump.cc
> > @@ -605,7 +605,17 @@ do_jump (tree exp, rtx_code_label *if_false_label,
> >        /* Fall through and generate the normal code.  */
> >      default:
> >      normal:
> > -      temp = expand_normal (exp);
> > +      tree cmp = exp;
> > +      /* If the expression is a truth type then explicitly generate an & 1
> > +      to indicate to the target that it's a zero-one values type.  This
> > +      allows the target to further optimize the comparison should it
> > +      choose to.  */
> > +      if (tree_zero_one_valued_p (exp))
> > +     {
> > +       type = TREE_TYPE (exp);
> > +       cmp = build2 (BIT_AND_EXPR, type, exp, build_int_cstu (type, 1));
> > +     }
> > +      temp = expand_normal (cmp);
> >        do_pending_stack_adjust ();
> >        /* The RTL optimizers prefer comparisons against pseudos.  */
> >        if (GET_CODE (temp) == SUBREG)
> > diff --git a/gcc/tree.h b/gcc/tree.h
> > index 8f8a9660c9e0605eb516de194640b8c1b531b798..be3d2dee82f692e81082cf21c878c10f9fe9e1f1 100644
> > --- a/gcc/tree.h
> > +++ b/gcc/tree.h
> > @@ -4690,6 +4690,7 @@ extern tree signed_or_unsigned_type_for (int, tree);
> >  extern tree signed_type_for (tree);
> >  extern tree unsigned_type_for (tree);
> >  extern bool is_truth_type_for (tree, tree);
> > +extern bool tree_zero_one_valued_p (tree);
> >  extern tree truth_type_for (tree);
> >  extern tree build_pointer_type_for_mode (tree, machine_mode, bool);
> >  extern tree build_pointer_type (tree);
> >
> >
> >
> >
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> HRB 36809 (AG Nuernberg)
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-26 11:32     ` Richard Biener
@ 2022-09-26 11:46       ` Tamar Christina
  2022-09-26 12:34         ` Richard Biener
  2022-09-28 15:04         ` Richard Sandiford
  0 siblings, 2 replies; 30+ messages in thread
From: Tamar Christina @ 2022-09-26 11:46 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, nd, jeffreyalaw, Richard Sandiford

[-- Attachment #1: Type: text/plain, Size: 7870 bytes --]

> Maybe the target could use (subreg:SI (reg:BI ...)) as argument. Heh.

But then I'd still need to change the expansion code. I suppose this could prevent the issue with changes to code on other targets.

> > > We have undocumented addcc, negcc, etc. patterns, should we have aandcc pattern for this indicating support for andcc + jump as opposedto cmpcc + jump?
> >
> > This could work yeah. I didn't know these existed.

> Ah, so they are conditional add, not add setting CC, so andcc wouldn't
> be appropriate.

> So I'm not sure how we'd handle such situation - maybe looking at
> REG_DECL and recognizing a _Bool PARM_DECL is OK?

I have a slight suspicion that Richard Sandiford would likely reject this though.. The additional AND seemed less hacky as it's just communicating range.

I still need to also figure out which representation of bool is being used, because only the 0-1 variant works. Is there a way to check that?

Thanks,
Tamar.

________________________________
From: Richard Biener <rguenther@suse.de>
Sent: Monday, September 26, 2022 12:32 PM
To: Tamar Christina <Tamar.Christina@arm.com>
Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; jeffreyalaw@gmail.com <jeffreyalaw@gmail.com>; Richard Sandiford <Richard.Sandiford@arm.com>
Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend

On Mon, 26 Sep 2022, Tamar Christina wrote:

> > > This pattern occurs more than 120,000 times in SPECCPU 2017 and so is quite common.
>
> > How does this help a target?
>
> So the idea is to communicate that only the bottom bit of the value is relevant and not the entire value itself.
>
> > Why does RTL nonzerop bits not recover thisinformation and the desired optimization done later during combinefor example?
>
> I'm not sure it works here. We (AArch64) don't have QImode integer registers, so our apcs says that the top bits of the 32-bit registers it's passed in are undefined.
>
> We have to zero extend the value first if we really want it as an 8-bit value. So the problem is if you e.g. Pass a boolean as an argument of a function I don't think nonzero bits will return anything useful.
>
> > Why's a SImode compare not OK if there's no QImode compare?
>
> The mode then becomes irrelevant because we're telling the backend that only a single bit matters. And we have instructions to test and branch on the value of a single bit. See https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602090.html for the testcases

Maybe the target could use (subreg:SI (reg:BI ...)) as argument.  Heh.

> > We have undocumented addcc, negcc, etc. patterns, should we have aandcc pattern for this indicating support for andcc + jump as opposedto cmpcc + jump?
>
> This could work yeah. I didn't know these existed.

Ah, so they are conditional add, not add setting CC, so andcc wouldn't
be appropriate.

So I'm not sure how we'd handle such situation - maybe looking at
REG_DECL and recognizing a _Bool PARM_DECL is OK?

Richard.

> > So - what's the target and what's a testcase?
>
> See https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602090.html :)
>
> Thanks,
> Tamar
>
> ________________________________
> From: Richard Biener <rguenther@suse.de>
> Sent: Monday, September 26, 2022 11:35 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; jeffreyalaw@gmail.com <jeffreyalaw@gmail.com>; Richard Sandiford <Richard.Sandiford@arm.com>
> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
>
> On Fri, 23 Sep 2022, Tamar Christina wrote:
>
> > Hi All,
> >
> > This is an RFC to figure out how to deal with targets that don't have native
> > comparisons against QImode values.
> >
> > Booleans, at least in C99 and higher are 0-1 valued.  This means that we only
> > really need to test a single bit.  However in RTL we no longer have this
> > information available and just have an SImode value (due to the promotion of
> > QImode to SImode).
> >
> > This RFC fixes it by emitting an explicit & 1 during the expansion of the
> > conditional branch.
> >
> > However it's unlikely that we want to do this unconditionally.  Most targets
> > I've tested seem to have harmless code changes, like x86 changes from testb to
> > andl, $1.
> >
> > So I have two questions:
> >
> > 1. Should I limit this behind a target macro? Or should I just leave it for all
> >    targets and deal with the fallout.
> > 2. How can I tell whether the C99 0-1 values bools are being used or the older
> >    0, non-0 variant?
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > However there are some benign codegen changes on x86, testb changed to andl $1.
> >
> > This pattern occurs more than 120,000 times in SPECCPU 2017 and so is quite common.
>
> How does this help a target?  Why does RTL nonzerop bits not recover this
> information and the desired optimization done later during combine
> for example?  Why's a SImode compare not OK if there's no QImode compare?
> We have undocumented addcc, negcc, etc. patterns, should we have a
> andcc pattern for this indicating support for andcc + jump as opposed
> to cmpcc + jump?
>
> So - what's the target and what's a testcase?
>
> Thanks,
> Richard.
>
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> >        * tree.h (tree_zero_one_valued_p): New.
> >        * dojump.cc (do_jump): Add & 1 if truth type.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/dojump.cc b/gcc/dojump.cc
> > index 2af0cd1aca3b6af13d5d8799094ee93f18022296..8eaf1be49cd12298e61c6946ae79ca9de6197864 100644
> > --- a/gcc/dojump.cc
> > +++ b/gcc/dojump.cc
> > @@ -605,7 +605,17 @@ do_jump (tree exp, rtx_code_label *if_false_label,
> >        /* Fall through and generate the normal code.  */
> >      default:
> >      normal:
> > -      temp = expand_normal (exp);
> > +      tree cmp = exp;
> > +      /* If the expression is a truth type then explicitly generate an & 1
> > +      to indicate to the target that it's a zero-one values type.  This
> > +      allows the target to further optimize the comparison should it
> > +      choose to.  */
> > +      if (tree_zero_one_valued_p (exp))
> > +     {
> > +       type = TREE_TYPE (exp);
> > +       cmp = build2 (BIT_AND_EXPR, type, exp, build_int_cstu (type, 1));
> > +     }
> > +      temp = expand_normal (cmp);
> >        do_pending_stack_adjust ();
> >        /* The RTL optimizers prefer comparisons against pseudos.  */
> >        if (GET_CODE (temp) == SUBREG)
> > diff --git a/gcc/tree.h b/gcc/tree.h
> > index 8f8a9660c9e0605eb516de194640b8c1b531b798..be3d2dee82f692e81082cf21c878c10f9fe9e1f1 100644
> > --- a/gcc/tree.h
> > +++ b/gcc/tree.h
> > @@ -4690,6 +4690,7 @@ extern tree signed_or_unsigned_type_for (int, tree);
> >  extern tree signed_type_for (tree);
> >  extern tree unsigned_type_for (tree);
> >  extern bool is_truth_type_for (tree, tree);
> > +extern bool tree_zero_one_valued_p (tree);
> >  extern tree truth_type_for (tree);
> >  extern tree build_pointer_type_for_mode (tree, machine_mode, bool);
> >  extern tree build_pointer_type (tree);
> >
> >
> >
> >
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> HRB 36809 (AG Nuernberg)
>

--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-26 11:46       ` Tamar Christina
@ 2022-09-26 12:34         ` Richard Biener
  2022-09-26 12:43           ` Richard Biener
  2022-09-28 15:04         ` Richard Sandiford
  1 sibling, 1 reply; 30+ messages in thread
From: Richard Biener @ 2022-09-26 12:34 UTC (permalink / raw)
  To: Tamar Christina; +Cc: gcc-patches, nd, jeffreyalaw, Richard Sandiford

On Mon, 26 Sep 2022, Tamar Christina wrote:

> > Maybe the target could use (subreg:SI (reg:BI ...)) as argument. Heh.
> 
> But then I'd still need to change the expansion code. I suppose this could prevent the issue with changes to code on other targets.
> 
> > > > We have undocumented addcc, negcc, etc. patterns, should we have aandcc pattern for this indicating support for andcc + jump as opposedto cmpcc + jump?
> > >
> > > This could work yeah. I didn't know these existed.
> 
> > Ah, so they are conditional add, not add setting CC, so andcc wouldn't
> > be appropriate.
> 
> > So I'm not sure how we'd handle such situation - maybe looking at
> > REG_DECL and recognizing a _Bool PARM_DECL is OK?
> 
> I have a slight suspicion that Richard Sandiford would likely reject this though.. The additional AND seemed less hacky as it's just communicating range.
> 
> I still need to also figure out which representation of bool is being used, because only the 0-1 variant works. Is there a way to check that?

So another option would be, in case you have (subreg:SI (reg:QI)),
if we expand

 if (b != 0)

expand that to

 !((b & 255) == 0)

basically invert the comparison and the leverage the paradoxical subreg
to specify a narrower immediate to AND with?  Just hoping that arm
can do 255 as immediate and still efficiently handle this?

Wouldn't this transform be possible in combine with the appropriate
backend pattern and combine synthesizing the and for paradoxical subregs?

Richard.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-26 12:34         ` Richard Biener
@ 2022-09-26 12:43           ` Richard Biener
  2022-09-26 14:02             ` Tamar Christina
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Biener @ 2022-09-26 12:43 UTC (permalink / raw)
  To: Tamar Christina; +Cc: gcc-patches, nd, jeffreyalaw, Richard Sandiford

On Mon, 26 Sep 2022, Richard Biener wrote:

> On Mon, 26 Sep 2022, Tamar Christina wrote:
> 
> > > Maybe the target could use (subreg:SI (reg:BI ...)) as argument. Heh.
> > 
> > But then I'd still need to change the expansion code. I suppose this could prevent the issue with changes to code on other targets.
> > 
> > > > > We have undocumented addcc, negcc, etc. patterns, should we have aandcc pattern for this indicating support for andcc + jump as opposedto cmpcc + jump?
> > > >
> > > > This could work yeah. I didn't know these existed.
> > 
> > > Ah, so they are conditional add, not add setting CC, so andcc wouldn't
> > > be appropriate.
> > 
> > > So I'm not sure how we'd handle such situation - maybe looking at
> > > REG_DECL and recognizing a _Bool PARM_DECL is OK?
> > 
> > I have a slight suspicion that Richard Sandiford would likely reject this though.. The additional AND seemed less hacky as it's just communicating range.
> > 
> > I still need to also figure out which representation of bool is being used, because only the 0-1 variant works. Is there a way to check that?
> 
> So another option would be, in case you have (subreg:SI (reg:QI)),
> if we expand
> 
>  if (b != 0)
> 
> expand that to
> 
>  !((b & 255) == 0)
> 
> basically invert the comparison and the leverage the paradoxical subreg
> to specify a narrower immediate to AND with?  Just hoping that arm
> can do 255 as immediate and still efficiently handle this?
> 
> Wouldn't this transform be possible in combine with the appropriate
> backend pattern and combine synthesizing the and for paradoxical subregs?

Looking at what we produce on aarch64 it seems 'bool' is using
an SImode register but your characterization that the upper 24 bits
have undefined content suggests that is a wrong representation?
If the ABI doesn't say anything about the upper bits we should
reflect that somehow?

Richard.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-26 12:43           ` Richard Biener
@ 2022-09-26 14:02             ` Tamar Christina
  0 siblings, 0 replies; 30+ messages in thread
From: Tamar Christina @ 2022-09-26 14:02 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, nd, jeffreyalaw, Richard Sandiford

> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Monday, September 26, 2022 1:43 PM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jeffreyalaw@gmail.com;
> Richard Sandiford <Richard.Sandiford@arm.com>
> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> branches, give hint if argument is a truth type to backend
> 
> On Mon, 26 Sep 2022, Richard Biener wrote:
> 
> > On Mon, 26 Sep 2022, Tamar Christina wrote:
> >
> > > > Maybe the target could use (subreg:SI (reg:BI ...)) as argument. Heh.
> > >
> > > But then I'd still need to change the expansion code. I suppose this could
> prevent the issue with changes to code on other targets.
> > >
> > > > > > We have undocumented addcc, negcc, etc. patterns, should we
> have aandcc pattern for this indicating support for andcc + jump as
> opposedto cmpcc + jump?
> > > > >
> > > > > This could work yeah. I didn't know these existed.
> > >
> > > > Ah, so they are conditional add, not add setting CC, so andcc
> > > > wouldn't be appropriate.
> > >
> > > > So I'm not sure how we'd handle such situation - maybe looking at
> > > > REG_DECL and recognizing a _Bool PARM_DECL is OK?
> > >
> > > I have a slight suspicion that Richard Sandiford would likely reject this
> though.. The additional AND seemed less hacky as it's just communicating
> range.
> > >
> > > I still need to also figure out which representation of bool is being used,
> because only the 0-1 variant works. Is there a way to check that?
> >
> > So another option would be, in case you have (subreg:SI (reg:QI)), if
> > we expand
> >
> >  if (b != 0)
> >
> > expand that to
> >
> >  !((b & 255) == 0)
> >
> > basically invert the comparison and the leverage the paradoxical
> > subreg to specify a narrower immediate to AND with?  Just hoping that
> > arm can do 255 as immediate and still efficiently handle this?

We can and already do, and don't need that representation to do so.
The problem is, handling 255 is already inefficient. It requires us to use an additional
Instruction to test the value. Whereas we have a fused test single bit and branch instruction.

> >
> > Wouldn't this transform be possible in combine with the appropriate
> > backend pattern and combine synthesizing the and for paradoxical
> subregs?

Not unless we have enough range information in RTL to know that whatever value has
been fed into the cbranch has a range of 1 bit. A range of 8 bits we already have and isn't value useful.

The idea was to transform what we currently have:

        tst     w0, 255
        bne     .L4
        ret

i.e. test the bottom 8 bits, into

        tbnz    w0, #0, .L4
        ret

i.e. test only bit 0 and branch based on that bit. We cannot do this when all we know is that the range is 8 bits.

> 
> Looking at what we produce on aarch64 it seems 'bool' is using an SImode
> register but your characterization that the upper 24 bits have undefined
> content suggests that is a wrong representation?
> If the ABI doesn't say anything about the upper bits we should reflect that
> somehow?

It does. And no "bool" is using QImode. The expansion of

extern void h ();

void g1(bool x)
{
  if (__builtin_expect (x, 0))
    h ();
}

Shows that the argument x is passed as a QI mode, but like many RISC targets (and even i386) we promote the argument during expansion:

(insn 2 4 3 2 (set (reg/v:SI 92 [ x ])
        (zero_extend:SI (reg:QI 0 x0 [ x ]))) "/app/example.cpp":4:1 -1
     (nil))

But the value is passed as QImode.

We use this fact to know that the range is 8 bits in the cbanch instruction.  If no operation was done that requires a bigger
range then combine will push the zero extend into the cbranch and we have various patterns to handle different forms of this.

For instance:

void g1(bool *x)
{
  if (__builtin_expect (*x, 0))
    h ();
}

Because of the load of x we generate:

        ldrb    w0, [x0]
        cbnz    w0, .L7
        ret

because we know the top bits are defined to 0 in this case and can just test the entire register.

The reason for this promotion for us and many other backends is one of efficiency. If we don't promote to something
we have native instructions for we would have to promote and demote the value at *every* instruction in RTL.

This causes significant noise in the RTL.  So we can't do anything different here.  I have plans to try to fix this, but not in GCC 13.

But even then it won't help with this case, because we explicitly need to know that the range is a single bit. Not 8 bits.

Regards,
Tamar

> 
> Richard.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-26 11:46       ` Tamar Christina
  2022-09-26 12:34         ` Richard Biener
@ 2022-09-28 15:04         ` Richard Sandiford
  2022-09-28 17:23           ` Jeff Law
  1 sibling, 1 reply; 30+ messages in thread
From: Richard Sandiford @ 2022-09-28 15:04 UTC (permalink / raw)
  To: Tamar Christina; +Cc: Richard Biener, gcc-patches, nd, jeffreyalaw

Tamar Christina <Tamar.Christina@arm.com> writes:
>> Maybe the target could use (subreg:SI (reg:BI ...)) as argument. Heh.
>
> But then I'd still need to change the expansion code. I suppose this could
> prevent the issue with changes to code on other targets. 
>
>> > > We have undocumented addcc, negcc, etc. patterns, should we have aandcc
> pattern for this indicating support for andcc + jump as opposedto cmpcc + jump?
>> >
>> > This could work yeah. I didn't know these existed.
>
>> Ah, so they are conditional add, not add setting CC, so andcc wouldn't
>> be appropriate.
>
>> So I'm not sure how we'd handle such situation - maybe looking at
>> REG_DECL and recognizing a _Bool PARM_DECL is OK?
>
> I have a slight suspicion that Richard Sandiford would likely reject this
> though..

Good guess :-P  We shouldn't rely on something like that for correctness.

Would it help if we promoted the test-and-branch instructions to optabs,
alongside cbranch?  The jump expanders could then target it directly.

IMO that'd be a reasonable thing to do if it does help.  It's a relatively
common operation, especially on CISCy targets.

Thanks,
Richard

> The additional AND seemed less hacky as it's just communicating range.
>
> I still need to also figure out which representation of bool is being used,
> because only the 0-1 variant works. Is there a way to check that?
>
> Thanks, 
> Tamar. 
>
> -------------------------------------------------------------------------------
> From: Richard Biener <rguenther@suse.de>
> Sent: Monday, September 26, 2022 12:32 PM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>;
> jeffreyalaw@gmail.com <jeffreyalaw@gmail.com>; Richard Sandiford
> <Richard.Sandiford@arm.com>
> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches,
> give hint if argument is a truth type to backend
>  
> On Mon, 26 Sep 2022, Tamar Christina wrote:
>
>> > > This pattern occurs more than 120,000 times in SPECCPU 2017 and so is
> quite common.
>>
>> > How does this help a target?
>>
>> So the idea is to communicate that only the bottom bit of the value is
> relevant and not the entire value itself.
>>
>> > Why does RTL nonzerop bits not recover thisinformation and the desired
> optimization done later during combinefor example?
>>
>> I'm not sure it works here. We (AArch64) don't have QImode integer registers,
> so our apcs says that the top bits of the 32-bit registers it's passed in are
> undefined.
>>
>> We have to zero extend the value first if we really want it as an 8-bit
> value. So the problem is if you e.g. Pass a boolean as an argument of a
> function I don't think nonzero bits will return anything useful.
>>
>> > Why's a SImode compare not OK if there's no QImode compare?
>>
>> The mode then becomes irrelevant because we're telling the backend that only
> a single bit matters. And we have instructions to test and branch on the value
> of a single bit. See https://gcc.gnu.org/pipermail/gcc-patches/2022-September/
> 602090.html for the testcases
>
> Maybe the target could use (subreg:SI (reg:BI ...)) as argument.  Heh.
>
>> > We have undocumented addcc, negcc, etc. patterns, should we have aandcc
> pattern for this indicating support for andcc + jump as opposedto cmpcc + jump?
>>
>> This could work yeah. I didn't know these existed.
>
> Ah, so they are conditional add, not add setting CC, so andcc wouldn't
> be appropriate.
>
> So I'm not sure how we'd handle such situation - maybe looking at
> REG_DECL and recognizing a _Bool PARM_DECL is OK?
>
> Richard.
>
>> > So - what's the target and what's a testcase?
>>
>> See https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602090.html :)
>>
>> Thanks,
>> Tamar
>>
>> ________________________________
>> From: Richard Biener <rguenther@suse.de>
>> Sent: Monday, September 26, 2022 11:35 AM
>> To: Tamar Christina <Tamar.Christina@arm.com>
>> Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>;
> jeffreyalaw@gmail.com <jeffreyalaw@gmail.com>; Richard Sandiford
> <Richard.Sandiford@arm.com>
>> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> branches, give hint if argument is a truth type to backend
>>
>> On Fri, 23 Sep 2022, Tamar Christina wrote:
>>
>> > Hi All,
>> >
>> > This is an RFC to figure out how to deal with targets that don't have
> native
>> > comparisons against QImode values.
>> >
>> > Booleans, at least in C99 and higher are 0-1 valued.  This means that we
> only
>> > really need to test a single bit.  However in RTL we no longer have this
>> > information available and just have an SImode value (due to the promotion
> of
>> > QImode to SImode).
>> >
>> > This RFC fixes it by emitting an explicit & 1 during the expansion of the
>> > conditional branch.
>> >
>> > However it's unlikely that we want to do this unconditionally.  Most
> targets
>> > I've tested seem to have harmless code changes, like x86 changes from testb
> to
>> > andl, $1.
>> >
>> > So I have two questions:
>> >
>> > 1. Should I limit this behind a target macro? Or should I just leave it for
> all
>> >    targets and deal with the fallout.
>> > 2. How can I tell whether the C99 0-1 values bools are being used or the
> older
>> >    0, non-0 variant?
>> >
>> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>> > However there are some benign codegen changes on x86, testb changed to andl
> $1.
>> >
>> > This pattern occurs more than 120,000 times in SPECCPU 2017 and so is quite
> common.
>>
>> How does this help a target?  Why does RTL nonzerop bits not recover this
>> information and the desired optimization done later during combine
>> for example?  Why's a SImode compare not OK if there's no QImode compare?
>> We have undocumented addcc, negcc, etc. patterns, should we have a
>> andcc pattern for this indicating support for andcc + jump as opposed
>> to cmpcc + jump?
>>
>> So - what's the target and what's a testcase?
>>
>> Thanks,
>> Richard.
>>
>> > Thanks,
>> > Tamar
>> >
>> > gcc/ChangeLog:
>> >
>> >        * tree.h (tree_zero_one_valued_p): New.
>> >        * dojump.cc (do_jump): Add & 1 if truth type.
>> >
>> > --- inline copy of patch --
>> > diff --git a/gcc/dojump.cc b/gcc/dojump.cc
>> > index
> 2af0cd1aca3b6af13d5d8799094ee93f18022296..8eaf1be49cd12298e61c6946ae79ca9de6197864
> 100644
>> > --- a/gcc/dojump.cc
>> > +++ b/gcc/dojump.cc
>> > @@ -605,7 +605,17 @@ do_jump (tree exp, rtx_code_label *if_false_label,
>> >        /* Fall through and generate the normal code.  */
>> >      default:
>> >      normal:
>> > -      temp = expand_normal (exp);
>> > +      tree cmp = exp;
>> > +      /* If the expression is a truth type then explicitly generate an & 1
>> > +      to indicate to the target that it's a zero-one values type.  This
>> > +      allows the target to further optimize the comparison should it
>> > +      choose to.  */
>> > +      if (tree_zero_one_valued_p (exp))
>> > +     {
>> > +       type = TREE_TYPE (exp);
>> > +       cmp = build2 (BIT_AND_EXPR, type, exp, build_int_cstu (type, 1));
>> > +     }
>> > +      temp = expand_normal (cmp);
>> >        do_pending_stack_adjust ();
>> >        /* The RTL optimizers prefer comparisons against pseudos.  */
>> >        if (GET_CODE (temp) == SUBREG)
>> > diff --git a/gcc/tree.h b/gcc/tree.h
>> > index
> 8f8a9660c9e0605eb516de194640b8c1b531b798..be3d2dee82f692e81082cf21c878c10f9fe9e1f1
> 100644
>> > --- a/gcc/tree.h
>> > +++ b/gcc/tree.h
>> > @@ -4690,6 +4690,7 @@ extern tree signed_or_unsigned_type_for (int, tree);
>> >  extern tree signed_type_for (tree);
>> >  extern tree unsigned_type_for (tree);
>> >  extern bool is_truth_type_for (tree, tree);
>> > +extern bool tree_zero_one_valued_p (tree);
>> >  extern tree truth_type_for (tree);
>> >  extern tree build_pointer_type_for_mode (tree, machine_mode, bool);
>> >  extern tree build_pointer_type (tree);
>> >
>> >
>> >
>> >
>> >
>>
>> --
>> Richard Biener <rguenther@suse.de>
>> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
>> Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
>> HRB 36809 (AG Nuernberg)
>>
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-28 15:04         ` Richard Sandiford
@ 2022-09-28 17:23           ` Jeff Law
  2022-09-29  9:37             ` Richard Sandiford
  0 siblings, 1 reply; 30+ messages in thread
From: Jeff Law @ 2022-09-28 17:23 UTC (permalink / raw)
  To: Tamar Christina, Richard Biener, gcc-patches, nd, richard.sandiford


On 9/28/22 09:04, Richard Sandiford wrote:
> Tamar Christina <Tamar.Christina@arm.com> writes:
>>> Maybe the target could use (subreg:SI (reg:BI ...)) as argument. Heh.
>> But then I'd still need to change the expansion code. I suppose this could
>> prevent the issue with changes to code on other targets.
>>
>>>>> We have undocumented addcc, negcc, etc. patterns, should we have aandcc
>> pattern for this indicating support for andcc + jump as opposedto cmpcc + jump?
>>>> This could work yeah. I didn't know these existed.
>>> Ah, so they are conditional add, not add setting CC, so andcc wouldn't
>>> be appropriate.
>>> So I'm not sure how we'd handle such situation - maybe looking at
>>> REG_DECL and recognizing a _Bool PARM_DECL is OK?
>> I have a slight suspicion that Richard Sandiford would likely reject this
>> though..
> Good guess :-P  We shouldn't rely on something like that for correctness.
>
> Would it help if we promoted the test-and-branch instructions to optabs,
> alongside cbranch?  The jump expanders could then target it directly.
>
> IMO that'd be a reasonable thing to do if it does help.  It's a relatively
> common operation, especially on CISCy targets.

But don't we represent these single bit tests using zero_extract as the 
condition of the branch?  I guess if we can generate them directly 
rather than waiting for combine to deduce that we're dealing with a 
single bit test and constructing the zero_extract form would be an 
improvement and might help aarch at the same time.


jeff



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-28 17:23           ` Jeff Law
@ 2022-09-29  9:37             ` Richard Sandiford
  2022-09-29  9:40               ` Richard Biener
  2022-09-29 20:49               ` Jeff Law
  0 siblings, 2 replies; 30+ messages in thread
From: Richard Sandiford @ 2022-09-29  9:37 UTC (permalink / raw)
  To: Jeff Law; +Cc: Tamar Christina, Richard Biener, gcc-patches, nd

Jeff Law <jeffreyalaw@gmail.com> writes:
> On 9/28/22 09:04, Richard Sandiford wrote:
>> Tamar Christina <Tamar.Christina@arm.com> writes:
>>>> Maybe the target could use (subreg:SI (reg:BI ...)) as argument. Heh.
>>> But then I'd still need to change the expansion code. I suppose this could
>>> prevent the issue with changes to code on other targets.
>>>
>>>>>> We have undocumented addcc, negcc, etc. patterns, should we have aandcc
>>> pattern for this indicating support for andcc + jump as opposedto cmpcc + jump?
>>>>> This could work yeah. I didn't know these existed.
>>>> Ah, so they are conditional add, not add setting CC, so andcc wouldn't
>>>> be appropriate.
>>>> So I'm not sure how we'd handle such situation - maybe looking at
>>>> REG_DECL and recognizing a _Bool PARM_DECL is OK?
>>> I have a slight suspicion that Richard Sandiford would likely reject this
>>> though..
>> Good guess :-P  We shouldn't rely on something like that for correctness.
>>
>> Would it help if we promoted the test-and-branch instructions to optabs,
>> alongside cbranch?  The jump expanders could then target it directly.
>>
>> IMO that'd be a reasonable thing to do if it does help.  It's a relatively
>> common operation, especially on CISCy targets.
>
> But don't we represent these single bit tests using zero_extract as the 
> condition of the branch?  I guess if we can generate them directly 
> rather than waiting for combine to deduce that we're dealing with a 
> single bit test and constructing the zero_extract form would be an 
> improvement and might help aarch at the same time.

Do you mean that the promote_mode stuff should use ext(z)v rather than
zero_extend to promote a bool, where available?  If so, I agree that
might help.  But it sounds like it would have downsides too.  Currently
a bool memory can be zero-extended on the fly using a load, but if we
used the zero_extract form instead, we'd have to extract the bit after
the load.  And (as an alternative) choosing different behaviour based
on whether expand sees a REG or a MEM sounds like it could still cause
problems, since REGs could be replaced by MEMs (or vice versa) later in
the RTL passes.

ISTM that the original patch was inserting an extra operation in the
branch expansion in order to target a specific instruction.  Targeting
the instruction in expand seems good, but IMO we should do it directly,
based on knowledge of whether the instruction actually exists.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-29  9:37             ` Richard Sandiford
@ 2022-09-29  9:40               ` Richard Biener
  2022-09-29 10:21                 ` Tamar Christina
  2022-09-29 20:49               ` Jeff Law
  1 sibling, 1 reply; 30+ messages in thread
From: Richard Biener @ 2022-09-29  9:40 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Jeff Law, Tamar Christina, gcc-patches, nd

[-- Attachment #1: Type: text/plain, Size: 2908 bytes --]

On Thu, 29 Sep 2022, Richard Sandiford wrote:

> Jeff Law <jeffreyalaw@gmail.com> writes:
> > On 9/28/22 09:04, Richard Sandiford wrote:
> >> Tamar Christina <Tamar.Christina@arm.com> writes:
> >>>> Maybe the target could use (subreg:SI (reg:BI ...)) as argument. Heh.
> >>> But then I'd still need to change the expansion code. I suppose this could
> >>> prevent the issue with changes to code on other targets.
> >>>
> >>>>>> We have undocumented addcc, negcc, etc. patterns, should we have aandcc
> >>> pattern for this indicating support for andcc + jump as opposedto cmpcc + jump?
> >>>>> This could work yeah. I didn't know these existed.
> >>>> Ah, so they are conditional add, not add setting CC, so andcc wouldn't
> >>>> be appropriate.
> >>>> So I'm not sure how we'd handle such situation - maybe looking at
> >>>> REG_DECL and recognizing a _Bool PARM_DECL is OK?
> >>> I have a slight suspicion that Richard Sandiford would likely reject this
> >>> though..
> >> Good guess :-P  We shouldn't rely on something like that for correctness.
> >>
> >> Would it help if we promoted the test-and-branch instructions to optabs,
> >> alongside cbranch?  The jump expanders could then target it directly.
> >>
> >> IMO that'd be a reasonable thing to do if it does help.  It's a relatively
> >> common operation, especially on CISCy targets.
> >
> > But don't we represent these single bit tests using zero_extract as the 
> > condition of the branch?  I guess if we can generate them directly 
> > rather than waiting for combine to deduce that we're dealing with a 
> > single bit test and constructing the zero_extract form would be an 
> > improvement and might help aarch at the same time.
> 
> Do you mean that the promote_mode stuff should use ext(z)v rather than
> zero_extend to promote a bool, where available?  If so, I agree that
> might help.  But it sounds like it would have downsides too.  Currently
> a bool memory can be zero-extended on the fly using a load, but if we
> used the zero_extract form instead, we'd have to extract the bit after
> the load.  And (as an alternative) choosing different behaviour based
> on whether expand sees a REG or a MEM sounds like it could still cause
> problems, since REGs could be replaced by MEMs (or vice versa) later in
> the RTL passes.
> 
> ISTM that the original patch was inserting an extra operation in the
> branch expansion in order to target a specific instruction.  Targeting
> the instruction in expand seems good, but IMO we should do it directly,
> based on knowledge of whether the instruction actually exists.

Yes, I think a compare-and-branch pattern is the best fit here.  Note
on GIMPLE we'd rely on the fact this is a BOOLEAN_TYPE (so
even 8 bit precision bools only have 1 and 0 as meaningful values).
So the 'compare-' bit in compare-and-branch would be interpreting
a BOOLEAN_TYPE, not so much a general compare.

Richard.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-29  9:40               ` Richard Biener
@ 2022-09-29 10:21                 ` Tamar Christina
  2022-09-29 11:09                   ` Richard Biener
  0 siblings, 1 reply; 30+ messages in thread
From: Tamar Christina @ 2022-09-29 10:21 UTC (permalink / raw)
  To: Richard Biener, Richard Sandiford; +Cc: Jeff Law, gcc-patches, nd

> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Thursday, September 29, 2022 10:41 AM
> To: Richard Sandiford <Richard.Sandiford@arm.com>
> Cc: Jeff Law <jeffreyalaw@gmail.com>; Tamar Christina
> <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org; nd <nd@arm.com>
> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> branches, give hint if argument is a truth type to backend
> 
> On Thu, 29 Sep 2022, Richard Sandiford wrote:
> 
> > Jeff Law <jeffreyalaw@gmail.com> writes:
> > > On 9/28/22 09:04, Richard Sandiford wrote:
> > >> Tamar Christina <Tamar.Christina@arm.com> writes:
> > >>>> Maybe the target could use (subreg:SI (reg:BI ...)) as argument. Heh.
> > >>> But then I'd still need to change the expansion code. I suppose
> > >>> this could prevent the issue with changes to code on other targets.
> > >>>
> > >>>>>> We have undocumented addcc, negcc, etc. patterns, should we
> > >>>>>> have aandcc
> > >>> pattern for this indicating support for andcc + jump as opposedto
> cmpcc + jump?
> > >>>>> This could work yeah. I didn't know these existed.
> > >>>> Ah, so they are conditional add, not add setting CC, so andcc
> > >>>> wouldn't be appropriate.
> > >>>> So I'm not sure how we'd handle such situation - maybe looking at
> > >>>> REG_DECL and recognizing a _Bool PARM_DECL is OK?
> > >>> I have a slight suspicion that Richard Sandiford would likely
> > >>> reject this though..
> > >> Good guess :-P  We shouldn't rely on something like that for
> correctness.
> > >>
> > >> Would it help if we promoted the test-and-branch instructions to
> > >> optabs, alongside cbranch?  The jump expanders could then target it
> directly.
> > >>
> > >> IMO that'd be a reasonable thing to do if it does help.  It's a
> > >> relatively common operation, especially on CISCy targets.
> > >
> > > But don't we represent these single bit tests using zero_extract as
> > > the condition of the branch?  I guess if we can generate them
> > > directly rather than waiting for combine to deduce that we're
> > > dealing with a single bit test and constructing the zero_extract
> > > form would be an improvement and might help aarch at the same time.
> >
> > Do you mean that the promote_mode stuff should use ext(z)v rather than
> > zero_extend to promote a bool, where available?  If so, I agree that
> > might help.  But it sounds like it would have downsides too.
> > Currently a bool memory can be zero-extended on the fly using a load,
> > but if we used the zero_extract form instead, we'd have to extract the
> > bit after the load.  And (as an alternative) choosing different
> > behaviour based on whether expand sees a REG or a MEM sounds like it
> > could still cause problems, since REGs could be replaced by MEMs (or
> > vice versa) later in the RTL passes.
> >
> > ISTM that the original patch was inserting an extra operation in the
> > branch expansion in order to target a specific instruction.  Targeting
> > the instruction in expand seems good, but IMO we should do it
> > directly, based on knowledge of whether the instruction actually exists.
> 
> Yes, I think a compare-and-branch pattern is the best fit here.  Note on
> GIMPLE we'd rely on the fact this is a BOOLEAN_TYPE (so even 8 bit precision
> bools only have 1 and 0 as meaningful values).
> So the 'compare-' bit in compare-and-branch would be interpreting a
> BOOLEAN_TYPE, not so much a general compare.

Oh, I was thinking of adding a constant argument representing the precision that
is relevant for the compare in order to make this a bit more general/future proof.

Are you thinking I should instead just make the optab implicitly only work for 1-bit
precision comparisons?

Thanks,
Tamar

> 
> Richard.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-29 10:21                 ` Tamar Christina
@ 2022-09-29 11:09                   ` Richard Biener
  2022-09-30  8:00                     ` Tamar Christina
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Biener @ 2022-09-29 11:09 UTC (permalink / raw)
  To: Tamar Christina via Gcc-patches; +Cc: Richard Sandiford, nd



> Am 29.09.2022 um 12:23 schrieb Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>:
> 
> 
>> 
>> -----Original Message-----
>> From: Richard Biener <rguenther@suse.de>
>> Sent: Thursday, September 29, 2022 10:41 AM
>> To: Richard Sandiford <Richard.Sandiford@arm.com>
>> Cc: Jeff Law <jeffreyalaw@gmail.com>; Tamar Christina
>> <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org; nd <nd@arm.com>
>> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
>> branches, give hint if argument is a truth type to backend
>> 
>>> On Thu, 29 Sep 2022, Richard Sandiford wrote:
>>> 
>>> Jeff Law <jeffreyalaw@gmail.com> writes:
>>>> On 9/28/22 09:04, Richard Sandiford wrote:
>>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
>>>>>>> Maybe the target could use (subreg:SI (reg:BI ...)) as argument. Heh.
>>>>>> But then I'd still need to change the expansion code. I suppose
>>>>>> this could prevent the issue with changes to code on other targets.
>>>>>> 
>>>>>>>>> We have undocumented addcc, negcc, etc. patterns, should we
>>>>>>>>> have aandcc
>>>>>> pattern for this indicating support for andcc + jump as opposedto
>> cmpcc + jump?
>>>>>>>> This could work yeah. I didn't know these existed.
>>>>>>> Ah, so they are conditional add, not add setting CC, so andcc
>>>>>>> wouldn't be appropriate.
>>>>>>> So I'm not sure how we'd handle such situation - maybe looking at
>>>>>>> REG_DECL and recognizing a _Bool PARM_DECL is OK?
>>>>>> I have a slight suspicion that Richard Sandiford would likely
>>>>>> reject this though..
>>>>> Good guess :-P  We shouldn't rely on something like that for
>> correctness.
>>>>> 
>>>>> Would it help if we promoted the test-and-branch instructions to
>>>>> optabs, alongside cbranch?  The jump expanders could then target it
>> directly.
>>>>> 
>>>>> IMO that'd be a reasonable thing to do if it does help.  It's a
>>>>> relatively common operation, especially on CISCy targets.
>>>> 
>>>> But don't we represent these single bit tests using zero_extract as
>>>> the condition of the branch?  I guess if we can generate them
>>>> directly rather than waiting for combine to deduce that we're
>>>> dealing with a single bit test and constructing the zero_extract
>>>> form would be an improvement and might help aarch at the same time.
>>> 
>>> Do you mean that the promote_mode stuff should use ext(z)v rather than
>>> zero_extend to promote a bool, where available?  If so, I agree that
>>> might help.  But it sounds like it would have downsides too.
>>> Currently a bool memory can be zero-extended on the fly using a load,
>>> but if we used the zero_extract form instead, we'd have to extract the
>>> bit after the load.  And (as an alternative) choosing different
>>> behaviour based on whether expand sees a REG or a MEM sounds like it
>>> could still cause problems, since REGs could be replaced by MEMs (or
>>> vice versa) later in the RTL passes.
>>> 
>>> ISTM that the original patch was inserting an extra operation in the
>>> branch expansion in order to target a specific instruction.  Targeting
>>> the instruction in expand seems good, but IMO we should do it
>>> directly, based on knowledge of whether the instruction actually exists.
>> 
>> Yes, I think a compare-and-branch pattern is the best fit here.  Note on
>> GIMPLE we'd rely on the fact this is a BOOLEAN_TYPE (so even 8 bit precision
>> bools only have 1 and 0 as meaningful values).
>> So the 'compare-' bit in compare-and-branch would be interpreting a
>> BOOLEAN_TYPE, not so much a general compare.
> 
> Oh, I was thinking of adding a constant argument representing the precision that
> is relevant for the compare in order to make this a bit more general/future proof.
> 
> Are you thinking I should instead just make the optab implicitly only work for 1-bit
> precision comparisons?

What’s the optab you propose (cite also the documentation part)?

> 
> Thanks,
> Tamar
> 
>> 
>> Richard.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-29  9:37             ` Richard Sandiford
  2022-09-29  9:40               ` Richard Biener
@ 2022-09-29 20:49               ` Jeff Law
  1 sibling, 0 replies; 30+ messages in thread
From: Jeff Law @ 2022-09-29 20:49 UTC (permalink / raw)
  To: Tamar Christina, Richard Biener, gcc-patches, nd, richard.sandiford


On 9/29/22 03:37, Richard Sandiford wrote:
> Jeff Law <jeffreyalaw@gmail.com> writes:
>> On 9/28/22 09:04, Richard Sandiford wrote:
>>> Tamar Christina <Tamar.Christina@arm.com> writes:
>>>>> Maybe the target could use (subreg:SI (reg:BI ...)) as argument. Heh.
>>>> But then I'd still need to change the expansion code. I suppose this could
>>>> prevent the issue with changes to code on other targets.
>>>>
>>>>>>> We have undocumented addcc, negcc, etc. patterns, should we have aandcc
>>>> pattern for this indicating support for andcc + jump as opposedto cmpcc + jump?
>>>>>> This could work yeah. I didn't know these existed.
>>>>> Ah, so they are conditional add, not add setting CC, so andcc wouldn't
>>>>> be appropriate.
>>>>> So I'm not sure how we'd handle such situation - maybe looking at
>>>>> REG_DECL and recognizing a _Bool PARM_DECL is OK?
>>>> I have a slight suspicion that Richard Sandiford would likely reject this
>>>> though..
>>> Good guess :-P  We shouldn't rely on something like that for correctness.
>>>
>>> Would it help if we promoted the test-and-branch instructions to optabs,
>>> alongside cbranch?  The jump expanders could then target it directly.
>>>
>>> IMO that'd be a reasonable thing to do if it does help.  It's a relatively
>>> common operation, especially on CISCy targets.
>> But don't we represent these single bit tests using zero_extract as the
>> condition of the branch?  I guess if we can generate them directly
>> rather than waiting for combine to deduce that we're dealing with a
>> single bit test and constructing the zero_extract form would be an
>> improvement and might help aarch at the same time.
> Do you mean that the promote_mode stuff should use ext(z)v rather than
> zero_extend to promote a bool, where available?

No, just that if we're doing a single bit test that the way to handle 
that is with a zero_extract and the earlier we can generate that form, 
the better.


Jeff


^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-29 11:09                   ` Richard Biener
@ 2022-09-30  8:00                     ` Tamar Christina
  2022-09-30  8:28                       ` Richard Sandiford
  0 siblings, 1 reply; 30+ messages in thread
From: Tamar Christina @ 2022-09-30  8:00 UTC (permalink / raw)
  To: Richard Biener, Tamar Christina via Gcc-patches
  Cc: Richard Sandiford, nd, Jeff Law

> -----Original Message-----
> From: Gcc-patches <gcc-patches-
> bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf Of Richard
> Biener via Gcc-patches
> Sent: Thursday, September 29, 2022 12:09 PM
> To: Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>
> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; nd <nd@arm.com>
> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> branches, give hint if argument is a truth type to backend
> 
> 
> 
> > Am 29.09.2022 um 12:23 schrieb Tamar Christina via Gcc-patches <gcc-
> patches@gcc.gnu.org>:
> >
> > 
> >>
> >> -----Original Message-----
> >> From: Richard Biener <rguenther@suse.de>
> >> Sent: Thursday, September 29, 2022 10:41 AM
> >> To: Richard Sandiford <Richard.Sandiford@arm.com>
> >> Cc: Jeff Law <jeffreyalaw@gmail.com>; Tamar Christina
> >> <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org; nd
> <nd@arm.com>
> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> >> branches, give hint if argument is a truth type to backend
> >>
> >>> On Thu, 29 Sep 2022, Richard Sandiford wrote:
> >>>
> >>> Jeff Law <jeffreyalaw@gmail.com> writes:
> >>>> On 9/28/22 09:04, Richard Sandiford wrote:
> >>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
> >>>>>>> Maybe the target could use (subreg:SI (reg:BI ...)) as argument.
> Heh.
> >>>>>> But then I'd still need to change the expansion code. I suppose
> >>>>>> this could prevent the issue with changes to code on other targets.
> >>>>>>
> >>>>>>>>> We have undocumented addcc, negcc, etc. patterns, should we
> >>>>>>>>> have aandcc
> >>>>>> pattern for this indicating support for andcc + jump as opposedto
> >> cmpcc + jump?
> >>>>>>>> This could work yeah. I didn't know these existed.
> >>>>>>> Ah, so they are conditional add, not add setting CC, so andcc
> >>>>>>> wouldn't be appropriate.
> >>>>>>> So I'm not sure how we'd handle such situation - maybe looking
> >>>>>>> at REG_DECL and recognizing a _Bool PARM_DECL is OK?
> >>>>>> I have a slight suspicion that Richard Sandiford would likely
> >>>>>> reject this though..
> >>>>> Good guess :-P  We shouldn't rely on something like that for
> >> correctness.
> >>>>>
> >>>>> Would it help if we promoted the test-and-branch instructions to
> >>>>> optabs, alongside cbranch?  The jump expanders could then target
> >>>>> it
> >> directly.
> >>>>>
> >>>>> IMO that'd be a reasonable thing to do if it does help.  It's a
> >>>>> relatively common operation, especially on CISCy targets.
> >>>>
> >>>> But don't we represent these single bit tests using zero_extract as
> >>>> the condition of the branch?  I guess if we can generate them
> >>>> directly rather than waiting for combine to deduce that we're
> >>>> dealing with a single bit test and constructing the zero_extract
> >>>> form would be an improvement and might help aarch at the same time.
> >>>
> >>> Do you mean that the promote_mode stuff should use ext(z)v rather
> >>> than zero_extend to promote a bool, where available?  If so, I agree
> >>> that might help.  But it sounds like it would have downsides too.
> >>> Currently a bool memory can be zero-extended on the fly using a
> >>> load, but if we used the zero_extract form instead, we'd have to
> >>> extract the bit after the load.  And (as an alternative) choosing
> >>> different behaviour based on whether expand sees a REG or a MEM
> >>> sounds like it could still cause problems, since REGs could be
> >>> replaced by MEMs (or vice versa) later in the RTL passes.
> >>>
> >>> ISTM that the original patch was inserting an extra operation in the
> >>> branch expansion in order to target a specific instruction.
> >>> Targeting the instruction in expand seems good, but IMO we should do
> >>> it directly, based on knowledge of whether the instruction actually
> exists.
> >>
> >> Yes, I think a compare-and-branch pattern is the best fit here.  Note
> >> on GIMPLE we'd rely on the fact this is a BOOLEAN_TYPE (so even 8 bit
> >> precision bools only have 1 and 0 as meaningful values).
> >> So the 'compare-' bit in compare-and-branch would be interpreting a
> >> BOOLEAN_TYPE, not so much a general compare.
> >
> > Oh, I was thinking of adding a constant argument representing the
> > precision that is relevant for the compare in order to make this a bit more
> general/future proof.
> >
> > Are you thinking I should instead just make the optab implicitly only
> > work for 1-bit precision comparisons?
> 
> What’s the optab you propose (cite also the documentation part)?

tbranchmode5
  Conditional branch instruction combined with a bit test instruction. Operand 0 is a comparison operator.
  Operand 1 and Operand 2 are the first and second operands of the comparison, respectively.
  Operand 3 is the number of low-order bits that are relevant for the comparison.
  Operand 4 is the code_label to jump to.

Specifically this representation would allow us to emit all our different conditional branching instructions
without needing to rely on combine.  We have some cases that happen during optimization that sometimes prevent
the optimal sequence from being generated. This would also solve that as we would expand to what we want to
start with.

Tamar.

> 
> >
> > Thanks,
> > Tamar
> >
> >>
> >> Richard.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-30  8:00                     ` Tamar Christina
@ 2022-09-30  8:28                       ` Richard Sandiford
  2022-09-30  8:38                         ` Tamar Christina
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Sandiford @ 2022-09-30  8:28 UTC (permalink / raw)
  To: Tamar Christina
  Cc: Richard Biener, Tamar Christina via Gcc-patches, nd, Jeff Law

Tamar Christina <Tamar.Christina@arm.com> writes:
>> -----Original Message-----
>> From: Gcc-patches <gcc-patches-
>> bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf Of Richard
>> Biener via Gcc-patches
>> Sent: Thursday, September 29, 2022 12:09 PM
>> To: Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>
>> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; nd <nd@arm.com>
>> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
>> branches, give hint if argument is a truth type to backend
>>
>>
>>
>> > Am 29.09.2022 um 12:23 schrieb Tamar Christina via Gcc-patches <gcc-
>> patches@gcc.gnu.org>:
>> >
>> >
>> >>
>> >> -----Original Message-----
>> >> From: Richard Biener <rguenther@suse.de>
>> >> Sent: Thursday, September 29, 2022 10:41 AM
>> >> To: Richard Sandiford <Richard.Sandiford@arm.com>
>> >> Cc: Jeff Law <jeffreyalaw@gmail.com>; Tamar Christina
>> >> <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org; nd
>> <nd@arm.com>
>> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
>> >> branches, give hint if argument is a truth type to backend
>> >>
>> >>> On Thu, 29 Sep 2022, Richard Sandiford wrote:
>> >>>
>> >>> Jeff Law <jeffreyalaw@gmail.com> writes:
>> >>>> On 9/28/22 09:04, Richard Sandiford wrote:
>> >>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
>> >>>>>>> Maybe the target could use (subreg:SI (reg:BI ...)) as argument.
>> Heh.
>> >>>>>> But then I'd still need to change the expansion code. I suppose
>> >>>>>> this could prevent the issue with changes to code on other targets.
>> >>>>>>
>> >>>>>>>>> We have undocumented addcc, negcc, etc. patterns, should we
>> >>>>>>>>> have aandcc
>> >>>>>> pattern for this indicating support for andcc + jump as opposedto
>> >> cmpcc + jump?
>> >>>>>>>> This could work yeah. I didn't know these existed.
>> >>>>>>> Ah, so they are conditional add, not add setting CC, so andcc
>> >>>>>>> wouldn't be appropriate.
>> >>>>>>> So I'm not sure how we'd handle such situation - maybe looking
>> >>>>>>> at REG_DECL and recognizing a _Bool PARM_DECL is OK?
>> >>>>>> I have a slight suspicion that Richard Sandiford would likely
>> >>>>>> reject this though..
>> >>>>> Good guess :-P  We shouldn't rely on something like that for
>> >> correctness.
>> >>>>>
>> >>>>> Would it help if we promoted the test-and-branch instructions to
>> >>>>> optabs, alongside cbranch?  The jump expanders could then target
>> >>>>> it
>> >> directly.
>> >>>>>
>> >>>>> IMO that'd be a reasonable thing to do if it does help.  It's a
>> >>>>> relatively common operation, especially on CISCy targets.
>> >>>>
>> >>>> But don't we represent these single bit tests using zero_extract as
>> >>>> the condition of the branch?  I guess if we can generate them
>> >>>> directly rather than waiting for combine to deduce that we're
>> >>>> dealing with a single bit test and constructing the zero_extract
>> >>>> form would be an improvement and might help aarch at the same time.
>> >>>
>> >>> Do you mean that the promote_mode stuff should use ext(z)v rather
>> >>> than zero_extend to promote a bool, where available?  If so, I agree
>> >>> that might help.  But it sounds like it would have downsides too.
>> >>> Currently a bool memory can be zero-extended on the fly using a
>> >>> load, but if we used the zero_extract form instead, we'd have to
>> >>> extract the bit after the load.  And (as an alternative) choosing
>> >>> different behaviour based on whether expand sees a REG or a MEM
>> >>> sounds like it could still cause problems, since REGs could be
>> >>> replaced by MEMs (or vice versa) later in the RTL passes.
>> >>>
>> >>> ISTM that the original patch was inserting an extra operation in the
>> >>> branch expansion in order to target a specific instruction.
>> >>> Targeting the instruction in expand seems good, but IMO we should do
>> >>> it directly, based on knowledge of whether the instruction actually
>> exists.
>> >>
>> >> Yes, I think a compare-and-branch pattern is the best fit here.  Note
>> >> on GIMPLE we'd rely on the fact this is a BOOLEAN_TYPE (so even 8 bit
>> >> precision bools only have 1 and 0 as meaningful values).
>> >> So the 'compare-' bit in compare-and-branch would be interpreting a
>> >> BOOLEAN_TYPE, not so much a general compare.
>> >
>> > Oh, I was thinking of adding a constant argument representing the
>> > precision that is relevant for the compare in order to make this a bit more
>> general/future proof.
>> >
>> > Are you thinking I should instead just make the optab implicitly only
>> > work for 1-bit precision comparisons?
>>
>> What’s the optab you propose (cite also the documentation part)?
>
> tbranchmode5
>   Conditional branch instruction combined with a bit test instruction. Operand 0 is a comparison operator.
>   Operand 1 and Operand 2 are the first and second operands of the comparison, respectively.
>   Operand 3 is the number of low-order bits that are relevant for the comparison.
>   Operand 4 is the code_label to jump to.

For the TB instructions (and for other similar instructions that I've
seen on other architectures) it would be more useful to have a single-bit
test, with operand 4 specifying the bit position.  Arguably it might then
be better to have separate eq and ne optabs, to avoid the awkward doubling
of the operands (operand 1 contains operands 2 and 3).

I guess a more general way of achieving the same thing would be to make
operand 4 in the optab above a mask rather than a bit count.  But that
might be overly general, if there are no known architectures that have
such an instruction.

Thanks,
Richard

> Specifically this representation would allow us to emit all our different conditional branching instructions
> without needing to rely on combine.  We have some cases that happen during optimization that sometimes prevent
> the optimal sequence from being generated. This would also solve that as we would expand to what we want to
> start with.
>
> Tamar.
>
>>
>> >
>> > Thanks,
>> > Tamar
>> >
>> >>
>> >> Richard.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-30  8:28                       ` Richard Sandiford
@ 2022-09-30  8:38                         ` Tamar Christina
  2022-09-30  8:48                           ` Richard Sandiford
  0 siblings, 1 reply; 30+ messages in thread
From: Tamar Christina @ 2022-09-30  8:38 UTC (permalink / raw)
  To: Richard Sandiford
  Cc: Richard Biener, Tamar Christina via Gcc-patches, nd, Jeff Law

> -----Original Message-----
> From: Richard Sandiford <richard.sandiford@arm.com>
> Sent: Friday, September 30, 2022 9:29 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via Gcc-patches
> <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff Law
> <jeffreyalaw@gmail.com>
> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> branches, give hint if argument is a truth type to backend
> 
> Tamar Christina <Tamar.Christina@arm.com> writes:
> >> -----Original Message-----
> >> From: Gcc-patches <gcc-patches-
> >> bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf Of Richard
> >> Biener via Gcc-patches
> >> Sent: Thursday, September 29, 2022 12:09 PM
> >> To: Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>
> >> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; nd <nd@arm.com>
> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> >> branches, give hint if argument is a truth type to backend
> >>
> >>
> >>
> >> > Am 29.09.2022 um 12:23 schrieb Tamar Christina via Gcc-patches
> >> > <gcc-
> >> patches@gcc.gnu.org>:
> >> >
> >> >
> >> >>
> >> >> -----Original Message-----
> >> >> From: Richard Biener <rguenther@suse.de>
> >> >> Sent: Thursday, September 29, 2022 10:41 AM
> >> >> To: Richard Sandiford <Richard.Sandiford@arm.com>
> >> >> Cc: Jeff Law <jeffreyalaw@gmail.com>; Tamar Christina
> >> >> <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org; nd
> >> <nd@arm.com>
> >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
> >> >> conditional branches, give hint if argument is a truth type to
> >> >> backend
> >> >>
> >> >>> On Thu, 29 Sep 2022, Richard Sandiford wrote:
> >> >>>
> >> >>> Jeff Law <jeffreyalaw@gmail.com> writes:
> >> >>>> On 9/28/22 09:04, Richard Sandiford wrote:
> >> >>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
> >> >>>>>>> Maybe the target could use (subreg:SI (reg:BI ...)) as argument.
> >> Heh.
> >> >>>>>> But then I'd still need to change the expansion code. I
> >> >>>>>> suppose this could prevent the issue with changes to code on
> other targets.
> >> >>>>>>
> >> >>>>>>>>> We have undocumented addcc, negcc, etc. patterns, should
> we
> >> >>>>>>>>> have aandcc
> >> >>>>>> pattern for this indicating support for andcc + jump as
> >> >>>>>> opposedto
> >> >> cmpcc + jump?
> >> >>>>>>>> This could work yeah. I didn't know these existed.
> >> >>>>>>> Ah, so they are conditional add, not add setting CC, so andcc
> >> >>>>>>> wouldn't be appropriate.
> >> >>>>>>> So I'm not sure how we'd handle such situation - maybe
> >> >>>>>>> looking at REG_DECL and recognizing a _Bool PARM_DECL is OK?
> >> >>>>>> I have a slight suspicion that Richard Sandiford would likely
> >> >>>>>> reject this though..
> >> >>>>> Good guess :-P  We shouldn't rely on something like that for
> >> >> correctness.
> >> >>>>>
> >> >>>>> Would it help if we promoted the test-and-branch instructions
> >> >>>>> to optabs, alongside cbranch?  The jump expanders could then
> >> >>>>> target it
> >> >> directly.
> >> >>>>>
> >> >>>>> IMO that'd be a reasonable thing to do if it does help.  It's a
> >> >>>>> relatively common operation, especially on CISCy targets.
> >> >>>>
> >> >>>> But don't we represent these single bit tests using zero_extract
> >> >>>> as the condition of the branch?  I guess if we can generate them
> >> >>>> directly rather than waiting for combine to deduce that we're
> >> >>>> dealing with a single bit test and constructing the zero_extract
> >> >>>> form would be an improvement and might help aarch at the same
> time.
> >> >>>
> >> >>> Do you mean that the promote_mode stuff should use ext(z)v rather
> >> >>> than zero_extend to promote a bool, where available?  If so, I
> >> >>> agree that might help.  But it sounds like it would have downsides
> too.
> >> >>> Currently a bool memory can be zero-extended on the fly using a
> >> >>> load, but if we used the zero_extract form instead, we'd have to
> >> >>> extract the bit after the load.  And (as an alternative) choosing
> >> >>> different behaviour based on whether expand sees a REG or a MEM
> >> >>> sounds like it could still cause problems, since REGs could be
> >> >>> replaced by MEMs (or vice versa) later in the RTL passes.
> >> >>>
> >> >>> ISTM that the original patch was inserting an extra operation in
> >> >>> the branch expansion in order to target a specific instruction.
> >> >>> Targeting the instruction in expand seems good, but IMO we should
> >> >>> do it directly, based on knowledge of whether the instruction
> >> >>> actually
> >> exists.
> >> >>
> >> >> Yes, I think a compare-and-branch pattern is the best fit here.
> >> >> Note on GIMPLE we'd rely on the fact this is a BOOLEAN_TYPE (so
> >> >> even 8 bit precision bools only have 1 and 0 as meaningful values).
> >> >> So the 'compare-' bit in compare-and-branch would be interpreting
> >> >> a BOOLEAN_TYPE, not so much a general compare.
> >> >
> >> > Oh, I was thinking of adding a constant argument representing the
> >> > precision that is relevant for the compare in order to make this a
> >> > bit more
> >> general/future proof.
> >> >
> >> > Are you thinking I should instead just make the optab implicitly
> >> > only work for 1-bit precision comparisons?
> >>
> >> What’s the optab you propose (cite also the documentation part)?
> >
> > tbranchmode5
> >   Conditional branch instruction combined with a bit test instruction.
> Operand 0 is a comparison operator.
> >   Operand 1 and Operand 2 are the first and second operands of the
> comparison, respectively.
> >   Operand 3 is the number of low-order bits that are relevant for the
> comparison.
> >   Operand 4 is the code_label to jump to.
> 
> For the TB instructions (and for other similar instructions that I've seen on
> other architectures) it would be more useful to have a single-bit test, with
> operand 4 specifying the bit position.  Arguably it might then be better to
> have separate eq and ne optabs, to avoid the awkward doubling of the
> operands (operand 1 contains operands 2 and 3).
> 
> I guess a more general way of achieving the same thing would be to make
> operand 4 in the optab above a mask rather than a bit count.  But that might
> be overly general, if there are no known architectures that have such an
> instruction.

One of the reasons I wanted a range rather than a single bit is that I can the use
this to generate cbz/cbnz early on as well.  This would mean we could use my earlier
patch that tried to drop the QI/HI promotions without needing the any_extend additional
pass if we wanted to.

We'd also no longer need to rely on seeing a paradoxical subreg for a tst.

Tamar.

> 
> Thanks,
> Richard
> 
> > Specifically this representation would allow us to emit all our
> > different conditional branching instructions without needing to rely
> > on combine.  We have some cases that happen during optimization that
> > sometimes prevent the optimal sequence from being generated. This
> would also solve that as we would expand to what we want to start with.
> >
> > Tamar.
> >
> >>
> >> >
> >> > Thanks,
> >> > Tamar
> >> >
> >> >>
> >> >> Richard.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-30  8:38                         ` Tamar Christina
@ 2022-09-30  8:48                           ` Richard Sandiford
  2022-09-30  9:15                             ` Tamar Christina
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Sandiford @ 2022-09-30  8:48 UTC (permalink / raw)
  To: Tamar Christina
  Cc: Richard Biener, Tamar Christina via Gcc-patches, nd, Jeff Law

Tamar Christina <Tamar.Christina@arm.com> writes:
>> -----Original Message-----
>> From: Richard Sandiford <richard.sandiford@arm.com>
>> Sent: Friday, September 30, 2022 9:29 AM
>> To: Tamar Christina <Tamar.Christina@arm.com>
>> Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via Gcc-patches
>> <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff Law
>> <jeffreyalaw@gmail.com>
>> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
>> branches, give hint if argument is a truth type to backend
>> 
>> Tamar Christina <Tamar.Christina@arm.com> writes:
>> >> -----Original Message-----
>> >> From: Gcc-patches <gcc-patches-
>> >> bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf Of Richard
>> >> Biener via Gcc-patches
>> >> Sent: Thursday, September 29, 2022 12:09 PM
>> >> To: Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>
>> >> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; nd <nd@arm.com>
>> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
>> >> branches, give hint if argument is a truth type to backend
>> >>
>> >>
>> >>
>> >> > Am 29.09.2022 um 12:23 schrieb Tamar Christina via Gcc-patches
>> >> > <gcc-
>> >> patches@gcc.gnu.org>:
>> >> >
>> >> >
>> >> >>
>> >> >> -----Original Message-----
>> >> >> From: Richard Biener <rguenther@suse.de>
>> >> >> Sent: Thursday, September 29, 2022 10:41 AM
>> >> >> To: Richard Sandiford <Richard.Sandiford@arm.com>
>> >> >> Cc: Jeff Law <jeffreyalaw@gmail.com>; Tamar Christina
>> >> >> <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org; nd
>> >> <nd@arm.com>
>> >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
>> >> >> conditional branches, give hint if argument is a truth type to
>> >> >> backend
>> >> >>
>> >> >>> On Thu, 29 Sep 2022, Richard Sandiford wrote:
>> >> >>>
>> >> >>> Jeff Law <jeffreyalaw@gmail.com> writes:
>> >> >>>> On 9/28/22 09:04, Richard Sandiford wrote:
>> >> >>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
>> >> >>>>>>> Maybe the target could use (subreg:SI (reg:BI ...)) as argument.
>> >> Heh.
>> >> >>>>>> But then I'd still need to change the expansion code. I
>> >> >>>>>> suppose this could prevent the issue with changes to code on
>> other targets.
>> >> >>>>>>
>> >> >>>>>>>>> We have undocumented addcc, negcc, etc. patterns, should
>> we
>> >> >>>>>>>>> have aandcc
>> >> >>>>>> pattern for this indicating support for andcc + jump as
>> >> >>>>>> opposedto
>> >> >> cmpcc + jump?
>> >> >>>>>>>> This could work yeah. I didn't know these existed.
>> >> >>>>>>> Ah, so they are conditional add, not add setting CC, so andcc
>> >> >>>>>>> wouldn't be appropriate.
>> >> >>>>>>> So I'm not sure how we'd handle such situation - maybe
>> >> >>>>>>> looking at REG_DECL and recognizing a _Bool PARM_DECL is OK?
>> >> >>>>>> I have a slight suspicion that Richard Sandiford would likely
>> >> >>>>>> reject this though..
>> >> >>>>> Good guess :-P  We shouldn't rely on something like that for
>> >> >> correctness.
>> >> >>>>>
>> >> >>>>> Would it help if we promoted the test-and-branch instructions
>> >> >>>>> to optabs, alongside cbranch?  The jump expanders could then
>> >> >>>>> target it
>> >> >> directly.
>> >> >>>>>
>> >> >>>>> IMO that'd be a reasonable thing to do if it does help.  It's a
>> >> >>>>> relatively common operation, especially on CISCy targets.
>> >> >>>>
>> >> >>>> But don't we represent these single bit tests using zero_extract
>> >> >>>> as the condition of the branch?  I guess if we can generate them
>> >> >>>> directly rather than waiting for combine to deduce that we're
>> >> >>>> dealing with a single bit test and constructing the zero_extract
>> >> >>>> form would be an improvement and might help aarch at the same
>> time.
>> >> >>>
>> >> >>> Do you mean that the promote_mode stuff should use ext(z)v rather
>> >> >>> than zero_extend to promote a bool, where available?  If so, I
>> >> >>> agree that might help.  But it sounds like it would have downsides
>> too.
>> >> >>> Currently a bool memory can be zero-extended on the fly using a
>> >> >>> load, but if we used the zero_extract form instead, we'd have to
>> >> >>> extract the bit after the load.  And (as an alternative) choosing
>> >> >>> different behaviour based on whether expand sees a REG or a MEM
>> >> >>> sounds like it could still cause problems, since REGs could be
>> >> >>> replaced by MEMs (or vice versa) later in the RTL passes.
>> >> >>>
>> >> >>> ISTM that the original patch was inserting an extra operation in
>> >> >>> the branch expansion in order to target a specific instruction.
>> >> >>> Targeting the instruction in expand seems good, but IMO we should
>> >> >>> do it directly, based on knowledge of whether the instruction
>> >> >>> actually
>> >> exists.
>> >> >>
>> >> >> Yes, I think a compare-and-branch pattern is the best fit here.
>> >> >> Note on GIMPLE we'd rely on the fact this is a BOOLEAN_TYPE (so
>> >> >> even 8 bit precision bools only have 1 and 0 as meaningful values).
>> >> >> So the 'compare-' bit in compare-and-branch would be interpreting
>> >> >> a BOOLEAN_TYPE, not so much a general compare.
>> >> >
>> >> > Oh, I was thinking of adding a constant argument representing the
>> >> > precision that is relevant for the compare in order to make this a
>> >> > bit more
>> >> general/future proof.
>> >> >
>> >> > Are you thinking I should instead just make the optab implicitly
>> >> > only work for 1-bit precision comparisons?
>> >>
>> >> What’s the optab you propose (cite also the documentation part)?
>> >
>> > tbranchmode5
>> >   Conditional branch instruction combined with a bit test instruction.
>> Operand 0 is a comparison operator.
>> >   Operand 1 and Operand 2 are the first and second operands of the
>> comparison, respectively.
>> >   Operand 3 is the number of low-order bits that are relevant for the
>> comparison.
>> >   Operand 4 is the code_label to jump to.
>> 
>> For the TB instructions (and for other similar instructions that I've seen on
>> other architectures) it would be more useful to have a single-bit test, with
>> operand 4 specifying the bit position.  Arguably it might then be better to
>> have separate eq and ne optabs, to avoid the awkward doubling of the
>> operands (operand 1 contains operands 2 and 3).
>> 
>> I guess a more general way of achieving the same thing would be to make
>> operand 4 in the optab above a mask rather than a bit count.  But that might
>> be overly general, if there are no known architectures that have such an
>> instruction.
>
> One of the reasons I wanted a range rather than a single bit is that I can the use
> this to generate cbz/cbnz early on as well.

We already have the opportunity to do that via cbranch<mode>4.
But at the moment aarch64.md always forces the separate comparison
instead.  (Not sure why TBH.  Does it enable more ifcvt opportunities?)

If we change the body of cbranch<mode>4 to:

  if ((GET_CODE (operands[0]) != EQ && GET_CODE (operands[0]) != NE)
      || operands[2] != const0_rtx)
    {
      operands[1] = aarch64_gen_compare_reg (GET_CODE (operands[0]),
					     operands[1], operands[2]);
      operands[2] = const0_rtx;
    }

then we generate the cbz/cbnz directly.

Thanks,
Richard


> This would mean we could use my earlier
> patch that tried to drop the QI/HI promotions without needing the any_extend additional
> pass if we wanted to.
>
> We'd also no longer need to rely on seeing a paradoxical subreg for a tst.
>
> Tamar.
>
>> 
>> Thanks,
>> Richard
>> 
>> > Specifically this representation would allow us to emit all our
>> > different conditional branching instructions without needing to rely
>> > on combine.  We have some cases that happen during optimization that
>> > sometimes prevent the optimal sequence from being generated. This
>> would also solve that as we would expand to what we want to start with.
>> >
>> > Tamar.
>> >
>> >>
>> >> >
>> >> > Thanks,
>> >> > Tamar
>> >> >
>> >> >>
>> >> >> Richard.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-30  8:48                           ` Richard Sandiford
@ 2022-09-30  9:15                             ` Tamar Christina
  2022-09-30 10:16                               ` Richard Biener
  0 siblings, 1 reply; 30+ messages in thread
From: Tamar Christina @ 2022-09-30  9:15 UTC (permalink / raw)
  To: Richard Sandiford
  Cc: Richard Biener, Tamar Christina via Gcc-patches, nd, Jeff Law



> -----Original Message-----
> From: Richard Sandiford <richard.sandiford@arm.com>
> Sent: Friday, September 30, 2022 9:49 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via Gcc-patches
> <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff Law
> <jeffreyalaw@gmail.com>
> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> branches, give hint if argument is a truth type to backend
> 
> Tamar Christina <Tamar.Christina@arm.com> writes:
> >> -----Original Message-----
> >> From: Richard Sandiford <richard.sandiford@arm.com>
> >> Sent: Friday, September 30, 2022 9:29 AM
> >> To: Tamar Christina <Tamar.Christina@arm.com>
> >> Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via
> >> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff Law
> >> <jeffreyalaw@gmail.com>
> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> >> branches, give hint if argument is a truth type to backend
> >>
> >> Tamar Christina <Tamar.Christina@arm.com> writes:
> >> >> -----Original Message-----
> >> >> From: Gcc-patches <gcc-patches-
> >> >> bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf Of
> Richard
> >> >> Biener via Gcc-patches
> >> >> Sent: Thursday, September 29, 2022 12:09 PM
> >> >> To: Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>
> >> >> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; nd
> <nd@arm.com>
> >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
> >> >> conditional branches, give hint if argument is a truth type to
> >> >> backend
> >> >>
> >> >>
> >> >>
> >> >> > Am 29.09.2022 um 12:23 schrieb Tamar Christina via Gcc-patches
> >> >> > <gcc-
> >> >> patches@gcc.gnu.org>:
> >> >> >
> >> >> >
> >> >> >>
> >> >> >> -----Original Message-----
> >> >> >> From: Richard Biener <rguenther@suse.de>
> >> >> >> Sent: Thursday, September 29, 2022 10:41 AM
> >> >> >> To: Richard Sandiford <Richard.Sandiford@arm.com>
> >> >> >> Cc: Jeff Law <jeffreyalaw@gmail.com>; Tamar Christina
> >> >> >> <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org; nd
> >> >> <nd@arm.com>
> >> >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
> >> >> >> conditional branches, give hint if argument is a truth type to
> >> >> >> backend
> >> >> >>
> >> >> >>> On Thu, 29 Sep 2022, Richard Sandiford wrote:
> >> >> >>>
> >> >> >>> Jeff Law <jeffreyalaw@gmail.com> writes:
> >> >> >>>> On 9/28/22 09:04, Richard Sandiford wrote:
> >> >> >>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
> >> >> >>>>>>> Maybe the target could use (subreg:SI (reg:BI ...)) as
> argument.
> >> >> Heh.
> >> >> >>>>>> But then I'd still need to change the expansion code. I
> >> >> >>>>>> suppose this could prevent the issue with changes to code
> >> >> >>>>>> on
> >> other targets.
> >> >> >>>>>>
> >> >> >>>>>>>>> We have undocumented addcc, negcc, etc. patterns,
> should
> >> we
> >> >> >>>>>>>>> have aandcc
> >> >> >>>>>> pattern for this indicating support for andcc + jump as
> >> >> >>>>>> opposedto
> >> >> >> cmpcc + jump?
> >> >> >>>>>>>> This could work yeah. I didn't know these existed.
> >> >> >>>>>>> Ah, so they are conditional add, not add setting CC, so
> >> >> >>>>>>> andcc wouldn't be appropriate.
> >> >> >>>>>>> So I'm not sure how we'd handle such situation - maybe
> >> >> >>>>>>> looking at REG_DECL and recognizing a _Bool PARM_DECL is
> OK?
> >> >> >>>>>> I have a slight suspicion that Richard Sandiford would
> >> >> >>>>>> likely reject this though..
> >> >> >>>>> Good guess :-P  We shouldn't rely on something like that for
> >> >> >> correctness.
> >> >> >>>>>
> >> >> >>>>> Would it help if we promoted the test-and-branch
> >> >> >>>>> instructions to optabs, alongside cbranch?  The jump
> >> >> >>>>> expanders could then target it
> >> >> >> directly.
> >> >> >>>>>
> >> >> >>>>> IMO that'd be a reasonable thing to do if it does help.
> >> >> >>>>> It's a relatively common operation, especially on CISCy targets.
> >> >> >>>>
> >> >> >>>> But don't we represent these single bit tests using
> >> >> >>>> zero_extract as the condition of the branch?  I guess if we
> >> >> >>>> can generate them directly rather than waiting for combine to
> >> >> >>>> deduce that we're dealing with a single bit test and
> >> >> >>>> constructing the zero_extract form would be an improvement
> >> >> >>>> and might help aarch at the same
> >> time.
> >> >> >>>
> >> >> >>> Do you mean that the promote_mode stuff should use ext(z)v
> >> >> >>> rather than zero_extend to promote a bool, where available?
> >> >> >>> If so, I agree that might help.  But it sounds like it would
> >> >> >>> have downsides
> >> too.
> >> >> >>> Currently a bool memory can be zero-extended on the fly using
> >> >> >>> a load, but if we used the zero_extract form instead, we'd
> >> >> >>> have to extract the bit after the load.  And (as an
> >> >> >>> alternative) choosing different behaviour based on whether
> >> >> >>> expand sees a REG or a MEM sounds like it could still cause
> >> >> >>> problems, since REGs could be replaced by MEMs (or vice versa)
> later in the RTL passes.
> >> >> >>>
> >> >> >>> ISTM that the original patch was inserting an extra operation
> >> >> >>> in the branch expansion in order to target a specific instruction.
> >> >> >>> Targeting the instruction in expand seems good, but IMO we
> >> >> >>> should do it directly, based on knowledge of whether the
> >> >> >>> instruction actually
> >> >> exists.
> >> >> >>
> >> >> >> Yes, I think a compare-and-branch pattern is the best fit here.
> >> >> >> Note on GIMPLE we'd rely on the fact this is a BOOLEAN_TYPE (so
> >> >> >> even 8 bit precision bools only have 1 and 0 as meaningful values).
> >> >> >> So the 'compare-' bit in compare-and-branch would be
> >> >> >> interpreting a BOOLEAN_TYPE, not so much a general compare.
> >> >> >
> >> >> > Oh, I was thinking of adding a constant argument representing
> >> >> > the precision that is relevant for the compare in order to make
> >> >> > this a bit more
> >> >> general/future proof.
> >> >> >
> >> >> > Are you thinking I should instead just make the optab implicitly
> >> >> > only work for 1-bit precision comparisons?
> >> >>
> >> >> What’s the optab you propose (cite also the documentation part)?
> >> >
> >> > tbranchmode5
> >> >   Conditional branch instruction combined with a bit test instruction.
> >> Operand 0 is a comparison operator.
> >> >   Operand 1 and Operand 2 are the first and second operands of the
> >> comparison, respectively.
> >> >   Operand 3 is the number of low-order bits that are relevant for
> >> > the
> >> comparison.
> >> >   Operand 4 is the code_label to jump to.
> >>
> >> For the TB instructions (and for other similar instructions that I've
> >> seen on other architectures) it would be more useful to have a
> >> single-bit test, with operand 4 specifying the bit position.
> >> Arguably it might then be better to have separate eq and ne optabs,
> >> to avoid the awkward doubling of the operands (operand 1 contains
> operands 2 and 3).
> >>
> >> I guess a more general way of achieving the same thing would be to
> >> make operand 4 in the optab above a mask rather than a bit count.
> >> But that might be overly general, if there are no known architectures
> >> that have such an instruction.
> >
> > One of the reasons I wanted a range rather than a single bit is that I
> > can the use this to generate cbz/cbnz early on as well.
> 
> We already have the opportunity to do that via cbranch<mode>4.
> But at the moment aarch64.md always forces the separate comparison
> instead.  (Not sure why TBH.  Does it enable more ifcvt opportunities?)
> 
> If we change the body of cbranch<mode>4 to:
> 
>   if ((GET_CODE (operands[0]) != EQ && GET_CODE (operands[0]) != NE)
>       || operands[2] != const0_rtx)
>     {
>       operands[1] = aarch64_gen_compare_reg (GET_CODE (operands[0]),
> 					     operands[1], operands[2]);
>       operands[2] = const0_rtx;
>     }
> 
> then we generate the cbz/cbnz directly.
>

Ah ok, then if Richi agrees, bitpos it is then instead of bit count.

Tamar
 

> Thanks,
> Richard
> 
> 
> > This would mean we could use my earlier patch that tried to drop the
> > QI/HI promotions without needing the any_extend additional pass if we
> > wanted to.
> >
> > We'd also no longer need to rely on seeing a paradoxical subreg for a tst.
> >
> > Tamar.
> >
> >>
> >> Thanks,
> >> Richard
> >>
> >> > Specifically this representation would allow us to emit all our
> >> > different conditional branching instructions without needing to
> >> > rely on combine.  We have some cases that happen during
> >> > optimization that sometimes prevent the optimal sequence from being
> >> > generated. This
> >> would also solve that as we would expand to what we want to start with.
> >> >
> >> > Tamar.
> >> >
> >> >>
> >> >> >
> >> >> > Thanks,
> >> >> > Tamar
> >> >> >
> >> >> >>
> >> >> >> Richard.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-30  9:15                             ` Tamar Christina
@ 2022-09-30 10:16                               ` Richard Biener
  2022-09-30 11:11                                 ` Tamar Christina
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Biener @ 2022-09-30 10:16 UTC (permalink / raw)
  To: Tamar Christina
  Cc: Richard Sandiford, Tamar Christina via Gcc-patches, nd, Jeff Law

On Fri, 30 Sep 2022, Tamar Christina wrote:

> 
> 
> > -----Original Message-----
> > From: Richard Sandiford <richard.sandiford@arm.com>
> > Sent: Friday, September 30, 2022 9:49 AM
> > To: Tamar Christina <Tamar.Christina@arm.com>
> > Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via Gcc-patches
> > <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff Law
> > <jeffreyalaw@gmail.com>
> > Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> > branches, give hint if argument is a truth type to backend
> > 
> > Tamar Christina <Tamar.Christina@arm.com> writes:
> > >> -----Original Message-----
> > >> From: Richard Sandiford <richard.sandiford@arm.com>
> > >> Sent: Friday, September 30, 2022 9:29 AM
> > >> To: Tamar Christina <Tamar.Christina@arm.com>
> > >> Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via
> > >> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff Law
> > >> <jeffreyalaw@gmail.com>
> > >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> > >> branches, give hint if argument is a truth type to backend
> > >>
> > >> Tamar Christina <Tamar.Christina@arm.com> writes:
> > >> >> -----Original Message-----
> > >> >> From: Gcc-patches <gcc-patches-
> > >> >> bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf Of
> > Richard
> > >> >> Biener via Gcc-patches
> > >> >> Sent: Thursday, September 29, 2022 12:09 PM
> > >> >> To: Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>
> > >> >> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; nd
> > <nd@arm.com>
> > >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
> > >> >> conditional branches, give hint if argument is a truth type to
> > >> >> backend
> > >> >>
> > >> >>
> > >> >>
> > >> >> > Am 29.09.2022 um 12:23 schrieb Tamar Christina via Gcc-patches
> > >> >> > <gcc-
> > >> >> patches@gcc.gnu.org>:
> > >> >> >
> > >> >> >
> > >> >> >>
> > >> >> >> -----Original Message-----
> > >> >> >> From: Richard Biener <rguenther@suse.de>
> > >> >> >> Sent: Thursday, September 29, 2022 10:41 AM
> > >> >> >> To: Richard Sandiford <Richard.Sandiford@arm.com>
> > >> >> >> Cc: Jeff Law <jeffreyalaw@gmail.com>; Tamar Christina
> > >> >> >> <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org; nd
> > >> >> <nd@arm.com>
> > >> >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
> > >> >> >> conditional branches, give hint if argument is a truth type to
> > >> >> >> backend
> > >> >> >>
> > >> >> >>> On Thu, 29 Sep 2022, Richard Sandiford wrote:
> > >> >> >>>
> > >> >> >>> Jeff Law <jeffreyalaw@gmail.com> writes:
> > >> >> >>>> On 9/28/22 09:04, Richard Sandiford wrote:
> > >> >> >>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
> > >> >> >>>>>>> Maybe the target could use (subreg:SI (reg:BI ...)) as
> > argument.
> > >> >> Heh.
> > >> >> >>>>>> But then I'd still need to change the expansion code. I
> > >> >> >>>>>> suppose this could prevent the issue with changes to code
> > >> >> >>>>>> on
> > >> other targets.
> > >> >> >>>>>>
> > >> >> >>>>>>>>> We have undocumented addcc, negcc, etc. patterns,
> > should
> > >> we
> > >> >> >>>>>>>>> have aandcc
> > >> >> >>>>>> pattern for this indicating support for andcc + jump as
> > >> >> >>>>>> opposedto
> > >> >> >> cmpcc + jump?
> > >> >> >>>>>>>> This could work yeah. I didn't know these existed.
> > >> >> >>>>>>> Ah, so they are conditional add, not add setting CC, so
> > >> >> >>>>>>> andcc wouldn't be appropriate.
> > >> >> >>>>>>> So I'm not sure how we'd handle such situation - maybe
> > >> >> >>>>>>> looking at REG_DECL and recognizing a _Bool PARM_DECL is
> > OK?
> > >> >> >>>>>> I have a slight suspicion that Richard Sandiford would
> > >> >> >>>>>> likely reject this though..
> > >> >> >>>>> Good guess :-P  We shouldn't rely on something like that for
> > >> >> >> correctness.
> > >> >> >>>>>
> > >> >> >>>>> Would it help if we promoted the test-and-branch
> > >> >> >>>>> instructions to optabs, alongside cbranch?  The jump
> > >> >> >>>>> expanders could then target it
> > >> >> >> directly.
> > >> >> >>>>>
> > >> >> >>>>> IMO that'd be a reasonable thing to do if it does help.
> > >> >> >>>>> It's a relatively common operation, especially on CISCy targets.
> > >> >> >>>>
> > >> >> >>>> But don't we represent these single bit tests using
> > >> >> >>>> zero_extract as the condition of the branch?  I guess if we
> > >> >> >>>> can generate them directly rather than waiting for combine to
> > >> >> >>>> deduce that we're dealing with a single bit test and
> > >> >> >>>> constructing the zero_extract form would be an improvement
> > >> >> >>>> and might help aarch at the same
> > >> time.
> > >> >> >>>
> > >> >> >>> Do you mean that the promote_mode stuff should use ext(z)v
> > >> >> >>> rather than zero_extend to promote a bool, where available?
> > >> >> >>> If so, I agree that might help.  But it sounds like it would
> > >> >> >>> have downsides
> > >> too.
> > >> >> >>> Currently a bool memory can be zero-extended on the fly using
> > >> >> >>> a load, but if we used the zero_extract form instead, we'd
> > >> >> >>> have to extract the bit after the load.  And (as an
> > >> >> >>> alternative) choosing different behaviour based on whether
> > >> >> >>> expand sees a REG or a MEM sounds like it could still cause
> > >> >> >>> problems, since REGs could be replaced by MEMs (or vice versa)
> > later in the RTL passes.
> > >> >> >>>
> > >> >> >>> ISTM that the original patch was inserting an extra operation
> > >> >> >>> in the branch expansion in order to target a specific instruction.
> > >> >> >>> Targeting the instruction in expand seems good, but IMO we
> > >> >> >>> should do it directly, based on knowledge of whether the
> > >> >> >>> instruction actually
> > >> >> exists.
> > >> >> >>
> > >> >> >> Yes, I think a compare-and-branch pattern is the best fit here.
> > >> >> >> Note on GIMPLE we'd rely on the fact this is a BOOLEAN_TYPE (so
> > >> >> >> even 8 bit precision bools only have 1 and 0 as meaningful values).
> > >> >> >> So the 'compare-' bit in compare-and-branch would be
> > >> >> >> interpreting a BOOLEAN_TYPE, not so much a general compare.
> > >> >> >
> > >> >> > Oh, I was thinking of adding a constant argument representing
> > >> >> > the precision that is relevant for the compare in order to make
> > >> >> > this a bit more
> > >> >> general/future proof.
> > >> >> >
> > >> >> > Are you thinking I should instead just make the optab implicitly
> > >> >> > only work for 1-bit precision comparisons?
> > >> >>
> > >> >> What?s the optab you propose (cite also the documentation part)?
> > >> >
> > >> > tbranchmode5
> > >> >   Conditional branch instruction combined with a bit test instruction.
> > >> Operand 0 is a comparison operator.
> > >> >   Operand 1 and Operand 2 are the first and second operands of the
> > >> comparison, respectively.
> > >> >   Operand 3 is the number of low-order bits that are relevant for
> > >> > the
> > >> comparison.
> > >> >   Operand 4 is the code_label to jump to.
> > >>
> > >> For the TB instructions (and for other similar instructions that I've
> > >> seen on other architectures) it would be more useful to have a
> > >> single-bit test, with operand 4 specifying the bit position.
> > >> Arguably it might then be better to have separate eq and ne optabs,
> > >> to avoid the awkward doubling of the operands (operand 1 contains
> > operands 2 and 3).
> > >>
> > >> I guess a more general way of achieving the same thing would be to
> > >> make operand 4 in the optab above a mask rather than a bit count.
> > >> But that might be overly general, if there are no known architectures
> > >> that have such an instruction.
> > >
> > > One of the reasons I wanted a range rather than a single bit is that I
> > > can the use this to generate cbz/cbnz early on as well.
> > 
> > We already have the opportunity to do that via cbranch<mode>4.
> > But at the moment aarch64.md always forces the separate comparison
> > instead.  (Not sure why TBH.  Does it enable more ifcvt opportunities?)
> > 
> > If we change the body of cbranch<mode>4 to:
> > 
> >   if ((GET_CODE (operands[0]) != EQ && GET_CODE (operands[0]) != NE)
> >       || operands[2] != const0_rtx)
> >     {
> >       operands[1] = aarch64_gen_compare_reg (GET_CODE (operands[0]),
> > 					     operands[1], operands[2]);
> >       operands[2] = const0_rtx;
> >     }
> > 
> > then we generate the cbz/cbnz directly.
> >
> 
> Ah ok, then if Richi agrees, bitpos it is then instead of bit count.

Somehow I understood that cbranch<>4 is already fully capable of the
optimization?

On your earlier proposal I'd have commented that if it wouldn't make
sense to instead have a CCmode setter instead of an expander with
a branch label?  That would be a bit test, like {sign,zero}_extract
compared against zero which can then be combined with a branch?

Of course if the ISA is really bit-test-and-branch then cbranch<>4
itself or a variant of it might be more useful.  Maybe
cbranchbi4 would be "abused" for this?

> Tamar
>  
> 
> > Thanks,
> > Richard
> > 
> > 
> > > This would mean we could use my earlier patch that tried to drop the
> > > QI/HI promotions without needing the any_extend additional pass if we
> > > wanted to.
> > >
> > > We'd also no longer need to rely on seeing a paradoxical subreg for a tst.
> > >
> > > Tamar.
> > >
> > >>
> > >> Thanks,
> > >> Richard
> > >>
> > >> > Specifically this representation would allow us to emit all our
> > >> > different conditional branching instructions without needing to
> > >> > rely on combine.  We have some cases that happen during
> > >> > optimization that sometimes prevent the optimal sequence from being
> > >> > generated. This
> > >> would also solve that as we would expand to what we want to start with.
> > >> >
> > >> > Tamar.
> > >> >
> > >> >>
> > >> >> >
> > >> >> > Thanks,
> > >> >> > Tamar
> > >> >> >
> > >> >> >>
> > >> >> >> Richard.
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-30 10:16                               ` Richard Biener
@ 2022-09-30 11:11                                 ` Tamar Christina
  2022-09-30 11:52                                   ` Richard Biener
  0 siblings, 1 reply; 30+ messages in thread
From: Tamar Christina @ 2022-09-30 11:11 UTC (permalink / raw)
  To: Richard Biener
  Cc: Richard Sandiford, Tamar Christina via Gcc-patches, nd, Jeff Law

> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Friday, September 30, 2022 11:17 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina via
> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff Law
> <jeffreyalaw@gmail.com>
> Subject: RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> branches, give hint if argument is a truth type to backend
> 
> On Fri, 30 Sep 2022, Tamar Christina wrote:
> 
> >
> >
> > > -----Original Message-----
> > > From: Richard Sandiford <richard.sandiford@arm.com>
> > > Sent: Friday, September 30, 2022 9:49 AM
> > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via
> > > Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff Law
> > > <jeffreyalaw@gmail.com>
> > > Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> > > branches, give hint if argument is a truth type to backend
> > >
> > > Tamar Christina <Tamar.Christina@arm.com> writes:
> > > >> -----Original Message-----
> > > >> From: Richard Sandiford <richard.sandiford@arm.com>
> > > >> Sent: Friday, September 30, 2022 9:29 AM
> > > >> To: Tamar Christina <Tamar.Christina@arm.com>
> > > >> Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via
> > > >> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff
> Law
> > > >> <jeffreyalaw@gmail.com>
> > > >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
> > > >> conditional branches, give hint if argument is a truth type to
> > > >> backend
> > > >>
> > > >> Tamar Christina <Tamar.Christina@arm.com> writes:
> > > >> >> -----Original Message-----
> > > >> >> From: Gcc-patches <gcc-patches-
> > > >> >> bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf Of
> > > Richard
> > > >> >> Biener via Gcc-patches
> > > >> >> Sent: Thursday, September 29, 2022 12:09 PM
> > > >> >> To: Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>
> > > >> >> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; nd
> > > <nd@arm.com>
> > > >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
> > > >> >> conditional branches, give hint if argument is a truth type to
> > > >> >> backend
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> > Am 29.09.2022 um 12:23 schrieb Tamar Christina via
> > > >> >> > Gcc-patches
> > > >> >> > <gcc-
> > > >> >> patches@gcc.gnu.org>:
> > > >> >> >
> > > >> >> >
> > > >> >> >>
> > > >> >> >> -----Original Message-----
> > > >> >> >> From: Richard Biener <rguenther@suse.de>
> > > >> >> >> Sent: Thursday, September 29, 2022 10:41 AM
> > > >> >> >> To: Richard Sandiford <Richard.Sandiford@arm.com>
> > > >> >> >> Cc: Jeff Law <jeffreyalaw@gmail.com>; Tamar Christina
> > > >> >> >> <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org; nd
> > > >> >> <nd@arm.com>
> > > >> >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
> > > >> >> >> conditional branches, give hint if argument is a truth type
> > > >> >> >> to backend
> > > >> >> >>
> > > >> >> >>> On Thu, 29 Sep 2022, Richard Sandiford wrote:
> > > >> >> >>>
> > > >> >> >>> Jeff Law <jeffreyalaw@gmail.com> writes:
> > > >> >> >>>> On 9/28/22 09:04, Richard Sandiford wrote:
> > > >> >> >>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
> > > >> >> >>>>>>> Maybe the target could use (subreg:SI (reg:BI ...)) as
> > > argument.
> > > >> >> Heh.
> > > >> >> >>>>>> But then I'd still need to change the expansion code. I
> > > >> >> >>>>>> suppose this could prevent the issue with changes to
> > > >> >> >>>>>> code on
> > > >> other targets.
> > > >> >> >>>>>>
> > > >> >> >>>>>>>>> We have undocumented addcc, negcc, etc. patterns,
> > > should
> > > >> we
> > > >> >> >>>>>>>>> have aandcc
> > > >> >> >>>>>> pattern for this indicating support for andcc + jump as
> > > >> >> >>>>>> opposedto
> > > >> >> >> cmpcc + jump?
> > > >> >> >>>>>>>> This could work yeah. I didn't know these existed.
> > > >> >> >>>>>>> Ah, so they are conditional add, not add setting CC,
> > > >> >> >>>>>>> so andcc wouldn't be appropriate.
> > > >> >> >>>>>>> So I'm not sure how we'd handle such situation - maybe
> > > >> >> >>>>>>> looking at REG_DECL and recognizing a _Bool PARM_DECL
> > > >> >> >>>>>>> is
> > > OK?
> > > >> >> >>>>>> I have a slight suspicion that Richard Sandiford would
> > > >> >> >>>>>> likely reject this though..
> > > >> >> >>>>> Good guess :-P  We shouldn't rely on something like that
> > > >> >> >>>>> for
> > > >> >> >> correctness.
> > > >> >> >>>>>
> > > >> >> >>>>> Would it help if we promoted the test-and-branch
> > > >> >> >>>>> instructions to optabs, alongside cbranch?  The jump
> > > >> >> >>>>> expanders could then target it
> > > >> >> >> directly.
> > > >> >> >>>>>
> > > >> >> >>>>> IMO that'd be a reasonable thing to do if it does help.
> > > >> >> >>>>> It's a relatively common operation, especially on CISCy
> targets.
> > > >> >> >>>>
> > > >> >> >>>> But don't we represent these single bit tests using
> > > >> >> >>>> zero_extract as the condition of the branch?  I guess if
> > > >> >> >>>> we can generate them directly rather than waiting for
> > > >> >> >>>> combine to deduce that we're dealing with a single bit
> > > >> >> >>>> test and constructing the zero_extract form would be an
> > > >> >> >>>> improvement and might help aarch at the same
> > > >> time.
> > > >> >> >>>
> > > >> >> >>> Do you mean that the promote_mode stuff should use ext(z)v
> > > >> >> >>> rather than zero_extend to promote a bool, where available?
> > > >> >> >>> If so, I agree that might help.  But it sounds like it
> > > >> >> >>> would have downsides
> > > >> too.
> > > >> >> >>> Currently a bool memory can be zero-extended on the fly
> > > >> >> >>> using a load, but if we used the zero_extract form
> > > >> >> >>> instead, we'd have to extract the bit after the load.  And
> > > >> >> >>> (as an
> > > >> >> >>> alternative) choosing different behaviour based on whether
> > > >> >> >>> expand sees a REG or a MEM sounds like it could still
> > > >> >> >>> cause problems, since REGs could be replaced by MEMs (or
> > > >> >> >>> vice versa)
> > > later in the RTL passes.
> > > >> >> >>>
> > > >> >> >>> ISTM that the original patch was inserting an extra
> > > >> >> >>> operation in the branch expansion in order to target a specific
> instruction.
> > > >> >> >>> Targeting the instruction in expand seems good, but IMO we
> > > >> >> >>> should do it directly, based on knowledge of whether the
> > > >> >> >>> instruction actually
> > > >> >> exists.
> > > >> >> >>
> > > >> >> >> Yes, I think a compare-and-branch pattern is the best fit here.
> > > >> >> >> Note on GIMPLE we'd rely on the fact this is a BOOLEAN_TYPE
> > > >> >> >> (so even 8 bit precision bools only have 1 and 0 as meaningful
> values).
> > > >> >> >> So the 'compare-' bit in compare-and-branch would be
> > > >> >> >> interpreting a BOOLEAN_TYPE, not so much a general compare.
> > > >> >> >
> > > >> >> > Oh, I was thinking of adding a constant argument
> > > >> >> > representing the precision that is relevant for the compare
> > > >> >> > in order to make this a bit more
> > > >> >> general/future proof.
> > > >> >> >
> > > >> >> > Are you thinking I should instead just make the optab
> > > >> >> > implicitly only work for 1-bit precision comparisons?
> > > >> >>
> > > >> >> What?s the optab you propose (cite also the documentation part)?
> > > >> >
> > > >> > tbranchmode5
> > > >> >   Conditional branch instruction combined with a bit test instruction.
> > > >> Operand 0 is a comparison operator.
> > > >> >   Operand 1 and Operand 2 are the first and second operands of
> > > >> > the
> > > >> comparison, respectively.
> > > >> >   Operand 3 is the number of low-order bits that are relevant
> > > >> > for the
> > > >> comparison.
> > > >> >   Operand 4 is the code_label to jump to.
> > > >>
> > > >> For the TB instructions (and for other similar instructions that
> > > >> I've seen on other architectures) it would be more useful to have
> > > >> a single-bit test, with operand 4 specifying the bit position.
> > > >> Arguably it might then be better to have separate eq and ne
> > > >> optabs, to avoid the awkward doubling of the operands (operand 1
> > > >> contains
> > > operands 2 and 3).
> > > >>
> > > >> I guess a more general way of achieving the same thing would be
> > > >> to make operand 4 in the optab above a mask rather than a bit count.
> > > >> But that might be overly general, if there are no known
> > > >> architectures that have such an instruction.
> > > >
> > > > One of the reasons I wanted a range rather than a single bit is
> > > > that I can the use this to generate cbz/cbnz early on as well.
> > >
> > > We already have the opportunity to do that via cbranch<mode>4.
> > > But at the moment aarch64.md always forces the separate comparison
> > > instead.  (Not sure why TBH.  Does it enable more ifcvt
> > > opportunities?)
> > >
> > > If we change the body of cbranch<mode>4 to:
> > >
> > >   if ((GET_CODE (operands[0]) != EQ && GET_CODE (operands[0]) != NE)
> > >       || operands[2] != const0_rtx)
> > >     {
> > >       operands[1] = aarch64_gen_compare_reg (GET_CODE (operands[0]),
> > > 					     operands[1], operands[2]);
> > >       operands[2] = const0_rtx;
> > >     }
> > >
> > > then we generate the cbz/cbnz directly.
> > >
> >
> > Ah ok, then if Richi agrees, bitpos it is then instead of bit count.
> 
> Somehow I understood that cbranch<>4 is already fully capable of the
> optimization?
> 
> On your earlier proposal I'd have commented that if it wouldn't make sense
> to instead have a CCmode setter instead of an expander with a branch label?
> That would be a bit test, like {sign,zero}_extract compared against zero which
> can then be combined with a branch?
>

I missed that part, that could work too.

> Of course if the ISA is really bit-test-and-branch then cbranch<>4 itself or a
> variant of it might be more useful.  Maybe
> cbranchbi4 would be "abused" for this?

The instruction is an actual bit-test-and-branch with any arbitrary bitpos.
Yes we can abuse cbranchbi4 for this, but then it also means we can't e.g.
use this to optimize a < 0 where a is a signed value.  With the new optab
this would just be a bit-test-and-branch of the sign bit.

But also I'm not entirely convinced that using `BImode` and assuming a single
bit is safe here. What happens if I compile my source with -std=c89?

So I personally think the new optab makes more sense here. The CC setter would work too.

I guess my question is, between you folks, which approach would you like. It seems that Richi
You'd like a CC setter. Richard do you have a preference of one over the other?

Tamar

> 
> > Tamar
> >
> >
> > > Thanks,
> > > Richard
> > >
> > >
> > > > This would mean we could use my earlier patch that tried to drop
> > > > the QI/HI promotions without needing the any_extend additional
> > > > pass if we wanted to.
> > > >
> > > > We'd also no longer need to rely on seeing a paradoxical subreg for a
> tst.
> > > >
> > > > Tamar.
> > > >
> > > >>
> > > >> Thanks,
> > > >> Richard
> > > >>
> > > >> > Specifically this representation would allow us to emit all our
> > > >> > different conditional branching instructions without needing to
> > > >> > rely on combine.  We have some cases that happen during
> > > >> > optimization that sometimes prevent the optimal sequence from
> > > >> > being generated. This
> > > >> would also solve that as we would expand to what we want to start
> with.
> > > >> >
> > > >> > Tamar.
> > > >> >
> > > >> >>
> > > >> >> >
> > > >> >> > Thanks,
> > > >> >> > Tamar
> > > >> >> >
> > > >> >> >>
> > > >> >> >> Richard.
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461
> Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald,
> Boudien Moerman; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-30 11:11                                 ` Tamar Christina
@ 2022-09-30 11:52                                   ` Richard Biener
  2022-09-30 12:48                                     ` Tamar Christina
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Biener @ 2022-09-30 11:52 UTC (permalink / raw)
  To: Tamar Christina
  Cc: Richard Sandiford, Tamar Christina via Gcc-patches, nd, Jeff Law

On Fri, 30 Sep 2022, Tamar Christina wrote:

> > -----Original Message-----
> > From: Richard Biener <rguenther@suse.de>
> > Sent: Friday, September 30, 2022 11:17 AM
> > To: Tamar Christina <Tamar.Christina@arm.com>
> > Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina via
> > Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff Law
> > <jeffreyalaw@gmail.com>
> > Subject: RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> > branches, give hint if argument is a truth type to backend
> > 
> > On Fri, 30 Sep 2022, Tamar Christina wrote:
> > 
> > >
> > >
> > > > -----Original Message-----
> > > > From: Richard Sandiford <richard.sandiford@arm.com>
> > > > Sent: Friday, September 30, 2022 9:49 AM
> > > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > > Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via
> > > > Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff Law
> > > > <jeffreyalaw@gmail.com>
> > > > Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> > > > branches, give hint if argument is a truth type to backend
> > > >
> > > > Tamar Christina <Tamar.Christina@arm.com> writes:
> > > > >> -----Original Message-----
> > > > >> From: Richard Sandiford <richard.sandiford@arm.com>
> > > > >> Sent: Friday, September 30, 2022 9:29 AM
> > > > >> To: Tamar Christina <Tamar.Christina@arm.com>
> > > > >> Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via
> > > > >> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff
> > Law
> > > > >> <jeffreyalaw@gmail.com>
> > > > >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
> > > > >> conditional branches, give hint if argument is a truth type to
> > > > >> backend
> > > > >>
> > > > >> Tamar Christina <Tamar.Christina@arm.com> writes:
> > > > >> >> -----Original Message-----
> > > > >> >> From: Gcc-patches <gcc-patches-
> > > > >> >> bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf Of
> > > > Richard
> > > > >> >> Biener via Gcc-patches
> > > > >> >> Sent: Thursday, September 29, 2022 12:09 PM
> > > > >> >> To: Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>
> > > > >> >> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; nd
> > > > <nd@arm.com>
> > > > >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
> > > > >> >> conditional branches, give hint if argument is a truth type to
> > > > >> >> backend
> > > > >> >>
> > > > >> >>
> > > > >> >>
> > > > >> >> > Am 29.09.2022 um 12:23 schrieb Tamar Christina via
> > > > >> >> > Gcc-patches
> > > > >> >> > <gcc-
> > > > >> >> patches@gcc.gnu.org>:
> > > > >> >> >
> > > > >> >> >
> > > > >> >> >>
> > > > >> >> >> -----Original Message-----
> > > > >> >> >> From: Richard Biener <rguenther@suse.de>
> > > > >> >> >> Sent: Thursday, September 29, 2022 10:41 AM
> > > > >> >> >> To: Richard Sandiford <Richard.Sandiford@arm.com>
> > > > >> >> >> Cc: Jeff Law <jeffreyalaw@gmail.com>; Tamar Christina
> > > > >> >> >> <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org; nd
> > > > >> >> <nd@arm.com>
> > > > >> >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
> > > > >> >> >> conditional branches, give hint if argument is a truth type
> > > > >> >> >> to backend
> > > > >> >> >>
> > > > >> >> >>> On Thu, 29 Sep 2022, Richard Sandiford wrote:
> > > > >> >> >>>
> > > > >> >> >>> Jeff Law <jeffreyalaw@gmail.com> writes:
> > > > >> >> >>>> On 9/28/22 09:04, Richard Sandiford wrote:
> > > > >> >> >>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
> > > > >> >> >>>>>>> Maybe the target could use (subreg:SI (reg:BI ...)) as
> > > > argument.
> > > > >> >> Heh.
> > > > >> >> >>>>>> But then I'd still need to change the expansion code. I
> > > > >> >> >>>>>> suppose this could prevent the issue with changes to
> > > > >> >> >>>>>> code on
> > > > >> other targets.
> > > > >> >> >>>>>>
> > > > >> >> >>>>>>>>> We have undocumented addcc, negcc, etc. patterns,
> > > > should
> > > > >> we
> > > > >> >> >>>>>>>>> have aandcc
> > > > >> >> >>>>>> pattern for this indicating support for andcc + jump as
> > > > >> >> >>>>>> opposedto
> > > > >> >> >> cmpcc + jump?
> > > > >> >> >>>>>>>> This could work yeah. I didn't know these existed.
> > > > >> >> >>>>>>> Ah, so they are conditional add, not add setting CC,
> > > > >> >> >>>>>>> so andcc wouldn't be appropriate.
> > > > >> >> >>>>>>> So I'm not sure how we'd handle such situation - maybe
> > > > >> >> >>>>>>> looking at REG_DECL and recognizing a _Bool PARM_DECL
> > > > >> >> >>>>>>> is
> > > > OK?
> > > > >> >> >>>>>> I have a slight suspicion that Richard Sandiford would
> > > > >> >> >>>>>> likely reject this though..
> > > > >> >> >>>>> Good guess :-P  We shouldn't rely on something like that
> > > > >> >> >>>>> for
> > > > >> >> >> correctness.
> > > > >> >> >>>>>
> > > > >> >> >>>>> Would it help if we promoted the test-and-branch
> > > > >> >> >>>>> instructions to optabs, alongside cbranch?  The jump
> > > > >> >> >>>>> expanders could then target it
> > > > >> >> >> directly.
> > > > >> >> >>>>>
> > > > >> >> >>>>> IMO that'd be a reasonable thing to do if it does help.
> > > > >> >> >>>>> It's a relatively common operation, especially on CISCy
> > targets.
> > > > >> >> >>>>
> > > > >> >> >>>> But don't we represent these single bit tests using
> > > > >> >> >>>> zero_extract as the condition of the branch?  I guess if
> > > > >> >> >>>> we can generate them directly rather than waiting for
> > > > >> >> >>>> combine to deduce that we're dealing with a single bit
> > > > >> >> >>>> test and constructing the zero_extract form would be an
> > > > >> >> >>>> improvement and might help aarch at the same
> > > > >> time.
> > > > >> >> >>>
> > > > >> >> >>> Do you mean that the promote_mode stuff should use ext(z)v
> > > > >> >> >>> rather than zero_extend to promote a bool, where available?
> > > > >> >> >>> If so, I agree that might help.  But it sounds like it
> > > > >> >> >>> would have downsides
> > > > >> too.
> > > > >> >> >>> Currently a bool memory can be zero-extended on the fly
> > > > >> >> >>> using a load, but if we used the zero_extract form
> > > > >> >> >>> instead, we'd have to extract the bit after the load.  And
> > > > >> >> >>> (as an
> > > > >> >> >>> alternative) choosing different behaviour based on whether
> > > > >> >> >>> expand sees a REG or a MEM sounds like it could still
> > > > >> >> >>> cause problems, since REGs could be replaced by MEMs (or
> > > > >> >> >>> vice versa)
> > > > later in the RTL passes.
> > > > >> >> >>>
> > > > >> >> >>> ISTM that the original patch was inserting an extra
> > > > >> >> >>> operation in the branch expansion in order to target a specific
> > instruction.
> > > > >> >> >>> Targeting the instruction in expand seems good, but IMO we
> > > > >> >> >>> should do it directly, based on knowledge of whether the
> > > > >> >> >>> instruction actually
> > > > >> >> exists.
> > > > >> >> >>
> > > > >> >> >> Yes, I think a compare-and-branch pattern is the best fit here.
> > > > >> >> >> Note on GIMPLE we'd rely on the fact this is a BOOLEAN_TYPE
> > > > >> >> >> (so even 8 bit precision bools only have 1 and 0 as meaningful
> > values).
> > > > >> >> >> So the 'compare-' bit in compare-and-branch would be
> > > > >> >> >> interpreting a BOOLEAN_TYPE, not so much a general compare.
> > > > >> >> >
> > > > >> >> > Oh, I was thinking of adding a constant argument
> > > > >> >> > representing the precision that is relevant for the compare
> > > > >> >> > in order to make this a bit more
> > > > >> >> general/future proof.
> > > > >> >> >
> > > > >> >> > Are you thinking I should instead just make the optab
> > > > >> >> > implicitly only work for 1-bit precision comparisons?
> > > > >> >>
> > > > >> >> What?s the optab you propose (cite also the documentation part)?
> > > > >> >
> > > > >> > tbranchmode5
> > > > >> >   Conditional branch instruction combined with a bit test instruction.
> > > > >> Operand 0 is a comparison operator.
> > > > >> >   Operand 1 and Operand 2 are the first and second operands of
> > > > >> > the
> > > > >> comparison, respectively.
> > > > >> >   Operand 3 is the number of low-order bits that are relevant
> > > > >> > for the
> > > > >> comparison.
> > > > >> >   Operand 4 is the code_label to jump to.
> > > > >>
> > > > >> For the TB instructions (and for other similar instructions that
> > > > >> I've seen on other architectures) it would be more useful to have
> > > > >> a single-bit test, with operand 4 specifying the bit position.
> > > > >> Arguably it might then be better to have separate eq and ne
> > > > >> optabs, to avoid the awkward doubling of the operands (operand 1
> > > > >> contains
> > > > operands 2 and 3).
> > > > >>
> > > > >> I guess a more general way of achieving the same thing would be
> > > > >> to make operand 4 in the optab above a mask rather than a bit count.
> > > > >> But that might be overly general, if there are no known
> > > > >> architectures that have such an instruction.
> > > > >
> > > > > One of the reasons I wanted a range rather than a single bit is
> > > > > that I can the use this to generate cbz/cbnz early on as well.
> > > >
> > > > We already have the opportunity to do that via cbranch<mode>4.
> > > > But at the moment aarch64.md always forces the separate comparison
> > > > instead.  (Not sure why TBH.  Does it enable more ifcvt
> > > > opportunities?)
> > > >
> > > > If we change the body of cbranch<mode>4 to:
> > > >
> > > >   if ((GET_CODE (operands[0]) != EQ && GET_CODE (operands[0]) != NE)
> > > >       || operands[2] != const0_rtx)
> > > >     {
> > > >       operands[1] = aarch64_gen_compare_reg (GET_CODE (operands[0]),
> > > > 					     operands[1], operands[2]);
> > > >       operands[2] = const0_rtx;
> > > >     }
> > > >
> > > > then we generate the cbz/cbnz directly.
> > > >
> > >
> > > Ah ok, then if Richi agrees, bitpos it is then instead of bit count.
> > 
> > Somehow I understood that cbranch<>4 is already fully capable of the
> > optimization?
> > 
> > On your earlier proposal I'd have commented that if it wouldn't make sense
> > to instead have a CCmode setter instead of an expander with a branch label?
> > That would be a bit test, like {sign,zero}_extract compared against zero which
> > can then be combined with a branch?
> >
> 
> I missed that part, that could work too.
> 
> > Of course if the ISA is really bit-test-and-branch then cbranch<>4 itself or a
> > variant of it might be more useful.  Maybe
> > cbranchbi4 would be "abused" for this?
> 
> The instruction is an actual bit-test-and-branch with any arbitrary bitpos.
> Yes we can abuse cbranchbi4 for this, but then it also means we can't e.g.
> use this to optimize a < 0 where a is a signed value.  With the new optab
> this would just be a bit-test-and-branch of the sign bit.
> 
> But also I'm not entirely convinced that using `BImode` and assuming a single
> bit is safe here. What happens if I compile my source with -std=c89?
> 
> So I personally think the new optab makes more sense here. The CC setter would work too.
> 
> I guess my question is, between you folks, which approach would you like. It seems that Richi
> You'd like a CC setter. Richard do you have a preference of one over the other?

My order of preference is

a) an existing pattern, if possible
b) something usable by N > 1 targets we know of, even if it requires
some combine magic
c) something that fits the actual ISA

For b), x86 has BEXTR which performs a zero_extract from reg/mem
and sets ZF when the result is zero.  For a branch on sign bit there's
easier ways, so it's probably not very useful for general compare and 
branch optimization and if (a & 0x30) is probably handled by combine(?).
It also seems that if (a & 1) is handled for aarch64 already and
it's just a lack of an optab that allows RTL expansion to expand
if (bool) as if (bool & 1)?

Richard.

> Tamar
> 
> > 
> > > Tamar
> > >
> > >
> > > > Thanks,
> > > > Richard
> > > >
> > > >
> > > > > This would mean we could use my earlier patch that tried to drop
> > > > > the QI/HI promotions without needing the any_extend additional
> > > > > pass if we wanted to.
> > > > >
> > > > > We'd also no longer need to rely on seeing a paradoxical subreg for a
> > tst.
> > > > >
> > > > > Tamar.
> > > > >
> > > > >>
> > > > >> Thanks,
> > > > >> Richard
> > > > >>
> > > > >> > Specifically this representation would allow us to emit all our
> > > > >> > different conditional branching instructions without needing to
> > > > >> > rely on combine.  We have some cases that happen during
> > > > >> > optimization that sometimes prevent the optimal sequence from
> > > > >> > being generated. This
> > > > >> would also solve that as we would expand to what we want to start
> > with.
> > > > >> >
> > > > >> > Tamar.
> > > > >> >
> > > > >> >>
> > > > >> >> >
> > > > >> >> > Thanks,
> > > > >> >> > Tamar
> > > > >> >> >
> > > > >> >> >>
> > > > >> >> >> Richard.
> > >
> > 
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461
> > Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald,
> > Boudien Moerman; HRB 36809 (AG Nuernberg)
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-30 11:52                                   ` Richard Biener
@ 2022-09-30 12:48                                     ` Tamar Christina
  2022-09-30 14:28                                       ` Richard Sandiford
  0 siblings, 1 reply; 30+ messages in thread
From: Tamar Christina @ 2022-09-30 12:48 UTC (permalink / raw)
  To: Richard Biener
  Cc: Richard Sandiford, Tamar Christina via Gcc-patches, nd, Jeff Law

> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Friday, September 30, 2022 12:53 PM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina via
> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff Law
> <jeffreyalaw@gmail.com>
> Subject: RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> branches, give hint if argument is a truth type to backend
> 
> On Fri, 30 Sep 2022, Tamar Christina wrote:
> 
> > > -----Original Message-----
> > > From: Richard Biener <rguenther@suse.de>
> > > Sent: Friday, September 30, 2022 11:17 AM
> > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina
> > > via Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff
> Law
> > > <jeffreyalaw@gmail.com>
> > > Subject: RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional
> > > branches, give hint if argument is a truth type to backend
> > >
> > > On Fri, 30 Sep 2022, Tamar Christina wrote:
> > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Richard Sandiford <richard.sandiford@arm.com>
> > > > > Sent: Friday, September 30, 2022 9:49 AM
> > > > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > > > Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via
> > > > > Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff
> Law
> > > > > <jeffreyalaw@gmail.com>
> > > > > Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
> > > > > conditional branches, give hint if argument is a truth type to
> > > > > backend
> > > > >
> > > > > Tamar Christina <Tamar.Christina@arm.com> writes:
> > > > > >> -----Original Message-----
> > > > > >> From: Richard Sandiford <richard.sandiford@arm.com>
> > > > > >> Sent: Friday, September 30, 2022 9:29 AM
> > > > > >> To: Tamar Christina <Tamar.Christina@arm.com>
> > > > > >> Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via
> > > > > >> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff
> > > Law
> > > > > >> <jeffreyalaw@gmail.com>
> > > > > >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
> > > > > >> conditional branches, give hint if argument is a truth type
> > > > > >> to backend
> > > > > >>
> > > > > >> Tamar Christina <Tamar.Christina@arm.com> writes:
> > > > > >> >> -----Original Message-----
> > > > > >> >> From: Gcc-patches <gcc-patches-
> > > > > >> >> bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf Of
> > > > > Richard
> > > > > >> >> Biener via Gcc-patches
> > > > > >> >> Sent: Thursday, September 29, 2022 12:09 PM
> > > > > >> >> To: Tamar Christina via Gcc-patches
> > > > > >> >> <gcc-patches@gcc.gnu.org>
> > > > > >> >> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; nd
> > > > > <nd@arm.com>
> > > > > >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
> > > > > >> >> conditional branches, give hint if argument is a truth
> > > > > >> >> type to backend
> > > > > >> >>
> > > > > >> >>
> > > > > >> >>
> > > > > >> >> > Am 29.09.2022 um 12:23 schrieb Tamar Christina via
> > > > > >> >> > Gcc-patches
> > > > > >> >> > <gcc-
> > > > > >> >> patches@gcc.gnu.org>:
> > > > > >> >> >
> > > > > >> >> >
> > > > > >> >> >>
> > > > > >> >> >> -----Original Message-----
> > > > > >> >> >> From: Richard Biener <rguenther@suse.de>
> > > > > >> >> >> Sent: Thursday, September 29, 2022 10:41 AM
> > > > > >> >> >> To: Richard Sandiford <Richard.Sandiford@arm.com>
> > > > > >> >> >> Cc: Jeff Law <jeffreyalaw@gmail.com>; Tamar Christina
> > > > > >> >> >> <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org; nd
> > > > > >> >> <nd@arm.com>
> > > > > >> >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion
> > > > > >> >> >> of conditional branches, give hint if argument is a
> > > > > >> >> >> truth type to backend
> > > > > >> >> >>
> > > > > >> >> >>> On Thu, 29 Sep 2022, Richard Sandiford wrote:
> > > > > >> >> >>>
> > > > > >> >> >>> Jeff Law <jeffreyalaw@gmail.com> writes:
> > > > > >> >> >>>> On 9/28/22 09:04, Richard Sandiford wrote:
> > > > > >> >> >>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
> > > > > >> >> >>>>>>> Maybe the target could use (subreg:SI (reg:BI
> > > > > >> >> >>>>>>> ...)) as
> > > > > argument.
> > > > > >> >> Heh.
> > > > > >> >> >>>>>> But then I'd still need to change the expansion
> > > > > >> >> >>>>>> code. I suppose this could prevent the issue with
> > > > > >> >> >>>>>> changes to code on
> > > > > >> other targets.
> > > > > >> >> >>>>>>
> > > > > >> >> >>>>>>>>> We have undocumented addcc, negcc, etc.
> > > > > >> >> >>>>>>>>> patterns,
> > > > > should
> > > > > >> we
> > > > > >> >> >>>>>>>>> have aandcc
> > > > > >> >> >>>>>> pattern for this indicating support for andcc +
> > > > > >> >> >>>>>> jump as opposedto
> > > > > >> >> >> cmpcc + jump?
> > > > > >> >> >>>>>>>> This could work yeah. I didn't know these existed.
> > > > > >> >> >>>>>>> Ah, so they are conditional add, not add setting
> > > > > >> >> >>>>>>> CC, so andcc wouldn't be appropriate.
> > > > > >> >> >>>>>>> So I'm not sure how we'd handle such situation -
> > > > > >> >> >>>>>>> maybe looking at REG_DECL and recognizing a _Bool
> > > > > >> >> >>>>>>> PARM_DECL is
> > > > > OK?
> > > > > >> >> >>>>>> I have a slight suspicion that Richard Sandiford
> > > > > >> >> >>>>>> would likely reject this though..
> > > > > >> >> >>>>> Good guess :-P  We shouldn't rely on something like
> > > > > >> >> >>>>> that for
> > > > > >> >> >> correctness.
> > > > > >> >> >>>>>
> > > > > >> >> >>>>> Would it help if we promoted the test-and-branch
> > > > > >> >> >>>>> instructions to optabs, alongside cbranch?  The jump
> > > > > >> >> >>>>> expanders could then target it
> > > > > >> >> >> directly.
> > > > > >> >> >>>>>
> > > > > >> >> >>>>> IMO that'd be a reasonable thing to do if it does help.
> > > > > >> >> >>>>> It's a relatively common operation, especially on
> > > > > >> >> >>>>> CISCy
> > > targets.
> > > > > >> >> >>>>
> > > > > >> >> >>>> But don't we represent these single bit tests using
> > > > > >> >> >>>> zero_extract as the condition of the branch?  I guess
> > > > > >> >> >>>> if we can generate them directly rather than waiting
> > > > > >> >> >>>> for combine to deduce that we're dealing with a
> > > > > >> >> >>>> single bit test and constructing the zero_extract
> > > > > >> >> >>>> form would be an improvement and might help aarch at
> > > > > >> >> >>>> the same
> > > > > >> time.
> > > > > >> >> >>>
> > > > > >> >> >>> Do you mean that the promote_mode stuff should use
> > > > > >> >> >>> ext(z)v rather than zero_extend to promote a bool, where
> available?
> > > > > >> >> >>> If so, I agree that might help.  But it sounds like it
> > > > > >> >> >>> would have downsides
> > > > > >> too.
> > > > > >> >> >>> Currently a bool memory can be zero-extended on the
> > > > > >> >> >>> fly using a load, but if we used the zero_extract form
> > > > > >> >> >>> instead, we'd have to extract the bit after the load.
> > > > > >> >> >>> And (as an
> > > > > >> >> >>> alternative) choosing different behaviour based on
> > > > > >> >> >>> whether expand sees a REG or a MEM sounds like it
> > > > > >> >> >>> could still cause problems, since REGs could be
> > > > > >> >> >>> replaced by MEMs (or vice versa)
> > > > > later in the RTL passes.
> > > > > >> >> >>>
> > > > > >> >> >>> ISTM that the original patch was inserting an extra
> > > > > >> >> >>> operation in the branch expansion in order to target a
> > > > > >> >> >>> specific
> > > instruction.
> > > > > >> >> >>> Targeting the instruction in expand seems good, but
> > > > > >> >> >>> IMO we should do it directly, based on knowledge of
> > > > > >> >> >>> whether the instruction actually
> > > > > >> >> exists.
> > > > > >> >> >>
> > > > > >> >> >> Yes, I think a compare-and-branch pattern is the best fit
> here.
> > > > > >> >> >> Note on GIMPLE we'd rely on the fact this is a
> > > > > >> >> >> BOOLEAN_TYPE (so even 8 bit precision bools only have 1
> > > > > >> >> >> and 0 as meaningful
> > > values).
> > > > > >> >> >> So the 'compare-' bit in compare-and-branch would be
> > > > > >> >> >> interpreting a BOOLEAN_TYPE, not so much a general
> compare.
> > > > > >> >> >
> > > > > >> >> > Oh, I was thinking of adding a constant argument
> > > > > >> >> > representing the precision that is relevant for the
> > > > > >> >> > compare in order to make this a bit more
> > > > > >> >> general/future proof.
> > > > > >> >> >
> > > > > >> >> > Are you thinking I should instead just make the optab
> > > > > >> >> > implicitly only work for 1-bit precision comparisons?
> > > > > >> >>
> > > > > >> >> What?s the optab you propose (cite also the documentation
> part)?
> > > > > >> >
> > > > > >> > tbranchmode5
> > > > > >> >   Conditional branch instruction combined with a bit test
> instruction.
> > > > > >> Operand 0 is a comparison operator.
> > > > > >> >   Operand 1 and Operand 2 are the first and second operands
> > > > > >> > of the
> > > > > >> comparison, respectively.
> > > > > >> >   Operand 3 is the number of low-order bits that are
> > > > > >> > relevant for the
> > > > > >> comparison.
> > > > > >> >   Operand 4 is the code_label to jump to.
> > > > > >>
> > > > > >> For the TB instructions (and for other similar instructions
> > > > > >> that I've seen on other architectures) it would be more
> > > > > >> useful to have a single-bit test, with operand 4 specifying the bit
> position.
> > > > > >> Arguably it might then be better to have separate eq and ne
> > > > > >> optabs, to avoid the awkward doubling of the operands
> > > > > >> (operand 1 contains
> > > > > operands 2 and 3).
> > > > > >>
> > > > > >> I guess a more general way of achieving the same thing would
> > > > > >> be to make operand 4 in the optab above a mask rather than a bit
> count.
> > > > > >> But that might be overly general, if there are no known
> > > > > >> architectures that have such an instruction.
> > > > > >
> > > > > > One of the reasons I wanted a range rather than a single bit
> > > > > > is that I can the use this to generate cbz/cbnz early on as well.
> > > > >
> > > > > We already have the opportunity to do that via cbranch<mode>4.
> > > > > But at the moment aarch64.md always forces the separate
> > > > > comparison instead.  (Not sure why TBH.  Does it enable more
> > > > > ifcvt
> > > > > opportunities?)
> > > > >
> > > > > If we change the body of cbranch<mode>4 to:
> > > > >
> > > > >   if ((GET_CODE (operands[0]) != EQ && GET_CODE (operands[0]) !=
> NE)
> > > > >       || operands[2] != const0_rtx)
> > > > >     {
> > > > >       operands[1] = aarch64_gen_compare_reg (GET_CODE
> (operands[0]),
> > > > > 					     operands[1], operands[2]);
> > > > >       operands[2] = const0_rtx;
> > > > >     }
> > > > >
> > > > > then we generate the cbz/cbnz directly.
> > > > >
> > > >
> > > > Ah ok, then if Richi agrees, bitpos it is then instead of bit count.
> > >
> > > Somehow I understood that cbranch<>4 is already fully capable of the
> > > optimization?
> > >
> > > On your earlier proposal I'd have commented that if it wouldn't make
> > > sense to instead have a CCmode setter instead of an expander with a
> branch label?
> > > That would be a bit test, like {sign,zero}_extract compared against
> > > zero which can then be combined with a branch?
> > >
> >
> > I missed that part, that could work too.
> >
> > > Of course if the ISA is really bit-test-and-branch then cbranch<>4
> > > itself or a variant of it might be more useful.  Maybe
> > > cbranchbi4 would be "abused" for this?
> >
> > The instruction is an actual bit-test-and-branch with any arbitrary bitpos.
> > Yes we can abuse cbranchbi4 for this, but then it also means we can't e.g.
> > use this to optimize a < 0 where a is a signed value.  With the new
> > optab this would just be a bit-test-and-branch of the sign bit.
> >
> > But also I'm not entirely convinced that using `BImode` and assuming a
> > single bit is safe here. What happens if I compile my source with -std=c89?
> >
> > So I personally think the new optab makes more sense here. The CC setter
> would work too.
> >
> > I guess my question is, between you folks, which approach would you
> > like. It seems that Richi You'd like a CC setter. Richard do you have a
> preference of one over the other?
> 
> My order of preference is
> 
> a) an existing pattern, if possible
> b) something usable by N > 1 targets we know of, even if it requires some
> combine magic
> c) something that fits the actual ISA
>
> For b), x86 has BEXTR which performs a zero_extract from reg/mem and sets
> ZF when the result is zero.  For a branch on sign bit there's easier ways, so it's
> probably not very useful for general compare and branch optimization and if
> (a & 0x30) is probably handled by combine(?).

Agreed, I was more giving an example of other uses. But ok.

> It also seems that if (a & 1) is handled for aarch64 already and it's just a lack of
> an optab that allows RTL expansion to expand if (bool) as if (bool & 1)?

Yes this was what I was originally after.  Your original suggestion was to abuse BImode
and cbranch for this.  That could work, but I worry about introducing BImode in the RTL,
as I don't think we currently use it at all and wonder about new missed optimizations.

Richard Sandiford, would this be OK with you? I also don't want to emit an extract here as I think
that's gonna have a bigger effect on other targets than & 1 did.

I would rather have something I can explicitly check for target support for. I also wonder if just a
target hook asking the target how to expand a given tree Boolean argument would work.

Tamar

> 
> Richard.
> 
> > Tamar
> >
> > >
> > > > Tamar
> > > >
> > > >
> > > > > Thanks,
> > > > > Richard
> > > > >
> > > > >
> > > > > > This would mean we could use my earlier patch that tried to
> > > > > > drop the QI/HI promotions without needing the any_extend
> > > > > > additional pass if we wanted to.
> > > > > >
> > > > > > We'd also no longer need to rely on seeing a paradoxical
> > > > > > subreg for a
> > > tst.
> > > > > >
> > > > > > Tamar.
> > > > > >
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Richard
> > > > > >>
> > > > > >> > Specifically this representation would allow us to emit all
> > > > > >> > our different conditional branching instructions without
> > > > > >> > needing to rely on combine.  We have some cases that happen
> > > > > >> > during optimization that sometimes prevent the optimal
> > > > > >> > sequence from being generated. This
> > > > > >> would also solve that as we would expand to what we want to
> > > > > >> start
> > > with.
> > > > > >> >
> > > > > >> > Tamar.
> > > > > >> >
> > > > > >> >>
> > > > > >> >> >
> > > > > >> >> > Thanks,
> > > > > >> >> > Tamar
> > > > > >> >> >
> > > > > >> >> >>
> > > > > >> >> >> Richard.
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461
> > > Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald,
> > > Boudien Moerman; HRB 36809 (AG Nuernberg)
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461
> Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald,
> Boudien Moerman; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-30 12:48                                     ` Tamar Christina
@ 2022-09-30 14:28                                       ` Richard Sandiford
  2022-09-30 14:33                                         ` Richard Biener
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Sandiford @ 2022-09-30 14:28 UTC (permalink / raw)
  To: Tamar Christina
  Cc: Richard Biener, Tamar Christina via Gcc-patches, nd, Jeff Law

Tamar Christina <Tamar.Christina@arm.com> writes:
>> -----Original Message-----
>> From: Richard Biener <rguenther@suse.de>
>> Sent: Friday, September 30, 2022 12:53 PM
>> To: Tamar Christina <Tamar.Christina@arm.com>
>> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina via
>> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff Law
>> <jeffreyalaw@gmail.com>
>> Subject: RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional
>> branches, give hint if argument is a truth type to backend
>>
>> On Fri, 30 Sep 2022, Tamar Christina wrote:
>>
>> > > -----Original Message-----
>> > > From: Richard Biener <rguenther@suse.de>
>> > > Sent: Friday, September 30, 2022 11:17 AM
>> > > To: Tamar Christina <Tamar.Christina@arm.com>
>> > > Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina
>> > > via Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff
>> Law
>> > > <jeffreyalaw@gmail.com>
>> > > Subject: RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional
>> > > branches, give hint if argument is a truth type to backend
>> > >
>> > > On Fri, 30 Sep 2022, Tamar Christina wrote:
>> > >
>> > > >
>> > > >
>> > > > > -----Original Message-----
>> > > > > From: Richard Sandiford <richard.sandiford@arm.com>
>> > > > > Sent: Friday, September 30, 2022 9:49 AM
>> > > > > To: Tamar Christina <Tamar.Christina@arm.com>
>> > > > > Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via
>> > > > > Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff
>> Law
>> > > > > <jeffreyalaw@gmail.com>
>> > > > > Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
>> > > > > conditional branches, give hint if argument is a truth type to
>> > > > > backend
>> > > > >
>> > > > > Tamar Christina <Tamar.Christina@arm.com> writes:
>> > > > > >> -----Original Message-----
>> > > > > >> From: Richard Sandiford <richard.sandiford@arm.com>
>> > > > > >> Sent: Friday, September 30, 2022 9:29 AM
>> > > > > >> To: Tamar Christina <Tamar.Christina@arm.com>
>> > > > > >> Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via
>> > > > > >> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff
>> > > Law
>> > > > > >> <jeffreyalaw@gmail.com>
>> > > > > >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
>> > > > > >> conditional branches, give hint if argument is a truth type
>> > > > > >> to backend
>> > > > > >>
>> > > > > >> Tamar Christina <Tamar.Christina@arm.com> writes:
>> > > > > >> >> -----Original Message-----
>> > > > > >> >> From: Gcc-patches <gcc-patches-
>> > > > > >> >> bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf Of
>> > > > > Richard
>> > > > > >> >> Biener via Gcc-patches
>> > > > > >> >> Sent: Thursday, September 29, 2022 12:09 PM
>> > > > > >> >> To: Tamar Christina via Gcc-patches
>> > > > > >> >> <gcc-patches@gcc.gnu.org>
>> > > > > >> >> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; nd
>> > > > > <nd@arm.com>
>> > > > > >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
>> > > > > >> >> conditional branches, give hint if argument is a truth
>> > > > > >> >> type to backend
>> > > > > >> >>
>> > > > > >> >>
>> > > > > >> >>
>> > > > > >> >> > Am 29.09.2022 um 12:23 schrieb Tamar Christina via
>> > > > > >> >> > Gcc-patches
>> > > > > >> >> > <gcc-
>> > > > > >> >> patches@gcc.gnu.org>:
>> > > > > >> >> >
>> > > > > >> >> >
>> > > > > >> >> >>
>> > > > > >> >> >> -----Original Message-----
>> > > > > >> >> >> From: Richard Biener <rguenther@suse.de>
>> > > > > >> >> >> Sent: Thursday, September 29, 2022 10:41 AM
>> > > > > >> >> >> To: Richard Sandiford <Richard.Sandiford@arm.com>
>> > > > > >> >> >> Cc: Jeff Law <jeffreyalaw@gmail.com>; Tamar Christina
>> > > > > >> >> >> <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org; nd
>> > > > > >> >> <nd@arm.com>
>> > > > > >> >> >> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion
>> > > > > >> >> >> of conditional branches, give hint if argument is a
>> > > > > >> >> >> truth type to backend
>> > > > > >> >> >>
>> > > > > >> >> >>> On Thu, 29 Sep 2022, Richard Sandiford wrote:
>> > > > > >> >> >>>
>> > > > > >> >> >>> Jeff Law <jeffreyalaw@gmail.com> writes:
>> > > > > >> >> >>>> On 9/28/22 09:04, Richard Sandiford wrote:
>> > > > > >> >> >>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
>> > > > > >> >> >>>>>>> Maybe the target could use (subreg:SI (reg:BI
>> > > > > >> >> >>>>>>> ...)) as
>> > > > > argument.
>> > > > > >> >> Heh.
>> > > > > >> >> >>>>>> But then I'd still need to change the expansion
>> > > > > >> >> >>>>>> code. I suppose this could prevent the issue with
>> > > > > >> >> >>>>>> changes to code on
>> > > > > >> other targets.
>> > > > > >> >> >>>>>>
>> > > > > >> >> >>>>>>>>> We have undocumented addcc, negcc, etc.
>> > > > > >> >> >>>>>>>>> patterns,
>> > > > > should
>> > > > > >> we
>> > > > > >> >> >>>>>>>>> have aandcc
>> > > > > >> >> >>>>>> pattern for this indicating support for andcc +
>> > > > > >> >> >>>>>> jump as opposedto
>> > > > > >> >> >> cmpcc + jump?
>> > > > > >> >> >>>>>>>> This could work yeah. I didn't know these existed.
>> > > > > >> >> >>>>>>> Ah, so they are conditional add, not add setting
>> > > > > >> >> >>>>>>> CC, so andcc wouldn't be appropriate.
>> > > > > >> >> >>>>>>> So I'm not sure how we'd handle such situation -
>> > > > > >> >> >>>>>>> maybe looking at REG_DECL and recognizing a _Bool
>> > > > > >> >> >>>>>>> PARM_DECL is
>> > > > > OK?
>> > > > > >> >> >>>>>> I have a slight suspicion that Richard Sandiford
>> > > > > >> >> >>>>>> would likely reject this though..
>> > > > > >> >> >>>>> Good guess :-P  We shouldn't rely on something like
>> > > > > >> >> >>>>> that for
>> > > > > >> >> >> correctness.
>> > > > > >> >> >>>>>
>> > > > > >> >> >>>>> Would it help if we promoted the test-and-branch
>> > > > > >> >> >>>>> instructions to optabs, alongside cbranch?  The jump
>> > > > > >> >> >>>>> expanders could then target it
>> > > > > >> >> >> directly.
>> > > > > >> >> >>>>>
>> > > > > >> >> >>>>> IMO that'd be a reasonable thing to do if it does help.
>> > > > > >> >> >>>>> It's a relatively common operation, especially on
>> > > > > >> >> >>>>> CISCy
>> > > targets.
>> > > > > >> >> >>>>
>> > > > > >> >> >>>> But don't we represent these single bit tests using
>> > > > > >> >> >>>> zero_extract as the condition of the branch?  I guess
>> > > > > >> >> >>>> if we can generate them directly rather than waiting
>> > > > > >> >> >>>> for combine to deduce that we're dealing with a
>> > > > > >> >> >>>> single bit test and constructing the zero_extract
>> > > > > >> >> >>>> form would be an improvement and might help aarch at
>> > > > > >> >> >>>> the same
>> > > > > >> time.
>> > > > > >> >> >>>
>> > > > > >> >> >>> Do you mean that the promote_mode stuff should use
>> > > > > >> >> >>> ext(z)v rather than zero_extend to promote a bool, where
>> available?
>> > > > > >> >> >>> If so, I agree that might help.  But it sounds like it
>> > > > > >> >> >>> would have downsides
>> > > > > >> too.
>> > > > > >> >> >>> Currently a bool memory can be zero-extended on the
>> > > > > >> >> >>> fly using a load, but if we used the zero_extract form
>> > > > > >> >> >>> instead, we'd have to extract the bit after the load.
>> > > > > >> >> >>> And (as an
>> > > > > >> >> >>> alternative) choosing different behaviour based on
>> > > > > >> >> >>> whether expand sees a REG or a MEM sounds like it
>> > > > > >> >> >>> could still cause problems, since REGs could be
>> > > > > >> >> >>> replaced by MEMs (or vice versa)
>> > > > > later in the RTL passes.
>> > > > > >> >> >>>
>> > > > > >> >> >>> ISTM that the original patch was inserting an extra
>> > > > > >> >> >>> operation in the branch expansion in order to target a
>> > > > > >> >> >>> specific
>> > > instruction.
>> > > > > >> >> >>> Targeting the instruction in expand seems good, but
>> > > > > >> >> >>> IMO we should do it directly, based on knowledge of
>> > > > > >> >> >>> whether the instruction actually
>> > > > > >> >> exists.
>> > > > > >> >> >>
>> > > > > >> >> >> Yes, I think a compare-and-branch pattern is the best fit
>> here.
>> > > > > >> >> >> Note on GIMPLE we'd rely on the fact this is a
>> > > > > >> >> >> BOOLEAN_TYPE (so even 8 bit precision bools only have 1
>> > > > > >> >> >> and 0 as meaningful
>> > > values).
>> > > > > >> >> >> So the 'compare-' bit in compare-and-branch would be
>> > > > > >> >> >> interpreting a BOOLEAN_TYPE, not so much a general
>> compare.
>> > > > > >> >> >
>> > > > > >> >> > Oh, I was thinking of adding a constant argument
>> > > > > >> >> > representing the precision that is relevant for the
>> > > > > >> >> > compare in order to make this a bit more
>> > > > > >> >> general/future proof.
>> > > > > >> >> >
>> > > > > >> >> > Are you thinking I should instead just make the optab
>> > > > > >> >> > implicitly only work for 1-bit precision comparisons?
>> > > > > >> >>
>> > > > > >> >> What?s the optab you propose (cite also the documentation
>> part)?
>> > > > > >> >
>> > > > > >> > tbranchmode5
>> > > > > >> >   Conditional branch instruction combined with a bit test
>> instruction.
>> > > > > >> Operand 0 is a comparison operator.
>> > > > > >> >   Operand 1 and Operand 2 are the first and second operands
>> > > > > >> > of the
>> > > > > >> comparison, respectively.
>> > > > > >> >   Operand 3 is the number of low-order bits that are
>> > > > > >> > relevant for the
>> > > > > >> comparison.
>> > > > > >> >   Operand 4 is the code_label to jump to.
>> > > > > >>
>> > > > > >> For the TB instructions (and for other similar instructions
>> > > > > >> that I've seen on other architectures) it would be more
>> > > > > >> useful to have a single-bit test, with operand 4 specifying the bit
>> position.
>> > > > > >> Arguably it might then be better to have separate eq and ne
>> > > > > >> optabs, to avoid the awkward doubling of the operands
>> > > > > >> (operand 1 contains
>> > > > > operands 2 and 3).
>> > > > > >>
>> > > > > >> I guess a more general way of achieving the same thing would
>> > > > > >> be to make operand 4 in the optab above a mask rather than a bit
>> count.
>> > > > > >> But that might be overly general, if there are no known
>> > > > > >> architectures that have such an instruction.
>> > > > > >
>> > > > > > One of the reasons I wanted a range rather than a single bit
>> > > > > > is that I can the use this to generate cbz/cbnz early on as well.
>> > > > >
>> > > > > We already have the opportunity to do that via cbranch<mode>4.
>> > > > > But at the moment aarch64.md always forces the separate
>> > > > > comparison instead.  (Not sure why TBH.  Does it enable more
>> > > > > ifcvt
>> > > > > opportunities?)
>> > > > >
>> > > > > If we change the body of cbranch<mode>4 to:
>> > > > >
>> > > > >   if ((GET_CODE (operands[0]) != EQ && GET_CODE (operands[0]) !=
>> NE)
>> > > > >       || operands[2] != const0_rtx)
>> > > > >     {
>> > > > >       operands[1] = aarch64_gen_compare_reg (GET_CODE
>> (operands[0]),
>> > > > >                                            operands[1], operands[2]);
>> > > > >       operands[2] = const0_rtx;
>> > > > >     }
>> > > > >
>> > > > > then we generate the cbz/cbnz directly.
>> > > > >
>> > > >
>> > > > Ah ok, then if Richi agrees, bitpos it is then instead of bit count.
>> > >
>> > > Somehow I understood that cbranch<>4 is already fully capable of the
>> > > optimization?
>> > >
>> > > On your earlier proposal I'd have commented that if it wouldn't make
>> > > sense to instead have a CCmode setter instead of an expander with a
>> branch label?
>> > > That would be a bit test, like {sign,zero}_extract compared against
>> > > zero which can then be combined with a branch?
>> > >
>> >
>> > I missed that part, that could work too.
>> >
>> > > Of course if the ISA is really bit-test-and-branch then cbranch<>4
>> > > itself or a variant of it might be more useful.  Maybe
>> > > cbranchbi4 would be "abused" for this?
>> >
>> > The instruction is an actual bit-test-and-branch with any arbitrary bitpos.
>> > Yes we can abuse cbranchbi4 for this, but then it also means we can't e.g.
>> > use this to optimize a < 0 where a is a signed value.  With the new
>> > optab this would just be a bit-test-and-branch of the sign bit.
>> >
>> > But also I'm not entirely convinced that using `BImode` and assuming a
>> > single bit is safe here. What happens if I compile my source with -std=c89?
>> >
>> > So I personally think the new optab makes more sense here. The CC setter
>> would work too.
>> >
>> > I guess my question is, between you folks, which approach would you
>> > like. It seems that Richi You'd like a CC setter. Richard do you have a
>> preference of one over the other?
>>
>> My order of preference is
>>
>> a) an existing pattern, if possible
>> b) something usable by N > 1 targets we know of, even if it requires some
>> combine magic
>> c) something that fits the actual ISA
>>
>> For b), x86 has BEXTR which performs a zero_extract from reg/mem and sets
>> ZF when the result is zero.  For a branch on sign bit there's easier ways, so it's
>> probably not very useful for general compare and branch optimization and if
>> (a & 0x30) is probably handled by combine(?).
>
> Agreed, I was more giving an example of other uses. But ok.
>
>> It also seems that if (a & 1) is handled for aarch64 already and it's just a lack of
>> an optab that allows RTL expansion to expand if (bool) as if (bool & 1)?
>
> Yes this was what I was originally after.  Your original suggestion was to abuse BImode
> and cbranch for this.  That could work, but I worry about introducing BImode in the RTL,
> as I don't think we currently use it at all and wonder about new missed optimizations.
>
> Richard Sandiford, would this be OK with you? I also don't want to emit an extract here as I think
> that's gonna have a bigger effect on other targets than & 1 did.
>
> I would rather have something I can explicitly check for target support for. I also wonder if just a
> target hook asking the target how to expand a given tree Boolean argument would work.

Personally I'd prefer the test-and-branch optab.  We used to have
separate compare optabs and branch optabs in the cc0 days, but having
the combined pattern turned out to be much easier.  We don't currently
expose CC registers at the optabs level and I think that's a good thing.

We could have a test-bit-and-set-pseudo optab.  But using that optab
here would reintroduce the problem that I had with the original patch,
which is:

- Currently, the branch on the boolean value expands to a single optab
  (cbranch<mode>4).  That optab could easily expand to a single cb(n)z
  instruction on aarch64 (even though the port chooses not to do that
  currently).  So taken on its own, the branch is already maximally
  efficient.

  The patch instead adds an extra instruction to the branch expansion,
  so that it uses two optabs rather than one.  On the face of it,
  this extra instruction is entirely redundant.  But emitting it can
  sometimes allow profitable combines.  That is, rather than expand to:

    and reg, reg, 0xFF (from promote_mode)
    cbranch on reg     (from the branch expansion)

  the patch expanded to:

    and reg, reg, 0xFF (from promote_mode)
    and reg, reg, 0x1  (from the branch expansion)
    cbranch on reg     (from the branch expansion)

  The first two ANDs come from separate sources.  When both ANDs exist,
  combine can get rid of the redundancy and can then see that the
  retained AND 0x1 + cbranch optimise to a test-and-branch instruction.

  But if the target can't in fact use a test-and-branch in a particular
  situation, we'd be left with the AND 0x1 and the cbranch.  That's not
  too bad if the AND from promote_mode and the AND 0x1 can be combined.
  But if the AND with 0xff happens "too far" from the AND with 0x1
  (e.g. because it's in a loop preheader), or if there's intermediate
  logic, we might end up keeping all three instructions.

Emitting an extra bit test instruction as part of the branch expansion
IMO has the same problem.  We're relying on combine combining the
(redundant) bit test with the cbranch.  If combine doesn't do that
(e.g. because the target doesn't support the combination in certain
cases) then we could be left with three instructions rather than the
original two.

That's why I think the fix is either for promote_mode to use a different
expansion for booleans (which could also have knock-on effects) or for
the branch expanders to target the test-and-branch instruction directly.

The optab wouldn't be an aarch64 special.  arc, avr, and h8300sx have
similar instructions.  There are problem others too: h8300sx was from
memory and I stopped looking after avr. :-)

Thanks,
Richard

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-30 14:28                                       ` Richard Sandiford
@ 2022-09-30 14:33                                         ` Richard Biener
  0 siblings, 0 replies; 30+ messages in thread
From: Richard Biener @ 2022-09-30 14:33 UTC (permalink / raw)
  To: Richard Sandiford via Gcc-patches; +Cc: Tamar Christina, nd



> Am 30.09.2022 um 16:29 schrieb Richard Sandiford via Gcc-patches <gcc-patches@gcc.gnu.org>:
> 
> Tamar Christina <Tamar.Christina@arm.com> writes:
>>> -----Original Message-----
>>> From: Richard Biener <rguenther@suse.de>
>>> Sent: Friday, September 30, 2022 12:53 PM
>>> To: Tamar Christina <Tamar.Christina@arm.com>
>>> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina via
>>> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff Law
>>> <jeffreyalaw@gmail.com>
>>> Subject: RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional
>>> branches, give hint if argument is a truth type to backend
>>> 
>>>> On Fri, 30 Sep 2022, Tamar Christina wrote:
>>> 
>>>>> -----Original Message-----
>>>>> From: Richard Biener <rguenther@suse.de>
>>>>> Sent: Friday, September 30, 2022 11:17 AM
>>>>> To: Tamar Christina <Tamar.Christina@arm.com>
>>>>> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina
>>>>> via Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff
>>> Law
>>>>> <jeffreyalaw@gmail.com>
>>>>> Subject: RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional
>>>>> branches, give hint if argument is a truth type to backend
>>>>> 
>>>>> On Fri, 30 Sep 2022, Tamar Christina wrote:
>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Richard Sandiford <richard.sandiford@arm.com>
>>>>>>> Sent: Friday, September 30, 2022 9:49 AM
>>>>>>> To: Tamar Christina <Tamar.Christina@arm.com>
>>>>>>> Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via
>>>>>>> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff
>>> Law
>>>>>>> <jeffreyalaw@gmail.com>
>>>>>>> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
>>>>>>> conditional branches, give hint if argument is a truth type to
>>>>>>> backend
>>>>>>> 
>>>>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Richard Sandiford <richard.sandiford@arm.com>
>>>>>>>>> Sent: Friday, September 30, 2022 9:29 AM
>>>>>>>>> To: Tamar Christina <Tamar.Christina@arm.com>
>>>>>>>>> Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via
>>>>>>>>> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff
>>>>> Law
>>>>>>>>> <jeffreyalaw@gmail.com>
>>>>>>>>> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
>>>>>>>>> conditional branches, give hint if argument is a truth type
>>>>>>>>> to backend
>>>>>>>>> 
>>>>>>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Gcc-patches <gcc-patches-
>>>>>>>>>>> bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf Of
>>>>>>> Richard
>>>>>>>>>>> Biener via Gcc-patches
>>>>>>>>>>> Sent: Thursday, September 29, 2022 12:09 PM
>>>>>>>>>>> To: Tamar Christina via Gcc-patches
>>>>>>>>>>> <gcc-patches@gcc.gnu.org>
>>>>>>>>>>> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; nd
>>>>>>> <nd@arm.com>
>>>>>>>>>>> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
>>>>>>>>>>> conditional branches, give hint if argument is a truth
>>>>>>>>>>> type to backend
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> Am 29.09.2022 um 12:23 schrieb Tamar Christina via
>>>>>>>>>>>> Gcc-patches
>>>>>>>>>>>> <gcc-
>>>>>>>>>>> patches@gcc.gnu.org>:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: Richard Biener <rguenther@suse.de>
>>>>>>>>>>>>> Sent: Thursday, September 29, 2022 10:41 AM
>>>>>>>>>>>>> To: Richard Sandiford <Richard.Sandiford@arm.com>
>>>>>>>>>>>>> Cc: Jeff Law <jeffreyalaw@gmail.com>; Tamar Christina
>>>>>>>>>>>>> <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org; nd
>>>>>>>>>>> <nd@arm.com>
>>>>>>>>>>>>> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion
>>>>>>>>>>>>> of conditional branches, give hint if argument is a
>>>>>>>>>>>>> truth type to backend
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Thu, 29 Sep 2022, Richard Sandiford wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Jeff Law <jeffreyalaw@gmail.com> writes:
>>>>>>>>>>>>>>> On 9/28/22 09:04, Richard Sandiford wrote:
>>>>>>>>>>>>>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
>>>>>>>>>>>>>>>>>> Maybe the target could use (subreg:SI (reg:BI
>>>>>>>>>>>>>>>>>> ...)) as
>>>>>>> argument.
>>>>>>>>>>> Heh.
>>>>>>>>>>>>>>>>> But then I'd still need to change the expansion
>>>>>>>>>>>>>>>>> code. I suppose this could prevent the issue with
>>>>>>>>>>>>>>>>> changes to code on
>>>>>>>>> other targets.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> We have undocumented addcc, negcc, etc.
>>>>>>>>>>>>>>>>>>>> patterns,
>>>>>>> should
>>>>>>>>> we
>>>>>>>>>>>>>>>>>>>> have aandcc
>>>>>>>>>>>>>>>>> pattern for this indicating support for andcc +
>>>>>>>>>>>>>>>>> jump as opposedto
>>>>>>>>>>>>> cmpcc + jump?
>>>>>>>>>>>>>>>>>>> This could work yeah. I didn't know these existed.
>>>>>>>>>>>>>>>>>> Ah, so they are conditional add, not add setting
>>>>>>>>>>>>>>>>>> CC, so andcc wouldn't be appropriate.
>>>>>>>>>>>>>>>>>> So I'm not sure how we'd handle such situation -
>>>>>>>>>>>>>>>>>> maybe looking at REG_DECL and recognizing a _Bool
>>>>>>>>>>>>>>>>>> PARM_DECL is
>>>>>>> OK?
>>>>>>>>>>>>>>>>> I have a slight suspicion that Richard Sandiford
>>>>>>>>>>>>>>>>> would likely reject this though..
>>>>>>>>>>>>>>>> Good guess :-P  We shouldn't rely on something like
>>>>>>>>>>>>>>>> that for
>>>>>>>>>>>>> correctness.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Would it help if we promoted the test-and-branch
>>>>>>>>>>>>>>>> instructions to optabs, alongside cbranch?  The jump
>>>>>>>>>>>>>>>> expanders could then target it
>>>>>>>>>>>>> directly.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> IMO that'd be a reasonable thing to do if it does help.
>>>>>>>>>>>>>>>> It's a relatively common operation, especially on
>>>>>>>>>>>>>>>> CISCy
>>>>> targets.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> But don't we represent these single bit tests using
>>>>>>>>>>>>>>> zero_extract as the condition of the branch?  I guess
>>>>>>>>>>>>>>> if we can generate them directly rather than waiting
>>>>>>>>>>>>>>> for combine to deduce that we're dealing with a
>>>>>>>>>>>>>>> single bit test and constructing the zero_extract
>>>>>>>>>>>>>>> form would be an improvement and might help aarch at
>>>>>>>>>>>>>>> the same
>>>>>>>>> time.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Do you mean that the promote_mode stuff should use
>>>>>>>>>>>>>> ext(z)v rather than zero_extend to promote a bool, where
>>> available?
>>>>>>>>>>>>>> If so, I agree that might help.  But it sounds like it
>>>>>>>>>>>>>> would have downsides
>>>>>>>>> too.
>>>>>>>>>>>>>> Currently a bool memory can be zero-extended on the
>>>>>>>>>>>>>> fly using a load, but if we used the zero_extract form
>>>>>>>>>>>>>> instead, we'd have to extract the bit after the load.
>>>>>>>>>>>>>> And (as an
>>>>>>>>>>>>>> alternative) choosing different behaviour based on
>>>>>>>>>>>>>> whether expand sees a REG or a MEM sounds like it
>>>>>>>>>>>>>> could still cause problems, since REGs could be
>>>>>>>>>>>>>> replaced by MEMs (or vice versa)
>>>>>>> later in the RTL passes.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> ISTM that the original patch was inserting an extra
>>>>>>>>>>>>>> operation in the branch expansion in order to target a
>>>>>>>>>>>>>> specific
>>>>> instruction.
>>>>>>>>>>>>>> Targeting the instruction in expand seems good, but
>>>>>>>>>>>>>> IMO we should do it directly, based on knowledge of
>>>>>>>>>>>>>> whether the instruction actually
>>>>>>>>>>> exists.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Yes, I think a compare-and-branch pattern is the best fit
>>> here.
>>>>>>>>>>>>> Note on GIMPLE we'd rely on the fact this is a
>>>>>>>>>>>>> BOOLEAN_TYPE (so even 8 bit precision bools only have 1
>>>>>>>>>>>>> and 0 as meaningful
>>>>> values).
>>>>>>>>>>>>> So the 'compare-' bit in compare-and-branch would be
>>>>>>>>>>>>> interpreting a BOOLEAN_TYPE, not so much a general
>>> compare.
>>>>>>>>>>>> 
>>>>>>>>>>>> Oh, I was thinking of adding a constant argument
>>>>>>>>>>>> representing the precision that is relevant for the
>>>>>>>>>>>> compare in order to make this a bit more
>>>>>>>>>>> general/future proof.
>>>>>>>>>>>> 
>>>>>>>>>>>> Are you thinking I should instead just make the optab
>>>>>>>>>>>> implicitly only work for 1-bit precision comparisons?
>>>>>>>>>>> 
>>>>>>>>>>> What?s the optab you propose (cite also the documentation
>>> part)?
>>>>>>>>>> 
>>>>>>>>>> tbranchmode5
>>>>>>>>>>  Conditional branch instruction combined with a bit test
>>> instruction.
>>>>>>>>> Operand 0 is a comparison operator.
>>>>>>>>>>  Operand 1 and Operand 2 are the first and second operands
>>>>>>>>>> of the
>>>>>>>>> comparison, respectively.
>>>>>>>>>>  Operand 3 is the number of low-order bits that are
>>>>>>>>>> relevant for the
>>>>>>>>> comparison.
>>>>>>>>>>  Operand 4 is the code_label to jump to.
>>>>>>>>> 
>>>>>>>>> For the TB instructions (and for other similar instructions
>>>>>>>>> that I've seen on other architectures) it would be more
>>>>>>>>> useful to have a single-bit test, with operand 4 specifying the bit
>>> position.
>>>>>>>>> Arguably it might then be better to have separate eq and ne
>>>>>>>>> optabs, to avoid the awkward doubling of the operands
>>>>>>>>> (operand 1 contains
>>>>>>> operands 2 and 3).
>>>>>>>>> 
>>>>>>>>> I guess a more general way of achieving the same thing would
>>>>>>>>> be to make operand 4 in the optab above a mask rather than a bit
>>> count.
>>>>>>>>> But that might be overly general, if there are no known
>>>>>>>>> architectures that have such an instruction.
>>>>>>>> 
>>>>>>>> One of the reasons I wanted a range rather than a single bit
>>>>>>>> is that I can the use this to generate cbz/cbnz early on as well.
>>>>>>> 
>>>>>>> We already have the opportunity to do that via cbranch<mode>4.
>>>>>>> But at the moment aarch64.md always forces the separate
>>>>>>> comparison instead.  (Not sure why TBH.  Does it enable more
>>>>>>> ifcvt
>>>>>>> opportunities?)
>>>>>>> 
>>>>>>> If we change the body of cbranch<mode>4 to:
>>>>>>> 
>>>>>>>  if ((GET_CODE (operands[0]) != EQ && GET_CODE (operands[0]) !=
>>> NE)
>>>>>>>      || operands[2] != const0_rtx)
>>>>>>>    {
>>>>>>>      operands[1] = aarch64_gen_compare_reg (GET_CODE
>>> (operands[0]),
>>>>>>>                                           operands[1], operands[2]);
>>>>>>>      operands[2] = const0_rtx;
>>>>>>>    }
>>>>>>> 
>>>>>>> then we generate the cbz/cbnz directly.
>>>>>>> 
>>>>>> 
>>>>>> Ah ok, then if Richi agrees, bitpos it is then instead of bit count.
>>>>> 
>>>>> Somehow I understood that cbranch<>4 is already fully capable of the
>>>>> optimization?
>>>>> 
>>>>> On your earlier proposal I'd have commented that if it wouldn't make
>>>>> sense to instead have a CCmode setter instead of an expander with a
>>> branch label?
>>>>> That would be a bit test, like {sign,zero}_extract compared against
>>>>> zero which can then be combined with a branch?
>>>>> 
>>>> 
>>>> I missed that part, that could work too.
>>>> 
>>>>> Of course if the ISA is really bit-test-and-branch then cbranch<>4
>>>>> itself or a variant of it might be more useful.  Maybe
>>>>> cbranchbi4 would be "abused" for this?
>>>> 
>>>> The instruction is an actual bit-test-and-branch with any arbitrary bitpos.
>>>> Yes we can abuse cbranchbi4 for this, but then it also means we can't e.g.
>>>> use this to optimize a < 0 where a is a signed value.  With the new
>>>> optab this would just be a bit-test-and-branch of the sign bit.
>>>> 
>>>> But also I'm not entirely convinced that using `BImode` and assuming a
>>>> single bit is safe here. What happens if I compile my source with -std=c89?
>>>> 
>>>> So I personally think the new optab makes more sense here. The CC setter
>>> would work too.
>>>> 
>>>> I guess my question is, between you folks, which approach would you
>>>> like. It seems that Richi You'd like a CC setter. Richard do you have a
>>> preference of one over the other?
>>> 
>>> My order of preference is
>>> 
>>> a) an existing pattern, if possible
>>> b) something usable by N > 1 targets we know of, even if it requires some
>>> combine magic
>>> c) something that fits the actual ISA
>>> 
>>> For b), x86 has BEXTR which performs a zero_extract from reg/mem and sets
>>> ZF when the result is zero.  For a branch on sign bit there's easier ways, so it's
>>> probably not very useful for general compare and branch optimization and if
>>> (a & 0x30) is probably handled by combine(?).
>> 
>> Agreed, I was more giving an example of other uses. But ok.
>> 
>>> It also seems that if (a & 1) is handled for aarch64 already and it's just a lack of
>>> an optab that allows RTL expansion to expand if (bool) as if (bool & 1)?
>> 
>> Yes this was what I was originally after.  Your original suggestion was to abuse BImode
>> and cbranch for this.  That could work, but I worry about introducing BImode in the RTL,
>> as I don't think we currently use it at all and wonder about new missed optimizations.
>> 
>> Richard Sandiford, would this be OK with you? I also don't want to emit an extract here as I think
>> that's gonna have a bigger effect on other targets than & 1 did.
>> 
>> I would rather have something I can explicitly check for target support for. I also wonder if just a
>> target hook asking the target how to expand a given tree Boolean argument would work.
> 
> Personally I'd prefer the test-and-branch optab.  We used to have
> separate compare optabs and branch optabs in the cc0 days, but having
> the combined pattern turned out to be much easier.  We don't currently
> expose CC registers at the optabs level and I think that's a good thing.
> 
> We could have a test-bit-and-set-pseudo optab.  But using that optab
> here would reintroduce the problem that I had with the original patch,
> which is:
> 
> - Currently, the branch on the boolean value expands to a single optab
>  (cbranch<mode>4).  That optab could easily expand to a single cb(n)z
>  instruction on aarch64 (even though the port chooses not to do that
>  currently).  So taken on its own, the branch is already maximally
>  efficient.
> 
>  The patch instead adds an extra instruction to the branch expansion,
>  so that it uses two optabs rather than one.  On the face of it,
>  this extra instruction is entirely redundant.  But emitting it can
>  sometimes allow profitable combines.  That is, rather than expand to:
> 
>    and reg, reg, 0xFF (from promote_mode)
>    cbranch on reg     (from the branch expansion)
> 
>  the patch expanded to:
> 
>    and reg, reg, 0xFF (from promote_mode)
>    and reg, reg, 0x1  (from the branch expansion)
>    cbranch on reg     (from the branch expansion)
> 
>  The first two ANDs come from separate sources.  When both ANDs exist,
>  combine can get rid of the redundancy and can then see that the
>  retained AND 0x1 + cbranch optimise to a test-and-branch instruction.
> 
>  But if the target can't in fact use a test-and-branch in a particular
>  situation, we'd be left with the AND 0x1 and the cbranch.  That's not
>  too bad if the AND from promote_mode and the AND 0x1 can be combined.
>  But if the AND with 0xff happens "too far" from the AND with 0x1
>  (e.g. because it's in a loop preheader), or if there's intermediate
>  logic, we might end up keeping all three instructions.
> 
> Emitting an extra bit test instruction as part of the branch expansion
> IMO has the same problem.  We're relying on combine combining the
> (redundant) bit test with the cbranch.  If combine doesn't do that
> (e.g. because the target doesn't support the combination in certain
> cases) then we could be left with three instructions rather than the
> original two.
> 
> That's why I think the fix is either for promote_mode to use a different
> expansion for booleans (which could also have knock-on effects) or for
> the branch expanders to target the test-and-branch instruction directly.
> 
> The optab wouldn't be an aarch64 special.  arc, avr, and h8300sx have
> similar instructions.  There are problem others too: h8300sx was from
> memory and I stopped looking after avr. :-)

Fair enough, let’s go with that then.

Richard 


> Thanks,
> Richard

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
  2022-09-23  9:24 [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend Tamar Christina
  2022-09-23  9:25 ` [PATCH 2/2]AArch64 Extend tbz pattern to allow SI to SI extensions Tamar Christina
  2022-09-26 10:35 ` [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend Richard Biener
@ 2022-10-27  3:22 ` Andrew Pinski
  2 siblings, 0 replies; 30+ messages in thread
From: Andrew Pinski @ 2022-10-27  3:22 UTC (permalink / raw)
  To: Tamar Christina; +Cc: gcc-patches, richard.sandiford, nd, rguenther

On Fri, Sep 23, 2022 at 2:26 AM Tamar Christina via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi All,
>
> This is an RFC to figure out how to deal with targets that don't have native
> comparisons against QImode values.
>
> Booleans, at least in C99 and higher are 0-1 valued.  This means that we only
> really need to test a single bit.  However in RTL we no longer have this
> information available and just have an SImode value (due to the promotion of
> QImode to SImode).
>
> This RFC fixes it by emitting an explicit & 1 during the expansion of the
> conditional branch.
>
> However it's unlikely that we want to do this unconditionally.  Most targets
> I've tested seem to have harmless code changes, like x86 changes from testb to
> andl, $1.
>
> So I have two questions:
>
> 1. Should I limit this behind a target macro? Or should I just leave it for all
>    targets and deal with the fallout.
> 2. How can I tell whether the C99 0-1 values bools are being used or the older
>    0, non-0 variant?
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> However there are some benign codegen changes on x86, testb changed to andl $1.
>
> This pattern occurs more than 120,000 times in SPECCPU 2017 and so is quite common.
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>         * tree.h (tree_zero_one_valued_p): New.
>         * dojump.cc (do_jump): Add & 1 if truth type.

Just FYI, I think this fixes/improves https://gcc.gnu.org/PR101806 .

Thanks,
Andrew

>
> --- inline copy of patch --
> diff --git a/gcc/dojump.cc b/gcc/dojump.cc
> index 2af0cd1aca3b6af13d5d8799094ee93f18022296..8eaf1be49cd12298e61c6946ae79ca9de6197864 100644
> --- a/gcc/dojump.cc
> +++ b/gcc/dojump.cc
> @@ -605,7 +605,17 @@ do_jump (tree exp, rtx_code_label *if_false_label,
>        /* Fall through and generate the normal code.  */
>      default:
>      normal:
> -      temp = expand_normal (exp);
> +      tree cmp = exp;
> +      /* If the expression is a truth type then explicitly generate an & 1
> +        to indicate to the target that it's a zero-one values type.  This
> +        allows the target to further optimize the comparison should it
> +        choose to.  */
> +      if (tree_zero_one_valued_p (exp))
> +       {
> +         type = TREE_TYPE (exp);
> +         cmp = build2 (BIT_AND_EXPR, type, exp, build_int_cstu (type, 1));
> +       }
> +      temp = expand_normal (cmp);
>        do_pending_stack_adjust ();
>        /* The RTL optimizers prefer comparisons against pseudos.  */
>        if (GET_CODE (temp) == SUBREG)
> diff --git a/gcc/tree.h b/gcc/tree.h
> index 8f8a9660c9e0605eb516de194640b8c1b531b798..be3d2dee82f692e81082cf21c878c10f9fe9e1f1 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -4690,6 +4690,7 @@ extern tree signed_or_unsigned_type_for (int, tree);
>  extern tree signed_type_for (tree);
>  extern tree unsigned_type_for (tree);
>  extern bool is_truth_type_for (tree, tree);
> +extern bool tree_zero_one_valued_p (tree);
>  extern tree truth_type_for (tree);
>  extern tree build_pointer_type_for_mode (tree, machine_mode, bool);
>  extern tree build_pointer_type (tree);
>
>
>
>
> --

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2022-10-27  3:22 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-23  9:24 [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend Tamar Christina
2022-09-23  9:25 ` [PATCH 2/2]AArch64 Extend tbz pattern to allow SI to SI extensions Tamar Christina
2022-09-23  9:42   ` Richard Sandiford
2022-09-23  9:48     ` Tamar Christina
2022-09-26 10:35 ` [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend Richard Biener
2022-09-26 11:05   ` Tamar Christina
2022-09-26 11:32     ` Richard Biener
2022-09-26 11:46       ` Tamar Christina
2022-09-26 12:34         ` Richard Biener
2022-09-26 12:43           ` Richard Biener
2022-09-26 14:02             ` Tamar Christina
2022-09-28 15:04         ` Richard Sandiford
2022-09-28 17:23           ` Jeff Law
2022-09-29  9:37             ` Richard Sandiford
2022-09-29  9:40               ` Richard Biener
2022-09-29 10:21                 ` Tamar Christina
2022-09-29 11:09                   ` Richard Biener
2022-09-30  8:00                     ` Tamar Christina
2022-09-30  8:28                       ` Richard Sandiford
2022-09-30  8:38                         ` Tamar Christina
2022-09-30  8:48                           ` Richard Sandiford
2022-09-30  9:15                             ` Tamar Christina
2022-09-30 10:16                               ` Richard Biener
2022-09-30 11:11                                 ` Tamar Christina
2022-09-30 11:52                                   ` Richard Biener
2022-09-30 12:48                                     ` Tamar Christina
2022-09-30 14:28                                       ` Richard Sandiford
2022-09-30 14:33                                         ` Richard Biener
2022-09-29 20:49               ` Jeff Law
2022-10-27  3:22 ` Andrew Pinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).