public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v2] match.pd: rewrite select to branchless expression
@ 2022-11-11  2:28 Michael Collison
  2022-11-11  7:44 ` Prathamesh Kulkarni
  2022-11-18 12:57 ` Richard Biener
  0 siblings, 2 replies; 7+ messages in thread
From: Michael Collison @ 2022-11-11  2:28 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener

This patches transforms ((x & 0x1) == 0) ? y : z <op> y -into 
(-(typeof(y))(x & 0x1) & z) <op> y, where op is a '^' or a '|'. It also 
transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x , 
0x1)) & z ) op y.

Matching this patterns allows GCC to generate branchless code for one of 
the functions in coremark.

Bootstrapped and tested on x86 and RISC-V. Okay?

Michael.

2022-11-10  Michael Collison  <collison@rivosinc.com>

     * match.pd ((x & 0x1) == 0) ? y : z <op> y
     -> (-(typeof(y))(x & 0x1) & z) <op> y.

2022-11-10  Michael Collison <collison@rivosinc.com>

     * gcc.dg/tree-ssa/branchless-cond.c: New test.

---

Changes in v2:

- Rewrite comment to use C syntax

- Guard against 1-bit types

- Simplify pattern by using zero_one_valued_p

  gcc/match.pd                                  | 24 +++++++++++++++++
  .../gcc.dg/tree-ssa/branchless-cond.c         | 26 +++++++++++++++++++
  2 files changed, 50 insertions(+)
  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 194ba8f5188..258531e9046 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3486,6 +3486,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
    (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1)
    (max @2 @1))
  
+/* ((x & 0x1) == 0) ? y : z <op> y -> (-(typeof(y))(x & 0x1) & z) <op> y */
+(for op (bit_xor bit_ior)
+ (simplify
+  (cond (eq zero_one_valued_p@0
+            integer_zerop)
+        @1
+        (op:c @2 @1))
+  (if (INTEGRAL_TYPE_P (type)
+       && TYPE_PRECISION (type) > 1
+       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))))
+       (op (bit_and (negate (convert:type @0)) @2) @1))))
+
+/* ((x & 0x1) == 0) ? z <op> y : y -> (-(typeof(y))(x & 0x1) & z) <op> y */
+(for op (bit_xor bit_ior)
+ (simplify
+  (cond (ne zero_one_valued_p@0
+            integer_zerop)
+	(op:c @2 @1)
+        @1)
+  (if (INTEGRAL_TYPE_P (type)
+       && TYPE_PRECISION (type) > 1
+       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))))
+       (op (bit_and (negate (convert:type @0)) @2) @1))))
+
  /* Simplifications of shift and rotates.  */
  
  (for rotate (lrotate rrotate)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
new file mode 100644
index 00000000000..68087ae6568
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int f1(unsigned int x, unsigned int y, unsigned int z)
+{
+  return ((x & 1) == 0) ? y : z ^ y;
+}
+
+int f2(unsigned int x, unsigned int y, unsigned int z)
+{
+  return ((x & 1) != 0) ? z ^ y : y;
+}
+
+int f3(unsigned int x, unsigned int y, unsigned int z)
+{
+  return ((x & 1) == 0) ? y : z | y;
+}
+
+int f4(unsigned int x, unsigned int y, unsigned int z)
+{
+  return ((x & 1) != 0) ? z | y : y;
+}
+
+/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */
+/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */
+/* { dg-final { scan-tree-dump-not "if" "optimized" } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] match.pd: rewrite select to branchless expression
  2022-11-11  2:28 [PATCH v2] match.pd: rewrite select to branchless expression Michael Collison
@ 2022-11-11  7:44 ` Prathamesh Kulkarni
  2022-11-11 13:00   ` Michael Collison
  2022-11-18 12:57 ` Richard Biener
  1 sibling, 1 reply; 7+ messages in thread
From: Prathamesh Kulkarni @ 2022-11-11  7:44 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc-patches, Richard Biener

On Fri, 11 Nov 2022 at 07:58, Michael Collison <collison@rivosinc.com> wrote:
>
> This patches transforms ((x & 0x1) == 0) ? y : z <op> y -into
> (-(typeof(y))(x & 0x1) & z) <op> y, where op is a '^' or a '|'. It also
> transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x ,
> 0x1)) & z ) op y.
>
> Matching this patterns allows GCC to generate branchless code for one of
> the functions in coremark.
>
> Bootstrapped and tested on x86 and RISC-V. Okay?
>
> Michael.
>
> 2022-11-10  Michael Collison  <collison@rivosinc.com>
>
>      * match.pd ((x & 0x1) == 0) ? y : z <op> y
>      -> (-(typeof(y))(x & 0x1) & z) <op> y.
>
> 2022-11-10  Michael Collison <collison@rivosinc.com>
>
>      * gcc.dg/tree-ssa/branchless-cond.c: New test.
>
> ---
>
> Changes in v2:
>
> - Rewrite comment to use C syntax
>
> - Guard against 1-bit types
>
> - Simplify pattern by using zero_one_valued_p
>
>   gcc/match.pd                                  | 24 +++++++++++++++++
>   .../gcc.dg/tree-ssa/branchless-cond.c         | 26 +++++++++++++++++++
>   2 files changed, 50 insertions(+)
>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 194ba8f5188..258531e9046 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3486,6 +3486,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>     (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1)
>     (max @2 @1))
>
> +/* ((x & 0x1) == 0) ? y : z <op> y -> (-(typeof(y))(x & 0x1) & z) <op> y */
> +(for op (bit_xor bit_ior)
> + (simplify
> +  (cond (eq zero_one_valued_p@0
> +            integer_zerop)
> +        @1
> +        (op:c @2 @1))
> +  (if (INTEGRAL_TYPE_P (type)
> +       && TYPE_PRECISION (type) > 1
> +       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))))
> +       (op (bit_and (negate (convert:type @0)) @2) @1))))
> +
> +/* ((x & 0x1) == 0) ? z <op> y : y -> (-(typeof(y))(x & 0x1) & z) <op> y */
> +(for op (bit_xor bit_ior)
> + (simplify
> +  (cond (ne zero_one_valued_p@0
> +            integer_zerop)
> +       (op:c @2 @1)
> +        @1)
> +  (if (INTEGRAL_TYPE_P (type)
> +       && TYPE_PRECISION (type) > 1
> +       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))))
> +       (op (bit_and (negate (convert:type @0)) @2) @1))))
> +
>   /* Simplifications of shift and rotates.  */
>
>   (for rotate (lrotate rrotate)
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> new file mode 100644
> index 00000000000..68087ae6568
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +int f1(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) == 0) ? y : z ^ y;
> +}
> +
> +int f2(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) != 0) ? z ^ y : y;
> +}
> +
> +int f3(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) == 0) ? y : z | y;
> +}
> +
> +int f4(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) != 0) ? z | y : y;
> +}
Sorry to nitpick -- Since the pattern gates on INTEGRAL_TYPE_P, would
it be a good idea
to have these tests for other integral types too besides int like
{char, short, long} ?

Thanks,
Prathamesh
> +
> +/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */
> +/* { dg-final { scan-tree-dump-not "if" "optimized" } } */
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] match.pd: rewrite select to branchless expression
  2022-11-11  7:44 ` Prathamesh Kulkarni
@ 2022-11-11 13:00   ` Michael Collison
  2022-11-17 14:34     ` Jeff Law
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Collison @ 2022-11-11 13:00 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: gcc-patches, Richard Biener

Hi Prathamesh,

It is my understanding that INTEGRAL_TYPE_P applies to the other integer 
types you mentioned (chart, short, long). In fact the test function that 
motivated this match has a mixture of char and short and does not 
restrict matching.

On 11/11/22 02:44, Prathamesh Kulkarni wrote:
> On Fri, 11 Nov 2022 at 07:58, Michael Collison <collison@rivosinc.com> wrote:
>> This patches transforms ((x & 0x1) == 0) ? y : z <op> y -into
>> (-(typeof(y))(x & 0x1) & z) <op> y, where op is a '^' or a '|'. It also
>> transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x ,
>> 0x1)) & z ) op y.
>>
>> Matching this patterns allows GCC to generate branchless code for one of
>> the functions in coremark.
>>
>> Bootstrapped and tested on x86 and RISC-V. Okay?
>>
>> Michael.
>>
>> 2022-11-10  Michael Collison  <collison@rivosinc.com>
>>
>>       * match.pd ((x & 0x1) == 0) ? y : z <op> y
>>       -> (-(typeof(y))(x & 0x1) & z) <op> y.
>>
>> 2022-11-10  Michael Collison <collison@rivosinc.com>
>>
>>       * gcc.dg/tree-ssa/branchless-cond.c: New test.
>>
>> ---
>>
>> Changes in v2:
>>
>> - Rewrite comment to use C syntax
>>
>> - Guard against 1-bit types
>>
>> - Simplify pattern by using zero_one_valued_p
>>
>>    gcc/match.pd                                  | 24 +++++++++++++++++
>>    .../gcc.dg/tree-ssa/branchless-cond.c         | 26 +++++++++++++++++++
>>    2 files changed, 50 insertions(+)
>>    create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
>>
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 194ba8f5188..258531e9046 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -3486,6 +3486,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>      (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1)
>>      (max @2 @1))
>>
>> +/* ((x & 0x1) == 0) ? y : z <op> y -> (-(typeof(y))(x & 0x1) & z) <op> y */
>> +(for op (bit_xor bit_ior)
>> + (simplify
>> +  (cond (eq zero_one_valued_p@0
>> +            integer_zerop)
>> +        @1
>> +        (op:c @2 @1))
>> +  (if (INTEGRAL_TYPE_P (type)
>> +       && TYPE_PRECISION (type) > 1
>> +       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))))
>> +       (op (bit_and (negate (convert:type @0)) @2) @1))))
>> +
>> +/* ((x & 0x1) == 0) ? z <op> y : y -> (-(typeof(y))(x & 0x1) & z) <op> y */
>> +(for op (bit_xor bit_ior)
>> + (simplify
>> +  (cond (ne zero_one_valued_p@0
>> +            integer_zerop)
>> +       (op:c @2 @1)
>> +        @1)
>> +  (if (INTEGRAL_TYPE_P (type)
>> +       && TYPE_PRECISION (type) > 1
>> +       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))))
>> +       (op (bit_and (negate (convert:type @0)) @2) @1))))
>> +
>>    /* Simplifications of shift and rotates.  */
>>
>>    (for rotate (lrotate rrotate)
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
>> new file mode 100644
>> index 00000000000..68087ae6568
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
>> @@ -0,0 +1,26 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-optimized" } */
>> +
>> +int f1(unsigned int x, unsigned int y, unsigned int z)
>> +{
>> +  return ((x & 1) == 0) ? y : z ^ y;
>> +}
>> +
>> +int f2(unsigned int x, unsigned int y, unsigned int z)
>> +{
>> +  return ((x & 1) != 0) ? z ^ y : y;
>> +}
>> +
>> +int f3(unsigned int x, unsigned int y, unsigned int z)
>> +{
>> +  return ((x & 1) == 0) ? y : z | y;
>> +}
>> +
>> +int f4(unsigned int x, unsigned int y, unsigned int z)
>> +{
>> +  return ((x & 1) != 0) ? z | y : y;
>> +}
> Sorry to nitpick -- Since the pattern gates on INTEGRAL_TYPE_P, would
> it be a good idea
> to have these tests for other integral types too besides int like
> {char, short, long} ?
>
> Thanks,
> Prathamesh
>> +
>> +/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */
>> +/* { dg-final { scan-tree-dump-not "if" "optimized" } } */
>> --
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] match.pd: rewrite select to branchless expression
  2022-11-11 13:00   ` Michael Collison
@ 2022-11-17 14:34     ` Jeff Law
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff Law @ 2022-11-17 14:34 UTC (permalink / raw)
  To: Michael Collison, Prathamesh Kulkarni; +Cc: gcc-patches, Richard Biener


On 11/11/22 06:00, Michael Collison wrote:
> Hi Prathamesh,
>
> It is my understanding that INTEGRAL_TYPE_P applies to the other 
> integer types you mentioned (chart, short, long). In fact the test 
> function that motivated this match has a mixture of char and short and 
> does not restrict matching.

What I think Prathamesh is asking is whether or not we want to have 
tests  with different types.   It's less about correctness I think and 
more about ensuring that the testsuite covers those tests in case they 
regress in the future.


jeff




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] match.pd: rewrite select to branchless expression
  2022-11-11  2:28 [PATCH v2] match.pd: rewrite select to branchless expression Michael Collison
  2022-11-11  7:44 ` Prathamesh Kulkarni
@ 2022-11-18 12:57 ` Richard Biener
  2022-12-01 18:57   ` Michael Collison
  1 sibling, 1 reply; 7+ messages in thread
From: Richard Biener @ 2022-11-18 12:57 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc-patches

On Fri, Nov 11, 2022 at 3:28 AM Michael Collison <collison@rivosinc.com> wrote:
>
> This patches transforms ((x & 0x1) == 0) ? y : z <op> y -into
> (-(typeof(y))(x & 0x1) & z) <op> y, where op is a '^' or a '|'. It also
> transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x ,
> 0x1)) & z ) op y.
>
> Matching this patterns allows GCC to generate branchless code for one of
> the functions in coremark.
>
> Bootstrapped and tested on x86 and RISC-V. Okay?

OK.

Thanks,
Richard.

> Michael.
>
> 2022-11-10  Michael Collison  <collison@rivosinc.com>
>
>      * match.pd ((x & 0x1) == 0) ? y : z <op> y
>      -> (-(typeof(y))(x & 0x1) & z) <op> y.
>
> 2022-11-10  Michael Collison <collison@rivosinc.com>
>
>      * gcc.dg/tree-ssa/branchless-cond.c: New test.
>
> ---
>
> Changes in v2:
>
> - Rewrite comment to use C syntax
>
> - Guard against 1-bit types
>
> - Simplify pattern by using zero_one_valued_p
>
>   gcc/match.pd                                  | 24 +++++++++++++++++
>   .../gcc.dg/tree-ssa/branchless-cond.c         | 26 +++++++++++++++++++
>   2 files changed, 50 insertions(+)
>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 194ba8f5188..258531e9046 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3486,6 +3486,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>     (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1)
>     (max @2 @1))
>
> +/* ((x & 0x1) == 0) ? y : z <op> y -> (-(typeof(y))(x & 0x1) & z) <op> y */
> +(for op (bit_xor bit_ior)
> + (simplify
> +  (cond (eq zero_one_valued_p@0
> +            integer_zerop)
> +        @1
> +        (op:c @2 @1))
> +  (if (INTEGRAL_TYPE_P (type)
> +       && TYPE_PRECISION (type) > 1
> +       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))))
> +       (op (bit_and (negate (convert:type @0)) @2) @1))))
> +
> +/* ((x & 0x1) == 0) ? z <op> y : y -> (-(typeof(y))(x & 0x1) & z) <op> y */
> +(for op (bit_xor bit_ior)
> + (simplify
> +  (cond (ne zero_one_valued_p@0
> +            integer_zerop)
> +       (op:c @2 @1)
> +        @1)
> +  (if (INTEGRAL_TYPE_P (type)
> +       && TYPE_PRECISION (type) > 1
> +       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))))
> +       (op (bit_and (negate (convert:type @0)) @2) @1))))
> +
>   /* Simplifications of shift and rotates.  */
>
>   (for rotate (lrotate rrotate)
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> new file mode 100644
> index 00000000000..68087ae6568
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +int f1(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) == 0) ? y : z ^ y;
> +}
> +
> +int f2(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) != 0) ? z ^ y : y;
> +}
> +
> +int f3(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) == 0) ? y : z | y;
> +}
> +
> +int f4(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) != 0) ? z | y : y;
> +}
> +
> +/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */
> +/* { dg-final { scan-tree-dump-not "if" "optimized" } } */
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] match.pd: rewrite select to branchless expression
  2022-11-18 12:57 ` Richard Biener
@ 2022-12-01 18:57   ` Michael Collison
  2022-12-02  8:22     ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Collison @ 2022-12-01 18:57 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Richard,

Can you submit this patch for me while I sort out git write access?

On 11/18/22 07:57, Richard Biener wrote:
> On Fri, Nov 11, 2022 at 3:28 AM Michael Collison <collison@rivosinc.com> wrote:
>> This patches transforms ((x & 0x1) == 0) ? y : z <op> y -into
>> (-(typeof(y))(x & 0x1) & z) <op> y, where op is a '^' or a '|'. It also
>> transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x ,
>> 0x1)) & z ) op y.
>>
>> Matching this patterns allows GCC to generate branchless code for one of
>> the functions in coremark.
>>
>> Bootstrapped and tested on x86 and RISC-V. Okay?
> OK.
>
> Thanks,
> Richard.
>
>> Michael.
>>
>> 2022-11-10  Michael Collison  <collison@rivosinc.com>
>>
>>       * match.pd ((x & 0x1) == 0) ? y : z <op> y
>>       -> (-(typeof(y))(x & 0x1) & z) <op> y.
>>
>> 2022-11-10  Michael Collison <collison@rivosinc.com>
>>
>>       * gcc.dg/tree-ssa/branchless-cond.c: New test.
>>
>> ---
>>
>> Changes in v2:
>>
>> - Rewrite comment to use C syntax
>>
>> - Guard against 1-bit types
>>
>> - Simplify pattern by using zero_one_valued_p
>>
>>    gcc/match.pd                                  | 24 +++++++++++++++++
>>    .../gcc.dg/tree-ssa/branchless-cond.c         | 26 +++++++++++++++++++
>>    2 files changed, 50 insertions(+)
>>    create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
>>
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 194ba8f5188..258531e9046 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -3486,6 +3486,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>      (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1)
>>      (max @2 @1))
>>
>> +/* ((x & 0x1) == 0) ? y : z <op> y -> (-(typeof(y))(x & 0x1) & z) <op> y */
>> +(for op (bit_xor bit_ior)
>> + (simplify
>> +  (cond (eq zero_one_valued_p@0
>> +            integer_zerop)
>> +        @1
>> +        (op:c @2 @1))
>> +  (if (INTEGRAL_TYPE_P (type)
>> +       && TYPE_PRECISION (type) > 1
>> +       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))))
>> +       (op (bit_and (negate (convert:type @0)) @2) @1))))
>> +
>> +/* ((x & 0x1) == 0) ? z <op> y : y -> (-(typeof(y))(x & 0x1) & z) <op> y */
>> +(for op (bit_xor bit_ior)
>> + (simplify
>> +  (cond (ne zero_one_valued_p@0
>> +            integer_zerop)
>> +       (op:c @2 @1)
>> +        @1)
>> +  (if (INTEGRAL_TYPE_P (type)
>> +       && TYPE_PRECISION (type) > 1
>> +       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))))
>> +       (op (bit_and (negate (convert:type @0)) @2) @1))))
>> +
>>    /* Simplifications of shift and rotates.  */
>>
>>    (for rotate (lrotate rrotate)
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
>> new file mode 100644
>> index 00000000000..68087ae6568
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
>> @@ -0,0 +1,26 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-optimized" } */
>> +
>> +int f1(unsigned int x, unsigned int y, unsigned int z)
>> +{
>> +  return ((x & 1) == 0) ? y : z ^ y;
>> +}
>> +
>> +int f2(unsigned int x, unsigned int y, unsigned int z)
>> +{
>> +  return ((x & 1) != 0) ? z ^ y : y;
>> +}
>> +
>> +int f3(unsigned int x, unsigned int y, unsigned int z)
>> +{
>> +  return ((x & 1) == 0) ? y : z | y;
>> +}
>> +
>> +int f4(unsigned int x, unsigned int y, unsigned int z)
>> +{
>> +  return ((x & 1) != 0) ? z | y : y;
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */
>> +/* { dg-final { scan-tree-dump-not "if" "optimized" } } */
>> --
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] match.pd: rewrite select to branchless expression
  2022-12-01 18:57   ` Michael Collison
@ 2022-12-02  8:22     ` Richard Biener
  0 siblings, 0 replies; 7+ messages in thread
From: Richard Biener @ 2022-12-02  8:22 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc-patches

On Thu, Dec 1, 2022 at 7:57 PM Michael Collison <collison@rivosinc.com> wrote:
>
> Richard,
>
> Can you submit this patch for me while I sort out git write access?

Done.  I had to apply the patch manually - in future please make sure
to send patches that can be applied with git am.

Thanks,
Richard.

> On 11/18/22 07:57, Richard Biener wrote:
> > On Fri, Nov 11, 2022 at 3:28 AM Michael Collison <collison@rivosinc.com> wrote:
> >> This patches transforms ((x & 0x1) == 0) ? y : z <op> y -into
> >> (-(typeof(y))(x & 0x1) & z) <op> y, where op is a '^' or a '|'. It also
> >> transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x ,
> >> 0x1)) & z ) op y.
> >>
> >> Matching this patterns allows GCC to generate branchless code for one of
> >> the functions in coremark.
> >>
> >> Bootstrapped and tested on x86 and RISC-V. Okay?
> > OK.
> >
> > Thanks,
> > Richard.
> >
> >> Michael.
> >>
> >> 2022-11-10  Michael Collison  <collison@rivosinc.com>
> >>
> >>       * match.pd ((x & 0x1) == 0) ? y : z <op> y
> >>       -> (-(typeof(y))(x & 0x1) & z) <op> y.
> >>
> >> 2022-11-10  Michael Collison <collison@rivosinc.com>
> >>
> >>       * gcc.dg/tree-ssa/branchless-cond.c: New test.
> >>
> >> ---
> >>
> >> Changes in v2:
> >>
> >> - Rewrite comment to use C syntax
> >>
> >> - Guard against 1-bit types
> >>
> >> - Simplify pattern by using zero_one_valued_p
> >>
> >>    gcc/match.pd                                  | 24 +++++++++++++++++
> >>    .../gcc.dg/tree-ssa/branchless-cond.c         | 26 +++++++++++++++++++
> >>    2 files changed, 50 insertions(+)
> >>    create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> >>
> >> diff --git a/gcc/match.pd b/gcc/match.pd
> >> index 194ba8f5188..258531e9046 100644
> >> --- a/gcc/match.pd
> >> +++ b/gcc/match.pd
> >> @@ -3486,6 +3486,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >>      (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1)
> >>      (max @2 @1))
> >>
> >> +/* ((x & 0x1) == 0) ? y : z <op> y -> (-(typeof(y))(x & 0x1) & z) <op> y */
> >> +(for op (bit_xor bit_ior)
> >> + (simplify
> >> +  (cond (eq zero_one_valued_p@0
> >> +            integer_zerop)
> >> +        @1
> >> +        (op:c @2 @1))
> >> +  (if (INTEGRAL_TYPE_P (type)
> >> +       && TYPE_PRECISION (type) > 1
> >> +       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))))
> >> +       (op (bit_and (negate (convert:type @0)) @2) @1))))
> >> +
> >> +/* ((x & 0x1) == 0) ? z <op> y : y -> (-(typeof(y))(x & 0x1) & z) <op> y */
> >> +(for op (bit_xor bit_ior)
> >> + (simplify
> >> +  (cond (ne zero_one_valued_p@0
> >> +            integer_zerop)
> >> +       (op:c @2 @1)
> >> +        @1)
> >> +  (if (INTEGRAL_TYPE_P (type)
> >> +       && TYPE_PRECISION (type) > 1
> >> +       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))))
> >> +       (op (bit_and (negate (convert:type @0)) @2) @1))))
> >> +
> >>    /* Simplifications of shift and rotates.  */
> >>
> >>    (for rotate (lrotate rrotate)
> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> >> new file mode 100644
> >> index 00000000000..68087ae6568
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> >> @@ -0,0 +1,26 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> >> +
> >> +int f1(unsigned int x, unsigned int y, unsigned int z)
> >> +{
> >> +  return ((x & 1) == 0) ? y : z ^ y;
> >> +}
> >> +
> >> +int f2(unsigned int x, unsigned int y, unsigned int z)
> >> +{
> >> +  return ((x & 1) != 0) ? z ^ y : y;
> >> +}
> >> +
> >> +int f3(unsigned int x, unsigned int y, unsigned int z)
> >> +{
> >> +  return ((x & 1) == 0) ? y : z | y;
> >> +}
> >> +
> >> +int f4(unsigned int x, unsigned int y, unsigned int z)
> >> +{
> >> +  return ((x & 1) != 0) ? z | y : y;
> >> +}
> >> +
> >> +/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */
> >> +/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */
> >> +/* { dg-final { scan-tree-dump-not "if" "optimized" } } */
> >> --
> >> 2.34.1
> >>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-12-02  8:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-11  2:28 [PATCH v2] match.pd: rewrite select to branchless expression Michael Collison
2022-11-11  7:44 ` Prathamesh Kulkarni
2022-11-11 13:00   ` Michael Collison
2022-11-17 14:34     ` Jeff Law
2022-11-18 12:57 ` Richard Biener
2022-12-01 18:57   ` Michael Collison
2022-12-02  8:22     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).