public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Canonicalize X u< X to UNORDERED_EXPR
@ 2016-04-30 18:45 Marc Glisse
  2016-05-02  8:37 ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Marc Glisse @ 2016-04-30 18:45 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: TEXT/PLAIN, Size: 496 bytes --]

Hello,

this case seemed to be missing in the various X cmp X transformations. It 
does not change the generated code in the testcase.

The missing :c is rather trivial. I can commit it separately if you 
prefer.

Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

2016-05-02  Marc Glisse  <marc.glisse@inria.fr>

gcc/
 	* match.pd ((A & B) OP (C & B)): Mark '&' as commutative.
 	(X u< X, X u> X): New transformations

gcc/testsuite/
 	* gcc.dg/tree-ssa/unord.c: New testcase.

-- 
Marc Glisse

[-- Attachment #2: Type: TEXT/PLAIN, Size: 2159 bytes --]

Index: trunk/gcc/match.pd
===================================================================
--- trunk/gcc/match.pd	(revision 235654)
+++ trunk/gcc/match.pd	(working copy)
@@ -783,21 +783,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   @0)
  /* (~x | y) & x -> x & y */
  /* (~x & y) | x -> x | y */
  (simplify
   (bitop:c (rbitop:c (bit_not @0) @1) @0)
   (bitop @0 @1)))
 
 /* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
 (for bitop (bit_and bit_ior bit_xor)
  (simplify
-  (bitop (bit_and:c @0 @1) (bit_and @2 @1))
+  (bitop (bit_and:c @0 @1) (bit_and:c @2 @1))
   (bit_and (bitop @0 @2) @1)))
 
 /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
 (simplify
   (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
   (bit_ior (bit_and @0 @2) (bit_and @1 @2)))
 
 /* Combine successive equal operations with constants.  */
 (for bitop (bit_and bit_ior bit_xor)
  (simplify
@@ -1914,20 +1914,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (simplify
   (cmp @0 @0)
   (if (cmp != NE_EXPR
        || ! FLOAT_TYPE_P (TREE_TYPE (@0))
        || ! HONOR_NANS (@0))
    { constant_boolean_node (false, type); })))
 (for cmp (unle unge uneq)
  (simplify
   (cmp @0 @0)
   { constant_boolean_node (true, type); }))
+(for cmp (unlt ungt)
+ (simplify
+  (cmp @0 @0)
+  (unordered @0 @0)))
 (simplify
  (ltgt @0 @0)
  (if (!flag_trapping_math)
   { constant_boolean_node (false, type); }))
 
 /* Fold ~X op ~Y as Y op X.  */
 (for cmp (simple_comparison)
  (simplify
   (cmp (bit_not@2 @0) (bit_not@3 @1))
   (if (single_use (@2) && single_use (@3))
Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c
===================================================================
--- trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c	(revision 0)
+++ trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c	(working copy)
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+int f(double a){double b=a;return !__builtin_islessequal(a,b);}
+int g(double a){double b=a;return !__builtin_isgreaterequal(a,b);}
+
+/* { dg-final { scan-tree-dump-times " unord " 2 "optimized" } } */

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Canonicalize X u< X to UNORDERED_EXPR
  2016-04-30 18:45 Canonicalize X u< X to UNORDERED_EXPR Marc Glisse
@ 2016-05-02  8:37 ` Richard Biener
  2016-05-02  9:19   ` Marc Glisse
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-05-02  8:37 UTC (permalink / raw)
  To: Marc Glisse; +Cc: GCC Patches

On Sat, Apr 30, 2016 at 8:44 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> Hello,
>
> this case seemed to be missing in the various X cmp X transformations. It
> does not change the generated code in the testcase.
>
> The missing :c is rather trivial. I can commit it separately if you prefer.

I think it's not missing.  Commutating the first one is enough to eventually
make the @1s match up.  I think you should get a diagnostic on a duplicate
pattern when adding another :c (hmm, no, it's indeed "different" patterns but
still redundant).

> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

Ok for the new pattern.

Thanks,
Richard.

> 2016-05-02  Marc Glisse  <marc.glisse@inria.fr>
>
> gcc/
>         * match.pd ((A & B) OP (C & B)): Mark '&' as commutative.
>         (X u< X, X u> X): New transformations
>
> gcc/testsuite/
>         * gcc.dg/tree-ssa/unord.c: New testcase.
>
> --
> Marc Glisse
> Index: trunk/gcc/match.pd
> ===================================================================
> --- trunk/gcc/match.pd  (revision 235654)
> +++ trunk/gcc/match.pd  (working copy)
> @@ -783,21 +783,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>    @0)
>   /* (~x | y) & x -> x & y */
>   /* (~x & y) | x -> x | y */
>   (simplify
>    (bitop:c (rbitop:c (bit_not @0) @1) @0)
>    (bitop @0 @1)))
>
>  /* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
>  (for bitop (bit_and bit_ior bit_xor)
>   (simplify
> -  (bitop (bit_and:c @0 @1) (bit_and @2 @1))
> +  (bitop (bit_and:c @0 @1) (bit_and:c @2 @1))
>    (bit_and (bitop @0 @2) @1)))
>
>  /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
>  (simplify
>    (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
>    (bit_ior (bit_and @0 @2) (bit_and @1 @2)))
>
>  /* Combine successive equal operations with constants.  */
>  (for bitop (bit_and bit_ior bit_xor)
>   (simplify
> @@ -1914,20 +1914,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (simplify
>    (cmp @0 @0)
>    (if (cmp != NE_EXPR
>         || ! FLOAT_TYPE_P (TREE_TYPE (@0))
>         || ! HONOR_NANS (@0))
>     { constant_boolean_node (false, type); })))
>  (for cmp (unle unge uneq)
>   (simplify
>    (cmp @0 @0)
>    { constant_boolean_node (true, type); }))
> +(for cmp (unlt ungt)
> + (simplify
> +  (cmp @0 @0)
> +  (unordered @0 @0)))
>  (simplify
>   (ltgt @0 @0)
>   (if (!flag_trapping_math)
>    { constant_boolean_node (false, type); }))
>
>  /* Fold ~X op ~Y as Y op X.  */
>  (for cmp (simple_comparison)
>   (simplify
>    (cmp (bit_not@2 @0) (bit_not@3 @1))
>    (if (single_use (@2) && single_use (@3))
> Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c
> ===================================================================
> --- trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (revision 0)
> +++ trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (working copy)
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +int f(double a){double b=a;return !__builtin_islessequal(a,b);}
> +int g(double a){double b=a;return !__builtin_isgreaterequal(a,b);}
> +
> +/* { dg-final { scan-tree-dump-times " unord " 2 "optimized" } } */
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Canonicalize X u< X to UNORDERED_EXPR
  2016-05-02  8:37 ` Richard Biener
@ 2016-05-02  9:19   ` Marc Glisse
  2016-05-02  9:45     ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Marc Glisse @ 2016-05-02  9:19 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On Mon, 2 May 2016, Richard Biener wrote:

> On Sat, Apr 30, 2016 at 8:44 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> Hello,
>>
>> this case seemed to be missing in the various X cmp X transformations. It
>> does not change the generated code in the testcase.
>>
>> The missing :c is rather trivial. I can commit it separately if you prefer.
>
> I think it's not missing.  Commutating the first one is enough to eventually
> make the @1s match up.  I think you should get a diagnostic on a duplicate
> pattern when adding another :c (hmm, no, it's indeed "different" patterns but
> still redundant).

Let's see. The current pattern is
   (bitop (bit_and:c @0 @1) (bit_and @2 @1))

This matches:
(X & Y) ^ (Z & Y)
(Y & X) ^ (Z & Y)

If I have for instance (Y & X) ^ (Y & Z), I don't see how that is going to 
match, and indeed we don't simplify that. On the other hand, if I have 
bit_ior instead of bit_xor, then we have another very similar 
transformation a 100 lines up in match.pd

(for op (bit_and bit_ior)
      rop (bit_ior bit_and)
  (simplify
   (op (convert? (rop:c @0 @1)) (convert? (rop @0 @2)))

That one also commutes only one, but starting from a different match. We 
should probably reorganize them a bit.

>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>
> Ok for the new pattern.
>
> Thanks,
> Richard.
>
>> 2016-05-02  Marc Glisse  <marc.glisse@inria.fr>
>>
>> gcc/
>>         * match.pd ((A & B) OP (C & B)): Mark '&' as commutative.
>>         (X u< X, X u> X): New transformations
>>
>> gcc/testsuite/
>>         * gcc.dg/tree-ssa/unord.c: New testcase.
>>
>> --
>> Marc Glisse
>> Index: trunk/gcc/match.pd
>> ===================================================================
>> --- trunk/gcc/match.pd  (revision 235654)
>> +++ trunk/gcc/match.pd  (working copy)
>> @@ -783,21 +783,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>    @0)
>>   /* (~x | y) & x -> x & y */
>>   /* (~x & y) | x -> x | y */
>>   (simplify
>>    (bitop:c (rbitop:c (bit_not @0) @1) @0)
>>    (bitop @0 @1)))
>>
>>  /* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
>>  (for bitop (bit_and bit_ior bit_xor)
>>   (simplify
>> -  (bitop (bit_and:c @0 @1) (bit_and @2 @1))
>> +  (bitop (bit_and:c @0 @1) (bit_and:c @2 @1))
>>    (bit_and (bitop @0 @2) @1)))
>>
>>  /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
>>  (simplify
>>    (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
>>    (bit_ior (bit_and @0 @2) (bit_and @1 @2)))
>>
>>  /* Combine successive equal operations with constants.  */
>>  (for bitop (bit_and bit_ior bit_xor)
>>   (simplify
>> @@ -1914,20 +1914,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>   (simplify
>>    (cmp @0 @0)
>>    (if (cmp != NE_EXPR
>>         || ! FLOAT_TYPE_P (TREE_TYPE (@0))
>>         || ! HONOR_NANS (@0))
>>     { constant_boolean_node (false, type); })))
>>  (for cmp (unle unge uneq)
>>   (simplify
>>    (cmp @0 @0)
>>    { constant_boolean_node (true, type); }))
>> +(for cmp (unlt ungt)
>> + (simplify
>> +  (cmp @0 @0)
>> +  (unordered @0 @0)))
>>  (simplify
>>   (ltgt @0 @0)
>>   (if (!flag_trapping_math)
>>    { constant_boolean_node (false, type); }))
>>
>>  /* Fold ~X op ~Y as Y op X.  */
>>  (for cmp (simple_comparison)
>>   (simplify
>>    (cmp (bit_not@2 @0) (bit_not@3 @1))
>>    (if (single_use (@2) && single_use (@3))
>> Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c
>> ===================================================================
>> --- trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (revision 0)
>> +++ trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (working copy)
>> @@ -0,0 +1,7 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O -fdump-tree-optimized" } */
>> +
>> +int f(double a){double b=a;return !__builtin_islessequal(a,b);}
>> +int g(double a){double b=a;return !__builtin_isgreaterequal(a,b);}
>> +
>> +/* { dg-final { scan-tree-dump-times " unord " 2 "optimized" } } */

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Canonicalize X u< X to UNORDERED_EXPR
  2016-05-02  9:19   ` Marc Glisse
@ 2016-05-02  9:45     ` Richard Biener
  2016-05-03  6:37       ` Marc Glisse
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-05-02  9:45 UTC (permalink / raw)
  To: Marc Glisse; +Cc: GCC Patches

On Mon, May 2, 2016 at 11:18 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Mon, 2 May 2016, Richard Biener wrote:
>
>> On Sat, Apr 30, 2016 at 8:44 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>
>>> Hello,
>>>
>>> this case seemed to be missing in the various X cmp X transformations. It
>>> does not change the generated code in the testcase.
>>>
>>> The missing :c is rather trivial. I can commit it separately if you
>>> prefer.
>>
>>
>> I think it's not missing.  Commutating the first one is enough to
>> eventually
>> make the @1s match up.  I think you should get a diagnostic on a duplicate
>> pattern when adding another :c (hmm, no, it's indeed "different" patterns
>> but
>> still redundant).
>
>
> Let's see. The current pattern is
>   (bitop (bit_and:c @0 @1) (bit_and @2 @1))
>
> This matches:
> (X & Y) ^ (Z & Y)
> (Y & X) ^ (Z & Y)
>
> If I have for instance (Y & X) ^ (Y & Z), I don't see how that is going to
> match, and indeed we don't simplify that.

Hmm, indeed.  I might have wrongly ommitted some :c then in other places
as well.  So the original patch is ok.

> On the other hand, if I have
> bit_ior instead of bit_xor, then we have another very similar transformation
> a 100 lines up in match.pd
>
> (for op (bit_and bit_ior)
>      rop (bit_ior bit_and)
>  (simplify
>   (op (convert? (rop:c @0 @1)) (convert? (rop @0 @2)))
>
> That one also commutes only one, but starting from a different match. We
> should probably reorganize them a bit.

Yeah.

Richard.

>
>>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>>
>>
>> Ok for the new pattern.
>>
>> Thanks,
>> Richard.
>>
>>> 2016-05-02  Marc Glisse  <marc.glisse@inria.fr>
>>>
>>> gcc/
>>>         * match.pd ((A & B) OP (C & B)): Mark '&' as commutative.
>>>         (X u< X, X u> X): New transformations
>>>
>>> gcc/testsuite/
>>>         * gcc.dg/tree-ssa/unord.c: New testcase.
>>>
>>> --
>>> Marc Glisse
>>> Index: trunk/gcc/match.pd
>>> ===================================================================
>>> --- trunk/gcc/match.pd  (revision 235654)
>>> +++ trunk/gcc/match.pd  (working copy)
>>> @@ -783,21 +783,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>>    @0)
>>>   /* (~x | y) & x -> x & y */
>>>   /* (~x & y) | x -> x | y */
>>>   (simplify
>>>    (bitop:c (rbitop:c (bit_not @0) @1) @0)
>>>    (bitop @0 @1)))
>>>
>>>  /* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
>>>  (for bitop (bit_and bit_ior bit_xor)
>>>   (simplify
>>> -  (bitop (bit_and:c @0 @1) (bit_and @2 @1))
>>> +  (bitop (bit_and:c @0 @1) (bit_and:c @2 @1))
>>>    (bit_and (bitop @0 @2) @1)))
>>>
>>>  /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
>>>  (simplify
>>>    (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
>>>    (bit_ior (bit_and @0 @2) (bit_and @1 @2)))
>>>
>>>  /* Combine successive equal operations with constants.  */
>>>  (for bitop (bit_and bit_ior bit_xor)
>>>   (simplify
>>> @@ -1914,20 +1914,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>>   (simplify
>>>    (cmp @0 @0)
>>>    (if (cmp != NE_EXPR
>>>         || ! FLOAT_TYPE_P (TREE_TYPE (@0))
>>>         || ! HONOR_NANS (@0))
>>>     { constant_boolean_node (false, type); })))
>>>  (for cmp (unle unge uneq)
>>>   (simplify
>>>    (cmp @0 @0)
>>>    { constant_boolean_node (true, type); }))
>>> +(for cmp (unlt ungt)
>>> + (simplify
>>> +  (cmp @0 @0)
>>> +  (unordered @0 @0)))
>>>  (simplify
>>>   (ltgt @0 @0)
>>>   (if (!flag_trapping_math)
>>>    { constant_boolean_node (false, type); }))
>>>
>>>  /* Fold ~X op ~Y as Y op X.  */
>>>  (for cmp (simple_comparison)
>>>   (simplify
>>>    (cmp (bit_not@2 @0) (bit_not@3 @1))
>>>    (if (single_use (@2) && single_use (@3))
>>> Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c
>>> ===================================================================
>>> --- trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (revision 0)
>>> +++ trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (working copy)
>>> @@ -0,0 +1,7 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O -fdump-tree-optimized" } */
>>> +
>>> +int f(double a){double b=a;return !__builtin_islessequal(a,b);}
>>> +int g(double a){double b=a;return !__builtin_isgreaterequal(a,b);}
>>> +
>>> +/* { dg-final { scan-tree-dump-times " unord " 2 "optimized" } } */
>
>
> --
> Marc Glisse

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Canonicalize X u< X to UNORDERED_EXPR
  2016-05-02  9:45     ` Richard Biener
@ 2016-05-03  6:37       ` Marc Glisse
  2016-05-03 11:03         ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Marc Glisse @ 2016-05-03  6:37 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

[-- Attachment #1: Type: TEXT/PLAIN, Size: 535 bytes --]

This removes the duplication. I also removed the case (A&B)&(A&C) which is 
handled by reassoc. And I need 2 NOP checks, for the case where @0 is a 
constant (that couldn't happen before my patch because canonicalization 
would put the constant as second operand).

Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

2016-05-03  Marc Glisse  <marc.glisse@inria.fr>

 	* match.pd ((A | B) & (A | C)): Generalize to BIT_XOR_EXPR.  Mark
 	as commutative.  Check both conversions are NOP.
 	((A & B) OP (C & B)): Remove.

-- 
Marc Glisse

[-- Attachment #2: Type: TEXT/PLAIN, Size: 2135 bytes --]

Index: gcc/match.pd
===================================================================
--- gcc/match.pd	(revision 235764)
+++ gcc/match.pd	(working copy)
@@ -678,25 +678,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (simplify
  (bit_xor:c (bit_and:c @0 @1) @1)
  (bit_and (bit_not @0) @1))
 
 /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
    operands are another bit-wise operation with a common input.  If so,
    distribute the bit operations to save an operation and possibly two if
    constants are involved.  For example, convert
      (A | B) & (A | C) into A | (B & C)
    Further simplification will occur if B and C are constants.  */
-(for op (bit_and bit_ior)
-     rop (bit_ior bit_and)
+(for op (bit_and bit_ior bit_xor)
+     rop (bit_ior bit_and bit_and)
  (simplify
-  (op (convert? (rop:c @0 @1)) (convert? (rop @0 @2)))
-  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+  (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2)))
+  (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
+       && tree_nop_conversion_p (type, TREE_TYPE (@2)))
    (rop (convert @0) (op (convert @1) (convert @2))))))
 
 
 (simplify
  (abs (abs@1 @0))
  @1)
 (simplify
  (abs (negate @0))
  (abs @0))
 (simplify
@@ -780,26 +781,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   /* (x & y) | x -> x */
  (simplify
   (bitop:c (rbitop:c @0 @1) @0)
   @0)
  /* (~x | y) & x -> x & y */
  /* (~x & y) | x -> x | y */
  (simplify
   (bitop:c (rbitop:c (bit_not @0) @1) @0)
   (bitop @0 @1)))
 
-/* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
-(for bitop (bit_and bit_ior bit_xor)
- (simplify
-  (bitop (bit_and:c @0 @1) (bit_and @2 @1))
-  (bit_and (bitop @0 @2) @1)))
-
 /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
 (simplify
   (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
   (bit_ior (bit_and @0 @2) (bit_and @1 @2)))
 
 /* Combine successive equal operations with constants.  */
 (for bitop (bit_and bit_ior bit_xor)
  (simplify
   (bitop (bitop @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
   (bitop @0 (bitop @1 @2))))

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Canonicalize X u< X to UNORDERED_EXPR
  2016-05-03  6:37       ` Marc Glisse
@ 2016-05-03 11:03         ` Richard Biener
  2016-05-03 13:27           ` Marc Glisse
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-05-03 11:03 UTC (permalink / raw)
  To: Marc Glisse; +Cc: GCC Patches

On Tue, May 3, 2016 at 8:36 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> This removes the duplication. I also removed the case (A&B)&(A&C) which is
> handled by reassoc. And I need 2 NOP checks, for the case where @0 is a
> constant (that couldn't happen before my patch because canonicalization
> would put the constant as second operand).

Nicely spotted.  Not sure we want to delay (A&B)&(A&C) until re-assoc.  We have
many patterns that reassoc would also catch, like (A + CST) + CST or (A + B)- A,
albeit reassoc only handles the unsigned cases.

> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

Ok.

Thanks,
Richard.

> 2016-05-03  Marc Glisse  <marc.glisse@inria.fr>
>
>         * match.pd ((A | B) & (A | C)): Generalize to BIT_XOR_EXPR.  Mark
>         as commutative.  Check both conversions are NOP.
>         ((A & B) OP (C & B)): Remove.
>
> --
> Marc Glisse
> Index: gcc/match.pd
> ===================================================================
> --- gcc/match.pd        (revision 235764)
> +++ gcc/match.pd        (working copy)
> @@ -678,25 +678,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (simplify
>   (bit_xor:c (bit_and:c @0 @1) @1)
>   (bit_and (bit_not @0) @1))
>
>  /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
>     operands are another bit-wise operation with a common input.  If so,
>     distribute the bit operations to save an operation and possibly two if
>     constants are involved.  For example, convert
>       (A | B) & (A | C) into A | (B & C)
>     Further simplification will occur if B and C are constants.  */
> -(for op (bit_and bit_ior)
> -     rop (bit_ior bit_and)
> +(for op (bit_and bit_ior bit_xor)
> +     rop (bit_ior bit_and bit_and)
>   (simplify
> -  (op (convert? (rop:c @0 @1)) (convert? (rop @0 @2)))
> -  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
> +  (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2)))
> +  (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
> +       && tree_nop_conversion_p (type, TREE_TYPE (@2)))
>     (rop (convert @0) (op (convert @1) (convert @2))))))
>
>
>  (simplify
>   (abs (abs@1 @0))
>   @1)
>  (simplify
>   (abs (negate @0))
>   (abs @0))
>  (simplify
> @@ -780,26 +781,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>    /* (x & y) | x -> x */
>   (simplify
>    (bitop:c (rbitop:c @0 @1) @0)
>    @0)
>   /* (~x | y) & x -> x & y */
>   /* (~x & y) | x -> x | y */
>   (simplify
>    (bitop:c (rbitop:c (bit_not @0) @1) @0)
>    (bitop @0 @1)))
>
> -/* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
> -(for bitop (bit_and bit_ior bit_xor)
> - (simplify
> -  (bitop (bit_and:c @0 @1) (bit_and @2 @1))
> -  (bit_and (bitop @0 @2) @1)))
> -
>  /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
>  (simplify
>    (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
>    (bit_ior (bit_and @0 @2) (bit_and @1 @2)))
>
>  /* Combine successive equal operations with constants.  */
>  (for bitop (bit_and bit_ior bit_xor)
>   (simplify
>    (bitop (bitop @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
>    (bitop @0 (bitop @1 @2))))
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Canonicalize X u< X to UNORDERED_EXPR
  2016-05-03 11:03         ` Richard Biener
@ 2016-05-03 13:27           ` Marc Glisse
  2016-05-03 13:34             ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Marc Glisse @ 2016-05-03 13:27 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On Tue, 3 May 2016, Richard Biener wrote:

> On Tue, May 3, 2016 at 8:36 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> This removes the duplication. I also removed the case (A&B)&(A&C) which is
>> handled by reassoc. And I need 2 NOP checks, for the case where @0 is a
>> constant (that couldn't happen before my patch because canonicalization
>> would put the constant as second operand).
>
> Nicely spotted.  Not sure we want to delay (A&B)&(A&C) until re-assoc.  We have
> many patterns that reassoc would also catch, like (A + CST) + CST or (A + B)- A,
> albeit reassoc only handles the unsigned cases.

(A&B)&A seems simple enough for match.pd, I thought (A&B)&(A&C) was 
starting to be a bit specialized. I could easily add it back (making it 
work with | at the same time), but then I am not convinced A&(B&C) is the 
best output. If A&B or A&C have several uses, then (A&B)&C or B&(A&C) seem 
preferable (and we would still have a transformation for (A&cst1)&cst2 so 
we wouldn't lose in the case where B and C are constants). We may still 
end up having to add some :s to the transformation I just touched.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Canonicalize X u< X to UNORDERED_EXPR
  2016-05-03 13:27           ` Marc Glisse
@ 2016-05-03 13:34             ` Richard Biener
  2016-05-06 11:50               ` Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) Marc Glisse
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-05-03 13:34 UTC (permalink / raw)
  To: Marc Glisse; +Cc: GCC Patches

On Tue, May 3, 2016 at 3:26 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Tue, 3 May 2016, Richard Biener wrote:
>
>> On Tue, May 3, 2016 at 8:36 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>
>>> This removes the duplication. I also removed the case (A&B)&(A&C) which
>>> is
>>> handled by reassoc. And I need 2 NOP checks, for the case where @0 is a
>>> constant (that couldn't happen before my patch because canonicalization
>>> would put the constant as second operand).
>>
>>
>> Nicely spotted.  Not sure we want to delay (A&B)&(A&C) until re-assoc.  We
>> have
>> many patterns that reassoc would also catch, like (A + CST) + CST or (A +
>> B)- A,
>> albeit reassoc only handles the unsigned cases.
>
>
> (A&B)&A seems simple enough for match.pd, I thought (A&B)&(A&C) was starting
> to be a bit specialized. I could easily add it back (making it work with |
> at the same time), but then I am not convinced A&(B&C) is the best output.
> If A&B or A&C have several uses, then (A&B)&C or B&(A&C) seem preferable
> (and we would still have a transformation for (A&cst1)&cst2 so we wouldn't
> lose in the case where B and C are constants). We may still end up having to
> add some :s to the transformation I just touched.

Yeah, these are always tricky questions.  Note that re-assoc won't
handle the case
with multi-use A&C or A&B.  The only reason to care for the single-use case is
when we change operations for the mixed operand cases.  For the all-&| case
there is only the (usual) consideration about SSA lifetime extension.

So maybe it makes sense to split out the all-&| cases.

Richard.

> --
> Marc Glisse

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR)
  2016-05-03 13:34             ` Richard Biener
@ 2016-05-06 11:50               ` Marc Glisse
  2016-05-08 20:49                 ` Marc Glisse
                                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Marc Glisse @ 2016-05-06 11:50 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2870 bytes --]

On Tue, 3 May 2016, Richard Biener wrote:

> On Tue, May 3, 2016 at 3:26 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> On Tue, 3 May 2016, Richard Biener wrote:
>>
>>> On Tue, May 3, 2016 at 8:36 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>>
>>>> This removes the duplication. I also removed the case (A&B)&(A&C) which
>>>> is
>>>> handled by reassoc. And I need 2 NOP checks, for the case where @0 is a
>>>> constant (that couldn't happen before my patch because canonicalization
>>>> would put the constant as second operand).
>>>
>>>
>>> Nicely spotted.  Not sure we want to delay (A&B)&(A&C) until re-assoc.  We
>>> have
>>> many patterns that reassoc would also catch, like (A + CST) + CST or (A +
>>> B)- A,
>>> albeit reassoc only handles the unsigned cases.
>>
>>
>> (A&B)&A seems simple enough for match.pd, I thought (A&B)&(A&C) was starting
>> to be a bit specialized. I could easily add it back (making it work with |
>> at the same time), but then I am not convinced A&(B&C) is the best output.
>> If A&B or A&C have several uses, then (A&B)&C or B&(A&C) seem preferable
>> (and we would still have a transformation for (A&cst1)&cst2 so we wouldn't
>> lose in the case where B and C are constants). We may still end up having to
>> add some :s to the transformation I just touched.
>
> Yeah, these are always tricky questions.  Note that re-assoc won't
> handle the case
> with multi-use A&C or A&B.  The only reason to care for the single-use case is
> when we change operations for the mixed operand cases.  For the all-&| case
> there is only the (usual) consideration about SSA lifetime extension.
>
> So maybe it makes sense to split out the all-&| cases.

Here they are. I did (X&Y)&X and (X&Y)&(X&Z). The next one would be 
((X&Y)&Z)&X, but at some point we have to defer to reassoc.

I didn't add the convert?+tree_nop_conversion_p to the existing transform 
I modified. I guess at some point we should make a pass and add them to 
all the transformations on bit operations...

For (X & Y) & Y, I believe that any conversion is fine. For the others, 
tree_nop_conversion_p is probably too strict (narrowing should be fine for 
all), but I was too lazy to look for tighter conditions.

(X ^ Y) ^ Y -> X should probably have (non_lvalue ...) on its output, but 
in a simple test it didn't seem to matter. Is non_lvalue still needed?


Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

2016-05-06  Marc Glisse  <marc.glisse@inria.fr>

gcc/
 	* fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge with...
 	* match.pd ((X & Y) ^ Y): ... this.
 	((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y)
 	| (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations.

gcc/testsuite/
 	* gcc.dg/tree-ssa/bit-assoc.c: New testcase.
 	* gcc.dg/tree-ssa/pr69270.c: Adjust.
 	* gcc.dg/tree-ssa/vrp59.c: Disable forwprop.

-- 
Marc Glisse

[-- Attachment #2: Type: TEXT/PLAIN, Size: 8281 bytes --]

Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c	(revision 235933)
+++ gcc/fold-const.c	(working copy)
@@ -10063,59 +10063,20 @@ fold_binary_loc (location_t loc,
 	}
       /* Fold !X & 1 as X == 0.  */
       if (TREE_CODE (arg0) == TRUTH_NOT_EXPR
 	  && integer_onep (arg1))
 	{
 	  tem = TREE_OPERAND (arg0, 0);
 	  return fold_build2_loc (loc, EQ_EXPR, type, tem,
 				  build_zero_cst (TREE_TYPE (tem)));
 	}
 
-      /* Fold (X ^ Y) & Y as ~X & Y.  */
-      if (TREE_CODE (arg0) == BIT_XOR_EXPR
-	  && operand_equal_p (TREE_OPERAND (arg0, 1), arg1, 0))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
-			      fold_convert_loc (loc, type, arg1));
-	}
-      /* Fold (X ^ Y) & X as ~Y & X.  */
-      if (TREE_CODE (arg0) == BIT_XOR_EXPR
-	  && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0)
-	  && reorder_operands_p (TREE_OPERAND (arg0, 1), arg1))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
-			      fold_convert_loc (loc, type, arg1));
-	}
-      /* Fold X & (X ^ Y) as X & ~Y.  */
-      if (TREE_CODE (arg1) == BIT_XOR_EXPR
-	  && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_convert_loc (loc, type, arg0),
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem));
-	}
-      /* Fold X & (Y ^ X) as ~Y & X.  */
-      if (TREE_CODE (arg1) == BIT_XOR_EXPR
-	  && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0)
-	  && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
-			      fold_convert_loc (loc, type, arg0));
-	}
-
       /* Fold (X * Y) & -(1 << CST) to X * Y if Y is a constant
          multiple of 1 << CST.  */
       if (TREE_CODE (arg1) == INTEGER_CST)
 	{
 	  wide_int cst1 = arg1;
 	  wide_int ncst1 = -cst1;
 	  if ((cst1 & ncst1) == ncst1
 	      && multiple_of_p (type, arg0,
 				wide_int_to_tree (TREE_TYPE (arg1), ncst1)))
 	    return fold_convert_loc (loc, type, arg0);
Index: gcc/match.pd
===================================================================
--- gcc/match.pd	(revision 235933)
+++ gcc/match.pd	(working copy)
@@ -667,39 +667,70 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
       && tree_nop_conversion_p (type, TREE_TYPE (@1)))
   (bit_xor (convert @0) (convert @1))))
 
 /* Convert ~X ^ C to X ^ ~C.  */
 (simplify
  (bit_xor (convert? (bit_not @0)) INTEGER_CST@1)
  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
   (bit_xor (convert @0) (bit_not @1))))
 
-/* Fold (X & Y) ^ Y as ~X & Y.  */
-(simplify
- (bit_xor:c (bit_and:c @0 @1) @1)
- (bit_and (bit_not @0) @1))
+/* Fold (X & Y) ^ Y and (X ^ Y) & Y as ~X & Y.  */
+(for opo (bit_and bit_xor)
+     opi (bit_xor bit_and)
+ (simplify
+  (opo:c (opi:c @0 @1) @1) 
+  (bit_and (bit_not @0) @1)))
 
 /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
    operands are another bit-wise operation with a common input.  If so,
    distribute the bit operations to save an operation and possibly two if
    constants are involved.  For example, convert
      (A | B) & (A | C) into A | (B & C)
    Further simplification will occur if B and C are constants.  */
 (for op (bit_and bit_ior bit_xor)
      rop (bit_ior bit_and bit_and)
  (simplify
   (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2)))
   (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
        && tree_nop_conversion_p (type, TREE_TYPE (@2)))
    (rop (convert @0) (op (convert @1) (convert @2))))))
 
+/* Some simple reassociation for bit operations, also handled in reassoc.  */
+/* (X & Y) & Y -> X & Y
+   (X | Y) | Y -> X | Y  */
+(for op (bit_and bit_ior)
+ (simplify
+  (op:c (convert?@2 (op:c @0 @1)) (convert? @1))
+  @2))
+/* (X ^ Y) ^ Y -> X  */
+(simplify
+ (bit_xor:c (convert? (bit_xor:c @0 @1)) (convert? @1))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+  (convert @0)))
+/* (X & Y) & (X & Z) -> (X & Y) & Z
+   (X | Y) | (X | Z) -> (X | Y) | Z  */
+(for op (bit_and bit_ior)
+ (simplify
+  (op:c (convert?@3 (op:c@4 @0 @1)) (convert?@5 (op:c@6 @0 @2)))
+  (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
+       && tree_nop_conversion_p (type, TREE_TYPE (@2)))
+   (if (single_use (@5) && single_use (@6))
+    (op @3 (convert @2))
+    (if (single_use (@3) && single_use (@4))
+     (op (convert @1) @5))))))
+/* (X ^ Y) ^ (X ^ Z) -> Y ^ Z  */
+(simplify
+ (bit_xor (convert? (bit_xor:c @0 @1)) (convert? (bit_xor:c @0 @2)))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
+      && tree_nop_conversion_p (type, TREE_TYPE (@2)))
+  (convert (bit_xor @1 @2))))
 
 (simplify
  (abs (abs@1 @0))
  @1)
 (simplify
  (abs (negate @0))
  (abs @0))
 (simplify
  (abs tree_expr_nonnegative_p@0)
  @0)
Index: gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c	(working copy)
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-forwprop1-details -fdump-tree-ccp1-details" } */
+
+int f1(int a, int b){
+  int c = a & b;
+  return c & b;
+}
+int f2(int a, int b){
+  int c = a | b;
+  return b | c;
+}
+int g1(int a, int b, int c){
+  int d = a & b;
+  int e = b & c;
+  return d & e;
+}
+int g2(int a, int b, int c){
+  int d = a | b;
+  int e = c | b;
+  return d | e;
+}
+int g3(int a, int b, int c){
+  int d = a ^ b;
+  int e = b ^ c;
+  return e ^ d;
+}
+
+/* { dg-final { scan-tree-dump-times "Match-and-simplified" 2 "ccp1" } } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified" 3 "forwprop1" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/pr69270.c	(revision 235933)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr69270.c	(working copy)
@@ -1,21 +1,23 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fsplit-paths -fdump-tree-dom3-details" } */
 
 /* There should be two references to bufferstep that turn into
    constants.  */
 /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with constant .0." 1 "dom3"} } */
 /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with constant .1." 1 "dom3"} } */
 
 /* And some assignments ought to fold down to constants.  */
-/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 2 "dom3"} } */
-/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 2 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -1;" 1 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -2;" 1 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 1 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 1 "dom3"} } */
 
 /* The XOR operations should have been optimized to constants.  */
 /* { dg-final { scan-tree-dump-not "bit_xor" "dom3"} } */
 
 
 extern int *stepsizeTable;
 
 void
 adpcm_coder (signed char *outdata, int len)
 {
Index: gcc/testsuite/gcc.dg/tree-ssa/vrp59.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/vrp59.c	(revision 235933)
+++ gcc/testsuite/gcc.dg/tree-ssa/vrp59.c	(working copy)
@@ -1,12 +1,12 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fno-tree-ccp -fno-tree-forwprop -fdump-tree-vrp1" } */
 
 int f(int x)
 {
   if (x >= 0 && x <= 3)
     {
       x = x ^ 3;
       x = x & 3;
     }
   return x;
 }

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR)
  2016-05-06 11:50               ` Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) Marc Glisse
@ 2016-05-08 20:49                 ` Marc Glisse
  2016-05-09 10:04                 ` Richard Biener
  2016-05-10  6:12                 ` Marc Glisse
  2 siblings, 0 replies; 22+ messages in thread
From: Marc Glisse @ 2016-05-08 20:49 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On Fri, 6 May 2016, Marc Glisse wrote:

> 2016-05-06  Marc Glisse  <marc.glisse@inria.fr>
>
> gcc/
> 	* fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge with...
> 	* match.pd ((X & Y) ^ Y): ... this.
> 	((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y)
> 	| (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations.
>
> gcc/testsuite/
> 	* gcc.dg/tree-ssa/bit-assoc.c: New testcase.
> 	* gcc.dg/tree-ssa/pr69270.c: Adjust.
> 	* gcc.dg/tree-ssa/vrp59.c: Disable forwprop.

Oups, I forgot about convert1/convert2 in match.pd, which may be relevant 
especially if @0 is a constant in the last transforms. Please ignore the 
patch for now, I'll resend after checking it.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR)
  2016-05-06 11:50               ` Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) Marc Glisse
  2016-05-08 20:49                 ` Marc Glisse
@ 2016-05-09 10:04                 ` Richard Biener
  2016-05-10  6:12                 ` Marc Glisse
  2 siblings, 0 replies; 22+ messages in thread
From: Richard Biener @ 2016-05-09 10:04 UTC (permalink / raw)
  To: Marc Glisse; +Cc: GCC Patches

On Fri, May 6, 2016 at 1:50 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Tue, 3 May 2016, Richard Biener wrote:
>
>> On Tue, May 3, 2016 at 3:26 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>
>>> On Tue, 3 May 2016, Richard Biener wrote:
>>>
>>>> On Tue, May 3, 2016 at 8:36 AM, Marc Glisse <marc.glisse@inria.fr>
>>>> wrote:
>>>>>
>>>>>
>>>>> This removes the duplication. I also removed the case (A&B)&(A&C) which
>>>>> is
>>>>> handled by reassoc. And I need 2 NOP checks, for the case where @0 is a
>>>>> constant (that couldn't happen before my patch because canonicalization
>>>>> would put the constant as second operand).
>>>>
>>>>
>>>>
>>>> Nicely spotted.  Not sure we want to delay (A&B)&(A&C) until re-assoc.
>>>> We
>>>> have
>>>> many patterns that reassoc would also catch, like (A + CST) + CST or (A
>>>> +
>>>> B)- A,
>>>> albeit reassoc only handles the unsigned cases.
>>>
>>>
>>>
>>> (A&B)&A seems simple enough for match.pd, I thought (A&B)&(A&C) was
>>> starting
>>> to be a bit specialized. I could easily add it back (making it work with
>>> |
>>> at the same time), but then I am not convinced A&(B&C) is the best
>>> output.
>>> If A&B or A&C have several uses, then (A&B)&C or B&(A&C) seem preferable
>>> (and we would still have a transformation for (A&cst1)&cst2 so we
>>> wouldn't
>>> lose in the case where B and C are constants). We may still end up having
>>> to
>>> add some :s to the transformation I just touched.
>>
>>
>> Yeah, these are always tricky questions.  Note that re-assoc won't
>> handle the case
>> with multi-use A&C or A&B.  The only reason to care for the single-use
>> case is
>> when we change operations for the mixed operand cases.  For the all-&|
>> case
>> there is only the (usual) consideration about SSA lifetime extension.
>>
>> So maybe it makes sense to split out the all-&| cases.
>
>
> Here they are. I did (X&Y)&X and (X&Y)&(X&Z). The next one would be
> ((X&Y)&Z)&X, but at some point we have to defer to reassoc.
>
> I didn't add the convert?+tree_nop_conversion_p to the existing transform I
> modified. I guess at some point we should make a pass and add them to all
> the transformations on bit operations...
>
> For (X & Y) & Y, I believe that any conversion is fine. For the others,
> tree_nop_conversion_p is probably too strict (narrowing should be fine for
> all), but I was too lazy to look for tighter conditions.
>
> (X ^ Y) ^ Y -> X should probably have (non_lvalue ...) on its output, but in
> a simple test it didn't seem to matter. Is non_lvalue still needed?

I've only added them to patterns where the fold-const.c variant had it.

In theory (hopefully) all of them are no longer necessary since C/C++ both
do delayed folding and thus hopefully figure out lvalue-ness before
entering fold ...

Richard.

>
> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>
> 2016-05-06  Marc Glisse  <marc.glisse@inria.fr>
>
> gcc/
>         * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge
> with...
>         * match.pd ((X & Y) ^ Y): ... this.
>         ((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y)
>         | (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations.
>
> gcc/testsuite/
>         * gcc.dg/tree-ssa/bit-assoc.c: New testcase.
>         * gcc.dg/tree-ssa/pr69270.c: Adjust.
>         * gcc.dg/tree-ssa/vrp59.c: Disable forwprop.
>
> --
> Marc Glisse
> Index: gcc/fold-const.c
> ===================================================================
> --- gcc/fold-const.c    (revision 235933)
> +++ gcc/fold-const.c    (working copy)
> @@ -10063,59 +10063,20 @@ fold_binary_loc (location_t loc,
>         }
>        /* Fold !X & 1 as X == 0.  */
>        if (TREE_CODE (arg0) == TRUTH_NOT_EXPR
>           && integer_onep (arg1))
>         {
>           tem = TREE_OPERAND (arg0, 0);
>           return fold_build2_loc (loc, EQ_EXPR, type, tem,
>                                   build_zero_cst (TREE_TYPE (tem)));
>         }
>
> -      /* Fold (X ^ Y) & Y as ~X & Y.  */
> -      if (TREE_CODE (arg0) == BIT_XOR_EXPR
> -         && operand_equal_p (TREE_OPERAND (arg0, 1), arg1, 0))
> -       {
> -         tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
> -         return fold_build2_loc (loc, BIT_AND_EXPR, type,
> -                             fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem),
> -                             fold_convert_loc (loc, type, arg1));
> -       }
> -      /* Fold (X ^ Y) & X as ~Y & X.  */
> -      if (TREE_CODE (arg0) == BIT_XOR_EXPR
> -         && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0)
> -         && reorder_operands_p (TREE_OPERAND (arg0, 1), arg1))
> -       {
> -         tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
> -         return fold_build2_loc (loc, BIT_AND_EXPR, type,
> -                             fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem),
> -                             fold_convert_loc (loc, type, arg1));
> -       }
> -      /* Fold X & (X ^ Y) as X & ~Y.  */
> -      if (TREE_CODE (arg1) == BIT_XOR_EXPR
> -         && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
> -       {
> -         tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
> -         return fold_build2_loc (loc, BIT_AND_EXPR, type,
> -                             fold_convert_loc (loc, type, arg0),
> -                             fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem));
> -       }
> -      /* Fold X & (Y ^ X) as ~Y & X.  */
> -      if (TREE_CODE (arg1) == BIT_XOR_EXPR
> -         && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0)
> -         && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
> -       {
> -         tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
> -         return fold_build2_loc (loc, BIT_AND_EXPR, type,
> -                             fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem),
> -                             fold_convert_loc (loc, type, arg0));
> -       }
> -
>        /* Fold (X * Y) & -(1 << CST) to X * Y if Y is a constant
>           multiple of 1 << CST.  */
>        if (TREE_CODE (arg1) == INTEGER_CST)
>         {
>           wide_int cst1 = arg1;
>           wide_int ncst1 = -cst1;
>           if ((cst1 & ncst1) == ncst1
>               && multiple_of_p (type, arg0,
>                                 wide_int_to_tree (TREE_TYPE (arg1), ncst1)))
>             return fold_convert_loc (loc, type, arg0);
> Index: gcc/match.pd
> ===================================================================
> --- gcc/match.pd        (revision 235933)
> +++ gcc/match.pd        (working copy)
> @@ -667,39 +667,70 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
>        && tree_nop_conversion_p (type, TREE_TYPE (@1)))
>    (bit_xor (convert @0) (convert @1))))
>
>  /* Convert ~X ^ C to X ^ ~C.  */
>  (simplify
>   (bit_xor (convert? (bit_not @0)) INTEGER_CST@1)
>   (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>    (bit_xor (convert @0) (bit_not @1))))
>
> -/* Fold (X & Y) ^ Y as ~X & Y.  */
> -(simplify
> - (bit_xor:c (bit_and:c @0 @1) @1)
> - (bit_and (bit_not @0) @1))
> +/* Fold (X & Y) ^ Y and (X ^ Y) & Y as ~X & Y.  */
> +(for opo (bit_and bit_xor)
> +     opi (bit_xor bit_and)
> + (simplify
> +  (opo:c (opi:c @0 @1) @1)
> +  (bit_and (bit_not @0) @1)))
>
>  /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
>     operands are another bit-wise operation with a common input.  If so,
>     distribute the bit operations to save an operation and possibly two if
>     constants are involved.  For example, convert
>       (A | B) & (A | C) into A | (B & C)
>     Further simplification will occur if B and C are constants.  */
>  (for op (bit_and bit_ior bit_xor)
>       rop (bit_ior bit_and bit_and)
>   (simplify
>    (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2)))
>    (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
>         && tree_nop_conversion_p (type, TREE_TYPE (@2)))
>     (rop (convert @0) (op (convert @1) (convert @2))))))
>
> +/* Some simple reassociation for bit operations, also handled in reassoc.
> */
> +/* (X & Y) & Y -> X & Y
> +   (X | Y) | Y -> X | Y  */
> +(for op (bit_and bit_ior)
> + (simplify
> +  (op:c (convert?@2 (op:c @0 @1)) (convert? @1))
> +  @2))
> +/* (X ^ Y) ^ Y -> X  */
> +(simplify
> + (bit_xor:c (convert? (bit_xor:c @0 @1)) (convert? @1))
> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
> +  (convert @0)))
> +/* (X & Y) & (X & Z) -> (X & Y) & Z
> +   (X | Y) | (X | Z) -> (X | Y) | Z  */
> +(for op (bit_and bit_ior)
> + (simplify
> +  (op:c (convert?@3 (op:c@4 @0 @1)) (convert?@5 (op:c@6 @0 @2)))
> +  (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
> +       && tree_nop_conversion_p (type, TREE_TYPE (@2)))
> +   (if (single_use (@5) && single_use (@6))
> +    (op @3 (convert @2))
> +    (if (single_use (@3) && single_use (@4))
> +     (op (convert @1) @5))))))
> +/* (X ^ Y) ^ (X ^ Z) -> Y ^ Z  */
> +(simplify
> + (bit_xor (convert? (bit_xor:c @0 @1)) (convert? (bit_xor:c @0 @2)))
> + (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
> +      && tree_nop_conversion_p (type, TREE_TYPE (@2)))
> +  (convert (bit_xor @1 @2))))
>
>  (simplify
>   (abs (abs@1 @0))
>   @1)
>  (simplify
>   (abs (negate @0))
>   (abs @0))
>  (simplify
>   (abs tree_expr_nonnegative_p@0)
>   @0)
> Index: gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c   (revision 0)
> +++ gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c   (working copy)
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-forwprop1-details -fdump-tree-ccp1-details"
> } */
> +
> +int f1(int a, int b){
> +  int c = a & b;
> +  return c & b;
> +}
> +int f2(int a, int b){
> +  int c = a | b;
> +  return b | c;
> +}
> +int g1(int a, int b, int c){
> +  int d = a & b;
> +  int e = b & c;
> +  return d & e;
> +}
> +int g2(int a, int b, int c){
> +  int d = a | b;
> +  int e = c | b;
> +  return d | e;
> +}
> +int g3(int a, int b, int c){
> +  int d = a ^ b;
> +  int e = b ^ c;
> +  return e ^ d;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Match-and-simplified" 2 "ccp1" } } */
> +/* { dg-final { scan-tree-dump-times "gimple_simplified" 3 "forwprop1" } }
> */
> Index: gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/pr69270.c     (revision 235933)
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr69270.c     (working copy)
> @@ -1,21 +1,23 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -fsplit-paths -fdump-tree-dom3-details" } */
>
>  /* There should be two references to bufferstep that turn into
>     constants.  */
>  /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with
> constant .0." 1 "dom3"} } */
>  /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with
> constant .1." 1 "dom3"} } */
>
>  /* And some assignments ought to fold down to constants.  */
> -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 2 "dom3"}
> } */
> -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 2 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -1;" 1 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -2;" 1 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 1 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 1 "dom3"}
> } */
>
>  /* The XOR operations should have been optimized to constants.  */
>  /* { dg-final { scan-tree-dump-not "bit_xor" "dom3"} } */
>
>
>  extern int *stepsizeTable;
>
>  void
>  adpcm_coder (signed char *outdata, int len)
>  {
> Index: gcc/testsuite/gcc.dg/tree-ssa/vrp59.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/vrp59.c       (revision 235933)
> +++ gcc/testsuite/gcc.dg/tree-ssa/vrp59.c       (working copy)
> @@ -1,12 +1,12 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp1" } */
> +/* { dg-options "-O2 -fno-tree-ccp -fno-tree-forwprop -fdump-tree-vrp1" }
> */
>
>  int f(int x)
>  {
>    if (x >= 0 && x <= 3)
>      {
>        x = x ^ 3;
>        x = x & 3;
>      }
>    return x;
>  }
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR)
  2016-05-06 11:50               ` Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) Marc Glisse
  2016-05-08 20:49                 ` Marc Glisse
  2016-05-09 10:04                 ` Richard Biener
@ 2016-05-10  6:12                 ` Marc Glisse
  2016-05-10  9:27                   ` Richard Biener
  2016-05-11 13:52                   ` H.J. Lu
  2 siblings, 2 replies; 22+ messages in thread
From: Marc Glisse @ 2016-05-10  6:12 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1656 bytes --]

On Fri, 6 May 2016, Marc Glisse wrote:

> Here they are. I did (X&Y)&X and (X&Y)&(X&Z). The next one would be 
> ((X&Y)&Z)&X, but at some point we have to defer to reassoc.
>
> I didn't add the convert?+tree_nop_conversion_p to the existing transform I 
> modified. I guess at some point we should make a pass and add them to all the 
> transformations on bit operations...
>
> For (X & Y) & Y, I believe that any conversion is fine. For the others, 
> tree_nop_conversion_p is probably too strict (narrowing should be fine for 
> all), but I was too lazy to look for tighter conditions.
>
> (X ^ Y) ^ Y -> X should probably have (non_lvalue ...) on its output, but in 
> a simple test it didn't seem to matter. Is non_lvalue still needed?
>
>
> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>
> 2016-05-06  Marc Glisse  <marc.glisse@inria.fr>
>
> gcc/
> 	* fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge 
> with...
> 	* match.pd ((X & Y) ^ Y): ... this.
> 	((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y)
> 	| (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations.
>
> gcc/testsuite/
> 	* gcc.dg/tree-ssa/bit-assoc.c: New testcase.
> 	* gcc.dg/tree-ssa/pr69270.c: Adjust.
> 	* gcc.dg/tree-ssa/vrp59.c: Disable forwprop.

Here it is again, I just replaced convert with convert[12] in the last 2 
transforms. This should matter for (unsigned)(si & 42) & (ui & 42u). I 
didn't change it in the other transform, because it would only matter in 
the case (T)(X & CST) & CST, which I think would be better served by 
extending the transform that currently handles (X & CST1) & CST2 (not done 
in this patch).

-- 
Marc Glisse

[-- Attachment #2: Type: TEXT/PLAIN, Size: 8285 bytes --]

Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c	(revision 236047)
+++ gcc/fold-const.c	(working copy)
@@ -10064,59 +10064,20 @@ fold_binary_loc (location_t loc,
 	}
       /* Fold !X & 1 as X == 0.  */
       if (TREE_CODE (arg0) == TRUTH_NOT_EXPR
 	  && integer_onep (arg1))
 	{
 	  tem = TREE_OPERAND (arg0, 0);
 	  return fold_build2_loc (loc, EQ_EXPR, type, tem,
 				  build_zero_cst (TREE_TYPE (tem)));
 	}
 
-      /* Fold (X ^ Y) & Y as ~X & Y.  */
-      if (TREE_CODE (arg0) == BIT_XOR_EXPR
-	  && operand_equal_p (TREE_OPERAND (arg0, 1), arg1, 0))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
-			      fold_convert_loc (loc, type, arg1));
-	}
-      /* Fold (X ^ Y) & X as ~Y & X.  */
-      if (TREE_CODE (arg0) == BIT_XOR_EXPR
-	  && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0)
-	  && reorder_operands_p (TREE_OPERAND (arg0, 1), arg1))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
-			      fold_convert_loc (loc, type, arg1));
-	}
-      /* Fold X & (X ^ Y) as X & ~Y.  */
-      if (TREE_CODE (arg1) == BIT_XOR_EXPR
-	  && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_convert_loc (loc, type, arg0),
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem));
-	}
-      /* Fold X & (Y ^ X) as ~Y & X.  */
-      if (TREE_CODE (arg1) == BIT_XOR_EXPR
-	  && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0)
-	  && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
-			      fold_convert_loc (loc, type, arg0));
-	}
-
       /* Fold (X * Y) & -(1 << CST) to X * Y if Y is a constant
          multiple of 1 << CST.  */
       if (TREE_CODE (arg1) == INTEGER_CST)
 	{
 	  wide_int cst1 = arg1;
 	  wide_int ncst1 = -cst1;
 	  if ((cst1 & ncst1) == ncst1
 	      && multiple_of_p (type, arg0,
 				wide_int_to_tree (TREE_TYPE (arg1), ncst1)))
 	    return fold_convert_loc (loc, type, arg0);
Index: gcc/match.pd
===================================================================
--- gcc/match.pd	(revision 236047)
+++ gcc/match.pd	(working copy)
@@ -667,39 +667,70 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
       && tree_nop_conversion_p (type, TREE_TYPE (@1)))
   (bit_xor (convert @0) (convert @1))))
 
 /* Convert ~X ^ C to X ^ ~C.  */
 (simplify
  (bit_xor (convert? (bit_not @0)) INTEGER_CST@1)
  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
   (bit_xor (convert @0) (bit_not @1))))
 
-/* Fold (X & Y) ^ Y as ~X & Y.  */
-(simplify
- (bit_xor:c (bit_and:c @0 @1) @1)
- (bit_and (bit_not @0) @1))
+/* Fold (X & Y) ^ Y and (X ^ Y) & Y as ~X & Y.  */
+(for opo (bit_and bit_xor)
+     opi (bit_xor bit_and)
+ (simplify
+  (opo:c (opi:c @0 @1) @1) 
+  (bit_and (bit_not @0) @1)))
 
 /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
    operands are another bit-wise operation with a common input.  If so,
    distribute the bit operations to save an operation and possibly two if
    constants are involved.  For example, convert
      (A | B) & (A | C) into A | (B & C)
    Further simplification will occur if B and C are constants.  */
 (for op (bit_and bit_ior bit_xor)
      rop (bit_ior bit_and bit_and)
  (simplify
   (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2)))
   (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
        && tree_nop_conversion_p (type, TREE_TYPE (@2)))
    (rop (convert @0) (op (convert @1) (convert @2))))))
 
+/* Some simple reassociation for bit operations, also handled in reassoc.  */
+/* (X & Y) & Y -> X & Y
+   (X | Y) | Y -> X | Y  */
+(for op (bit_and bit_ior)
+ (simplify
+  (op:c (convert?@2 (op:c @0 @1)) (convert? @1))
+  @2))
+/* (X ^ Y) ^ Y -> X  */
+(simplify
+ (bit_xor:c (convert? (bit_xor:c @0 @1)) (convert? @1))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+  (convert @0)))
+/* (X & Y) & (X & Z) -> (X & Y) & Z
+   (X | Y) | (X | Z) -> (X | Y) | Z  */
+(for op (bit_and bit_ior)
+ (simplify
+  (op:c (convert1?@3 (op:c@4 @0 @1)) (convert2?@5 (op:c@6 @0 @2)))
+  (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
+       && tree_nop_conversion_p (type, TREE_TYPE (@2)))
+   (if (single_use (@5) && single_use (@6))
+    (op @3 (convert @2))
+    (if (single_use (@3) && single_use (@4))
+     (op (convert @1) @5))))))
+/* (X ^ Y) ^ (X ^ Z) -> Y ^ Z  */
+(simplify
+ (bit_xor (convert1? (bit_xor:c @0 @1)) (convert2? (bit_xor:c @0 @2)))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
+      && tree_nop_conversion_p (type, TREE_TYPE (@2)))
+  (convert (bit_xor @1 @2))))
 
 (simplify
  (abs (abs@1 @0))
  @1)
 (simplify
  (abs (negate @0))
  (abs @0))
 (simplify
  (abs tree_expr_nonnegative_p@0)
  @0)
Index: gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c	(working copy)
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-forwprop1-details -fdump-tree-ccp1-details" } */
+
+int f1(int a, int b){
+  int c = a & b;
+  return c & b;
+}
+int f2(int a, int b){
+  int c = a | b;
+  return b | c;
+}
+int g1(int a, int b, int c){
+  int d = a & b;
+  int e = b & c;
+  return d & e;
+}
+int g2(int a, int b, int c){
+  int d = a | b;
+  int e = c | b;
+  return d | e;
+}
+int g3(int a, int b, int c){
+  int d = a ^ b;
+  int e = b ^ c;
+  return e ^ d;
+}
+
+/* { dg-final { scan-tree-dump-times "Match-and-simplified" 2 "ccp1" } } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified" 3 "forwprop1" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/pr69270.c	(revision 236047)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr69270.c	(working copy)
@@ -1,21 +1,23 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fsplit-paths -fdump-tree-dom3-details" } */
 
 /* There should be two references to bufferstep that turn into
    constants.  */
 /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with constant .0." 1 "dom3"} } */
 /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with constant .1." 1 "dom3"} } */
 
 /* And some assignments ought to fold down to constants.  */
-/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 2 "dom3"} } */
-/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 2 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -1;" 1 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -2;" 1 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 1 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 1 "dom3"} } */
 
 /* The XOR operations should have been optimized to constants.  */
 /* { dg-final { scan-tree-dump-not "bit_xor" "dom3"} } */
 
 
 extern int *stepsizeTable;
 
 void
 adpcm_coder (signed char *outdata, int len)
 {
Index: gcc/testsuite/gcc.dg/tree-ssa/vrp59.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/vrp59.c	(revision 236047)
+++ gcc/testsuite/gcc.dg/tree-ssa/vrp59.c	(working copy)
@@ -1,12 +1,12 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fno-tree-ccp -fno-tree-forwprop -fdump-tree-vrp1" } */
 
 int f(int x)
 {
   if (x >= 0 && x <= 3)
     {
       x = x ^ 3;
       x = x & 3;
     }
   return x;
 }

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR)
  2016-05-10  6:12                 ` Marc Glisse
@ 2016-05-10  9:27                   ` Richard Biener
  2016-05-11 13:52                   ` H.J. Lu
  1 sibling, 0 replies; 22+ messages in thread
From: Richard Biener @ 2016-05-10  9:27 UTC (permalink / raw)
  To: Marc Glisse; +Cc: GCC Patches

On Tue, May 10, 2016 at 8:11 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Fri, 6 May 2016, Marc Glisse wrote:
>
>> Here they are. I did (X&Y)&X and (X&Y)&(X&Z). The next one would be
>> ((X&Y)&Z)&X, but at some point we have to defer to reassoc.
>>
>> I didn't add the convert?+tree_nop_conversion_p to the existing transform
>> I modified. I guess at some point we should make a pass and add them to all
>> the transformations on bit operations...
>>
>> For (X & Y) & Y, I believe that any conversion is fine. For the others,
>> tree_nop_conversion_p is probably too strict (narrowing should be fine for
>> all), but I was too lazy to look for tighter conditions.
>>
>> (X ^ Y) ^ Y -> X should probably have (non_lvalue ...) on its output, but
>> in a simple test it didn't seem to matter. Is non_lvalue still needed?
>>
>>
>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>>
>> 2016-05-06  Marc Glisse  <marc.glisse@inria.fr>
>>
>> gcc/
>>         * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge
>> with...
>>         * match.pd ((X & Y) ^ Y): ... this.
>>         ((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y)
>>         | (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations.
>>
>> gcc/testsuite/
>>         * gcc.dg/tree-ssa/bit-assoc.c: New testcase.
>>         * gcc.dg/tree-ssa/pr69270.c: Adjust.
>>         * gcc.dg/tree-ssa/vrp59.c: Disable forwprop.
>
>
> Here it is again, I just replaced convert with convert[12] in the last 2
> transforms. This should matter for (unsigned)(si & 42) & (ui & 42u). I
> didn't change it in the other transform, because it would only matter in the
> case (T)(X & CST) & CST, which I think would be better served by extending
> the transform that currently handles (X & CST1) & CST2 (not done in this
> patch).

Ok.

Thanks!
Richard.

> --
> Marc Glisse
> Index: gcc/fold-const.c
> ===================================================================
> --- gcc/fold-const.c    (revision 236047)
> +++ gcc/fold-const.c    (working copy)
> @@ -10064,59 +10064,20 @@ fold_binary_loc (location_t loc,
>         }
>        /* Fold !X & 1 as X == 0.  */
>        if (TREE_CODE (arg0) == TRUTH_NOT_EXPR
>           && integer_onep (arg1))
>         {
>           tem = TREE_OPERAND (arg0, 0);
>           return fold_build2_loc (loc, EQ_EXPR, type, tem,
>                                   build_zero_cst (TREE_TYPE (tem)));
>         }
>
> -      /* Fold (X ^ Y) & Y as ~X & Y.  */
> -      if (TREE_CODE (arg0) == BIT_XOR_EXPR
> -         && operand_equal_p (TREE_OPERAND (arg0, 1), arg1, 0))
> -       {
> -         tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
> -         return fold_build2_loc (loc, BIT_AND_EXPR, type,
> -                             fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem),
> -                             fold_convert_loc (loc, type, arg1));
> -       }
> -      /* Fold (X ^ Y) & X as ~Y & X.  */
> -      if (TREE_CODE (arg0) == BIT_XOR_EXPR
> -         && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0)
> -         && reorder_operands_p (TREE_OPERAND (arg0, 1), arg1))
> -       {
> -         tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
> -         return fold_build2_loc (loc, BIT_AND_EXPR, type,
> -                             fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem),
> -                             fold_convert_loc (loc, type, arg1));
> -       }
> -      /* Fold X & (X ^ Y) as X & ~Y.  */
> -      if (TREE_CODE (arg1) == BIT_XOR_EXPR
> -         && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
> -       {
> -         tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
> -         return fold_build2_loc (loc, BIT_AND_EXPR, type,
> -                             fold_convert_loc (loc, type, arg0),
> -                             fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem));
> -       }
> -      /* Fold X & (Y ^ X) as ~Y & X.  */
> -      if (TREE_CODE (arg1) == BIT_XOR_EXPR
> -         && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0)
> -         && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
> -       {
> -         tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
> -         return fold_build2_loc (loc, BIT_AND_EXPR, type,
> -                             fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem),
> -                             fold_convert_loc (loc, type, arg0));
> -       }
> -
>        /* Fold (X * Y) & -(1 << CST) to X * Y if Y is a constant
>           multiple of 1 << CST.  */
>        if (TREE_CODE (arg1) == INTEGER_CST)
>         {
>           wide_int cst1 = arg1;
>           wide_int ncst1 = -cst1;
>           if ((cst1 & ncst1) == ncst1
>               && multiple_of_p (type, arg0,
>                                 wide_int_to_tree (TREE_TYPE (arg1), ncst1)))
>             return fold_convert_loc (loc, type, arg0);
> Index: gcc/match.pd
> ===================================================================
> --- gcc/match.pd        (revision 236047)
> +++ gcc/match.pd        (working copy)
> @@ -667,39 +667,70 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
>        && tree_nop_conversion_p (type, TREE_TYPE (@1)))
>    (bit_xor (convert @0) (convert @1))))
>
>  /* Convert ~X ^ C to X ^ ~C.  */
>  (simplify
>   (bit_xor (convert? (bit_not @0)) INTEGER_CST@1)
>   (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>    (bit_xor (convert @0) (bit_not @1))))
>
> -/* Fold (X & Y) ^ Y as ~X & Y.  */
> -(simplify
> - (bit_xor:c (bit_and:c @0 @1) @1)
> - (bit_and (bit_not @0) @1))
> +/* Fold (X & Y) ^ Y and (X ^ Y) & Y as ~X & Y.  */
> +(for opo (bit_and bit_xor)
> +     opi (bit_xor bit_and)
> + (simplify
> +  (opo:c (opi:c @0 @1) @1)
> +  (bit_and (bit_not @0) @1)))
>
>  /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
>     operands are another bit-wise operation with a common input.  If so,
>     distribute the bit operations to save an operation and possibly two if
>     constants are involved.  For example, convert
>       (A | B) & (A | C) into A | (B & C)
>     Further simplification will occur if B and C are constants.  */
>  (for op (bit_and bit_ior bit_xor)
>       rop (bit_ior bit_and bit_and)
>   (simplify
>    (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2)))
>    (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
>         && tree_nop_conversion_p (type, TREE_TYPE (@2)))
>     (rop (convert @0) (op (convert @1) (convert @2))))))
>
> +/* Some simple reassociation for bit operations, also handled in reassoc.
> */
> +/* (X & Y) & Y -> X & Y
> +   (X | Y) | Y -> X | Y  */
> +(for op (bit_and bit_ior)
> + (simplify
> +  (op:c (convert?@2 (op:c @0 @1)) (convert? @1))
> +  @2))
> +/* (X ^ Y) ^ Y -> X  */
> +(simplify
> + (bit_xor:c (convert? (bit_xor:c @0 @1)) (convert? @1))
> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
> +  (convert @0)))
> +/* (X & Y) & (X & Z) -> (X & Y) & Z
> +   (X | Y) | (X | Z) -> (X | Y) | Z  */
> +(for op (bit_and bit_ior)
> + (simplify
> +  (op:c (convert1?@3 (op:c@4 @0 @1)) (convert2?@5 (op:c@6 @0 @2)))
> +  (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
> +       && tree_nop_conversion_p (type, TREE_TYPE (@2)))
> +   (if (single_use (@5) && single_use (@6))
> +    (op @3 (convert @2))
> +    (if (single_use (@3) && single_use (@4))
> +     (op (convert @1) @5))))))
> +/* (X ^ Y) ^ (X ^ Z) -> Y ^ Z  */
> +(simplify
> + (bit_xor (convert1? (bit_xor:c @0 @1)) (convert2? (bit_xor:c @0 @2)))
> + (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
> +      && tree_nop_conversion_p (type, TREE_TYPE (@2)))
> +  (convert (bit_xor @1 @2))))
>
>  (simplify
>   (abs (abs@1 @0))
>   @1)
>  (simplify
>   (abs (negate @0))
>   (abs @0))
>  (simplify
>   (abs tree_expr_nonnegative_p@0)
>   @0)
> Index: gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c   (revision 0)
> +++ gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c   (working copy)
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-forwprop1-details -fdump-tree-ccp1-details"
> } */
> +
> +int f1(int a, int b){
> +  int c = a & b;
> +  return c & b;
> +}
> +int f2(int a, int b){
> +  int c = a | b;
> +  return b | c;
> +}
> +int g1(int a, int b, int c){
> +  int d = a & b;
> +  int e = b & c;
> +  return d & e;
> +}
> +int g2(int a, int b, int c){
> +  int d = a | b;
> +  int e = c | b;
> +  return d | e;
> +}
> +int g3(int a, int b, int c){
> +  int d = a ^ b;
> +  int e = b ^ c;
> +  return e ^ d;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Match-and-simplified" 2 "ccp1" } } */
> +/* { dg-final { scan-tree-dump-times "gimple_simplified" 3 "forwprop1" } }
> */
> Index: gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/pr69270.c     (revision 236047)
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr69270.c     (working copy)
> @@ -1,21 +1,23 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -fsplit-paths -fdump-tree-dom3-details" } */
>
>  /* There should be two references to bufferstep that turn into
>     constants.  */
>  /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with
> constant .0." 1 "dom3"} } */
>  /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with
> constant .1." 1 "dom3"} } */
>
>  /* And some assignments ought to fold down to constants.  */
> -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 2 "dom3"}
> } */
> -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 2 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -1;" 1 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -2;" 1 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 1 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 1 "dom3"}
> } */
>
>  /* The XOR operations should have been optimized to constants.  */
>  /* { dg-final { scan-tree-dump-not "bit_xor" "dom3"} } */
>
>
>  extern int *stepsizeTable;
>
>  void
>  adpcm_coder (signed char *outdata, int len)
>  {
> Index: gcc/testsuite/gcc.dg/tree-ssa/vrp59.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/vrp59.c       (revision 236047)
> +++ gcc/testsuite/gcc.dg/tree-ssa/vrp59.c       (working copy)
> @@ -1,12 +1,12 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp1" } */
> +/* { dg-options "-O2 -fno-tree-ccp -fno-tree-forwprop -fdump-tree-vrp1" }
> */
>
>  int f(int x)
>  {
>    if (x >= 0 && x <= 3)
>      {
>        x = x ^ 3;
>        x = x & 3;
>      }
>    return x;
>  }
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR)
  2016-05-10  6:12                 ` Marc Glisse
  2016-05-10  9:27                   ` Richard Biener
@ 2016-05-11 13:52                   ` H.J. Lu
  2016-05-11 16:17                     ` Marc Glisse
  1 sibling, 1 reply; 22+ messages in thread
From: H.J. Lu @ 2016-05-11 13:52 UTC (permalink / raw)
  To: Marc Glisse; +Cc: Richard Biener, GCC Patches

On Mon, May 9, 2016 at 11:11 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Fri, 6 May 2016, Marc Glisse wrote:
>
>> Here they are. I did (X&Y)&X and (X&Y)&(X&Z). The next one would be
>> ((X&Y)&Z)&X, but at some point we have to defer to reassoc.
>>
>> I didn't add the convert?+tree_nop_conversion_p to the existing transform
>> I modified. I guess at some point we should make a pass and add them to all
>> the transformations on bit operations...
>>
>> For (X & Y) & Y, I believe that any conversion is fine. For the others,
>> tree_nop_conversion_p is probably too strict (narrowing should be fine for
>> all), but I was too lazy to look for tighter conditions.
>>
>> (X ^ Y) ^ Y -> X should probably have (non_lvalue ...) on its output, but
>> in a simple test it didn't seem to matter. Is non_lvalue still needed?
>>
>>
>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>>
>> 2016-05-06  Marc Glisse  <marc.glisse@inria.fr>
>>
>> gcc/
>>         * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge
>> with...
>>         * match.pd ((X & Y) ^ Y): ... this.
>>         ((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y)
>>         | (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations.
>>
>> gcc/testsuite/
>>         * gcc.dg/tree-ssa/bit-assoc.c: New testcase.
>>         * gcc.dg/tree-ssa/pr69270.c: Adjust.
>>         * gcc.dg/tree-ssa/vrp59.c: Disable forwprop.
>
>
> Here it is again, I just replaced convert with convert[12] in the last 2
> transforms. This should matter for (unsigned)(si & 42) & (ui & 42u). I
> didn't change it in the other transform, because it would only matter in the
> case (T)(X & CST) & CST, which I think would be better served by extending
> the transform that currently handles (X & CST1) & CST2 (not done in this
> patch).

It caused:

FAIL: gcc.dg/tree-ssa/vrp47.c scan-tree-dump-times vrp2 " & 1;" 0

on x86.

H.J.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR)
  2016-05-11 13:52                   ` H.J. Lu
@ 2016-05-11 16:17                     ` Marc Glisse
  2016-05-11 16:26                       ` Simple bitop reassoc in match.pd Jeff Law
  0 siblings, 1 reply; 22+ messages in thread
From: Marc Glisse @ 2016-05-11 16:17 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Richard Biener, GCC Patches

On Wed, 11 May 2016, H.J. Lu wrote:

>>>         * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge with...
>>>         * match.pd ((X & Y) ^ Y): ... this.

> It caused:
>
> FAIL: gcc.dg/tree-ssa/vrp47.c scan-tree-dump-times vrp2 " & 1;" 0
>
> on x86.

Ah, yes, logical_op_short_circuit so I didn't test it. Hmm, doesn't seem 
so easy. We want to compute (int)(x==0) in a branch where x is known to be 
in [0, 1]. VRP1 gives (int)(_Bool)(x^1). forwprop3 handles the double 
conversion, which gives (x^1)&1. Here the new transform applies and gives 
(~x)&1. VRP2 comes, and with (x^1)&1 it used to notice that this is just 
x^1 (remember that x is in [0, 1] in this branch), while it doesn't know 
what to do with (~x)&1. In the end, we get
 	notl	%eax
 	andl	$1, %eax
instead of
 	xorl	$1, %eax

(andn doesn't seem to have a version with an immediate)

The transformation seems right to me, but we are then missing another 
transformation like ~X & Y -> X ^ Y when we know that X & ~Y is 0 (the 1 
bits of X are included in those of Y). That may not be easy with the 
limited bit tracking we have. A version limited to constant Y of the form 
2^n-1 (i.e. 0...01...1) where X has a range included in [0, Y] may be 
simpler.

We could also simplify (int)(_Bool)x to x using VRP information that x is 
in [0, 1], but apparently when VRP replaces x==0 with y=x^1,(_Bool)y, it 
does not compute a range for the new variable y, and by the time the next 
VRP pass comes, it is too late.

In the mean time, I propose xfailing this test...

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Simple bitop reassoc in match.pd
  2016-05-11 16:17                     ` Marc Glisse
@ 2016-05-11 16:26                       ` Jeff Law
  2016-05-11 17:56                         ` Marc Glisse
  2016-05-12  5:26                         ` Marc Glisse
  0 siblings, 2 replies; 22+ messages in thread
From: Jeff Law @ 2016-05-11 16:26 UTC (permalink / raw)
  To: Marc Glisse, H.J. Lu; +Cc: Richard Biener, GCC Patches

On 05/11/2016 10:17 AM, Marc Glisse wrote:
> The transformation seems right to me, but we are then missing another
> transformation like ~X & Y -> X ^ Y when we know that X & ~Y is 0 (the 1
> bits of X are included in those of Y). That may not be easy with the
> limited bit tracking we have. A version limited to constant Y of the
> form 2^n-1 (i.e. 0...01...1) where X has a range included in [0, Y] may
> be simpler.
While we don't have strong bit tracking, we do track [0..1] ranges 
reasonably well, so it may be worth doing.

> We could also simplify (int)(_Bool)x to x using VRP information that x
> is in [0, 1], but apparently when VRP replaces x==0 with y=x^1,(_Bool)y,
> it does not compute a range for the new variable y, and by the time the
> next VRP pass comes, it is too late.
Seems like a clear oversight.

jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Simple bitop reassoc in match.pd
  2016-05-11 16:26                       ` Simple bitop reassoc in match.pd Jeff Law
@ 2016-05-11 17:56                         ` Marc Glisse
  2016-05-11 20:44                           ` Marc Glisse
  2016-05-12  8:41                           ` Richard Biener
  2016-05-12  5:26                         ` Marc Glisse
  1 sibling, 2 replies; 22+ messages in thread
From: Marc Glisse @ 2016-05-11 17:56 UTC (permalink / raw)
  To: Jeff Law; +Cc: H.J. Lu, Richard Biener, GCC Patches

On Wed, 11 May 2016, Jeff Law wrote:

>> We could also simplify (int)(_Bool)x to x using VRP information that x
>> is in [0, 1], but apparently when VRP replaces x==0 with y=x^1,(_Bool)y,
>> it does not compute a range for the new variable y, and by the time the
>> next VRP pass comes, it is too late.
> Seems like a clear oversight.

In get_value_range, there is:
   /* If we query the range for a new SSA name return an unmodifiable VARYING.
      We should get here at most from the substitute-and-fold stage which
      will never try to change values.  */
so this is a known limitation.

We could try to change that (XRESIZEVEC, memset(0) on the new elements, 
update num_vr_values to the new num_ssa_names, at this point vr_value 
should be replaced with a vector).

We could also use set_range_info and make simplify_conversion_using_ranges 
use get_range_info instead of get_value_range. Might even move the whole 
function to match.pd then ;-)

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Simple bitop reassoc in match.pd
  2016-05-11 17:56                         ` Marc Glisse
@ 2016-05-11 20:44                           ` Marc Glisse
  2016-05-12  8:41                           ` Richard Biener
  1 sibling, 0 replies; 22+ messages in thread
From: Marc Glisse @ 2016-05-11 20:44 UTC (permalink / raw)
  To: Jeff Law; +Cc: H.J. Lu, Richard Biener, GCC Patches

[-- Attachment #1: Type: TEXT/PLAIN, Size: 339 bytes --]

On Wed, 11 May 2016, Marc Glisse wrote:

> We could also use set_range_info and make simplify_conversion_using_ranges 
> use get_range_info instead of get_value_range.

Something like this seems to fix the testcase. I'll try to submit it 
properly, but I don't know when. (I also added the ~X&C transform to my 
TODO-list)

-- 
Marc Glisse

[-- Attachment #2: Type: TEXT/PLAIN, Size: 1882 bytes --]

Index: gcc/tree-vrp.c
===================================================================
--- gcc/tree-vrp.c	(revision 236122)
+++ gcc/tree-vrp.c	(working copy)
@@ -8940,6 +8940,8 @@
       gassign *newop
 	= gimple_build_assign (tem, BIT_XOR_EXPR, op0, op1);
       gsi_insert_before (gsi, newop, GSI_SAME_STMT);
+      if (TYPE_PRECISION (TREE_TYPE (tem)) > 1)
+	set_range_info (tem, VR_RANGE, 0, 1);
       gimple_assign_set_rhs_with_ops (gsi, NOP_EXPR, tem);
     }
   /* Or without.  */
@@ -9648,7 +9650,6 @@
 {
   tree innerop, middleop, finaltype;
   gimple *def_stmt;
-  value_range *innervr;
   signop inner_sgn, middle_sgn, final_sgn;
   unsigned inner_prec, middle_prec, final_prec;
   widest_int innermin, innermed, innermax, middlemin, middlemed, middlemax;
@@ -9666,18 +9667,16 @@
       || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (innerop))
     return false;
 
-  /* Get the value-range of the inner operand.  */
-  innervr = get_value_range (innerop);
-  if (innervr->type != VR_RANGE
-      || TREE_CODE (innervr->min) != INTEGER_CST
-      || TREE_CODE (innervr->max) != INTEGER_CST)
+  /* Get the value-range of the inner operand.  Use get_range_info in
+     case innerop was created during substitute-and-fold.  */
+  wide_int imin, imax;
+  if (get_range_info (innerop, &imin, &imax) != VR_RANGE)
     return false;
+  innermin = widest_int::from (imin, TYPE_SIGN (TREE_TYPE (innerop)));
+  innermax = widest_int::from (imax, TYPE_SIGN (TREE_TYPE (innerop)));
 
   /* Simulate the conversion chain to check if the result is equal if
      the middle conversion is removed.  */
-  innermin = wi::to_widest (innervr->min);
-  innermax = wi::to_widest (innervr->max);
-
   inner_prec = TYPE_PRECISION (TREE_TYPE (innerop));
   middle_prec = TYPE_PRECISION (TREE_TYPE (middleop));
   final_prec = TYPE_PRECISION (finaltype);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Simple bitop reassoc in match.pd
  2016-05-11 16:26                       ` Simple bitop reassoc in match.pd Jeff Law
  2016-05-11 17:56                         ` Marc Glisse
@ 2016-05-12  5:26                         ` Marc Glisse
  1 sibling, 0 replies; 22+ messages in thread
From: Marc Glisse @ 2016-05-12  5:26 UTC (permalink / raw)
  To: Jeff Law; +Cc: H.J. Lu, Richard Biener, GCC Patches

On Wed, 11 May 2016, Jeff Law wrote:

> On 05/11/2016 10:17 AM, Marc Glisse wrote:
>> The transformation seems right to me, but we are then missing another
>> transformation like ~X & Y -> X ^ Y when we know that X & ~Y is 0 (the 1
>> bits of X are included in those of Y). That may not be easy with the
>> limited bit tracking we have. A version limited to constant Y of the
>> form 2^n-1 (i.e. 0...01...1) where X has a range included in [0, Y] may
>> be simpler.
> While we don't have strong bit tracking, we do track [0..1] ranges reasonably 
> well, so it may be worth doing.

I had started writing

+/* Simplify (~X & Y) to X ^ Y if we know that (X & ~Y) is 0.  */
+#if GIMPLE
+(simplify
+ (bit_and (bit_not SSA_NAME@0) INTEGER_CST@1)
+ (if ((get_nonzero_bits (@0) & wi::bit_not (@1)) == 0)
+  (bit_xor @0 @1)))
+#endif

but then I realized that VRP does not call the simplify machinery, so this 
would have to be rewritten in simplify_bit_ops_using_ranges. The good 
point is that we could then compute:
may_be_nonzero_X & ~must_be_nonzero_Y
instead of assuming that Y is a constant (not that it would change 
anything in practice).

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Simple bitop reassoc in match.pd
  2016-05-11 17:56                         ` Marc Glisse
  2016-05-11 20:44                           ` Marc Glisse
@ 2016-05-12  8:41                           ` Richard Biener
  2016-05-12 16:03                             ` Marc Glisse
  1 sibling, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-05-12  8:41 UTC (permalink / raw)
  To: Marc Glisse; +Cc: Jeff Law, H.J. Lu, GCC Patches

On Wed, May 11, 2016 at 7:56 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Wed, 11 May 2016, Jeff Law wrote:
>
>>> We could also simplify (int)(_Bool)x to x using VRP information that x
>>> is in [0, 1], but apparently when VRP replaces x==0 with y=x^1,(_Bool)y,
>>> it does not compute a range for the new variable y, and by the time the
>>> next VRP pass comes, it is too late.
>>
>> Seems like a clear oversight.
>
>
> In get_value_range, there is:
>   /* If we query the range for a new SSA name return an unmodifiable
> VARYING.
>      We should get here at most from the substitute-and-fold stage which
>      will never try to change values.  */
> so this is a known limitation.
>
> We could try to change that (XRESIZEVEC, memset(0) on the new elements,
> update num_vr_values to the new num_ssa_names, at this point vr_value should
> be replaced with a vector).
>
> We could also use set_range_info and make simplify_conversion_using_ranges
> use get_range_info instead of get_value_range. Might even move the whole
> function to match.pd then ;-)

Yeah - note that VRP already calls set_range_info before simplifying
stmts.  It's
just that substitute_and_fold doesn't apply fold_stmt (and thus match.pd) to
all stmts but it only applies the pass specific "fold" (vrp_fold_stmt)
to all stmts.

Richard.

> --
> Marc Glisse

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Simple bitop reassoc in match.pd
  2016-05-12  8:41                           ` Richard Biener
@ 2016-05-12 16:03                             ` Marc Glisse
  2016-05-12 16:51                               ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Marc Glisse @ 2016-05-12 16:03 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On Thu, 12 May 2016, Richard Biener wrote:

> Yeah - note that VRP already calls set_range_info before simplifying 
> stmts.  It's just that substitute_and_fold doesn't apply fold_stmt (and 
> thus match.pd) to all stmts but it only applies the pass specific "fold" 
> (vrp_fold_stmt) to all stmts.

Just to be sure: is the fact that VRP doesn't apply fold_stmt on purpose? 
The restriction makes sense, it is just that it may yield a bit of 
duplication. We already indirectly use get_range_info in match.pd and may 
miss out on opportunities that only occur in branches during the VRP pass.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Simple bitop reassoc in match.pd
  2016-05-12 16:03                             ` Marc Glisse
@ 2016-05-12 16:51                               ` Richard Biener
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Biener @ 2016-05-12 16:51 UTC (permalink / raw)
  To: gcc-patches, Marc Glisse

On May 12, 2016 6:02:47 PM GMT+02:00, Marc Glisse <marc.glisse@inria.fr> wrote:
>On Thu, 12 May 2016, Richard Biener wrote:
>
>> Yeah - note that VRP already calls set_range_info before simplifying 
>> stmts.  It's just that substitute_and_fold doesn't apply fold_stmt
>(and 
>> thus match.pd) to all stmts but it only applies the pass specific
>"fold" 
>> (vrp_fold_stmt) to all stmts.
>
>Just to be sure: is the fact that VRP doesn't apply fold_stmt on
>purpose? 

The propagator only folds stmts that had operands replaced (that doesn't enable all simplifications as match.PD patterns cover more than one statement).

>The restriction makes sense, it is just that it may yield a bit of 
>duplication. We already indirectly use get_range_info in match.pd and
>may 
>miss out on opportunities that only occur in branches during the VRP
>pass.

Yes.  The pass specific fold is called on each stmt.  Maybe we can omit the propagators folding if the pass specific folding applied.

Richard.


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-05-12 16:51 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-30 18:45 Canonicalize X u< X to UNORDERED_EXPR Marc Glisse
2016-05-02  8:37 ` Richard Biener
2016-05-02  9:19   ` Marc Glisse
2016-05-02  9:45     ` Richard Biener
2016-05-03  6:37       ` Marc Glisse
2016-05-03 11:03         ` Richard Biener
2016-05-03 13:27           ` Marc Glisse
2016-05-03 13:34             ` Richard Biener
2016-05-06 11:50               ` Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) Marc Glisse
2016-05-08 20:49                 ` Marc Glisse
2016-05-09 10:04                 ` Richard Biener
2016-05-10  6:12                 ` Marc Glisse
2016-05-10  9:27                   ` Richard Biener
2016-05-11 13:52                   ` H.J. Lu
2016-05-11 16:17                     ` Marc Glisse
2016-05-11 16:26                       ` Simple bitop reassoc in match.pd Jeff Law
2016-05-11 17:56                         ` Marc Glisse
2016-05-11 20:44                           ` Marc Glisse
2016-05-12  8:41                           ` Richard Biener
2016-05-12 16:03                             ` Marc Glisse
2016-05-12 16:51                               ` Richard Biener
2016-05-12  5:26                         ` Marc Glisse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).