* Canonicalize X u< X to UNORDERED_EXPR
@ 2016-04-30 18:45 Marc Glisse
2016-05-02 8:37 ` Richard Biener
0 siblings, 1 reply; 22+ messages in thread
From: Marc Glisse @ 2016-04-30 18:45 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: TEXT/PLAIN, Size: 496 bytes --]
Hello,
this case seemed to be missing in the various X cmp X transformations. It
does not change the generated code in the testcase.
The missing :c is rather trivial. I can commit it separately if you
prefer.
Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
2016-05-02 Marc Glisse <marc.glisse@inria.fr>
gcc/
* match.pd ((A & B) OP (C & B)): Mark '&' as commutative.
(X u< X, X u> X): New transformations
gcc/testsuite/
* gcc.dg/tree-ssa/unord.c: New testcase.
--
Marc Glisse
[-- Attachment #2: Type: TEXT/PLAIN, Size: 2159 bytes --]
Index: trunk/gcc/match.pd
===================================================================
--- trunk/gcc/match.pd (revision 235654)
+++ trunk/gcc/match.pd (working copy)
@@ -783,21 +783,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
@0)
/* (~x | y) & x -> x & y */
/* (~x & y) | x -> x | y */
(simplify
(bitop:c (rbitop:c (bit_not @0) @1) @0)
(bitop @0 @1)))
/* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
(for bitop (bit_and bit_ior bit_xor)
(simplify
- (bitop (bit_and:c @0 @1) (bit_and @2 @1))
+ (bitop (bit_and:c @0 @1) (bit_and:c @2 @1))
(bit_and (bitop @0 @2) @1)))
/* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
(simplify
(bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
(bit_ior (bit_and @0 @2) (bit_and @1 @2)))
/* Combine successive equal operations with constants. */
(for bitop (bit_and bit_ior bit_xor)
(simplify
@@ -1914,20 +1914,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(simplify
(cmp @0 @0)
(if (cmp != NE_EXPR
|| ! FLOAT_TYPE_P (TREE_TYPE (@0))
|| ! HONOR_NANS (@0))
{ constant_boolean_node (false, type); })))
(for cmp (unle unge uneq)
(simplify
(cmp @0 @0)
{ constant_boolean_node (true, type); }))
+(for cmp (unlt ungt)
+ (simplify
+ (cmp @0 @0)
+ (unordered @0 @0)))
(simplify
(ltgt @0 @0)
(if (!flag_trapping_math)
{ constant_boolean_node (false, type); }))
/* Fold ~X op ~Y as Y op X. */
(for cmp (simple_comparison)
(simplify
(cmp (bit_not@2 @0) (bit_not@3 @1))
(if (single_use (@2) && single_use (@3))
Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c
===================================================================
--- trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (revision 0)
+++ trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (working copy)
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+int f(double a){double b=a;return !__builtin_islessequal(a,b);}
+int g(double a){double b=a;return !__builtin_isgreaterequal(a,b);}
+
+/* { dg-final { scan-tree-dump-times " unord " 2 "optimized" } } */
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Canonicalize X u< X to UNORDERED_EXPR
2016-04-30 18:45 Canonicalize X u< X to UNORDERED_EXPR Marc Glisse
@ 2016-05-02 8:37 ` Richard Biener
2016-05-02 9:19 ` Marc Glisse
0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-05-02 8:37 UTC (permalink / raw)
To: Marc Glisse; +Cc: GCC Patches
On Sat, Apr 30, 2016 at 8:44 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> Hello,
>
> this case seemed to be missing in the various X cmp X transformations. It
> does not change the generated code in the testcase.
>
> The missing :c is rather trivial. I can commit it separately if you prefer.
I think it's not missing. Commutating the first one is enough to eventually
make the @1s match up. I think you should get a diagnostic on a duplicate
pattern when adding another :c (hmm, no, it's indeed "different" patterns but
still redundant).
> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
Ok for the new pattern.
Thanks,
Richard.
> 2016-05-02 Marc Glisse <marc.glisse@inria.fr>
>
> gcc/
> * match.pd ((A & B) OP (C & B)): Mark '&' as commutative.
> (X u< X, X u> X): New transformations
>
> gcc/testsuite/
> * gcc.dg/tree-ssa/unord.c: New testcase.
>
> --
> Marc Glisse
> Index: trunk/gcc/match.pd
> ===================================================================
> --- trunk/gcc/match.pd (revision 235654)
> +++ trunk/gcc/match.pd (working copy)
> @@ -783,21 +783,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> @0)
> /* (~x | y) & x -> x & y */
> /* (~x & y) | x -> x | y */
> (simplify
> (bitop:c (rbitop:c (bit_not @0) @1) @0)
> (bitop @0 @1)))
>
> /* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
> (for bitop (bit_and bit_ior bit_xor)
> (simplify
> - (bitop (bit_and:c @0 @1) (bit_and @2 @1))
> + (bitop (bit_and:c @0 @1) (bit_and:c @2 @1))
> (bit_and (bitop @0 @2) @1)))
>
> /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
> (simplify
> (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
> (bit_ior (bit_and @0 @2) (bit_and @1 @2)))
>
> /* Combine successive equal operations with constants. */
> (for bitop (bit_and bit_ior bit_xor)
> (simplify
> @@ -1914,20 +1914,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (simplify
> (cmp @0 @0)
> (if (cmp != NE_EXPR
> || ! FLOAT_TYPE_P (TREE_TYPE (@0))
> || ! HONOR_NANS (@0))
> { constant_boolean_node (false, type); })))
> (for cmp (unle unge uneq)
> (simplify
> (cmp @0 @0)
> { constant_boolean_node (true, type); }))
> +(for cmp (unlt ungt)
> + (simplify
> + (cmp @0 @0)
> + (unordered @0 @0)))
> (simplify
> (ltgt @0 @0)
> (if (!flag_trapping_math)
> { constant_boolean_node (false, type); }))
>
> /* Fold ~X op ~Y as Y op X. */
> (for cmp (simple_comparison)
> (simplify
> (cmp (bit_not@2 @0) (bit_not@3 @1))
> (if (single_use (@2) && single_use (@3))
> Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c
> ===================================================================
> --- trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (revision 0)
> +++ trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (working copy)
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +int f(double a){double b=a;return !__builtin_islessequal(a,b);}
> +int g(double a){double b=a;return !__builtin_isgreaterequal(a,b);}
> +
> +/* { dg-final { scan-tree-dump-times " unord " 2 "optimized" } } */
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Canonicalize X u< X to UNORDERED_EXPR
2016-05-02 8:37 ` Richard Biener
@ 2016-05-02 9:19 ` Marc Glisse
2016-05-02 9:45 ` Richard Biener
0 siblings, 1 reply; 22+ messages in thread
From: Marc Glisse @ 2016-05-02 9:19 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
On Mon, 2 May 2016, Richard Biener wrote:
> On Sat, Apr 30, 2016 at 8:44 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> Hello,
>>
>> this case seemed to be missing in the various X cmp X transformations. It
>> does not change the generated code in the testcase.
>>
>> The missing :c is rather trivial. I can commit it separately if you prefer.
>
> I think it's not missing. Commutating the first one is enough to eventually
> make the @1s match up. I think you should get a diagnostic on a duplicate
> pattern when adding another :c (hmm, no, it's indeed "different" patterns but
> still redundant).
Let's see. The current pattern is
(bitop (bit_and:c @0 @1) (bit_and @2 @1))
This matches:
(X & Y) ^ (Z & Y)
(Y & X) ^ (Z & Y)
If I have for instance (Y & X) ^ (Y & Z), I don't see how that is going to
match, and indeed we don't simplify that. On the other hand, if I have
bit_ior instead of bit_xor, then we have another very similar
transformation a 100 lines up in match.pd
(for op (bit_and bit_ior)
rop (bit_ior bit_and)
(simplify
(op (convert? (rop:c @0 @1)) (convert? (rop @0 @2)))
That one also commutes only one, but starting from a different match. We
should probably reorganize them a bit.
>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>
> Ok for the new pattern.
>
> Thanks,
> Richard.
>
>> 2016-05-02 Marc Glisse <marc.glisse@inria.fr>
>>
>> gcc/
>> * match.pd ((A & B) OP (C & B)): Mark '&' as commutative.
>> (X u< X, X u> X): New transformations
>>
>> gcc/testsuite/
>> * gcc.dg/tree-ssa/unord.c: New testcase.
>>
>> --
>> Marc Glisse
>> Index: trunk/gcc/match.pd
>> ===================================================================
>> --- trunk/gcc/match.pd (revision 235654)
>> +++ trunk/gcc/match.pd (working copy)
>> @@ -783,21 +783,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>> @0)
>> /* (~x | y) & x -> x & y */
>> /* (~x & y) | x -> x | y */
>> (simplify
>> (bitop:c (rbitop:c (bit_not @0) @1) @0)
>> (bitop @0 @1)))
>>
>> /* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
>> (for bitop (bit_and bit_ior bit_xor)
>> (simplify
>> - (bitop (bit_and:c @0 @1) (bit_and @2 @1))
>> + (bitop (bit_and:c @0 @1) (bit_and:c @2 @1))
>> (bit_and (bitop @0 @2) @1)))
>>
>> /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
>> (simplify
>> (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
>> (bit_ior (bit_and @0 @2) (bit_and @1 @2)))
>>
>> /* Combine successive equal operations with constants. */
>> (for bitop (bit_and bit_ior bit_xor)
>> (simplify
>> @@ -1914,20 +1914,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>> (simplify
>> (cmp @0 @0)
>> (if (cmp != NE_EXPR
>> || ! FLOAT_TYPE_P (TREE_TYPE (@0))
>> || ! HONOR_NANS (@0))
>> { constant_boolean_node (false, type); })))
>> (for cmp (unle unge uneq)
>> (simplify
>> (cmp @0 @0)
>> { constant_boolean_node (true, type); }))
>> +(for cmp (unlt ungt)
>> + (simplify
>> + (cmp @0 @0)
>> + (unordered @0 @0)))
>> (simplify
>> (ltgt @0 @0)
>> (if (!flag_trapping_math)
>> { constant_boolean_node (false, type); }))
>>
>> /* Fold ~X op ~Y as Y op X. */
>> (for cmp (simple_comparison)
>> (simplify
>> (cmp (bit_not@2 @0) (bit_not@3 @1))
>> (if (single_use (@2) && single_use (@3))
>> Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c
>> ===================================================================
>> --- trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (revision 0)
>> +++ trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (working copy)
>> @@ -0,0 +1,7 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O -fdump-tree-optimized" } */
>> +
>> +int f(double a){double b=a;return !__builtin_islessequal(a,b);}
>> +int g(double a){double b=a;return !__builtin_isgreaterequal(a,b);}
>> +
>> +/* { dg-final { scan-tree-dump-times " unord " 2 "optimized" } } */
--
Marc Glisse
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Canonicalize X u< X to UNORDERED_EXPR
2016-05-02 9:19 ` Marc Glisse
@ 2016-05-02 9:45 ` Richard Biener
2016-05-03 6:37 ` Marc Glisse
0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-05-02 9:45 UTC (permalink / raw)
To: Marc Glisse; +Cc: GCC Patches
On Mon, May 2, 2016 at 11:18 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Mon, 2 May 2016, Richard Biener wrote:
>
>> On Sat, Apr 30, 2016 at 8:44 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>
>>> Hello,
>>>
>>> this case seemed to be missing in the various X cmp X transformations. It
>>> does not change the generated code in the testcase.
>>>
>>> The missing :c is rather trivial. I can commit it separately if you
>>> prefer.
>>
>>
>> I think it's not missing. Commutating the first one is enough to
>> eventually
>> make the @1s match up. I think you should get a diagnostic on a duplicate
>> pattern when adding another :c (hmm, no, it's indeed "different" patterns
>> but
>> still redundant).
>
>
> Let's see. The current pattern is
> (bitop (bit_and:c @0 @1) (bit_and @2 @1))
>
> This matches:
> (X & Y) ^ (Z & Y)
> (Y & X) ^ (Z & Y)
>
> If I have for instance (Y & X) ^ (Y & Z), I don't see how that is going to
> match, and indeed we don't simplify that.
Hmm, indeed. I might have wrongly ommitted some :c then in other places
as well. So the original patch is ok.
> On the other hand, if I have
> bit_ior instead of bit_xor, then we have another very similar transformation
> a 100 lines up in match.pd
>
> (for op (bit_and bit_ior)
> rop (bit_ior bit_and)
> (simplify
> (op (convert? (rop:c @0 @1)) (convert? (rop @0 @2)))
>
> That one also commutes only one, but starting from a different match. We
> should probably reorganize them a bit.
Yeah.
Richard.
>
>>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>>
>>
>> Ok for the new pattern.
>>
>> Thanks,
>> Richard.
>>
>>> 2016-05-02 Marc Glisse <marc.glisse@inria.fr>
>>>
>>> gcc/
>>> * match.pd ((A & B) OP (C & B)): Mark '&' as commutative.
>>> (X u< X, X u> X): New transformations
>>>
>>> gcc/testsuite/
>>> * gcc.dg/tree-ssa/unord.c: New testcase.
>>>
>>> --
>>> Marc Glisse
>>> Index: trunk/gcc/match.pd
>>> ===================================================================
>>> --- trunk/gcc/match.pd (revision 235654)
>>> +++ trunk/gcc/match.pd (working copy)
>>> @@ -783,21 +783,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>> @0)
>>> /* (~x | y) & x -> x & y */
>>> /* (~x & y) | x -> x | y */
>>> (simplify
>>> (bitop:c (rbitop:c (bit_not @0) @1) @0)
>>> (bitop @0 @1)))
>>>
>>> /* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
>>> (for bitop (bit_and bit_ior bit_xor)
>>> (simplify
>>> - (bitop (bit_and:c @0 @1) (bit_and @2 @1))
>>> + (bitop (bit_and:c @0 @1) (bit_and:c @2 @1))
>>> (bit_and (bitop @0 @2) @1)))
>>>
>>> /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
>>> (simplify
>>> (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
>>> (bit_ior (bit_and @0 @2) (bit_and @1 @2)))
>>>
>>> /* Combine successive equal operations with constants. */
>>> (for bitop (bit_and bit_ior bit_xor)
>>> (simplify
>>> @@ -1914,20 +1914,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>> (simplify
>>> (cmp @0 @0)
>>> (if (cmp != NE_EXPR
>>> || ! FLOAT_TYPE_P (TREE_TYPE (@0))
>>> || ! HONOR_NANS (@0))
>>> { constant_boolean_node (false, type); })))
>>> (for cmp (unle unge uneq)
>>> (simplify
>>> (cmp @0 @0)
>>> { constant_boolean_node (true, type); }))
>>> +(for cmp (unlt ungt)
>>> + (simplify
>>> + (cmp @0 @0)
>>> + (unordered @0 @0)))
>>> (simplify
>>> (ltgt @0 @0)
>>> (if (!flag_trapping_math)
>>> { constant_boolean_node (false, type); }))
>>>
>>> /* Fold ~X op ~Y as Y op X. */
>>> (for cmp (simple_comparison)
>>> (simplify
>>> (cmp (bit_not@2 @0) (bit_not@3 @1))
>>> (if (single_use (@2) && single_use (@3))
>>> Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c
>>> ===================================================================
>>> --- trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (revision 0)
>>> +++ trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (working copy)
>>> @@ -0,0 +1,7 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O -fdump-tree-optimized" } */
>>> +
>>> +int f(double a){double b=a;return !__builtin_islessequal(a,b);}
>>> +int g(double a){double b=a;return !__builtin_isgreaterequal(a,b);}
>>> +
>>> +/* { dg-final { scan-tree-dump-times " unord " 2 "optimized" } } */
>
>
> --
> Marc Glisse
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Canonicalize X u< X to UNORDERED_EXPR
2016-05-02 9:45 ` Richard Biener
@ 2016-05-03 6:37 ` Marc Glisse
2016-05-03 11:03 ` Richard Biener
0 siblings, 1 reply; 22+ messages in thread
From: Marc Glisse @ 2016-05-03 6:37 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
[-- Attachment #1: Type: TEXT/PLAIN, Size: 535 bytes --]
This removes the duplication. I also removed the case (A&B)&(A&C) which is
handled by reassoc. And I need 2 NOP checks, for the case where @0 is a
constant (that couldn't happen before my patch because canonicalization
would put the constant as second operand).
Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
2016-05-03 Marc Glisse <marc.glisse@inria.fr>
* match.pd ((A | B) & (A | C)): Generalize to BIT_XOR_EXPR. Mark
as commutative. Check both conversions are NOP.
((A & B) OP (C & B)): Remove.
--
Marc Glisse
[-- Attachment #2: Type: TEXT/PLAIN, Size: 2135 bytes --]
Index: gcc/match.pd
===================================================================
--- gcc/match.pd (revision 235764)
+++ gcc/match.pd (working copy)
@@ -678,25 +678,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(simplify
(bit_xor:c (bit_and:c @0 @1) @1)
(bit_and (bit_not @0) @1))
/* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
operands are another bit-wise operation with a common input. If so,
distribute the bit operations to save an operation and possibly two if
constants are involved. For example, convert
(A | B) & (A | C) into A | (B & C)
Further simplification will occur if B and C are constants. */
-(for op (bit_and bit_ior)
- rop (bit_ior bit_and)
+(for op (bit_and bit_ior bit_xor)
+ rop (bit_ior bit_and bit_and)
(simplify
- (op (convert? (rop:c @0 @1)) (convert? (rop @0 @2)))
- (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+ (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2)))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
+ && tree_nop_conversion_p (type, TREE_TYPE (@2)))
(rop (convert @0) (op (convert @1) (convert @2))))))
(simplify
(abs (abs@1 @0))
@1)
(simplify
(abs (negate @0))
(abs @0))
(simplify
@@ -780,26 +781,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
/* (x & y) | x -> x */
(simplify
(bitop:c (rbitop:c @0 @1) @0)
@0)
/* (~x | y) & x -> x & y */
/* (~x & y) | x -> x | y */
(simplify
(bitop:c (rbitop:c (bit_not @0) @1) @0)
(bitop @0 @1)))
-/* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
-(for bitop (bit_and bit_ior bit_xor)
- (simplify
- (bitop (bit_and:c @0 @1) (bit_and @2 @1))
- (bit_and (bitop @0 @2) @1)))
-
/* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
(simplify
(bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
(bit_ior (bit_and @0 @2) (bit_and @1 @2)))
/* Combine successive equal operations with constants. */
(for bitop (bit_and bit_ior bit_xor)
(simplify
(bitop (bitop @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
(bitop @0 (bitop @1 @2))))
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Canonicalize X u< X to UNORDERED_EXPR
2016-05-03 6:37 ` Marc Glisse
@ 2016-05-03 11:03 ` Richard Biener
2016-05-03 13:27 ` Marc Glisse
0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-05-03 11:03 UTC (permalink / raw)
To: Marc Glisse; +Cc: GCC Patches
On Tue, May 3, 2016 at 8:36 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> This removes the duplication. I also removed the case (A&B)&(A&C) which is
> handled by reassoc. And I need 2 NOP checks, for the case where @0 is a
> constant (that couldn't happen before my patch because canonicalization
> would put the constant as second operand).
Nicely spotted. Not sure we want to delay (A&B)&(A&C) until re-assoc. We have
many patterns that reassoc would also catch, like (A + CST) + CST or (A + B)- A,
albeit reassoc only handles the unsigned cases.
> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
Ok.
Thanks,
Richard.
> 2016-05-03 Marc Glisse <marc.glisse@inria.fr>
>
> * match.pd ((A | B) & (A | C)): Generalize to BIT_XOR_EXPR. Mark
> as commutative. Check both conversions are NOP.
> ((A & B) OP (C & B)): Remove.
>
> --
> Marc Glisse
> Index: gcc/match.pd
> ===================================================================
> --- gcc/match.pd (revision 235764)
> +++ gcc/match.pd (working copy)
> @@ -678,25 +678,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (simplify
> (bit_xor:c (bit_and:c @0 @1) @1)
> (bit_and (bit_not @0) @1))
>
> /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
> operands are another bit-wise operation with a common input. If so,
> distribute the bit operations to save an operation and possibly two if
> constants are involved. For example, convert
> (A | B) & (A | C) into A | (B & C)
> Further simplification will occur if B and C are constants. */
> -(for op (bit_and bit_ior)
> - rop (bit_ior bit_and)
> +(for op (bit_and bit_ior bit_xor)
> + rop (bit_ior bit_and bit_and)
> (simplify
> - (op (convert? (rop:c @0 @1)) (convert? (rop @0 @2)))
> - (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
> + (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2)))
> + (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
> + && tree_nop_conversion_p (type, TREE_TYPE (@2)))
> (rop (convert @0) (op (convert @1) (convert @2))))))
>
>
> (simplify
> (abs (abs@1 @0))
> @1)
> (simplify
> (abs (negate @0))
> (abs @0))
> (simplify
> @@ -780,26 +781,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> /* (x & y) | x -> x */
> (simplify
> (bitop:c (rbitop:c @0 @1) @0)
> @0)
> /* (~x | y) & x -> x & y */
> /* (~x & y) | x -> x | y */
> (simplify
> (bitop:c (rbitop:c (bit_not @0) @1) @0)
> (bitop @0 @1)))
>
> -/* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
> -(for bitop (bit_and bit_ior bit_xor)
> - (simplify
> - (bitop (bit_and:c @0 @1) (bit_and @2 @1))
> - (bit_and (bitop @0 @2) @1)))
> -
> /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
> (simplify
> (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
> (bit_ior (bit_and @0 @2) (bit_and @1 @2)))
>
> /* Combine successive equal operations with constants. */
> (for bitop (bit_and bit_ior bit_xor)
> (simplify
> (bitop (bitop @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
> (bitop @0 (bitop @1 @2))))
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Canonicalize X u< X to UNORDERED_EXPR
2016-05-03 11:03 ` Richard Biener
@ 2016-05-03 13:27 ` Marc Glisse
2016-05-03 13:34 ` Richard Biener
0 siblings, 1 reply; 22+ messages in thread
From: Marc Glisse @ 2016-05-03 13:27 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
On Tue, 3 May 2016, Richard Biener wrote:
> On Tue, May 3, 2016 at 8:36 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> This removes the duplication. I also removed the case (A&B)&(A&C) which is
>> handled by reassoc. And I need 2 NOP checks, for the case where @0 is a
>> constant (that couldn't happen before my patch because canonicalization
>> would put the constant as second operand).
>
> Nicely spotted. Not sure we want to delay (A&B)&(A&C) until re-assoc. We have
> many patterns that reassoc would also catch, like (A + CST) + CST or (A + B)- A,
> albeit reassoc only handles the unsigned cases.
(A&B)&A seems simple enough for match.pd, I thought (A&B)&(A&C) was
starting to be a bit specialized. I could easily add it back (making it
work with | at the same time), but then I am not convinced A&(B&C) is the
best output. If A&B or A&C have several uses, then (A&B)&C or B&(A&C) seem
preferable (and we would still have a transformation for (A&cst1)&cst2 so
we wouldn't lose in the case where B and C are constants). We may still
end up having to add some :s to the transformation I just touched.
--
Marc Glisse
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Canonicalize X u< X to UNORDERED_EXPR
2016-05-03 13:27 ` Marc Glisse
@ 2016-05-03 13:34 ` Richard Biener
2016-05-06 11:50 ` Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) Marc Glisse
0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-05-03 13:34 UTC (permalink / raw)
To: Marc Glisse; +Cc: GCC Patches
On Tue, May 3, 2016 at 3:26 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Tue, 3 May 2016, Richard Biener wrote:
>
>> On Tue, May 3, 2016 at 8:36 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>
>>> This removes the duplication. I also removed the case (A&B)&(A&C) which
>>> is
>>> handled by reassoc. And I need 2 NOP checks, for the case where @0 is a
>>> constant (that couldn't happen before my patch because canonicalization
>>> would put the constant as second operand).
>>
>>
>> Nicely spotted. Not sure we want to delay (A&B)&(A&C) until re-assoc. We
>> have
>> many patterns that reassoc would also catch, like (A + CST) + CST or (A +
>> B)- A,
>> albeit reassoc only handles the unsigned cases.
>
>
> (A&B)&A seems simple enough for match.pd, I thought (A&B)&(A&C) was starting
> to be a bit specialized. I could easily add it back (making it work with |
> at the same time), but then I am not convinced A&(B&C) is the best output.
> If A&B or A&C have several uses, then (A&B)&C or B&(A&C) seem preferable
> (and we would still have a transformation for (A&cst1)&cst2 so we wouldn't
> lose in the case where B and C are constants). We may still end up having to
> add some :s to the transformation I just touched.
Yeah, these are always tricky questions. Note that re-assoc won't
handle the case
with multi-use A&C or A&B. The only reason to care for the single-use case is
when we change operations for the mixed operand cases. For the all-&| case
there is only the (usual) consideration about SSA lifetime extension.
So maybe it makes sense to split out the all-&| cases.
Richard.
> --
> Marc Glisse
^ permalink raw reply [flat|nested] 22+ messages in thread
* Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR)
2016-05-03 13:34 ` Richard Biener
@ 2016-05-06 11:50 ` Marc Glisse
2016-05-08 20:49 ` Marc Glisse
` (2 more replies)
0 siblings, 3 replies; 22+ messages in thread
From: Marc Glisse @ 2016-05-06 11:50 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
[-- Attachment #1: Type: TEXT/PLAIN, Size: 2870 bytes --]
On Tue, 3 May 2016, Richard Biener wrote:
> On Tue, May 3, 2016 at 3:26 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> On Tue, 3 May 2016, Richard Biener wrote:
>>
>>> On Tue, May 3, 2016 at 8:36 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>>
>>>> This removes the duplication. I also removed the case (A&B)&(A&C) which
>>>> is
>>>> handled by reassoc. And I need 2 NOP checks, for the case where @0 is a
>>>> constant (that couldn't happen before my patch because canonicalization
>>>> would put the constant as second operand).
>>>
>>>
>>> Nicely spotted. Not sure we want to delay (A&B)&(A&C) until re-assoc. We
>>> have
>>> many patterns that reassoc would also catch, like (A + CST) + CST or (A +
>>> B)- A,
>>> albeit reassoc only handles the unsigned cases.
>>
>>
>> (A&B)&A seems simple enough for match.pd, I thought (A&B)&(A&C) was starting
>> to be a bit specialized. I could easily add it back (making it work with |
>> at the same time), but then I am not convinced A&(B&C) is the best output.
>> If A&B or A&C have several uses, then (A&B)&C or B&(A&C) seem preferable
>> (and we would still have a transformation for (A&cst1)&cst2 so we wouldn't
>> lose in the case where B and C are constants). We may still end up having to
>> add some :s to the transformation I just touched.
>
> Yeah, these are always tricky questions. Note that re-assoc won't
> handle the case
> with multi-use A&C or A&B. The only reason to care for the single-use case is
> when we change operations for the mixed operand cases. For the all-&| case
> there is only the (usual) consideration about SSA lifetime extension.
>
> So maybe it makes sense to split out the all-&| cases.
Here they are. I did (X&Y)&X and (X&Y)&(X&Z). The next one would be
((X&Y)&Z)&X, but at some point we have to defer to reassoc.
I didn't add the convert?+tree_nop_conversion_p to the existing transform
I modified. I guess at some point we should make a pass and add them to
all the transformations on bit operations...
For (X & Y) & Y, I believe that any conversion is fine. For the others,
tree_nop_conversion_p is probably too strict (narrowing should be fine for
all), but I was too lazy to look for tighter conditions.
(X ^ Y) ^ Y -> X should probably have (non_lvalue ...) on its output, but
in a simple test it didn't seem to matter. Is non_lvalue still needed?
Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
2016-05-06 Marc Glisse <marc.glisse@inria.fr>
gcc/
* fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge with...
* match.pd ((X & Y) ^ Y): ... this.
((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y)
| (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations.
gcc/testsuite/
* gcc.dg/tree-ssa/bit-assoc.c: New testcase.
* gcc.dg/tree-ssa/pr69270.c: Adjust.
* gcc.dg/tree-ssa/vrp59.c: Disable forwprop.
--
Marc Glisse
[-- Attachment #2: Type: TEXT/PLAIN, Size: 8281 bytes --]
Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c (revision 235933)
+++ gcc/fold-const.c (working copy)
@@ -10063,59 +10063,20 @@ fold_binary_loc (location_t loc,
}
/* Fold !X & 1 as X == 0. */
if (TREE_CODE (arg0) == TRUTH_NOT_EXPR
&& integer_onep (arg1))
{
tem = TREE_OPERAND (arg0, 0);
return fold_build2_loc (loc, EQ_EXPR, type, tem,
build_zero_cst (TREE_TYPE (tem)));
}
- /* Fold (X ^ Y) & Y as ~X & Y. */
- if (TREE_CODE (arg0) == BIT_XOR_EXPR
- && operand_equal_p (TREE_OPERAND (arg0, 1), arg1, 0))
- {
- tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
- return fold_build2_loc (loc, BIT_AND_EXPR, type,
- fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
- fold_convert_loc (loc, type, arg1));
- }
- /* Fold (X ^ Y) & X as ~Y & X. */
- if (TREE_CODE (arg0) == BIT_XOR_EXPR
- && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0)
- && reorder_operands_p (TREE_OPERAND (arg0, 1), arg1))
- {
- tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
- return fold_build2_loc (loc, BIT_AND_EXPR, type,
- fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
- fold_convert_loc (loc, type, arg1));
- }
- /* Fold X & (X ^ Y) as X & ~Y. */
- if (TREE_CODE (arg1) == BIT_XOR_EXPR
- && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
- {
- tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
- return fold_build2_loc (loc, BIT_AND_EXPR, type,
- fold_convert_loc (loc, type, arg0),
- fold_build1_loc (loc, BIT_NOT_EXPR, type, tem));
- }
- /* Fold X & (Y ^ X) as ~Y & X. */
- if (TREE_CODE (arg1) == BIT_XOR_EXPR
- && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0)
- && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
- {
- tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
- return fold_build2_loc (loc, BIT_AND_EXPR, type,
- fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
- fold_convert_loc (loc, type, arg0));
- }
-
/* Fold (X * Y) & -(1 << CST) to X * Y if Y is a constant
multiple of 1 << CST. */
if (TREE_CODE (arg1) == INTEGER_CST)
{
wide_int cst1 = arg1;
wide_int ncst1 = -cst1;
if ((cst1 & ncst1) == ncst1
&& multiple_of_p (type, arg0,
wide_int_to_tree (TREE_TYPE (arg1), ncst1)))
return fold_convert_loc (loc, type, arg0);
Index: gcc/match.pd
===================================================================
--- gcc/match.pd (revision 235933)
+++ gcc/match.pd (working copy)
@@ -667,39 +667,70 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(if (tree_nop_conversion_p (type, TREE_TYPE (@0))
&& tree_nop_conversion_p (type, TREE_TYPE (@1)))
(bit_xor (convert @0) (convert @1))))
/* Convert ~X ^ C to X ^ ~C. */
(simplify
(bit_xor (convert? (bit_not @0)) INTEGER_CST@1)
(if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
(bit_xor (convert @0) (bit_not @1))))
-/* Fold (X & Y) ^ Y as ~X & Y. */
-(simplify
- (bit_xor:c (bit_and:c @0 @1) @1)
- (bit_and (bit_not @0) @1))
+/* Fold (X & Y) ^ Y and (X ^ Y) & Y as ~X & Y. */
+(for opo (bit_and bit_xor)
+ opi (bit_xor bit_and)
+ (simplify
+ (opo:c (opi:c @0 @1) @1)
+ (bit_and (bit_not @0) @1)))
/* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
operands are another bit-wise operation with a common input. If so,
distribute the bit operations to save an operation and possibly two if
constants are involved. For example, convert
(A | B) & (A | C) into A | (B & C)
Further simplification will occur if B and C are constants. */
(for op (bit_and bit_ior bit_xor)
rop (bit_ior bit_and bit_and)
(simplify
(op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2)))
(if (tree_nop_conversion_p (type, TREE_TYPE (@1))
&& tree_nop_conversion_p (type, TREE_TYPE (@2)))
(rop (convert @0) (op (convert @1) (convert @2))))))
+/* Some simple reassociation for bit operations, also handled in reassoc. */
+/* (X & Y) & Y -> X & Y
+ (X | Y) | Y -> X | Y */
+(for op (bit_and bit_ior)
+ (simplify
+ (op:c (convert?@2 (op:c @0 @1)) (convert? @1))
+ @2))
+/* (X ^ Y) ^ Y -> X */
+(simplify
+ (bit_xor:c (convert? (bit_xor:c @0 @1)) (convert? @1))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+ (convert @0)))
+/* (X & Y) & (X & Z) -> (X & Y) & Z
+ (X | Y) | (X | Z) -> (X | Y) | Z */
+(for op (bit_and bit_ior)
+ (simplify
+ (op:c (convert?@3 (op:c@4 @0 @1)) (convert?@5 (op:c@6 @0 @2)))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
+ && tree_nop_conversion_p (type, TREE_TYPE (@2)))
+ (if (single_use (@5) && single_use (@6))
+ (op @3 (convert @2))
+ (if (single_use (@3) && single_use (@4))
+ (op (convert @1) @5))))))
+/* (X ^ Y) ^ (X ^ Z) -> Y ^ Z */
+(simplify
+ (bit_xor (convert? (bit_xor:c @0 @1)) (convert? (bit_xor:c @0 @2)))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
+ && tree_nop_conversion_p (type, TREE_TYPE (@2)))
+ (convert (bit_xor @1 @2))))
(simplify
(abs (abs@1 @0))
@1)
(simplify
(abs (negate @0))
(abs @0))
(simplify
(abs tree_expr_nonnegative_p@0)
@0)
Index: gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (working copy)
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-forwprop1-details -fdump-tree-ccp1-details" } */
+
+int f1(int a, int b){
+ int c = a & b;
+ return c & b;
+}
+int f2(int a, int b){
+ int c = a | b;
+ return b | c;
+}
+int g1(int a, int b, int c){
+ int d = a & b;
+ int e = b & c;
+ return d & e;
+}
+int g2(int a, int b, int c){
+ int d = a | b;
+ int e = c | b;
+ return d | e;
+}
+int g3(int a, int b, int c){
+ int d = a ^ b;
+ int e = b ^ c;
+ return e ^ d;
+}
+
+/* { dg-final { scan-tree-dump-times "Match-and-simplified" 2 "ccp1" } } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified" 3 "forwprop1" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (revision 235933)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (working copy)
@@ -1,21 +1,23 @@
/* { dg-do compile } */
/* { dg-options "-O2 -fsplit-paths -fdump-tree-dom3-details" } */
/* There should be two references to bufferstep that turn into
constants. */
/* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with constant .0." 1 "dom3"} } */
/* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with constant .1." 1 "dom3"} } */
/* And some assignments ought to fold down to constants. */
-/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 2 "dom3"} } */
-/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 2 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -1;" 1 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -2;" 1 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 1 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 1 "dom3"} } */
/* The XOR operations should have been optimized to constants. */
/* { dg-final { scan-tree-dump-not "bit_xor" "dom3"} } */
extern int *stepsizeTable;
void
adpcm_coder (signed char *outdata, int len)
{
Index: gcc/testsuite/gcc.dg/tree-ssa/vrp59.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (revision 235933)
+++ gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (working copy)
@@ -1,12 +1,12 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fno-tree-ccp -fno-tree-forwprop -fdump-tree-vrp1" } */
int f(int x)
{
if (x >= 0 && x <= 3)
{
x = x ^ 3;
x = x & 3;
}
return x;
}
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR)
2016-05-06 11:50 ` Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) Marc Glisse
@ 2016-05-08 20:49 ` Marc Glisse
2016-05-09 10:04 ` Richard Biener
2016-05-10 6:12 ` Marc Glisse
2 siblings, 0 replies; 22+ messages in thread
From: Marc Glisse @ 2016-05-08 20:49 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
On Fri, 6 May 2016, Marc Glisse wrote:
> 2016-05-06 Marc Glisse <marc.glisse@inria.fr>
>
> gcc/
> * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge with...
> * match.pd ((X & Y) ^ Y): ... this.
> ((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y)
> | (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations.
>
> gcc/testsuite/
> * gcc.dg/tree-ssa/bit-assoc.c: New testcase.
> * gcc.dg/tree-ssa/pr69270.c: Adjust.
> * gcc.dg/tree-ssa/vrp59.c: Disable forwprop.
Oups, I forgot about convert1/convert2 in match.pd, which may be relevant
especially if @0 is a constant in the last transforms. Please ignore the
patch for now, I'll resend after checking it.
--
Marc Glisse
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR)
2016-05-06 11:50 ` Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) Marc Glisse
2016-05-08 20:49 ` Marc Glisse
@ 2016-05-09 10:04 ` Richard Biener
2016-05-10 6:12 ` Marc Glisse
2 siblings, 0 replies; 22+ messages in thread
From: Richard Biener @ 2016-05-09 10:04 UTC (permalink / raw)
To: Marc Glisse; +Cc: GCC Patches
On Fri, May 6, 2016 at 1:50 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Tue, 3 May 2016, Richard Biener wrote:
>
>> On Tue, May 3, 2016 at 3:26 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>
>>> On Tue, 3 May 2016, Richard Biener wrote:
>>>
>>>> On Tue, May 3, 2016 at 8:36 AM, Marc Glisse <marc.glisse@inria.fr>
>>>> wrote:
>>>>>
>>>>>
>>>>> This removes the duplication. I also removed the case (A&B)&(A&C) which
>>>>> is
>>>>> handled by reassoc. And I need 2 NOP checks, for the case where @0 is a
>>>>> constant (that couldn't happen before my patch because canonicalization
>>>>> would put the constant as second operand).
>>>>
>>>>
>>>>
>>>> Nicely spotted. Not sure we want to delay (A&B)&(A&C) until re-assoc.
>>>> We
>>>> have
>>>> many patterns that reassoc would also catch, like (A + CST) + CST or (A
>>>> +
>>>> B)- A,
>>>> albeit reassoc only handles the unsigned cases.
>>>
>>>
>>>
>>> (A&B)&A seems simple enough for match.pd, I thought (A&B)&(A&C) was
>>> starting
>>> to be a bit specialized. I could easily add it back (making it work with
>>> |
>>> at the same time), but then I am not convinced A&(B&C) is the best
>>> output.
>>> If A&B or A&C have several uses, then (A&B)&C or B&(A&C) seem preferable
>>> (and we would still have a transformation for (A&cst1)&cst2 so we
>>> wouldn't
>>> lose in the case where B and C are constants). We may still end up having
>>> to
>>> add some :s to the transformation I just touched.
>>
>>
>> Yeah, these are always tricky questions. Note that re-assoc won't
>> handle the case
>> with multi-use A&C or A&B. The only reason to care for the single-use
>> case is
>> when we change operations for the mixed operand cases. For the all-&|
>> case
>> there is only the (usual) consideration about SSA lifetime extension.
>>
>> So maybe it makes sense to split out the all-&| cases.
>
>
> Here they are. I did (X&Y)&X and (X&Y)&(X&Z). The next one would be
> ((X&Y)&Z)&X, but at some point we have to defer to reassoc.
>
> I didn't add the convert?+tree_nop_conversion_p to the existing transform I
> modified. I guess at some point we should make a pass and add them to all
> the transformations on bit operations...
>
> For (X & Y) & Y, I believe that any conversion is fine. For the others,
> tree_nop_conversion_p is probably too strict (narrowing should be fine for
> all), but I was too lazy to look for tighter conditions.
>
> (X ^ Y) ^ Y -> X should probably have (non_lvalue ...) on its output, but in
> a simple test it didn't seem to matter. Is non_lvalue still needed?
I've only added them to patterns where the fold-const.c variant had it.
In theory (hopefully) all of them are no longer necessary since C/C++ both
do delayed folding and thus hopefully figure out lvalue-ness before
entering fold ...
Richard.
>
> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>
> 2016-05-06 Marc Glisse <marc.glisse@inria.fr>
>
> gcc/
> * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge
> with...
> * match.pd ((X & Y) ^ Y): ... this.
> ((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y)
> | (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations.
>
> gcc/testsuite/
> * gcc.dg/tree-ssa/bit-assoc.c: New testcase.
> * gcc.dg/tree-ssa/pr69270.c: Adjust.
> * gcc.dg/tree-ssa/vrp59.c: Disable forwprop.
>
> --
> Marc Glisse
> Index: gcc/fold-const.c
> ===================================================================
> --- gcc/fold-const.c (revision 235933)
> +++ gcc/fold-const.c (working copy)
> @@ -10063,59 +10063,20 @@ fold_binary_loc (location_t loc,
> }
> /* Fold !X & 1 as X == 0. */
> if (TREE_CODE (arg0) == TRUTH_NOT_EXPR
> && integer_onep (arg1))
> {
> tem = TREE_OPERAND (arg0, 0);
> return fold_build2_loc (loc, EQ_EXPR, type, tem,
> build_zero_cst (TREE_TYPE (tem)));
> }
>
> - /* Fold (X ^ Y) & Y as ~X & Y. */
> - if (TREE_CODE (arg0) == BIT_XOR_EXPR
> - && operand_equal_p (TREE_OPERAND (arg0, 1), arg1, 0))
> - {
> - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
> - return fold_build2_loc (loc, BIT_AND_EXPR, type,
> - fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem),
> - fold_convert_loc (loc, type, arg1));
> - }
> - /* Fold (X ^ Y) & X as ~Y & X. */
> - if (TREE_CODE (arg0) == BIT_XOR_EXPR
> - && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0)
> - && reorder_operands_p (TREE_OPERAND (arg0, 1), arg1))
> - {
> - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
> - return fold_build2_loc (loc, BIT_AND_EXPR, type,
> - fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem),
> - fold_convert_loc (loc, type, arg1));
> - }
> - /* Fold X & (X ^ Y) as X & ~Y. */
> - if (TREE_CODE (arg1) == BIT_XOR_EXPR
> - && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
> - {
> - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
> - return fold_build2_loc (loc, BIT_AND_EXPR, type,
> - fold_convert_loc (loc, type, arg0),
> - fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem));
> - }
> - /* Fold X & (Y ^ X) as ~Y & X. */
> - if (TREE_CODE (arg1) == BIT_XOR_EXPR
> - && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0)
> - && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
> - {
> - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
> - return fold_build2_loc (loc, BIT_AND_EXPR, type,
> - fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem),
> - fold_convert_loc (loc, type, arg0));
> - }
> -
> /* Fold (X * Y) & -(1 << CST) to X * Y if Y is a constant
> multiple of 1 << CST. */
> if (TREE_CODE (arg1) == INTEGER_CST)
> {
> wide_int cst1 = arg1;
> wide_int ncst1 = -cst1;
> if ((cst1 & ncst1) == ncst1
> && multiple_of_p (type, arg0,
> wide_int_to_tree (TREE_TYPE (arg1), ncst1)))
> return fold_convert_loc (loc, type, arg0);
> Index: gcc/match.pd
> ===================================================================
> --- gcc/match.pd (revision 235933)
> +++ gcc/match.pd (working copy)
> @@ -667,39 +667,70 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
> && tree_nop_conversion_p (type, TREE_TYPE (@1)))
> (bit_xor (convert @0) (convert @1))))
>
> /* Convert ~X ^ C to X ^ ~C. */
> (simplify
> (bit_xor (convert? (bit_not @0)) INTEGER_CST@1)
> (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
> (bit_xor (convert @0) (bit_not @1))))
>
> -/* Fold (X & Y) ^ Y as ~X & Y. */
> -(simplify
> - (bit_xor:c (bit_and:c @0 @1) @1)
> - (bit_and (bit_not @0) @1))
> +/* Fold (X & Y) ^ Y and (X ^ Y) & Y as ~X & Y. */
> +(for opo (bit_and bit_xor)
> + opi (bit_xor bit_and)
> + (simplify
> + (opo:c (opi:c @0 @1) @1)
> + (bit_and (bit_not @0) @1)))
>
> /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
> operands are another bit-wise operation with a common input. If so,
> distribute the bit operations to save an operation and possibly two if
> constants are involved. For example, convert
> (A | B) & (A | C) into A | (B & C)
> Further simplification will occur if B and C are constants. */
> (for op (bit_and bit_ior bit_xor)
> rop (bit_ior bit_and bit_and)
> (simplify
> (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2)))
> (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
> && tree_nop_conversion_p (type, TREE_TYPE (@2)))
> (rop (convert @0) (op (convert @1) (convert @2))))))
>
> +/* Some simple reassociation for bit operations, also handled in reassoc.
> */
> +/* (X & Y) & Y -> X & Y
> + (X | Y) | Y -> X | Y */
> +(for op (bit_and bit_ior)
> + (simplify
> + (op:c (convert?@2 (op:c @0 @1)) (convert? @1))
> + @2))
> +/* (X ^ Y) ^ Y -> X */
> +(simplify
> + (bit_xor:c (convert? (bit_xor:c @0 @1)) (convert? @1))
> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
> + (convert @0)))
> +/* (X & Y) & (X & Z) -> (X & Y) & Z
> + (X | Y) | (X | Z) -> (X | Y) | Z */
> +(for op (bit_and bit_ior)
> + (simplify
> + (op:c (convert?@3 (op:c@4 @0 @1)) (convert?@5 (op:c@6 @0 @2)))
> + (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
> + && tree_nop_conversion_p (type, TREE_TYPE (@2)))
> + (if (single_use (@5) && single_use (@6))
> + (op @3 (convert @2))
> + (if (single_use (@3) && single_use (@4))
> + (op (convert @1) @5))))))
> +/* (X ^ Y) ^ (X ^ Z) -> Y ^ Z */
> +(simplify
> + (bit_xor (convert? (bit_xor:c @0 @1)) (convert? (bit_xor:c @0 @2)))
> + (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
> + && tree_nop_conversion_p (type, TREE_TYPE (@2)))
> + (convert (bit_xor @1 @2))))
>
> (simplify
> (abs (abs@1 @0))
> @1)
> (simplify
> (abs (negate @0))
> (abs @0))
> (simplify
> (abs tree_expr_nonnegative_p@0)
> @0)
> Index: gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (revision 0)
> +++ gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (working copy)
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-forwprop1-details -fdump-tree-ccp1-details"
> } */
> +
> +int f1(int a, int b){
> + int c = a & b;
> + return c & b;
> +}
> +int f2(int a, int b){
> + int c = a | b;
> + return b | c;
> +}
> +int g1(int a, int b, int c){
> + int d = a & b;
> + int e = b & c;
> + return d & e;
> +}
> +int g2(int a, int b, int c){
> + int d = a | b;
> + int e = c | b;
> + return d | e;
> +}
> +int g3(int a, int b, int c){
> + int d = a ^ b;
> + int e = b ^ c;
> + return e ^ d;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Match-and-simplified" 2 "ccp1" } } */
> +/* { dg-final { scan-tree-dump-times "gimple_simplified" 3 "forwprop1" } }
> */
> Index: gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (revision 235933)
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (working copy)
> @@ -1,21 +1,23 @@
> /* { dg-do compile } */
> /* { dg-options "-O2 -fsplit-paths -fdump-tree-dom3-details" } */
>
> /* There should be two references to bufferstep that turn into
> constants. */
> /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with
> constant .0." 1 "dom3"} } */
> /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with
> constant .1." 1 "dom3"} } */
>
> /* And some assignments ought to fold down to constants. */
> -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 2 "dom3"}
> } */
> -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 2 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -1;" 1 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -2;" 1 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 1 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 1 "dom3"}
> } */
>
> /* The XOR operations should have been optimized to constants. */
> /* { dg-final { scan-tree-dump-not "bit_xor" "dom3"} } */
>
>
> extern int *stepsizeTable;
>
> void
> adpcm_coder (signed char *outdata, int len)
> {
> Index: gcc/testsuite/gcc.dg/tree-ssa/vrp59.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (revision 235933)
> +++ gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (working copy)
> @@ -1,12 +1,12 @@
> /* { dg-do compile } */
> -/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp1" } */
> +/* { dg-options "-O2 -fno-tree-ccp -fno-tree-forwprop -fdump-tree-vrp1" }
> */
>
> int f(int x)
> {
> if (x >= 0 && x <= 3)
> {
> x = x ^ 3;
> x = x & 3;
> }
> return x;
> }
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR)
2016-05-06 11:50 ` Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) Marc Glisse
2016-05-08 20:49 ` Marc Glisse
2016-05-09 10:04 ` Richard Biener
@ 2016-05-10 6:12 ` Marc Glisse
2016-05-10 9:27 ` Richard Biener
2016-05-11 13:52 ` H.J. Lu
2 siblings, 2 replies; 22+ messages in thread
From: Marc Glisse @ 2016-05-10 6:12 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1656 bytes --]
On Fri, 6 May 2016, Marc Glisse wrote:
> Here they are. I did (X&Y)&X and (X&Y)&(X&Z). The next one would be
> ((X&Y)&Z)&X, but at some point we have to defer to reassoc.
>
> I didn't add the convert?+tree_nop_conversion_p to the existing transform I
> modified. I guess at some point we should make a pass and add them to all the
> transformations on bit operations...
>
> For (X & Y) & Y, I believe that any conversion is fine. For the others,
> tree_nop_conversion_p is probably too strict (narrowing should be fine for
> all), but I was too lazy to look for tighter conditions.
>
> (X ^ Y) ^ Y -> X should probably have (non_lvalue ...) on its output, but in
> a simple test it didn't seem to matter. Is non_lvalue still needed?
>
>
> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>
> 2016-05-06 Marc Glisse <marc.glisse@inria.fr>
>
> gcc/
> * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge
> with...
> * match.pd ((X & Y) ^ Y): ... this.
> ((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y)
> | (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations.
>
> gcc/testsuite/
> * gcc.dg/tree-ssa/bit-assoc.c: New testcase.
> * gcc.dg/tree-ssa/pr69270.c: Adjust.
> * gcc.dg/tree-ssa/vrp59.c: Disable forwprop.
Here it is again, I just replaced convert with convert[12] in the last 2
transforms. This should matter for (unsigned)(si & 42) & (ui & 42u). I
didn't change it in the other transform, because it would only matter in
the case (T)(X & CST) & CST, which I think would be better served by
extending the transform that currently handles (X & CST1) & CST2 (not done
in this patch).
--
Marc Glisse
[-- Attachment #2: Type: TEXT/PLAIN, Size: 8285 bytes --]
Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c (revision 236047)
+++ gcc/fold-const.c (working copy)
@@ -10064,59 +10064,20 @@ fold_binary_loc (location_t loc,
}
/* Fold !X & 1 as X == 0. */
if (TREE_CODE (arg0) == TRUTH_NOT_EXPR
&& integer_onep (arg1))
{
tem = TREE_OPERAND (arg0, 0);
return fold_build2_loc (loc, EQ_EXPR, type, tem,
build_zero_cst (TREE_TYPE (tem)));
}
- /* Fold (X ^ Y) & Y as ~X & Y. */
- if (TREE_CODE (arg0) == BIT_XOR_EXPR
- && operand_equal_p (TREE_OPERAND (arg0, 1), arg1, 0))
- {
- tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
- return fold_build2_loc (loc, BIT_AND_EXPR, type,
- fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
- fold_convert_loc (loc, type, arg1));
- }
- /* Fold (X ^ Y) & X as ~Y & X. */
- if (TREE_CODE (arg0) == BIT_XOR_EXPR
- && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0)
- && reorder_operands_p (TREE_OPERAND (arg0, 1), arg1))
- {
- tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
- return fold_build2_loc (loc, BIT_AND_EXPR, type,
- fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
- fold_convert_loc (loc, type, arg1));
- }
- /* Fold X & (X ^ Y) as X & ~Y. */
- if (TREE_CODE (arg1) == BIT_XOR_EXPR
- && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
- {
- tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
- return fold_build2_loc (loc, BIT_AND_EXPR, type,
- fold_convert_loc (loc, type, arg0),
- fold_build1_loc (loc, BIT_NOT_EXPR, type, tem));
- }
- /* Fold X & (Y ^ X) as ~Y & X. */
- if (TREE_CODE (arg1) == BIT_XOR_EXPR
- && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0)
- && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
- {
- tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
- return fold_build2_loc (loc, BIT_AND_EXPR, type,
- fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
- fold_convert_loc (loc, type, arg0));
- }
-
/* Fold (X * Y) & -(1 << CST) to X * Y if Y is a constant
multiple of 1 << CST. */
if (TREE_CODE (arg1) == INTEGER_CST)
{
wide_int cst1 = arg1;
wide_int ncst1 = -cst1;
if ((cst1 & ncst1) == ncst1
&& multiple_of_p (type, arg0,
wide_int_to_tree (TREE_TYPE (arg1), ncst1)))
return fold_convert_loc (loc, type, arg0);
Index: gcc/match.pd
===================================================================
--- gcc/match.pd (revision 236047)
+++ gcc/match.pd (working copy)
@@ -667,39 +667,70 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(if (tree_nop_conversion_p (type, TREE_TYPE (@0))
&& tree_nop_conversion_p (type, TREE_TYPE (@1)))
(bit_xor (convert @0) (convert @1))))
/* Convert ~X ^ C to X ^ ~C. */
(simplify
(bit_xor (convert? (bit_not @0)) INTEGER_CST@1)
(if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
(bit_xor (convert @0) (bit_not @1))))
-/* Fold (X & Y) ^ Y as ~X & Y. */
-(simplify
- (bit_xor:c (bit_and:c @0 @1) @1)
- (bit_and (bit_not @0) @1))
+/* Fold (X & Y) ^ Y and (X ^ Y) & Y as ~X & Y. */
+(for opo (bit_and bit_xor)
+ opi (bit_xor bit_and)
+ (simplify
+ (opo:c (opi:c @0 @1) @1)
+ (bit_and (bit_not @0) @1)))
/* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
operands are another bit-wise operation with a common input. If so,
distribute the bit operations to save an operation and possibly two if
constants are involved. For example, convert
(A | B) & (A | C) into A | (B & C)
Further simplification will occur if B and C are constants. */
(for op (bit_and bit_ior bit_xor)
rop (bit_ior bit_and bit_and)
(simplify
(op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2)))
(if (tree_nop_conversion_p (type, TREE_TYPE (@1))
&& tree_nop_conversion_p (type, TREE_TYPE (@2)))
(rop (convert @0) (op (convert @1) (convert @2))))))
+/* Some simple reassociation for bit operations, also handled in reassoc. */
+/* (X & Y) & Y -> X & Y
+ (X | Y) | Y -> X | Y */
+(for op (bit_and bit_ior)
+ (simplify
+ (op:c (convert?@2 (op:c @0 @1)) (convert? @1))
+ @2))
+/* (X ^ Y) ^ Y -> X */
+(simplify
+ (bit_xor:c (convert? (bit_xor:c @0 @1)) (convert? @1))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+ (convert @0)))
+/* (X & Y) & (X & Z) -> (X & Y) & Z
+ (X | Y) | (X | Z) -> (X | Y) | Z */
+(for op (bit_and bit_ior)
+ (simplify
+ (op:c (convert1?@3 (op:c@4 @0 @1)) (convert2?@5 (op:c@6 @0 @2)))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
+ && tree_nop_conversion_p (type, TREE_TYPE (@2)))
+ (if (single_use (@5) && single_use (@6))
+ (op @3 (convert @2))
+ (if (single_use (@3) && single_use (@4))
+ (op (convert @1) @5))))))
+/* (X ^ Y) ^ (X ^ Z) -> Y ^ Z */
+(simplify
+ (bit_xor (convert1? (bit_xor:c @0 @1)) (convert2? (bit_xor:c @0 @2)))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
+ && tree_nop_conversion_p (type, TREE_TYPE (@2)))
+ (convert (bit_xor @1 @2))))
(simplify
(abs (abs@1 @0))
@1)
(simplify
(abs (negate @0))
(abs @0))
(simplify
(abs tree_expr_nonnegative_p@0)
@0)
Index: gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (working copy)
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-forwprop1-details -fdump-tree-ccp1-details" } */
+
+int f1(int a, int b){
+ int c = a & b;
+ return c & b;
+}
+int f2(int a, int b){
+ int c = a | b;
+ return b | c;
+}
+int g1(int a, int b, int c){
+ int d = a & b;
+ int e = b & c;
+ return d & e;
+}
+int g2(int a, int b, int c){
+ int d = a | b;
+ int e = c | b;
+ return d | e;
+}
+int g3(int a, int b, int c){
+ int d = a ^ b;
+ int e = b ^ c;
+ return e ^ d;
+}
+
+/* { dg-final { scan-tree-dump-times "Match-and-simplified" 2 "ccp1" } } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified" 3 "forwprop1" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (revision 236047)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (working copy)
@@ -1,21 +1,23 @@
/* { dg-do compile } */
/* { dg-options "-O2 -fsplit-paths -fdump-tree-dom3-details" } */
/* There should be two references to bufferstep that turn into
constants. */
/* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with constant .0." 1 "dom3"} } */
/* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with constant .1." 1 "dom3"} } */
/* And some assignments ought to fold down to constants. */
-/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 2 "dom3"} } */
-/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 2 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -1;" 1 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -2;" 1 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 1 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 1 "dom3"} } */
/* The XOR operations should have been optimized to constants. */
/* { dg-final { scan-tree-dump-not "bit_xor" "dom3"} } */
extern int *stepsizeTable;
void
adpcm_coder (signed char *outdata, int len)
{
Index: gcc/testsuite/gcc.dg/tree-ssa/vrp59.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (revision 236047)
+++ gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (working copy)
@@ -1,12 +1,12 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fno-tree-ccp -fno-tree-forwprop -fdump-tree-vrp1" } */
int f(int x)
{
if (x >= 0 && x <= 3)
{
x = x ^ 3;
x = x & 3;
}
return x;
}
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR)
2016-05-10 6:12 ` Marc Glisse
@ 2016-05-10 9:27 ` Richard Biener
2016-05-11 13:52 ` H.J. Lu
1 sibling, 0 replies; 22+ messages in thread
From: Richard Biener @ 2016-05-10 9:27 UTC (permalink / raw)
To: Marc Glisse; +Cc: GCC Patches
On Tue, May 10, 2016 at 8:11 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Fri, 6 May 2016, Marc Glisse wrote:
>
>> Here they are. I did (X&Y)&X and (X&Y)&(X&Z). The next one would be
>> ((X&Y)&Z)&X, but at some point we have to defer to reassoc.
>>
>> I didn't add the convert?+tree_nop_conversion_p to the existing transform
>> I modified. I guess at some point we should make a pass and add them to all
>> the transformations on bit operations...
>>
>> For (X & Y) & Y, I believe that any conversion is fine. For the others,
>> tree_nop_conversion_p is probably too strict (narrowing should be fine for
>> all), but I was too lazy to look for tighter conditions.
>>
>> (X ^ Y) ^ Y -> X should probably have (non_lvalue ...) on its output, but
>> in a simple test it didn't seem to matter. Is non_lvalue still needed?
>>
>>
>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>>
>> 2016-05-06 Marc Glisse <marc.glisse@inria.fr>
>>
>> gcc/
>> * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge
>> with...
>> * match.pd ((X & Y) ^ Y): ... this.
>> ((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y)
>> | (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations.
>>
>> gcc/testsuite/
>> * gcc.dg/tree-ssa/bit-assoc.c: New testcase.
>> * gcc.dg/tree-ssa/pr69270.c: Adjust.
>> * gcc.dg/tree-ssa/vrp59.c: Disable forwprop.
>
>
> Here it is again, I just replaced convert with convert[12] in the last 2
> transforms. This should matter for (unsigned)(si & 42) & (ui & 42u). I
> didn't change it in the other transform, because it would only matter in the
> case (T)(X & CST) & CST, which I think would be better served by extending
> the transform that currently handles (X & CST1) & CST2 (not done in this
> patch).
Ok.
Thanks!
Richard.
> --
> Marc Glisse
> Index: gcc/fold-const.c
> ===================================================================
> --- gcc/fold-const.c (revision 236047)
> +++ gcc/fold-const.c (working copy)
> @@ -10064,59 +10064,20 @@ fold_binary_loc (location_t loc,
> }
> /* Fold !X & 1 as X == 0. */
> if (TREE_CODE (arg0) == TRUTH_NOT_EXPR
> && integer_onep (arg1))
> {
> tem = TREE_OPERAND (arg0, 0);
> return fold_build2_loc (loc, EQ_EXPR, type, tem,
> build_zero_cst (TREE_TYPE (tem)));
> }
>
> - /* Fold (X ^ Y) & Y as ~X & Y. */
> - if (TREE_CODE (arg0) == BIT_XOR_EXPR
> - && operand_equal_p (TREE_OPERAND (arg0, 1), arg1, 0))
> - {
> - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
> - return fold_build2_loc (loc, BIT_AND_EXPR, type,
> - fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem),
> - fold_convert_loc (loc, type, arg1));
> - }
> - /* Fold (X ^ Y) & X as ~Y & X. */
> - if (TREE_CODE (arg0) == BIT_XOR_EXPR
> - && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0)
> - && reorder_operands_p (TREE_OPERAND (arg0, 1), arg1))
> - {
> - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
> - return fold_build2_loc (loc, BIT_AND_EXPR, type,
> - fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem),
> - fold_convert_loc (loc, type, arg1));
> - }
> - /* Fold X & (X ^ Y) as X & ~Y. */
> - if (TREE_CODE (arg1) == BIT_XOR_EXPR
> - && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
> - {
> - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
> - return fold_build2_loc (loc, BIT_AND_EXPR, type,
> - fold_convert_loc (loc, type, arg0),
> - fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem));
> - }
> - /* Fold X & (Y ^ X) as ~Y & X. */
> - if (TREE_CODE (arg1) == BIT_XOR_EXPR
> - && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0)
> - && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
> - {
> - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
> - return fold_build2_loc (loc, BIT_AND_EXPR, type,
> - fold_build1_loc (loc, BIT_NOT_EXPR, type,
> tem),
> - fold_convert_loc (loc, type, arg0));
> - }
> -
> /* Fold (X * Y) & -(1 << CST) to X * Y if Y is a constant
> multiple of 1 << CST. */
> if (TREE_CODE (arg1) == INTEGER_CST)
> {
> wide_int cst1 = arg1;
> wide_int ncst1 = -cst1;
> if ((cst1 & ncst1) == ncst1
> && multiple_of_p (type, arg0,
> wide_int_to_tree (TREE_TYPE (arg1), ncst1)))
> return fold_convert_loc (loc, type, arg0);
> Index: gcc/match.pd
> ===================================================================
> --- gcc/match.pd (revision 236047)
> +++ gcc/match.pd (working copy)
> @@ -667,39 +667,70 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
> && tree_nop_conversion_p (type, TREE_TYPE (@1)))
> (bit_xor (convert @0) (convert @1))))
>
> /* Convert ~X ^ C to X ^ ~C. */
> (simplify
> (bit_xor (convert? (bit_not @0)) INTEGER_CST@1)
> (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
> (bit_xor (convert @0) (bit_not @1))))
>
> -/* Fold (X & Y) ^ Y as ~X & Y. */
> -(simplify
> - (bit_xor:c (bit_and:c @0 @1) @1)
> - (bit_and (bit_not @0) @1))
> +/* Fold (X & Y) ^ Y and (X ^ Y) & Y as ~X & Y. */
> +(for opo (bit_and bit_xor)
> + opi (bit_xor bit_and)
> + (simplify
> + (opo:c (opi:c @0 @1) @1)
> + (bit_and (bit_not @0) @1)))
>
> /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
> operands are another bit-wise operation with a common input. If so,
> distribute the bit operations to save an operation and possibly two if
> constants are involved. For example, convert
> (A | B) & (A | C) into A | (B & C)
> Further simplification will occur if B and C are constants. */
> (for op (bit_and bit_ior bit_xor)
> rop (bit_ior bit_and bit_and)
> (simplify
> (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2)))
> (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
> && tree_nop_conversion_p (type, TREE_TYPE (@2)))
> (rop (convert @0) (op (convert @1) (convert @2))))))
>
> +/* Some simple reassociation for bit operations, also handled in reassoc.
> */
> +/* (X & Y) & Y -> X & Y
> + (X | Y) | Y -> X | Y */
> +(for op (bit_and bit_ior)
> + (simplify
> + (op:c (convert?@2 (op:c @0 @1)) (convert? @1))
> + @2))
> +/* (X ^ Y) ^ Y -> X */
> +(simplify
> + (bit_xor:c (convert? (bit_xor:c @0 @1)) (convert? @1))
> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
> + (convert @0)))
> +/* (X & Y) & (X & Z) -> (X & Y) & Z
> + (X | Y) | (X | Z) -> (X | Y) | Z */
> +(for op (bit_and bit_ior)
> + (simplify
> + (op:c (convert1?@3 (op:c@4 @0 @1)) (convert2?@5 (op:c@6 @0 @2)))
> + (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
> + && tree_nop_conversion_p (type, TREE_TYPE (@2)))
> + (if (single_use (@5) && single_use (@6))
> + (op @3 (convert @2))
> + (if (single_use (@3) && single_use (@4))
> + (op (convert @1) @5))))))
> +/* (X ^ Y) ^ (X ^ Z) -> Y ^ Z */
> +(simplify
> + (bit_xor (convert1? (bit_xor:c @0 @1)) (convert2? (bit_xor:c @0 @2)))
> + (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
> + && tree_nop_conversion_p (type, TREE_TYPE (@2)))
> + (convert (bit_xor @1 @2))))
>
> (simplify
> (abs (abs@1 @0))
> @1)
> (simplify
> (abs (negate @0))
> (abs @0))
> (simplify
> (abs tree_expr_nonnegative_p@0)
> @0)
> Index: gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (revision 0)
> +++ gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (working copy)
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-forwprop1-details -fdump-tree-ccp1-details"
> } */
> +
> +int f1(int a, int b){
> + int c = a & b;
> + return c & b;
> +}
> +int f2(int a, int b){
> + int c = a | b;
> + return b | c;
> +}
> +int g1(int a, int b, int c){
> + int d = a & b;
> + int e = b & c;
> + return d & e;
> +}
> +int g2(int a, int b, int c){
> + int d = a | b;
> + int e = c | b;
> + return d | e;
> +}
> +int g3(int a, int b, int c){
> + int d = a ^ b;
> + int e = b ^ c;
> + return e ^ d;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Match-and-simplified" 2 "ccp1" } } */
> +/* { dg-final { scan-tree-dump-times "gimple_simplified" 3 "forwprop1" } }
> */
> Index: gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (revision 236047)
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (working copy)
> @@ -1,21 +1,23 @@
> /* { dg-do compile } */
> /* { dg-options "-O2 -fsplit-paths -fdump-tree-dom3-details" } */
>
> /* There should be two references to bufferstep that turn into
> constants. */
> /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with
> constant .0." 1 "dom3"} } */
> /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with
> constant .1." 1 "dom3"} } */
>
> /* And some assignments ought to fold down to constants. */
> -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 2 "dom3"}
> } */
> -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 2 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -1;" 1 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -2;" 1 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 1 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 1 "dom3"}
> } */
>
> /* The XOR operations should have been optimized to constants. */
> /* { dg-final { scan-tree-dump-not "bit_xor" "dom3"} } */
>
>
> extern int *stepsizeTable;
>
> void
> adpcm_coder (signed char *outdata, int len)
> {
> Index: gcc/testsuite/gcc.dg/tree-ssa/vrp59.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (revision 236047)
> +++ gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (working copy)
> @@ -1,12 +1,12 @@
> /* { dg-do compile } */
> -/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp1" } */
> +/* { dg-options "-O2 -fno-tree-ccp -fno-tree-forwprop -fdump-tree-vrp1" }
> */
>
> int f(int x)
> {
> if (x >= 0 && x <= 3)
> {
> x = x ^ 3;
> x = x & 3;
> }
> return x;
> }
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR)
2016-05-10 6:12 ` Marc Glisse
2016-05-10 9:27 ` Richard Biener
@ 2016-05-11 13:52 ` H.J. Lu
2016-05-11 16:17 ` Marc Glisse
1 sibling, 1 reply; 22+ messages in thread
From: H.J. Lu @ 2016-05-11 13:52 UTC (permalink / raw)
To: Marc Glisse; +Cc: Richard Biener, GCC Patches
On Mon, May 9, 2016 at 11:11 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Fri, 6 May 2016, Marc Glisse wrote:
>
>> Here they are. I did (X&Y)&X and (X&Y)&(X&Z). The next one would be
>> ((X&Y)&Z)&X, but at some point we have to defer to reassoc.
>>
>> I didn't add the convert?+tree_nop_conversion_p to the existing transform
>> I modified. I guess at some point we should make a pass and add them to all
>> the transformations on bit operations...
>>
>> For (X & Y) & Y, I believe that any conversion is fine. For the others,
>> tree_nop_conversion_p is probably too strict (narrowing should be fine for
>> all), but I was too lazy to look for tighter conditions.
>>
>> (X ^ Y) ^ Y -> X should probably have (non_lvalue ...) on its output, but
>> in a simple test it didn't seem to matter. Is non_lvalue still needed?
>>
>>
>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>>
>> 2016-05-06 Marc Glisse <marc.glisse@inria.fr>
>>
>> gcc/
>> * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge
>> with...
>> * match.pd ((X & Y) ^ Y): ... this.
>> ((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y)
>> | (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations.
>>
>> gcc/testsuite/
>> * gcc.dg/tree-ssa/bit-assoc.c: New testcase.
>> * gcc.dg/tree-ssa/pr69270.c: Adjust.
>> * gcc.dg/tree-ssa/vrp59.c: Disable forwprop.
>
>
> Here it is again, I just replaced convert with convert[12] in the last 2
> transforms. This should matter for (unsigned)(si & 42) & (ui & 42u). I
> didn't change it in the other transform, because it would only matter in the
> case (T)(X & CST) & CST, which I think would be better served by extending
> the transform that currently handles (X & CST1) & CST2 (not done in this
> patch).
It caused:
FAIL: gcc.dg/tree-ssa/vrp47.c scan-tree-dump-times vrp2 " & 1;" 0
on x86.
H.J.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR)
2016-05-11 13:52 ` H.J. Lu
@ 2016-05-11 16:17 ` Marc Glisse
2016-05-11 16:26 ` Simple bitop reassoc in match.pd Jeff Law
0 siblings, 1 reply; 22+ messages in thread
From: Marc Glisse @ 2016-05-11 16:17 UTC (permalink / raw)
To: H.J. Lu; +Cc: Richard Biener, GCC Patches
On Wed, 11 May 2016, H.J. Lu wrote:
>>> * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge with...
>>> * match.pd ((X & Y) ^ Y): ... this.
> It caused:
>
> FAIL: gcc.dg/tree-ssa/vrp47.c scan-tree-dump-times vrp2 " & 1;" 0
>
> on x86.
Ah, yes, logical_op_short_circuit so I didn't test it. Hmm, doesn't seem
so easy. We want to compute (int)(x==0) in a branch where x is known to be
in [0, 1]. VRP1 gives (int)(_Bool)(x^1). forwprop3 handles the double
conversion, which gives (x^1)&1. Here the new transform applies and gives
(~x)&1. VRP2 comes, and with (x^1)&1 it used to notice that this is just
x^1 (remember that x is in [0, 1] in this branch), while it doesn't know
what to do with (~x)&1. In the end, we get
notl %eax
andl $1, %eax
instead of
xorl $1, %eax
(andn doesn't seem to have a version with an immediate)
The transformation seems right to me, but we are then missing another
transformation like ~X & Y -> X ^ Y when we know that X & ~Y is 0 (the 1
bits of X are included in those of Y). That may not be easy with the
limited bit tracking we have. A version limited to constant Y of the form
2^n-1 (i.e. 0...01...1) where X has a range included in [0, Y] may be
simpler.
We could also simplify (int)(_Bool)x to x using VRP information that x is
in [0, 1], but apparently when VRP replaces x==0 with y=x^1,(_Bool)y, it
does not compute a range for the new variable y, and by the time the next
VRP pass comes, it is too late.
In the mean time, I propose xfailing this test...
--
Marc Glisse
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd
2016-05-11 16:17 ` Marc Glisse
@ 2016-05-11 16:26 ` Jeff Law
2016-05-11 17:56 ` Marc Glisse
2016-05-12 5:26 ` Marc Glisse
0 siblings, 2 replies; 22+ messages in thread
From: Jeff Law @ 2016-05-11 16:26 UTC (permalink / raw)
To: Marc Glisse, H.J. Lu; +Cc: Richard Biener, GCC Patches
On 05/11/2016 10:17 AM, Marc Glisse wrote:
> The transformation seems right to me, but we are then missing another
> transformation like ~X & Y -> X ^ Y when we know that X & ~Y is 0 (the 1
> bits of X are included in those of Y). That may not be easy with the
> limited bit tracking we have. A version limited to constant Y of the
> form 2^n-1 (i.e. 0...01...1) where X has a range included in [0, Y] may
> be simpler.
While we don't have strong bit tracking, we do track [0..1] ranges
reasonably well, so it may be worth doing.
> We could also simplify (int)(_Bool)x to x using VRP information that x
> is in [0, 1], but apparently when VRP replaces x==0 with y=x^1,(_Bool)y,
> it does not compute a range for the new variable y, and by the time the
> next VRP pass comes, it is too late.
Seems like a clear oversight.
jeff
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd
2016-05-11 16:26 ` Simple bitop reassoc in match.pd Jeff Law
@ 2016-05-11 17:56 ` Marc Glisse
2016-05-11 20:44 ` Marc Glisse
2016-05-12 8:41 ` Richard Biener
2016-05-12 5:26 ` Marc Glisse
1 sibling, 2 replies; 22+ messages in thread
From: Marc Glisse @ 2016-05-11 17:56 UTC (permalink / raw)
To: Jeff Law; +Cc: H.J. Lu, Richard Biener, GCC Patches
On Wed, 11 May 2016, Jeff Law wrote:
>> We could also simplify (int)(_Bool)x to x using VRP information that x
>> is in [0, 1], but apparently when VRP replaces x==0 with y=x^1,(_Bool)y,
>> it does not compute a range for the new variable y, and by the time the
>> next VRP pass comes, it is too late.
> Seems like a clear oversight.
In get_value_range, there is:
/* If we query the range for a new SSA name return an unmodifiable VARYING.
We should get here at most from the substitute-and-fold stage which
will never try to change values. */
so this is a known limitation.
We could try to change that (XRESIZEVEC, memset(0) on the new elements,
update num_vr_values to the new num_ssa_names, at this point vr_value
should be replaced with a vector).
We could also use set_range_info and make simplify_conversion_using_ranges
use get_range_info instead of get_value_range. Might even move the whole
function to match.pd then ;-)
--
Marc Glisse
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd
2016-05-11 17:56 ` Marc Glisse
@ 2016-05-11 20:44 ` Marc Glisse
2016-05-12 8:41 ` Richard Biener
1 sibling, 0 replies; 22+ messages in thread
From: Marc Glisse @ 2016-05-11 20:44 UTC (permalink / raw)
To: Jeff Law; +Cc: H.J. Lu, Richard Biener, GCC Patches
[-- Attachment #1: Type: TEXT/PLAIN, Size: 339 bytes --]
On Wed, 11 May 2016, Marc Glisse wrote:
> We could also use set_range_info and make simplify_conversion_using_ranges
> use get_range_info instead of get_value_range.
Something like this seems to fix the testcase. I'll try to submit it
properly, but I don't know when. (I also added the ~X&C transform to my
TODO-list)
--
Marc Glisse
[-- Attachment #2: Type: TEXT/PLAIN, Size: 1882 bytes --]
Index: gcc/tree-vrp.c
===================================================================
--- gcc/tree-vrp.c (revision 236122)
+++ gcc/tree-vrp.c (working copy)
@@ -8940,6 +8940,8 @@
gassign *newop
= gimple_build_assign (tem, BIT_XOR_EXPR, op0, op1);
gsi_insert_before (gsi, newop, GSI_SAME_STMT);
+ if (TYPE_PRECISION (TREE_TYPE (tem)) > 1)
+ set_range_info (tem, VR_RANGE, 0, 1);
gimple_assign_set_rhs_with_ops (gsi, NOP_EXPR, tem);
}
/* Or without. */
@@ -9648,7 +9650,6 @@
{
tree innerop, middleop, finaltype;
gimple *def_stmt;
- value_range *innervr;
signop inner_sgn, middle_sgn, final_sgn;
unsigned inner_prec, middle_prec, final_prec;
widest_int innermin, innermed, innermax, middlemin, middlemed, middlemax;
@@ -9666,18 +9667,16 @@
|| SSA_NAME_OCCURS_IN_ABNORMAL_PHI (innerop))
return false;
- /* Get the value-range of the inner operand. */
- innervr = get_value_range (innerop);
- if (innervr->type != VR_RANGE
- || TREE_CODE (innervr->min) != INTEGER_CST
- || TREE_CODE (innervr->max) != INTEGER_CST)
+ /* Get the value-range of the inner operand. Use get_range_info in
+ case innerop was created during substitute-and-fold. */
+ wide_int imin, imax;
+ if (get_range_info (innerop, &imin, &imax) != VR_RANGE)
return false;
+ innermin = widest_int::from (imin, TYPE_SIGN (TREE_TYPE (innerop)));
+ innermax = widest_int::from (imax, TYPE_SIGN (TREE_TYPE (innerop)));
/* Simulate the conversion chain to check if the result is equal if
the middle conversion is removed. */
- innermin = wi::to_widest (innervr->min);
- innermax = wi::to_widest (innervr->max);
-
inner_prec = TYPE_PRECISION (TREE_TYPE (innerop));
middle_prec = TYPE_PRECISION (TREE_TYPE (middleop));
final_prec = TYPE_PRECISION (finaltype);
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd
2016-05-11 16:26 ` Simple bitop reassoc in match.pd Jeff Law
2016-05-11 17:56 ` Marc Glisse
@ 2016-05-12 5:26 ` Marc Glisse
1 sibling, 0 replies; 22+ messages in thread
From: Marc Glisse @ 2016-05-12 5:26 UTC (permalink / raw)
To: Jeff Law; +Cc: H.J. Lu, Richard Biener, GCC Patches
On Wed, 11 May 2016, Jeff Law wrote:
> On 05/11/2016 10:17 AM, Marc Glisse wrote:
>> The transformation seems right to me, but we are then missing another
>> transformation like ~X & Y -> X ^ Y when we know that X & ~Y is 0 (the 1
>> bits of X are included in those of Y). That may not be easy with the
>> limited bit tracking we have. A version limited to constant Y of the
>> form 2^n-1 (i.e. 0...01...1) where X has a range included in [0, Y] may
>> be simpler.
> While we don't have strong bit tracking, we do track [0..1] ranges reasonably
> well, so it may be worth doing.
I had started writing
+/* Simplify (~X & Y) to X ^ Y if we know that (X & ~Y) is 0. */
+#if GIMPLE
+(simplify
+ (bit_and (bit_not SSA_NAME@0) INTEGER_CST@1)
+ (if ((get_nonzero_bits (@0) & wi::bit_not (@1)) == 0)
+ (bit_xor @0 @1)))
+#endif
but then I realized that VRP does not call the simplify machinery, so this
would have to be rewritten in simplify_bit_ops_using_ranges. The good
point is that we could then compute:
may_be_nonzero_X & ~must_be_nonzero_Y
instead of assuming that Y is a constant (not that it would change
anything in practice).
--
Marc Glisse
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd
2016-05-11 17:56 ` Marc Glisse
2016-05-11 20:44 ` Marc Glisse
@ 2016-05-12 8:41 ` Richard Biener
2016-05-12 16:03 ` Marc Glisse
1 sibling, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-05-12 8:41 UTC (permalink / raw)
To: Marc Glisse; +Cc: Jeff Law, H.J. Lu, GCC Patches
On Wed, May 11, 2016 at 7:56 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Wed, 11 May 2016, Jeff Law wrote:
>
>>> We could also simplify (int)(_Bool)x to x using VRP information that x
>>> is in [0, 1], but apparently when VRP replaces x==0 with y=x^1,(_Bool)y,
>>> it does not compute a range for the new variable y, and by the time the
>>> next VRP pass comes, it is too late.
>>
>> Seems like a clear oversight.
>
>
> In get_value_range, there is:
> /* If we query the range for a new SSA name return an unmodifiable
> VARYING.
> We should get here at most from the substitute-and-fold stage which
> will never try to change values. */
> so this is a known limitation.
>
> We could try to change that (XRESIZEVEC, memset(0) on the new elements,
> update num_vr_values to the new num_ssa_names, at this point vr_value should
> be replaced with a vector).
>
> We could also use set_range_info and make simplify_conversion_using_ranges
> use get_range_info instead of get_value_range. Might even move the whole
> function to match.pd then ;-)
Yeah - note that VRP already calls set_range_info before simplifying
stmts. It's
just that substitute_and_fold doesn't apply fold_stmt (and thus match.pd) to
all stmts but it only applies the pass specific "fold" (vrp_fold_stmt)
to all stmts.
Richard.
> --
> Marc Glisse
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd
2016-05-12 8:41 ` Richard Biener
@ 2016-05-12 16:03 ` Marc Glisse
2016-05-12 16:51 ` Richard Biener
0 siblings, 1 reply; 22+ messages in thread
From: Marc Glisse @ 2016-05-12 16:03 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
On Thu, 12 May 2016, Richard Biener wrote:
> Yeah - note that VRP already calls set_range_info before simplifying
> stmts. It's just that substitute_and_fold doesn't apply fold_stmt (and
> thus match.pd) to all stmts but it only applies the pass specific "fold"
> (vrp_fold_stmt) to all stmts.
Just to be sure: is the fact that VRP doesn't apply fold_stmt on purpose?
The restriction makes sense, it is just that it may yield a bit of
duplication. We already indirectly use get_range_info in match.pd and may
miss out on opportunities that only occur in branches during the VRP pass.
--
Marc Glisse
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd
2016-05-12 16:03 ` Marc Glisse
@ 2016-05-12 16:51 ` Richard Biener
0 siblings, 0 replies; 22+ messages in thread
From: Richard Biener @ 2016-05-12 16:51 UTC (permalink / raw)
To: gcc-patches, Marc Glisse
On May 12, 2016 6:02:47 PM GMT+02:00, Marc Glisse <marc.glisse@inria.fr> wrote:
>On Thu, 12 May 2016, Richard Biener wrote:
>
>> Yeah - note that VRP already calls set_range_info before simplifying
>> stmts. It's just that substitute_and_fold doesn't apply fold_stmt
>(and
>> thus match.pd) to all stmts but it only applies the pass specific
>"fold"
>> (vrp_fold_stmt) to all stmts.
>
>Just to be sure: is the fact that VRP doesn't apply fold_stmt on
>purpose?
The propagator only folds stmts that had operands replaced (that doesn't enable all simplifications as match.PD patterns cover more than one statement).
>The restriction makes sense, it is just that it may yield a bit of
>duplication. We already indirectly use get_range_info in match.pd and
>may
>miss out on opportunities that only occur in branches during the VRP
>pass.
Yes. The pass specific fold is called on each stmt. Maybe we can omit the propagators folding if the pass specific folding applied.
Richard.
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2016-05-12 16:51 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-30 18:45 Canonicalize X u< X to UNORDERED_EXPR Marc Glisse
2016-05-02 8:37 ` Richard Biener
2016-05-02 9:19 ` Marc Glisse
2016-05-02 9:45 ` Richard Biener
2016-05-03 6:37 ` Marc Glisse
2016-05-03 11:03 ` Richard Biener
2016-05-03 13:27 ` Marc Glisse
2016-05-03 13:34 ` Richard Biener
2016-05-06 11:50 ` Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) Marc Glisse
2016-05-08 20:49 ` Marc Glisse
2016-05-09 10:04 ` Richard Biener
2016-05-10 6:12 ` Marc Glisse
2016-05-10 9:27 ` Richard Biener
2016-05-11 13:52 ` H.J. Lu
2016-05-11 16:17 ` Marc Glisse
2016-05-11 16:26 ` Simple bitop reassoc in match.pd Jeff Law
2016-05-11 17:56 ` Marc Glisse
2016-05-11 20:44 ` Marc Glisse
2016-05-12 8:41 ` Richard Biener
2016-05-12 16:03 ` Marc Glisse
2016-05-12 16:51 ` Richard Biener
2016-05-12 5:26 ` Marc Glisse
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).