* Canonicalize X u< X to UNORDERED_EXPR @ 2016-04-30 18:45 Marc Glisse 2016-05-02 8:37 ` Richard Biener 0 siblings, 1 reply; 22+ messages in thread From: Marc Glisse @ 2016-04-30 18:45 UTC (permalink / raw) To: gcc-patches [-- Attachment #1: Type: TEXT/PLAIN, Size: 496 bytes --] Hello, this case seemed to be missing in the various X cmp X transformations. It does not change the generated code in the testcase. The missing :c is rather trivial. I can commit it separately if you prefer. Bootstrap+regtest on powerpc64le-unknown-linux-gnu. 2016-05-02 Marc Glisse <marc.glisse@inria.fr> gcc/ * match.pd ((A & B) OP (C & B)): Mark '&' as commutative. (X u< X, X u> X): New transformations gcc/testsuite/ * gcc.dg/tree-ssa/unord.c: New testcase. -- Marc Glisse [-- Attachment #2: Type: TEXT/PLAIN, Size: 2159 bytes --] Index: trunk/gcc/match.pd =================================================================== --- trunk/gcc/match.pd (revision 235654) +++ trunk/gcc/match.pd (working copy) @@ -783,21 +783,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) @0) /* (~x | y) & x -> x & y */ /* (~x & y) | x -> x | y */ (simplify (bitop:c (rbitop:c (bit_not @0) @1) @0) (bitop @0 @1))) /* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */ (for bitop (bit_and bit_ior bit_xor) (simplify - (bitop (bit_and:c @0 @1) (bit_and @2 @1)) + (bitop (bit_and:c @0 @1) (bit_and:c @2 @1)) (bit_and (bitop @0 @2) @1))) /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */ (simplify (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2) (bit_ior (bit_and @0 @2) (bit_and @1 @2))) /* Combine successive equal operations with constants. */ (for bitop (bit_and bit_ior bit_xor) (simplify @@ -1914,20 +1914,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (simplify (cmp @0 @0) (if (cmp != NE_EXPR || ! FLOAT_TYPE_P (TREE_TYPE (@0)) || ! HONOR_NANS (@0)) { constant_boolean_node (false, type); }))) (for cmp (unle unge uneq) (simplify (cmp @0 @0) { constant_boolean_node (true, type); })) +(for cmp (unlt ungt) + (simplify + (cmp @0 @0) + (unordered @0 @0))) (simplify (ltgt @0 @0) (if (!flag_trapping_math) { constant_boolean_node (false, type); })) /* Fold ~X op ~Y as Y op X. */ (for cmp (simple_comparison) (simplify (cmp (bit_not@2 @0) (bit_not@3 @1)) (if (single_use (@2) && single_use (@3)) Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c =================================================================== --- trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (revision 0) +++ trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (working copy) @@ -0,0 +1,7 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-optimized" } */ + +int f(double a){double b=a;return !__builtin_islessequal(a,b);} +int g(double a){double b=a;return !__builtin_isgreaterequal(a,b);} + +/* { dg-final { scan-tree-dump-times " unord " 2 "optimized" } } */ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Canonicalize X u< X to UNORDERED_EXPR 2016-04-30 18:45 Canonicalize X u< X to UNORDERED_EXPR Marc Glisse @ 2016-05-02 8:37 ` Richard Biener 2016-05-02 9:19 ` Marc Glisse 0 siblings, 1 reply; 22+ messages in thread From: Richard Biener @ 2016-05-02 8:37 UTC (permalink / raw) To: Marc Glisse; +Cc: GCC Patches On Sat, Apr 30, 2016 at 8:44 PM, Marc Glisse <marc.glisse@inria.fr> wrote: > Hello, > > this case seemed to be missing in the various X cmp X transformations. It > does not change the generated code in the testcase. > > The missing :c is rather trivial. I can commit it separately if you prefer. I think it's not missing. Commutating the first one is enough to eventually make the @1s match up. I think you should get a diagnostic on a duplicate pattern when adding another :c (hmm, no, it's indeed "different" patterns but still redundant). > Bootstrap+regtest on powerpc64le-unknown-linux-gnu. Ok for the new pattern. Thanks, Richard. > 2016-05-02 Marc Glisse <marc.glisse@inria.fr> > > gcc/ > * match.pd ((A & B) OP (C & B)): Mark '&' as commutative. > (X u< X, X u> X): New transformations > > gcc/testsuite/ > * gcc.dg/tree-ssa/unord.c: New testcase. > > -- > Marc Glisse > Index: trunk/gcc/match.pd > =================================================================== > --- trunk/gcc/match.pd (revision 235654) > +++ trunk/gcc/match.pd (working copy) > @@ -783,21 +783,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > @0) > /* (~x | y) & x -> x & y */ > /* (~x & y) | x -> x | y */ > (simplify > (bitop:c (rbitop:c (bit_not @0) @1) @0) > (bitop @0 @1))) > > /* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */ > (for bitop (bit_and bit_ior bit_xor) > (simplify > - (bitop (bit_and:c @0 @1) (bit_and @2 @1)) > + (bitop (bit_and:c @0 @1) (bit_and:c @2 @1)) > (bit_and (bitop @0 @2) @1))) > > /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */ > (simplify > (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2) > (bit_ior (bit_and @0 @2) (bit_and @1 @2))) > > /* Combine successive equal operations with constants. */ > (for bitop (bit_and bit_ior bit_xor) > (simplify > @@ -1914,20 +1914,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (simplify > (cmp @0 @0) > (if (cmp != NE_EXPR > || ! FLOAT_TYPE_P (TREE_TYPE (@0)) > || ! HONOR_NANS (@0)) > { constant_boolean_node (false, type); }))) > (for cmp (unle unge uneq) > (simplify > (cmp @0 @0) > { constant_boolean_node (true, type); })) > +(for cmp (unlt ungt) > + (simplify > + (cmp @0 @0) > + (unordered @0 @0))) > (simplify > (ltgt @0 @0) > (if (!flag_trapping_math) > { constant_boolean_node (false, type); })) > > /* Fold ~X op ~Y as Y op X. */ > (for cmp (simple_comparison) > (simplify > (cmp (bit_not@2 @0) (bit_not@3 @1)) > (if (single_use (@2) && single_use (@3)) > Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c > =================================================================== > --- trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (revision 0) > +++ trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (working copy) > @@ -0,0 +1,7 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O -fdump-tree-optimized" } */ > + > +int f(double a){double b=a;return !__builtin_islessequal(a,b);} > +int g(double a){double b=a;return !__builtin_isgreaterequal(a,b);} > + > +/* { dg-final { scan-tree-dump-times " unord " 2 "optimized" } } */ > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Canonicalize X u< X to UNORDERED_EXPR 2016-05-02 8:37 ` Richard Biener @ 2016-05-02 9:19 ` Marc Glisse 2016-05-02 9:45 ` Richard Biener 0 siblings, 1 reply; 22+ messages in thread From: Marc Glisse @ 2016-05-02 9:19 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches On Mon, 2 May 2016, Richard Biener wrote: > On Sat, Apr 30, 2016 at 8:44 PM, Marc Glisse <marc.glisse@inria.fr> wrote: >> Hello, >> >> this case seemed to be missing in the various X cmp X transformations. It >> does not change the generated code in the testcase. >> >> The missing :c is rather trivial. I can commit it separately if you prefer. > > I think it's not missing. Commutating the first one is enough to eventually > make the @1s match up. I think you should get a diagnostic on a duplicate > pattern when adding another :c (hmm, no, it's indeed "different" patterns but > still redundant). Let's see. The current pattern is (bitop (bit_and:c @0 @1) (bit_and @2 @1)) This matches: (X & Y) ^ (Z & Y) (Y & X) ^ (Z & Y) If I have for instance (Y & X) ^ (Y & Z), I don't see how that is going to match, and indeed we don't simplify that. On the other hand, if I have bit_ior instead of bit_xor, then we have another very similar transformation a 100 lines up in match.pd (for op (bit_and bit_ior) rop (bit_ior bit_and) (simplify (op (convert? (rop:c @0 @1)) (convert? (rop @0 @2))) That one also commutes only one, but starting from a different match. We should probably reorganize them a bit. >> Bootstrap+regtest on powerpc64le-unknown-linux-gnu. > > Ok for the new pattern. > > Thanks, > Richard. > >> 2016-05-02 Marc Glisse <marc.glisse@inria.fr> >> >> gcc/ >> * match.pd ((A & B) OP (C & B)): Mark '&' as commutative. >> (X u< X, X u> X): New transformations >> >> gcc/testsuite/ >> * gcc.dg/tree-ssa/unord.c: New testcase. >> >> -- >> Marc Glisse >> Index: trunk/gcc/match.pd >> =================================================================== >> --- trunk/gcc/match.pd (revision 235654) >> +++ trunk/gcc/match.pd (working copy) >> @@ -783,21 +783,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) >> @0) >> /* (~x | y) & x -> x & y */ >> /* (~x & y) | x -> x | y */ >> (simplify >> (bitop:c (rbitop:c (bit_not @0) @1) @0) >> (bitop @0 @1))) >> >> /* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */ >> (for bitop (bit_and bit_ior bit_xor) >> (simplify >> - (bitop (bit_and:c @0 @1) (bit_and @2 @1)) >> + (bitop (bit_and:c @0 @1) (bit_and:c @2 @1)) >> (bit_and (bitop @0 @2) @1))) >> >> /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */ >> (simplify >> (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2) >> (bit_ior (bit_and @0 @2) (bit_and @1 @2))) >> >> /* Combine successive equal operations with constants. */ >> (for bitop (bit_and bit_ior bit_xor) >> (simplify >> @@ -1914,20 +1914,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) >> (simplify >> (cmp @0 @0) >> (if (cmp != NE_EXPR >> || ! FLOAT_TYPE_P (TREE_TYPE (@0)) >> || ! HONOR_NANS (@0)) >> { constant_boolean_node (false, type); }))) >> (for cmp (unle unge uneq) >> (simplify >> (cmp @0 @0) >> { constant_boolean_node (true, type); })) >> +(for cmp (unlt ungt) >> + (simplify >> + (cmp @0 @0) >> + (unordered @0 @0))) >> (simplify >> (ltgt @0 @0) >> (if (!flag_trapping_math) >> { constant_boolean_node (false, type); })) >> >> /* Fold ~X op ~Y as Y op X. */ >> (for cmp (simple_comparison) >> (simplify >> (cmp (bit_not@2 @0) (bit_not@3 @1)) >> (if (single_use (@2) && single_use (@3)) >> Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c >> =================================================================== >> --- trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (revision 0) >> +++ trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (working copy) >> @@ -0,0 +1,7 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O -fdump-tree-optimized" } */ >> + >> +int f(double a){double b=a;return !__builtin_islessequal(a,b);} >> +int g(double a){double b=a;return !__builtin_isgreaterequal(a,b);} >> + >> +/* { dg-final { scan-tree-dump-times " unord " 2 "optimized" } } */ -- Marc Glisse ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Canonicalize X u< X to UNORDERED_EXPR 2016-05-02 9:19 ` Marc Glisse @ 2016-05-02 9:45 ` Richard Biener 2016-05-03 6:37 ` Marc Glisse 0 siblings, 1 reply; 22+ messages in thread From: Richard Biener @ 2016-05-02 9:45 UTC (permalink / raw) To: Marc Glisse; +Cc: GCC Patches On Mon, May 2, 2016 at 11:18 AM, Marc Glisse <marc.glisse@inria.fr> wrote: > On Mon, 2 May 2016, Richard Biener wrote: > >> On Sat, Apr 30, 2016 at 8:44 PM, Marc Glisse <marc.glisse@inria.fr> wrote: >>> >>> Hello, >>> >>> this case seemed to be missing in the various X cmp X transformations. It >>> does not change the generated code in the testcase. >>> >>> The missing :c is rather trivial. I can commit it separately if you >>> prefer. >> >> >> I think it's not missing. Commutating the first one is enough to >> eventually >> make the @1s match up. I think you should get a diagnostic on a duplicate >> pattern when adding another :c (hmm, no, it's indeed "different" patterns >> but >> still redundant). > > > Let's see. The current pattern is > (bitop (bit_and:c @0 @1) (bit_and @2 @1)) > > This matches: > (X & Y) ^ (Z & Y) > (Y & X) ^ (Z & Y) > > If I have for instance (Y & X) ^ (Y & Z), I don't see how that is going to > match, and indeed we don't simplify that. Hmm, indeed. I might have wrongly ommitted some :c then in other places as well. So the original patch is ok. > On the other hand, if I have > bit_ior instead of bit_xor, then we have another very similar transformation > a 100 lines up in match.pd > > (for op (bit_and bit_ior) > rop (bit_ior bit_and) > (simplify > (op (convert? (rop:c @0 @1)) (convert? (rop @0 @2))) > > That one also commutes only one, but starting from a different match. We > should probably reorganize them a bit. Yeah. Richard. > >>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu. >> >> >> Ok for the new pattern. >> >> Thanks, >> Richard. >> >>> 2016-05-02 Marc Glisse <marc.glisse@inria.fr> >>> >>> gcc/ >>> * match.pd ((A & B) OP (C & B)): Mark '&' as commutative. >>> (X u< X, X u> X): New transformations >>> >>> gcc/testsuite/ >>> * gcc.dg/tree-ssa/unord.c: New testcase. >>> >>> -- >>> Marc Glisse >>> Index: trunk/gcc/match.pd >>> =================================================================== >>> --- trunk/gcc/match.pd (revision 235654) >>> +++ trunk/gcc/match.pd (working copy) >>> @@ -783,21 +783,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) >>> @0) >>> /* (~x | y) & x -> x & y */ >>> /* (~x & y) | x -> x | y */ >>> (simplify >>> (bitop:c (rbitop:c (bit_not @0) @1) @0) >>> (bitop @0 @1))) >>> >>> /* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */ >>> (for bitop (bit_and bit_ior bit_xor) >>> (simplify >>> - (bitop (bit_and:c @0 @1) (bit_and @2 @1)) >>> + (bitop (bit_and:c @0 @1) (bit_and:c @2 @1)) >>> (bit_and (bitop @0 @2) @1))) >>> >>> /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */ >>> (simplify >>> (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2) >>> (bit_ior (bit_and @0 @2) (bit_and @1 @2))) >>> >>> /* Combine successive equal operations with constants. */ >>> (for bitop (bit_and bit_ior bit_xor) >>> (simplify >>> @@ -1914,20 +1914,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) >>> (simplify >>> (cmp @0 @0) >>> (if (cmp != NE_EXPR >>> || ! FLOAT_TYPE_P (TREE_TYPE (@0)) >>> || ! HONOR_NANS (@0)) >>> { constant_boolean_node (false, type); }))) >>> (for cmp (unle unge uneq) >>> (simplify >>> (cmp @0 @0) >>> { constant_boolean_node (true, type); })) >>> +(for cmp (unlt ungt) >>> + (simplify >>> + (cmp @0 @0) >>> + (unordered @0 @0))) >>> (simplify >>> (ltgt @0 @0) >>> (if (!flag_trapping_math) >>> { constant_boolean_node (false, type); })) >>> >>> /* Fold ~X op ~Y as Y op X. */ >>> (for cmp (simple_comparison) >>> (simplify >>> (cmp (bit_not@2 @0) (bit_not@3 @1)) >>> (if (single_use (@2) && single_use (@3)) >>> Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c >>> =================================================================== >>> --- trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (revision 0) >>> +++ trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (working copy) >>> @@ -0,0 +1,7 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-O -fdump-tree-optimized" } */ >>> + >>> +int f(double a){double b=a;return !__builtin_islessequal(a,b);} >>> +int g(double a){double b=a;return !__builtin_isgreaterequal(a,b);} >>> + >>> +/* { dg-final { scan-tree-dump-times " unord " 2 "optimized" } } */ > > > -- > Marc Glisse ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Canonicalize X u< X to UNORDERED_EXPR 2016-05-02 9:45 ` Richard Biener @ 2016-05-03 6:37 ` Marc Glisse 2016-05-03 11:03 ` Richard Biener 0 siblings, 1 reply; 22+ messages in thread From: Marc Glisse @ 2016-05-03 6:37 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches [-- Attachment #1: Type: TEXT/PLAIN, Size: 535 bytes --] This removes the duplication. I also removed the case (A&B)&(A&C) which is handled by reassoc. And I need 2 NOP checks, for the case where @0 is a constant (that couldn't happen before my patch because canonicalization would put the constant as second operand). Bootstrap+regtest on powerpc64le-unknown-linux-gnu. 2016-05-03 Marc Glisse <marc.glisse@inria.fr> * match.pd ((A | B) & (A | C)): Generalize to BIT_XOR_EXPR. Mark as commutative. Check both conversions are NOP. ((A & B) OP (C & B)): Remove. -- Marc Glisse [-- Attachment #2: Type: TEXT/PLAIN, Size: 2135 bytes --] Index: gcc/match.pd =================================================================== --- gcc/match.pd (revision 235764) +++ gcc/match.pd (working copy) @@ -678,25 +678,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (simplify (bit_xor:c (bit_and:c @0 @1) @1) (bit_and (bit_not @0) @1)) /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both operands are another bit-wise operation with a common input. If so, distribute the bit operations to save an operation and possibly two if constants are involved. For example, convert (A | B) & (A | C) into A | (B & C) Further simplification will occur if B and C are constants. */ -(for op (bit_and bit_ior) - rop (bit_ior bit_and) +(for op (bit_and bit_ior bit_xor) + rop (bit_ior bit_and bit_and) (simplify - (op (convert? (rop:c @0 @1)) (convert? (rop @0 @2))) - (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) + (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2))) + (if (tree_nop_conversion_p (type, TREE_TYPE (@1)) + && tree_nop_conversion_p (type, TREE_TYPE (@2))) (rop (convert @0) (op (convert @1) (convert @2)))))) (simplify (abs (abs@1 @0)) @1) (simplify (abs (negate @0)) (abs @0)) (simplify @@ -780,26 +781,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* (x & y) | x -> x */ (simplify (bitop:c (rbitop:c @0 @1) @0) @0) /* (~x | y) & x -> x & y */ /* (~x & y) | x -> x | y */ (simplify (bitop:c (rbitop:c (bit_not @0) @1) @0) (bitop @0 @1))) -/* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */ -(for bitop (bit_and bit_ior bit_xor) - (simplify - (bitop (bit_and:c @0 @1) (bit_and @2 @1)) - (bit_and (bitop @0 @2) @1))) - /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */ (simplify (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2) (bit_ior (bit_and @0 @2) (bit_and @1 @2))) /* Combine successive equal operations with constants. */ (for bitop (bit_and bit_ior bit_xor) (simplify (bitop (bitop @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2) (bitop @0 (bitop @1 @2)))) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Canonicalize X u< X to UNORDERED_EXPR 2016-05-03 6:37 ` Marc Glisse @ 2016-05-03 11:03 ` Richard Biener 2016-05-03 13:27 ` Marc Glisse 0 siblings, 1 reply; 22+ messages in thread From: Richard Biener @ 2016-05-03 11:03 UTC (permalink / raw) To: Marc Glisse; +Cc: GCC Patches On Tue, May 3, 2016 at 8:36 AM, Marc Glisse <marc.glisse@inria.fr> wrote: > This removes the duplication. I also removed the case (A&B)&(A&C) which is > handled by reassoc. And I need 2 NOP checks, for the case where @0 is a > constant (that couldn't happen before my patch because canonicalization > would put the constant as second operand). Nicely spotted. Not sure we want to delay (A&B)&(A&C) until re-assoc. We have many patterns that reassoc would also catch, like (A + CST) + CST or (A + B)- A, albeit reassoc only handles the unsigned cases. > Bootstrap+regtest on powerpc64le-unknown-linux-gnu. Ok. Thanks, Richard. > 2016-05-03 Marc Glisse <marc.glisse@inria.fr> > > * match.pd ((A | B) & (A | C)): Generalize to BIT_XOR_EXPR. Mark > as commutative. Check both conversions are NOP. > ((A & B) OP (C & B)): Remove. > > -- > Marc Glisse > Index: gcc/match.pd > =================================================================== > --- gcc/match.pd (revision 235764) > +++ gcc/match.pd (working copy) > @@ -678,25 +678,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (simplify > (bit_xor:c (bit_and:c @0 @1) @1) > (bit_and (bit_not @0) @1)) > > /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both > operands are another bit-wise operation with a common input. If so, > distribute the bit operations to save an operation and possibly two if > constants are involved. For example, convert > (A | B) & (A | C) into A | (B & C) > Further simplification will occur if B and C are constants. */ > -(for op (bit_and bit_ior) > - rop (bit_ior bit_and) > +(for op (bit_and bit_ior bit_xor) > + rop (bit_ior bit_and bit_and) > (simplify > - (op (convert? (rop:c @0 @1)) (convert? (rop @0 @2))) > - (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) > + (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2))) > + (if (tree_nop_conversion_p (type, TREE_TYPE (@1)) > + && tree_nop_conversion_p (type, TREE_TYPE (@2))) > (rop (convert @0) (op (convert @1) (convert @2)))))) > > > (simplify > (abs (abs@1 @0)) > @1) > (simplify > (abs (negate @0)) > (abs @0)) > (simplify > @@ -780,26 +781,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > /* (x & y) | x -> x */ > (simplify > (bitop:c (rbitop:c @0 @1) @0) > @0) > /* (~x | y) & x -> x & y */ > /* (~x & y) | x -> x | y */ > (simplify > (bitop:c (rbitop:c (bit_not @0) @1) @0) > (bitop @0 @1))) > > -/* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */ > -(for bitop (bit_and bit_ior bit_xor) > - (simplify > - (bitop (bit_and:c @0 @1) (bit_and @2 @1)) > - (bit_and (bitop @0 @2) @1))) > - > /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */ > (simplify > (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2) > (bit_ior (bit_and @0 @2) (bit_and @1 @2))) > > /* Combine successive equal operations with constants. */ > (for bitop (bit_and bit_ior bit_xor) > (simplify > (bitop (bitop @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2) > (bitop @0 (bitop @1 @2)))) > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Canonicalize X u< X to UNORDERED_EXPR 2016-05-03 11:03 ` Richard Biener @ 2016-05-03 13:27 ` Marc Glisse 2016-05-03 13:34 ` Richard Biener 0 siblings, 1 reply; 22+ messages in thread From: Marc Glisse @ 2016-05-03 13:27 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches On Tue, 3 May 2016, Richard Biener wrote: > On Tue, May 3, 2016 at 8:36 AM, Marc Glisse <marc.glisse@inria.fr> wrote: >> This removes the duplication. I also removed the case (A&B)&(A&C) which is >> handled by reassoc. And I need 2 NOP checks, for the case where @0 is a >> constant (that couldn't happen before my patch because canonicalization >> would put the constant as second operand). > > Nicely spotted. Not sure we want to delay (A&B)&(A&C) until re-assoc. We have > many patterns that reassoc would also catch, like (A + CST) + CST or (A + B)- A, > albeit reassoc only handles the unsigned cases. (A&B)&A seems simple enough for match.pd, I thought (A&B)&(A&C) was starting to be a bit specialized. I could easily add it back (making it work with | at the same time), but then I am not convinced A&(B&C) is the best output. If A&B or A&C have several uses, then (A&B)&C or B&(A&C) seem preferable (and we would still have a transformation for (A&cst1)&cst2 so we wouldn't lose in the case where B and C are constants). We may still end up having to add some :s to the transformation I just touched. -- Marc Glisse ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Canonicalize X u< X to UNORDERED_EXPR 2016-05-03 13:27 ` Marc Glisse @ 2016-05-03 13:34 ` Richard Biener 2016-05-06 11:50 ` Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) Marc Glisse 0 siblings, 1 reply; 22+ messages in thread From: Richard Biener @ 2016-05-03 13:34 UTC (permalink / raw) To: Marc Glisse; +Cc: GCC Patches On Tue, May 3, 2016 at 3:26 PM, Marc Glisse <marc.glisse@inria.fr> wrote: > On Tue, 3 May 2016, Richard Biener wrote: > >> On Tue, May 3, 2016 at 8:36 AM, Marc Glisse <marc.glisse@inria.fr> wrote: >>> >>> This removes the duplication. I also removed the case (A&B)&(A&C) which >>> is >>> handled by reassoc. And I need 2 NOP checks, for the case where @0 is a >>> constant (that couldn't happen before my patch because canonicalization >>> would put the constant as second operand). >> >> >> Nicely spotted. Not sure we want to delay (A&B)&(A&C) until re-assoc. We >> have >> many patterns that reassoc would also catch, like (A + CST) + CST or (A + >> B)- A, >> albeit reassoc only handles the unsigned cases. > > > (A&B)&A seems simple enough for match.pd, I thought (A&B)&(A&C) was starting > to be a bit specialized. I could easily add it back (making it work with | > at the same time), but then I am not convinced A&(B&C) is the best output. > If A&B or A&C have several uses, then (A&B)&C or B&(A&C) seem preferable > (and we would still have a transformation for (A&cst1)&cst2 so we wouldn't > lose in the case where B and C are constants). We may still end up having to > add some :s to the transformation I just touched. Yeah, these are always tricky questions. Note that re-assoc won't handle the case with multi-use A&C or A&B. The only reason to care for the single-use case is when we change operations for the mixed operand cases. For the all-&| case there is only the (usual) consideration about SSA lifetime extension. So maybe it makes sense to split out the all-&| cases. Richard. > -- > Marc Glisse ^ permalink raw reply [flat|nested] 22+ messages in thread
* Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) 2016-05-03 13:34 ` Richard Biener @ 2016-05-06 11:50 ` Marc Glisse 2016-05-08 20:49 ` Marc Glisse ` (2 more replies) 0 siblings, 3 replies; 22+ messages in thread From: Marc Glisse @ 2016-05-06 11:50 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches [-- Attachment #1: Type: TEXT/PLAIN, Size: 2870 bytes --] On Tue, 3 May 2016, Richard Biener wrote: > On Tue, May 3, 2016 at 3:26 PM, Marc Glisse <marc.glisse@inria.fr> wrote: >> On Tue, 3 May 2016, Richard Biener wrote: >> >>> On Tue, May 3, 2016 at 8:36 AM, Marc Glisse <marc.glisse@inria.fr> wrote: >>>> >>>> This removes the duplication. I also removed the case (A&B)&(A&C) which >>>> is >>>> handled by reassoc. And I need 2 NOP checks, for the case where @0 is a >>>> constant (that couldn't happen before my patch because canonicalization >>>> would put the constant as second operand). >>> >>> >>> Nicely spotted. Not sure we want to delay (A&B)&(A&C) until re-assoc. We >>> have >>> many patterns that reassoc would also catch, like (A + CST) + CST or (A + >>> B)- A, >>> albeit reassoc only handles the unsigned cases. >> >> >> (A&B)&A seems simple enough for match.pd, I thought (A&B)&(A&C) was starting >> to be a bit specialized. I could easily add it back (making it work with | >> at the same time), but then I am not convinced A&(B&C) is the best output. >> If A&B or A&C have several uses, then (A&B)&C or B&(A&C) seem preferable >> (and we would still have a transformation for (A&cst1)&cst2 so we wouldn't >> lose in the case where B and C are constants). We may still end up having to >> add some :s to the transformation I just touched. > > Yeah, these are always tricky questions. Note that re-assoc won't > handle the case > with multi-use A&C or A&B. The only reason to care for the single-use case is > when we change operations for the mixed operand cases. For the all-&| case > there is only the (usual) consideration about SSA lifetime extension. > > So maybe it makes sense to split out the all-&| cases. Here they are. I did (X&Y)&X and (X&Y)&(X&Z). The next one would be ((X&Y)&Z)&X, but at some point we have to defer to reassoc. I didn't add the convert?+tree_nop_conversion_p to the existing transform I modified. I guess at some point we should make a pass and add them to all the transformations on bit operations... For (X & Y) & Y, I believe that any conversion is fine. For the others, tree_nop_conversion_p is probably too strict (narrowing should be fine for all), but I was too lazy to look for tighter conditions. (X ^ Y) ^ Y -> X should probably have (non_lvalue ...) on its output, but in a simple test it didn't seem to matter. Is non_lvalue still needed? Bootstrap+regtest on powerpc64le-unknown-linux-gnu. 2016-05-06 Marc Glisse <marc.glisse@inria.fr> gcc/ * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge with... * match.pd ((X & Y) ^ Y): ... this. ((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y) | (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations. gcc/testsuite/ * gcc.dg/tree-ssa/bit-assoc.c: New testcase. * gcc.dg/tree-ssa/pr69270.c: Adjust. * gcc.dg/tree-ssa/vrp59.c: Disable forwprop. -- Marc Glisse [-- Attachment #2: Type: TEXT/PLAIN, Size: 8281 bytes --] Index: gcc/fold-const.c =================================================================== --- gcc/fold-const.c (revision 235933) +++ gcc/fold-const.c (working copy) @@ -10063,59 +10063,20 @@ fold_binary_loc (location_t loc, } /* Fold !X & 1 as X == 0. */ if (TREE_CODE (arg0) == TRUTH_NOT_EXPR && integer_onep (arg1)) { tem = TREE_OPERAND (arg0, 0); return fold_build2_loc (loc, EQ_EXPR, type, tem, build_zero_cst (TREE_TYPE (tem))); } - /* Fold (X ^ Y) & Y as ~X & Y. */ - if (TREE_CODE (arg0) == BIT_XOR_EXPR - && operand_equal_p (TREE_OPERAND (arg0, 1), arg1, 0)) - { - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0)); - return fold_build2_loc (loc, BIT_AND_EXPR, type, - fold_build1_loc (loc, BIT_NOT_EXPR, type, tem), - fold_convert_loc (loc, type, arg1)); - } - /* Fold (X ^ Y) & X as ~Y & X. */ - if (TREE_CODE (arg0) == BIT_XOR_EXPR - && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0) - && reorder_operands_p (TREE_OPERAND (arg0, 1), arg1)) - { - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1)); - return fold_build2_loc (loc, BIT_AND_EXPR, type, - fold_build1_loc (loc, BIT_NOT_EXPR, type, tem), - fold_convert_loc (loc, type, arg1)); - } - /* Fold X & (X ^ Y) as X & ~Y. */ - if (TREE_CODE (arg1) == BIT_XOR_EXPR - && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0)) - { - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1)); - return fold_build2_loc (loc, BIT_AND_EXPR, type, - fold_convert_loc (loc, type, arg0), - fold_build1_loc (loc, BIT_NOT_EXPR, type, tem)); - } - /* Fold X & (Y ^ X) as ~Y & X. */ - if (TREE_CODE (arg1) == BIT_XOR_EXPR - && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0) - && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0))) - { - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0)); - return fold_build2_loc (loc, BIT_AND_EXPR, type, - fold_build1_loc (loc, BIT_NOT_EXPR, type, tem), - fold_convert_loc (loc, type, arg0)); - } - /* Fold (X * Y) & -(1 << CST) to X * Y if Y is a constant multiple of 1 << CST. */ if (TREE_CODE (arg1) == INTEGER_CST) { wide_int cst1 = arg1; wide_int ncst1 = -cst1; if ((cst1 & ncst1) == ncst1 && multiple_of_p (type, arg0, wide_int_to_tree (TREE_TYPE (arg1), ncst1))) return fold_convert_loc (loc, type, arg0); Index: gcc/match.pd =================================================================== --- gcc/match.pd (revision 235933) +++ gcc/match.pd (working copy) @@ -667,39 +667,70 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (tree_nop_conversion_p (type, TREE_TYPE (@0)) && tree_nop_conversion_p (type, TREE_TYPE (@1))) (bit_xor (convert @0) (convert @1)))) /* Convert ~X ^ C to X ^ ~C. */ (simplify (bit_xor (convert? (bit_not @0)) INTEGER_CST@1) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) (bit_xor (convert @0) (bit_not @1)))) -/* Fold (X & Y) ^ Y as ~X & Y. */ -(simplify - (bit_xor:c (bit_and:c @0 @1) @1) - (bit_and (bit_not @0) @1)) +/* Fold (X & Y) ^ Y and (X ^ Y) & Y as ~X & Y. */ +(for opo (bit_and bit_xor) + opi (bit_xor bit_and) + (simplify + (opo:c (opi:c @0 @1) @1) + (bit_and (bit_not @0) @1))) /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both operands are another bit-wise operation with a common input. If so, distribute the bit operations to save an operation and possibly two if constants are involved. For example, convert (A | B) & (A | C) into A | (B & C) Further simplification will occur if B and C are constants. */ (for op (bit_and bit_ior bit_xor) rop (bit_ior bit_and bit_and) (simplify (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2))) (if (tree_nop_conversion_p (type, TREE_TYPE (@1)) && tree_nop_conversion_p (type, TREE_TYPE (@2))) (rop (convert @0) (op (convert @1) (convert @2)))))) +/* Some simple reassociation for bit operations, also handled in reassoc. */ +/* (X & Y) & Y -> X & Y + (X | Y) | Y -> X | Y */ +(for op (bit_and bit_ior) + (simplify + (op:c (convert?@2 (op:c @0 @1)) (convert? @1)) + @2)) +/* (X ^ Y) ^ Y -> X */ +(simplify + (bit_xor:c (convert? (bit_xor:c @0 @1)) (convert? @1)) + (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) + (convert @0))) +/* (X & Y) & (X & Z) -> (X & Y) & Z + (X | Y) | (X | Z) -> (X | Y) | Z */ +(for op (bit_and bit_ior) + (simplify + (op:c (convert?@3 (op:c@4 @0 @1)) (convert?@5 (op:c@6 @0 @2))) + (if (tree_nop_conversion_p (type, TREE_TYPE (@1)) + && tree_nop_conversion_p (type, TREE_TYPE (@2))) + (if (single_use (@5) && single_use (@6)) + (op @3 (convert @2)) + (if (single_use (@3) && single_use (@4)) + (op (convert @1) @5)))))) +/* (X ^ Y) ^ (X ^ Z) -> Y ^ Z */ +(simplify + (bit_xor (convert? (bit_xor:c @0 @1)) (convert? (bit_xor:c @0 @2))) + (if (tree_nop_conversion_p (type, TREE_TYPE (@1)) + && tree_nop_conversion_p (type, TREE_TYPE (@2))) + (convert (bit_xor @1 @2)))) (simplify (abs (abs@1 @0)) @1) (simplify (abs (negate @0)) (abs @0)) (simplify (abs tree_expr_nonnegative_p@0) @0) Index: gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (working copy) @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-forwprop1-details -fdump-tree-ccp1-details" } */ + +int f1(int a, int b){ + int c = a & b; + return c & b; +} +int f2(int a, int b){ + int c = a | b; + return b | c; +} +int g1(int a, int b, int c){ + int d = a & b; + int e = b & c; + return d & e; +} +int g2(int a, int b, int c){ + int d = a | b; + int e = c | b; + return d | e; +} +int g3(int a, int b, int c){ + int d = a ^ b; + int e = b ^ c; + return e ^ d; +} + +/* { dg-final { scan-tree-dump-times "Match-and-simplified" 2 "ccp1" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified" 3 "forwprop1" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/pr69270.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (revision 235933) +++ gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (working copy) @@ -1,21 +1,23 @@ /* { dg-do compile } */ /* { dg-options "-O2 -fsplit-paths -fdump-tree-dom3-details" } */ /* There should be two references to bufferstep that turn into constants. */ /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with constant .0." 1 "dom3"} } */ /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with constant .1." 1 "dom3"} } */ /* And some assignments ought to fold down to constants. */ -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 2 "dom3"} } */ -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 2 "dom3"} } */ +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -1;" 1 "dom3"} } */ +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -2;" 1 "dom3"} } */ +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 1 "dom3"} } */ +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 1 "dom3"} } */ /* The XOR operations should have been optimized to constants. */ /* { dg-final { scan-tree-dump-not "bit_xor" "dom3"} } */ extern int *stepsizeTable; void adpcm_coder (signed char *outdata, int len) { Index: gcc/testsuite/gcc.dg/tree-ssa/vrp59.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (revision 235933) +++ gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (working copy) @@ -1,12 +1,12 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp1" } */ +/* { dg-options "-O2 -fno-tree-ccp -fno-tree-forwprop -fdump-tree-vrp1" } */ int f(int x) { if (x >= 0 && x <= 3) { x = x ^ 3; x = x & 3; } return x; } ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) 2016-05-06 11:50 ` Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) Marc Glisse @ 2016-05-08 20:49 ` Marc Glisse 2016-05-09 10:04 ` Richard Biener 2016-05-10 6:12 ` Marc Glisse 2 siblings, 0 replies; 22+ messages in thread From: Marc Glisse @ 2016-05-08 20:49 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches On Fri, 6 May 2016, Marc Glisse wrote: > 2016-05-06 Marc Glisse <marc.glisse@inria.fr> > > gcc/ > * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge with... > * match.pd ((X & Y) ^ Y): ... this. > ((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y) > | (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations. > > gcc/testsuite/ > * gcc.dg/tree-ssa/bit-assoc.c: New testcase. > * gcc.dg/tree-ssa/pr69270.c: Adjust. > * gcc.dg/tree-ssa/vrp59.c: Disable forwprop. Oups, I forgot about convert1/convert2 in match.pd, which may be relevant especially if @0 is a constant in the last transforms. Please ignore the patch for now, I'll resend after checking it. -- Marc Glisse ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) 2016-05-06 11:50 ` Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) Marc Glisse 2016-05-08 20:49 ` Marc Glisse @ 2016-05-09 10:04 ` Richard Biener 2016-05-10 6:12 ` Marc Glisse 2 siblings, 0 replies; 22+ messages in thread From: Richard Biener @ 2016-05-09 10:04 UTC (permalink / raw) To: Marc Glisse; +Cc: GCC Patches On Fri, May 6, 2016 at 1:50 PM, Marc Glisse <marc.glisse@inria.fr> wrote: > On Tue, 3 May 2016, Richard Biener wrote: > >> On Tue, May 3, 2016 at 3:26 PM, Marc Glisse <marc.glisse@inria.fr> wrote: >>> >>> On Tue, 3 May 2016, Richard Biener wrote: >>> >>>> On Tue, May 3, 2016 at 8:36 AM, Marc Glisse <marc.glisse@inria.fr> >>>> wrote: >>>>> >>>>> >>>>> This removes the duplication. I also removed the case (A&B)&(A&C) which >>>>> is >>>>> handled by reassoc. And I need 2 NOP checks, for the case where @0 is a >>>>> constant (that couldn't happen before my patch because canonicalization >>>>> would put the constant as second operand). >>>> >>>> >>>> >>>> Nicely spotted. Not sure we want to delay (A&B)&(A&C) until re-assoc. >>>> We >>>> have >>>> many patterns that reassoc would also catch, like (A + CST) + CST or (A >>>> + >>>> B)- A, >>>> albeit reassoc only handles the unsigned cases. >>> >>> >>> >>> (A&B)&A seems simple enough for match.pd, I thought (A&B)&(A&C) was >>> starting >>> to be a bit specialized. I could easily add it back (making it work with >>> | >>> at the same time), but then I am not convinced A&(B&C) is the best >>> output. >>> If A&B or A&C have several uses, then (A&B)&C or B&(A&C) seem preferable >>> (and we would still have a transformation for (A&cst1)&cst2 so we >>> wouldn't >>> lose in the case where B and C are constants). We may still end up having >>> to >>> add some :s to the transformation I just touched. >> >> >> Yeah, these are always tricky questions. Note that re-assoc won't >> handle the case >> with multi-use A&C or A&B. The only reason to care for the single-use >> case is >> when we change operations for the mixed operand cases. For the all-&| >> case >> there is only the (usual) consideration about SSA lifetime extension. >> >> So maybe it makes sense to split out the all-&| cases. > > > Here they are. I did (X&Y)&X and (X&Y)&(X&Z). The next one would be > ((X&Y)&Z)&X, but at some point we have to defer to reassoc. > > I didn't add the convert?+tree_nop_conversion_p to the existing transform I > modified. I guess at some point we should make a pass and add them to all > the transformations on bit operations... > > For (X & Y) & Y, I believe that any conversion is fine. For the others, > tree_nop_conversion_p is probably too strict (narrowing should be fine for > all), but I was too lazy to look for tighter conditions. > > (X ^ Y) ^ Y -> X should probably have (non_lvalue ...) on its output, but in > a simple test it didn't seem to matter. Is non_lvalue still needed? I've only added them to patterns where the fold-const.c variant had it. In theory (hopefully) all of them are no longer necessary since C/C++ both do delayed folding and thus hopefully figure out lvalue-ness before entering fold ... Richard. > > Bootstrap+regtest on powerpc64le-unknown-linux-gnu. > > 2016-05-06 Marc Glisse <marc.glisse@inria.fr> > > gcc/ > * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge > with... > * match.pd ((X & Y) ^ Y): ... this. > ((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y) > | (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations. > > gcc/testsuite/ > * gcc.dg/tree-ssa/bit-assoc.c: New testcase. > * gcc.dg/tree-ssa/pr69270.c: Adjust. > * gcc.dg/tree-ssa/vrp59.c: Disable forwprop. > > -- > Marc Glisse > Index: gcc/fold-const.c > =================================================================== > --- gcc/fold-const.c (revision 235933) > +++ gcc/fold-const.c (working copy) > @@ -10063,59 +10063,20 @@ fold_binary_loc (location_t loc, > } > /* Fold !X & 1 as X == 0. */ > if (TREE_CODE (arg0) == TRUTH_NOT_EXPR > && integer_onep (arg1)) > { > tem = TREE_OPERAND (arg0, 0); > return fold_build2_loc (loc, EQ_EXPR, type, tem, > build_zero_cst (TREE_TYPE (tem))); > } > > - /* Fold (X ^ Y) & Y as ~X & Y. */ > - if (TREE_CODE (arg0) == BIT_XOR_EXPR > - && operand_equal_p (TREE_OPERAND (arg0, 1), arg1, 0)) > - { > - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0)); > - return fold_build2_loc (loc, BIT_AND_EXPR, type, > - fold_build1_loc (loc, BIT_NOT_EXPR, type, > tem), > - fold_convert_loc (loc, type, arg1)); > - } > - /* Fold (X ^ Y) & X as ~Y & X. */ > - if (TREE_CODE (arg0) == BIT_XOR_EXPR > - && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0) > - && reorder_operands_p (TREE_OPERAND (arg0, 1), arg1)) > - { > - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1)); > - return fold_build2_loc (loc, BIT_AND_EXPR, type, > - fold_build1_loc (loc, BIT_NOT_EXPR, type, > tem), > - fold_convert_loc (loc, type, arg1)); > - } > - /* Fold X & (X ^ Y) as X & ~Y. */ > - if (TREE_CODE (arg1) == BIT_XOR_EXPR > - && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0)) > - { > - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1)); > - return fold_build2_loc (loc, BIT_AND_EXPR, type, > - fold_convert_loc (loc, type, arg0), > - fold_build1_loc (loc, BIT_NOT_EXPR, type, > tem)); > - } > - /* Fold X & (Y ^ X) as ~Y & X. */ > - if (TREE_CODE (arg1) == BIT_XOR_EXPR > - && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0) > - && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0))) > - { > - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0)); > - return fold_build2_loc (loc, BIT_AND_EXPR, type, > - fold_build1_loc (loc, BIT_NOT_EXPR, type, > tem), > - fold_convert_loc (loc, type, arg0)); > - } > - > /* Fold (X * Y) & -(1 << CST) to X * Y if Y is a constant > multiple of 1 << CST. */ > if (TREE_CODE (arg1) == INTEGER_CST) > { > wide_int cst1 = arg1; > wide_int ncst1 = -cst1; > if ((cst1 & ncst1) == ncst1 > && multiple_of_p (type, arg0, > wide_int_to_tree (TREE_TYPE (arg1), ncst1))) > return fold_convert_loc (loc, type, arg0); > Index: gcc/match.pd > =================================================================== > --- gcc/match.pd (revision 235933) > +++ gcc/match.pd (working copy) > @@ -667,39 +667,70 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (if (tree_nop_conversion_p (type, TREE_TYPE (@0)) > && tree_nop_conversion_p (type, TREE_TYPE (@1))) > (bit_xor (convert @0) (convert @1)))) > > /* Convert ~X ^ C to X ^ ~C. */ > (simplify > (bit_xor (convert? (bit_not @0)) INTEGER_CST@1) > (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) > (bit_xor (convert @0) (bit_not @1)))) > > -/* Fold (X & Y) ^ Y as ~X & Y. */ > -(simplify > - (bit_xor:c (bit_and:c @0 @1) @1) > - (bit_and (bit_not @0) @1)) > +/* Fold (X & Y) ^ Y and (X ^ Y) & Y as ~X & Y. */ > +(for opo (bit_and bit_xor) > + opi (bit_xor bit_and) > + (simplify > + (opo:c (opi:c @0 @1) @1) > + (bit_and (bit_not @0) @1))) > > /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both > operands are another bit-wise operation with a common input. If so, > distribute the bit operations to save an operation and possibly two if > constants are involved. For example, convert > (A | B) & (A | C) into A | (B & C) > Further simplification will occur if B and C are constants. */ > (for op (bit_and bit_ior bit_xor) > rop (bit_ior bit_and bit_and) > (simplify > (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2))) > (if (tree_nop_conversion_p (type, TREE_TYPE (@1)) > && tree_nop_conversion_p (type, TREE_TYPE (@2))) > (rop (convert @0) (op (convert @1) (convert @2)))))) > > +/* Some simple reassociation for bit operations, also handled in reassoc. > */ > +/* (X & Y) & Y -> X & Y > + (X | Y) | Y -> X | Y */ > +(for op (bit_and bit_ior) > + (simplify > + (op:c (convert?@2 (op:c @0 @1)) (convert? @1)) > + @2)) > +/* (X ^ Y) ^ Y -> X */ > +(simplify > + (bit_xor:c (convert? (bit_xor:c @0 @1)) (convert? @1)) > + (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) > + (convert @0))) > +/* (X & Y) & (X & Z) -> (X & Y) & Z > + (X | Y) | (X | Z) -> (X | Y) | Z */ > +(for op (bit_and bit_ior) > + (simplify > + (op:c (convert?@3 (op:c@4 @0 @1)) (convert?@5 (op:c@6 @0 @2))) > + (if (tree_nop_conversion_p (type, TREE_TYPE (@1)) > + && tree_nop_conversion_p (type, TREE_TYPE (@2))) > + (if (single_use (@5) && single_use (@6)) > + (op @3 (convert @2)) > + (if (single_use (@3) && single_use (@4)) > + (op (convert @1) @5)))))) > +/* (X ^ Y) ^ (X ^ Z) -> Y ^ Z */ > +(simplify > + (bit_xor (convert? (bit_xor:c @0 @1)) (convert? (bit_xor:c @0 @2))) > + (if (tree_nop_conversion_p (type, TREE_TYPE (@1)) > + && tree_nop_conversion_p (type, TREE_TYPE (@2))) > + (convert (bit_xor @1 @2)))) > > (simplify > (abs (abs@1 @0)) > @1) > (simplify > (abs (negate @0)) > (abs @0)) > (simplify > (abs tree_expr_nonnegative_p@0) > @0) > Index: gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c > =================================================================== > --- gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (revision 0) > +++ gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (working copy) > @@ -0,0 +1,29 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O -fdump-tree-forwprop1-details -fdump-tree-ccp1-details" > } */ > + > +int f1(int a, int b){ > + int c = a & b; > + return c & b; > +} > +int f2(int a, int b){ > + int c = a | b; > + return b | c; > +} > +int g1(int a, int b, int c){ > + int d = a & b; > + int e = b & c; > + return d & e; > +} > +int g2(int a, int b, int c){ > + int d = a | b; > + int e = c | b; > + return d | e; > +} > +int g3(int a, int b, int c){ > + int d = a ^ b; > + int e = b ^ c; > + return e ^ d; > +} > + > +/* { dg-final { scan-tree-dump-times "Match-and-simplified" 2 "ccp1" } } */ > +/* { dg-final { scan-tree-dump-times "gimple_simplified" 3 "forwprop1" } } > */ > Index: gcc/testsuite/gcc.dg/tree-ssa/pr69270.c > =================================================================== > --- gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (revision 235933) > +++ gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (working copy) > @@ -1,21 +1,23 @@ > /* { dg-do compile } */ > /* { dg-options "-O2 -fsplit-paths -fdump-tree-dom3-details" } */ > > /* There should be two references to bufferstep that turn into > constants. */ > /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with > constant .0." 1 "dom3"} } */ > /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with > constant .1." 1 "dom3"} } */ > > /* And some assignments ought to fold down to constants. */ > -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 2 "dom3"} > } */ > -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 2 "dom3"} > } */ > +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -1;" 1 "dom3"} > } */ > +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -2;" 1 "dom3"} > } */ > +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 1 "dom3"} > } */ > +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 1 "dom3"} > } */ > > /* The XOR operations should have been optimized to constants. */ > /* { dg-final { scan-tree-dump-not "bit_xor" "dom3"} } */ > > > extern int *stepsizeTable; > > void > adpcm_coder (signed char *outdata, int len) > { > Index: gcc/testsuite/gcc.dg/tree-ssa/vrp59.c > =================================================================== > --- gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (revision 235933) > +++ gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (working copy) > @@ -1,12 +1,12 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp1" } */ > +/* { dg-options "-O2 -fno-tree-ccp -fno-tree-forwprop -fdump-tree-vrp1" } > */ > > int f(int x) > { > if (x >= 0 && x <= 3) > { > x = x ^ 3; > x = x & 3; > } > return x; > } > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) 2016-05-06 11:50 ` Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) Marc Glisse 2016-05-08 20:49 ` Marc Glisse 2016-05-09 10:04 ` Richard Biener @ 2016-05-10 6:12 ` Marc Glisse 2016-05-10 9:27 ` Richard Biener 2016-05-11 13:52 ` H.J. Lu 2 siblings, 2 replies; 22+ messages in thread From: Marc Glisse @ 2016-05-10 6:12 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches [-- Attachment #1: Type: TEXT/PLAIN, Size: 1656 bytes --] On Fri, 6 May 2016, Marc Glisse wrote: > Here they are. I did (X&Y)&X and (X&Y)&(X&Z). The next one would be > ((X&Y)&Z)&X, but at some point we have to defer to reassoc. > > I didn't add the convert?+tree_nop_conversion_p to the existing transform I > modified. I guess at some point we should make a pass and add them to all the > transformations on bit operations... > > For (X & Y) & Y, I believe that any conversion is fine. For the others, > tree_nop_conversion_p is probably too strict (narrowing should be fine for > all), but I was too lazy to look for tighter conditions. > > (X ^ Y) ^ Y -> X should probably have (non_lvalue ...) on its output, but in > a simple test it didn't seem to matter. Is non_lvalue still needed? > > > Bootstrap+regtest on powerpc64le-unknown-linux-gnu. > > 2016-05-06 Marc Glisse <marc.glisse@inria.fr> > > gcc/ > * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge > with... > * match.pd ((X & Y) ^ Y): ... this. > ((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y) > | (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations. > > gcc/testsuite/ > * gcc.dg/tree-ssa/bit-assoc.c: New testcase. > * gcc.dg/tree-ssa/pr69270.c: Adjust. > * gcc.dg/tree-ssa/vrp59.c: Disable forwprop. Here it is again, I just replaced convert with convert[12] in the last 2 transforms. This should matter for (unsigned)(si & 42) & (ui & 42u). I didn't change it in the other transform, because it would only matter in the case (T)(X & CST) & CST, which I think would be better served by extending the transform that currently handles (X & CST1) & CST2 (not done in this patch). -- Marc Glisse [-- Attachment #2: Type: TEXT/PLAIN, Size: 8285 bytes --] Index: gcc/fold-const.c =================================================================== --- gcc/fold-const.c (revision 236047) +++ gcc/fold-const.c (working copy) @@ -10064,59 +10064,20 @@ fold_binary_loc (location_t loc, } /* Fold !X & 1 as X == 0. */ if (TREE_CODE (arg0) == TRUTH_NOT_EXPR && integer_onep (arg1)) { tem = TREE_OPERAND (arg0, 0); return fold_build2_loc (loc, EQ_EXPR, type, tem, build_zero_cst (TREE_TYPE (tem))); } - /* Fold (X ^ Y) & Y as ~X & Y. */ - if (TREE_CODE (arg0) == BIT_XOR_EXPR - && operand_equal_p (TREE_OPERAND (arg0, 1), arg1, 0)) - { - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0)); - return fold_build2_loc (loc, BIT_AND_EXPR, type, - fold_build1_loc (loc, BIT_NOT_EXPR, type, tem), - fold_convert_loc (loc, type, arg1)); - } - /* Fold (X ^ Y) & X as ~Y & X. */ - if (TREE_CODE (arg0) == BIT_XOR_EXPR - && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0) - && reorder_operands_p (TREE_OPERAND (arg0, 1), arg1)) - { - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1)); - return fold_build2_loc (loc, BIT_AND_EXPR, type, - fold_build1_loc (loc, BIT_NOT_EXPR, type, tem), - fold_convert_loc (loc, type, arg1)); - } - /* Fold X & (X ^ Y) as X & ~Y. */ - if (TREE_CODE (arg1) == BIT_XOR_EXPR - && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0)) - { - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1)); - return fold_build2_loc (loc, BIT_AND_EXPR, type, - fold_convert_loc (loc, type, arg0), - fold_build1_loc (loc, BIT_NOT_EXPR, type, tem)); - } - /* Fold X & (Y ^ X) as ~Y & X. */ - if (TREE_CODE (arg1) == BIT_XOR_EXPR - && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0) - && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0))) - { - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0)); - return fold_build2_loc (loc, BIT_AND_EXPR, type, - fold_build1_loc (loc, BIT_NOT_EXPR, type, tem), - fold_convert_loc (loc, type, arg0)); - } - /* Fold (X * Y) & -(1 << CST) to X * Y if Y is a constant multiple of 1 << CST. */ if (TREE_CODE (arg1) == INTEGER_CST) { wide_int cst1 = arg1; wide_int ncst1 = -cst1; if ((cst1 & ncst1) == ncst1 && multiple_of_p (type, arg0, wide_int_to_tree (TREE_TYPE (arg1), ncst1))) return fold_convert_loc (loc, type, arg0); Index: gcc/match.pd =================================================================== --- gcc/match.pd (revision 236047) +++ gcc/match.pd (working copy) @@ -667,39 +667,70 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (tree_nop_conversion_p (type, TREE_TYPE (@0)) && tree_nop_conversion_p (type, TREE_TYPE (@1))) (bit_xor (convert @0) (convert @1)))) /* Convert ~X ^ C to X ^ ~C. */ (simplify (bit_xor (convert? (bit_not @0)) INTEGER_CST@1) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) (bit_xor (convert @0) (bit_not @1)))) -/* Fold (X & Y) ^ Y as ~X & Y. */ -(simplify - (bit_xor:c (bit_and:c @0 @1) @1) - (bit_and (bit_not @0) @1)) +/* Fold (X & Y) ^ Y and (X ^ Y) & Y as ~X & Y. */ +(for opo (bit_and bit_xor) + opi (bit_xor bit_and) + (simplify + (opo:c (opi:c @0 @1) @1) + (bit_and (bit_not @0) @1))) /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both operands are another bit-wise operation with a common input. If so, distribute the bit operations to save an operation and possibly two if constants are involved. For example, convert (A | B) & (A | C) into A | (B & C) Further simplification will occur if B and C are constants. */ (for op (bit_and bit_ior bit_xor) rop (bit_ior bit_and bit_and) (simplify (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2))) (if (tree_nop_conversion_p (type, TREE_TYPE (@1)) && tree_nop_conversion_p (type, TREE_TYPE (@2))) (rop (convert @0) (op (convert @1) (convert @2)))))) +/* Some simple reassociation for bit operations, also handled in reassoc. */ +/* (X & Y) & Y -> X & Y + (X | Y) | Y -> X | Y */ +(for op (bit_and bit_ior) + (simplify + (op:c (convert?@2 (op:c @0 @1)) (convert? @1)) + @2)) +/* (X ^ Y) ^ Y -> X */ +(simplify + (bit_xor:c (convert? (bit_xor:c @0 @1)) (convert? @1)) + (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) + (convert @0))) +/* (X & Y) & (X & Z) -> (X & Y) & Z + (X | Y) | (X | Z) -> (X | Y) | Z */ +(for op (bit_and bit_ior) + (simplify + (op:c (convert1?@3 (op:c@4 @0 @1)) (convert2?@5 (op:c@6 @0 @2))) + (if (tree_nop_conversion_p (type, TREE_TYPE (@1)) + && tree_nop_conversion_p (type, TREE_TYPE (@2))) + (if (single_use (@5) && single_use (@6)) + (op @3 (convert @2)) + (if (single_use (@3) && single_use (@4)) + (op (convert @1) @5)))))) +/* (X ^ Y) ^ (X ^ Z) -> Y ^ Z */ +(simplify + (bit_xor (convert1? (bit_xor:c @0 @1)) (convert2? (bit_xor:c @0 @2))) + (if (tree_nop_conversion_p (type, TREE_TYPE (@1)) + && tree_nop_conversion_p (type, TREE_TYPE (@2))) + (convert (bit_xor @1 @2)))) (simplify (abs (abs@1 @0)) @1) (simplify (abs (negate @0)) (abs @0)) (simplify (abs tree_expr_nonnegative_p@0) @0) Index: gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (working copy) @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-forwprop1-details -fdump-tree-ccp1-details" } */ + +int f1(int a, int b){ + int c = a & b; + return c & b; +} +int f2(int a, int b){ + int c = a | b; + return b | c; +} +int g1(int a, int b, int c){ + int d = a & b; + int e = b & c; + return d & e; +} +int g2(int a, int b, int c){ + int d = a | b; + int e = c | b; + return d | e; +} +int g3(int a, int b, int c){ + int d = a ^ b; + int e = b ^ c; + return e ^ d; +} + +/* { dg-final { scan-tree-dump-times "Match-and-simplified" 2 "ccp1" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified" 3 "forwprop1" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/pr69270.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (revision 236047) +++ gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (working copy) @@ -1,21 +1,23 @@ /* { dg-do compile } */ /* { dg-options "-O2 -fsplit-paths -fdump-tree-dom3-details" } */ /* There should be two references to bufferstep that turn into constants. */ /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with constant .0." 1 "dom3"} } */ /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with constant .1." 1 "dom3"} } */ /* And some assignments ought to fold down to constants. */ -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 2 "dom3"} } */ -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 2 "dom3"} } */ +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -1;" 1 "dom3"} } */ +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -2;" 1 "dom3"} } */ +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 1 "dom3"} } */ +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 1 "dom3"} } */ /* The XOR operations should have been optimized to constants. */ /* { dg-final { scan-tree-dump-not "bit_xor" "dom3"} } */ extern int *stepsizeTable; void adpcm_coder (signed char *outdata, int len) { Index: gcc/testsuite/gcc.dg/tree-ssa/vrp59.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (revision 236047) +++ gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (working copy) @@ -1,12 +1,12 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp1" } */ +/* { dg-options "-O2 -fno-tree-ccp -fno-tree-forwprop -fdump-tree-vrp1" } */ int f(int x) { if (x >= 0 && x <= 3) { x = x ^ 3; x = x & 3; } return x; } ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) 2016-05-10 6:12 ` Marc Glisse @ 2016-05-10 9:27 ` Richard Biener 2016-05-11 13:52 ` H.J. Lu 1 sibling, 0 replies; 22+ messages in thread From: Richard Biener @ 2016-05-10 9:27 UTC (permalink / raw) To: Marc Glisse; +Cc: GCC Patches On Tue, May 10, 2016 at 8:11 AM, Marc Glisse <marc.glisse@inria.fr> wrote: > On Fri, 6 May 2016, Marc Glisse wrote: > >> Here they are. I did (X&Y)&X and (X&Y)&(X&Z). The next one would be >> ((X&Y)&Z)&X, but at some point we have to defer to reassoc. >> >> I didn't add the convert?+tree_nop_conversion_p to the existing transform >> I modified. I guess at some point we should make a pass and add them to all >> the transformations on bit operations... >> >> For (X & Y) & Y, I believe that any conversion is fine. For the others, >> tree_nop_conversion_p is probably too strict (narrowing should be fine for >> all), but I was too lazy to look for tighter conditions. >> >> (X ^ Y) ^ Y -> X should probably have (non_lvalue ...) on its output, but >> in a simple test it didn't seem to matter. Is non_lvalue still needed? >> >> >> Bootstrap+regtest on powerpc64le-unknown-linux-gnu. >> >> 2016-05-06 Marc Glisse <marc.glisse@inria.fr> >> >> gcc/ >> * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge >> with... >> * match.pd ((X & Y) ^ Y): ... this. >> ((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y) >> | (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations. >> >> gcc/testsuite/ >> * gcc.dg/tree-ssa/bit-assoc.c: New testcase. >> * gcc.dg/tree-ssa/pr69270.c: Adjust. >> * gcc.dg/tree-ssa/vrp59.c: Disable forwprop. > > > Here it is again, I just replaced convert with convert[12] in the last 2 > transforms. This should matter for (unsigned)(si & 42) & (ui & 42u). I > didn't change it in the other transform, because it would only matter in the > case (T)(X & CST) & CST, which I think would be better served by extending > the transform that currently handles (X & CST1) & CST2 (not done in this > patch). Ok. Thanks! Richard. > -- > Marc Glisse > Index: gcc/fold-const.c > =================================================================== > --- gcc/fold-const.c (revision 236047) > +++ gcc/fold-const.c (working copy) > @@ -10064,59 +10064,20 @@ fold_binary_loc (location_t loc, > } > /* Fold !X & 1 as X == 0. */ > if (TREE_CODE (arg0) == TRUTH_NOT_EXPR > && integer_onep (arg1)) > { > tem = TREE_OPERAND (arg0, 0); > return fold_build2_loc (loc, EQ_EXPR, type, tem, > build_zero_cst (TREE_TYPE (tem))); > } > > - /* Fold (X ^ Y) & Y as ~X & Y. */ > - if (TREE_CODE (arg0) == BIT_XOR_EXPR > - && operand_equal_p (TREE_OPERAND (arg0, 1), arg1, 0)) > - { > - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0)); > - return fold_build2_loc (loc, BIT_AND_EXPR, type, > - fold_build1_loc (loc, BIT_NOT_EXPR, type, > tem), > - fold_convert_loc (loc, type, arg1)); > - } > - /* Fold (X ^ Y) & X as ~Y & X. */ > - if (TREE_CODE (arg0) == BIT_XOR_EXPR > - && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0) > - && reorder_operands_p (TREE_OPERAND (arg0, 1), arg1)) > - { > - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1)); > - return fold_build2_loc (loc, BIT_AND_EXPR, type, > - fold_build1_loc (loc, BIT_NOT_EXPR, type, > tem), > - fold_convert_loc (loc, type, arg1)); > - } > - /* Fold X & (X ^ Y) as X & ~Y. */ > - if (TREE_CODE (arg1) == BIT_XOR_EXPR > - && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0)) > - { > - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1)); > - return fold_build2_loc (loc, BIT_AND_EXPR, type, > - fold_convert_loc (loc, type, arg0), > - fold_build1_loc (loc, BIT_NOT_EXPR, type, > tem)); > - } > - /* Fold X & (Y ^ X) as ~Y & X. */ > - if (TREE_CODE (arg1) == BIT_XOR_EXPR > - && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0) > - && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0))) > - { > - tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0)); > - return fold_build2_loc (loc, BIT_AND_EXPR, type, > - fold_build1_loc (loc, BIT_NOT_EXPR, type, > tem), > - fold_convert_loc (loc, type, arg0)); > - } > - > /* Fold (X * Y) & -(1 << CST) to X * Y if Y is a constant > multiple of 1 << CST. */ > if (TREE_CODE (arg1) == INTEGER_CST) > { > wide_int cst1 = arg1; > wide_int ncst1 = -cst1; > if ((cst1 & ncst1) == ncst1 > && multiple_of_p (type, arg0, > wide_int_to_tree (TREE_TYPE (arg1), ncst1))) > return fold_convert_loc (loc, type, arg0); > Index: gcc/match.pd > =================================================================== > --- gcc/match.pd (revision 236047) > +++ gcc/match.pd (working copy) > @@ -667,39 +667,70 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (if (tree_nop_conversion_p (type, TREE_TYPE (@0)) > && tree_nop_conversion_p (type, TREE_TYPE (@1))) > (bit_xor (convert @0) (convert @1)))) > > /* Convert ~X ^ C to X ^ ~C. */ > (simplify > (bit_xor (convert? (bit_not @0)) INTEGER_CST@1) > (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) > (bit_xor (convert @0) (bit_not @1)))) > > -/* Fold (X & Y) ^ Y as ~X & Y. */ > -(simplify > - (bit_xor:c (bit_and:c @0 @1) @1) > - (bit_and (bit_not @0) @1)) > +/* Fold (X & Y) ^ Y and (X ^ Y) & Y as ~X & Y. */ > +(for opo (bit_and bit_xor) > + opi (bit_xor bit_and) > + (simplify > + (opo:c (opi:c @0 @1) @1) > + (bit_and (bit_not @0) @1))) > > /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both > operands are another bit-wise operation with a common input. If so, > distribute the bit operations to save an operation and possibly two if > constants are involved. For example, convert > (A | B) & (A | C) into A | (B & C) > Further simplification will occur if B and C are constants. */ > (for op (bit_and bit_ior bit_xor) > rop (bit_ior bit_and bit_and) > (simplify > (op (convert? (rop:c @0 @1)) (convert? (rop:c @0 @2))) > (if (tree_nop_conversion_p (type, TREE_TYPE (@1)) > && tree_nop_conversion_p (type, TREE_TYPE (@2))) > (rop (convert @0) (op (convert @1) (convert @2)))))) > > +/* Some simple reassociation for bit operations, also handled in reassoc. > */ > +/* (X & Y) & Y -> X & Y > + (X | Y) | Y -> X | Y */ > +(for op (bit_and bit_ior) > + (simplify > + (op:c (convert?@2 (op:c @0 @1)) (convert? @1)) > + @2)) > +/* (X ^ Y) ^ Y -> X */ > +(simplify > + (bit_xor:c (convert? (bit_xor:c @0 @1)) (convert? @1)) > + (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) > + (convert @0))) > +/* (X & Y) & (X & Z) -> (X & Y) & Z > + (X | Y) | (X | Z) -> (X | Y) | Z */ > +(for op (bit_and bit_ior) > + (simplify > + (op:c (convert1?@3 (op:c@4 @0 @1)) (convert2?@5 (op:c@6 @0 @2))) > + (if (tree_nop_conversion_p (type, TREE_TYPE (@1)) > + && tree_nop_conversion_p (type, TREE_TYPE (@2))) > + (if (single_use (@5) && single_use (@6)) > + (op @3 (convert @2)) > + (if (single_use (@3) && single_use (@4)) > + (op (convert @1) @5)))))) > +/* (X ^ Y) ^ (X ^ Z) -> Y ^ Z */ > +(simplify > + (bit_xor (convert1? (bit_xor:c @0 @1)) (convert2? (bit_xor:c @0 @2))) > + (if (tree_nop_conversion_p (type, TREE_TYPE (@1)) > + && tree_nop_conversion_p (type, TREE_TYPE (@2))) > + (convert (bit_xor @1 @2)))) > > (simplify > (abs (abs@1 @0)) > @1) > (simplify > (abs (negate @0)) > (abs @0)) > (simplify > (abs tree_expr_nonnegative_p@0) > @0) > Index: gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c > =================================================================== > --- gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (revision 0) > +++ gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c (working copy) > @@ -0,0 +1,29 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O -fdump-tree-forwprop1-details -fdump-tree-ccp1-details" > } */ > + > +int f1(int a, int b){ > + int c = a & b; > + return c & b; > +} > +int f2(int a, int b){ > + int c = a | b; > + return b | c; > +} > +int g1(int a, int b, int c){ > + int d = a & b; > + int e = b & c; > + return d & e; > +} > +int g2(int a, int b, int c){ > + int d = a | b; > + int e = c | b; > + return d | e; > +} > +int g3(int a, int b, int c){ > + int d = a ^ b; > + int e = b ^ c; > + return e ^ d; > +} > + > +/* { dg-final { scan-tree-dump-times "Match-and-simplified" 2 "ccp1" } } */ > +/* { dg-final { scan-tree-dump-times "gimple_simplified" 3 "forwprop1" } } > */ > Index: gcc/testsuite/gcc.dg/tree-ssa/pr69270.c > =================================================================== > --- gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (revision 236047) > +++ gcc/testsuite/gcc.dg/tree-ssa/pr69270.c (working copy) > @@ -1,21 +1,23 @@ > /* { dg-do compile } */ > /* { dg-options "-O2 -fsplit-paths -fdump-tree-dom3-details" } */ > > /* There should be two references to bufferstep that turn into > constants. */ > /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with > constant .0." 1 "dom3"} } */ > /* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with > constant .1." 1 "dom3"} } */ > > /* And some assignments ought to fold down to constants. */ > -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 2 "dom3"} > } */ > -/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 2 "dom3"} > } */ > +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -1;" 1 "dom3"} > } */ > +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = -2;" 1 "dom3"} > } */ > +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 1 "dom3"} > } */ > +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 1 "dom3"} > } */ > > /* The XOR operations should have been optimized to constants. */ > /* { dg-final { scan-tree-dump-not "bit_xor" "dom3"} } */ > > > extern int *stepsizeTable; > > void > adpcm_coder (signed char *outdata, int len) > { > Index: gcc/testsuite/gcc.dg/tree-ssa/vrp59.c > =================================================================== > --- gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (revision 236047) > +++ gcc/testsuite/gcc.dg/tree-ssa/vrp59.c (working copy) > @@ -1,12 +1,12 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp1" } */ > +/* { dg-options "-O2 -fno-tree-ccp -fno-tree-forwprop -fdump-tree-vrp1" } > */ > > int f(int x) > { > if (x >= 0 && x <= 3) > { > x = x ^ 3; > x = x & 3; > } > return x; > } > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) 2016-05-10 6:12 ` Marc Glisse 2016-05-10 9:27 ` Richard Biener @ 2016-05-11 13:52 ` H.J. Lu 2016-05-11 16:17 ` Marc Glisse 1 sibling, 1 reply; 22+ messages in thread From: H.J. Lu @ 2016-05-11 13:52 UTC (permalink / raw) To: Marc Glisse; +Cc: Richard Biener, GCC Patches On Mon, May 9, 2016 at 11:11 PM, Marc Glisse <marc.glisse@inria.fr> wrote: > On Fri, 6 May 2016, Marc Glisse wrote: > >> Here they are. I did (X&Y)&X and (X&Y)&(X&Z). The next one would be >> ((X&Y)&Z)&X, but at some point we have to defer to reassoc. >> >> I didn't add the convert?+tree_nop_conversion_p to the existing transform >> I modified. I guess at some point we should make a pass and add them to all >> the transformations on bit operations... >> >> For (X & Y) & Y, I believe that any conversion is fine. For the others, >> tree_nop_conversion_p is probably too strict (narrowing should be fine for >> all), but I was too lazy to look for tighter conditions. >> >> (X ^ Y) ^ Y -> X should probably have (non_lvalue ...) on its output, but >> in a simple test it didn't seem to matter. Is non_lvalue still needed? >> >> >> Bootstrap+regtest on powerpc64le-unknown-linux-gnu. >> >> 2016-05-06 Marc Glisse <marc.glisse@inria.fr> >> >> gcc/ >> * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge >> with... >> * match.pd ((X & Y) ^ Y): ... this. >> ((X & Y) & Y, (X | Y) | Y, (X ^ Y) ^ Y, (X & Y) & (X & Z), (X | Y) >> | (X | Z), (X ^ Y) ^ (X ^ Z)): New transformations. >> >> gcc/testsuite/ >> * gcc.dg/tree-ssa/bit-assoc.c: New testcase. >> * gcc.dg/tree-ssa/pr69270.c: Adjust. >> * gcc.dg/tree-ssa/vrp59.c: Disable forwprop. > > > Here it is again, I just replaced convert with convert[12] in the last 2 > transforms. This should matter for (unsigned)(si & 42) & (ui & 42u). I > didn't change it in the other transform, because it would only matter in the > case (T)(X & CST) & CST, which I think would be better served by extending > the transform that currently handles (X & CST1) & CST2 (not done in this > patch). It caused: FAIL: gcc.dg/tree-ssa/vrp47.c scan-tree-dump-times vrp2 " & 1;" 0 on x86. H.J. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) 2016-05-11 13:52 ` H.J. Lu @ 2016-05-11 16:17 ` Marc Glisse 2016-05-11 16:26 ` Simple bitop reassoc in match.pd Jeff Law 0 siblings, 1 reply; 22+ messages in thread From: Marc Glisse @ 2016-05-11 16:17 UTC (permalink / raw) To: H.J. Lu; +Cc: Richard Biener, GCC Patches On Wed, 11 May 2016, H.J. Lu wrote: >>> * fold-const.c (fold_binary_loc) [(X ^ Y) & Y]: Remove and merge with... >>> * match.pd ((X & Y) ^ Y): ... this. > It caused: > > FAIL: gcc.dg/tree-ssa/vrp47.c scan-tree-dump-times vrp2 " & 1;" 0 > > on x86. Ah, yes, logical_op_short_circuit so I didn't test it. Hmm, doesn't seem so easy. We want to compute (int)(x==0) in a branch where x is known to be in [0, 1]. VRP1 gives (int)(_Bool)(x^1). forwprop3 handles the double conversion, which gives (x^1)&1. Here the new transform applies and gives (~x)&1. VRP2 comes, and with (x^1)&1 it used to notice that this is just x^1 (remember that x is in [0, 1] in this branch), while it doesn't know what to do with (~x)&1. In the end, we get notl %eax andl $1, %eax instead of xorl $1, %eax (andn doesn't seem to have a version with an immediate) The transformation seems right to me, but we are then missing another transformation like ~X & Y -> X ^ Y when we know that X & ~Y is 0 (the 1 bits of X are included in those of Y). That may not be easy with the limited bit tracking we have. A version limited to constant Y of the form 2^n-1 (i.e. 0...01...1) where X has a range included in [0, Y] may be simpler. We could also simplify (int)(_Bool)x to x using VRP information that x is in [0, 1], but apparently when VRP replaces x==0 with y=x^1,(_Bool)y, it does not compute a range for the new variable y, and by the time the next VRP pass comes, it is too late. In the mean time, I propose xfailing this test... -- Marc Glisse ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd 2016-05-11 16:17 ` Marc Glisse @ 2016-05-11 16:26 ` Jeff Law 2016-05-11 17:56 ` Marc Glisse 2016-05-12 5:26 ` Marc Glisse 0 siblings, 2 replies; 22+ messages in thread From: Jeff Law @ 2016-05-11 16:26 UTC (permalink / raw) To: Marc Glisse, H.J. Lu; +Cc: Richard Biener, GCC Patches On 05/11/2016 10:17 AM, Marc Glisse wrote: > The transformation seems right to me, but we are then missing another > transformation like ~X & Y -> X ^ Y when we know that X & ~Y is 0 (the 1 > bits of X are included in those of Y). That may not be easy with the > limited bit tracking we have. A version limited to constant Y of the > form 2^n-1 (i.e. 0...01...1) where X has a range included in [0, Y] may > be simpler. While we don't have strong bit tracking, we do track [0..1] ranges reasonably well, so it may be worth doing. > We could also simplify (int)(_Bool)x to x using VRP information that x > is in [0, 1], but apparently when VRP replaces x==0 with y=x^1,(_Bool)y, > it does not compute a range for the new variable y, and by the time the > next VRP pass comes, it is too late. Seems like a clear oversight. jeff ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd 2016-05-11 16:26 ` Simple bitop reassoc in match.pd Jeff Law @ 2016-05-11 17:56 ` Marc Glisse 2016-05-11 20:44 ` Marc Glisse 2016-05-12 8:41 ` Richard Biener 2016-05-12 5:26 ` Marc Glisse 1 sibling, 2 replies; 22+ messages in thread From: Marc Glisse @ 2016-05-11 17:56 UTC (permalink / raw) To: Jeff Law; +Cc: H.J. Lu, Richard Biener, GCC Patches On Wed, 11 May 2016, Jeff Law wrote: >> We could also simplify (int)(_Bool)x to x using VRP information that x >> is in [0, 1], but apparently when VRP replaces x==0 with y=x^1,(_Bool)y, >> it does not compute a range for the new variable y, and by the time the >> next VRP pass comes, it is too late. > Seems like a clear oversight. In get_value_range, there is: /* If we query the range for a new SSA name return an unmodifiable VARYING. We should get here at most from the substitute-and-fold stage which will never try to change values. */ so this is a known limitation. We could try to change that (XRESIZEVEC, memset(0) on the new elements, update num_vr_values to the new num_ssa_names, at this point vr_value should be replaced with a vector). We could also use set_range_info and make simplify_conversion_using_ranges use get_range_info instead of get_value_range. Might even move the whole function to match.pd then ;-) -- Marc Glisse ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd 2016-05-11 17:56 ` Marc Glisse @ 2016-05-11 20:44 ` Marc Glisse 2016-05-12 8:41 ` Richard Biener 1 sibling, 0 replies; 22+ messages in thread From: Marc Glisse @ 2016-05-11 20:44 UTC (permalink / raw) To: Jeff Law; +Cc: H.J. Lu, Richard Biener, GCC Patches [-- Attachment #1: Type: TEXT/PLAIN, Size: 339 bytes --] On Wed, 11 May 2016, Marc Glisse wrote: > We could also use set_range_info and make simplify_conversion_using_ranges > use get_range_info instead of get_value_range. Something like this seems to fix the testcase. I'll try to submit it properly, but I don't know when. (I also added the ~X&C transform to my TODO-list) -- Marc Glisse [-- Attachment #2: Type: TEXT/PLAIN, Size: 1882 bytes --] Index: gcc/tree-vrp.c =================================================================== --- gcc/tree-vrp.c (revision 236122) +++ gcc/tree-vrp.c (working copy) @@ -8940,6 +8940,8 @@ gassign *newop = gimple_build_assign (tem, BIT_XOR_EXPR, op0, op1); gsi_insert_before (gsi, newop, GSI_SAME_STMT); + if (TYPE_PRECISION (TREE_TYPE (tem)) > 1) + set_range_info (tem, VR_RANGE, 0, 1); gimple_assign_set_rhs_with_ops (gsi, NOP_EXPR, tem); } /* Or without. */ @@ -9648,7 +9650,6 @@ { tree innerop, middleop, finaltype; gimple *def_stmt; - value_range *innervr; signop inner_sgn, middle_sgn, final_sgn; unsigned inner_prec, middle_prec, final_prec; widest_int innermin, innermed, innermax, middlemin, middlemed, middlemax; @@ -9666,18 +9667,16 @@ || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (innerop)) return false; - /* Get the value-range of the inner operand. */ - innervr = get_value_range (innerop); - if (innervr->type != VR_RANGE - || TREE_CODE (innervr->min) != INTEGER_CST - || TREE_CODE (innervr->max) != INTEGER_CST) + /* Get the value-range of the inner operand. Use get_range_info in + case innerop was created during substitute-and-fold. */ + wide_int imin, imax; + if (get_range_info (innerop, &imin, &imax) != VR_RANGE) return false; + innermin = widest_int::from (imin, TYPE_SIGN (TREE_TYPE (innerop))); + innermax = widest_int::from (imax, TYPE_SIGN (TREE_TYPE (innerop))); /* Simulate the conversion chain to check if the result is equal if the middle conversion is removed. */ - innermin = wi::to_widest (innervr->min); - innermax = wi::to_widest (innervr->max); - inner_prec = TYPE_PRECISION (TREE_TYPE (innerop)); middle_prec = TYPE_PRECISION (TREE_TYPE (middleop)); final_prec = TYPE_PRECISION (finaltype); ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd 2016-05-11 17:56 ` Marc Glisse 2016-05-11 20:44 ` Marc Glisse @ 2016-05-12 8:41 ` Richard Biener 2016-05-12 16:03 ` Marc Glisse 1 sibling, 1 reply; 22+ messages in thread From: Richard Biener @ 2016-05-12 8:41 UTC (permalink / raw) To: Marc Glisse; +Cc: Jeff Law, H.J. Lu, GCC Patches On Wed, May 11, 2016 at 7:56 PM, Marc Glisse <marc.glisse@inria.fr> wrote: > On Wed, 11 May 2016, Jeff Law wrote: > >>> We could also simplify (int)(_Bool)x to x using VRP information that x >>> is in [0, 1], but apparently when VRP replaces x==0 with y=x^1,(_Bool)y, >>> it does not compute a range for the new variable y, and by the time the >>> next VRP pass comes, it is too late. >> >> Seems like a clear oversight. > > > In get_value_range, there is: > /* If we query the range for a new SSA name return an unmodifiable > VARYING. > We should get here at most from the substitute-and-fold stage which > will never try to change values. */ > so this is a known limitation. > > We could try to change that (XRESIZEVEC, memset(0) on the new elements, > update num_vr_values to the new num_ssa_names, at this point vr_value should > be replaced with a vector). > > We could also use set_range_info and make simplify_conversion_using_ranges > use get_range_info instead of get_value_range. Might even move the whole > function to match.pd then ;-) Yeah - note that VRP already calls set_range_info before simplifying stmts. It's just that substitute_and_fold doesn't apply fold_stmt (and thus match.pd) to all stmts but it only applies the pass specific "fold" (vrp_fold_stmt) to all stmts. Richard. > -- > Marc Glisse ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd 2016-05-12 8:41 ` Richard Biener @ 2016-05-12 16:03 ` Marc Glisse 2016-05-12 16:51 ` Richard Biener 0 siblings, 1 reply; 22+ messages in thread From: Marc Glisse @ 2016-05-12 16:03 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches On Thu, 12 May 2016, Richard Biener wrote: > Yeah - note that VRP already calls set_range_info before simplifying > stmts. It's just that substitute_and_fold doesn't apply fold_stmt (and > thus match.pd) to all stmts but it only applies the pass specific "fold" > (vrp_fold_stmt) to all stmts. Just to be sure: is the fact that VRP doesn't apply fold_stmt on purpose? The restriction makes sense, it is just that it may yield a bit of duplication. We already indirectly use get_range_info in match.pd and may miss out on opportunities that only occur in branches during the VRP pass. -- Marc Glisse ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd 2016-05-12 16:03 ` Marc Glisse @ 2016-05-12 16:51 ` Richard Biener 0 siblings, 0 replies; 22+ messages in thread From: Richard Biener @ 2016-05-12 16:51 UTC (permalink / raw) To: gcc-patches, Marc Glisse On May 12, 2016 6:02:47 PM GMT+02:00, Marc Glisse <marc.glisse@inria.fr> wrote: >On Thu, 12 May 2016, Richard Biener wrote: > >> Yeah - note that VRP already calls set_range_info before simplifying >> stmts. It's just that substitute_and_fold doesn't apply fold_stmt >(and >> thus match.pd) to all stmts but it only applies the pass specific >"fold" >> (vrp_fold_stmt) to all stmts. > >Just to be sure: is the fact that VRP doesn't apply fold_stmt on >purpose? The propagator only folds stmts that had operands replaced (that doesn't enable all simplifications as match.PD patterns cover more than one statement). >The restriction makes sense, it is just that it may yield a bit of >duplication. We already indirectly use get_range_info in match.pd and >may >miss out on opportunities that only occur in branches during the VRP >pass. Yes. The pass specific fold is called on each stmt. Maybe we can omit the propagators folding if the pass specific folding applied. Richard. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Simple bitop reassoc in match.pd 2016-05-11 16:26 ` Simple bitop reassoc in match.pd Jeff Law 2016-05-11 17:56 ` Marc Glisse @ 2016-05-12 5:26 ` Marc Glisse 1 sibling, 0 replies; 22+ messages in thread From: Marc Glisse @ 2016-05-12 5:26 UTC (permalink / raw) To: Jeff Law; +Cc: H.J. Lu, Richard Biener, GCC Patches On Wed, 11 May 2016, Jeff Law wrote: > On 05/11/2016 10:17 AM, Marc Glisse wrote: >> The transformation seems right to me, but we are then missing another >> transformation like ~X & Y -> X ^ Y when we know that X & ~Y is 0 (the 1 >> bits of X are included in those of Y). That may not be easy with the >> limited bit tracking we have. A version limited to constant Y of the >> form 2^n-1 (i.e. 0...01...1) where X has a range included in [0, Y] may >> be simpler. > While we don't have strong bit tracking, we do track [0..1] ranges reasonably > well, so it may be worth doing. I had started writing +/* Simplify (~X & Y) to X ^ Y if we know that (X & ~Y) is 0. */ +#if GIMPLE +(simplify + (bit_and (bit_not SSA_NAME@0) INTEGER_CST@1) + (if ((get_nonzero_bits (@0) & wi::bit_not (@1)) == 0) + (bit_xor @0 @1))) +#endif but then I realized that VRP does not call the simplify machinery, so this would have to be rewritten in simplify_bit_ops_using_ranges. The good point is that we could then compute: may_be_nonzero_X & ~must_be_nonzero_Y instead of assuming that Y is a constant (not that it would change anything in practice). -- Marc Glisse ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2016-05-12 16:51 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-04-30 18:45 Canonicalize X u< X to UNORDERED_EXPR Marc Glisse 2016-05-02 8:37 ` Richard Biener 2016-05-02 9:19 ` Marc Glisse 2016-05-02 9:45 ` Richard Biener 2016-05-03 6:37 ` Marc Glisse 2016-05-03 11:03 ` Richard Biener 2016-05-03 13:27 ` Marc Glisse 2016-05-03 13:34 ` Richard Biener 2016-05-06 11:50 ` Simple bitop reassoc in match.pd (was: Canonicalize X u< X to UNORDERED_EXPR) Marc Glisse 2016-05-08 20:49 ` Marc Glisse 2016-05-09 10:04 ` Richard Biener 2016-05-10 6:12 ` Marc Glisse 2016-05-10 9:27 ` Richard Biener 2016-05-11 13:52 ` H.J. Lu 2016-05-11 16:17 ` Marc Glisse 2016-05-11 16:26 ` Simple bitop reassoc in match.pd Jeff Law 2016-05-11 17:56 ` Marc Glisse 2016-05-11 20:44 ` Marc Glisse 2016-05-12 8:41 ` Richard Biener 2016-05-12 16:03 ` Marc Glisse 2016-05-12 16:51 ` Richard Biener 2016-05-12 5:26 ` Marc Glisse
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).