* RFC: Two minor optimization patterns @ 2015-01-13 22:47 Rasmus Villemoes 2015-01-13 22:56 ` Andrew Pinski 2015-01-14 13:14 ` RFC: Two minor optimization patterns Marc Glisse 0 siblings, 2 replies; 21+ messages in thread From: Rasmus Villemoes @ 2015-01-13 22:47 UTC (permalink / raw) To: gcc-patches [My first attempt at submitting a patch for gcc, so please forgive me if I'm not following the right protocol.] Sometimes rounding a variable to the next even integer is written x += x & 1. This usually means using an extra register (and hence at least an extra mov instruction) compared to the equivalent x = (x + 1) & ~1. The first pattern below tries to do this transformation. While playing with various ways of rounding down, I noticed that gcc already optimizes all of x-(x&3), x^(x&3) and x&~(x&3) to simply x&~3. In fact, x&~(x&y) is rewritten as x&~y. However, the dual of this is not handled, so I included the second pattern below. I've tested the below in the sense that gcc compiles and that trivial test cases get compiled as expected. Rasmus diff --git gcc/match.pd gcc/match.pd index 81c4ee6..04a0bc4 100644 --- gcc/match.pd +++ gcc/match.pd @@ -262,6 +262,16 @@ along with GCC; see the file COPYING3. If not see (abs tree_expr_nonnegative_p@0) @0) +/* x + (x & 1) -> (x + 1) & ~1 */ +(simplify + (plus @0 (bit_and @0 integer_onep@1)) + (bit_and (plus @0 @1) (bit_not @1))) + +/* x | ~(x | y) -> x | ~y */ +(simplify + (bit_ior @0 (bit_not (bit_ior @0 @1))) + (bit_ior @0 (bit_not @1))) + /* Try to fold (type) X op CST -> (type) (X op ((type-x) CST)) when profitable. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: RFC: Two minor optimization patterns 2015-01-13 22:47 RFC: Two minor optimization patterns Rasmus Villemoes @ 2015-01-13 22:56 ` Andrew Pinski 2015-01-14 9:52 ` Richard Biener 2015-01-14 13:14 ` RFC: Two minor optimization patterns Marc Glisse 1 sibling, 1 reply; 21+ messages in thread From: Andrew Pinski @ 2015-01-13 22:56 UTC (permalink / raw) To: Rasmus Villemoes; +Cc: GCC Patches On Tue, Jan 13, 2015 at 2:41 PM, Rasmus Villemoes <rv@rasmusvillemoes.dk> wrote: > [My first attempt at submitting a patch for gcc, so please forgive me > if I'm not following the right protocol.] There are a few things missing. For one, a testcase or two for the added optimizations. > > Sometimes rounding a variable to the next even integer is written x += x > & 1. This usually means using an extra register (and hence at least an > extra mov instruction) compared to the equivalent x = (x + 1) & ~1. The > first pattern below tries to do this transformation. > > While playing with various ways of rounding down, I noticed that gcc > already optimizes all of x-(x&3), x^(x&3) and x&~(x&3) to simply > x&~3. In fact, x&~(x&y) is rewritten as x&~y. However, the dual of this > is not handled, so I included the second pattern below. > > I've tested the below in the sense that gcc compiles and that trivial > test cases get compiled as expected. The other thing you missed is a changelog entry for the change you did. Also you mentioned you tested the patch below but did not mention which target you tested it on and you should run the full GCC testsuite. https://gcc.gnu.org/contribute.html is a good page to start with how to handle most of the items above. https://gcc.gnu.org/wiki/HowToPrepareATestcase is a good page on how to write the testcase for testing the added optimization. Thanks, Andrew Pinski > > Rasmus > > > > diff --git gcc/match.pd gcc/match.pd > index 81c4ee6..04a0bc4 100644 > --- gcc/match.pd > +++ gcc/match.pd > @@ -262,6 +262,16 @@ along with GCC; see the file COPYING3. If not see > (abs tree_expr_nonnegative_p@0) > @0) > > +/* x + (x & 1) -> (x + 1) & ~1 */ > +(simplify > + (plus @0 (bit_and @0 integer_onep@1)) > + (bit_and (plus @0 @1) (bit_not @1))) > + > +/* x | ~(x | y) -> x | ~y */ > +(simplify > + (bit_ior @0 (bit_not (bit_ior @0 @1))) > + (bit_ior @0 (bit_not @1))) > + > > /* Try to fold (type) X op CST -> (type) (X op ((type-x) CST)) > when profitable. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: RFC: Two minor optimization patterns 2015-01-13 22:56 ` Andrew Pinski @ 2015-01-14 9:52 ` Richard Biener 2015-01-14 12:45 ` Rasmus Villemoes 0 siblings, 1 reply; 21+ messages in thread From: Richard Biener @ 2015-01-14 9:52 UTC (permalink / raw) To: Andrew Pinski; +Cc: Rasmus Villemoes, GCC Patches On Tue, Jan 13, 2015 at 11:47 PM, Andrew Pinski <pinskia@gmail.com> wrote: > On Tue, Jan 13, 2015 at 2:41 PM, Rasmus Villemoes <rv@rasmusvillemoes.dk> wrote: >> [My first attempt at submitting a patch for gcc, so please forgive me >> if I'm not following the right protocol.] > > There are a few things missing. For one, a testcase or two for the > added optimizations. > >> >> Sometimes rounding a variable to the next even integer is written x += x >> & 1. This usually means using an extra register (and hence at least an >> extra mov instruction) compared to the equivalent x = (x + 1) & ~1. The >> first pattern below tries to do this transformation. >> >> While playing with various ways of rounding down, I noticed that gcc >> already optimizes all of x-(x&3), x^(x&3) and x&~(x&3) to simply >> x&~3. Does it also handle x+(x&3)? Where does it handle x - (x&3)? That is, doesn't the pattern also work for constants other than 1? Please put it before the abs simplifications after the last one handing bit_and/bit_ior. Thanks, Richard. > In fact, x&~(x&y) is rewritten as x&~y. However, the dual of this >> is not handled, so I included the second pattern below. >> >> I've tested the below in the sense that gcc compiles and that trivial >> test cases get compiled as expected. > > The other thing you missed is a changelog entry for the change you did. > Also you mentioned you tested the patch below but did not mention > which target you tested it on and you should run the full GCC > testsuite. > https://gcc.gnu.org/contribute.html is a good page to start with how > to handle most of the items above. > https://gcc.gnu.org/wiki/HowToPrepareATestcase is a good page on how > to write the testcase for testing the added optimization. > > Thanks, > Andrew Pinski > >> >> Rasmus >> >> >> >> diff --git gcc/match.pd gcc/match.pd >> index 81c4ee6..04a0bc4 100644 >> --- gcc/match.pd >> +++ gcc/match.pd >> @@ -262,6 +262,16 @@ along with GCC; see the file COPYING3. If not see >> (abs tree_expr_nonnegative_p@0) >> @0) >> >> +/* x + (x & 1) -> (x + 1) & ~1 */ >> +(simplify >> + (plus @0 (bit_and @0 integer_onep@1)) >> + (bit_and (plus @0 @1) (bit_not @1))) >> + >> +/* x | ~(x | y) -> x | ~y */ >> +(simplify >> + (bit_ior @0 (bit_not (bit_ior @0 @1))) >> + (bit_ior @0 (bit_not @1))) >> + >> >> /* Try to fold (type) X op CST -> (type) (X op ((type-x) CST)) >> when profitable. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: RFC: Two minor optimization patterns 2015-01-14 9:52 ` Richard Biener @ 2015-01-14 12:45 ` Rasmus Villemoes 2015-01-14 14:01 ` Richard Biener 0 siblings, 1 reply; 21+ messages in thread From: Rasmus Villemoes @ 2015-01-14 12:45 UTC (permalink / raw) To: Richard Biener; +Cc: Andrew Pinski, GCC Patches On Wed, Jan 14 2015, Richard Biener <richard.guenther@gmail.com> wrote: > On Tue, Jan 13, 2015 at 11:47 PM, Andrew Pinski <pinskia@gmail.com> wrote: >> On Tue, Jan 13, 2015 at 2:41 PM, Rasmus Villemoes <rv@rasmusvillemoes.dk> wrote: >>> [My first attempt at submitting a patch for gcc, so please forgive me >>> if I'm not following the right protocol.] >> >> There are a few things missing. For one, a testcase or two for the >> added optimizations. I'll see what I can come up with. Thanks for the pointers. >>> Sometimes rounding a variable to the next even integer is written x += x >>> & 1. This usually means using an extra register (and hence at least an >>> extra mov instruction) compared to the equivalent x = (x + 1) & ~1. The >>> first pattern below tries to do this transformation. >>> >>> While playing with various ways of rounding down, I noticed that gcc >>> already optimizes all of x-(x&3), x^(x&3) and x&~(x&3) to simply >>> x&~3. > > Does it also handle x+(x&3)? I'm not sure what 'it' refers to, and I'm also not sure how you think x+(x&3) could be rewritten. > Where does it handle x - (x&3)? If by 'it' you mean gcc, I tried looking for a pattern matching this, but couldn't find it, so I don't know where it is handled. I can just see by running gcc-5.0 -fdump-tree-original -O2 -c opt.c that "x - (x & 3)" is rewritten as x & -4 (which is of course the same as x & ~3). Btw, I now see that neither x&~(x&3) or x&~(x&y) are rewritten that early, but objdump -d shows that the end result is the same. > That is, doesn't the pattern also work for constants other than 1? Here I assume that 'the pattern' refers to the first pattern, and the answer is 'not immediately'. To round up a number to the next multiple of 2^k we need to add the negative of that number modulo 2^k. It just so happens that for k=1 we have x==-x for both possible values of x. So with a little tweak, this does in fact lead to an optimization opportunity, namely x + ((-x) & m) -> (x + m) & ~m whenever m is one less than a power of 2. I don't know how to check for m satisfying this in the match.pd language. > Please put it before the abs simplifications after the last one handing > bit_and/bit_ior. OK, will do. Rasmus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: RFC: Two minor optimization patterns 2015-01-14 12:45 ` Rasmus Villemoes @ 2015-01-14 14:01 ` Richard Biener 2015-01-21 10:50 ` [PATCH 0/4] A few " Rasmus Villemoes 0 siblings, 1 reply; 21+ messages in thread From: Richard Biener @ 2015-01-14 14:01 UTC (permalink / raw) To: Rasmus Villemoes; +Cc: Andrew Pinski, GCC Patches On Wed, Jan 14, 2015 at 1:23 PM, Rasmus Villemoes <rv@rasmusvillemoes.dk> wrote: > On Wed, Jan 14 2015, Richard Biener <richard.guenther@gmail.com> wrote: > >> On Tue, Jan 13, 2015 at 11:47 PM, Andrew Pinski <pinskia@gmail.com> wrote: >>> On Tue, Jan 13, 2015 at 2:41 PM, Rasmus Villemoes <rv@rasmusvillemoes.dk> wrote: >>>> [My first attempt at submitting a patch for gcc, so please forgive me >>>> if I'm not following the right protocol.] >>> >>> There are a few things missing. For one, a testcase or two for the >>> added optimizations. > > I'll see what I can come up with. Thanks for the pointers. > >>>> Sometimes rounding a variable to the next even integer is written x += x >>>> & 1. This usually means using an extra register (and hence at least an >>>> extra mov instruction) compared to the equivalent x = (x + 1) & ~1. The >>>> first pattern below tries to do this transformation. >>>> >>>> While playing with various ways of rounding down, I noticed that gcc >>>> already optimizes all of x-(x&3), x^(x&3) and x&~(x&3) to simply >>>> x&~3. >> >> Does it also handle x+(x&3)? > > I'm not sure what 'it' refers to, and I'm also not sure how you think > x+(x&3) could be rewritten. I was just guessing. >> Where does it handle x - (x&3)? > > If by 'it' you mean gcc, I tried looking for a pattern matching this, > but couldn't find it, so I don't know where it is handled. I can just > see by running gcc-5.0 -fdump-tree-original -O2 -c opt.c that "x - (x & > 3)" is rewritten as x & -4 (which is of course the same as x & ~3). That's done in fold-const.c:fold_binary_loc here: /* Fold A - (A & B) into ~B & A. */ if (!TREE_SIDE_EFFECTS (arg0) && TREE_CODE (arg1) == BIT_AND_EXPR) { ... (note that patterns are not fully moved to match.pd yet) > Btw, > I now see that neither x&~(x&3) or x&~(x&y) are rewritten that early, > but objdump -d shows that the end result is the same. > >> That is, doesn't the pattern also work for constants other than 1? > > Here I assume that 'the pattern' refers to the first pattern, and the > answer is 'not immediately'. To round up a number to the next multiple > of 2^k we need to add the negative of that number modulo 2^k. It just so > happens that for k=1 we have x==-x for both possible values of x. So > with a little tweak, this does in fact lead to an optimization > opportunity, namely x + ((-x) & m) -> (x + m) & ~m whenever m is one > less than a power of 2. I don't know how to check for m satisfying this > in the match.pd language. you'd need to write some C code involving trees in a if/with. We do have a integer_pow2p predicate but not a integer_one_less_than_pow2p one. > >> Please put it before the abs simplifications after the last one handing >> bit_and/bit_ior. > > OK, will do. Thanks, Richard. > Rasmus ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 0/4] A few optimization patterns 2015-01-14 14:01 ` Richard Biener @ 2015-01-21 10:50 ` Rasmus Villemoes 2015-01-21 10:50 ` [PATCH 1/4] match.pd: Add x + (x & 1) -> (x + 1) & ~1 pattern Rasmus Villemoes ` (3 more replies) 0 siblings, 4 replies; 21+ messages in thread From: Rasmus Villemoes @ 2015-01-21 10:50 UTC (permalink / raw) To: GCC Patches; +Cc: Richard Biener, Andrew Pinski, Rasmus Villemoes This adds four optimization patterns to match.pd, along with trivial test cases. Compiles and works as expected, and 'make check' on x86_64 gives the same number of "unexpected failures" before and after (8 from gfortran.dg/guality/pr41558.f90, 1 from failing to compile gcc.dg/plugin/ggcplug.c). I know almost nothing about the internals of gcc, so 4/4 may very well be considered ugly - it was what I could stitch together from pieces I picked up here and there. Rasmus Villemoes (4): match.pd: Add x + (x & 1) -> (x + 1) & ~1 pattern match.pd: Add x & ~(x & y) -> x & ~y pattern match.pd: Add x | ~(x | y) -> x | ~y pattern match.pd: Add x + ((-x) & m) -> (x + m) & ~m pattern gcc/match.pd | 27 ++++++++++++++++++ gcc/testsuite/gcc.dg/20150120-1.c | 51 +++++++++++++++++++++++++++++++++ gcc/testsuite/gcc.dg/20150120-2.c | 32 +++++++++++++++++++++ gcc/testsuite/gcc.dg/20150120-3.c | 32 +++++++++++++++++++++ gcc/testsuite/gcc.dg/20150120-4.c | 59 +++++++++++++++++++++++++++++++++++++++ 5 files changed, 201 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/20150120-1.c create mode 100644 gcc/testsuite/gcc.dg/20150120-2.c create mode 100644 gcc/testsuite/gcc.dg/20150120-3.c create mode 100644 gcc/testsuite/gcc.dg/20150120-4.c -- 2.1.3 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 1/4] match.pd: Add x + (x & 1) -> (x + 1) & ~1 pattern 2015-01-21 10:50 ` [PATCH 0/4] A few " Rasmus Villemoes @ 2015-01-21 10:50 ` Rasmus Villemoes 2015-04-30 9:34 ` Richard Biener 2015-05-01 18:26 ` Jeff Law 2015-01-21 10:55 ` [PATCH 2/4] match.pd: Add x & ~(x & y) -> x & ~y pattern Rasmus Villemoes ` (2 subsequent siblings) 3 siblings, 2 replies; 21+ messages in thread From: Rasmus Villemoes @ 2015-01-21 10:50 UTC (permalink / raw) To: GCC Patches; +Cc: Richard Biener, Andrew Pinski, Rasmus Villemoes gcc.dg/20150120-1.c: New test Rounding an integer to the next even integer is sometimes written x += x & 1. The equivalent x = (x+1)&~1 usually uses one less register, and in practical cases only the new value of x will be used (making it unlikely that the subexpression x&1 has any uses). Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> --- gcc/match.pd | 6 +++++ gcc/testsuite/gcc.dg/20150120-1.c | 51 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 57 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/20150120-1.c diff --git gcc/match.pd gcc/match.pd index 81c4ee6..ecefcfb 100644 --- gcc/match.pd +++ gcc/match.pd @@ -255,6 +255,12 @@ along with GCC; see the file COPYING3. If not see (bitop @0 @0) (non_lvalue @0))) +/* x + (x & 1) -> (x + 1) & ~1 */ +(simplify + (plus:c @0 (bit_and@2 @0 integer_onep@1)) + (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) + (bit_and (plus @0 @1) (bit_not @1)))) + (simplify (abs (negate @0)) (abs @0)) diff --git gcc/testsuite/gcc.dg/20150120-1.c gcc/testsuite/gcc.dg/20150120-1.c new file mode 100644 index 0000000..18906c4 --- /dev/null +++ gcc/testsuite/gcc.dg/20150120-1.c @@ -0,0 +1,51 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-original" } */ + +/* x + (x & 1) -> (x + 1) & ~1 */ +int +fn1 (int x) +{ + return x + (x & 1); +} +int +fn2 (int x) +{ + return (x & 1) + x; +} +int +fn3 (int x) +{ + return x + (1 & x); +} +int +fn4 (int x) +{ + return (1 & x) + x; +} +unsigned int +fn5 (unsigned int x) +{ + return x + (x & 1); +} +unsigned int +fn6 (unsigned int x) +{ + return (x & 1) + x; +} +unsigned int +fn7 (unsigned int x) +{ + return x + (x % 2); +} +unsigned int +fn8 (unsigned int x) +{ + return (x % 2) + x; +} +unsigned int +fn9 (unsigned int x) +{ + return (1LL & x) + x; +} + +/* { dg-final { scan-tree-dump-times "x \\+ 1" 9 "original" } } */ -- 2.1.3 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/4] match.pd: Add x + (x & 1) -> (x + 1) & ~1 pattern 2015-01-21 10:50 ` [PATCH 1/4] match.pd: Add x + (x & 1) -> (x + 1) & ~1 pattern Rasmus Villemoes @ 2015-04-30 9:34 ` Richard Biener 2015-05-01 18:26 ` Jeff Law 1 sibling, 0 replies; 21+ messages in thread From: Richard Biener @ 2015-04-30 9:34 UTC (permalink / raw) To: Rasmus Villemoes; +Cc: GCC Patches, Andrew Pinski On Wed, Jan 21, 2015 at 11:49 AM, Rasmus Villemoes <rv@rasmusvillemoes.dk> wrote: > gcc.dg/20150120-1.c: New test > > Rounding an integer to the next even integer is sometimes written x += > x & 1. The equivalent x = (x+1)&~1 usually uses one less register, and > in practical cases only the new value of x will be used (making it > unlikely that the subexpression x&1 has any uses). Now that we are in stage1 again You are missig a ChangeLog entry and fail to state how you tested the patch. Otherwise the patch looks ok. Thanks, Richard. > Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> > --- > gcc/match.pd | 6 +++++ > gcc/testsuite/gcc.dg/20150120-1.c | 51 +++++++++++++++++++++++++++++++++++++++ > 2 files changed, 57 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/20150120-1.c > > diff --git gcc/match.pd gcc/match.pd > index 81c4ee6..ecefcfb 100644 > --- gcc/match.pd > +++ gcc/match.pd > @@ -255,6 +255,12 @@ along with GCC; see the file COPYING3. If not see > (bitop @0 @0) > (non_lvalue @0))) > > +/* x + (x & 1) -> (x + 1) & ~1 */ > +(simplify > + (plus:c @0 (bit_and@2 @0 integer_onep@1)) > + (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) > + (bit_and (plus @0 @1) (bit_not @1)))) > + > (simplify > (abs (negate @0)) > (abs @0)) > diff --git gcc/testsuite/gcc.dg/20150120-1.c gcc/testsuite/gcc.dg/20150120-1.c > new file mode 100644 > index 0000000..18906c4 > --- /dev/null > +++ gcc/testsuite/gcc.dg/20150120-1.c > @@ -0,0 +1,51 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-original" } */ > + > +/* x + (x & 1) -> (x + 1) & ~1 */ > +int > +fn1 (int x) > +{ > + return x + (x & 1); > +} > +int > +fn2 (int x) > +{ > + return (x & 1) + x; > +} > +int > +fn3 (int x) > +{ > + return x + (1 & x); > +} > +int > +fn4 (int x) > +{ > + return (1 & x) + x; > +} > +unsigned int > +fn5 (unsigned int x) > +{ > + return x + (x & 1); > +} > +unsigned int > +fn6 (unsigned int x) > +{ > + return (x & 1) + x; > +} > +unsigned int > +fn7 (unsigned int x) > +{ > + return x + (x % 2); > +} > +unsigned int > +fn8 (unsigned int x) > +{ > + return (x % 2) + x; > +} > +unsigned int > +fn9 (unsigned int x) > +{ > + return (1LL & x) + x; > +} > + > +/* { dg-final { scan-tree-dump-times "x \\+ 1" 9 "original" } } */ > -- > 2.1.3 > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/4] match.pd: Add x + (x & 1) -> (x + 1) & ~1 pattern 2015-01-21 10:50 ` [PATCH 1/4] match.pd: Add x + (x & 1) -> (x + 1) & ~1 pattern Rasmus Villemoes 2015-04-30 9:34 ` Richard Biener @ 2015-05-01 18:26 ` Jeff Law 1 sibling, 0 replies; 21+ messages in thread From: Jeff Law @ 2015-05-01 18:26 UTC (permalink / raw) To: Rasmus Villemoes, GCC Patches; +Cc: Richard Biener, Andrew Pinski On 01/21/2015 03:49 AM, Rasmus Villemoes wrote: > gcc.dg/20150120-1.c: New test > > Rounding an integer to the next even integer is sometimes written x += > x & 1. The equivalent x = (x+1)&~1 usually uses one less register, and > in practical cases only the new value of x will be used (making it > unlikely that the subexpression x&1 has any uses). I bootstrapped and regression tested this on x86_64-linux-gnu, created the appropriate ChangeLogs and installed the patch on the trunk. Jeff ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 2/4] match.pd: Add x & ~(x & y) -> x & ~y pattern 2015-01-21 10:50 ` [PATCH 0/4] A few " Rasmus Villemoes 2015-01-21 10:50 ` [PATCH 1/4] match.pd: Add x + (x & 1) -> (x + 1) & ~1 pattern Rasmus Villemoes @ 2015-01-21 10:55 ` Rasmus Villemoes 2015-05-01 18:29 ` Jeff Law 2015-01-21 10:58 ` [PATCH 3/4] match.pd: Add x | ~(x | y) -> x | " Rasmus Villemoes 2015-01-21 11:17 ` [PATCH 4/4] match.pd: Add x + ((-x) & m) -> (x + m) & ~m pattern Rasmus Villemoes 3 siblings, 1 reply; 21+ messages in thread From: Rasmus Villemoes @ 2015-01-21 10:55 UTC (permalink / raw) To: GCC Patches; +Cc: Richard Biener, Andrew Pinski, Rasmus Villemoes gcc.dg/20150120-2.c: New test Clearing a certain subset of bits, for example to round down x to a multiple of a power of 2, is sometimes written x & ~(x & y), where y may or may not be a constant. It is shorter to use x & ~y, particularly when y is a constant. gcc already does this when it is spelled x - (x & y) or x ^ (x & y). Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> --- gcc/match.pd | 6 ++++++ gcc/testsuite/gcc.dg/20150120-2.c | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/20150120-2.c diff --git gcc/match.pd gcc/match.pd index ecefcfb..d25fc3a 100644 --- gcc/match.pd +++ gcc/match.pd @@ -261,6 +261,12 @@ along with GCC; see the file COPYING3. If not see (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) (bit_and (plus @0 @1) (bit_not @1)))) +/* x & ~(x & y) -> x & ~y */ +(simplify + (bit_and:c @0 (bit_not (bit_and:c@2 @0 @1))) + (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) + (bit_and @0 (bit_not @1)))) + (simplify (abs (negate @0)) (abs @0)) diff --git gcc/testsuite/gcc.dg/20150120-2.c gcc/testsuite/gcc.dg/20150120-2.c new file mode 100644 index 0000000..976b654 --- /dev/null +++ gcc/testsuite/gcc.dg/20150120-2.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-original" } */ + +/* x & ~(x & y) -> x & ~y */ +int fn1 (int x, int y) +{ + return x & ~(x & y); +} +int fn2 (int x, int y) +{ + return ~(x & y) & x; +} +int fn3 (int x, int y) +{ + return x & ~(y & x); +} +int fn4 (int x, int y) +{ + return ~(y & x) & x; +} +int fn5 (int z) +{ + return z & ~(z & 3); +} +int fn6 (int z) +{ + return ~(z & 3) & z; +} + + +/* { dg-final { scan-tree-dump-times "~y & x" 4 "original" } } */ +/* { dg-final { scan-tree-dump-times "z & -4" 2 "original" } } */ -- 2.1.3 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/4] match.pd: Add x & ~(x & y) -> x & ~y pattern 2015-01-21 10:55 ` [PATCH 2/4] match.pd: Add x & ~(x & y) -> x & ~y pattern Rasmus Villemoes @ 2015-05-01 18:29 ` Jeff Law 0 siblings, 0 replies; 21+ messages in thread From: Jeff Law @ 2015-05-01 18:29 UTC (permalink / raw) To: Rasmus Villemoes, GCC Patches; +Cc: Richard Biener, Andrew Pinski [-- Attachment #1: Type: text/plain, Size: 972 bytes --] On 01/21/2015 03:49 AM, Rasmus Villemoes wrote: > gcc.dg/20150120-2.c: New test > > Clearing a certain subset of bits, for example to round down x to a > multiple of a power of 2, is sometimes written x & ~(x & y), where y > may or may not be a constant. It is shorter to use x & ~y, > particularly when y is a constant. gcc already does this when it is > spelled x - (x & y) or x ^ (x & y). > > Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> > --- > gcc/match.pd | 6 ++++++ > gcc/testsuite/gcc.dg/20150120-2.c | 32 ++++++++++++++++++++++++++++++++ > 2 files changed, 38 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/20150120-2.c I combined this pattern with its bit-ior 'dual' into a single pattern. Then bootstrapped, regression tested, wrote the ChangeLog and installed the result onto the trunk. I'm attaching the final patch which includes the changes in #1, #2 and #3 of this series for archival purposes. jeff [-- Attachment #2: patch --] [-- Type: text/plain, Size: 4487 bytes --] commit 0a66cfe379a8217c69705535303626d12aca0c5f Author: law <law@138bc75d-0d04-0410-961f-82ee72b054a4> Date: Fri May 1 18:25:12 2015 +0000 * match.pd: New simplification patterns. (x + (x & 1)) -> ((x + 1) & ~1) (x & ~(x & y)) -> ((x & ~y)) (x | ~(x | y)) -> ((x | ~y)) * gcc.dg/20150120-1.c: New test. * gcc.dg/20150120-2.c: New test. * gcc.dg/20150120-3.c: New test. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@222697 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 65816c7..e006b26 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2015-05-01 Rasmus Villemoes <rv@rasmusvillemoes.dk> + + * match.pd: New simplification patterns. + (x + (x & 1)) -> ((x + 1) & ~1) + (x & ~(x & y)) -> ((x & ~y)) + (x | ~(x | y)) -> ((x | ~y)) + 2015-05-01 Kyrylo Tkachov <kyrylo.tkachov@arm.com> * target.def (attribute_table): Mention that struct attribute_spec diff --git a/gcc/match.pd b/gcc/match.pd index fc374de..87ecaf1 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -255,6 +255,20 @@ along with GCC; see the file COPYING3. If not see (bitop @0 @0) (non_lvalue @0))) +/* x + (x & 1) -> (x + 1) & ~1 */ +(simplify + (plus:c @0 (bit_and@2 @0 integer_onep@1)) + (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) + (bit_and (plus @0 @1) (bit_not @1)))) + +/* x & ~(x & y) -> x & ~y */ +/* x | ~(x | y) -> x | ~y */ +(for bitop (bit_and bit_ior) + (simplify + (bitop:c @0 (bit_not (bitop:c@2 @0 @1))) + (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) + (bitop @0 (bit_not @1))))) + (simplify (abs (negate @0)) (abs @0)) diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index bfdde3b..2aedc46 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,9 @@ +2015-05-01 Rasmus Villemoes <rv@rasmusvillemoes.dk> + + * gcc.dg/20150120-1.c: New test. + * gcc.dg/20150120-2.c: New test. + * gcc.dg/20150120-3.c: New test. + 2015-05-01 David Edelsohn <dje.gcc@gmail.com> * gcc.dg/debug/pr65771.c: Add "dg-add-options tls". diff --git a/gcc/testsuite/gcc.dg/20150120-1.c b/gcc/testsuite/gcc.dg/20150120-1.c new file mode 100644 index 0000000..18906c4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/20150120-1.c @@ -0,0 +1,51 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-original" } */ + +/* x + (x & 1) -> (x + 1) & ~1 */ +int +fn1 (int x) +{ + return x + (x & 1); +} +int +fn2 (int x) +{ + return (x & 1) + x; +} +int +fn3 (int x) +{ + return x + (1 & x); +} +int +fn4 (int x) +{ + return (1 & x) + x; +} +unsigned int +fn5 (unsigned int x) +{ + return x + (x & 1); +} +unsigned int +fn6 (unsigned int x) +{ + return (x & 1) + x; +} +unsigned int +fn7 (unsigned int x) +{ + return x + (x % 2); +} +unsigned int +fn8 (unsigned int x) +{ + return (x % 2) + x; +} +unsigned int +fn9 (unsigned int x) +{ + return (1LL & x) + x; +} + +/* { dg-final { scan-tree-dump-times "x \\+ 1" 9 "original" } } */ diff --git a/gcc/testsuite/gcc.dg/20150120-2.c b/gcc/testsuite/gcc.dg/20150120-2.c new file mode 100644 index 0000000..976b654 --- /dev/null +++ b/gcc/testsuite/gcc.dg/20150120-2.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-original" } */ + +/* x & ~(x & y) -> x & ~y */ +int fn1 (int x, int y) +{ + return x & ~(x & y); +} +int fn2 (int x, int y) +{ + return ~(x & y) & x; +} +int fn3 (int x, int y) +{ + return x & ~(y & x); +} +int fn4 (int x, int y) +{ + return ~(y & x) & x; +} +int fn5 (int z) +{ + return z & ~(z & 3); +} +int fn6 (int z) +{ + return ~(z & 3) & z; +} + + +/* { dg-final { scan-tree-dump-times "~y & x" 4 "original" } } */ +/* { dg-final { scan-tree-dump-times "z & -4" 2 "original" } } */ diff --git a/gcc/testsuite/gcc.dg/20150120-3.c b/gcc/testsuite/gcc.dg/20150120-3.c new file mode 100644 index 0000000..322556f --- /dev/null +++ b/gcc/testsuite/gcc.dg/20150120-3.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-original" } */ + +/* x | ~(x | y) -> x | ~y */ +int fn1 (int x, int y) +{ + return x | ~(x | y); +} +int fn2 (int x, int y) +{ + return ~(x | y) | x; +} +int fn3 (int x, int y) +{ + return x | ~(y | x); +} +int fn4 (int x, int y) +{ + return ~(y | x) | x; +} +int fn5 (int z) +{ + return z | ~(z | 3); +} +int fn6 (int z) +{ + return ~(z | 3) | z; +} + + +/* { dg-final { scan-tree-dump-times "~y \\| x" 4 "original" } } */ +/* { dg-final { scan-tree-dump-times "z \\| -4" 2 "original" } } */ ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 3/4] match.pd: Add x | ~(x | y) -> x | ~y pattern 2015-01-21 10:50 ` [PATCH 0/4] A few " Rasmus Villemoes 2015-01-21 10:50 ` [PATCH 1/4] match.pd: Add x + (x & 1) -> (x + 1) & ~1 pattern Rasmus Villemoes 2015-01-21 10:55 ` [PATCH 2/4] match.pd: Add x & ~(x & y) -> x & ~y pattern Rasmus Villemoes @ 2015-01-21 10:58 ` Rasmus Villemoes 2015-01-21 11:32 ` Marek Polacek 2015-01-21 11:17 ` [PATCH 4/4] match.pd: Add x + ((-x) & m) -> (x + m) & ~m pattern Rasmus Villemoes 3 siblings, 1 reply; 21+ messages in thread From: Rasmus Villemoes @ 2015-01-21 10:58 UTC (permalink / raw) To: GCC Patches; +Cc: Richard Biener, Andrew Pinski, Rasmus Villemoes gcc.dg/20150120-3.c: New test This is simply the 'dual' of the previous pattern, added for completeness. Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> --- gcc/match.pd | 6 ++++++ gcc/testsuite/gcc.dg/20150120-3.c | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/20150120-3.c diff --git gcc/match.pd gcc/match.pd index d25fc3a..47865f1 100644 --- gcc/match.pd +++ gcc/match.pd @@ -267,6 +267,12 @@ along with GCC; see the file COPYING3. If not see (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) (bit_and @0 (bit_not @1)))) +/* x | ~(x | y) -> x | ~y */ +(simplify + (bit_ior:c @0 (bit_not (bit_ior:c@2 @0 @1))) + (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) + (bit_ior @0 (bit_not @1)))) + (simplify (abs (negate @0)) (abs @0)) diff --git gcc/testsuite/gcc.dg/20150120-3.c gcc/testsuite/gcc.dg/20150120-3.c new file mode 100644 index 0000000..322556f --- /dev/null +++ gcc/testsuite/gcc.dg/20150120-3.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-original" } */ + +/* x | ~(x | y) -> x | ~y */ +int fn1 (int x, int y) +{ + return x | ~(x | y); +} +int fn2 (int x, int y) +{ + return ~(x | y) | x; +} +int fn3 (int x, int y) +{ + return x | ~(y | x); +} +int fn4 (int x, int y) +{ + return ~(y | x) | x; +} +int fn5 (int z) +{ + return z | ~(z | 3); +} +int fn6 (int z) +{ + return ~(z | 3) | z; +} + + +/* { dg-final { scan-tree-dump-times "~y \\| x" 4 "original" } } */ +/* { dg-final { scan-tree-dump-times "z \\| -4" 2 "original" } } */ -- 2.1.3 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/4] match.pd: Add x | ~(x | y) -> x | ~y pattern 2015-01-21 10:58 ` [PATCH 3/4] match.pd: Add x | ~(x | y) -> x | " Rasmus Villemoes @ 2015-01-21 11:32 ` Marek Polacek 0 siblings, 0 replies; 21+ messages in thread From: Marek Polacek @ 2015-01-21 11:32 UTC (permalink / raw) To: Rasmus Villemoes; +Cc: GCC Patches, Richard Biener, Andrew Pinski On Wed, Jan 21, 2015 at 11:49:53AM +0100, Rasmus Villemoes wrote: > gcc.dg/20150120-3.c: New test > > This is simply the 'dual' of the previous pattern, added for > completeness. If this is dual, I think you could make use of (for op (bit_ior bit_and) ... and do the simplification in one hunk. Marek ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 4/4] match.pd: Add x + ((-x) & m) -> (x + m) & ~m pattern 2015-01-21 10:50 ` [PATCH 0/4] A few " Rasmus Villemoes ` (2 preceding siblings ...) 2015-01-21 10:58 ` [PATCH 3/4] match.pd: Add x | ~(x | y) -> x | " Rasmus Villemoes @ 2015-01-21 11:17 ` Rasmus Villemoes 2015-04-30 9:42 ` Richard Biener 3 siblings, 1 reply; 21+ messages in thread From: Rasmus Villemoes @ 2015-01-21 11:17 UTC (permalink / raw) To: GCC Patches; +Cc: Richard Biener, Andrew Pinski, Rasmus Villemoes Generalizing the x+(x&1) pattern, one can round up x to a multiple of a 2^k by adding the negative of x modulo 2^k. But it is fewer instructions, and presumably requires fewer registers, to do the more common (x+m)&~m where m=2^k-1. Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> --- gcc/match.pd | 9 ++++++ gcc/testsuite/gcc.dg/20150120-4.c | 59 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 68 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/20150120-4.c diff --git gcc/match.pd gcc/match.pd index 47865f1..93c2298 100644 --- gcc/match.pd +++ gcc/match.pd @@ -273,6 +273,15 @@ along with GCC; see the file COPYING3. If not see (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) (bit_ior @0 (bit_not @1)))) +/* x + ((-x) & m) -> (x + m) & ~m when m == 2^k-1. */ +(simplify + (plus:c @0 (bit_and@2 (negate @0) CONSTANT_CLASS_P@1)) + (with { tree cst = fold_binary (PLUS_EXPR, TREE_TYPE (@1), + @1, build_one_cst (TREE_TYPE (@1))); } + (if ((TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) + && cst && integer_pow2p (cst)) + (bit_and (plus @0 @1) (bit_not @1))))) + (simplify (abs (negate @0)) (abs @0)) diff --git gcc/testsuite/gcc.dg/20150120-4.c gcc/testsuite/gcc.dg/20150120-4.c new file mode 100644 index 0000000..c3552bf --- /dev/null +++ gcc/testsuite/gcc.dg/20150120-4.c @@ -0,0 +1,59 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-original" } */ + +/* x + ((-x) & m) -> (x + m) & ~m for m one less than a pow2. */ +int +fn1 (int x) +{ + return x + ((-x) & 7); +} +int +fn2 (int x) +{ + return ((-x) & 7) + x; +} +unsigned int +fn3 (unsigned int x) +{ + return x + ((-x) & 7); +} +unsigned int +fn4 (unsigned int x) +{ + return ((-x) & 7) + x; +} +unsigned int +fn5 (unsigned int x) +{ + return x + ((-x) % 8); +} +unsigned int +fn6 (unsigned int x) +{ + return ((-x) % 8) + x; +} +int +fn7 (int x) +{ + return x + ((-x) & 9); +} +int +fn8 (int x) +{ + return ((-x) & 9) + x; +} +unsigned int +fn9 (unsigned int x) +{ + return x + ((-x) & ~0U); +} +unsigned int +fn10 (unsigned int x) +{ + return ((-x) & ~0U) + x; +} + + +/* { dg-final { scan-tree-dump-times "x \\+ 7" 6 "original" } } */ +/* { dg-final { scan-tree-dump-times "-x & 9" 2 "original" } } */ +/* { dg-final { scan-tree-dump-times "return 0" 2 "original" } } */ -- 2.1.3 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 4/4] match.pd: Add x + ((-x) & m) -> (x + m) & ~m pattern 2015-01-21 11:17 ` [PATCH 4/4] match.pd: Add x + ((-x) & m) -> (x + m) & ~m pattern Rasmus Villemoes @ 2015-04-30 9:42 ` Richard Biener 2015-04-30 11:56 ` Marc Glisse 0 siblings, 1 reply; 21+ messages in thread From: Richard Biener @ 2015-04-30 9:42 UTC (permalink / raw) To: Rasmus Villemoes; +Cc: GCC Patches, Andrew Pinski On Wed, Jan 21, 2015 at 11:49 AM, Rasmus Villemoes <rv@rasmusvillemoes.dk> wrote: > Generalizing the x+(x&1) pattern, one can round up x to a multiple of > a 2^k by adding the negative of x modulo 2^k. But it is fewer > instructions, and presumably requires fewer registers, to do the more > common (x+m)&~m where m=2^k-1. > > Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> > --- > gcc/match.pd | 9 ++++++ > gcc/testsuite/gcc.dg/20150120-4.c | 59 +++++++++++++++++++++++++++++++++++++++ > 2 files changed, 68 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/20150120-4.c > > diff --git gcc/match.pd gcc/match.pd > index 47865f1..93c2298 100644 > --- gcc/match.pd > +++ gcc/match.pd > @@ -273,6 +273,15 @@ along with GCC; see the file COPYING3. If not see > (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) > (bit_ior @0 (bit_not @1)))) > > +/* x + ((-x) & m) -> (x + m) & ~m when m == 2^k-1. */ > +(simplify > + (plus:c @0 (bit_and@2 (negate @0) CONSTANT_CLASS_P@1)) I think you want to restrict this to INTEGER_CST@1 > + (with { tree cst = fold_binary (PLUS_EXPR, TREE_TYPE (@1), > + @1, build_one_cst (TREE_TYPE (@1))); } We shouldn't dispatch to fold_binary in patterns. int_const_binop would be the appropriate function to use - but what happens for @1 == INT_MAX where @1 + 1 overflows? Similar, is this also valid for negative @1 and thus signed mask types? IMHO we should check whether @1 is equal to wi::mask (TYPE_PRECISION (TREE_TYPE (@1)) - wi::clz (@1), false, TYPE_PRECISION (TREE_TYPE (@1)). As with the other patch a ChangeLog entry is missing as well as stating how you tested the patch. Thanks, Richard. > + (if ((TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) > + && cst && integer_pow2p (cst)) > + (bit_and (plus @0 @1) (bit_not @1))))) > + > (simplify > (abs (negate @0)) > (abs @0)) > diff --git gcc/testsuite/gcc.dg/20150120-4.c gcc/testsuite/gcc.dg/20150120-4.c > new file mode 100644 > index 0000000..c3552bf > --- /dev/null > +++ gcc/testsuite/gcc.dg/20150120-4.c > @@ -0,0 +1,59 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-original" } */ > + > +/* x + ((-x) & m) -> (x + m) & ~m for m one less than a pow2. */ > +int > +fn1 (int x) > +{ > + return x + ((-x) & 7); > +} > +int > +fn2 (int x) > +{ > + return ((-x) & 7) + x; > +} > +unsigned int > +fn3 (unsigned int x) > +{ > + return x + ((-x) & 7); > +} > +unsigned int > +fn4 (unsigned int x) > +{ > + return ((-x) & 7) + x; > +} > +unsigned int > +fn5 (unsigned int x) > +{ > + return x + ((-x) % 8); > +} > +unsigned int > +fn6 (unsigned int x) > +{ > + return ((-x) % 8) + x; > +} > +int > +fn7 (int x) > +{ > + return x + ((-x) & 9); > +} > +int > +fn8 (int x) > +{ > + return ((-x) & 9) + x; > +} > +unsigned int > +fn9 (unsigned int x) > +{ > + return x + ((-x) & ~0U); > +} > +unsigned int > +fn10 (unsigned int x) > +{ > + return ((-x) & ~0U) + x; > +} > + > + > +/* { dg-final { scan-tree-dump-times "x \\+ 7" 6 "original" } } */ > +/* { dg-final { scan-tree-dump-times "-x & 9" 2 "original" } } */ > +/* { dg-final { scan-tree-dump-times "return 0" 2 "original" } } */ > -- > 2.1.3 > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 4/4] match.pd: Add x + ((-x) & m) -> (x + m) & ~m pattern 2015-04-30 9:42 ` Richard Biener @ 2015-04-30 11:56 ` Marc Glisse 2015-04-30 12:25 ` Richard Biener 0 siblings, 1 reply; 21+ messages in thread From: Marc Glisse @ 2015-04-30 11:56 UTC (permalink / raw) To: Richard Biener; +Cc: Rasmus Villemoes, GCC Patches, Andrew Pinski On Thu, 30 Apr 2015, Richard Biener wrote: > On Wed, Jan 21, 2015 at 11:49 AM, Rasmus Villemoes > <rv@rasmusvillemoes.dk> wrote: >> Generalizing the x+(x&1) pattern, one can round up x to a multiple of >> a 2^k by adding the negative of x modulo 2^k. But it is fewer >> instructions, and presumably requires fewer registers, to do the more >> common (x+m)&~m where m=2^k-1. >> >> Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> >> --- >> gcc/match.pd | 9 ++++++ >> gcc/testsuite/gcc.dg/20150120-4.c | 59 +++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 68 insertions(+) >> create mode 100644 gcc/testsuite/gcc.dg/20150120-4.c >> >> diff --git gcc/match.pd gcc/match.pd >> index 47865f1..93c2298 100644 >> --- gcc/match.pd >> +++ gcc/match.pd >> @@ -273,6 +273,15 @@ along with GCC; see the file COPYING3. If not see >> (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) >> (bit_ior @0 (bit_not @1)))) >> >> +/* x + ((-x) & m) -> (x + m) & ~m when m == 2^k-1. */ >> +(simplify >> + (plus:c @0 (bit_and@2 (negate @0) CONSTANT_CLASS_P@1)) > > I think you want to restrict this to INTEGER_CST@1 Is this only to make the following test easier (a good enough reason for me) or is there some fundamental reason why this transformation would be wrong for vectors? >> + (with { tree cst = fold_binary (PLUS_EXPR, TREE_TYPE (@1), >> + @1, build_one_cst (TREE_TYPE (@1))); } > > We shouldn't dispatch to fold_binary in patterns. int_const_binop would > be the appropriate function to use - but what happens for @1 == INT_MAX > where @1 + 1 overflows? Similar, is this also valid for negative @1 > and thus signed mask types? IMHO we should check whether @1 > is equal to wi::mask (TYPE_PRECISION (TREE_TYPE (@1)) - wi::clz (@1), > false, TYPE_PRECISION (TREE_TYPE (@1)). > > As with the other patch a ChangeLog entry is missing as well as stating > how you tested the patch. > > Thanks, > Richard. > >> + (if ((TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) >> + && cst && integer_pow2p (cst)) >> + (bit_and (plus @0 @1) (bit_not @1))))) -- Marc Glisse ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 4/4] match.pd: Add x + ((-x) & m) -> (x + m) & ~m pattern 2015-04-30 11:56 ` Marc Glisse @ 2015-04-30 12:25 ` Richard Biener 0 siblings, 0 replies; 21+ messages in thread From: Richard Biener @ 2015-04-30 12:25 UTC (permalink / raw) To: GCC Patches; +Cc: Rasmus Villemoes, Andrew Pinski On Thu, Apr 30, 2015 at 1:44 PM, Marc Glisse <marc.glisse@inria.fr> wrote: > On Thu, 30 Apr 2015, Richard Biener wrote: > >> On Wed, Jan 21, 2015 at 11:49 AM, Rasmus Villemoes >> <rv@rasmusvillemoes.dk> wrote: >>> >>> Generalizing the x+(x&1) pattern, one can round up x to a multiple of >>> a 2^k by adding the negative of x modulo 2^k. But it is fewer >>> instructions, and presumably requires fewer registers, to do the more >>> common (x+m)&~m where m=2^k-1. >>> >>> Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> >>> --- >>> gcc/match.pd | 9 ++++++ >>> gcc/testsuite/gcc.dg/20150120-4.c | 59 >>> +++++++++++++++++++++++++++++++++++++++ >>> 2 files changed, 68 insertions(+) >>> create mode 100644 gcc/testsuite/gcc.dg/20150120-4.c >>> >>> diff --git gcc/match.pd gcc/match.pd >>> index 47865f1..93c2298 100644 >>> --- gcc/match.pd >>> +++ gcc/match.pd >>> @@ -273,6 +273,15 @@ along with GCC; see the file COPYING3. If not see >>> (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) >>> (bit_ior @0 (bit_not @1)))) >>> >>> +/* x + ((-x) & m) -> (x + m) & ~m when m == 2^k-1. */ >>> +(simplify >>> + (plus:c @0 (bit_and@2 (negate @0) CONSTANT_CLASS_P@1)) >> >> >> I think you want to restrict this to INTEGER_CST@1 > > > Is this only to make the following test easier (a good enough reason for me) > or is there some fundamental reason why this transformation would be wrong > for vectors? Good question - I suppose it also works for vectors (well, the predicates don't). for non-ingegers or complex ints we shouldn't arrive here as we can't have bit_and for them. for pointers we can't have plus on them. So yes, it makes the following tests easier. A TODO comment for vectors might be appropriate (we'd simply need a predicate that can test for all emlements being 2^k-1). Richard. > >>> + (with { tree cst = fold_binary (PLUS_EXPR, TREE_TYPE (@1), >>> + @1, build_one_cst (TREE_TYPE (@1))); } >> >> >> We shouldn't dispatch to fold_binary in patterns. int_const_binop would >> be the appropriate function to use - but what happens for @1 == INT_MAX >> where @1 + 1 overflows? Similar, is this also valid for negative @1 >> and thus signed mask types? IMHO we should check whether @1 >> is equal to wi::mask (TYPE_PRECISION (TREE_TYPE (@1)) - wi::clz (@1), >> false, TYPE_PRECISION (TREE_TYPE (@1)). >> >> As with the other patch a ChangeLog entry is missing as well as stating >> how you tested the patch. >> >> Thanks, >> Richard. >> >>> + (if ((TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) >>> + && cst && integer_pow2p (cst)) >>> + (bit_and (plus @0 @1) (bit_not @1))))) > > > -- > Marc Glisse ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: RFC: Two minor optimization patterns 2015-01-13 22:47 RFC: Two minor optimization patterns Rasmus Villemoes 2015-01-13 22:56 ` Andrew Pinski @ 2015-01-14 13:14 ` Marc Glisse 2015-01-14 13:58 ` Richard Biener 1 sibling, 1 reply; 21+ messages in thread From: Marc Glisse @ 2015-01-14 13:14 UTC (permalink / raw) To: Rasmus Villemoes; +Cc: gcc-patches On Tue, 13 Jan 2015, Rasmus Villemoes wrote: > diff --git gcc/match.pd gcc/match.pd > index 81c4ee6..04a0bc4 100644 > --- gcc/match.pd > +++ gcc/match.pd > @@ -262,6 +262,16 @@ along with GCC; see the file COPYING3. If not see > (abs tree_expr_nonnegative_p@0) > @0) > > +/* x + (x & 1) -> (x + 1) & ~1 */ > +(simplify > + (plus @0 (bit_and @0 integer_onep@1)) > + (bit_and (plus @0 @1) (bit_not @1))) > + > +/* x | ~(x | y) -> x | ~y */ > +(simplify > + (bit_ior @0 (bit_not (bit_ior @0 @1))) > + (bit_ior @0 (bit_not @1))) You may want to consider using has_single_use (see other patterns). -- Marc Glisse ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: RFC: Two minor optimization patterns 2015-01-14 13:14 ` RFC: Two minor optimization patterns Marc Glisse @ 2015-01-14 13:58 ` Richard Biener 2015-01-14 14:31 ` Marc Glisse 0 siblings, 1 reply; 21+ messages in thread From: Richard Biener @ 2015-01-14 13:58 UTC (permalink / raw) To: GCC Patches; +Cc: Rasmus Villemoes On Wed, Jan 14, 2015 at 1:45 PM, Marc Glisse <marc.glisse@inria.fr> wrote: > On Tue, 13 Jan 2015, Rasmus Villemoes wrote: > >> diff --git gcc/match.pd gcc/match.pd >> index 81c4ee6..04a0bc4 100644 >> --- gcc/match.pd >> +++ gcc/match.pd >> @@ -262,6 +262,16 @@ along with GCC; see the file COPYING3. If not see >> (abs tree_expr_nonnegative_p@0) >> @0) >> >> +/* x + (x & 1) -> (x + 1) & ~1 */ >> +(simplify >> + (plus @0 (bit_and @0 integer_onep@1)) >> + (bit_and (plus @0 @1) (bit_not @1))) >> + >> +/* x | ~(x | y) -> x | ~y */ >> +(simplify >> + (bit_ior @0 (bit_not (bit_ior @0 @1))) >> + (bit_ior @0 (bit_not @1))) > > > You may want to consider using has_single_use (see other patterns). Just to clarify, you mean on the (x | y) sub-expression? Richard. > -- > Marc Glisse ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: RFC: Two minor optimization patterns 2015-01-14 13:58 ` Richard Biener @ 2015-01-14 14:31 ` Marc Glisse 2015-01-14 14:49 ` Richard Biener 0 siblings, 1 reply; 21+ messages in thread From: Marc Glisse @ 2015-01-14 14:31 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches, Rasmus Villemoes On Wed, 14 Jan 2015, Richard Biener wrote: > On Wed, Jan 14, 2015 at 1:45 PM, Marc Glisse <marc.glisse@inria.fr> wrote: >> On Tue, 13 Jan 2015, Rasmus Villemoes wrote: >> >>> diff --git gcc/match.pd gcc/match.pd >>> index 81c4ee6..04a0bc4 100644 >>> --- gcc/match.pd >>> +++ gcc/match.pd >>> @@ -262,6 +262,16 @@ along with GCC; see the file COPYING3. If not see >>> (abs tree_expr_nonnegative_p@0) >>> @0) >>> >>> +/* x + (x & 1) -> (x + 1) & ~1 */ >>> +(simplify >>> + (plus @0 (bit_and @0 integer_onep@1)) >>> + (bit_and (plus @0 @1) (bit_not @1))) >>> + >>> +/* x | ~(x | y) -> x | ~y */ >>> +(simplify >>> + (bit_ior @0 (bit_not (bit_ior @0 @1))) >>> + (bit_ior @0 (bit_not @1))) >> >> >> You may want to consider using has_single_use (see other patterns). > > Just to clarify, you mean on the (x | y) sub-expression? I meant on (x & 1) for the first pattern and on ~(x | y) for the second one. That is, cases where the transformation could actually increase the number of statements (we don't have a convenient interface to check if (x+1) or ~y are already available). (x | y) could sometimes be cheaper than y if it is already computed and it shortens the lifetime of y, but we are way too early in the pipeline to make an informed decision about that, so it seems better to canonicalize. Now that I think about it, some platforms probably have an instruction ornot, so my reasons above could be wrong... -- Marc Glisse ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: RFC: Two minor optimization patterns 2015-01-14 14:31 ` Marc Glisse @ 2015-01-14 14:49 ` Richard Biener 0 siblings, 0 replies; 21+ messages in thread From: Richard Biener @ 2015-01-14 14:49 UTC (permalink / raw) To: GCC Patches; +Cc: Rasmus Villemoes On Wed, Jan 14, 2015 at 3:26 PM, Marc Glisse <marc.glisse@inria.fr> wrote: > On Wed, 14 Jan 2015, Richard Biener wrote: > >> On Wed, Jan 14, 2015 at 1:45 PM, Marc Glisse <marc.glisse@inria.fr> wrote: >>> >>> On Tue, 13 Jan 2015, Rasmus Villemoes wrote: >>> >>>> diff --git gcc/match.pd gcc/match.pd >>>> index 81c4ee6..04a0bc4 100644 >>>> --- gcc/match.pd >>>> +++ gcc/match.pd >>>> @@ -262,6 +262,16 @@ along with GCC; see the file COPYING3. If not see >>>> (abs tree_expr_nonnegative_p@0) >>>> @0) >>>> >>>> +/* x + (x & 1) -> (x + 1) & ~1 */ >>>> +(simplify >>>> + (plus @0 (bit_and @0 integer_onep@1)) >>>> + (bit_and (plus @0 @1) (bit_not @1))) >>>> + >>>> +/* x | ~(x | y) -> x | ~y */ >>>> +(simplify >>>> + (bit_ior @0 (bit_not (bit_ior @0 @1))) >>>> + (bit_ior @0 (bit_not @1))) >>> >>> >>> >>> You may want to consider using has_single_use (see other patterns). >> >> >> Just to clarify, you mean on the (x | y) sub-expression? > > > I meant on (x & 1) for the first pattern and on ~(x | y) for the second one. > That is, cases where the transformation could actually increase the number > of statements (we don't have a convenient interface to check if (x+1) or ~y > are already available). > > (x | y) could sometimes be cheaper than y if it is already computed and it > shortens the lifetime of y, but we are way too early in the pipeline to make > an informed decision about that, so it seems better to canonicalize. Yeah. Note that I've not yet settled myself on how to attack the "single-use issue" generally. For example I'd like to do code generation for a specific simplification variant for value-numbering which can check the availability of expressions - here explicit has_single_use calls wil be prohibitive. It should be the responsibility of the caller to apply restrictions like this (but for the tree-ssa-forwprop.c case I've not yet come up with a reasonable way to tame things down...) Richard. > Now that I think about it, some platforms probably have an instruction > ornot, so my reasons above could be wrong... > -- > Marc Glisse ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2015-05-01 18:29 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-01-13 22:47 RFC: Two minor optimization patterns Rasmus Villemoes 2015-01-13 22:56 ` Andrew Pinski 2015-01-14 9:52 ` Richard Biener 2015-01-14 12:45 ` Rasmus Villemoes 2015-01-14 14:01 ` Richard Biener 2015-01-21 10:50 ` [PATCH 0/4] A few " Rasmus Villemoes 2015-01-21 10:50 ` [PATCH 1/4] match.pd: Add x + (x & 1) -> (x + 1) & ~1 pattern Rasmus Villemoes 2015-04-30 9:34 ` Richard Biener 2015-05-01 18:26 ` Jeff Law 2015-01-21 10:55 ` [PATCH 2/4] match.pd: Add x & ~(x & y) -> x & ~y pattern Rasmus Villemoes 2015-05-01 18:29 ` Jeff Law 2015-01-21 10:58 ` [PATCH 3/4] match.pd: Add x | ~(x | y) -> x | " Rasmus Villemoes 2015-01-21 11:32 ` Marek Polacek 2015-01-21 11:17 ` [PATCH 4/4] match.pd: Add x + ((-x) & m) -> (x + m) & ~m pattern Rasmus Villemoes 2015-04-30 9:42 ` Richard Biener 2015-04-30 11:56 ` Marc Glisse 2015-04-30 12:25 ` Richard Biener 2015-01-14 13:14 ` RFC: Two minor optimization patterns Marc Glisse 2015-01-14 13:58 ` Richard Biener 2015-01-14 14:31 ` Marc Glisse 2015-01-14 14:49 ` Richard Biener
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).