public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] match.pd: reassociate multiplications with constants
@ 2017-07-13 16:37 Alexander Monakov
  2017-07-13 18:31 ` Marc Glisse
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Monakov @ 2017-07-13 16:37 UTC (permalink / raw)
  To: gcc-patches; +Cc: Marc Glisse

Hi,

This is a followup to https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01545.html

Recently due to a fix for PR 80800 GCC has lost the ability to reassociate
signed multiplications chains to go from 'X * CST1 * Y * CST2'
to 'X * Y * (CST1 * CST2)'.  The fix to that PR prevents extract_muldiv from
introducing '(X * (CST1 * CST2)) * Y', which was wrong because it may cause
intermediate signed overflow (unexpected if Y == 0).

As mentioned in that thread, we can reassociate constants to outermost operands
instead: this is safe because CST1 cannot be 0 or -1, since those are handled
by other match.pd rules.

(in fact it's possible to reassociate negates too, and go from '(-X) * Y * CST'
to '(X * Y) * (-CST)' if (-CST) doesn't overflow (again, we know that CST != 1),
but I'm not sure how valuable that is in practice, so opted not to do that yet)

The following patch reinstates folding by adding a new match.pd rule that moves
constants to outermost operands, where they can be merged by fold_binary
machinery if their product doesn't overflow.  Bootstrapped and regtested on
amd64, OK for trunk?

	* match.pd ((X * CST) * Y): Reassociate to (X * Y) * CST.

diff --git a/gcc/match.pd b/gcc/match.pd
index 4c64b21..e49f879 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2139,6 +2139,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (mult @0 integer_minus_onep)
  (negate @0))

+/* Reassociate (X * CST) * Y to (X * Y) * CST.  This does not introduce
+   signed overflow: previous rules handle CST being -1 or 0, and for
+   the rest we know that if X * Y overflows, so does (X * CST) * Y.  */
+(if (!TYPE_OVERFLOW_SANITIZED (type) && !TYPE_SATURATING (type))
+ (simplify
+  (mult:c (mult @0 INTEGER_CST@1) @2)
+  (if (TREE_CODE (@2) != INTEGER_CST)
+   (mult (mult @0 @2) @1))))
+
 /* True if we can easily extract the real and imaginary parts of a complex
    number.  */
 (match compositional_complex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] match.pd: reassociate multiplications with constants
  2017-07-13 16:37 [PATCH] match.pd: reassociate multiplications with constants Alexander Monakov
@ 2017-07-13 18:31 ` Marc Glisse
  2017-07-13 19:42   ` Alexander Monakov
                     ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Marc Glisse @ 2017-07-13 18:31 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-patches

On Thu, 13 Jul 2017, Alexander Monakov wrote:

> This is a followup to https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01545.html
>
> Recently due to a fix for PR 80800 GCC has lost the ability to reassociate
> signed multiplications chains to go from 'X * CST1 * Y * CST2'
> to 'X * Y * (CST1 * CST2)'.  The fix to that PR prevents extract_muldiv from
> introducing '(X * (CST1 * CST2)) * Y', which was wrong because it may cause
> intermediate signed overflow (unexpected if Y == 0).
>
> As mentioned in that thread, we can reassociate constants to outermost operands
> instead: this is safe because CST1 cannot be 0 or -1, since those are handled
> by other match.pd rules.
>
> (in fact it's possible to reassociate negates too, and go from '(-X) * Y * CST'
> to '(X * Y) * (-CST)' if (-CST) doesn't overflow (again, we know that CST != 1),
> but I'm not sure how valuable that is in practice, so opted not to do that yet)
>
> The following patch reinstates folding by adding a new match.pd rule that moves
> constants to outermost operands, where they can be merged by fold_binary
> machinery if their product doesn't overflow.  Bootstrapped and regtested on
> amd64, OK for trunk?
>
> 	* match.pd ((X * CST) * Y): Reassociate to (X * Y) * CST.
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 4c64b21..e49f879 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -2139,6 +2139,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (mult @0 integer_minus_onep)
>  (negate @0))
>
> +/* Reassociate (X * CST) * Y to (X * Y) * CST.  This does not introduce
> +   signed overflow: previous rules handle CST being -1 or 0, and for
> +   the rest we know that if X * Y overflows, so does (X * CST) * Y.  */
> +(if (!TYPE_OVERFLOW_SANITIZED (type) && !TYPE_SATURATING (type))
> + (simplify
> +  (mult:c (mult @0 INTEGER_CST@1) @2)
> +  (if (TREE_CODE (@2) != INTEGER_CST)
> +   (mult (mult @0 @2) @1))))
> +
> /* True if we can easily extract the real and imaginary parts of a complex
>    number.  */
> (match compositional_complex

Thanks. I guess it makes sense as a canonicalization (there are always 
cases where those make it harder to optimize, but hopefully fewer than 
those where they help).

I notice that we do not turn (X*10)*10 into X*100 in GIMPLE. X*big*big 
where abs(big*big)>abs(INT_MIN) can be optimized to 0, the only hard case 
is when the product of the constants is -INT_MIN, which we could turn into 
X<<31 for instance (sadly loses range info), or (-X)*INT_MIN or whatever. 
That would make a nice follow-up, if you are interested.

Relying on inner expressions being folded can be slightly dangerous, 
especially for generic IIRC. It seems easy enough to check that @1 is 
neither 0 nor -1 for safety.

Probably needs :s on the inner multiplication.

Unless the test on TYPE_OVERFLOW_SANITIZED etc is shared with adjacent 
transformations, I'd rather put it inside, with the other if, but that's a 
matter of taste.

One small testcase please? Or is there already one that is currently 
failing?

(I cannot ok patches, only comment)

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] match.pd: reassociate multiplications with constants
  2017-07-13 18:31 ` Marc Glisse
@ 2017-07-13 19:42   ` Alexander Monakov
  2017-07-14  6:00     ` Marc Glisse
  2017-07-15 17:01   ` Alexander Monakov
  2017-07-15 17:16   ` Alexander Monakov
  2 siblings, 1 reply; 16+ messages in thread
From: Alexander Monakov @ 2017-07-13 19:42 UTC (permalink / raw)
  To: Marc Glisse; +Cc: gcc-patches

On Thu, 13 Jul 2017, Marc Glisse wrote:
> I notice that we do not turn (X*10)*10 into X*100 in GIMPLE.

Sorry, could you clarify what you mean here?  I think we certainly do that,
just not via match.pd, but in 'associate:' case of fold_binary_loc.

> Relying on inner expressions being folded can be slightly dangerous,
> especially for generic IIRC. It seems easy enough to check that @1 is neither
> 0 nor -1 for safety.

I wanted to add a gcc_checking_assert to that effect, but it's not used in
match.pd anywhere.  Is there a nice way to do that?

Thanks!
Alexander

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] match.pd: reassociate multiplications with constants
  2017-07-13 19:42   ` Alexander Monakov
@ 2017-07-14  6:00     ` Marc Glisse
  2017-07-18  8:23       ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Marc Glisse @ 2017-07-14  6:00 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-patches

On Thu, 13 Jul 2017, Alexander Monakov wrote:

> On Thu, 13 Jul 2017, Marc Glisse wrote:
>> I notice that we do not turn (X*10)*10 into X*100 in GIMPLE.
>
> Sorry, could you clarify what you mean here?  I think we certainly do that,
> just not via match.pd, but in 'associate:' case of fold_binary_loc.

fold_binary_loc is for GENERIC, so mostly for front-end time optimization 
of expressions.

int f(int a){
   int b=a*10;
   return b*10;
}

$ gcc-snapshot -O3 -S -fdump-tree-optimized a.c
$ cat a.c.228t.optimized
[...]
   b_2 = a_1(D) * 10;
   _3 = b_2 * 10;
   return _3;

>> Relying on inner expressions being folded can be slightly dangerous,
>> especially for generic IIRC. It seems easy enough to check that @1 is neither
>> 0 nor -1 for safety.
>
> I wanted to add a gcc_checking_assert to that effect, but it's not used in
> match.pd anywhere.  Is there a nice way to do that?

You can use (with { arbitrary C++ code } ... ) to add an assertion (you 
can use @0 in the block of C++ code).

I was more thinking of an "if" than an assertion though.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] match.pd: reassociate multiplications with constants
  2017-07-13 18:31 ` Marc Glisse
  2017-07-13 19:42   ` Alexander Monakov
@ 2017-07-15 17:01   ` Alexander Monakov
  2017-07-15 17:16   ` Alexander Monakov
  2 siblings, 0 replies; 16+ messages in thread
From: Alexander Monakov @ 2017-07-15 17:01 UTC (permalink / raw)
  To: Marc Glisse; +Cc: gcc-patches

On Thu, 13 Jul 2017, Marc Glisse wrote:
> I notice that we do not turn (X*10)*10 into X*100 in GIMPLE [...]

I've completely missed that.  Posting another patch to address that.

> Relying on inner expressions being folded can be slightly dangerous,
> especially for generic IIRC. It seems easy enough to check that @1 is neither
> 0 nor -1 for safety.

Yep - done.

> Probably needs :s on the inner multiplication.

Not 100% sure, but done in this version.

> Unless the test on TYPE_OVERFLOW_SANITIZED etc is shared with adjacent
> transformations, I'd rather put it inside, with the other if, but that's a
> matter of taste.

Done, probably better that way.
 
> One small testcase please? Or is there already one that is currently failing?

No, I'm not aware of any preexisting testcase.  Added.

	* match.pd ((X * CST) * Y): Reassociate to (X * Y) * CST.
testsuite/
	* gcc.dg/tree-ssa/assoc-2.c: New testcase.

diff --git a/gcc/match.pd b/gcc/match.pd
index 4c64b21..36045f1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2139,6 +2139,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (mult @0 integer_minus_onep)
  (negate @0))
 
+/* Reassociate (X * CST) * Y to (X * Y) * CST.  This does not introduce
+   signed overflow for CST != 0 && CST != -1.  */
+(simplify
+ (mult:c (mult:s @0 INTEGER_CST@1) @2)
+ (if (TREE_CODE (@2) != INTEGER_CST
+      && !integer_zerop (@1) && !integer_minus_onep (@1)
+      && !TYPE_OVERFLOW_SANITIZED (type) && !TYPE_SATURATING (type))
+  (mult (mult @0 @2) @1)))
+
 /* True if we can easily extract the real and imaginary parts of a complex
    number.  */
 (match compositional_complex
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
new file mode 100644
index 0000000..a92c882
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-gimple-raw -fdump-tree-optimized-raw" } */
+
+int f0(int a, int b){
+  return a * 33 * b * 55;
+}
+
+/* { dg-final { scan-tree-dump-times "mult_expr" 2 "gimple" } } */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] match.pd: reassociate multiplications with constants
  2017-07-13 18:31 ` Marc Glisse
  2017-07-13 19:42   ` Alexander Monakov
  2017-07-15 17:01   ` Alexander Monakov
@ 2017-07-15 17:16   ` Alexander Monakov
  2017-07-17 20:00     ` Marc Glisse
  2 siblings, 1 reply; 16+ messages in thread
From: Alexander Monakov @ 2017-07-15 17:16 UTC (permalink / raw)
  To: Marc Glisse; +Cc: gcc-patches

On Thu, 13 Jul 2017, Marc Glisse wrote:
> X*big*big where abs(big*big)>abs(INT_MIN) can be optimized to 0

I'm not sure that would be a win, eliminating X prevents the compiler from
deducing that X must be zero (if overflow invokes undefined behavior).

> the only hard case is when the product of the constants is -INT_MIN, which we
> could turn into X<<31 for instance (sadly loses range info), or (-X)*INT_MIN
> or whatever. That would make a nice follow-up, if you are interested.

Here's a patch that combines constants, but doesn't handle this case,
(I'm not sure we want to handle it, would it be useful in practice?)
and neither does substitute zero on overflow, per the above concern.

(slsr-4.c needs to be adjusted due to new simplifications)

	* match.pd ((X * CST1) * CST2): Simplify to X * (CST1 * CST2)
	if the product does not overflow.
testsuite:
	* gcc.dg/tree-ssa/assoc-2.c: Enhance.
	* gcc.dg/tree-ssa/slsr-4.c: Adjust.

diff --git a/gcc/match.pd b/gcc/match.pd
index 36045f1..7f384db 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -283,6 +283,17 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
         || mul != wi::min_value (TYPE_PRECISION (type), SIGNED))
      { build_zero_cst (type); })))))
 
+/* Combine successive multiplications.  Similar to above, but handling
+   overflow is different.  */
+(simplify
+ (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
+ (with {
+   bool overflow_p;
+   wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p);
+  }
+  (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))
+   (mult @0 {wide_int_to_tree (type, mul); }))))
+
 /* Optimize A / A to 1.0 if we don't care about
    NaNs or Infinities.  */
 (simplify
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
index a92c882..cc0e9d4 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
@@ -5,4 +5,15 @@ int f0(int a, int b){
   return a * 33 * b * 55;
 }
 
-/* { dg-final { scan-tree-dump-times "mult_expr" 2 "gimple" } } */
+int f1(int a){
+  a *= 33;
+  return a * 55;
+}
+
+int f2(int a, int b){
+  a *= 33;
+  return a * b * 55;
+}
+
+/* { dg-final { scan-tree-dump-times "mult_expr" 7 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "mult_expr" 5 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
index 17d7b4c..1e943b7 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
@@ -23,13 +23,9 @@ f (int i)
   foo (y);
 }
 
-/* { dg-final { scan-tree-dump-times "\\* 4" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\* 10" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\+ 20;" 1 "slsr" } } */
+/* { dg-final { scan-tree-dump-times "\\* 40" 1 "slsr" } } */
 /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\- 16;" 1 "slsr" } } */
 /* { dg-final { scan-tree-dump-times "\\- 160" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\* 4" 1 "optimized" } } */
-/* { dg-final { scan-tree-dump-times "\\* 10" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\* 40" 1 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "\\+ 40" 1 "optimized" } } */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] match.pd: reassociate multiplications with constants
  2017-07-15 17:16   ` Alexander Monakov
@ 2017-07-17 20:00     ` Marc Glisse
  2017-07-17 20:50       ` Alexander Monakov
  0 siblings, 1 reply; 16+ messages in thread
From: Marc Glisse @ 2017-07-17 20:00 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-patches

On Sat, 15 Jul 2017, Alexander Monakov wrote:

> On Thu, 13 Jul 2017, Marc Glisse wrote:
>> X*big*big where abs(big*big)>abs(INT_MIN) can be optimized to 0
>
> I'm not sure that would be a win, eliminating X prevents the compiler from
> deducing that X must be zero (if overflow invokes undefined behavior).

This issue is common to many simplifications. As far as I know, we almost 
never use an operation to deduce a range for its argument (the only case I 
can think of is dereferencing a pointer). Though this case is a bit 
special since we would determine a constant value, not just a range, so we 
could imagine walking through the uses of X that are dominated by the 
multiplication and replacing X by 0 there... but that's quite a bit of 
work for something hopefully rare. I wonder if the new VRP that's in the 
works changes anything there.

>> the only hard case is when the product of the constants is -INT_MIN, which we
>> could turn into X<<31 for instance (sadly loses range info), or (-X)*INT_MIN
>> or whatever. That would make a nice follow-up, if you are interested.
>
> Here's a patch that combines constants, but doesn't handle this case,
> (I'm not sure we want to handle it, would it be useful in practice?)

Probably not worth it indeed.

> and neither does substitute zero on overflow, per the above concern.
>
> (slsr-4.c needs to be adjusted due to new simplifications)
>
> 	* match.pd ((X * CST1) * CST2): Simplify to X * (CST1 * CST2)
> 	if the product does not overflow.
> testsuite:
> 	* gcc.dg/tree-ssa/assoc-2.c: Enhance.
> 	* gcc.dg/tree-ssa/slsr-4.c: Adjust.
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 36045f1..7f384db 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -283,6 +283,17 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>         || mul != wi::min_value (TYPE_PRECISION (type), SIGNED))
>      { build_zero_cst (type); })))))
>
> +/* Combine successive multiplications.  Similar to above, but handling
> +   overflow is different.  */
> +(simplify
> + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
> + (with {
> +   bool overflow_p;
> +   wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p);
> +  }
> +  (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))

I wonder if there are cases where this would cause trouble for saturating 
integers. The only case I can think of is when @2 is -1, but that's likely 
simplified to NEGATE_EXPR first.

> +   (mult @0 {wide_int_to_tree (type, mul); }))))
> +

There are a number of possible extensions (handle vectors, handle a 
conversion, etc) but I guess reviewers probably won't make those a 
requirement for this patch. Missing space before wide_int_to_tree.

> /* Optimize A / A to 1.0 if we don't care about
>    NaNs or Infinities.  */
> (simplify
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
> index a92c882..cc0e9d4 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
> @@ -5,4 +5,15 @@ int f0(int a, int b){
>   return a * 33 * b * 55;
> }
>
> -/* { dg-final { scan-tree-dump-times "mult_expr" 2 "gimple" } } */
> +int f1(int a){
> +  a *= 33;
> +  return a * 55;
> +}
> +
> +int f2(int a, int b){
> +  a *= 33;
> +  return a * b * 55;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "mult_expr" 7 "gimple" } } */
> +/* { dg-final { scan-tree-dump-times "mult_expr" 5 "optimized" } } */

Ah, so that's why you had -fdump-tree-optimized in the previous patch...

> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
> index 17d7b4c..1e943b7 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
> @@ -23,13 +23,9 @@ f (int i)
>   foo (y);
> }
>
> -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "slsr" } } */
> -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "slsr" } } */
> -/* { dg-final { scan-tree-dump-times "\\+ 20;" 1 "slsr" } } */
> +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "slsr" } } */
> /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "slsr" } } */
> -/* { dg-final { scan-tree-dump-times "\\- 16;" 1 "slsr" } } */
> /* { dg-final { scan-tree-dump-times "\\- 160" 1 "slsr" } } */
> -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "optimized" } } */
> -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "optimized" } } */
> /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "optimized" } } */
> /* { dg-final { scan-tree-dump-times "\\+ 40" 1 "optimized" } } */

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] match.pd: reassociate multiplications with constants
  2017-07-17 20:00     ` Marc Glisse
@ 2017-07-17 20:50       ` Alexander Monakov
  2017-07-18 17:08         ` Alexander Monakov
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Monakov @ 2017-07-17 20:50 UTC (permalink / raw)
  To: Marc Glisse; +Cc: gcc-patches

On Mon, 17 Jul 2017, Marc Glisse wrote:
> > +/* Combine successive multiplications.  Similar to above, but handling
> > +   overflow is different.  */
> > +(simplify
> > + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
> > + (with {
> > +   bool overflow_p;
> > +   wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p);
> > +  }
> > +  (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))
> 
> I wonder if there are cases where this would cause trouble for saturating
> integers. The only case I can think of is when @2 is -1, but that's likely
> simplified to NEGATE_EXPR first.

Ah, yes, I think if @2 is -1 or 0 then we should not attempt this transform for
either saturating or sanitized types, just like in the first patch. I think
wrapping the 'with' with 'if (!integer_minus_onep (@2) && !integer_zerop (@2))'
works, since as you say it should become a negate/zero anyway?

Alexander

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] match.pd: reassociate multiplications with constants
  2017-07-14  6:00     ` Marc Glisse
@ 2017-07-18  8:23       ` Richard Biener
  0 siblings, 0 replies; 16+ messages in thread
From: Richard Biener @ 2017-07-18  8:23 UTC (permalink / raw)
  To: Marc Glisse; +Cc: Alexander Monakov, GCC Patches

On Fri, Jul 14, 2017 at 7:59 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Thu, 13 Jul 2017, Alexander Monakov wrote:
>
>> On Thu, 13 Jul 2017, Marc Glisse wrote:
>>>
>>> I notice that we do not turn (X*10)*10 into X*100 in GIMPLE.
>>
>>
>> Sorry, could you clarify what you mean here?  I think we certainly do
>> that,
>> just not via match.pd, but in 'associate:' case of fold_binary_loc.
>
>
> fold_binary_loc is for GENERIC, so mostly for front-end time optimization of
> expressions.
>
> int f(int a){
>   int b=a*10;
>   return b*10;
> }
>
> $ gcc-snapshot -O3 -S -fdump-tree-optimized a.c
> $ cat a.c.228t.optimized
> [...]
>   b_2 = a_1(D) * 10;
>   _3 = b_2 * 10;
>   return _3;

Yeah, but I think we best address this by adding support to re-associate
expressions with !TYPE_OVERFLOW_WRAPS to the reassoc pass.
It's not going to be an easy task if you want to avoid re-writing everything
to unsigned arithmetic.

Simple cases might be worth doing with patterns (like the case above).

Richard.

>>> Relying on inner expressions being folded can be slightly dangerous,
>>> especially for generic IIRC. It seems easy enough to check that @1 is
>>> neither
>>> 0 nor -1 for safety.
>>
>>
>> I wanted to add a gcc_checking_assert to that effect, but it's not used in
>> match.pd anywhere.  Is there a nice way to do that?
>
>
> You can use (with { arbitrary C++ code } ... ) to add an assertion (you can
> use @0 in the block of C++ code).
>
> I was more thinking of an "if" than an assertion though.
>
> --
> Marc Glisse

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] match.pd: reassociate multiplications with constants
  2017-07-17 20:50       ` Alexander Monakov
@ 2017-07-18 17:08         ` Alexander Monakov
  2017-07-19 11:28           ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Monakov @ 2017-07-18 17:08 UTC (permalink / raw)
  To: Marc Glisse; +Cc: gcc-patches

On Mon, 17 Jul 2017, Alexander Monakov wrote:
> On Mon, 17 Jul 2017, Marc Glisse wrote:
> > > +/* Combine successive multiplications.  Similar to above, but handling
> > > +   overflow is different.  */
> > > +(simplify
> > > + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
> > > + (with {
> > > +   bool overflow_p;
> > > +   wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p);
> > > +  }
> > > +  (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))
> > 
> > I wonder if there are cases where this would cause trouble for saturating
> > integers. The only case I can think of is when @2 is -1, but that's likely
> > simplified to NEGATE_EXPR first.
> 
> Ah, yes, I think if @2 is -1 or 0 then we should not attempt this transform for
> either saturating or sanitized types, just like in the first patch. I think
> wrapping the 'with' with 'if (!integer_minus_onep (@2) && !integer_zerop (@2))'
> works, since as you say it should become a negate/zero anyway?

Updated patch:

	* match.pd ((X * CST1) * CST2): Simplify to X * (CST1 * CST2).
testsuite:
	* gcc.dg/tree-ssa/assoc-2.c: Enhance.
	* gcc.dg/tree-ssa/slsr-4.c: Adjust.

diff --git a/gcc/match.pd b/gcc/match.pd
index 36045f1..0bb5541 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -283,6 +283,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
         || mul != wi::min_value (TYPE_PRECISION (type), SIGNED))
      { build_zero_cst (type); })))))

+/* Combine successive multiplications.  Similar to above, but handling
+   overflow is different.  */
+(simplify
+ (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
+ /* More specific rules can handle 0 and -1; skip them here to avoid
+    wrong transformations for sanitized and saturating types.  */
+ (if (!integer_zerop (@2) && !integer_minus_onep (@2))
+  (with {
+    bool overflow_p;
+    wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p);
+   }
+   (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))
+    (mult @0 { wide_int_to_tree (type, mul); })))))
+
 /* Optimize A / A to 1.0 if we don't care about
    NaNs or Infinities.  */
 (simplify
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
index a92c882..cc0e9d4 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
@@ -5,4 +5,15 @@ int f0(int a, int b){
   return a * 33 * b * 55;
 }

-/* { dg-final { scan-tree-dump-times "mult_expr" 2 "gimple" } } */
+int f1(int a){
+  a *= 33;
+  return a * 55;
+}
+
+int f2(int a, int b){
+  a *= 33;
+  return a * b * 55;
+}
+
+/* { dg-final { scan-tree-dump-times "mult_expr" 7 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "mult_expr" 5 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
index 17d7b4c..1e943b7 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
@@ -23,13 +23,9 @@ f (int i)
   foo (y);
 }
 
-/* { dg-final { scan-tree-dump-times "\\* 4" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\* 10" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\+ 20;" 1 "slsr" } } */
+/* { dg-final { scan-tree-dump-times "\\* 40" 1 "slsr" } } */
 /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\- 16;" 1 "slsr" } } */
 /* { dg-final { scan-tree-dump-times "\\- 160" 1 "slsr" } } */
-/* { dg-final { scan-tree-dump-times "\\* 4" 1 "optimized" } } */
-/* { dg-final { scan-tree-dump-times "\\* 10" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\* 40" 1 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "\\+ 40" 1 "optimized" } } */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] match.pd: reassociate multiplications with constants
  2017-07-18 17:08         ` Alexander Monakov
@ 2017-07-19 11:28           ` Richard Biener
  2017-07-19 11:33             ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2017-07-19 11:28 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: Marc Glisse, GCC Patches

On Tue, Jul 18, 2017 at 7:07 PM, Alexander Monakov <amonakov@ispras.ru> wrote:
> On Mon, 17 Jul 2017, Alexander Monakov wrote:
>> On Mon, 17 Jul 2017, Marc Glisse wrote:
>> > > +/* Combine successive multiplications.  Similar to above, but handling
>> > > +   overflow is different.  */
>> > > +(simplify
>> > > + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
>> > > + (with {
>> > > +   bool overflow_p;
>> > > +   wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p);
>> > > +  }
>> > > +  (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))
>> >
>> > I wonder if there are cases where this would cause trouble for saturating
>> > integers. The only case I can think of is when @2 is -1, but that's likely
>> > simplified to NEGATE_EXPR first.
>>
>> Ah, yes, I think if @2 is -1 or 0 then we should not attempt this transform for
>> either saturating or sanitized types, just like in the first patch. I think
>> wrapping the 'with' with 'if (!integer_minus_onep (@2) && !integer_zerop (@2))'
>> works, since as you say it should become a negate/zero anyway?
>
> Updated patch:
>
>         * match.pd ((X * CST1) * CST2): Simplify to X * (CST1 * CST2).
> testsuite:
>         * gcc.dg/tree-ssa/assoc-2.c: Enhance.
>         * gcc.dg/tree-ssa/slsr-4.c: Adjust.
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 36045f1..0bb5541 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -283,6 +283,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>          || mul != wi::min_value (TYPE_PRECISION (type), SIGNED))
>       { build_zero_cst (type); })))))
>
> +/* Combine successive multiplications.  Similar to above, but handling
> +   overflow is different.  */
> +(simplify
> + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
> + /* More specific rules can handle 0 and -1; skip them here to avoid
> +    wrong transformations for sanitized and saturating types.  */
> + (if (!integer_zerop (@2) && !integer_minus_onep (@2))

I think integer_zerop (@2) should never happen here if you order the
pattern after

(simplify
 (mult @0 integer_zerop@1)
 @1)

which I think you do.  Why's @1 == -1 ok when sanitizing but not @2 == -1?
If you rely on

/* Transform x * -1 into -x.  */
(simplify
 (mult @0 integer_minus_onep)
 (negate @0))

then you need to move that pattern above yours (there seem to be a bunch
of related ones like 0 - @1 -> -@1 to move earlier as well).

That would leave us with the case of saturating types (the above transforms
doesn't care for those).

So unless you can come up with a testcase that breaks I'd remove the
integer_zerop/integer_minus_onep checks.

> +  (with {
> +    bool overflow_p;
> +    wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p);
> +   }
> +   (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))

I think overflow in the constant multiplication is ok unless
TYPE_OVERFLOW_SANITIZED
(and can we have a ubsan testcase for that that would otherwise fail?).
It's not that we introduce new overflow here.

Ok with those changes.

Thanks,
Richard.

> +    (mult @0 { wide_int_to_tree (type, mul); })))))
> +
>  /* Optimize A / A to 1.0 if we don't care about
>     NaNs or Infinities.  */
>  (simplify
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
> index a92c882..cc0e9d4 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
> @@ -5,4 +5,15 @@ int f0(int a, int b){
>    return a * 33 * b * 55;
>  }
>
> -/* { dg-final { scan-tree-dump-times "mult_expr" 2 "gimple" } } */
> +int f1(int a){
> +  a *= 33;
> +  return a * 55;
> +}
> +
> +int f2(int a, int b){
> +  a *= 33;
> +  return a * b * 55;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "mult_expr" 7 "gimple" } } */
> +/* { dg-final { scan-tree-dump-times "mult_expr" 5 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
> index 17d7b4c..1e943b7 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
> @@ -23,13 +23,9 @@ f (int i)
>    foo (y);
>  }
>
> -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "slsr" } } */
> -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "slsr" } } */
> -/* { dg-final { scan-tree-dump-times "\\+ 20;" 1 "slsr" } } */
> +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "slsr" } } */
>  /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "slsr" } } */
> -/* { dg-final { scan-tree-dump-times "\\- 16;" 1 "slsr" } } */
>  /* { dg-final { scan-tree-dump-times "\\- 160" 1 "slsr" } } */
> -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "optimized" } } */
> -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "optimized" } } */
>  /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "optimized" } } */
>  /* { dg-final { scan-tree-dump-times "\\+ 40" 1 "optimized" } } */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] match.pd: reassociate multiplications with constants
  2017-07-19 11:28           ` Richard Biener
@ 2017-07-19 11:33             ` Richard Biener
  2017-07-19 14:40               ` Alexander Monakov
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2017-07-19 11:33 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: Marc Glisse, GCC Patches

On Wed, Jul 19, 2017 at 1:28 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Jul 18, 2017 at 7:07 PM, Alexander Monakov <amonakov@ispras.ru> wrote:
>> On Mon, 17 Jul 2017, Alexander Monakov wrote:
>>> On Mon, 17 Jul 2017, Marc Glisse wrote:
>>> > > +/* Combine successive multiplications.  Similar to above, but handling
>>> > > +   overflow is different.  */
>>> > > +(simplify
>>> > > + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
>>> > > + (with {
>>> > > +   bool overflow_p;
>>> > > +   wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p);
>>> > > +  }
>>> > > +  (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))
>>> >
>>> > I wonder if there are cases where this would cause trouble for saturating
>>> > integers. The only case I can think of is when @2 is -1, but that's likely
>>> > simplified to NEGATE_EXPR first.
>>>
>>> Ah, yes, I think if @2 is -1 or 0 then we should not attempt this transform for
>>> either saturating or sanitized types, just like in the first patch. I think
>>> wrapping the 'with' with 'if (!integer_minus_onep (@2) && !integer_zerop (@2))'
>>> works, since as you say it should become a negate/zero anyway?
>>
>> Updated patch:
>>
>>         * match.pd ((X * CST1) * CST2): Simplify to X * (CST1 * CST2).
>> testsuite:
>>         * gcc.dg/tree-ssa/assoc-2.c: Enhance.
>>         * gcc.dg/tree-ssa/slsr-4.c: Adjust.
>>
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 36045f1..0bb5541 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -283,6 +283,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>          || mul != wi::min_value (TYPE_PRECISION (type), SIGNED))
>>       { build_zero_cst (type); })))))
>>
>> +/* Combine successive multiplications.  Similar to above, but handling
>> +   overflow is different.  */
>> +(simplify
>> + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
>> + /* More specific rules can handle 0 and -1; skip them here to avoid
>> +    wrong transformations for sanitized and saturating types.  */
>> + (if (!integer_zerop (@2) && !integer_minus_onep (@2))
>
> I think integer_zerop (@2) should never happen here if you order the
> pattern after
>
> (simplify
>  (mult @0 integer_zerop@1)
>  @1)
>
> which I think you do.  Why's @1 == -1 ok when sanitizing but not @2 == -1?
> If you rely on
>
> /* Transform x * -1 into -x.  */
> (simplify
>  (mult @0 integer_minus_onep)
>  (negate @0))
>
> then you need to move that pattern above yours (there seem to be a bunch
> of related ones like 0 - @1 -> -@1 to move earlier as well).
>
> That would leave us with the case of saturating types (the above transforms
> doesn't care for those).
>
> So unless you can come up with a testcase that breaks I'd remove the
> integer_zerop/integer_minus_onep checks.

So for saturating types isn't the issue when @1 and @2 have opposite
sign and the inner multiply would have saturated?  [how do saturated
types behave with sign-changing multiplication/negation anyway?]

>> +  (with {
>> +    bool overflow_p;
>> +    wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p);
>> +   }
>> +   (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type))
>
> I think overflow in the constant multiplication is ok unless
> TYPE_OVERFLOW_SANITIZED
> (and can we have a ubsan testcase for that that would otherwise fail?).
> It's not that we introduce new overflow here.
>
> Ok with those changes.
>
> Thanks,
> Richard.
>
>> +    (mult @0 { wide_int_to_tree (type, mul); })))))
>> +
>>  /* Optimize A / A to 1.0 if we don't care about
>>     NaNs or Infinities.  */
>>  (simplify
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
>> index a92c882..cc0e9d4 100644
>> --- a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c
>> @@ -5,4 +5,15 @@ int f0(int a, int b){
>>    return a * 33 * b * 55;
>>  }
>>
>> -/* { dg-final { scan-tree-dump-times "mult_expr" 2 "gimple" } } */
>> +int f1(int a){
>> +  a *= 33;
>> +  return a * 55;
>> +}
>> +
>> +int f2(int a, int b){
>> +  a *= 33;
>> +  return a * b * 55;
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times "mult_expr" 7 "gimple" } } */
>> +/* { dg-final { scan-tree-dump-times "mult_expr" 5 "optimized" } } */
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
>> index 17d7b4c..1e943b7 100644
>> --- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c
>> @@ -23,13 +23,9 @@ f (int i)
>>    foo (y);
>>  }
>>
>> -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "slsr" } } */
>> -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "slsr" } } */
>> -/* { dg-final { scan-tree-dump-times "\\+ 20;" 1 "slsr" } } */
>> +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "slsr" } } */
>>  /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "slsr" } } */
>> -/* { dg-final { scan-tree-dump-times "\\- 16;" 1 "slsr" } } */
>>  /* { dg-final { scan-tree-dump-times "\\- 160" 1 "slsr" } } */
>> -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "optimized" } } */
>> -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "optimized" } } */
>>  /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "optimized" } } */
>>  /* { dg-final { scan-tree-dump-times "\\+ 40" 1 "optimized" } } */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] match.pd: reassociate multiplications with constants
  2017-07-19 11:33             ` Richard Biener
@ 2017-07-19 14:40               ` Alexander Monakov
  2017-07-20  8:41                 ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Monakov @ 2017-07-19 14:40 UTC (permalink / raw)
  To: Richard Biener; +Cc: Marc Glisse, GCC Patches

On Wed, 19 Jul 2017, Richard Biener wrote:
> >> --- a/gcc/match.pd
> >> +++ b/gcc/match.pd
> >> @@ -283,6 +283,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >>          || mul != wi::min_value (TYPE_PRECISION (type), SIGNED))
> >>       { build_zero_cst (type); })))))
> >>
> >> +/* Combine successive multiplications.  Similar to above, but handling
> >> +   overflow is different.  */
> >> +(simplify
> >> + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
> >> + /* More specific rules can handle 0 and -1; skip them here to avoid
> >> +    wrong transformations for sanitized and saturating types.  */
> >> + (if (!integer_zerop (@2) && !integer_minus_onep (@2))
> >
> > I think integer_zerop (@2) should never happen here if you order the
> > pattern after [...]
> > which I think you do.  Why's @1 == -1 ok when sanitizing but not @2 == -1?

Because we ruled out @2 being 0 or -1. If @2 == 1 then @1 == -1 is fine,
otherwise abs(@2) > 1, thus if X * @1 overflows, so does X * (@1 * @2)
(assuming type range is from -2**L to 2**L - 1).

@2 == -1 is not ok because with @1 == -1 it may lead to wrong result for
saturating types if their range can be asymmetrical.  It would also
conceal intermediate overflow for sanitized ops.

> > So unless you can come up with a testcase that breaks I'd remove the
> > integer_zerop/integer_minus_onep checks.

Well, initially I suggested using 'gcc_checking_assert' to ensure that such a
pattern is not triggered under wrong circumstances (such as bisecting match.pd
by #if 0'ing parts of it), but I believe Marc said he would prefer plain ifs.
I agree with him that we shouldn't just implicitly depend on ordering, but
keep the ifs or use asserts.  Do you insist?  Are there other instances already
where GCC relies on ordering within match.pd for correctness?

> So for saturating types isn't the issue when @1 and @2 have opposite
> sign and the inner multiply would have saturated?

No, I think the only special case is @1 == @2 == -1, otherwise either @2 is
0 or 1, or @1 * @2 is larger in magnitude than @1 (unless @1 == 0), and
folding doesn't conceal an intermediate saturation.

> [how do saturated
> types behave with sign-changing multiplication/negation anyway?]

Actually I'm not sure they need to be handled here, afaict only fixed-point
types can be saturating, but then wouldn't constant operands be FIXED_CST?
(but there are already some instances in match.pd that anticipate
TYPE_SATURATING in patterns that match INTEGER_CST)

> > I think overflow in the constant multiplication is ok unless
> > TYPE_OVERFLOW_SANITIZED
> > (and can we have a ubsan testcase for that that would otherwise fail?).
> > It's not that we introduce new overflow here.

This is to allow deducing that X must be zero. Otherwise, exploiting
undefined overflow allows to fold X * CST1 * CST2 to just 0 if CST1 * CST2
overflows (this is what Marc originally suggested), but shouldn't we
rather keep X in the IR to later deduce that X is zero?

Alexander

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] match.pd: reassociate multiplications with constants
  2017-07-19 14:40               ` Alexander Monakov
@ 2017-07-20  8:41                 ` Richard Biener
  2017-07-20  9:40                   ` Alexander Monakov
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2017-07-20  8:41 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: Marc Glisse, GCC Patches

On Wed, Jul 19, 2017 at 4:39 PM, Alexander Monakov <amonakov@ispras.ru> wrote:
> On Wed, 19 Jul 2017, Richard Biener wrote:
>> >> --- a/gcc/match.pd
>> >> +++ b/gcc/match.pd
>> >> @@ -283,6 +283,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>> >>          || mul != wi::min_value (TYPE_PRECISION (type), SIGNED))
>> >>       { build_zero_cst (type); })))))
>> >>
>> >> +/* Combine successive multiplications.  Similar to above, but handling
>> >> +   overflow is different.  */
>> >> +(simplify
>> >> + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
>> >> + /* More specific rules can handle 0 and -1; skip them here to avoid
>> >> +    wrong transformations for sanitized and saturating types.  */
>> >> + (if (!integer_zerop (@2) && !integer_minus_onep (@2))
>> >
>> > I think integer_zerop (@2) should never happen here if you order the
>> > pattern after [...]
>> > which I think you do.  Why's @1 == -1 ok when sanitizing but not @2 == -1?
>
> Because we ruled out @2 being 0 or -1. If @2 == 1 then @1 == -1 is fine,
> otherwise abs(@2) > 1, thus if X * @1 overflows, so does X * (@1 * @2)
> (assuming type range is from -2**L to 2**L - 1).
>
> @2 == -1 is not ok because with @1 == -1 it may lead to wrong result for
> saturating types if their range can be asymmetrical.  It would also
> conceal intermediate overflow for sanitized ops.

I'm not so much worried about false negatives for ubsan -- it's ubsan
implementations
fault that it runs so late after folding.

>> > So unless you can come up with a testcase that breaks I'd remove the
>> > integer_zerop/integer_minus_onep checks.
>
> Well, initially I suggested using 'gcc_checking_assert' to ensure that such a
> pattern is not triggered under wrong circumstances (such as bisecting match.pd
> by #if 0'ing parts of it), but I believe Marc said he would prefer plain ifs.
> I agree with him that we shouldn't just implicitly depend on ordering, but
> keep the ifs or use asserts.  Do you insist?  Are there other instances already
> where GCC relies on ordering within match.pd for correctness?

Not for correctness AFAIK but for optimization.  IIRC I implemented strict
ordering because of a bug.

>> So for saturating types isn't the issue when @1 and @2 have opposite
>> sign and the inner multiply would have saturated?
>
> No, I think the only special case is @1 == @2 == -1, otherwise either @2 is
> 0 or 1, or @1 * @2 is larger in magnitude than @1 (unless @1 == 0), and
> folding doesn't conceal an intermediate saturation.

Ok, but you don't check for @1 == @2 == -1 because you rely on
the subtree being folded (we rely on this for optimization, not sure we should
rely on this for correctness -- only checking @1 == -1 is if course
conservatively
correct).

Note that with folding X * -1 to -X a related pattern would optimize
(mult (negate X) CST)
and (negate (mult X CST)).

>> [how do saturated
>> types behave with sign-changing multiplication/negation anyway?]
>
> Actually I'm not sure they need to be handled here, afaict only fixed-point
> types can be saturating, but then wouldn't constant operands be FIXED_CST?
> (but there are already some instances in match.pd that anticipate
> TYPE_SATURATING in patterns that match INTEGER_CST)

At least

/* In a non-aggregate type, indicates a saturating type.  */
#define TYPE_SATURATING(NODE) \
  (TREE_NOT_CHECK4 (NODE, RECORD_TYPE, UNION_TYPE, QUAL_UNION_TYPE,
ARRAY_TYPE)->base.u.bits.saturating_flag)

suggests that non-fixed point saturating types are possible.  Maybe no such
ones exist though.  fixed-point math has other issues (but I think at least
has symmetric negative/positive ranges).

>> > I think overflow in the constant multiplication is ok unless
>> > TYPE_OVERFLOW_SANITIZED
>> > (and can we have a ubsan testcase for that that would otherwise fail?).
>> > It's not that we introduce new overflow here.
>
> This is to allow deducing that X must be zero. Otherwise, exploiting
> undefined overflow allows to fold X * CST1 * CST2 to just 0 if CST1 * CST2
> overflows (this is what Marc originally suggested), but shouldn't we
> rather keep X in the IR to later deduce that X is zero?

I don't think we should care here.  After all we could immediately fold
X * CST1 * CST2 to zero if CST1 * CST2 overflows and overflow invokes
undefined behavior, right in this reassoc pattern.

So -- I'd prefer to just give up for TYPE_SATURATING (as you say currently
it can't happen anyway because the only saturating types we have are
fixed-point),
thus remove the zero/minus_one checks (not worry about false positive in ubsan).

Richard.

>
> Alexander

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] match.pd: reassociate multiplications with constants
  2017-07-20  8:41                 ` Richard Biener
@ 2017-07-20  9:40                   ` Alexander Monakov
  2017-07-20 10:51                     ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Monakov @ 2017-07-20  9:40 UTC (permalink / raw)
  To: Richard Biener; +Cc: Marc Glisse, GCC Patches

On Thu, 20 Jul 2017, Richard Biener wrote:
> >> So for saturating types isn't the issue when @1 and @2 have opposite
> >> sign and the inner multiply would have saturated?
> >
> > No, I think the only special case is @1 == @2 == -1, otherwise either @2 is
> > 0 or 1, or @1 * @2 is larger in magnitude than @1 (unless @1 == 0), and
> > folding doesn't conceal an intermediate saturation.
> 
> Ok, but you don't check for @1 == @2 == -1 because you rely on
> the subtree being folded

Hm, I don't see why you say that, the patch ensures that @2 != -1 (I agree with
Marc that implicitly relying on other folding to happen is not quite ok)

(for the avoidance of doubt, "the only special case" was meant in the
context of saturating types here)

> fixed-point math has other issues (but I think at least
> has symmetric negative/positive ranges).

I don't think so, n1169.pdf calls out that fixed-point _Fract types may have
asymmetric range such that -1.0 is exactly representable but 1.0 is not.

> >> > I think overflow in the constant multiplication is ok unless
> >> > TYPE_OVERFLOW_SANITIZED
> >> > (and can we have a ubsan testcase for that that would otherwise fail?).
> >> > It's not that we introduce new overflow here.
> >
> > This is to allow deducing that X must be zero. Otherwise, exploiting
> > undefined overflow allows to fold X * CST1 * CST2 to just 0 if CST1 * CST2
> > overflows (this is what Marc originally suggested), but shouldn't we
> > rather keep X in the IR to later deduce that X is zero?
> 
> I don't think we should care here.  After all we could immediately fold
> X * CST1 * CST2 to zero if CST1 * CST2 overflows and overflow invokes
> undefined behavior, right in this reassoc pattern.

Note we cannot simply check if CST1 * CST2 overflows, because in case their
exact product is -INT_MIN (i.e. INT_MAX+1), the original expression doesn't
overflow for X==-1 (except when CST1==INT_MIN && CST2==-1).  This was previously
mentioned by Marc; sorry for neglecting to mention this special case in my
previous email.

So the motivation to not to do the transform on overflow is twofold:

- unclear how to handle CST1*CST2 == -INT_MIN, if at all;
- otherwise, folding X*CST1*CST2 to literal zero would drop X from the
  IR; but if X is an SSA_NAME, we could instead deduce that X is zero and
  replace all dominated uses of X by zero (I don't know how real-world-relevant
  this is)

> So -- I'd prefer to just give up for TYPE_SATURATING (as you say currently
> it can't happen anyway because the only saturating types we have are
> fixed-point),
> thus remove the zero/minus_one checks (not worry about false positive in ubsan).

OK, I think that can be done (I assume you mean 'false negative in ubsan'),
but in light of the above clarification, what change do you want for overflow?

Alexander

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] match.pd: reassociate multiplications with constants
  2017-07-20  9:40                   ` Alexander Monakov
@ 2017-07-20 10:51                     ` Richard Biener
  0 siblings, 0 replies; 16+ messages in thread
From: Richard Biener @ 2017-07-20 10:51 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: Marc Glisse, GCC Patches

On Thu, Jul 20, 2017 at 11:39 AM, Alexander Monakov <amonakov@ispras.ru> wrote:
> On Thu, 20 Jul 2017, Richard Biener wrote:
>> >> So for saturating types isn't the issue when @1 and @2 have opposite
>> >> sign and the inner multiply would have saturated?
>> >
>> > No, I think the only special case is @1 == @2 == -1, otherwise either @2 is
>> > 0 or 1, or @1 * @2 is larger in magnitude than @1 (unless @1 == 0), and
>> > folding doesn't conceal an intermediate saturation.
>>
>> Ok, but you don't check for @1 == @2 == -1 because you rely on
>> the subtree being folded
>
> Hm, I don't see why you say that, the patch ensures that @2 != -1 (I agree with
> Marc that implicitly relying on other folding to happen is not quite ok)

But not @1 == -1, so not both.

> (for the avoidance of doubt, "the only special case" was meant in the
> context of saturating types here)
>
>> fixed-point math has other issues (but I think at least
>> has symmetric negative/positive ranges).
>
> I don't think so, n1169.pdf calls out that fixed-point _Fract types may have
> asymmetric range such that -1.0 is exactly representable but 1.0 is not.

I stand corrected then.

>> >> > I think overflow in the constant multiplication is ok unless
>> >> > TYPE_OVERFLOW_SANITIZED
>> >> > (and can we have a ubsan testcase for that that would otherwise fail?).
>> >> > It's not that we introduce new overflow here.
>> >
>> > This is to allow deducing that X must be zero. Otherwise, exploiting
>> > undefined overflow allows to fold X * CST1 * CST2 to just 0 if CST1 * CST2
>> > overflows (this is what Marc originally suggested), but shouldn't we
>> > rather keep X in the IR to later deduce that X is zero?
>>
>> I don't think we should care here.  After all we could immediately fold
>> X * CST1 * CST2 to zero if CST1 * CST2 overflows and overflow invokes
>> undefined behavior, right in this reassoc pattern.
>
> Note we cannot simply check if CST1 * CST2 overflows, because in case their
> exact product is -INT_MIN (i.e. INT_MAX+1), the original expression doesn't
> overflow for X==-1 (except when CST1==INT_MIN && CST2==-1).  This was previously
> mentioned by Marc; sorry for neglecting to mention this special case in my
> previous email.
>
> So the motivation to not to do the transform on overflow is twofold:
>
> - unclear how to handle CST1*CST2 == -INT_MIN, if at all;
> - otherwise, folding X*CST1*CST2 to literal zero would drop X from the
>   IR; but if X is an SSA_NAME, we could instead deduce that X is zero and
>   replace all dominated uses of X by zero (I don't know how real-world-relevant
>   this is)
>
>> So -- I'd prefer to just give up for TYPE_SATURATING (as you say currently
>> it can't happen anyway because the only saturating types we have are
>> fixed-point),
>> thus remove the zero/minus_one checks (not worry about false positive in ubsan).
>
> OK, I think that can be done (I assume you mean 'false negative in ubsan'),
> but in light of the above clarification, what change do you want for overflow?

Ok, let's leave overflow handling as in your patch.  It looks like fold_binary
associate: case does the same.  It does look like an unnecessary restriction
but we follow existing practice there.

Richard.

> Alexander

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-07-20 10:51 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-13 16:37 [PATCH] match.pd: reassociate multiplications with constants Alexander Monakov
2017-07-13 18:31 ` Marc Glisse
2017-07-13 19:42   ` Alexander Monakov
2017-07-14  6:00     ` Marc Glisse
2017-07-18  8:23       ` Richard Biener
2017-07-15 17:01   ` Alexander Monakov
2017-07-15 17:16   ` Alexander Monakov
2017-07-17 20:00     ` Marc Glisse
2017-07-17 20:50       ` Alexander Monakov
2017-07-18 17:08         ` Alexander Monakov
2017-07-19 11:28           ` Richard Biener
2017-07-19 11:33             ` Richard Biener
2017-07-19 14:40               ` Alexander Monakov
2017-07-20  8:41                 ` Richard Biener
2017-07-20  9:40                   ` Alexander Monakov
2017-07-20 10:51                     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).