public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
@ 2016-02-26  3:02 kugan
  2016-02-26 11:02 ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: kugan @ 2016-02-26  3:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener

[-- Attachment #1: Type: text/plain, Size: 592 bytes --]



Hi,

This is an attempt to fix missed optimization: x + (-y * z * z) => x - y 
* z * z as reported in PR40921.

Regression tested and bootstrapped on x86-64-linux-gnu with no new 
regressions.

Is this OK for next stage1?

Thanks,
Kugan


gcc/ChangeLog:

2016-02-26  Kugan Vivekanandarajah  <kuganv@linaro.org>

	PR middle-end/40921
	* tree-ssa-reassoc.c (propagate_neg_to_sub_or_add): New.
	(reassociate_bb): Call propagate_neg_to_sub_or_add.


gcc/testsuite/ChangeLog:

2016-02-26  Kugan Vivekanandarajah  <kuganv@linaro.org>

	PR middle-end/40921
	* gcc.dg/tree-ssa/pr40921.c: New test.

[-- Attachment #2: PR40921.txt --]
[-- Type: text/plain, Size: 3246 bytes --]

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c b/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c
index e69de29..6a3529b 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c
@@ -0,0 +1,11 @@
+
+/* PR middle-end/40921.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-reassoc1 -fno-rounding-math" } */
+
+double foo (double x, double y, double z)
+{
+    return x + (-y * z*z);
+}
+
+/* { dg-final { scan-tree-dump-times "= -" 0 "reassoc1" } } */
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index e54700e..f99635b 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -4784,6 +4784,78 @@ transform_stmt_to_multiply (gimple_stmt_iterator *gsi, gimple *stmt,
     }
 }
 
+/* Propagate NEGATE_EXPR to MINUS_EXPR/PLUS_EXPR when the neageted
+   expression is multiplied and used in MINUS_EXPR/PLUS_EXPR.  */
+static void
+propagate_neg_to_sub_or_add (gimple_stmt_iterator *gsi, gimple *stmt)
+{
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs1, rhs2, mult_lhs;
+  gimple *use_stmt;
+  gimple *use_stmt2;
+  use_operand_p use;
+  enum tree_code code;
+  gassign *g;
+
+  /* Note that -frounding-math should disable the proposed
+     optimization.  */
+  if (flag_rounding_math)
+    return;
+
+  if (!single_imm_use (lhs, &use, &use_stmt))
+    return;
+
+  if (!is_gimple_assign (use_stmt))
+    return;
+
+  code = gimple_assign_rhs_code (use_stmt);
+  if (code != MULT_EXPR)
+    return;
+  mult_lhs = gimple_assign_lhs (use_stmt);
+  while (code == MULT_EXPR)
+    {
+      if (!single_imm_use (mult_lhs, &use, &use_stmt2))
+	break;
+      if (!is_gimple_assign (use_stmt2))
+	break;
+      code = gimple_assign_rhs_code (use_stmt2);
+      mult_lhs = gimple_assign_lhs (use_stmt2);
+      use_stmt = use_stmt2;
+    }
+
+  if (code != PLUS_EXPR
+      && code != MINUS_EXPR)
+    return;
+
+  lhs = gimple_assign_lhs (use_stmt);
+  rhs1 = gimple_assign_rhs1 (use_stmt);
+  rhs2 = gimple_assign_rhs2 (use_stmt);
+
+  if (rhs1 == USE_FROM_PTR (use))
+    {
+      if (code == MINUS_EXPR)
+	return;
+      std::swap (rhs1, rhs2);
+      code = MINUS_EXPR;
+    }
+  else
+    {
+      if (code == PLUS_EXPR)
+	code = MINUS_EXPR;
+      else
+	code = PLUS_EXPR;
+    }
+
+  g = gimple_build_assign (lhs, code, rhs1, rhs2);
+  gimple_stmt_iterator gsi2 = gsi_for_stmt (use_stmt);
+  gsi_replace (&gsi2, g, true);
+
+  lhs = gimple_assign_lhs (stmt);
+  rhs1 = gimple_assign_rhs1 (stmt);
+  g = gimple_build_assign (lhs, SSA_NAME, rhs1);
+  gsi_replace (gsi, g, true);
+}
+
 /* Reassociate expressions in basic block BB and its post-dominator as
    children.
 
@@ -4809,6 +4881,11 @@ reassociate_bb (basic_block bb)
 	{
 	  tree lhs, rhs1, rhs2;
 	  enum tree_code rhs_code = gimple_assign_rhs_code (stmt);
+	  if (rhs_code == NEGATE_EXPR)
+	    {
+	      propagate_neg_to_sub_or_add (&gsi, stmt);
+	      continue;
+	    }
 
 	  /* If this is not a gimple binary expression, there is
 	     nothing for us to do with it.  */
@@ -4884,6 +4961,7 @@ reassociate_bb (basic_block bb)
 	      if (rhs_code == MULT_EXPR)
 		attempt_builtin_copysign (&ops);
 
+
 	      if (reassoc_insert_powi_p
 		  && rhs_code == MULT_EXPR
 		  && flag_unsafe_math_optimizations)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
  2016-02-26  3:02 [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z kugan
@ 2016-02-26 11:02 ` Richard Biener
  2016-02-29 10:54   ` kugan
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2016-02-26 11:02 UTC (permalink / raw)
  To: kugan; +Cc: gcc-patches

On Fri, Feb 26, 2016 at 4:02 AM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> Hi,
>
> This is an attempt to fix missed optimization: x + (-y * z * z) => x - y * z
> * z as reported in PR40921.
>
> Regression tested and bootstrapped on x86-64-linux-gnu with no new
> regressions.
>
> Is this OK for next stage1?

Err.  I think the way you implement that in reassoc is ad-hoc and not
related to reassoc at all.

In fact what reassoc is missing is to handle

 -y * z * (-w) * x -> y * x * w * x

thus optimize negates as if they were additional * -1 entries in a
multiplication chain.  And
then optimize a single remaining * -1 in the result chain to a negate.

Then match.pd handles x + (-y) -> x - y (independent of -frounding-math btw).

So no, this isn't ok as-is, IMHO you want to expand the multiplication ops chain
pulling in the * -1 ops (if single-use, of course).

Richard.

> Thanks,
> Kugan
>
>
> gcc/ChangeLog:
>
> 2016-02-26  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         PR middle-end/40921
>         * tree-ssa-reassoc.c (propagate_neg_to_sub_or_add): New.
>         (reassociate_bb): Call propagate_neg_to_sub_or_add.
>
>
> gcc/testsuite/ChangeLog:
>
> 2016-02-26  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         PR middle-end/40921
>         * gcc.dg/tree-ssa/pr40921.c: New test.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
  2016-02-26 11:02 ` Richard Biener
@ 2016-02-29 10:54   ` kugan
  2016-04-19 11:36     ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: kugan @ 2016-02-29 10:54 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 691 bytes --]

>
> Err.  I think the way you implement that in reassoc is ad-hoc and not
> related to reassoc at all.
>
> In fact what reassoc is missing is to handle
>
>   -y * z * (-w) * x -> y * x * w * x
>
> thus optimize negates as if they were additional * -1 entries in a
> multiplication chain.  And
> then optimize a single remaining * -1 in the result chain to a negate.
>
> Then match.pd handles x + (-y) -> x - y (independent of -frounding-math btw).
>
> So no, this isn't ok as-is, IMHO you want to expand the multiplication ops chain
> pulling in the * -1 ops (if single-use, of course).
>

I agree. Here is the updated patch along what you suggested. Does this 
look better ?

Thanks,
Kugan

[-- Attachment #2: PR40921.txt --]
[-- Type: text/plain, Size: 1864 bytes --]

diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index 17eb64f..bbb5ffb 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -4674,6 +4674,41 @@ attempt_builtin_powi (gimple *stmt, vec<operand_entry *> *ops)
   return result;
 }
 
+/* Factor out NEGATE_EXPR from the multiplication operands.  */
+static void
+factor_out_negate_expr (gimple_stmt_iterator *gsi,
+			gimple *stmt, vec<operand_entry *> *ops)
+{
+  operand_entry *oe;
+  unsigned int i;
+  int neg_count = 0;
+
+  FOR_EACH_VEC_ELT (*ops, i, oe)
+    {
+      if (TREE_CODE (oe->op) != SSA_NAME
+	  || !has_single_use (oe->op))
+	continue;
+      gimple *def_stmt = SSA_NAME_DEF_STMT (oe->op);
+      if (!is_gimple_assign (def_stmt)
+	  || gimple_assign_rhs_code (def_stmt) != NEGATE_EXPR)
+	continue;
+      oe->op = gimple_assign_rhs1 (def_stmt);
+      neg_count ++;
+    }
+
+  if (neg_count % 2)
+    {
+      tree lhs = gimple_assign_lhs (stmt);
+      tree tmp = make_temp_ssa_name (TREE_TYPE (lhs), NULL, "reassocneg");
+      gimple_set_lhs (stmt, tmp);
+      gassign *neg_stmt = gimple_build_assign (lhs, NEGATE_EXPR,
+					       tmp);
+      gimple_set_location (neg_stmt, gimple_location (stmt));
+      gimple_set_uid (neg_stmt, gimple_uid (stmt));
+      gsi_insert_after (gsi, neg_stmt, GSI_SAME_STMT);
+    }
+}
+
 /* Attempt to optimize
    CST1 * copysign (CST2, y) -> copysign (CST1 * CST2, y) if CST1 > 0, or
    CST1 * copysign (CST2, y) -> -copysign (CST1 * CST2, y) if CST1 < 0.  */
@@ -4917,6 +4952,12 @@ reassociate_bb (basic_block bb)
 	      if (rhs_code == MULT_EXPR)
 		attempt_builtin_copysign (&ops);
 
+	      if (rhs_code == MULT_EXPR)
+		{
+		  factor_out_negate_expr (&gsi, stmt, &ops);
+		  ops.qsort (sort_by_operand_rank);
+		}
+
 	      if (reassoc_insert_powi_p
 		  && rhs_code == MULT_EXPR
 		  && flag_unsafe_math_optimizations)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
  2016-04-19 11:36     ` Richard Biener
@ 2016-04-19 11:36       ` Richard Biener
  2016-04-19 12:12         ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2016-04-19 11:36 UTC (permalink / raw)
  To: kugan; +Cc: gcc-patches

On Tue, Apr 19, 2016 at 1:35 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Mon, Feb 29, 2016 at 11:53 AM, kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>> Err.  I think the way you implement that in reassoc is ad-hoc and not
>>> related to reassoc at all.
>>>
>>> In fact what reassoc is missing is to handle
>>>
>>>   -y * z * (-w) * x -> y * x * w * x
>>>
>>> thus optimize negates as if they were additional * -1 entries in a
>>> multiplication chain.  And
>>> then optimize a single remaining * -1 in the result chain to a negate.
>>>
>>> Then match.pd handles x + (-y) -> x - y (independent of -frounding-math
>>> btw).
>>>
>>> So no, this isn't ok as-is, IMHO you want to expand the multiplication ops
>>> chain
>>> pulling in the * -1 ops (if single-use, of course).
>>>
>>
>> I agree. Here is the updated patch along what you suggested. Does this look
>> better ?
>
> It looks better but I think you want to do factor_out_negate_expr before the
> first qsort/optimize_ops_list call to catch -1. * z * (-w) which also means you
> want to simply append a -1. to the ops list rather than adjusting the result
> with a negate stmt.
>
> You also need to guard all this with ! HONOR_SNANS (type) && (!
> HONOR_SIGNED_ZEROS (type)
> || ! COMPLEX_FLOAT_TYPE_P (type)) (see match.pd pattern transforming x
> * -1. to -x).

And please add at least one testcase.

Richard.

> Richard.
>
>> Thanks,
>> Kugan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
  2016-02-29 10:54   ` kugan
@ 2016-04-19 11:36     ` Richard Biener
  2016-04-19 11:36       ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2016-04-19 11:36 UTC (permalink / raw)
  To: kugan; +Cc: gcc-patches

On Mon, Feb 29, 2016 at 11:53 AM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>>
>> Err.  I think the way you implement that in reassoc is ad-hoc and not
>> related to reassoc at all.
>>
>> In fact what reassoc is missing is to handle
>>
>>   -y * z * (-w) * x -> y * x * w * x
>>
>> thus optimize negates as if they were additional * -1 entries in a
>> multiplication chain.  And
>> then optimize a single remaining * -1 in the result chain to a negate.
>>
>> Then match.pd handles x + (-y) -> x - y (independent of -frounding-math
>> btw).
>>
>> So no, this isn't ok as-is, IMHO you want to expand the multiplication ops
>> chain
>> pulling in the * -1 ops (if single-use, of course).
>>
>
> I agree. Here is the updated patch along what you suggested. Does this look
> better ?

It looks better but I think you want to do factor_out_negate_expr before the
first qsort/optimize_ops_list call to catch -1. * z * (-w) which also means you
want to simply append a -1. to the ops list rather than adjusting the result
with a negate stmt.

You also need to guard all this with ! HONOR_SNANS (type) && (!
HONOR_SIGNED_ZEROS (type)
|| ! COMPLEX_FLOAT_TYPE_P (type)) (see match.pd pattern transforming x
* -1. to -x).

Richard.

> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
  2016-04-19 11:36       ` Richard Biener
@ 2016-04-19 12:12         ` Richard Biener
  2016-04-21 11:13           ` kugan
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2016-04-19 12:12 UTC (permalink / raw)
  To: kugan; +Cc: gcc-patches

On Tue, Apr 19, 2016 at 1:36 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Apr 19, 2016 at 1:35 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Mon, Feb 29, 2016 at 11:53 AM, kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>
>>>> Err.  I think the way you implement that in reassoc is ad-hoc and not
>>>> related to reassoc at all.
>>>>
>>>> In fact what reassoc is missing is to handle
>>>>
>>>>   -y * z * (-w) * x -> y * x * w * x
>>>>
>>>> thus optimize negates as if they were additional * -1 entries in a
>>>> multiplication chain.  And
>>>> then optimize a single remaining * -1 in the result chain to a negate.
>>>>
>>>> Then match.pd handles x + (-y) -> x - y (independent of -frounding-math
>>>> btw).
>>>>
>>>> So no, this isn't ok as-is, IMHO you want to expand the multiplication ops
>>>> chain
>>>> pulling in the * -1 ops (if single-use, of course).
>>>>
>>>
>>> I agree. Here is the updated patch along what you suggested. Does this look
>>> better ?
>>
>> It looks better but I think you want to do factor_out_negate_expr before the
>> first qsort/optimize_ops_list call to catch -1. * z * (-w) which also means you
>> want to simply append a -1. to the ops list rather than adjusting the result
>> with a negate stmt.
>>
>> You also need to guard all this with ! HONOR_SNANS (type) && (!
>> HONOR_SIGNED_ZEROS (type)
>> || ! COMPLEX_FLOAT_TYPE_P (type)) (see match.pd pattern transforming x
>> * -1. to -x).
>
> And please add at least one testcase.

And it appears to me that you could handle this in linearize_expr_tree
as well, similar
to how we handle MULT_EXPR with acceptable_pow_call there by adding -1. and
op into the ops vec.

Similar for the x + x + x -> 3 * x case we'd want to add a repeat op when seeing
x + 3 * x + x and use ->count in that patch as well.

Best split out the

  if (rhscode == MULT_EXPR
      && TREE_CODE (binrhs) == SSA_NAME
      && acceptable_pow_call (SSA_NAME_DEF_STMT (binrhs), &base, &exponent))
    {
      add_repeat_to_ops_vec (ops, base, exponent);
      gimple_set_visited (SSA_NAME_DEF_STMT (binrhs), true);
    }
  else
    add_to_ops_vec (ops, binrhs);

pattern into a helper that handles the other cases.

Richard.

> Richard.
>
>> Richard.
>>
>>> Thanks,
>>> Kugan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
  2016-04-19 12:12         ` Richard Biener
@ 2016-04-21 11:13           ` kugan
  2016-04-22  8:09             ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: kugan @ 2016-04-21 11:13 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Hi Richard,

On 19/04/16 22:11, Richard Biener wrote:
> On Tue, Apr 19, 2016 at 1:36 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Tue, Apr 19, 2016 at 1:35 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Mon, Feb 29, 2016 at 11:53 AM, kugan
>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>
>>>>> Err.  I think the way you implement that in reassoc is ad-hoc and not
>>>>> related to reassoc at all.
>>>>>
>>>>> In fact what reassoc is missing is to handle
>>>>>
>>>>>    -y * z * (-w) * x -> y * x * w * x
>>>>>
>>>>> thus optimize negates as if they were additional * -1 entries in a
>>>>> multiplication chain.  And
>>>>> then optimize a single remaining * -1 in the result chain to a negate.
>>>>>
>>>>> Then match.pd handles x + (-y) -> x - y (independent of -frounding-math
>>>>> btw).
>>>>>
>>>>> So no, this isn't ok as-is, IMHO you want to expand the multiplication ops
>>>>> chain
>>>>> pulling in the * -1 ops (if single-use, of course).
>>>>>
>>>>
>>>> I agree. Here is the updated patch along what you suggested. Does this look
>>>> better ?
>>>
>>> It looks better but I think you want to do factor_out_negate_expr before the
>>> first qsort/optimize_ops_list call to catch -1. * z * (-w) which also means you
>>> want to simply append a -1. to the ops list rather than adjusting the result
>>> with a negate stmt.
>>>
>>> You also need to guard all this with ! HONOR_SNANS (type) && (!
>>> HONOR_SIGNED_ZEROS (type)
>>> || ! COMPLEX_FLOAT_TYPE_P (type)) (see match.pd pattern transforming x
>>> * -1. to -x).
>>
>> And please add at least one testcase.
>
> And it appears to me that you could handle this in linearize_expr_tree
> as well, similar
> to how we handle MULT_EXPR with acceptable_pow_call there by adding -1. and
> op into the ops vec.
>


I am not sure I understand this. I tried doing this. If I add  -1 and 
rhs1 for the NEGATE_EXPR to ops list,  when it come to rewrite_expr_tree 
constant will be sorted early and would make it hard to generate:
  x + (-y * z * z) => x - y * z * z

Do you want to swap the constant in MULT_EXPR chain (if present) like in 
swap_ops_for_binary_stmt and then create a NEGATE_EXPR ?


Thanks,
Kugan

> Similar for the x + x + x -> 3 * x case we'd want to add a repeat op when seeing
> x + 3 * x + x and use ->count in that patch as well.
>
> Best split out the
>
>    if (rhscode == MULT_EXPR
>        && TREE_CODE (binrhs) == SSA_NAME
>        && acceptable_pow_call (SSA_NAME_DEF_STMT (binrhs), &base, &exponent))
>      {
>        add_repeat_to_ops_vec (ops, base, exponent);
>        gimple_set_visited (SSA_NAME_DEF_STMT (binrhs), true);
>      }
>    else
>      add_to_ops_vec (ops, binrhs);
>
> pattern into a helper that handles the other cases.
>
> Richard.
>
>> Richard.
>>
>>> Richard.
>>>
>>>> Thanks,
>>>> Kugan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
  2016-04-21 11:13           ` kugan
@ 2016-04-22  8:09             ` Richard Biener
  2016-04-23 13:10               ` kugan
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2016-04-22  8:09 UTC (permalink / raw)
  To: kugan; +Cc: gcc-patches

On Thu, Apr 21, 2016 at 1:12 PM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> Hi Richard,
>
>
> On 19/04/16 22:11, Richard Biener wrote:
>>
>> On Tue, Apr 19, 2016 at 1:36 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>>
>>> On Tue, Apr 19, 2016 at 1:35 PM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>>
>>>> On Mon, Feb 29, 2016 at 11:53 AM, kugan
>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>>
>>>>>>
>>>>>> Err.  I think the way you implement that in reassoc is ad-hoc and not
>>>>>> related to reassoc at all.
>>>>>>
>>>>>> In fact what reassoc is missing is to handle
>>>>>>
>>>>>>    -y * z * (-w) * x -> y * x * w * x
>>>>>>
>>>>>> thus optimize negates as if they were additional * -1 entries in a
>>>>>> multiplication chain.  And
>>>>>> then optimize a single remaining * -1 in the result chain to a negate.
>>>>>>
>>>>>> Then match.pd handles x + (-y) -> x - y (independent of
>>>>>> -frounding-math
>>>>>> btw).
>>>>>>
>>>>>> So no, this isn't ok as-is, IMHO you want to expand the multiplication
>>>>>> ops
>>>>>> chain
>>>>>> pulling in the * -1 ops (if single-use, of course).
>>>>>>
>>>>>
>>>>> I agree. Here is the updated patch along what you suggested. Does this
>>>>> look
>>>>> better ?
>>>>
>>>>
>>>> It looks better but I think you want to do factor_out_negate_expr before
>>>> the
>>>> first qsort/optimize_ops_list call to catch -1. * z * (-w) which also
>>>> means you
>>>> want to simply append a -1. to the ops list rather than adjusting the
>>>> result
>>>> with a negate stmt.
>>>>
>>>> You also need to guard all this with ! HONOR_SNANS (type) && (!
>>>> HONOR_SIGNED_ZEROS (type)
>>>> || ! COMPLEX_FLOAT_TYPE_P (type)) (see match.pd pattern transforming x
>>>> * -1. to -x).
>>>
>>>
>>> And please add at least one testcase.
>>
>>
>> And it appears to me that you could handle this in linearize_expr_tree
>> as well, similar
>> to how we handle MULT_EXPR with acceptable_pow_call there by adding -1.
>> and
>> op into the ops vec.
>>
>
>
> I am not sure I understand this. I tried doing this. If I add  -1 and rhs1
> for the NEGATE_EXPR to ops list,  when it come to rewrite_expr_tree constant
> will be sorted early and would make it hard to generate:
>  x + (-y * z * z) => x - y * z * z
>
> Do you want to swap the constant in MULT_EXPR chain (if present) like in
> swap_ops_for_binary_stmt and then create a NEGATE_EXPR ?

In addition to linearize_expr handling you need to handle a -1 in the MULT_EXPR
chain specially at rewrite_expr_tree time by emitting a NEGATE_EXPR instead
of a MULT_EXPR (which also means you should sort the -1 "last").

Richard.

>
> Thanks,
> Kugan
>
>
>> Similar for the x + x + x -> 3 * x case we'd want to add a repeat op when
>> seeing
>> x + 3 * x + x and use ->count in that patch as well.
>>
>> Best split out the
>>
>>    if (rhscode == MULT_EXPR
>>        && TREE_CODE (binrhs) == SSA_NAME
>>        && acceptable_pow_call (SSA_NAME_DEF_STMT (binrhs), &base,
>> &exponent))
>>      {
>>        add_repeat_to_ops_vec (ops, base, exponent);
>>        gimple_set_visited (SSA_NAME_DEF_STMT (binrhs), true);
>>      }
>>    else
>>      add_to_ops_vec (ops, binrhs);
>>
>> pattern into a helper that handles the other cases.
>>
>> Richard.
>>
>>> Richard.
>>>
>>>> Richard.
>>>>
>>>>> Thanks,
>>>>> Kugan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
  2016-04-22  8:09             ` Richard Biener
@ 2016-04-23 13:10               ` kugan
  2016-04-27 13:41                 ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: kugan @ 2016-04-23 13:10 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1186 bytes --]


>> I am not sure I understand this. I tried doing this. If I add  -1 and rhs1
>> for the NEGATE_EXPR to ops list,  when it come to rewrite_expr_tree constant
>> will be sorted early and would make it hard to generate:
>>   x + (-y * z * z) => x - y * z * z
>>
>> Do you want to swap the constant in MULT_EXPR chain (if present) like in
>> swap_ops_for_binary_stmt and then create a NEGATE_EXPR ?
>
> In addition to linearize_expr handling you need to handle a -1 in the MULT_EXPR
> chain specially at rewrite_expr_tree time by emitting a NEGATE_EXPR instead
> of a MULT_EXPR (which also means you should sort the -1 "last").

Hi Richard,

Thanks. Here is an attempt which does this.

Regression tested and bootstrapped on x86-64-linux-gnu with no new 
regressions.

Is this OK for trunk?

Thanks,
Kugan

2016-04-23  Kugan Vivekanandarajah  <kuganv@linaro.org>

	PR middle-end/40921
	* gcc.dg/tree-ssa/pr40921.c: New test.

gcc/ChangeLog:

2016-04-23  Kugan Vivekanandarajah  <kuganv@linaro.org>

	PR middle-end/40921
	* tree-ssa-reassoc.c (try_special_add_to_ops): New.
	(linearize_expr_tree): Call try_special_add_to_ops.
	(reassociate_bb): Convert MULT_EXPR by (-1) to NEGATE_EXPR.


[-- Attachment #2: p.txt --]
[-- Type: text/plain, Size: 5263 bytes --]

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c b/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c
index e69de29..f587a8f 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c
@@ -0,0 +1,20 @@
+
+/* { dg-do compile } */
+/* { dg-options "-O2  -fdump-tree-optimized -ffast-math" } */
+
+unsigned int foo (unsigned int x, unsigned int y, unsigned int z)
+{
+      return x + (-y * z*z);
+}
+
+float bar (float x, float y, float z)
+{
+      return x + (-y * z*z);
+}
+
+float bar (float x, float y, float z)
+{
+      return x + (-y * z*z * 5.0);
+}
+
+/* { dg-final { scan-tree-dump-times "_* = -y_" 0 "optimized" } } */
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index d23dabd..1b38207 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -4252,6 +4252,45 @@ acceptable_pow_call (gimple *stmt, tree *base, HOST_WIDE_INT *exponent)
   return true;
 }
 
+/* Try to derive and add operand entry for OP to *OPS.  Return false if
+   unsuccessful.  */
+
+static bool
+try_special_add_to_ops (vec<operand_entry *> *ops,
+			enum tree_code code,
+			tree op, gimple* def_stmt)
+{
+  tree base = NULL_TREE;
+  HOST_WIDE_INT exponent = 0;
+
+  if (TREE_CODE (op) != SSA_NAME)
+    return false;
+
+  if (code == MULT_EXPR
+      && acceptable_pow_call (def_stmt, &base, &exponent))
+    {
+      add_repeat_to_ops_vec (ops, base, exponent);
+      gimple_set_visited (def_stmt, true);
+      return true;
+    }
+  else if (code == MULT_EXPR
+	   && is_gimple_assign (def_stmt)
+	   && gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR
+	   && !HONOR_SNANS (TREE_TYPE (op))
+	   && (!HONOR_SIGNED_ZEROS (TREE_TYPE (op))
+	       || !COMPLEX_FLOAT_TYPE_P (TREE_TYPE (op))))
+    {
+      tree rhs1 = gimple_assign_rhs1 (def_stmt);
+      tree cst = build_minus_one_cst (TREE_TYPE (op));
+      add_to_ops_vec (ops, rhs1);
+      add_to_ops_vec (ops, cst);
+      gimple_set_visited (def_stmt, true);
+      return true;
+    }
+
+  return false;
+}
+
 /* Recursively linearize a binary expression that is the RHS of STMT.
    Place the operands of the expression tree in the vector named OPS.  */
 
@@ -4266,8 +4305,6 @@ linearize_expr_tree (vec<operand_entry *> *ops, gimple *stmt,
   bool binrhsisreassoc = false;
   enum tree_code rhscode = gimple_assign_rhs_code (stmt);
   struct loop *loop = loop_containing_stmt (stmt);
-  tree base = NULL_TREE;
-  HOST_WIDE_INT exponent = 0;
 
   if (set_visited)
     gimple_set_visited (stmt, true);
@@ -4303,26 +4340,11 @@ linearize_expr_tree (vec<operand_entry *> *ops, gimple *stmt,
 
       if (!binrhsisreassoc)
 	{
-	  if (rhscode == MULT_EXPR
-	      && TREE_CODE (binrhs) == SSA_NAME
-	      && acceptable_pow_call (binrhsdef, &base, &exponent))
-	    {
-	      add_repeat_to_ops_vec (ops, base, exponent);
-	      gimple_set_visited (binrhsdef, true);
-	    }
-	  else
+	  if (!try_special_add_to_ops (ops, rhscode, binrhs, binrhsdef))
 	    add_to_ops_vec (ops, binrhs);
 
-	  if (rhscode == MULT_EXPR
-	      && TREE_CODE (binlhs) == SSA_NAME
-	      && acceptable_pow_call (binlhsdef, &base, &exponent))
-	    {
-	      add_repeat_to_ops_vec (ops, base, exponent);
-	      gimple_set_visited (binlhsdef, true);
-	    }
-	  else
+	  if (!try_special_add_to_ops (ops, rhscode, binlhs, binlhsdef))
 	    add_to_ops_vec (ops, binlhs);
-
 	  return;
 	}
 
@@ -4360,14 +4382,7 @@ linearize_expr_tree (vec<operand_entry *> *ops, gimple *stmt,
   linearize_expr_tree (ops, SSA_NAME_DEF_STMT (binlhs),
 		       is_associative, set_visited);
 
-  if (rhscode == MULT_EXPR
-      && TREE_CODE (binrhs) == SSA_NAME
-      && acceptable_pow_call (SSA_NAME_DEF_STMT (binrhs), &base, &exponent))
-    {
-      add_repeat_to_ops_vec (ops, base, exponent);
-      gimple_set_visited (SSA_NAME_DEF_STMT (binrhs), true);
-    }
-  else
+  if (!try_special_add_to_ops (ops, rhscode, binrhs, binrhsdef))
     add_to_ops_vec (ops, binrhs);
 }
 
@@ -5127,6 +5142,19 @@ reassociate_bb (basic_block bb)
 		    powi_result = attempt_builtin_powi (stmt, &ops);
 		}
 
+	      int last = ops.length () - 1;
+	      bool negate_result = false;
+	      if (rhs_code == MULT_EXPR
+		  && ops.length () > 1
+		  && ((TREE_CODE (ops[last]->op) == INTEGER_CST
+		       && integer_minus_onep (ops[last]->op))
+		      || ((TREE_CODE (ops[last]->op) == REAL_CST)
+			  && real_equal (&TREE_REAL_CST (ops[last]->op), &dconstm1))))
+		{
+		  ops.unordered_remove (last);
+		  negate_result = true;
+		}
+
 	      /* If the operand vector is now empty, all operands were 
 		 consumed by the __builtin_powi optimization.  */
 	      if (ops.length () == 0)
@@ -5169,6 +5197,17 @@ reassociate_bb (basic_block bb)
 						   powi_result != NULL);
                     }
 
+		  if (negate_result)
+		    {
+		      tree tmp = make_ssa_name (TREE_TYPE (lhs));
+		      gimple_set_lhs (stmt, tmp);
+		      gassign *neg_stmt = gimple_build_assign (lhs, NEGATE_EXPR,
+							       tmp);
+		      gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+		      gsi_insert_after (&gsi, neg_stmt, GSI_NEW_STMT);
+		      update_stmt (stmt);
+		    }
+
 		  /* If we combined some repeated factors into a 
 		     __builtin_powi call, multiply that result by the
 		     reassociated operands.  */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
  2016-04-23 13:10               ` kugan
@ 2016-04-27 13:41                 ` Richard Biener
  2016-05-05  0:41                   ` kugan
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2016-04-27 13:41 UTC (permalink / raw)
  To: kugan; +Cc: gcc-patches

On Sat, Apr 23, 2016 at 3:10 PM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>>> I am not sure I understand this. I tried doing this. If I add  -1 and
>>> rhs1
>>> for the NEGATE_EXPR to ops list,  when it come to rewrite_expr_tree
>>> constant
>>> will be sorted early and would make it hard to generate:
>>>   x + (-y * z * z) => x - y * z * z
>>>
>>> Do you want to swap the constant in MULT_EXPR chain (if present) like in
>>> swap_ops_for_binary_stmt and then create a NEGATE_EXPR ?
>>
>>
>> In addition to linearize_expr handling you need to handle a -1 in the
>> MULT_EXPR
>> chain specially at rewrite_expr_tree time by emitting a NEGATE_EXPR
>> instead
>> of a MULT_EXPR (which also means you should sort the -1 "last").
>
>
> Hi Richard,
>
> Thanks. Here is an attempt which does this.
>
> Regression tested and bootstrapped on x86-64-linux-gnu with no new
> regressions.
>
> Is this OK for trunk?

+             int last = ops.length () - 1;
+             bool negate_result = false;

Do

  oe &last = ops.last ();


+             if (rhs_code == MULT_EXPR
+                 && ops.length () > 1
+                 && ((TREE_CODE (ops[last]->op) == INTEGER_CST

and last.op here and below

+                      && integer_minus_onep (ops[last]->op))
+                     || ((TREE_CODE (ops[last]->op) == REAL_CST)
+                         && real_equal (&TREE_REAL_CST
(ops[last]->op), &dconstm1))))

Here the checks !HONOR_SNANS () && (!HONOS_SIGNED_ZEROS ||
!COMPLEX_FLOAT_TYPE_P)
are missing.  The * -1 might appear literally and you are only allowed
to turn it into a negate
under the above conditions.

+               {
+                 ops.unordered_remove (last);

use ops.pop ();

+                 negate_result = true;

Please move the whole thing under the else { } case of the ops.length
== 0, ops.length == 1 test chain
as you did for the actual emit of the negate.


+                 if (negate_result)
+                   {
+                     tree tmp = make_ssa_name (TREE_TYPE (lhs));
+                     gimple_set_lhs (stmt, tmp);
+                     gassign *neg_stmt = gimple_build_assign (lhs, NEGATE_EXPR,
+                                                              tmp);
+                     gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+                     gsi_insert_after (&gsi, neg_stmt, GSI_NEW_STMT);
+                     update_stmt (stmt);

I think that if powi_result is also built you end up using the wrong
stmt so you miss a

                        stmt = SSA_NAME_DEF_STMT (lhs);

here.  Also see the new_lhs handling of the powi_result case - again
you need sth
similar here (it's handling looks a bit fishy as well - this all needs
some comments
and possibly a (lot more) testcases).

So, please do the above requested changes and verify the 'lhs' issues I pointed
out by trying to add a few more testcase that also cover the case where a powi
is detected in addition to a negation.  Please also add a testcase that catches
(-y) * x * (-z).

Otherwise this now looks good.

Thanks,
Richard.


> Thanks,
> Kugan
>
> 2016-04-23  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         PR middle-end/40921
>         * gcc.dg/tree-ssa/pr40921.c: New test.
>
> gcc/ChangeLog:
>
> 2016-04-23  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         PR middle-end/40921
>         * tree-ssa-reassoc.c (try_special_add_to_ops): New.
>         (linearize_expr_tree): Call try_special_add_to_ops.
>         (reassociate_bb): Convert MULT_EXPR by (-1) to NEGATE_EXPR.
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
  2016-04-27 13:41                 ` Richard Biener
@ 2016-05-05  0:41                   ` kugan
  2016-05-09 11:10                     ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: kugan @ 2016-05-05  0:41 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3064 bytes --]

Hi Richard,

>
> +             int last = ops.length () - 1;
> +             bool negate_result = false;
>
> Do
>
>    oe &last = ops.last ();
>

Done.

>
> +             if (rhs_code == MULT_EXPR
> +                 && ops.length () > 1
> +                 && ((TREE_CODE (ops[last]->op) == INTEGER_CST
>
> and last.op here and below
>
> +                      && integer_minus_onep (ops[last]->op))
> +                     || ((TREE_CODE (ops[last]->op) == REAL_CST)
> +                         && real_equal (&TREE_REAL_CST
> (ops[last]->op), &dconstm1))))
>

Done.

> Here the checks !HONOR_SNANS () && (!HONOS_SIGNED_ZEROS ||
> !COMPLEX_FLOAT_TYPE_P)
> are missing.  The * -1 might appear literally and you are only allowed
> to turn it into a negate
> under the above conditions.

Done.

>
> +               {
> +                 ops.unordered_remove (last);
>
> use ops.pop ();
>
Done.

> +                 negate_result = true;
>
> Please move the whole thing under the else { } case of the ops.length
> == 0, ops.length == 1 test chain
> as you did for the actual emit of the negate.
>

I see your point. However, when we remove the (-1) from the ops list, 
that intern can result in ops.length becoming 1. Therefore, I moved the 
  the following  if (negate_result), outside the condition.


>
> +                 if (negate_result)
> +                   {
> +                     tree tmp = make_ssa_name (TREE_TYPE (lhs));
> +                     gimple_set_lhs (stmt, tmp);
> +                     gassign *neg_stmt = gimple_build_assign (lhs, NEGATE_EXPR,
> +                                                              tmp);
> +                     gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
> +                     gsi_insert_after (&gsi, neg_stmt, GSI_NEW_STMT);
> +                     update_stmt (stmt);
>
> I think that if powi_result is also built you end up using the wrong
> stmt so you miss a
>
>                          stmt = SSA_NAME_DEF_STMT (lhs);

Yes, indeed. This can happen and I have added this.

>
> here.  Also see the new_lhs handling of the powi_result case - again
> you need sth
> similar here (it's handling looks a bit fishy as well - this all needs
> some comments
> and possibly a (lot more) testcases).
>
> So, please do the above requested changes and verify the 'lhs' issues I pointed
> out by trying to add a few more testcase that also cover the case where a powi
> is detected in addition to a negation.  Please also add a testcase that catches
> (-y) * x * (-z).
>

Added this to the testcase.

Does this look better now?

Thanks,
Kugan


>> 2016-04-23  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>
>>          PR middle-end/40921
>>          * gcc.dg/tree-ssa/pr40921.c: New test.
>>
>> gcc/ChangeLog:
>>
>> 2016-04-23  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>
>>          PR middle-end/40921
>>          * tree-ssa-reassoc.c (try_special_add_to_ops): New.
>>          (linearize_expr_tree): Call try_special_add_to_ops.
>>          (reassociate_bb): Convert MULT_EXPR by (-1) to NEGATE_EXPR.
>>

[-- Attachment #2: p.txt --]
[-- Type: text/plain, Size: 5379 bytes --]

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c b/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c
index e69de29..3a5a23a 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c
@@ -0,0 +1,26 @@
+
+/* { dg-do compile } */
+/* { dg-options "-O2  -fdump-tree-optimized -ffast-math" } */
+
+unsigned int foo (unsigned int x, unsigned int y, unsigned int z)
+{
+      return x + (-y * z * z);
+}
+
+float bar (float x, float y, float z)
+{
+      return x + (-y * z * z);
+}
+
+float bar2 (float x, float y, float z)
+{
+      return x + (-y * z * z * 5.0f);
+}
+
+float bar3 (float x, float y, float z)
+{
+      return x + (-y * x * -z);
+}
+
+
+/* { dg-final { scan-tree-dump-times "_* = -y_" 0 "optimized" } } */
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index 4e1251b..1df6681 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -4112,6 +4112,45 @@ acceptable_pow_call (gimple *stmt, tree *base, HOST_WIDE_INT *exponent)
   return true;
 }
 
+/* Try to derive and add operand entry for OP to *OPS.  Return false if
+   unsuccessful.  */
+
+static bool
+try_special_add_to_ops (vec<operand_entry *> *ops,
+			enum tree_code code,
+			tree op, gimple* def_stmt)
+{
+  tree base = NULL_TREE;
+  HOST_WIDE_INT exponent = 0;
+
+  if (TREE_CODE (op) != SSA_NAME)
+    return false;
+
+  if (code == MULT_EXPR
+      && acceptable_pow_call (def_stmt, &base, &exponent))
+    {
+      add_repeat_to_ops_vec (ops, base, exponent);
+      gimple_set_visited (def_stmt, true);
+      return true;
+    }
+  else if (code == MULT_EXPR
+	   && is_gimple_assign (def_stmt)
+	   && gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR
+	   && !HONOR_SNANS (TREE_TYPE (op))
+	   && (!HONOR_SIGNED_ZEROS (TREE_TYPE (op))
+	       || !COMPLEX_FLOAT_TYPE_P (TREE_TYPE (op))))
+    {
+      tree rhs1 = gimple_assign_rhs1 (def_stmt);
+      tree cst = build_minus_one_cst (TREE_TYPE (op));
+      add_to_ops_vec (ops, rhs1);
+      add_to_ops_vec (ops, cst);
+      gimple_set_visited (def_stmt, true);
+      return true;
+    }
+
+  return false;
+}
+
 /* Recursively linearize a binary expression that is the RHS of STMT.
    Place the operands of the expression tree in the vector named OPS.  */
 
@@ -4126,8 +4165,6 @@ linearize_expr_tree (vec<operand_entry *> *ops, gimple *stmt,
   bool binrhsisreassoc = false;
   enum tree_code rhscode = gimple_assign_rhs_code (stmt);
   struct loop *loop = loop_containing_stmt (stmt);
-  tree base = NULL_TREE;
-  HOST_WIDE_INT exponent = 0;
 
   if (set_visited)
     gimple_set_visited (stmt, true);
@@ -4163,24 +4200,10 @@ linearize_expr_tree (vec<operand_entry *> *ops, gimple *stmt,
 
       if (!binrhsisreassoc)
 	{
-	  if (rhscode == MULT_EXPR
-	      && TREE_CODE (binrhs) == SSA_NAME
-	      && acceptable_pow_call (binrhsdef, &base, &exponent))
-	    {
-	      add_repeat_to_ops_vec (ops, base, exponent);
-	      gimple_set_visited (binrhsdef, true);
-	    }
-	  else
+	  if (!try_special_add_to_ops (ops, rhscode, binrhs, binrhsdef))
 	    add_to_ops_vec (ops, binrhs);
 
-	  if (rhscode == MULT_EXPR
-	      && TREE_CODE (binlhs) == SSA_NAME
-	      && acceptable_pow_call (binlhsdef, &base, &exponent))
-	    {
-	      add_repeat_to_ops_vec (ops, base, exponent);
-	      gimple_set_visited (binlhsdef, true);
-	    }
-	  else
+	  if (!try_special_add_to_ops (ops, rhscode, binlhs, binlhsdef))
 	    add_to_ops_vec (ops, binlhs);
 
 	  return;
@@ -4220,14 +4243,7 @@ linearize_expr_tree (vec<operand_entry *> *ops, gimple *stmt,
   linearize_expr_tree (ops, SSA_NAME_DEF_STMT (binlhs),
 		       is_associative, set_visited);
 
-  if (rhscode == MULT_EXPR
-      && TREE_CODE (binrhs) == SSA_NAME
-      && acceptable_pow_call (SSA_NAME_DEF_STMT (binrhs), &base, &exponent))
-    {
-      add_repeat_to_ops_vec (ops, base, exponent);
-      gimple_set_visited (SSA_NAME_DEF_STMT (binrhs), true);
-    }
-  else
+  if (!try_special_add_to_ops (ops, rhscode, binrhs, binrhsdef))
     add_to_ops_vec (ops, binrhs);
 }
 
@@ -4980,6 +4996,22 @@ reassociate_bb (basic_block bb)
 		  && flag_unsafe_math_optimizations)
 		powi_result = attempt_builtin_powi (stmt, &ops);
 
+	      operand_entry *last = ops.last ();
+	      bool negate_result = false;
+	      if (rhs_code == MULT_EXPR
+		  && ops.length () > 1
+		  && ((TREE_CODE (last->op) == INTEGER_CST
+		       && integer_minus_onep (last->op))
+		      || ((TREE_CODE (last->op) == REAL_CST)
+			  && real_equal (&TREE_REAL_CST (last->op), &dconstm1)))
+		  && !HONOR_SNANS (TREE_TYPE (lhs))
+		  && (!HONOR_SIGNED_ZEROS (TREE_TYPE (lhs))
+		      || !COMPLEX_FLOAT_TYPE_P (TREE_TYPE (lhs))))
+		{
+		  ops.pop ();
+		  negate_result = true;
+		}
+
 	      /* If the operand vector is now empty, all operands were 
 		 consumed by the __builtin_powi optimization.  */
 	      if (ops.length () == 0)
@@ -5042,6 +5074,18 @@ reassociate_bb (basic_block bb)
 		      gsi_insert_after (&gsi, mul_stmt, GSI_NEW_STMT);
 		    }
 		}
+
+	      if (negate_result)
+		{
+		  stmt = SSA_NAME_DEF_STMT (lhs);
+		  tree tmp = make_ssa_name (TREE_TYPE (lhs));
+		  gimple_set_lhs (stmt, tmp);
+		  gassign *neg_stmt = gimple_build_assign (lhs, NEGATE_EXPR,
+							   tmp);
+		  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+		  gsi_insert_after (&gsi, neg_stmt, GSI_NEW_STMT);
+		  update_stmt (stmt);
+		}
 	    }
 	}
     }

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
  2016-05-05  0:41                   ` kugan
@ 2016-05-09 11:10                     ` Richard Biener
  2016-05-18  8:38                       ` Kugan Vivekanandarajah
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2016-05-09 11:10 UTC (permalink / raw)
  To: kugan; +Cc: gcc-patches

On Thu, May 5, 2016 at 2:41 AM, kugan <kugan.vivekanandarajah@linaro.org> wrote:
> Hi Richard,
>
>>
>> +             int last = ops.length () - 1;
>> +             bool negate_result = false;
>>
>> Do
>>
>>    oe &last = ops.last ();
>>
>
> Done.
>
>>
>> +             if (rhs_code == MULT_EXPR
>> +                 && ops.length () > 1
>> +                 && ((TREE_CODE (ops[last]->op) == INTEGER_CST
>>
>> and last.op here and below
>>
>> +                      && integer_minus_onep (ops[last]->op))
>> +                     || ((TREE_CODE (ops[last]->op) == REAL_CST)
>> +                         && real_equal (&TREE_REAL_CST
>> (ops[last]->op), &dconstm1))))
>>
>
> Done.
>
>> Here the checks !HONOR_SNANS () && (!HONOS_SIGNED_ZEROS ||
>> !COMPLEX_FLOAT_TYPE_P)
>> are missing.  The * -1 might appear literally and you are only allowed
>> to turn it into a negate
>> under the above conditions.
>
>
> Done.
>
>>
>> +               {
>> +                 ops.unordered_remove (last);
>>
>> use ops.pop ();
>>
> Done.
>
>> +                 negate_result = true;
>>
>> Please move the whole thing under the else { } case of the ops.length
>> == 0, ops.length == 1 test chain
>> as you did for the actual emit of the negate.
>>
>
> I see your point. However, when we remove the (-1) from the ops list, that
> intern can result in ops.length becoming 1. Therefore, I moved the  the
> following  if (negate_result), outside the condition.

Ah, indeed.   But now you have to care for ops.length () == 0 and thus
the unconditonally ops.last () may now trap.  So I suggest to
do

+             operand_entry *last;
+             bool negate_result = false;
+             if (rhs_code == MULT_EXPR
+                 && ops.length () > 1
                   && (last = ops.last ())
+                 && ((TREE_CODE (last->op) == INTEGER_CST

to avoid this.

>
>>
>> +                 if (negate_result)
>> +                   {
>> +                     tree tmp = make_ssa_name (TREE_TYPE (lhs));
>> +                     gimple_set_lhs (stmt, tmp);
>> +                     gassign *neg_stmt = gimple_build_assign (lhs,
>> NEGATE_EXPR,
>> +                                                              tmp);
>> +                     gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
>> +                     gsi_insert_after (&gsi, neg_stmt, GSI_NEW_STMT);
>> +                     update_stmt (stmt);
>>
>> I think that if powi_result is also built you end up using the wrong
>> stmt so you miss a
>>
>>                          stmt = SSA_NAME_DEF_STMT (lhs);
>
>
> Yes, indeed. This can happen and I have added this.
>
>>
>> here.  Also see the new_lhs handling of the powi_result case - again
>> you need sth
>> similar here (it's handling looks a bit fishy as well - this all needs
>> some comments
>> and possibly a (lot more) testcases).
>>
>> So, please do the above requested changes and verify the 'lhs' issues I
>> pointed
>> out by trying to add a few more testcase that also cover the case where a
>> powi
>> is detected in addition to a negation.  Please also add a testcase that
>> catches
>> (-y) * x * (-z).
>>
>
> Added this to the testcase.
>
> Does this look better now?

Yes - the patch is ok with the above suggested change.

Thanks,
Richard.

>
> Thanks,
> Kugan
>
>
>>> 2016-04-23  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>
>>>          PR middle-end/40921
>>>          * gcc.dg/tree-ssa/pr40921.c: New test.
>>>
>>> gcc/ChangeLog:
>>>
>>> 2016-04-23  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>
>>>          PR middle-end/40921
>>>          * tree-ssa-reassoc.c (try_special_add_to_ops): New.
>>>          (linearize_expr_tree): Call try_special_add_to_ops.
>>>          (reassociate_bb): Convert MULT_EXPR by (-1) to NEGATE_EXPR.
>>>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
  2016-05-09 11:10                     ` Richard Biener
@ 2016-05-18  8:38                       ` Kugan Vivekanandarajah
  2016-05-18 10:01                         ` Martin Liška
  2016-05-18 11:24                         ` Richard Biener
  0 siblings, 2 replies; 16+ messages in thread
From: Kugan Vivekanandarajah @ 2016-05-18  8:38 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1003 bytes --]

>>> Please move the whole thing under the else { } case of the ops.length
>>> == 0, ops.length == 1 test chain
>>> as you did for the actual emit of the negate.
>>>
>>
>> I see your point. However, when we remove the (-1) from the ops list, that
>> intern can result in ops.length becoming 1. Therefore, I moved the  the
>> following  if (negate_result), outside the condition.
>
> Ah, indeed.   But now you have to care for ops.length () == 0 and thus
> the unconditonally ops.last () may now trap.  So I suggest to
> do

Done.

> Yes - the patch is ok with the above suggested change.


While testing on an arm variant, vector types are not handled.
Therefore, I had to change:

+      || ((TREE_CODE (last->op) == REAL_CST)
+            && real_equal (&TREE_REAL_CST (last->op), &dconstm1))


to

+      || real_minus_onep (last->op))


Is this Still OK. Bootstrap and regression testing on ARM, AARCH64 and
x86-64 didn’t have any new regressions.

Thanks,
Kugan

[-- Attachment #2: pr40921.txt --]
[-- Type: text/plain, Size: 5318 bytes --]

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c b/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c
index e69de29..3a5a23a 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c
@@ -0,0 +1,26 @@
+
+/* { dg-do compile } */
+/* { dg-options "-O2  -fdump-tree-optimized -ffast-math" } */
+
+unsigned int foo (unsigned int x, unsigned int y, unsigned int z)
+{
+      return x + (-y * z * z);
+}
+
+float bar (float x, float y, float z)
+{
+      return x + (-y * z * z);
+}
+
+float bar2 (float x, float y, float z)
+{
+      return x + (-y * z * z * 5.0f);
+}
+
+float bar3 (float x, float y, float z)
+{
+      return x + (-y * x * -z);
+}
+
+
+/* { dg-final { scan-tree-dump-times "_* = -y_" 0 "optimized" } } */
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index 7408977..b54e3eb 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -4252,6 +4252,45 @@ acceptable_pow_call (gimple *stmt, tree *base, HOST_WIDE_INT *exponent)
   return true;
 }
 
+/* Try to derive and add operand entry for OP to *OPS.  Return false if
+   unsuccessful.  */
+
+static bool
+try_special_add_to_ops (vec<operand_entry *> *ops,
+			enum tree_code code,
+			tree op, gimple* def_stmt)
+{
+  tree base = NULL_TREE;
+  HOST_WIDE_INT exponent = 0;
+
+  if (TREE_CODE (op) != SSA_NAME)
+    return false;
+
+  if (code == MULT_EXPR
+      && acceptable_pow_call (def_stmt, &base, &exponent))
+    {
+      add_repeat_to_ops_vec (ops, base, exponent);
+      gimple_set_visited (def_stmt, true);
+      return true;
+    }
+  else if (code == MULT_EXPR
+	   && is_gimple_assign (def_stmt)
+	   && gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR
+	   && !HONOR_SNANS (TREE_TYPE (op))
+	   && (!HONOR_SIGNED_ZEROS (TREE_TYPE (op))
+	       || !COMPLEX_FLOAT_TYPE_P (TREE_TYPE (op))))
+    {
+      tree rhs1 = gimple_assign_rhs1 (def_stmt);
+      tree cst = build_minus_one_cst (TREE_TYPE (op));
+      add_to_ops_vec (ops, rhs1);
+      add_to_ops_vec (ops, cst);
+      gimple_set_visited (def_stmt, true);
+      return true;
+    }
+
+  return false;
+}
+
 /* Recursively linearize a binary expression that is the RHS of STMT.
    Place the operands of the expression tree in the vector named OPS.  */
 
@@ -4266,8 +4305,6 @@ linearize_expr_tree (vec<operand_entry *> *ops, gimple *stmt,
   bool binrhsisreassoc = false;
   enum tree_code rhscode = gimple_assign_rhs_code (stmt);
   struct loop *loop = loop_containing_stmt (stmt);
-  tree base = NULL_TREE;
-  HOST_WIDE_INT exponent = 0;
 
   if (set_visited)
     gimple_set_visited (stmt, true);
@@ -4303,24 +4340,10 @@ linearize_expr_tree (vec<operand_entry *> *ops, gimple *stmt,
 
       if (!binrhsisreassoc)
 	{
-	  if (rhscode == MULT_EXPR
-	      && TREE_CODE (binrhs) == SSA_NAME
-	      && acceptable_pow_call (binrhsdef, &base, &exponent))
-	    {
-	      add_repeat_to_ops_vec (ops, base, exponent);
-	      gimple_set_visited (binrhsdef, true);
-	    }
-	  else
+	  if (!try_special_add_to_ops (ops, rhscode, binrhs, binrhsdef))
 	    add_to_ops_vec (ops, binrhs);
 
-	  if (rhscode == MULT_EXPR
-	      && TREE_CODE (binlhs) == SSA_NAME
-	      && acceptable_pow_call (binlhsdef, &base, &exponent))
-	    {
-	      add_repeat_to_ops_vec (ops, base, exponent);
-	      gimple_set_visited (binlhsdef, true);
-	    }
-	  else
+	  if (!try_special_add_to_ops (ops, rhscode, binlhs, binlhsdef))
 	    add_to_ops_vec (ops, binlhs);
 
 	  return;
@@ -4360,14 +4383,7 @@ linearize_expr_tree (vec<operand_entry *> *ops, gimple *stmt,
   linearize_expr_tree (ops, SSA_NAME_DEF_STMT (binlhs),
 		       is_associative, set_visited);
 
-  if (rhscode == MULT_EXPR
-      && TREE_CODE (binrhs) == SSA_NAME
-      && acceptable_pow_call (SSA_NAME_DEF_STMT (binrhs), &base, &exponent))
-    {
-      add_repeat_to_ops_vec (ops, base, exponent);
-      gimple_set_visited (SSA_NAME_DEF_STMT (binrhs), true);
-    }
-  else
+  if (!try_special_add_to_ops (ops, rhscode, binrhs, binrhsdef))
     add_to_ops_vec (ops, binrhs);
 }
 
@@ -5127,6 +5143,24 @@ reassociate_bb (basic_block bb)
 		    powi_result = attempt_builtin_powi (stmt, &ops);
 		}
 
+	      operand_entry *last;
+	      bool negate_result = false;
+	      if (ops.length () > 1
+		  && rhs_code == MULT_EXPR)
+		{
+		  last = ops.last ();
+		  if (((TREE_CODE (last->op) == INTEGER_CST
+			&& integer_minus_onep (last->op))
+		       || real_minus_onep (last->op))
+		      && !HONOR_SNANS (TREE_TYPE (lhs))
+		      && (!HONOR_SIGNED_ZEROS (TREE_TYPE (lhs))
+			  || !COMPLEX_FLOAT_TYPE_P (TREE_TYPE (lhs))))
+		    {
+		      ops.pop ();
+		      negate_result = true;
+		    }
+		}
+
 	      /* If the operand vector is now empty, all operands were 
 		 consumed by the __builtin_powi optimization.  */
 	      if (ops.length () == 0)
@@ -5189,6 +5223,18 @@ reassociate_bb (basic_block bb)
 		      gsi_insert_after (&gsi, mul_stmt, GSI_NEW_STMT);
 		    }
 		}
+
+	      if (negate_result)
+		{
+		  stmt = SSA_NAME_DEF_STMT (lhs);
+		  tree tmp = make_ssa_name (TREE_TYPE (lhs));
+		  gimple_set_lhs (stmt, tmp);
+		  gassign *neg_stmt = gimple_build_assign (lhs, NEGATE_EXPR,
+							   tmp);
+		  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+		  gsi_insert_after (&gsi, neg_stmt, GSI_NEW_STMT);
+		  update_stmt (stmt);
+		}
 	    }
 	}
     }

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
  2016-05-18  8:38                       ` Kugan Vivekanandarajah
@ 2016-05-18 10:01                         ` Martin Liška
  2016-05-18 11:33                           ` Kugan
  2016-05-18 11:24                         ` Richard Biener
  1 sibling, 1 reply; 16+ messages in thread
From: Martin Liška @ 2016-05-18 10:01 UTC (permalink / raw)
  To: Kugan Vivekanandarajah, Richard Biener; +Cc: gcc-patches

On 05/18/2016 10:38 AM, Kugan Vivekanandarajah wrote:
> Is this Still OK. Bootstrap and regression testing on ARM, AARCH64 and
> x86-64 didnÂ’t have any new regressions.
> 
> Thanks,
> Kugan

Hello.

I see various ICE after your commit r236356:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71170

Martin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
  2016-05-18  8:38                       ` Kugan Vivekanandarajah
  2016-05-18 10:01                         ` Martin Liška
@ 2016-05-18 11:24                         ` Richard Biener
  1 sibling, 0 replies; 16+ messages in thread
From: Richard Biener @ 2016-05-18 11:24 UTC (permalink / raw)
  To: Kugan Vivekanandarajah; +Cc: gcc-patches

On Wed, May 18, 2016 at 10:38 AM, Kugan Vivekanandarajah
<kugan.vivekanandarajah@linaro.org> wrote:
>>>> Please move the whole thing under the else { } case of the ops.length
>>>> == 0, ops.length == 1 test chain
>>>> as you did for the actual emit of the negate.
>>>>
>>>
>>> I see your point. However, when we remove the (-1) from the ops list, that
>>> intern can result in ops.length becoming 1. Therefore, I moved the  the
>>> following  if (negate_result), outside the condition.
>>
>> Ah, indeed.   But now you have to care for ops.length () == 0 and thus
>> the unconditonally ops.last () may now trap.  So I suggest to
>> do
>
> Done.
>
>> Yes - the patch is ok with the above suggested change.
>
>
> While testing on an arm variant, vector types are not handled.
> Therefore, I had to change:
>
> +      || ((TREE_CODE (last->op) == REAL_CST)
> +            && real_equal (&TREE_REAL_CST (last->op), &dconstm1))
>
>
> to
>
> +      || real_minus_onep (last->op))
>
>
> Is this Still OK. Bootstrap and regression testing on ARM, AARCH64 and
> x86-64 didn’t have any new regressions.

Yes.

Thanks,
Richard.

> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z
  2016-05-18 10:01                         ` Martin Liška
@ 2016-05-18 11:33                           ` Kugan
  0 siblings, 0 replies; 16+ messages in thread
From: Kugan @ 2016-05-18 11:33 UTC (permalink / raw)
  To: Martin Liška, Richard Biener; +Cc: gcc-patches

Hi Martin,

> 
> I see various ICE after your commit r236356:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71170

Sorry for the breakage. Looking into it.

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-05-18 11:33 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-26  3:02 [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z kugan
2016-02-26 11:02 ` Richard Biener
2016-02-29 10:54   ` kugan
2016-04-19 11:36     ` Richard Biener
2016-04-19 11:36       ` Richard Biener
2016-04-19 12:12         ` Richard Biener
2016-04-21 11:13           ` kugan
2016-04-22  8:09             ` Richard Biener
2016-04-23 13:10               ` kugan
2016-04-27 13:41                 ` Richard Biener
2016-05-05  0:41                   ` kugan
2016-05-09 11:10                     ` Richard Biener
2016-05-18  8:38                       ` Kugan Vivekanandarajah
2016-05-18 10:01                         ` Martin Liška
2016-05-18 11:33                           ` Kugan
2016-05-18 11:24                         ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).