public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC][PATCH][PR63586] Convert x+x+x+x into 4*x
@ 2016-02-26  3:03 kugan
  2016-02-26 11:07 ` Richard Biener
  0 siblings, 1 reply; 11+ messages in thread
From: kugan @ 2016-02-26  3:03 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener

[-- Attachment #1: Type: text/plain, Size: 583 bytes --]



Hi,

This is an attempt to fix missed optimization: x+x+x+x -> 4*x as 
reported in PR63586.

Regression tested and bootstrapped on x86-64-linux-gnu with no new 
regressions.

Is this OK for next stage1?

Thanks,
Kugan


gcc/testsuite/ChangeLog:

2016-02-26  Kugan Vivekanandarajah  <kuganv@linaro.org>

	PR middle-end/63586
	* gcc.dg/tree-ssa/reassoc-14.c: Fix multiply count.

gcc/ChangeLog:

2016-02-26  Kugan Vivekanandarajah  <kuganv@linaro.org>

	PR middle-end/63586
	* tree-ssa-reassoc.c (transform_add_to_multiply): New.
	(reassociate_bb): Call transform_add_to_multiply.



[-- Attachment #2: PR63586.txt --]
[-- Type: text/plain, Size: 3005 bytes --]

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
index 62802d1..16ebc86 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
@@ -19,6 +19,7 @@ unsigned int test2 (unsigned int x, unsigned int y, unsigned int z,
   return tmp1 + tmp2 + tmp3;
 }
 
-/* There should be one multiplication left in test1 and three in test2.  */
+/* There should be two multiplication left in test1 (inculding one generated
+   when converting addition to multiplication) and three in test2.  */
 
-/* { dg-final { scan-tree-dump-times "\\\*" 4 "reassoc1" } } */
+/* { dg-final { scan-tree-dump-times "\\\*" 5 "reassoc1" } } */
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index dfd0da1..2454b9d 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -1698,6 +1698,61 @@ eliminate_redundant_comparison (enum tree_code opcode,
   return false;
 }
 
+/* Recursively transoform repeated addition of same values into multiply with
+   constant.  */
+static void
+transform_add_to_multiply (gimple_stmt_iterator *gsi, gimple *stmt, vec<operand_entry *> *ops)
+{
+  operand_entry *oe;
+  tree op = NULL_TREE;
+  int i, start = -1, end = 0, count = 0;
+
+  /* Look for repeated operands.  */
+  FOR_EACH_VEC_ELT (*ops, i, oe)
+    {
+      if (start == -1)
+	{
+	  count = 1;
+	  op = oe->op;
+	  start = i;
+	}
+      else if (operand_equal_p (oe->op, op, 0))
+	{
+	  count++;
+	  end = i;
+	}
+      else if (count == 1)
+	{
+	  count = 1;
+	  op = oe->op;
+	  start = i;
+	}
+      else
+	break;
+    }
+
+  if (count > 1)
+    {
+      /* Convert repeated operand addition to multiplication.  */
+      for (i = end; i >= start; --i)
+	ops->unordered_remove (i);
+      tree tmp = make_temp_ssa_name (TREE_TYPE (op), NULL, "reassocmul");
+      gassign *mul_stmt = gimple_build_assign (tmp, MULT_EXPR,
+					       op, build_int_cst (TREE_TYPE(op), count));
+      gimple_set_location (mul_stmt, gimple_location (stmt));
+      gimple_set_uid (mul_stmt, gimple_uid (stmt));
+      gsi_insert_before (gsi, mul_stmt, GSI_SAME_STMT);
+      oe = operand_entry_pool.allocate ();
+      oe->op = tmp;
+      oe->rank = get_rank (op) * count;
+      oe->id = 0;
+      oe->count = 1;
+      ops->safe_push (oe);
+      transform_add_to_multiply (gsi, stmt, ops);
+    }
+}
+
+
 /* Perform various identities and other optimizations on the list of
    operand entries, stored in OPS.  The tree code for the binary
    operation between all the operands is OPCODE.  */
@@ -4854,6 +4909,12 @@ reassociate_bb (basic_block bb)
 		  && flag_unsafe_math_optimizations)
 		powi_result = attempt_builtin_powi (stmt, &ops);
 
+	      if (rhs_code == PLUS_EXPR)
+		{
+		  transform_add_to_multiply (&gsi, stmt, &ops);
+		  ops.qsort (sort_by_operand_rank);
+		}
+
 	      /* If the operand vector is now empty, all operands were 
 		 consumed by the __builtin_powi optimization.  */
 	      if (ops.length () == 0)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH][PR63586] Convert x+x+x+x into 4*x
  2016-02-26  3:03 [RFC][PATCH][PR63586] Convert x+x+x+x into 4*x kugan
@ 2016-02-26 11:07 ` Richard Biener
  2016-02-29  4:28   ` kugan
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Biener @ 2016-02-26 11:07 UTC (permalink / raw)
  To: kugan; +Cc: gcc-patches

On Fri, Feb 26, 2016 at 4:02 AM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> Hi,
>
> This is an attempt to fix missed optimization: x+x+x+x -> 4*x as reported in
> PR63586.
>
> Regression tested and bootstrapped on x86-64-linux-gnu with no new
> regressions.
>
> Is this OK for next stage1?

That looks better, but I think the unordered_remove will break operand sorting
and thus you probably don't handle x + x + x + x + y + y + y + y + y +
y + z + z + z + z
optimally.

I'd say you simply want to avoid the recursion and collect a vector of
[start, end] pairs
before doing any modification to the ops vector.

Richard.

> Thanks,
> Kugan
>
>
> gcc/testsuite/ChangeLog:
>
> 2016-02-26  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         PR middle-end/63586
>         * gcc.dg/tree-ssa/reassoc-14.c: Fix multiply count.
>
> gcc/ChangeLog:
>
> 2016-02-26  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         PR middle-end/63586
>         * tree-ssa-reassoc.c (transform_add_to_multiply): New.
>         (reassociate_bb): Call transform_add_to_multiply.
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH][PR63586] Convert x+x+x+x into 4*x
  2016-02-26 11:07 ` Richard Biener
@ 2016-02-29  4:28   ` kugan
  2016-03-02 14:28     ` Christophe Lyon
  0 siblings, 1 reply; 11+ messages in thread
From: kugan @ 2016-02-29  4:28 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 400 bytes --]


> That looks better, but I think the unordered_remove will break operand sorting
> and thus you probably don't handle x + x + x + x + y + y + y + y + y +
> y + z + z + z + z
> optimally.
>
> I'd say you simply want to avoid the recursion and collect a vector of
> [start, end] pairs
> before doing any modification to the ops vector.

Hi Richard,

Is the attached patch looks better?

Thanks,
Kugan

[-- Attachment #2: PR63586.txt --]
[-- Type: text/plain, Size: 4679 bytes --]

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c b/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c
index e69de29..a002bdd 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c
@@ -0,0 +1,59 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-reassoc1" } */
+
+unsigned f1 (unsigned x)
+{
+    unsigned y = x + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    return y;
+}
+
+unsigned f2 (unsigned x, unsigned z)
+{
+    unsigned y = x + x;
+    y = y + x;
+    y = y + x;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    return y;
+}
+
+unsigned f3 (unsigned x, unsigned z, unsigned k)
+{
+    unsigned y = x + x;
+    y = y + x;
+    y = y + x;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    y = y + k;
+    return y;
+}
+
+unsigned f4 (unsigned x, unsigned z, unsigned k)
+{
+    unsigned y = k + x;
+    y = y + x;
+    y = y + x;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    y = y + x;
+    return y;
+}
+
+unsigned f5 (unsigned x, unsigned y, unsigned z)
+{
+    return x + x + x + x + y + y + y + y + y +
+      y + z + z + z + z;
+}
+
+
+/* { dg-final { scan-tree-dump-times "\\\*" 10 "reassoc1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
index 62802d1..16ebc86 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
@@ -19,6 +19,7 @@ unsigned int test2 (unsigned int x, unsigned int y, unsigned int z,
   return tmp1 + tmp2 + tmp3;
 }
 
-/* There should be one multiplication left in test1 and three in test2.  */
+/* There should be two multiplication left in test1 (inculding one generated
+   when converting addition to multiplication) and three in test2.  */
 
-/* { dg-final { scan-tree-dump-times "\\\*" 4 "reassoc1" } } */
+/* { dg-final { scan-tree-dump-times "\\\*" 5 "reassoc1" } } */
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index 17eb64f..0a43faf 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -1698,6 +1698,79 @@ eliminate_redundant_comparison (enum tree_code opcode,
   return false;
 }
 
+/* Transoform repeated addition of same values into multiply with
+   constant.  */
+static void
+transform_add_to_multiply (gimple_stmt_iterator *gsi, gimple *stmt, vec<operand_entry *> *ops)
+{
+  operand_entry *oe;
+  tree op = NULL_TREE;
+  int j;
+  int i, start = -1, end = 0, count = 0;
+  vec<int>  start_inds = vNULL;
+  vec<int>  end_inds = vNULL;
+  vec<tree>  op_list = vNULL;
+
+  /* Look for repeated operands.  */
+  FOR_EACH_VEC_ELT (*ops, i, oe)
+    {
+      if (start == -1)
+	{
+	  count = 1;
+	  op = oe->op;
+	  start = i;
+	}
+      else if (operand_equal_p (oe->op, op, 0))
+	{
+	  count++;
+	  end = i;
+	}
+      else
+	{
+	  if (count > 1)
+	    {
+	      start_inds.safe_push (start);
+	      end_inds.safe_push (end);
+	      op_list.safe_push (op);
+	    }
+	  count = 1;
+	  op = oe->op;
+	  start = i;
+	}
+    }
+
+  if (count > 1)
+    {
+      start_inds.safe_push (start);
+      end_inds.safe_push (end);
+      op_list.safe_push (op);
+    }
+
+  for (j = start_inds.length () - 1; j >= 0; --j)
+    {
+      /* Convert repeated operand addition to multiplication.  */
+      start = start_inds[j];
+      end = end_inds[j];
+      op = op_list[j];
+      count = end - start + 1;
+      for (i = end; i >= start; --i)
+	ops->unordered_remove (i);
+      tree tmp = make_temp_ssa_name (TREE_TYPE (op), NULL, "reassocmul");
+      gassign *mul_stmt = gimple_build_assign (tmp, MULT_EXPR,
+					       op, build_int_cst (TREE_TYPE(op), count));
+      gimple_set_location (mul_stmt, gimple_location (stmt));
+      gimple_set_uid (mul_stmt, gimple_uid (stmt));
+      gsi_insert_before (gsi, mul_stmt, GSI_SAME_STMT);
+      oe = operand_entry_pool.allocate ();
+      oe->op = tmp;
+      oe->rank = get_rank (op) * count;
+      oe->id = 0;
+      oe->count = 1;
+      ops->safe_push (oe);
+    }
+}
+
+
 /* Perform various identities and other optimizations on the list of
    operand entries, stored in OPS.  The tree code for the binary
    operation between all the operands is OPCODE.  */
@@ -4922,6 +4995,12 @@ reassociate_bb (basic_block bb)
 		  && flag_unsafe_math_optimizations)
 		powi_result = attempt_builtin_powi (stmt, &ops);
 
+	      if (rhs_code == PLUS_EXPR)
+		{
+		  transform_add_to_multiply (&gsi, stmt, &ops);
+		  ops.qsort (sort_by_operand_rank);
+		}
+
 	      /* If the operand vector is now empty, all operands were 
 		 consumed by the __builtin_powi optimization.  */
 	      if (ops.length () == 0)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH][PR63586] Convert x+x+x+x into 4*x
  2016-02-29  4:28   ` kugan
@ 2016-03-02 14:28     ` Christophe Lyon
  2016-04-19 11:56       ` Richard Biener
  0 siblings, 1 reply; 11+ messages in thread
From: Christophe Lyon @ 2016-03-02 14:28 UTC (permalink / raw)
  To: kugan; +Cc: Richard Biener, gcc-patches

On 29 February 2016 at 05:28, kugan <kugan.vivekanandarajah@linaro.org> wrote:
>
>> That looks better, but I think the unordered_remove will break operand
>> sorting
>> and thus you probably don't handle x + x + x + x + y + y + y + y + y +
>> y + z + z + z + z
>> optimally.
>>
>> I'd say you simply want to avoid the recursion and collect a vector of
>> [start, end] pairs
>> before doing any modification to the ops vector.
>
>
> Hi Richard,
>
> Is the attached patch looks better?
>

Minor comment, I've noticed typos in your updated comment:
"There should be two multiplication left in test1 (inculding one generated"
should be
"There should be two multiplications left in test1 (including one generated"



> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH][PR63586] Convert x+x+x+x into 4*x
  2016-03-02 14:28     ` Christophe Lyon
@ 2016-04-19 11:56       ` Richard Biener
  2016-04-19 11:57         ` Richard Biener
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Biener @ 2016-04-19 11:56 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: kugan, gcc-patches

On Wed, Mar 2, 2016 at 3:28 PM, Christophe Lyon
<christophe.lyon@linaro.org> wrote:
> On 29 February 2016 at 05:28, kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>
>>> That looks better, but I think the unordered_remove will break operand
>>> sorting
>>> and thus you probably don't handle x + x + x + x + y + y + y + y + y +
>>> y + z + z + z + z
>>> optimally.
>>>
>>> I'd say you simply want to avoid the recursion and collect a vector of
>>> [start, end] pairs
>>> before doing any modification to the ops vector.
>>
>>
>> Hi Richard,
>>
>> Is the attached patch looks better?
>>
>
> Minor comment, I've noticed typos in your updated comment:
> "There should be two multiplication left in test1 (inculding one generated"
> should be
> "There should be two multiplications left in test1 (including one generated"

+/* Transoform repeated addition of same values into multiply with
+   constant.  */

Transform

+static void
+transform_add_to_multiply (gimple_stmt_iterator *gsi, gimple *stmt,
vec<operand_entry *> *ops)

split the long line

op_list looks redundant - ops[start]->op gives you the desired value
already and if you
use a vec<std::pair<int, int>> you can have a more C++ish start,end pair.

+      tree tmp = make_temp_ssa_name (TREE_TYPE (op), NULL, "reassocmul");
+      gassign *mul_stmt = gimple_build_assign (tmp, MULT_EXPR,
+                                              op, build_int_cst
(TREE_TYPE(op), count));

this won't work for floating point or complex numbers - you need to use sth like
fold_convert (TREE_TYPE (op), build_int_cst (integer_type_node, count));

For FP types you need to guard the transform with flag_unsafe_math_optimizations

+      gimple_set_location (mul_stmt, gimple_location (stmt));
+      gimple_set_uid (mul_stmt, gimple_uid (stmt));
+      gsi_insert_before (gsi, mul_stmt, GSI_SAME_STMT);

I think you do not want to set the stmt uid and you want to insert the
stmt right
after the def of op (or at the original first add - though you can't
get your hands at
that easily).  You also don't want to set the location to the last stmt of the
whole add sequence - simply leave it unset.

+      oe = operand_entry_pool.allocate ();
+      oe->op = tmp;
+      oe->rank = get_rank (op) * count;

?  Why that?  oe->rank should be get_rank (tmp).

+      oe->id = 0;

other places use next_operand_entry_id++.  I think you want to simply
use add_to_ops_vec (oe, tmp); here for all of the above.

Please return whether you did any optimization and do the
qsort of the operand vector only if you did sth.

Testcase with FP math missing.  Likewise with complex or vector math.

Thanks,
Richard.

>> Thanks,
>> Kugan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH][PR63586] Convert x+x+x+x into 4*x
  2016-04-19 11:56       ` Richard Biener
@ 2016-04-19 11:57         ` Richard Biener
  2016-04-23 22:03           ` kugan
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Biener @ 2016-04-19 11:57 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: kugan, gcc-patches

On Tue, Apr 19, 2016 at 1:56 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Wed, Mar 2, 2016 at 3:28 PM, Christophe Lyon
> <christophe.lyon@linaro.org> wrote:
>> On 29 February 2016 at 05:28, kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>>> That looks better, but I think the unordered_remove will break operand
>>>> sorting
>>>> and thus you probably don't handle x + x + x + x + y + y + y + y + y +
>>>> y + z + z + z + z
>>>> optimally.
>>>>
>>>> I'd say you simply want to avoid the recursion and collect a vector of
>>>> [start, end] pairs
>>>> before doing any modification to the ops vector.
>>>
>>>
>>> Hi Richard,
>>>
>>> Is the attached patch looks better?
>>>
>>
>> Minor comment, I've noticed typos in your updated comment:
>> "There should be two multiplication left in test1 (inculding one generated"
>> should be
>> "There should be two multiplications left in test1 (including one generated"
>
> +/* Transoform repeated addition of same values into multiply with
> +   constant.  */
>
> Transform
>
> +static void
> +transform_add_to_multiply (gimple_stmt_iterator *gsi, gimple *stmt,
> vec<operand_entry *> *ops)
>
> split the long line
>
> op_list looks redundant - ops[start]->op gives you the desired value
> already and if you
> use a vec<std::pair<int, int>> you can have a more C++ish start,end pair.
>
> +      tree tmp = make_temp_ssa_name (TREE_TYPE (op), NULL, "reassocmul");
> +      gassign *mul_stmt = gimple_build_assign (tmp, MULT_EXPR,
> +                                              op, build_int_cst
> (TREE_TYPE(op), count));
>
> this won't work for floating point or complex numbers - you need to use sth like
> fold_convert (TREE_TYPE (op), build_int_cst (integer_type_node, count));
>
> For FP types you need to guard the transform with flag_unsafe_math_optimizations
>
> +      gimple_set_location (mul_stmt, gimple_location (stmt));
> +      gimple_set_uid (mul_stmt, gimple_uid (stmt));
> +      gsi_insert_before (gsi, mul_stmt, GSI_SAME_STMT);
>
> I think you do not want to set the stmt uid and you want to insert the
> stmt right
> after the def of op (or at the original first add - though you can't
> get your hands at
> that easily).  You also don't want to set the location to the last stmt of the
> whole add sequence - simply leave it unset.
>
> +      oe = operand_entry_pool.allocate ();
> +      oe->op = tmp;
> +      oe->rank = get_rank (op) * count;
>
> ?  Why that?  oe->rank should be get_rank (tmp).
>
> +      oe->id = 0;
>
> other places use next_operand_entry_id++.  I think you want to simply
> use add_to_ops_vec (oe, tmp); here for all of the above.
>
> Please return whether you did any optimization and do the
> qsort of the operand vector only if you did sth.
>
> Testcase with FP math missing.  Likewise with complex or vector math.

Btw, does it handle associating

  x + 3 * x + x

to

  5 * x

?

Richard.

> Thanks,
> Richard.
>
>>> Thanks,
>>> Kugan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH][PR63586] Convert x+x+x+x into 4*x
  2016-04-19 11:57         ` Richard Biener
@ 2016-04-23 22:03           ` kugan
  2016-04-27 14:03             ` Richard Biener
  0 siblings, 1 reply; 11+ messages in thread
From: kugan @ 2016-04-23 22:03 UTC (permalink / raw)
  To: Richard Biener, Christophe Lyon; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3236 bytes --]

Hi Richard,

As you have said in the other email, I tried implementing with the 
add_reapeats_to_ops_vec but the whole repeat vector is designed for 
MULT_EXPR chain. I tried changing it but it turned out to be not 
straightforward without lots of re-write. Therefore I tried to implement 
based on your review here. Please tell me what you think.

>> +/* Transoform repeated addition of same values into multiply with
>> +   constant.  */
>>
>> Transform

Done.

>>
>> +static void
>> +transform_add_to_multiply (gimple_stmt_iterator *gsi, gimple *stmt,
>> vec<operand_entry *> *ops)
>>
>> split the long line

Done.

>>
>> op_list looks redundant - ops[start]->op gives you the desired value
>> already and if you
>> use a vec<std::pair<int, int>> you can have a more C++ish start,end pair.
>>
>> +      tree tmp = make_temp_ssa_name (TREE_TYPE (op), NULL, "reassocmul");
>> +      gassign *mul_stmt = gimple_build_assign (tmp, MULT_EXPR,
>> +                                              op, build_int_cst
>> (TREE_TYPE(op), count));
>>
>> this won't work for floating point or complex numbers - you need to use sth like
>> fold_convert (TREE_TYPE (op), build_int_cst (integer_type_node, count));

Done.

>>
>> For FP types you need to guard the transform with flag_unsafe_math_optimizations

Done.

>>
>> +      gimple_set_location (mul_stmt, gimple_location (stmt));
>> +      gimple_set_uid (mul_stmt, gimple_uid (stmt));
>> +      gsi_insert_before (gsi, mul_stmt, GSI_SAME_STMT);
>>
>> I think you do not want to set the stmt uid

assert in reassoc_stmt_dominates_p (gcc_assert (gimple_uid (s1) && 
gimple_uid (s2))) is failing. So I tried to add the uid of the adjacent 
stmt and it seems to work.

>> and you want to insert the
>> stmt right
>> after the def of op (or at the original first add - though you can't
>> get your hands at

Done.

>> that easily).  You also don't want to set the location to the last stmt of the
>> whole add sequence - simply leave it unset.
>>
>> +      oe = operand_entry_pool.allocate ();
>> +      oe->op = tmp;
>> +      oe->rank = get_rank (op) * count;
>>
>> ?  Why that?  oe->rank should be get_rank (tmp).
>>
>> +      oe->id = 0;
>>
>> other places use next_operand_entry_id++.  I think you want to simply
>> use add_to_ops_vec (oe, tmp); here for all of the above.

Done.

>>
>> Please return whether you did any optimization and do the
>> qsort of the operand vector only if you did sth.

Done.


>> Testcase with FP math missing.  Likewise with complex or vector math.
>
> Btw, does it handle associating
>
>    x + 3 * x + x
>
> to
>
>    5 * x
>
> ?

Added this to the testcase and verified it is working.

Regression tested and bootstrapped on x86-64-linux-gnu with no new 
regressions.

Is this OK for trunk?

Thanks,
Kugan


gcc/testsuite/ChangeLog:

2016-04-24  Kugan Vivekanandarajah  <kuganv@linaro.org>

	PR middle-end/63586
	* gcc.dg/tree-ssa/pr63586-2.c: New test.
	* gcc.dg/tree-ssa/pr63586.c: New test.
	* gcc.dg/tree-ssa/reassoc-14.c: Adjust multiplication count.

gcc/ChangeLog:

2016-04-24  Kugan Vivekanandarajah  <kuganv@linaro.org>

	PR middle-end/63586
	* tree-ssa-reassoc.c (transform_add_to_multiply): New.
	(reassociate_bb): Call transform_add_to_multiply.




[-- Attachment #2: p1.txt --]
[-- Type: text/plain, Size: 5803 bytes --]

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
index e69de29..2774fbd 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -fdump-tree-reassoc1" } */
+
+float f1_float (float x)
+{
+    float y = x + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    return y;
+}
+
+float f1_float2 (float x)
+{
+    float y = x + 3 * x + x;
+    return y;
+}
+
+int f1_int (int x)
+{
+    int y = x + 3 * x + x;
+    return y;
+}
+
+/* { dg-final { scan-tree-dump-times "\\\*" 4 "reassoc1" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c b/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c
index e69de29..a0f705b 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c
@@ -0,0 +1,59 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-reassoc1" } */
+
+unsigned f1 (unsigned x)
+{
+    unsigned y = x + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    return y;
+}
+
+unsigned f2 (unsigned x, unsigned z)
+{
+    unsigned y = x + x;
+    y = y + x;
+    y = y + x;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    return y;
+}
+
+unsigned f3 (unsigned x, unsigned z, unsigned k)
+{
+    unsigned y = x + x;
+    y = y + x;
+    y = y + x;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    y = y + k;
+    return y;
+}
+
+unsigned f4 (unsigned x, unsigned z, unsigned k)
+{
+    unsigned y = k + x;
+    y = y + x;
+    y = y + x;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    y = y + x;
+    return y;
+}
+
+unsigned f5 (unsigned x, unsigned y, unsigned z)
+{
+    return x + x + x + x + y + y + y + y + y \
+      + y + z + z + z + z;
+}
+
+/* { dg-final { scan-tree-dump-times "\\\*" 10 "reassoc1" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
index 62802d1..16ebc86 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
@@ -19,6 +19,7 @@ unsigned int test2 (unsigned int x, unsigned int y, unsigned int z,
   return tmp1 + tmp2 + tmp3;
 }
 
-/* There should be one multiplication left in test1 and three in test2.  */
+/* There should be two multiplication left in test1 (inculding one generated
+   when converting addition to multiplication) and three in test2.  */
 
-/* { dg-final { scan-tree-dump-times "\\\*" 4 "reassoc1" } } */
+/* { dg-final { scan-tree-dump-times "\\\*" 5 "reassoc1" } } */
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index d23dabd..6e57d44 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -1756,6 +1756,95 @@ eliminate_redundant_comparison (enum tree_code opcode,
   return false;
 }
 
+/* Transform repeated addition of same values into multiply with
+   constant.  */
+static bool
+transform_add_to_multiply (vec<operand_entry *> *ops)
+{
+  operand_entry *oe;
+  tree op = NULL_TREE;
+  int j;
+  int i, start = -1, end = 0, count = 0;
+  vec<std::pair <int, int> > indxs = vNULL;
+  bool changed = false;
+  gimple *stmt;
+
+  if (!INTEGRAL_TYPE_P (TREE_TYPE ((*ops)[0]->op))
+      && !flag_unsafe_math_optimizations)
+    return false;
+
+  /* Look for repeated operands.  */
+  FOR_EACH_VEC_ELT (*ops, i, oe)
+    {
+      if (start == -1)
+	{
+	  count = 1;
+	  op = oe->op;
+	  start = i;
+	}
+      else if (operand_equal_p (oe->op, op, 0))
+	{
+	  count++;
+	  end = i;
+	}
+      else
+	{
+	  if (count > 1)
+	    indxs.safe_push (std::make_pair (start, end));
+	  count = 1;
+	  op = oe->op;
+	  start = i;
+	}
+    }
+
+  if (count > 1)
+    indxs.safe_push (std::make_pair (start, end));
+
+  for (j = indxs.length () - 1; j >= 0; --j)
+    {
+      /* Convert repeated operand addition to multiplication.  */
+      start = indxs[j].first;
+      end = indxs[j].second;
+      op = (*ops)[start]->op;
+      count = end - start + 1;
+      for (i = end; i >= start; --i)
+	ops->unordered_remove (i);
+      tree tmp = make_ssa_name (TREE_TYPE (op));
+      tree cst = build_int_cst (integer_type_node, count);
+      gimple *def_stmt = SSA_NAME_DEF_STMT (op);
+      gassign *mul_stmt
+	= gimple_build_assign (tmp, MULT_EXPR,
+			       op, fold_convert (TREE_TYPE (op), cst));
+      gimple_stmt_iterator gsi;
+      if (SSA_NAME_VAR (op) != NULL
+	  && gimple_code (def_stmt) == GIMPLE_NOP)
+	{
+	  gsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)));
+	  stmt = gsi_stmt (gsi);
+	  gsi_insert_before (&gsi, mul_stmt, GSI_NEW_STMT);
+	}
+      else if (gimple_code (def_stmt) == GIMPLE_PHI)
+	{
+	  gsi = gsi_after_labels (gimple_bb (def_stmt));
+	  stmt = gsi_stmt (gsi);
+	  gsi_insert_before (&gsi, mul_stmt, GSI_NEW_STMT);
+	}
+      else
+	{
+	  gsi = gsi_for_stmt (def_stmt);
+	  stmt = gsi_stmt (gsi);
+	  gsi_insert_after (&gsi, mul_stmt, GSI_NEW_STMT);
+	}
+      gimple_set_uid (mul_stmt, gimple_uid (stmt));
+      gimple_set_visited (mul_stmt, true);
+      add_to_ops_vec (ops, tmp);
+      changed = true;
+    }
+
+  return changed;
+}
+
+
 /* Perform various identities and other optimizations on the list of
    operand entries, stored in OPS.  The tree code for the binary
    operation between all the operands is OPCODE.  */
@@ -5127,6 +5216,10 @@ reassociate_bb (basic_block bb)
 		    powi_result = attempt_builtin_powi (stmt, &ops);
 		}
 
+	      if (rhs_code == PLUS_EXPR
+		  && transform_add_to_multiply (&ops))
+		ops.qsort (sort_by_operand_rank);
+
 	      /* If the operand vector is now empty, all operands were 
 		 consumed by the __builtin_powi optimization.  */
 	      if (ops.length () == 0)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH][PR63586] Convert x+x+x+x into 4*x
  2016-04-23 22:03           ` kugan
@ 2016-04-27 14:03             ` Richard Biener
  2016-05-05  1:58               ` kugan
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Biener @ 2016-04-27 14:03 UTC (permalink / raw)
  To: kugan; +Cc: Christophe Lyon, gcc-patches

On Sun, Apr 24, 2016 at 12:02 AM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> Hi Richard,
>
> As you have said in the other email, I tried implementing with the
> add_reapeats_to_ops_vec but the whole repeat vector is designed for
> MULT_EXPR chain. I tried changing it but it turned out to be not
> straightforward without lots of re-write. Therefore I tried to implement
> based on your review here. Please tell me what you think.

Hmm, ok.

>>> +/* Transoform repeated addition of same values into multiply with
>>> +   constant.  */
>>>
>>> Transform
>
>
> Done.
>
>>>
>>> +static void
>>> +transform_add_to_multiply (gimple_stmt_iterator *gsi, gimple *stmt,
>>> vec<operand_entry *> *ops)
>>>
>>> split the long line
>
>
> Done.
>
>>>
>>> op_list looks redundant - ops[start]->op gives you the desired value
>>> already and if you
>>> use a vec<std::pair<int, int>> you can have a more C++ish start,end pair.
>>>
>>> +      tree tmp = make_temp_ssa_name (TREE_TYPE (op), NULL,
>>> "reassocmul");
>>> +      gassign *mul_stmt = gimple_build_assign (tmp, MULT_EXPR,
>>> +                                              op, build_int_cst
>>> (TREE_TYPE(op), count));
>>>
>>> this won't work for floating point or complex numbers - you need to use
>>> sth like
>>> fold_convert (TREE_TYPE (op), build_int_cst (integer_type_node, count));
>
>
> Done.
>
>>>
>>> For FP types you need to guard the transform with
>>> flag_unsafe_math_optimizations
>
>
> Done.
>
>>>
>>> +      gimple_set_location (mul_stmt, gimple_location (stmt));
>>> +      gimple_set_uid (mul_stmt, gimple_uid (stmt));
>>> +      gsi_insert_before (gsi, mul_stmt, GSI_SAME_STMT);
>>>
>>> I think you do not want to set the stmt uid
>
>
> assert in reassoc_stmt_dominates_p (gcc_assert (gimple_uid (s1) &&
> gimple_uid (s2))) is failing. So I tried to add the uid of the adjacent stmt
> and it seems to work.

Hmm, yes, other cases seem to do the same.

>>> and you want to insert the
>>> stmt right
>>> after the def of op (or at the original first add - though you can't
>>> get your hands at
>
>
> Done.

maybe instert_stmt_after will help here, I don't think you got the insertion
logic correct, thus insert_stmt_after (mul_stmt, def_stmt) which I think
misses GIMPLE_NOP handling.  At least

+      if (SSA_NAME_VAR (op) != NULL

huh?  I suppose you could have tested SSA_NAME_IS_DEFAULT_DEF
but just the GIMPLE_NOP def-stmt test should be enough.

+         && gimple_code (def_stmt) == GIMPLE_NOP)
+       {
+         gsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)));
+         stmt = gsi_stmt (gsi);
+         gsi_insert_before (&gsi, mul_stmt, GSI_NEW_STMT);

not sure if that is the best insertion point choice, it un-does all
code-sinking done
(and no further sinking is run after the last reassoc pass).  We do know we
are handling all uses of op in our chain so inserting before the plus-expr
chain root should work here (thus 'stmt' in the caller context).  I'd
use that here instead.
I think I'd use that unconditionally even if it works and not bother
finding something
more optimal.

Apart from this this now looks ok to me.

But the testcases need some work


--- a/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
...
+
+/* { dg-final { scan-tree-dump-times "\\\*" 4 "reassoc1" } } */

I would have expected 3.  Also please check for \\\* 5 for example
to be more specific (and change the cases so you get different constants
for the different functions).

That said, please make the scans more specific.

Thanks,
Richard.


>>> that easily).  You also don't want to set the location to the last stmt
>>> of the
>>> whole add sequence - simply leave it unset.
>>>
>>> +      oe = operand_entry_pool.allocate ();
>>> +      oe->op = tmp;
>>> +      oe->rank = get_rank (op) * count;
>>>
>>> ?  Why that?  oe->rank should be get_rank (tmp).
>>>
>>> +      oe->id = 0;
>>>
>>> other places use next_operand_entry_id++.  I think you want to simply
>>> use add_to_ops_vec (oe, tmp); here for all of the above.
>
>
> Done.
>
>>>
>>> Please return whether you did any optimization and do the
>>> qsort of the operand vector only if you did sth.
>
>
> Done.
>
>
>>> Testcase with FP math missing.  Likewise with complex or vector math.
>>
>>
>> Btw, does it handle associating
>>
>>    x + 3 * x + x
>>
>> to
>>
>>    5 * x
>>
>> ?
>
>
> Added this to the testcase and verified it is working.
>
> Regression tested and bootstrapped on x86-64-linux-gnu with no new
> regressions.
>
> Is this OK for trunk?
>
> Thanks,
> Kugan
>
>
> gcc/testsuite/ChangeLog:
>
> 2016-04-24  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         PR middle-end/63586
>         * gcc.dg/tree-ssa/pr63586-2.c: New test.
>         * gcc.dg/tree-ssa/pr63586.c: New test.
>         * gcc.dg/tree-ssa/reassoc-14.c: Adjust multiplication count.
>
> gcc/ChangeLog:
>
> 2016-04-24  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>
>         PR middle-end/63586
>         * tree-ssa-reassoc.c (transform_add_to_multiply): New.
>         (reassociate_bb): Call transform_add_to_multiply.
>
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH][PR63586] Convert x+x+x+x into 4*x
  2016-04-27 14:03             ` Richard Biener
@ 2016-05-05  1:58               ` kugan
  2016-05-09 11:42                 ` Richard Biener
  2016-05-18 14:36                 ` H.J. Lu
  0 siblings, 2 replies; 11+ messages in thread
From: kugan @ 2016-05-05  1:58 UTC (permalink / raw)
  To: Richard Biener; +Cc: Christophe Lyon, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1892 bytes --]

Hi Richard,

>
> maybe instert_stmt_after will help here, I don't think you got the insertion
> logic correct, thus insert_stmt_after (mul_stmt, def_stmt) which I think
> misses GIMPLE_NOP handling.  At least
>
> +      if (SSA_NAME_VAR (op) != NULL
>
> huh?  I suppose you could have tested SSA_NAME_IS_DEFAULT_DEF
> but just the GIMPLE_NOP def-stmt test should be enough.
>
> +         && gimple_code (def_stmt) == GIMPLE_NOP)
> +       {
> +         gsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)));
> +         stmt = gsi_stmt (gsi);
> +         gsi_insert_before (&gsi, mul_stmt, GSI_NEW_STMT);
>
> not sure if that is the best insertion point choice, it un-does all
> code-sinking done
> (and no further sinking is run after the last reassoc pass).  We do know we
> are handling all uses of op in our chain so inserting before the plus-expr
> chain root should work here (thus 'stmt' in the caller context).  I'd
> use that here instead.
> I think I'd use that unconditionally even if it works and not bother
> finding something
> more optimal.
>

I now tried using instert_stmt_after with special handling for 
GIMPLE_PHI as you described.


> Apart from this this now looks ok to me.
>
> But the testcases need some work
>
>
> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> ...
> +
> +/* { dg-final { scan-tree-dump-times "\\\*" 4 "reassoc1" } } */
>
> I would have expected 3.

We now have an additional _15 = x_1(D) * 2

   Also please check for \\\* 5 for example
> to be more specific (and change the cases so you get different constants
> for the different functions).

>
> That said, please make the scans more specific.

I have now changes the test-cases to scan more specific multiplication 
scan as you wanted.


Does this now look better?


Thanks,
Kugan

[-- Attachment #2: p1.txt --]
[-- Type: text/plain, Size: 6053 bytes --]

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
index e69de29..0dcfe32 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -fdump-tree-reassoc1" } */
+
+float f1_float (float x, float z)
+{
+    float y = x + z;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    return y;
+}
+
+float f1_float2 (float x)
+{
+    float y = x + 3 * x + x;
+    return y;
+}
+
+int f1_int (int x)
+{
+    int y = x + 4 * x + x;
+    return y;
+}
+
+/* { dg-final { scan-tree-dump-times "\\\* 8\\\.0e\\\+0" 1 "reassoc1" } } */
+/* { dg-final { scan-tree-dump-times "\\\* 5\\\.0e\\\+0" 1 "reassoc1" } } */
+/* { dg-final { scan-tree-dump-times "\\\* 6" 1 "reassoc1" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c b/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c
index e69de29..470be8c 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c
@@ -0,0 +1,70 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-reassoc1" } */
+
+unsigned f1 (unsigned x, unsigned z)
+{
+    unsigned y = x + z;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    return y;
+}
+
+/* { dg-final { scan-tree-dump-times "\\\* 7" 1 "reassoc1" } } */
+
+unsigned f2 (unsigned x, unsigned z)
+{
+    unsigned y = x + z;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    return y;
+}
+
+/* { dg-final { scan-tree-dump-times "\\\* 5" 1 "reassoc1" } } */
+/* { dg-final { scan-tree-dump-times "\\\* 4" 1 "reassoc1" } } */
+
+unsigned f3 (unsigned x, unsigned z, unsigned k)
+{
+    unsigned y = x + z;
+    y = y + x;
+    y = y + z;
+    y = y + z;
+    y = y + k;
+    return y;
+}
+
+/* { dg-final { scan-tree-dump-times "\\\* 2" 1 "reassoc1" } } */
+/* { dg-final { scan-tree-dump-times "\\\* 3" 1 "reassoc1" } } */
+
+unsigned f4 (unsigned x, unsigned z, unsigned k)
+{
+    unsigned y = k + x;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    return y;
+}
+/* { dg-final { scan-tree-dump-times "\\\* 8" 1 "reassoc1" } } */
+
+unsigned f5 (unsigned x, unsigned y, unsigned z)
+{
+    return x + y + y + y + y + y \
+      + y + z + z + z + z + z + z + z + z + z;
+}
+
+/* { dg-final { scan-tree-dump-times "\\\* 6" 1 "reassoc1" } } */
+/* { dg-final { scan-tree-dump-times "\\\* 9" 1 "reassoc1" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
index 62802d1..16ebc86 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
@@ -19,6 +19,7 @@ unsigned int test2 (unsigned int x, unsigned int y, unsigned int z,
   return tmp1 + tmp2 + tmp3;
 }
 
-/* There should be one multiplication left in test1 and three in test2.  */
+/* There should be two multiplication left in test1 (inculding one generated
+   when converting addition to multiplication) and three in test2.  */
 
-/* { dg-final { scan-tree-dump-times "\\\*" 4 "reassoc1" } } */
+/* { dg-final { scan-tree-dump-times "\\\*" 5 "reassoc1" } } */
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index d23dabd..cd7e588 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -1756,6 +1756,81 @@ eliminate_redundant_comparison (enum tree_code opcode,
   return false;
 }
 
+/* Transform repeated addition of same values into multiply with
+   constant.  */
+static bool
+transform_add_to_multiply (gimple *stmt, vec<operand_entry *> *ops)
+{
+  operand_entry *oe;
+  tree op = NULL_TREE;
+  int j;
+  int i, start = -1, end = 0, count = 0;
+  vec<std::pair <int, int> > indxs = vNULL;
+  bool changed = false;
+
+  if (!INTEGRAL_TYPE_P (TREE_TYPE ((*ops)[0]->op))
+      && !flag_unsafe_math_optimizations)
+    return false;
+
+  /* Look for repeated operands.  */
+  FOR_EACH_VEC_ELT (*ops, i, oe)
+    {
+      if (start == -1)
+	{
+	  count = 1;
+	  op = oe->op;
+	  start = i;
+	}
+      else if (operand_equal_p (oe->op, op, 0))
+	{
+	  count++;
+	  end = i;
+	}
+      else
+	{
+	  if (count > 1)
+	    indxs.safe_push (std::make_pair (start, end));
+	  count = 1;
+	  op = oe->op;
+	  start = i;
+	}
+    }
+
+  if (count > 1)
+    indxs.safe_push (std::make_pair (start, end));
+
+  for (j = indxs.length () - 1; j >= 0; --j)
+    {
+      /* Convert repeated operand addition to multiplication.  */
+      start = indxs[j].first;
+      end = indxs[j].second;
+      op = (*ops)[start]->op;
+      count = end - start + 1;
+      for (i = end; i >= start; --i)
+	ops->unordered_remove (i);
+      tree tmp = make_ssa_name (TREE_TYPE (op));
+      tree cst = build_int_cst (integer_type_node, count);
+      gimple *def_stmt = SSA_NAME_DEF_STMT (op);
+      gassign *mul_stmt
+	= gimple_build_assign (tmp, MULT_EXPR,
+			       op, fold_convert (TREE_TYPE (op), cst));
+      if (gimple_code (def_stmt) == GIMPLE_NOP)
+	{
+	  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+	  gimple_set_uid (mul_stmt, gimple_uid (stmt));
+	  gsi_insert_before (&gsi, mul_stmt, GSI_NEW_STMT);
+	}
+      else
+	insert_stmt_after (mul_stmt, def_stmt);
+      gimple_set_visited (mul_stmt, true);
+      add_to_ops_vec (ops, tmp);
+      changed = true;
+    }
+
+  return changed;
+}
+
+
 /* Perform various identities and other optimizations on the list of
    operand entries, stored in OPS.  The tree code for the binary
    operation between all the operands is OPCODE.  */
@@ -5118,6 +5193,10 @@ reassociate_bb (basic_block bb)
 		    optimize_range_tests (rhs_code, &ops);
 	        }
 
+	      if (rhs_code == PLUS_EXPR
+		  && transform_add_to_multiply (stmt, &ops))
+		ops.qsort (sort_by_operand_rank);
+
 	      if (rhs_code == MULT_EXPR && !is_vector)
 	        {
 		  attempt_builtin_copysign (&ops);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH][PR63586] Convert x+x+x+x into 4*x
  2016-05-05  1:58               ` kugan
@ 2016-05-09 11:42                 ` Richard Biener
  2016-05-18 14:36                 ` H.J. Lu
  1 sibling, 0 replies; 11+ messages in thread
From: Richard Biener @ 2016-05-09 11:42 UTC (permalink / raw)
  To: kugan; +Cc: Christophe Lyon, gcc-patches

On Thu, May 5, 2016 at 3:57 AM, kugan <kugan.vivekanandarajah@linaro.org> wrote:
> Hi Richard,
>
>>
>> maybe instert_stmt_after will help here, I don't think you got the
>> insertion
>> logic correct, thus insert_stmt_after (mul_stmt, def_stmt) which I think
>> misses GIMPLE_NOP handling.  At least
>>
>> +      if (SSA_NAME_VAR (op) != NULL
>>
>> huh?  I suppose you could have tested SSA_NAME_IS_DEFAULT_DEF
>> but just the GIMPLE_NOP def-stmt test should be enough.
>>
>> +         && gimple_code (def_stmt) == GIMPLE_NOP)
>> +       {
>> +         gsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN
>> (cfun)));
>> +         stmt = gsi_stmt (gsi);
>> +         gsi_insert_before (&gsi, mul_stmt, GSI_NEW_STMT);
>>
>> not sure if that is the best insertion point choice, it un-does all
>> code-sinking done
>> (and no further sinking is run after the last reassoc pass).  We do know
>> we
>> are handling all uses of op in our chain so inserting before the plus-expr
>> chain root should work here (thus 'stmt' in the caller context).  I'd
>> use that here instead.
>> I think I'd use that unconditionally even if it works and not bother
>> finding something
>> more optimal.
>>
>
> I now tried using instert_stmt_after with special handling for GIMPLE_PHI as
> you described.

Thanks - I'd still say doing my last suggestion is likely better, at least if
'def_stmt' is not in the same basic-block as 'stmt'.  So can you do

  if (gimple_code (def_stmt) == GIMPLE_NOP
      || gimple_bb (stmt) != gimple_bb (def_stmt))
   {
...

instead?

>
>> Apart from this this now looks ok to me.
>>
>> But the testcases need some work
>>
>>
>> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
>> @@ -0,0 +1,29 @@
>> +/* { dg-do compile } */
>> ...
>> +
>> +/* { dg-final { scan-tree-dump-times "\\\*" 4 "reassoc1" } } */
>>
>> I would have expected 3.
>
>
> We now have an additional _15 = x_1(D) * 2

Ok.

>   Also please check for \\\* 5 for example
>>
>> to be more specific (and change the cases so you get different constants
>> for the different functions).
>
>
>>
>> That said, please make the scans more specific.
>
>
> I have now changes the test-cases to scan more specific multiplication scan
> as you wanted.
>
>
> Does this now look better?

Yes.  Ok with the above suggested change.

Thanks,
Richard.

>
> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH][PR63586] Convert x+x+x+x into 4*x
  2016-05-05  1:58               ` kugan
  2016-05-09 11:42                 ` Richard Biener
@ 2016-05-18 14:36                 ` H.J. Lu
  1 sibling, 0 replies; 11+ messages in thread
From: H.J. Lu @ 2016-05-18 14:36 UTC (permalink / raw)
  To: kugan; +Cc: Richard Biener, Christophe Lyon, gcc-patches

On Wed, May 4, 2016 at 6:57 PM, kugan <kugan.vivekanandarajah@linaro.org> wrote:
> Hi Richard,
>
>>
>> maybe instert_stmt_after will help here, I don't think you got the
>> insertion
>> logic correct, thus insert_stmt_after (mul_stmt, def_stmt) which I think
>> misses GIMPLE_NOP handling.  At least
>>
>> +      if (SSA_NAME_VAR (op) != NULL
>>
>> huh?  I suppose you could have tested SSA_NAME_IS_DEFAULT_DEF
>> but just the GIMPLE_NOP def-stmt test should be enough.
>>
>> +         && gimple_code (def_stmt) == GIMPLE_NOP)
>> +       {
>> +         gsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN
>> (cfun)));
>> +         stmt = gsi_stmt (gsi);
>> +         gsi_insert_before (&gsi, mul_stmt, GSI_NEW_STMT);
>>
>> not sure if that is the best insertion point choice, it un-does all
>> code-sinking done
>> (and no further sinking is run after the last reassoc pass).  We do know
>> we
>> are handling all uses of op in our chain so inserting before the plus-expr
>> chain root should work here (thus 'stmt' in the caller context).  I'd
>> use that here instead.
>> I think I'd use that unconditionally even if it works and not bother
>> finding something
>> more optimal.
>>
>
> I now tried using instert_stmt_after with special handling for GIMPLE_PHI as
> you described.
>
>
>> Apart from this this now looks ok to me.
>>
>> But the testcases need some work
>>
>>
>> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
>> @@ -0,0 +1,29 @@
>> +/* { dg-do compile } */
>> ...
>> +
>> +/* { dg-final { scan-tree-dump-times "\\\*" 4 "reassoc1" } } */
>>
>> I would have expected 3.
>
>
> We now have an additional _15 = x_1(D) * 2
>
>   Also please check for \\\* 5 for example
>>
>> to be more specific (and change the cases so you get different constants
>> for the different functions).
>
>
>>
>> That said, please make the scans more specific.
>
>
> I have now changes the test-cases to scan more specific multiplication scan
> as you wanted.
>
>
> Does this now look better?

It caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71172


-- 
H.J.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-05-18 14:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-26  3:03 [RFC][PATCH][PR63586] Convert x+x+x+x into 4*x kugan
2016-02-26 11:07 ` Richard Biener
2016-02-29  4:28   ` kugan
2016-03-02 14:28     ` Christophe Lyon
2016-04-19 11:56       ` Richard Biener
2016-04-19 11:57         ` Richard Biener
2016-04-23 22:03           ` kugan
2016-04-27 14:03             ` Richard Biener
2016-05-05  1:58               ` kugan
2016-05-09 11:42                 ` Richard Biener
2016-05-18 14:36                 ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).