public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Properly valueize values we value-number to
@ 2015-02-17 14:03 Richard Biener
  2015-02-17 14:58 ` Richard Biener
  2015-02-23 22:44 ` Jeff Law
  0 siblings, 2 replies; 9+ messages in thread
From: Richard Biener @ 2015-02-17 14:03 UTC (permalink / raw)
  To: gcc-patches; +Cc: law


This is something I noticed some time ago and that I remembered when
you added that looping SSA_NAME_VALUE to simplify_control_stmt_condition.
Currently DOM doesn't make sure that when setting
SSA_NAME_VALUE (x) = y that SSA_NAME_VALUE (y) == y, thus you could
get SSA_NAME_VALUE forming a chain until you reach the final value.

Thus the following patch which fixes all occurances and removes the
looping from simplify_control_stmt_condition.  Did you have any
testcase when you added that looping?

Note that this is purely by code inspection and I don't have any
testcase where a SSA_NAME_VALUE chain causes missed optimizations
(you'd have to disable a lot of other optimizations before dom1
to be able to create a reasonably simple one).

Anyway - the tree-ssa-dom.c part bootstrapped and tested ok on
x86_64-unknown-linux-gnu, the tree-ssa-threadedge.c part bootstrapped
ok ontop of that and testing is still in progress.

Ok?

Thanks,
Richard.

2015-02-17  Richard Biener  <rguenther@suse.de>

	* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
	(record_const_or_copy_1): Valueize y.
	(record_equivalences_from_stmt): Valueize rhs.
	* tree-ssa-threadedge.c (simplify_control_stmt_condition):
	Remove repeated valueization.

Index: gcc/tree-ssa-dom.c
===================================================================
--- gcc/tree-ssa-dom.c	(revision 220751)
+++ gcc/tree-ssa-dom.c	(working copy)
@@ -1223,6 +1223,13 @@ record_equivalences_from_phis (basic_blo
 	  if (lhs == t)
 	    continue;
 
+	  /* Valueize t.  */
+	  if (TREE_CODE (t) == SSA_NAME)
+	    {
+	      tree tmp = SSA_NAME_VALUE (t);
+	      t = tmp ? tmp : t;
+	    }
+
 	  /* If we have not processed an alternative yet, then set
 	     RHS to this alternative.  */
 	  if (rhs == NULL)
@@ -1566,6 +1573,13 @@ record_conditions (struct edge_info *edg
 static void
 record_const_or_copy_1 (tree x, tree y, tree prev_x)
 {
+  /* Valueize y.  */
+  if (TREE_CODE (y) == SSA_NAME)
+    {
+      tree tmp = SSA_NAME_VALUE (y);
+      y = tmp ? tmp : y;
+    }
+
   set_ssa_name_value (x, y);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -2184,18 +2198,25 @@ record_equivalences_from_stmt (gimple st
       if (may_optimize_p
 	  && (TREE_CODE (rhs) == SSA_NAME
 	      || is_gimple_min_invariant (rhs)))
-      {
-	if (dump_file && (dump_flags & TDF_DETAILS))
-	  {
-	    fprintf (dump_file, "==== ASGN ");
-	    print_generic_expr (dump_file, lhs, 0);
-	    fprintf (dump_file, " = ");
-	    print_generic_expr (dump_file, rhs, 0);
-	    fprintf (dump_file, "\n");
-	  }
+	{
+	  /* Valueize rhs.  */
+	  if (TREE_CODE (rhs) == SSA_NAME)
+	    {
+	      tree tmp = SSA_NAME_VALUE (rhs);
+	      rhs = tmp ? tmp : rhs;
+	    }
 
-	set_ssa_name_value (lhs, rhs);
-      }
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    {
+	      fprintf (dump_file, "==== ASGN ");
+	      print_generic_expr (dump_file, lhs, 0);
+	      fprintf (dump_file, " = ");
+	      print_generic_expr (dump_file, rhs, 0);
+	      fprintf (dump_file, "\n");
+	    }
+
+	  set_ssa_name_value (lhs, rhs);
+	}
     }
 
   /* Make sure we can propagate &x + CST.  */
Index: gcc/tree-ssa-threadedge.c
===================================================================
--- gcc/tree-ssa-threadedge.c	(revision 220751)
+++ gcc/tree-ssa-threadedge.c	(working copy)
@@ -591,29 +591,13 @@ simplify_control_stmt_condition (edge e,
       cond_code = gimple_cond_code (stmt);
 
       /* Get the current value of both operands.  */
-      if (TREE_CODE (op0) == SSA_NAME)
-	{
-	  for (int i = 0; i < 2; i++)
-	    {
-	      if (TREE_CODE (op0) == SSA_NAME
-		  && SSA_NAME_VALUE (op0))
-		op0 = SSA_NAME_VALUE (op0);
-	      else
-		break;
-	    }
-	}
-
-      if (TREE_CODE (op1) == SSA_NAME)
-	{
-	  for (int i = 0; i < 2; i++)
-	    {
-	      if (TREE_CODE (op1) == SSA_NAME
-		  && SSA_NAME_VALUE (op1))
-		op1 = SSA_NAME_VALUE (op1);
-	      else
-		break;
-	    }
-	}
+      if (TREE_CODE (op0) == SSA_NAME
+	  && SSA_NAME_VALUE (op0))
+	op0 = SSA_NAME_VALUE (op0);
+
+      if (TREE_CODE (op1) == SSA_NAME
+	  && SSA_NAME_VALUE (op1))
+	op1 = SSA_NAME_VALUE (op1);
 
       if (handle_dominating_asserts)
 	{
@@ -682,22 +666,11 @@ simplify_control_stmt_condition (edge e,
       tree original_lhs = cond;
       cached_lhs = cond;
 
-      /* Get the variable's current value from the equivalence chains.
-
-	 It is possible to get loops in the SSA_NAME_VALUE chains
-	 (consider threading the backedge of a loop where we have
-	 a loop invariant SSA_NAME used in the condition.  */
-      if (cached_lhs)
-	{
-	  for (int i = 0; i < 2; i++)
-	    {
-	      if (TREE_CODE (cached_lhs) == SSA_NAME
-		  && SSA_NAME_VALUE (cached_lhs))
-		cached_lhs = SSA_NAME_VALUE (cached_lhs);
-	      else
-		break;
-	    }
-	}
+      /* Get the variable's current value from the equivalence chains.  */
+      if (cached_lhs
+	  && TREE_CODE (cached_lhs) == SSA_NAME
+	  && SSA_NAME_VALUE (cached_lhs))
+	cached_lhs = SSA_NAME_VALUE (cached_lhs);
 
       /* If we're dominated by a suitable ASSERT_EXPR, then
 	 update CACHED_LHS appropriately.  */

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Properly valueize values we value-number to
  2015-02-17 14:03 [PATCH] Properly valueize values we value-number to Richard Biener
@ 2015-02-17 14:58 ` Richard Biener
  2015-04-24 19:34   ` Jeff Law
  2015-02-23 22:44 ` Jeff Law
  1 sibling, 1 reply; 9+ messages in thread
From: Richard Biener @ 2015-02-17 14:58 UTC (permalink / raw)
  To: gcc-patches; +Cc: law

On Tue, 17 Feb 2015, Richard Biener wrote:

> 
> This is something I noticed some time ago and that I remembered when
> you added that looping SSA_NAME_VALUE to simplify_control_stmt_condition.
> Currently DOM doesn't make sure that when setting
> SSA_NAME_VALUE (x) = y that SSA_NAME_VALUE (y) == y, thus you could
> get SSA_NAME_VALUE forming a chain until you reach the final value.
> 
> Thus the following patch which fixes all occurances and removes the
> looping from simplify_control_stmt_condition.  Did you have any
> testcase when you added that looping?
> 
> Note that this is purely by code inspection and I don't have any
> testcase where a SSA_NAME_VALUE chain causes missed optimizations
> (you'd have to disable a lot of other optimizations before dom1
> to be able to create a reasonably simple one).
> 
> Anyway - the tree-ssa-dom.c part bootstrapped and tested ok on
> x86_64-unknown-linux-gnu, the tree-ssa-threadedge.c part bootstrapped
> ok ontop of that and testing is still in progress.

On a closer look the record_const_or_copy_1 hunk is redundant
(record_equality is really a bit obfuscated...).

And record_equality is where the SSA_NAME_VALUEs might end up
forming chains?  At least because we might record a SSA_NAME_VALUE
for sth that might be the target of a SSA_NAME_VALUE, as in

 a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
 if (b_2 == i_3(D))
   ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)

thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
can't easily avoid that because we don't keep track of a
reverse mapping (nor would we want to update that).

Btw, in record_equality, the == part of the <= check for the loop
depth will just randomly swap x and y while we should prefer
IL canonical order.  If we can't rely on the input being in IL
canonical order we should prepend the whole function with

 if (tree_swap_operands_p (x, y, false))
   std::swap (x, y);

and change <= to < for the loop-depth check.

For PR62217 it is that loop depth check which causes the equivalence
to be recorded that ends up being harmful (well, harmful is the
actual propagation of course).  Btw, we already inhibit propagation
into IV increments (cprop_operand) - so it looks like we may
want to inhibit BIV replacement completely instead?

Index: gcc/tree-ssa-dom.c
===================================================================
--- gcc/tree-ssa-dom.c  (revision 220755)
+++ gcc/tree-ssa-dom.c  (working copy)
@@ -2291,11 +2291,16 @@ cprop_operand (gimple stmt, use_operand_
       if (!may_propagate_copy (op, val))
        return;
 
-      /* Do not propagate copies into simple IV increment statements.
-         See PR23821 for how this can disturb IV analysis.  */
-      if (TREE_CODE (val) != INTEGER_CST
-         && simple_iv_increment_p (stmt))
-       return;
+      /* Do not propagate copies into BIVs.
+         See PR23821 and PR62217 for how this can disturb IV and
+        number of iteration analysis.  */
+      if (TREE_CODE (val) != INTEGER_CST)
+       {
+         gimple def = SSA_NAME_DEF_STMT (op);
+         if (gimple_code (def) == GIMPLE_PHI
+             && gimple_bb (def)->loop_father->header == gimple_bb (def))
+           return;
+       }
 
       /* Dump details.  */
       if (dump_file && (dump_flags & TDF_DETAILS))

So it would just extend an existing heuristic.

Richard.

> Ok?
> 
> Thanks,
> Richard.
> 
> 2015-02-17  Richard Biener  <rguenther@suse.de>
> 
> 	* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
> 	(record_const_or_copy_1): Valueize y.
> 	(record_equivalences_from_stmt): Valueize rhs.
> 	* tree-ssa-threadedge.c (simplify_control_stmt_condition):
> 	Remove repeated valueization.
> 
> Index: gcc/tree-ssa-dom.c
> ===================================================================
> --- gcc/tree-ssa-dom.c	(revision 220751)
> +++ gcc/tree-ssa-dom.c	(working copy)
> @@ -1223,6 +1223,13 @@ record_equivalences_from_phis (basic_blo
>  	  if (lhs == t)
>  	    continue;
>  
> +	  /* Valueize t.  */
> +	  if (TREE_CODE (t) == SSA_NAME)
> +	    {
> +	      tree tmp = SSA_NAME_VALUE (t);
> +	      t = tmp ? tmp : t;
> +	    }
> +
>  	  /* If we have not processed an alternative yet, then set
>  	     RHS to this alternative.  */
>  	  if (rhs == NULL)
> @@ -1566,6 +1573,13 @@ record_conditions (struct edge_info *edg
>  static void
>  record_const_or_copy_1 (tree x, tree y, tree prev_x)
>  {
> +  /* Valueize y.  */
> +  if (TREE_CODE (y) == SSA_NAME)
> +    {
> +      tree tmp = SSA_NAME_VALUE (y);
> +      y = tmp ? tmp : y;
> +    }
> +
>    set_ssa_name_value (x, y);
>  
>    if (dump_file && (dump_flags & TDF_DETAILS))
> @@ -2184,18 +2198,25 @@ record_equivalences_from_stmt (gimple st
>        if (may_optimize_p
>  	  && (TREE_CODE (rhs) == SSA_NAME
>  	      || is_gimple_min_invariant (rhs)))
> -      {
> -	if (dump_file && (dump_flags & TDF_DETAILS))
> -	  {
> -	    fprintf (dump_file, "==== ASGN ");
> -	    print_generic_expr (dump_file, lhs, 0);
> -	    fprintf (dump_file, " = ");
> -	    print_generic_expr (dump_file, rhs, 0);
> -	    fprintf (dump_file, "\n");
> -	  }
> +	{
> +	  /* Valueize rhs.  */
> +	  if (TREE_CODE (rhs) == SSA_NAME)
> +	    {
> +	      tree tmp = SSA_NAME_VALUE (rhs);
> +	      rhs = tmp ? tmp : rhs;
> +	    }
>  
> -	set_ssa_name_value (lhs, rhs);
> -      }
> +	  if (dump_file && (dump_flags & TDF_DETAILS))
> +	    {
> +	      fprintf (dump_file, "==== ASGN ");
> +	      print_generic_expr (dump_file, lhs, 0);
> +	      fprintf (dump_file, " = ");
> +	      print_generic_expr (dump_file, rhs, 0);
> +	      fprintf (dump_file, "\n");
> +	    }
> +
> +	  set_ssa_name_value (lhs, rhs);
> +	}
>      }
>  
>    /* Make sure we can propagate &x + CST.  */
> Index: gcc/tree-ssa-threadedge.c
> ===================================================================
> --- gcc/tree-ssa-threadedge.c	(revision 220751)
> +++ gcc/tree-ssa-threadedge.c	(working copy)
> @@ -591,29 +591,13 @@ simplify_control_stmt_condition (edge e,
>        cond_code = gimple_cond_code (stmt);
>  
>        /* Get the current value of both operands.  */
> -      if (TREE_CODE (op0) == SSA_NAME)
> -	{
> -	  for (int i = 0; i < 2; i++)
> -	    {
> -	      if (TREE_CODE (op0) == SSA_NAME
> -		  && SSA_NAME_VALUE (op0))
> -		op0 = SSA_NAME_VALUE (op0);
> -	      else
> -		break;
> -	    }
> -	}
> -
> -      if (TREE_CODE (op1) == SSA_NAME)
> -	{
> -	  for (int i = 0; i < 2; i++)
> -	    {
> -	      if (TREE_CODE (op1) == SSA_NAME
> -		  && SSA_NAME_VALUE (op1))
> -		op1 = SSA_NAME_VALUE (op1);
> -	      else
> -		break;
> -	    }
> -	}
> +      if (TREE_CODE (op0) == SSA_NAME
> +	  && SSA_NAME_VALUE (op0))
> +	op0 = SSA_NAME_VALUE (op0);
> +
> +      if (TREE_CODE (op1) == SSA_NAME
> +	  && SSA_NAME_VALUE (op1))
> +	op1 = SSA_NAME_VALUE (op1);
>  
>        if (handle_dominating_asserts)
>  	{
> @@ -682,22 +666,11 @@ simplify_control_stmt_condition (edge e,
>        tree original_lhs = cond;
>        cached_lhs = cond;
>  
> -      /* Get the variable's current value from the equivalence chains.
> -
> -	 It is possible to get loops in the SSA_NAME_VALUE chains
> -	 (consider threading the backedge of a loop where we have
> -	 a loop invariant SSA_NAME used in the condition.  */
> -      if (cached_lhs)
> -	{
> -	  for (int i = 0; i < 2; i++)
> -	    {
> -	      if (TREE_CODE (cached_lhs) == SSA_NAME
> -		  && SSA_NAME_VALUE (cached_lhs))
> -		cached_lhs = SSA_NAME_VALUE (cached_lhs);
> -	      else
> -		break;
> -	    }
> -	}
> +      /* Get the variable's current value from the equivalence chains.  */
> +      if (cached_lhs
> +	  && TREE_CODE (cached_lhs) == SSA_NAME
> +	  && SSA_NAME_VALUE (cached_lhs))
> +	cached_lhs = SSA_NAME_VALUE (cached_lhs);
>  
>        /* If we're dominated by a suitable ASSERT_EXPR, then
>  	 update CACHED_LHS appropriately.  */
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Properly valueize values we value-number to
  2015-02-17 14:03 [PATCH] Properly valueize values we value-number to Richard Biener
  2015-02-17 14:58 ` Richard Biener
@ 2015-02-23 22:44 ` Jeff Law
  1 sibling, 0 replies; 9+ messages in thread
From: Jeff Law @ 2015-02-23 22:44 UTC (permalink / raw)
  To: Richard Biener, gcc-patches

On 02/17/15 07:03, Richard Biener wrote:
>
> This is something I noticed some time ago and that I remembered when
> you added that looping SSA_NAME_VALUE to simplify_control_stmt_condition.
> Currently DOM doesn't make sure that when setting
> SSA_NAME_VALUE (x) = y that SSA_NAME_VALUE (y) == y, thus you could
> get SSA_NAME_VALUE forming a chain until you reach the final value.
>
> Thus the following patch which fixes all occurances and removes the
> looping from simplify_control_stmt_condition.  Did you have any
> testcase when you added that looping?
pr61607, which probably looks familiar :-)





>
> Note that this is purely by code inspection and I don't have any
> testcase where a SSA_NAME_VALUE chain causes missed optimizations
> (you'd have to disable a lot of other optimizations before dom1
> to be able to create a reasonably simple one).
>
> Anyway - the tree-ssa-dom.c part bootstrapped and tested ok on
> x86_64-unknown-linux-gnu, the tree-ssa-threadedge.c part bootstrapped
> ok ontop of that and testing is still in progress.
>
> Ok?
>
> Thanks,
> Richard.
>
> 2015-02-17  Richard Biener  <rguenther@suse.de>
>
> 	* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
> 	(record_const_or_copy_1): Valueize y.
> 	(record_equivalences_from_stmt): Valueize rhs.
> 	* tree-ssa-threadedge.c (simplify_control_stmt_condition):
> 	Remove repeated valueization.
Given the regression was fixed elsewhere, let's table this until gcc-6.

I want to rip out large hunks of this code, at least on the threading 
side and replace it with a backwards walk similar to what Sebastian did 
for the FSM optimization.  As a nice side effect, that ought to 
completely disentangle DOM and the threader.

At the same time I want to see what effect we'd have by disabling these 
context sensitive propagations in DOM.  They add a lot of hair and I'm 
just not sure they're worth it.  If we really need them, perhaps we can 
get away with something simpler, outside of DOM.

jeff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Properly valueize values we value-number to
  2015-02-17 14:58 ` Richard Biener
@ 2015-04-24 19:34   ` Jeff Law
  2015-04-27  8:32     ` Richard Biener
  0 siblings, 1 reply; 9+ messages in thread
From: Jeff Law @ 2015-04-24 19:34 UTC (permalink / raw)
  To: Richard Biener, gcc-patches

On 02/17/2015 07:58 AM, Richard Biener wrote:
[ Restarting a old thread... ]

> On a closer look the record_const_or_copy_1 hunk is redundant
> (record_equality is really a bit obfuscated...).
Agreed.  I'm not entirely sure how it got to this point.

> And record_equality is where the SSA_NAME_VALUEs might end up
> forming chains?  At least because we might record a SSA_NAME_VALUE
> for sth that might be the target of a SSA_NAME_VALUE, as in
>
>   a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
>   if (b_2 == i_3(D))
>     ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)
>
> thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
> SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
> can't easily avoid that because we don't keep track of a
> reverse mapping (nor would we want to update that).
Right.  In general, the fact that we're in SSA form means that we ought 
not get loops in the SSA_NAME_VALUE chain since there's a single static 
assignment and we'll visit it once.

That nice model breaks down in two ways.  First we derive equivalences 
from equality conditions like you've shown above.  Second, when 
threading we can thread through a loop latch and start reprocessing a 
block.  The interaction between those two can result in loops in the 
value chain.

What I'm hoping to do in GCC6 is eliminate all this stuff with a 
threader that is independent of the dominator walk (and optimizer). 
Instead of walking forward through the dominator tree recording key 
nuggets, then removing those nuggets as we pop back up the tree, we 
instead we start with the conditional and walk up the use-def chains, 
simplifying as we go, with the goal of simplifying to a constant, of course.

You can see the beginnings of that with the FSM bits from Sebastian.

Obviously until those bits are in place, we should try to clean up any 
warts in the current implementation.

>
> Btw, in record_equality, the == part of the <= check for the loop
> depth will just randomly swap x and y while we should prefer
> IL canonical order.  If we can't rely on the input being in IL
> canonical order we should prepend the whole function with
>
>   if (tree_swap_operands_p (x, y, false))
>     std::swap (x, y);
>
> and change <= to < for the loop-depth check.
Agreed.  Makes perfect sense.

jeff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Properly valueize values we value-number to
  2015-04-24 19:34   ` Jeff Law
@ 2015-04-27  8:32     ` Richard Biener
  2015-04-27 12:47       ` Richard Biener
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Biener @ 2015-04-27  8:32 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Fri, 24 Apr 2015, Jeff Law wrote:

> On 02/17/2015 07:58 AM, Richard Biener wrote:
> [ Restarting a old thread... ]
> 
> > On a closer look the record_const_or_copy_1 hunk is redundant
> > (record_equality is really a bit obfuscated...).
> Agreed.  I'm not entirely sure how it got to this point.
> 
> > And record_equality is where the SSA_NAME_VALUEs might end up
> > forming chains?  At least because we might record a SSA_NAME_VALUE
> > for sth that might be the target of a SSA_NAME_VALUE, as in
> > 
> >   a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
> >   if (b_2 == i_3(D))
> >     ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)
> > 
> > thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
> > SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
> > can't easily avoid that because we don't keep track of a
> > reverse mapping (nor would we want to update that).
> Right.  In general, the fact that we're in SSA form means that we ought not
> get loops in the SSA_NAME_VALUE chain since there's a single static assignment
> and we'll visit it once.
> 
> That nice model breaks down in two ways.  First we derive equivalences from
> equality conditions like you've shown above.  Second, when threading we can
> thread through a loop latch and start reprocessing a block.  The interaction
> between those two can result in loops in the value chain.
> 
> What I'm hoping to do in GCC6 is eliminate all this stuff with a threader that
> is independent of the dominator walk (and optimizer). Instead of walking
> forward through the dominator tree recording key nuggets, then removing those
> nuggets as we pop back up the tree, we instead we start with the conditional
> and walk up the use-def chains, simplifying as we go, with the goal of
> simplifying to a constant, of course.
> 
> You can see the beginnings of that with the FSM bits from Sebastian.
> 
> Obviously until those bits are in place, we should try to clean up any warts
> in the current implementation.
> 
> > 
> > Btw, in record_equality, the == part of the <= check for the loop
> > depth will just randomly swap x and y while we should prefer
> > IL canonical order.  If we can't rely on the input being in IL
> > canonical order we should prepend the whole function with
> > 
> >   if (tree_swap_operands_p (x, y, false))
> >     std::swap (x, y);
> > 
> > and change <= to < for the loop-depth check.
> Agreed.  Makes perfect sense.

I'm now re-bootstrapping and testing the following.

Richard.

2015-04-27  Richard Biener  <rguenther@suse.de>

	* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
	(record_equivalences_from_stmt): Valueize rhs.
	(record_equality): Canonicalize x and y order via
	tree_swap_operands_p.  Do not swap operands for same loop depth.

Index: gcc/tree-ssa-dom.c
===================================================================
*** gcc/tree-ssa-dom.c	(revision 222360)
--- gcc/tree-ssa-dom.c	(working copy)
*************** record_equivalences_from_phis (basic_blo
*** 1519,1524 ****
--- 1519,1531 ----
  	  if (lhs == t)
  	    continue;
  
+ 	  /* Valueize t.  */
+ 	  if (TREE_CODE (t) == SSA_NAME)
+ 	    {
+ 	      tree tmp = SSA_NAME_VALUE (t);
+ 	      t = tmp ? tmp : t;
+ 	    }
+ 
  	  /* If we have not processed an alternative yet, then set
  	     RHS to this alternative.  */
  	  if (rhs == NULL)
*************** record_equality (tree x, tree y)
*** 1752,1757 ****
--- 1759,1767 ----
  {
    tree prev_x = NULL, prev_y = NULL;
  
+   if (tree_swap_operands_p (x, y, false))
+     std::swap (x, y);
+ 
    if (TREE_CODE (x) == SSA_NAME)
      prev_x = SSA_NAME_VALUE (x);
    if (TREE_CODE (y) == SSA_NAME)
*************** record_equality (tree x, tree y)
*** 1766,1772 ****
    else if (is_gimple_min_invariant (x)
  	   /* ???  When threading over backedges the following is important
  	      for correctness.  See PR61757.  */
! 	   || (loop_depth_of_name (x) <= loop_depth_of_name (y)))
      prev_x = x, x = y, y = prev_x, prev_x = prev_y;
    else if (prev_x && is_gimple_min_invariant (prev_x))
      x = y, y = prev_x, prev_x = prev_y;
--- 1776,1782 ----
    else if (is_gimple_min_invariant (x)
  	   /* ???  When threading over backedges the following is important
  	      for correctness.  See PR61757.  */
! 	   || (loop_depth_of_name (x) < loop_depth_of_name (y)))
      prev_x = x, x = y, y = prev_x, prev_x = prev_y;
    else if (prev_x && is_gimple_min_invariant (prev_x))
      x = y, y = prev_x, prev_x = prev_y;
*************** record_equivalences_from_stmt (gimple st
*** 2128,2145 ****
        if (may_optimize_p
  	  && (TREE_CODE (rhs) == SSA_NAME
  	      || is_gimple_min_invariant (rhs)))
!       {
! 	if (dump_file && (dump_flags & TDF_DETAILS))
! 	  {
! 	    fprintf (dump_file, "==== ASGN ");
! 	    print_generic_expr (dump_file, lhs, 0);
! 	    fprintf (dump_file, " = ");
! 	    print_generic_expr (dump_file, rhs, 0);
! 	    fprintf (dump_file, "\n");
! 	  }
  
! 	set_ssa_name_value (lhs, rhs);
!       }
      }
  
    /* Make sure we can propagate &x + CST.  */
--- 2138,2162 ----
        if (may_optimize_p
  	  && (TREE_CODE (rhs) == SSA_NAME
  	      || is_gimple_min_invariant (rhs)))
! 	{
! 	  /* Valueize rhs.  */
! 	  if (TREE_CODE (rhs) == SSA_NAME)
! 	    {
! 	      tree tmp = SSA_NAME_VALUE (rhs);
! 	      rhs = tmp ? tmp : rhs;
! 	    }
  
! 	  if (dump_file && (dump_flags & TDF_DETAILS))
! 	    {
! 	      fprintf (dump_file, "==== ASGN ");
! 	      print_generic_expr (dump_file, lhs, 0);
! 	      fprintf (dump_file, " = ");
! 	      print_generic_expr (dump_file, rhs, 0);
! 	      fprintf (dump_file, "\n");
! 	    }
! 
! 	  set_ssa_name_value (lhs, rhs);
! 	}
      }
  
    /* Make sure we can propagate &x + CST.  */

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Properly valueize values we value-number to
  2015-04-27  8:32     ` Richard Biener
@ 2015-04-27 12:47       ` Richard Biener
  2015-04-27 16:24         ` Jeff Law
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Biener @ 2015-04-27 12:47 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Mon, 27 Apr 2015, Richard Biener wrote:

> On Fri, 24 Apr 2015, Jeff Law wrote:
> 
> > On 02/17/2015 07:58 AM, Richard Biener wrote:
> > [ Restarting a old thread... ]
> > 
> > > On a closer look the record_const_or_copy_1 hunk is redundant
> > > (record_equality is really a bit obfuscated...).
> > Agreed.  I'm not entirely sure how it got to this point.
> > 
> > > And record_equality is where the SSA_NAME_VALUEs might end up
> > > forming chains?  At least because we might record a SSA_NAME_VALUE
> > > for sth that might be the target of a SSA_NAME_VALUE, as in
> > > 
> > >   a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
> > >   if (b_2 == i_3(D))
> > >     ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)
> > > 
> > > thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
> > > SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
> > > can't easily avoid that because we don't keep track of a
> > > reverse mapping (nor would we want to update that).
> > Right.  In general, the fact that we're in SSA form means that we ought not
> > get loops in the SSA_NAME_VALUE chain since there's a single static assignment
> > and we'll visit it once.
> > 
> > That nice model breaks down in two ways.  First we derive equivalences from
> > equality conditions like you've shown above.  Second, when threading we can
> > thread through a loop latch and start reprocessing a block.  The interaction
> > between those two can result in loops in the value chain.
> > 
> > What I'm hoping to do in GCC6 is eliminate all this stuff with a threader that
> > is independent of the dominator walk (and optimizer). Instead of walking
> > forward through the dominator tree recording key nuggets, then removing those
> > nuggets as we pop back up the tree, we instead we start with the conditional
> > and walk up the use-def chains, simplifying as we go, with the goal of
> > simplifying to a constant, of course.
> > 
> > You can see the beginnings of that with the FSM bits from Sebastian.
> > 
> > Obviously until those bits are in place, we should try to clean up any warts
> > in the current implementation.
> > 
> > > 
> > > Btw, in record_equality, the == part of the <= check for the loop
> > > depth will just randomly swap x and y while we should prefer
> > > IL canonical order.  If we can't rely on the input being in IL
> > > canonical order we should prepend the whole function with
> > > 
> > >   if (tree_swap_operands_p (x, y, false))
> > >     std::swap (x, y);
> > > 
> > > and change <= to < for the loop-depth check.
> > Agreed.  Makes perfect sense.
> 
> I'm now re-bootstrapping and testing the following.

Committed as follows, with XFAILing and re-opening PR65217.

Richard.

2015-04-27  Richard Biener  <rguenther@suse.de>

	* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
	(record_equivalences_from_stmt): Valueize rhs.
	(record_equality): Canonicalize x and y order via
	tree_swap_operands_p.  Do not swap operands for same loop depth.

	* gcc.target/i386/pr65217.c: XFAIL.

Index: gcc/tree-ssa-dom.c
===================================================================
*** gcc/tree-ssa-dom.c	(revision 222360)
--- gcc/tree-ssa-dom.c	(working copy)
*************** record_equivalences_from_phis (basic_blo
*** 1519,1524 ****
--- 1519,1531 ----
  	  if (lhs == t)
  	    continue;
  
+ 	  /* Valueize t.  */
+ 	  if (TREE_CODE (t) == SSA_NAME)
+ 	    {
+ 	      tree tmp = SSA_NAME_VALUE (t);
+ 	      t = tmp ? tmp : t;
+ 	    }
+ 
  	  /* If we have not processed an alternative yet, then set
  	     RHS to this alternative.  */
  	  if (rhs == NULL)
*************** record_equality (tree x, tree y)
*** 1752,1757 ****
--- 1759,1767 ----
  {
    tree prev_x = NULL, prev_y = NULL;
  
+   if (tree_swap_operands_p (x, y, false))
+     std::swap (x, y);
+ 
    if (TREE_CODE (x) == SSA_NAME)
      prev_x = SSA_NAME_VALUE (x);
    if (TREE_CODE (y) == SSA_NAME)
*************** record_equality (tree x, tree y)
*** 1766,1772 ****
    else if (is_gimple_min_invariant (x)
  	   /* ???  When threading over backedges the following is important
  	      for correctness.  See PR61757.  */
! 	   || (loop_depth_of_name (x) <= loop_depth_of_name (y)))
      prev_x = x, x = y, y = prev_x, prev_x = prev_y;
    else if (prev_x && is_gimple_min_invariant (prev_x))
      x = y, y = prev_x, prev_x = prev_y;
--- 1776,1782 ----
    else if (is_gimple_min_invariant (x)
  	   /* ???  When threading over backedges the following is important
  	      for correctness.  See PR61757.  */
! 	   || (loop_depth_of_name (x) < loop_depth_of_name (y)))
      prev_x = x, x = y, y = prev_x, prev_x = prev_y;
    else if (prev_x && is_gimple_min_invariant (prev_x))
      x = y, y = prev_x, prev_x = prev_y;
*************** record_equivalences_from_stmt (gimple st
*** 2128,2145 ****
        if (may_optimize_p
  	  && (TREE_CODE (rhs) == SSA_NAME
  	      || is_gimple_min_invariant (rhs)))
!       {
! 	if (dump_file && (dump_flags & TDF_DETAILS))
! 	  {
! 	    fprintf (dump_file, "==== ASGN ");
! 	    print_generic_expr (dump_file, lhs, 0);
! 	    fprintf (dump_file, " = ");
! 	    print_generic_expr (dump_file, rhs, 0);
! 	    fprintf (dump_file, "\n");
! 	  }
  
! 	set_ssa_name_value (lhs, rhs);
!       }
      }
  
    /* Make sure we can propagate &x + CST.  */
--- 2138,2162 ----
        if (may_optimize_p
  	  && (TREE_CODE (rhs) == SSA_NAME
  	      || is_gimple_min_invariant (rhs)))
! 	{
! 	  /* Valueize rhs.  */
! 	  if (TREE_CODE (rhs) == SSA_NAME)
! 	    {
! 	      tree tmp = SSA_NAME_VALUE (rhs);
! 	      rhs = tmp ? tmp : rhs;
! 	    }
  
! 	  if (dump_file && (dump_flags & TDF_DETAILS))
! 	    {
! 	      fprintf (dump_file, "==== ASGN ");
! 	      print_generic_expr (dump_file, lhs, 0);
! 	      fprintf (dump_file, " = ");
! 	      print_generic_expr (dump_file, rhs, 0);
! 	      fprintf (dump_file, "\n");
! 	    }
! 
! 	  set_ssa_name_value (lhs, rhs);
! 	}
      }
  
    /* Make sure we can propagate &x + CST.  */

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Properly valueize values we value-number to
  2015-04-27 12:47       ` Richard Biener
@ 2015-04-27 16:24         ` Jeff Law
  2015-04-27 18:29           ` Richard Biener
  0 siblings, 1 reply; 9+ messages in thread
From: Jeff Law @ 2015-04-27 16:24 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On 04/27/2015 06:46 AM, Richard Biener wrote:
> On Mon, 27 Apr 2015, Richard Biener wrote:
>
>> On Fri, 24 Apr 2015, Jeff Law wrote:
>>
>>> On 02/17/2015 07:58 AM, Richard Biener wrote:
>>> [ Restarting a old thread... ]
>>>
>>>> On a closer look the record_const_or_copy_1 hunk is redundant
>>>> (record_equality is really a bit obfuscated...).
>>> Agreed.  I'm not entirely sure how it got to this point.
>>>
>>>> And record_equality is where the SSA_NAME_VALUEs might end up
>>>> forming chains?  At least because we might record a SSA_NAME_VALUE
>>>> for sth that might be the target of a SSA_NAME_VALUE, as in
>>>>
>>>>    a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
>>>>    if (b_2 == i_3(D))
>>>>      ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)
>>>>
>>>> thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
>>>> SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
>>>> can't easily avoid that because we don't keep track of a
>>>> reverse mapping (nor would we want to update that).
>>> Right.  In general, the fact that we're in SSA form means that we ought not
>>> get loops in the SSA_NAME_VALUE chain since there's a single static assignment
>>> and we'll visit it once.
>>>
>>> That nice model breaks down in two ways.  First we derive equivalences from
>>> equality conditions like you've shown above.  Second, when threading we can
>>> thread through a loop latch and start reprocessing a block.  The interaction
>>> between those two can result in loops in the value chain.
>>>
>>> What I'm hoping to do in GCC6 is eliminate all this stuff with a threader that
>>> is independent of the dominator walk (and optimizer). Instead of walking
>>> forward through the dominator tree recording key nuggets, then removing those
>>> nuggets as we pop back up the tree, we instead we start with the conditional
>>> and walk up the use-def chains, simplifying as we go, with the goal of
>>> simplifying to a constant, of course.
>>>
>>> You can see the beginnings of that with the FSM bits from Sebastian.
>>>
>>> Obviously until those bits are in place, we should try to clean up any warts
>>> in the current implementation.
>>>
>>>>
>>>> Btw, in record_equality, the == part of the <= check for the loop
>>>> depth will just randomly swap x and y while we should prefer
>>>> IL canonical order.  If we can't rely on the input being in IL
>>>> canonical order we should prepend the whole function with
>>>>
>>>>    if (tree_swap_operands_p (x, y, false))
>>>>      std::swap (x, y);
>>>>
>>>> and change <= to < for the loop-depth check.
>>> Agreed.  Makes perfect sense.
>>
>> I'm now re-bootstrapping and testing the following.
>
> Committed as follows, with XFAILing and re-opening PR65217.
>
> Richard.
>
> 2015-04-27  Richard Biener  <rguenther@suse.de>
>
> 	* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
> 	(record_equivalences_from_stmt): Valueize rhs.
> 	(record_equality): Canonicalize x and y order via
> 	tree_swap_operands_p.  Do not swap operands for same loop depth.
>
> 	* gcc.target/i386/pr65217.c: XFAIL.
Looks good.  I'd spun the same record_equality changes over the weekend 
and have state on regressing 65217.

65217 is an interesting test.


   if ((n & -n) != n)
     __builtin_unreachable();
   return n;

n & -n will (of course) get computed into an SSA_NAME.  We then 
propagate that name for the use of "n" in the return statement rather 
than using the effectively zero cost "n".

If we put some smarts into tree_swap_operands_p to order sensibly in 
this kind of case, we end up regressing a different case that I'll be 
looking at today.

jeff


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Properly valueize values we value-number to
  2015-04-27 16:24         ` Jeff Law
@ 2015-04-27 18:29           ` Richard Biener
  2015-04-27 18:47             ` Jeff Law
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Biener @ 2015-04-27 18:29 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On April 27, 2015 6:24:47 PM GMT+02:00, Jeff Law <law@redhat.com> wrote:
>On 04/27/2015 06:46 AM, Richard Biener wrote:
>> On Mon, 27 Apr 2015, Richard Biener wrote:
>>
>>> On Fri, 24 Apr 2015, Jeff Law wrote:
>>>
>>>> On 02/17/2015 07:58 AM, Richard Biener wrote:
>>>> [ Restarting a old thread... ]
>>>>
>>>>> On a closer look the record_const_or_copy_1 hunk is redundant
>>>>> (record_equality is really a bit obfuscated...).
>>>> Agreed.  I'm not entirely sure how it got to this point.
>>>>
>>>>> And record_equality is where the SSA_NAME_VALUEs might end up
>>>>> forming chains?  At least because we might record a SSA_NAME_VALUE
>>>>> for sth that might be the target of a SSA_NAME_VALUE, as in
>>>>>
>>>>>    a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
>>>>>    if (b_2 == i_3(D))
>>>>>      ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)
>>>>>
>>>>> thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
>>>>> SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
>>>>> can't easily avoid that because we don't keep track of a
>>>>> reverse mapping (nor would we want to update that).
>>>> Right.  In general, the fact that we're in SSA form means that we
>ought not
>>>> get loops in the SSA_NAME_VALUE chain since there's a single static
>assignment
>>>> and we'll visit it once.
>>>>
>>>> That nice model breaks down in two ways.  First we derive
>equivalences from
>>>> equality conditions like you've shown above.  Second, when
>threading we can
>>>> thread through a loop latch and start reprocessing a block.  The
>interaction
>>>> between those two can result in loops in the value chain.
>>>>
>>>> What I'm hoping to do in GCC6 is eliminate all this stuff with a
>threader that
>>>> is independent of the dominator walk (and optimizer). Instead of
>walking
>>>> forward through the dominator tree recording key nuggets, then
>removing those
>>>> nuggets as we pop back up the tree, we instead we start with the
>conditional
>>>> and walk up the use-def chains, simplifying as we go, with the goal
>of
>>>> simplifying to a constant, of course.
>>>>
>>>> You can see the beginnings of that with the FSM bits from
>Sebastian.
>>>>
>>>> Obviously until those bits are in place, we should try to clean up
>any warts
>>>> in the current implementation.
>>>>
>>>>>
>>>>> Btw, in record_equality, the == part of the <= check for the loop
>>>>> depth will just randomly swap x and y while we should prefer
>>>>> IL canonical order.  If we can't rely on the input being in IL
>>>>> canonical order we should prepend the whole function with
>>>>>
>>>>>    if (tree_swap_operands_p (x, y, false))
>>>>>      std::swap (x, y);
>>>>>
>>>>> and change <= to < for the loop-depth check.
>>>> Agreed.  Makes perfect sense.
>>>
>>> I'm now re-bootstrapping and testing the following.
>>
>> Committed as follows, with XFAILing and re-opening PR65217.
>>
>> Richard.
>>
>> 2015-04-27  Richard Biener  <rguenther@suse.de>
>>
>> 	* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
>> 	(record_equivalences_from_stmt): Valueize rhs.
>> 	(record_equality): Canonicalize x and y order via
>> 	tree_swap_operands_p.  Do not swap operands for same loop depth.
>>
>> 	* gcc.target/i386/pr65217.c: XFAIL.
>Looks good.  I'd spun the same record_equality changes over the weekend
>
>and have state on regressing 65217.
>
>65217 is an interesting test.
>
>
>   if ((n & -n) != n)
>     __builtin_unreachable();
>   return n;
>
>n & -n will (of course) get computed into an SSA_NAME.  We then 
>propagate that name for the use of "n" in the return statement rather 
>than using the effectively zero cost "n".
>
>If we put some smarts into tree_swap_operands_p to order sensibly in 
>this kind of case, we end up regressing a different case that I'll be 
>looking at today.

In this case the temporary we propagate has a single use (in the comparison).  Might be interesting to disallow this case by extra checks in record equality.  I wouldn't change tree_swap_operands_p.

Richard.

>
>jeff


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Properly valueize values we value-number to
  2015-04-27 18:29           ` Richard Biener
@ 2015-04-27 18:47             ` Jeff Law
  0 siblings, 0 replies; 9+ messages in thread
From: Jeff Law @ 2015-04-27 18:47 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On 04/27/2015 12:29 PM, Richard Biener wrote:
> On April 27, 2015 6:24:47 PM GMT+02:00, Jeff Law <law@redhat.com>
>> n & -n will (of course) get computed into an SSA_NAME.  We then
>> propagate that name for the use of "n" in the return statement
>> rather than using the effectively zero cost "n".
>>
>> If we put some smarts into tree_swap_operands_p to order sensibly
>> in this kind of case, we end up regressing a different case that
>> I'll be looking at today.
>
> In this case the temporary we propagate has a single use (in the
> comparison).  Might be interesting to disallow this case by extra
> checks in record equality.  I wouldn't change tree_swap_operands_p.
>
Yea, I keep going back and forth over whether or not the heuristic 
should go in tree_swap_operand_p or in record_equality.

It's a fairly narrow issue, so maybe record_equality would be a better spot.

jeff

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-04-27 18:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-17 14:03 [PATCH] Properly valueize values we value-number to Richard Biener
2015-02-17 14:58 ` Richard Biener
2015-04-24 19:34   ` Jeff Law
2015-04-27  8:32     ` Richard Biener
2015-04-27 12:47       ` Richard Biener
2015-04-27 16:24         ` Jeff Law
2015-04-27 18:29           ` Richard Biener
2015-04-27 18:47             ` Jeff Law
2015-02-23 22:44 ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).