public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][PR tree-optimization/81741] Throttle recording conditional equivalences
@ 2017-08-22 15:43 Jeff Law
  2017-08-23 10:47 ` Richard Biener
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Law @ 2017-08-22 15:43 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2100 bytes --]

This patch addresses some issues with conditional equivalences.

First, it stops recording blindly recording conditional equivalences.
Which leads to regressions...

So for certain binary expressions (for example x - y), if we lookup the
expression in the hash table and miss, we do a second lookup to see if
we have x == y in the hash table.  If so, then even though we don't know
the exact values of x and y, we can still provide a constant result.

I considered doing this in DOM rather than in the hash table lookup
routines, but then we'd have to duplicate the functionality in the
DOM/VRP threader.  Integrating the alternate lookup in the hash table
avoids that pitfall and turned out to be easy.  I've added a variety of
new tests to verify this functionality (extensions of pr71947 testcases).

For a conditional equivalence where the cost of computing one of the
SSA_NAMEs is higher than the other (as seen with 81741), we do record
the equivalence, but arrange it that we will replace the expensive name
with the cheap name.  Obviously, I use the code from 81741 for the
testcase here.

However, there are still cases where we have a conditional equivalence
and the names have the same cost to compute.  A greatly simplified
example can be found in gcc.dg/tree-ssa/20030922-2.c.

I've simply xfailed this test.  To fix it we need to build a second set
of tables that are essentially the conditional equivalences, without
setting SSA_NAME_VALUE (SSA_NAME_VALUE is what drives const/copy
propagation in DOM).  That's actually fairly easy and not terribly
costly.  What gets ugly is we have to consult this second set of tables
when doing simplifications.  Worse yet, we have to run through each
operand's conditional equivalences, substitute it in and try to
simplify.  It just don't expect it to hit enough to justify that pain.

The net result is we should stop ping-ponging copy propagations that
arise from conditional equivalences at the loss of some minor
optimizations in code similar to 20030922-2.c.

Bootstrapped and regression tested on x86_64.  Installing on the trunk.


Jeff



[-- Attachment #2: Q --]
[-- Type: text/plain, Size: 14526 bytes --]

commit 44d01266aff5583b3b6db30158194656cfe88cae
Author: Jeff Law <law@redhat.com>
Date:   Tue Aug 22 09:10:02 2017 -0600

            PR tree-optimization/81741
            PR tree-optimization/71947
            * tree-ssa-dom.c: Include tree-inline.h.
            (record_temporary_equivalences): Only record SSA_NAME = SSA_NAME
            equivalences if one is more expensive to compute than the other.
            * tree-ssa-scopedtables.h (class const_or_copies): Make
            record_const_or_copy_raw method private.
            (class avail_exprs_stack): New method simplify_binary_operation.
            * tree-ssa-scopedtables.c (avail_exprs_stack::lookup_avail_expr): Call
            avail_exprs_stack::simplify_binary_operation as needed.
            (avail_exprs_stack::simplify_binary_operation): New function.
    
            PR tree-optimization/81741
            PR tree-optimization/71947
            * gcc.dg/tree-ssa/pr81741.c: New test.
            * gcc.dg/tree-ssa/pr71947-7.c: New test.
            * gcc.dg/tree-ssa/pr71947-8.c: New test.
            * gcc.dg/tree-ssa/pr71947-9.c: New test.
            * gcc.dg/tree-ssa/pr71941-1.c: Tweak expected output.
            * gcc.dg/tree-ssa/pr71941-2.c: Tweak expected output.
            * gcc.dg/tree-ssa/pr71941-3.c: Tweak expected output.
            * gcc.dg/tree-ssa/20030922-2.c: xfail.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index ab85c074f24..9b941af74c6 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,17 @@
+2017-08-22  Jeff Law  <law@redhat.com>
+
+	PR tree-optimization/81741
+	PR tree-optimization/71947
+	* tree-ssa-dom.c: Include tree-inline.h.
+	(record_temporary_equivalences): Only record SSA_NAME = SSA_NAME
+	equivalences if one is more expensive to compute than the other.
+	* tree-ssa-scopedtables.h (class const_or_copies): Make
+	record_const_or_copy_raw method private.
+	(class avail_exprs_stack): New method simplify_binary_operation.
+	* tree-ssa-scopedtables.c (avail_exprs_stack::lookup_avail_expr): Call
+	avail_exprs_stack::simplify_binary_operation as needed.
+	(avail_exprs_stack::simplify_binary_operation): New function.
+
 2017-08-22  Sebastian Huber  <sebastian.huber@embedded-brains.de>
 
 	* config.gcc (powerpc-*-rtems*): Add rs6000/linux64.opt.
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 17733519e0b..531d0f95ae7 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,16 @@
+2017-08-22  Jeff Law  <law@redhat.com>
+
+	PR tree-optimization/81741
+	PR tree-optimization/71947
+	* gcc.dg/tree-ssa/pr81741.c: New test.
+	* gcc.dg/tree-ssa/pr71947-7.c: New test.
+	* gcc.dg/tree-ssa/pr71947-8.c: New test.
+	* gcc.dg/tree-ssa/pr71947-9.c: New test.
+	* gcc.dg/tree-ssa/pr71941-1.c: Tweak expected output.
+	* gcc.dg/tree-ssa/pr71941-2.c: Tweak expected output.
+	* gcc.dg/tree-ssa/pr71941-3.c: Tweak expected output.
+	* gcc.dg/tree-ssa/20030922-2.c: xfail.
+
 2017-08-22  Yvan Roux  <yvan.roux@linaro.org>
 
         PR c++/80287
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
index 16c79da9521..172f203cf8e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
@@ -20,4 +20,6 @@ rgn_rank (rtx insn1, rtx insn2)
 }
 
 /* There should be two IF conditionals.  */
-/* { dg-final { scan-tree-dump-times "if " 2 "dom2" } } */
+/* We no longer record the conditional equivalence by design, thus we
+   are unable to simplify the 3rd conditional at compile time.  */
+/* { dg-final { scan-tree-dump-times "if " 2 "dom2" { xfail *-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c
index b03349546fd..ac8271cc574 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c
@@ -14,6 +14,6 @@ int f(int x, int y)
    return ret;
 }
 
-/* { dg-final { scan-tree-dump "Folded to: ret_\[0-9\]+ = 0;"  "dom2" } } */
+/* { dg-final { scan-tree-dump "Replaced redundant expr \[^\r\n\]* with .0."  "dom2" } } */
 
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c
index de8f88b88d7..b2c09cbb021 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c
@@ -13,4 +13,4 @@ int f(int x, int y)
   return ret;
 }
 
-/* { dg-final { scan-tree-dump "Folded to: ret_\[0-9\]+ = 0;"  "dom2" } } */
+/* { dg-final { scan-tree-dump "Replaced redundant expr \[^\r\n\]* with .0."  "dom2" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c
index e79847f83c8..2316f7e1c04 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c
@@ -9,4 +9,5 @@ int f(int x, int y)
   return ret;
 }
 
-/* { dg-final { scan-tree-dump "Folded to: ret_\[0-9\]+ = 0;"  "dom2" } } */
+/* { dg-final { scan-tree-dump "Replaced redundant expr \[^\r\n\]* with .0."  "dom2" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-7.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-7.c
new file mode 100644
index 00000000000..b44c064aa23
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-7.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+
+
+int f(int x, int y)
+{
+  int ret;
+  if (x == y)
+    ret = x % y;
+  else
+    ret = 1;
+
+  return ret;
+}
+/* { dg-final { scan-tree-dump "Replaced redundant expr \[^\r\n\]* with .0."  "dom2" } } */
+
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-8.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-8.c
new file mode 100644
index 00000000000..26e5abbdc29
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-8.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+
+
+int f(int x, int y)
+{
+  int ret;
+  if (x == y)
+    ret = x / y;
+  else
+    ret = 0;
+
+  return ret;
+}
+/* { dg-final { scan-tree-dump "Replaced redundant expr \[^\r\n\]* with .1."  "dom2" } } */
+
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-9.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-9.c
new file mode 100644
index 00000000000..22b68d5ede0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-9.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+
+
+int f(int x, int y)
+{
+  int ret;
+  if (x == y)
+    ret = x & y;
+  else
+    ret = 0;
+
+  return ret;
+}
+/* { dg-final { scan-tree-dump "Replaced redundant expr \[^\r\n\]* with .\(x|y\)."  "dom2" } } */
+
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr81741.c b/gcc/testsuite/gcc.dg/tree-ssa/pr81741.c
new file mode 100644
index 00000000000..a162c3cf58f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr81741.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -w -fdump-tree-dom2-details" } */
+
+#include <string.h>
+
+typedef struct string_s {
+  unsigned long size, alloc;
+  char *ptr;
+} string_t[1];
+
+# define M_ASSUME(x)                                    \
+  (! __builtin_constant_p (!!(x) || !(x)) || (x) ?      \
+   (void) 0 : __builtin_unreachable())
+
+int f(string_t s)
+{
+  M_ASSUME(strlen(s->ptr) == s->size);
+  return s->size;
+}
+
+/* { dg-final { scan-assembler-not "strlen" } } */
+
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 494b472e121..407a4ef97d2 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "gimple-fold.h"
 #include "tree-eh.h"
+#include "tree-inline.h"
 #include "gimple-iterator.h"
 #include "tree-cfg.h"
 #include "tree-into-ssa.h"
@@ -776,16 +777,27 @@ record_temporary_equivalences (edge e,
 
       /* Record the simple NAME = VALUE equivalence.  */
       tree rhs = edge_info->rhs;
-      record_equality (lhs, rhs, const_and_copies);
 
-      /* We already recorded that LHS = RHS, with canonicalization,
-	 value chain following, etc.
+      /* If this is a SSA_NAME = SSA_NAME equivalence and one operand is
+	 cheaper to compute than the other, then set up the equivalence
+	 such that we replace the expensive one with the cheap one.
 
-	 We also want to record RHS = LHS, but without any canonicalization
-	 or value chain following.  */
-      if (TREE_CODE (rhs) == SSA_NAME)
-	const_and_copies->record_const_or_copy_raw (rhs, lhs,
-						    SSA_NAME_VALUE (rhs));
+	 If they are the same cost to compute, then do not record anything.  */
+      if (TREE_CODE (lhs) == SSA_NAME && TREE_CODE (rhs) == SSA_NAME)
+	{
+	  gimple *rhs_def = SSA_NAME_DEF_STMT (rhs);
+	  int rhs_cost = estimate_num_insns (rhs_def, &eni_size_weights);
+
+	  gimple *lhs_def = SSA_NAME_DEF_STMT (lhs);
+	  int lhs_cost = estimate_num_insns (lhs_def, &eni_size_weights);
+
+	  if (rhs_cost > lhs_cost)
+	    record_equality (rhs, lhs, const_and_copies);
+	  else if (rhs_cost < lhs_cost)
+	    record_equality (lhs, rhs, const_and_copies);
+	}
+      else
+	record_equality (lhs, rhs, const_and_copies);
 
       /* If LHS is an SSA_NAME and RHS is a constant integer and LHS was
 	 set via a widening type conversion, then we may be able to record
diff --git a/gcc/tree-ssa-scopedtables.c b/gcc/tree-ssa-scopedtables.c
index 7b9ca78a878..6e1fd582814 100644
--- a/gcc/tree-ssa-scopedtables.c
+++ b/gcc/tree-ssa-scopedtables.c
@@ -116,6 +116,102 @@ vuse_eq (ao_ref *, tree vuse1, unsigned int cnt, void *data)
   return NULL;
 }
 
+/* We looked for STMT in the hash table, but did not find it.
+
+   If STMT is an assignment from a binary operator, we may know something
+   about the operands relationship to each other which would allow
+   us to derive a constant value for the RHS of STMT.  */
+
+tree
+avail_exprs_stack::simplify_binary_operation (gimple *stmt,
+					      class expr_hash_elt element)
+{
+  if (is_gimple_assign (stmt))
+    {
+      struct hashable_expr *expr = element.expr ();
+      if (expr->kind == EXPR_BINARY)
+	{
+	  enum tree_code code = expr->ops.binary.op;
+
+	  switch (code)
+	    {
+	    /* For these cases, if we know the operands
+	       are equal, then we know the result.  */
+	    case MIN_EXPR:
+	    case MAX_EXPR:
+	    case BIT_IOR_EXPR:
+	    case BIT_AND_EXPR:
+	    case BIT_XOR_EXPR:
+	    case MINUS_EXPR:
+	    case TRUNC_DIV_EXPR:
+	    case CEIL_DIV_EXPR:
+	    case FLOOR_DIV_EXPR:
+	    case ROUND_DIV_EXPR:
+	    case EXACT_DIV_EXPR:
+	    case TRUNC_MOD_EXPR:
+	    case CEIL_MOD_EXPR:
+	    case FLOOR_MOD_EXPR:
+	    case ROUND_MOD_EXPR:
+	      {
+		/* Build a simple equality expr and query the hash table
+		   for it.  */
+		struct hashable_expr expr;
+		expr.type = boolean_type_node;
+		expr.kind = EXPR_BINARY;
+		expr.ops.binary.op = EQ_EXPR;
+		expr.ops.binary.opnd0 = gimple_assign_rhs1 (stmt);
+		expr.ops.binary.opnd1 = gimple_assign_rhs2 (stmt);
+		class expr_hash_elt element2 (&expr, NULL_TREE);
+		expr_hash_elt **slot
+		  = m_avail_exprs->find_slot (&element2, NO_INSERT);
+		tree result_type = TREE_TYPE (gimple_assign_lhs (stmt));
+
+		/* If the query was successful and returned a nonzero
+		   result, then we know that the operands of the binary
+		   expression are the same.  In many cases this allows
+		   us to compute a constant result of the expression
+		   at compile time, even if we do not know the exact
+		   values of the operands.  */
+		if (slot && *slot && integer_onep ((*slot)->lhs ()))
+		  {
+		    switch (code)
+		      {
+		      case MIN_EXPR:
+		      case MAX_EXPR:
+		      case BIT_IOR_EXPR:
+		      case BIT_AND_EXPR:
+			return gimple_assign_rhs1 (stmt);
+
+		      case BIT_XOR_EXPR:
+		      case MINUS_EXPR:
+		      case TRUNC_MOD_EXPR:
+		      case CEIL_MOD_EXPR:
+		      case FLOOR_MOD_EXPR:
+		      case ROUND_MOD_EXPR:
+			return build_zero_cst (result_type);
+
+		      case TRUNC_DIV_EXPR:
+		      case CEIL_DIV_EXPR:
+		      case FLOOR_DIV_EXPR:
+		      case ROUND_DIV_EXPR:
+		      case EXACT_DIV_EXPR:
+			return build_one_cst (result_type);
+
+		      default:
+			gcc_unreachable ();
+		      }
+		  }
+		break;
+	      }
+
+	      default:
+		break;
+	    }
+	}
+    }
+  return NULL_TREE;
+}
+
 /* Search for an existing instance of STMT in the AVAIL_EXPRS_STACK table.
    If found, return its LHS. Otherwise insert STMT in the table and
    return NULL_TREE.
@@ -160,6 +256,12 @@ avail_exprs_stack::lookup_avail_expr (gimple *stmt, bool insert, bool tbaa_p)
     }
   else if (*slot == NULL)
     {
+      /* If we did not find the expression in the hash table, we may still
+	 be able to produce a result for some expressions.  */
+      tree alt = avail_exprs_stack::simplify_binary_operation (stmt, element);
+      if (alt)
+	return alt;
+
       class expr_hash_elt *element2 = new expr_hash_elt (element);
       *slot = element2;
 
diff --git a/gcc/tree-ssa-scopedtables.h b/gcc/tree-ssa-scopedtables.h
index df304aedbf4..e3d7bff6913 100644
--- a/gcc/tree-ssa-scopedtables.h
+++ b/gcc/tree-ssa-scopedtables.h
@@ -156,6 +156,11 @@ class avail_exprs_stack
   vec<std::pair<expr_hash_elt_t, expr_hash_elt_t> > m_stack;
   hash_table<expr_elt_hasher> *m_avail_exprs;
 
+  /* For some assignments where the RHS is a binary operator, if we know
+     a equality relationship between the operands, we may be able to compute
+     a result, even if we don't know the exact value of the operands.  */
+  tree simplify_binary_operation (gimple *, class expr_hash_elt);
+
   /* We do not allow copying this object or initializing one
      from another.  */
   avail_exprs_stack& operator= (const avail_exprs_stack&);
@@ -185,10 +190,6 @@ class const_and_copies
      may follow the value chain for the RHS.  */
   void record_const_or_copy (tree, tree);
 
-  /* Record a single const/copy pair that can be unwound.  This version
-     does not follow the value chain for the RHS.  */
-  void record_const_or_copy_raw (tree, tree, tree);
-
   /* Special entry point when we want to provide an explicit previous
      value for the first argument.  Try to get rid of this in the future. 
 
@@ -196,6 +197,10 @@ class const_and_copies
   void record_const_or_copy (tree, tree, tree);
 
  private:
+  /* Record a single const/copy pair that can be unwound.  This version
+     does not follow the value chain for the RHS.  */
+  void record_const_or_copy_raw (tree, tree, tree);
+
   vec<tree> m_stack;
   const_and_copies& operator= (const const_and_copies&);
   const_and_copies (class const_and_copies &);

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH][PR tree-optimization/81741] Throttle recording conditional equivalences
  2017-08-22 15:43 [PATCH][PR tree-optimization/81741] Throttle recording conditional equivalences Jeff Law
@ 2017-08-23 10:47 ` Richard Biener
  2017-08-23 14:56   ` Jeff Law
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Biener @ 2017-08-23 10:47 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Tue, Aug 22, 2017 at 5:13 PM, Jeff Law <law@redhat.com> wrote:
> This patch addresses some issues with conditional equivalences.
>
> First, it stops recording blindly recording conditional equivalences.
> Which leads to regressions...
>
> So for certain binary expressions (for example x - y), if we lookup the
> expression in the hash table and miss, we do a second lookup to see if
> we have x == y in the hash table.  If so, then even though we don't know
> the exact values of x and y, we can still provide a constant result.
>
> I considered doing this in DOM rather than in the hash table lookup
> routines, but then we'd have to duplicate the functionality in the
> DOM/VRP threader.  Integrating the alternate lookup in the hash table
> avoids that pitfall and turned out to be easy.  I've added a variety of
> new tests to verify this functionality (extensions of pr71947 testcases).
>
> For a conditional equivalence where the cost of computing one of the
> SSA_NAMEs is higher than the other (as seen with 81741), we do record
> the equivalence, but arrange it that we will replace the expensive name
> with the cheap name.  Obviously, I use the code from 81741 for the
> testcase here.
>
> However, there are still cases where we have a conditional equivalence
> and the names have the same cost to compute.  A greatly simplified
> example can be found in gcc.dg/tree-ssa/20030922-2.c.

So the intent is (as in the patch) to not record any conditional equivalence
for this case?

> I've simply xfailed this test.  To fix it we need to build a second set
> of tables that are essentially the conditional equivalences, without
> setting SSA_NAME_VALUE (SSA_NAME_VALUE is what drives const/copy
> propagation in DOM).  That's actually fairly easy and not terribly
> costly.  What gets ugly is we have to consult this second set of tables
> when doing simplifications.  Worse yet, we have to run through each
> operand's conditional equivalences, substitute it in and try to
> simplify.  It just don't expect it to hit enough to justify that pain.

Agreed.  Note I think the main issue is that conditional equivalences
do not play well with value-numbering as they affect values of already
visited expressions.  So you'd basically have to re-iterate value-numbering
possibly affected expressions (recording the new result only temporarily
of course).

> The net result is we should stop ping-ponging copy propagations that
> arise from conditional equivalences at the loss of some minor
> optimizations in code similar to 20030922-2.c.

Can you file a PR for the XFAIL?

> Bootstrapped and regression tested on x86_64.  Installing on the trunk.

Thanks for finally taking a stab at this.
Richard.

>
> Jeff
>
>
>
> commit 44d01266aff5583b3b6db30158194656cfe88cae
> Author: Jeff Law <law@redhat.com>
> Date:   Tue Aug 22 09:10:02 2017 -0600
>
>             PR tree-optimization/81741
>             PR tree-optimization/71947
>             * tree-ssa-dom.c: Include tree-inline.h.
>             (record_temporary_equivalences): Only record SSA_NAME = SSA_NAME
>             equivalences if one is more expensive to compute than the other.
>             * tree-ssa-scopedtables.h (class const_or_copies): Make
>             record_const_or_copy_raw method private.
>             (class avail_exprs_stack): New method simplify_binary_operation.
>             * tree-ssa-scopedtables.c (avail_exprs_stack::lookup_avail_expr): Call
>             avail_exprs_stack::simplify_binary_operation as needed.
>             (avail_exprs_stack::simplify_binary_operation): New function.
>
>             PR tree-optimization/81741
>             PR tree-optimization/71947
>             * gcc.dg/tree-ssa/pr81741.c: New test.
>             * gcc.dg/tree-ssa/pr71947-7.c: New test.
>             * gcc.dg/tree-ssa/pr71947-8.c: New test.
>             * gcc.dg/tree-ssa/pr71947-9.c: New test.
>             * gcc.dg/tree-ssa/pr71941-1.c: Tweak expected output.
>             * gcc.dg/tree-ssa/pr71941-2.c: Tweak expected output.
>             * gcc.dg/tree-ssa/pr71941-3.c: Tweak expected output.
>             * gcc.dg/tree-ssa/20030922-2.c: xfail.
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index ab85c074f24..9b941af74c6 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,17 @@
> +2017-08-22  Jeff Law  <law@redhat.com>
> +
> +       PR tree-optimization/81741
> +       PR tree-optimization/71947
> +       * tree-ssa-dom.c: Include tree-inline.h.
> +       (record_temporary_equivalences): Only record SSA_NAME = SSA_NAME
> +       equivalences if one is more expensive to compute than the other.
> +       * tree-ssa-scopedtables.h (class const_or_copies): Make
> +       record_const_or_copy_raw method private.
> +       (class avail_exprs_stack): New method simplify_binary_operation.
> +       * tree-ssa-scopedtables.c (avail_exprs_stack::lookup_avail_expr): Call
> +       avail_exprs_stack::simplify_binary_operation as needed.
> +       (avail_exprs_stack::simplify_binary_operation): New function.
> +
>  2017-08-22  Sebastian Huber  <sebastian.huber@embedded-brains.de>
>
>         * config.gcc (powerpc-*-rtems*): Add rs6000/linux64.opt.
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 17733519e0b..531d0f95ae7 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,16 @@
> +2017-08-22  Jeff Law  <law@redhat.com>
> +
> +       PR tree-optimization/81741
> +       PR tree-optimization/71947
> +       * gcc.dg/tree-ssa/pr81741.c: New test.
> +       * gcc.dg/tree-ssa/pr71947-7.c: New test.
> +       * gcc.dg/tree-ssa/pr71947-8.c: New test.
> +       * gcc.dg/tree-ssa/pr71947-9.c: New test.
> +       * gcc.dg/tree-ssa/pr71941-1.c: Tweak expected output.
> +       * gcc.dg/tree-ssa/pr71941-2.c: Tweak expected output.
> +       * gcc.dg/tree-ssa/pr71941-3.c: Tweak expected output.
> +       * gcc.dg/tree-ssa/20030922-2.c: xfail.
> +
>  2017-08-22  Yvan Roux  <yvan.roux@linaro.org>
>
>          PR c++/80287
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> index 16c79da9521..172f203cf8e 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> @@ -20,4 +20,6 @@ rgn_rank (rtx insn1, rtx insn2)
>  }
>
>  /* There should be two IF conditionals.  */
> -/* { dg-final { scan-tree-dump-times "if " 2 "dom2" } } */
> +/* We no longer record the conditional equivalence by design, thus we
> +   are unable to simplify the 3rd conditional at compile time.  */
> +/* { dg-final { scan-tree-dump-times "if " 2 "dom2" { xfail *-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c
> index b03349546fd..ac8271cc574 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c
> @@ -14,6 +14,6 @@ int f(int x, int y)
>     return ret;
>  }
>
> -/* { dg-final { scan-tree-dump "Folded to: ret_\[0-9\]+ = 0;"  "dom2" } } */
> +/* { dg-final { scan-tree-dump "Replaced redundant expr \[^\r\n\]* with .0."  "dom2" } } */
>
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c
> index de8f88b88d7..b2c09cbb021 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c
> @@ -13,4 +13,4 @@ int f(int x, int y)
>    return ret;
>  }
>
> -/* { dg-final { scan-tree-dump "Folded to: ret_\[0-9\]+ = 0;"  "dom2" } } */
> +/* { dg-final { scan-tree-dump "Replaced redundant expr \[^\r\n\]* with .0."  "dom2" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c
> index e79847f83c8..2316f7e1c04 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c
> @@ -9,4 +9,5 @@ int f(int x, int y)
>    return ret;
>  }
>
> -/* { dg-final { scan-tree-dump "Folded to: ret_\[0-9\]+ = 0;"  "dom2" } } */
> +/* { dg-final { scan-tree-dump "Replaced redundant expr \[^\r\n\]* with .0."  "dom2" } } */
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-7.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-7.c
> new file mode 100644
> index 00000000000..b44c064aa23
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-7.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
> +
> +
> +int f(int x, int y)
> +{
> +  int ret;
> +  if (x == y)
> +    ret = x % y;
> +  else
> +    ret = 1;
> +
> +  return ret;
> +}
> +/* { dg-final { scan-tree-dump "Replaced redundant expr \[^\r\n\]* with .0."  "dom2" } } */
> +
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-8.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-8.c
> new file mode 100644
> index 00000000000..26e5abbdc29
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-8.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
> +
> +
> +int f(int x, int y)
> +{
> +  int ret;
> +  if (x == y)
> +    ret = x / y;
> +  else
> +    ret = 0;
> +
> +  return ret;
> +}
> +/* { dg-final { scan-tree-dump "Replaced redundant expr \[^\r\n\]* with .1."  "dom2" } } */
> +
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-9.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-9.c
> new file mode 100644
> index 00000000000..22b68d5ede0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-9.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
> +
> +
> +int f(int x, int y)
> +{
> +  int ret;
> +  if (x == y)
> +    ret = x & y;
> +  else
> +    ret = 0;
> +
> +  return ret;
> +}
> +/* { dg-final { scan-tree-dump "Replaced redundant expr \[^\r\n\]* with .\(x|y\)."  "dom2" } } */
> +
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr81741.c b/gcc/testsuite/gcc.dg/tree-ssa/pr81741.c
> new file mode 100644
> index 00000000000..a162c3cf58f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr81741.c
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -w -fdump-tree-dom2-details" } */
> +
> +#include <string.h>
> +
> +typedef struct string_s {
> +  unsigned long size, alloc;
> +  char *ptr;
> +} string_t[1];
> +
> +# define M_ASSUME(x)                                    \
> +  (! __builtin_constant_p (!!(x) || !(x)) || (x) ?      \
> +   (void) 0 : __builtin_unreachable())
> +
> +int f(string_t s)
> +{
> +  M_ASSUME(strlen(s->ptr) == s->size);
> +  return s->size;
> +}
> +
> +/* { dg-final { scan-assembler-not "strlen" } } */
> +
> diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
> index 494b472e121..407a4ef97d2 100644
> --- a/gcc/tree-ssa-dom.c
> +++ b/gcc/tree-ssa-dom.c
> @@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "cfgloop.h"
>  #include "gimple-fold.h"
>  #include "tree-eh.h"
> +#include "tree-inline.h"
>  #include "gimple-iterator.h"
>  #include "tree-cfg.h"
>  #include "tree-into-ssa.h"
> @@ -776,16 +777,27 @@ record_temporary_equivalences (edge e,
>
>        /* Record the simple NAME = VALUE equivalence.  */
>        tree rhs = edge_info->rhs;
> -      record_equality (lhs, rhs, const_and_copies);
>
> -      /* We already recorded that LHS = RHS, with canonicalization,
> -        value chain following, etc.
> +      /* If this is a SSA_NAME = SSA_NAME equivalence and one operand is
> +        cheaper to compute than the other, then set up the equivalence
> +        such that we replace the expensive one with the cheap one.
>
> -        We also want to record RHS = LHS, but without any canonicalization
> -        or value chain following.  */
> -      if (TREE_CODE (rhs) == SSA_NAME)
> -       const_and_copies->record_const_or_copy_raw (rhs, lhs,
> -                                                   SSA_NAME_VALUE (rhs));
> +        If they are the same cost to compute, then do not record anything.  */
> +      if (TREE_CODE (lhs) == SSA_NAME && TREE_CODE (rhs) == SSA_NAME)
> +       {
> +         gimple *rhs_def = SSA_NAME_DEF_STMT (rhs);
> +         int rhs_cost = estimate_num_insns (rhs_def, &eni_size_weights);
> +
> +         gimple *lhs_def = SSA_NAME_DEF_STMT (lhs);
> +         int lhs_cost = estimate_num_insns (lhs_def, &eni_size_weights);
> +
> +         if (rhs_cost > lhs_cost)
> +           record_equality (rhs, lhs, const_and_copies);
> +         else if (rhs_cost < lhs_cost)
> +           record_equality (lhs, rhs, const_and_copies);
> +       }
> +      else
> +       record_equality (lhs, rhs, const_and_copies);
>
>        /* If LHS is an SSA_NAME and RHS is a constant integer and LHS was
>          set via a widening type conversion, then we may be able to record
> diff --git a/gcc/tree-ssa-scopedtables.c b/gcc/tree-ssa-scopedtables.c
> index 7b9ca78a878..6e1fd582814 100644
> --- a/gcc/tree-ssa-scopedtables.c
> +++ b/gcc/tree-ssa-scopedtables.c
> @@ -116,6 +116,102 @@ vuse_eq (ao_ref *, tree vuse1, unsigned int cnt, void *data)
>    return NULL;
>  }
>
> +/* We looked for STMT in the hash table, but did not find it.
> +
> +   If STMT is an assignment from a binary operator, we may know something
> +   about the operands relationship to each other which would allow
> +   us to derive a constant value for the RHS of STMT.  */
> +
> +tree
> +avail_exprs_stack::simplify_binary_operation (gimple *stmt,
> +                                             class expr_hash_elt element)
> +{
> +  if (is_gimple_assign (stmt))
> +    {
> +      struct hashable_expr *expr = element.expr ();
> +      if (expr->kind == EXPR_BINARY)
> +       {
> +         enum tree_code code = expr->ops.binary.op;
> +
> +         switch (code)
> +           {
> +           /* For these cases, if we know the operands
> +              are equal, then we know the result.  */
> +           case MIN_EXPR:
> +           case MAX_EXPR:
> +           case BIT_IOR_EXPR:
> +           case BIT_AND_EXPR:
> +           case BIT_XOR_EXPR:
> +           case MINUS_EXPR:
> +           case TRUNC_DIV_EXPR:
> +           case CEIL_DIV_EXPR:
> +           case FLOOR_DIV_EXPR:
> +           case ROUND_DIV_EXPR:
> +           case EXACT_DIV_EXPR:
> +           case TRUNC_MOD_EXPR:
> +           case CEIL_MOD_EXPR:
> +           case FLOOR_MOD_EXPR:
> +           case ROUND_MOD_EXPR:
> +             {
> +               /* Build a simple equality expr and query the hash table
> +                  for it.  */
> +               struct hashable_expr expr;
> +               expr.type = boolean_type_node;
> +               expr.kind = EXPR_BINARY;
> +               expr.ops.binary.op = EQ_EXPR;
> +               expr.ops.binary.opnd0 = gimple_assign_rhs1 (stmt);
> +               expr.ops.binary.opnd1 = gimple_assign_rhs2 (stmt);
> +               class expr_hash_elt element2 (&expr, NULL_TREE);
> +               expr_hash_elt **slot
> +                 = m_avail_exprs->find_slot (&element2, NO_INSERT);
> +               tree result_type = TREE_TYPE (gimple_assign_lhs (stmt));
> +
> +               /* If the query was successful and returned a nonzero
> +                  result, then we know that the operands of the binary
> +                  expression are the same.  In many cases this allows
> +                  us to compute a constant result of the expression
> +                  at compile time, even if we do not know the exact
> +                  values of the operands.  */
> +               if (slot && *slot && integer_onep ((*slot)->lhs ()))
> +                 {
> +                   switch (code)
> +                     {
> +                     case MIN_EXPR:
> +                     case MAX_EXPR:
> +                     case BIT_IOR_EXPR:
> +                     case BIT_AND_EXPR:
> +                       return gimple_assign_rhs1 (stmt);
> +
> +                     case BIT_XOR_EXPR:
> +                     case MINUS_EXPR:
> +                     case TRUNC_MOD_EXPR:
> +                     case CEIL_MOD_EXPR:
> +                     case FLOOR_MOD_EXPR:
> +                     case ROUND_MOD_EXPR:
> +                       return build_zero_cst (result_type);
> +
> +                     case TRUNC_DIV_EXPR:
> +                     case CEIL_DIV_EXPR:
> +                     case FLOOR_DIV_EXPR:
> +                     case ROUND_DIV_EXPR:
> +                     case EXACT_DIV_EXPR:
> +                       return build_one_cst (result_type);
> +
> +                     default:
> +                       gcc_unreachable ();
> +                     }
> +                 }
> +               break;
> +             }
> +
> +             default:
> +               break;
> +           }
> +       }
> +    }
> +  return NULL_TREE;
> +}
> +
>  /* Search for an existing instance of STMT in the AVAIL_EXPRS_STACK table.
>     If found, return its LHS. Otherwise insert STMT in the table and
>     return NULL_TREE.
> @@ -160,6 +256,12 @@ avail_exprs_stack::lookup_avail_expr (gimple *stmt, bool insert, bool tbaa_p)
>      }
>    else if (*slot == NULL)
>      {
> +      /* If we did not find the expression in the hash table, we may still
> +        be able to produce a result for some expressions.  */
> +      tree alt = avail_exprs_stack::simplify_binary_operation (stmt, element);
> +      if (alt)
> +       return alt;
> +
>        class expr_hash_elt *element2 = new expr_hash_elt (element);
>        *slot = element2;
>
> diff --git a/gcc/tree-ssa-scopedtables.h b/gcc/tree-ssa-scopedtables.h
> index df304aedbf4..e3d7bff6913 100644
> --- a/gcc/tree-ssa-scopedtables.h
> +++ b/gcc/tree-ssa-scopedtables.h
> @@ -156,6 +156,11 @@ class avail_exprs_stack
>    vec<std::pair<expr_hash_elt_t, expr_hash_elt_t> > m_stack;
>    hash_table<expr_elt_hasher> *m_avail_exprs;
>
> +  /* For some assignments where the RHS is a binary operator, if we know
> +     a equality relationship between the operands, we may be able to compute
> +     a result, even if we don't know the exact value of the operands.  */
> +  tree simplify_binary_operation (gimple *, class expr_hash_elt);
> +
>    /* We do not allow copying this object or initializing one
>       from another.  */
>    avail_exprs_stack& operator= (const avail_exprs_stack&);
> @@ -185,10 +190,6 @@ class const_and_copies
>       may follow the value chain for the RHS.  */
>    void record_const_or_copy (tree, tree);
>
> -  /* Record a single const/copy pair that can be unwound.  This version
> -     does not follow the value chain for the RHS.  */
> -  void record_const_or_copy_raw (tree, tree, tree);
> -
>    /* Special entry point when we want to provide an explicit previous
>       value for the first argument.  Try to get rid of this in the future.
>
> @@ -196,6 +197,10 @@ class const_and_copies
>    void record_const_or_copy (tree, tree, tree);
>
>   private:
> +  /* Record a single const/copy pair that can be unwound.  This version
> +     does not follow the value chain for the RHS.  */
> +  void record_const_or_copy_raw (tree, tree, tree);
> +
>    vec<tree> m_stack;
>    const_and_copies& operator= (const const_and_copies&);
>    const_and_copies (class const_and_copies &);
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH][PR tree-optimization/81741] Throttle recording conditional equivalences
  2017-08-23 10:47 ` Richard Biener
@ 2017-08-23 14:56   ` Jeff Law
  0 siblings, 0 replies; 3+ messages in thread
From: Jeff Law @ 2017-08-23 14:56 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On 08/23/2017 04:37 AM, Richard Biener wrote:
> On Tue, Aug 22, 2017 at 5:13 PM, Jeff Law <law@redhat.com> wrote:
>> However, there are still cases where we have a conditional equivalence
>> and the names have the same cost to compute.  A greatly simplified
>> example can be found in gcc.dg/tree-ssa/20030922-2.c.
> 
> So the intent is (as in the patch) to not record any conditional equivalence
> for this case?
Correct.  We're no longer recording these conditional equivalences in
the const/copies table and SSA_NAME_VALUE.  The only time we will record
them now is if one name is cheaper to compute than the other (so that we
can potentially propagate away the expensive one).

Note that we still have 1 = NAME1 eq NAME2

In the expression hash table, which is what allows us to still simplify
stuff like NAME1 ^ NAME2 -> 0 and the like.   The mechanism by which
that gets simplified now is different, hence the change in dump scanning.

But to reiterate, we're no longer recording the conditional equivalences
into the const/copies table by default and thus we no longer use
conditional equivalences for const/copy propagation by default.


> 
>> I've simply xfailed this test.  To fix it we need to build a second set
>> of tables that are essentially the conditional equivalences, without
>> setting SSA_NAME_VALUE (SSA_NAME_VALUE is what drives const/copy
>> propagation in DOM).  That's actually fairly easy and not terribly
>> costly.  What gets ugly is we have to consult this second set of tables
>> when doing simplifications.  Worse yet, we have to run through each
>> operand's conditional equivalences, substitute it in and try to
>> simplify.  It just don't expect it to hit enough to justify that pain.
> 
> Agreed.  Note I think the main issue is that conditional equivalences
> do not play well with value-numbering as they affect values of already
> visited expressions.  So you'd basically have to re-iterate value-numbering
> possibly affected expressions (recording the new result only temporarily
> of course).
Understood.  I  still want to flesh out the unwindable equivalence
classes a bit further so I can measure how often cases like 20030922-2.c
occur in practice.

My sense is there's still some cases we're going to want to handle, but
I'm going to have to build the fully generalized version to get enough
instrumentation to guide that decision.

> 
>> The net result is we should stop ping-ponging copy propagations that
>> arise from conditional equivalences at the loss of some minor
>> optimizations in code similar to 20030922-2.c.
> 
> Can you file a PR for the XFAIL?
Will do.


> 
>> Bootstrapped and regression tested on x86_64.  Installing on the trunk.
> 
> Thanks for finally taking a stab at this.
It's  been on my list for a while :-)

Jeff

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-08-23 14:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-22 15:43 [PATCH][PR tree-optimization/81741] Throttle recording conditional equivalences Jeff Law
2017-08-23 10:47 ` Richard Biener
2017-08-23 14:56   ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).