public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFA][PATCH][tree-optimization/78496] 03/03 Embed and exploit EVRP analysis within DOM
@ 2017-12-04  5:59 Jeff Law
  2017-12-04 10:42 ` Richard Biener
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Law @ 2017-12-04  5:59 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3331 bytes --]

And finally, the meat of the fix for pr78496.  This is largely the patch
that was posed right when stage1 closed with just minor updates.

This patch embeds evrp analysis into DOM and exploits it in the minimal
way necessary to fix pr78496 without regressing other tests in the
testsuite.

The key effect we're looking for here is to pick up a slew of jump
threads in DOM2 by exploiting the range information during jump threading.

These aren't handled well in the standard tree-vrp jump threading -- we
could abuse ASSERT_EXPRs further, but it'd just be ugly and probably
computationally expensive.

The ranges we want fall out naturally from a dominator walk, hence all
the recent work around pulling out EVRP analysis into a little module
other passes could reuse.

With these changes we pick up all the missing jump thread opportunities
in DOM for pr78496.  Sadly, I've been able to pull together an automated
test that I like.  About the best I could come up with would be to
verify that a large number of jump threads were realized during DOM2.
But that still seems rather fragile.  I also looked at examining the
results of PRE, but the partial redundancies that were originally the
source of complaints in that BZ have already been fixed.  I also looked
at perhaps looking for PHIs with constant args and then perhaps trying
to filter those, but got nowhere.

So there's no direct test for pr78496.  Sigh.



Running EVRP analysis in DOM had an unintended side effect.  Namely that
SSA_NAMEs that got created after the EVRP optimization pass would have
range information attached to them.  That caused two warning regressions
with -O1 code in the C++ and Fortran testsuites.  The problem is the
ranges attached to the new SSA_NAMES were very wide and there was code
left on an unexecutable path that called an allocation routine (C++) and
a memset (Fortran) using those names as arguments.  The uber-wide ranges
in those circumstances triggered the false positive warnings.

By exploiting the EVRP data during the standard part of DOM to optimize
conditionals, we're able to prove the paths in question simply aren't
executable.  We remove them and the warnings go away.

This work compromises builtin-unreachable-6.  So the original test it
kept and -fno-tree-dominator-opts is added to keep it from being
compromised.  A new builtin-unreachable-6a test is created to very that
DOM does indeed remove all the __builtin_unreachable paths.

This work also results in us optimizing 20030922-2.c again (conditional
equivalences).  EVRP analysis will note the conditional equivalence.  We
don't propagate anything from EVRP analysis at this time, but we do use
it to try and simplify GIMPLE_CONDs to a compile-time constant which is
what happens in this case.

I plan to check this in if/when the first two patches are approved.

Jeff

ps. The x_vr_values hack is scheduled to go away as we remove
tree-vrp.c's threading implementation in gcc-9 -- we'll be able to drop
the simplification callback and instead have simplification live within
the dom_opt_dom_walker class where it'll have access to vr_values.  The
analogous version in tree-vrp.c just gets deleted at that time.

pps. I know there's a bogus propagation patch that I need to go back and
re-check based on comments from Richi.  That's definitely in the queue.



[-- Attachment #2: P3 --]
[-- Type: text/plain, Size: 11521 bytes --]

	* gimple-ssa-evrp-analyze.h
	(evrp_range_analyzer::get_vr_values): Simplify.
	* gimple-ssa-evrp-analyze.c: Corresponding changes.

	* tree-ssa-dom.c: Include alloc-pool.h, tree-vrp.h, vr-values.h
	and gimple-ssa-evrp-analyze.h.
	(dom_opt_dom_walker class): Add evrp_range_analyzer member.
	(simplify_stmt_for_jump_threading): Copy a blob of code from
	tree-vrp.c to use ranges to simplify statements.
	(dom_opt_dom_walker::before_dom_children): Call
	evrp_range_analyzer::{enter,record_ranges_from_stmt} methods.
	(dom_opt_dom_walker::after_dom_children): Similarly for
	evrp_range_analyzer::leave.
	(dom_opt_dom_walker::optimize_stmt): Use EVRP ranges to optimize
	conditionals.

	* gcc.dg/builtin-unreachable-6.c: Disable DOM.
	* gcc.dg/builtin-unreachable-6a.c: New test.
	* gcc.dg/tree-ssa/20030922-1.c: No longer XFAIL.
	* gcc.dg/ssa-dom-branch-1.c: Tweak expected output.
 
diff --git a/gcc/gimple-ssa-evrp-analyze.h b/gcc/gimple-ssa-evrp-analyze.h
index 6216a90..4783e6f 100644
--- a/gcc/gimple-ssa-evrp-analyze.h
+++ b/gcc/gimple-ssa-evrp-analyze.h
@@ -51,13 +51,7 @@ class evrp_range_analyzer
      true, then we are transferring ownership.  Else we keep ownership.
 
      This should be converted to a unique_ptr.  */
-  class vr_values *get_vr_values (bool transfer)
-    {
-      class vr_values *x = vr_values;
-      if (transfer)
-	vr_values = NULL;
-      return x;
-    }
+  class vr_values *get_vr_values (void) { return vr_values; }
 
  private:
   DISABLE_COPY_AND_ASSIGN (evrp_range_analyzer);
diff --git a/gcc/gimple-ssa-evrp.c b/gcc/gimple-ssa-evrp.c
index a554cf9..609ce38 100644
--- a/gcc/gimple-ssa-evrp.c
+++ b/gcc/gimple-ssa-evrp.c
@@ -68,7 +68,7 @@ class evrp_dom_walker : public dom_walker
 public:
   evrp_dom_walker ()
     : dom_walker (CDI_DOMINATORS),
-      evrp_folder (evrp_range_analyzer.get_vr_values (false))
+      evrp_folder (evrp_range_analyzer.get_vr_values ())
     {
       need_eh_cleanup = BITMAP_ALLOC (NULL);
     }
diff --git a/gcc/testsuite/gcc.dg/builtin-unreachable-6.c b/gcc/testsuite/gcc.dg/builtin-unreachable-6.c
index 2f8ca36..1915dd1 100644
--- a/gcc/testsuite/gcc.dg/builtin-unreachable-6.c
+++ b/gcc/testsuite/gcc.dg/builtin-unreachable-6.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-fab1" } */
+/* { dg-options "-O2 -fdump-tree-fab1 -fno-tree-dominator-opts" } */
 
 void
 foo (int b, int c)
diff --git a/gcc/testsuite/gcc.dg/builtin-unreachable-6a.c b/gcc/testsuite/gcc.dg/builtin-unreachable-6a.c
new file mode 100644
index 0000000..f527f2e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/builtin-unreachable-6a.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-fab1" } */
+
+#include "builtin-unreachable-6.c"
+
+/* { dg-final { scan-tree-dump-times "lab:" 1 "fab1" } } */
+/* { dg-final { scan-tree-dump-not "__builtin_unreachable" "fab1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
index 172f203..16c79da 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
@@ -20,6 +20,4 @@ rgn_rank (rtx insn1, rtx insn2)
 }
 
 /* There should be two IF conditionals.  */
-/* We no longer record the conditional equivalence by design, thus we
-   are unable to simplify the 3rd conditional at compile time.  */
-/* { dg-final { scan-tree-dump-times "if " 2 "dom2" { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump-times "if " 2 "dom2" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c
index 3c15296..fae5bde 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c
@@ -19,9 +19,10 @@ try_combine (rtx i1, rtx newpat)
   else if (i1 && foo ());
 }
 
-/* There should be two tests against i1.  One from the hash table
-   dumps, one in the code itself.  */
-/* { dg-final { scan-tree-dump-times "if .i1_" 2 "dom2"} } */
+/* There should be four tests against i1.  One from the hash table
+   dumps, one from the EVRP analyzer one from EVRP evaluation and one
+   in the code itself.  */
+/* { dg-final { scan-tree-dump-times "if .i1_" 4 "dom2"} } */
 
 /* There should be no actual jump threads realized by DOM.  The
    legitimize jump threads are handled in VRP and those discovered
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 916d661..59a7d87 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -46,6 +46,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimplify.h"
 #include "tree-cfgcleanup.h"
 #include "dbgcnt.h"
+#include "alloc-pool.h"
+#include "tree-vrp.h"
+#include "vr-values.h"
+#include "gimple-ssa-evrp-analyze.h"
 
 /* This file implements optimizations on the dominator tree.  */
 
@@ -584,6 +588,9 @@ private:
   class const_and_copies *m_const_and_copies;
   class avail_exprs_stack *m_avail_exprs_stack;
 
+  /* VRP data.  */
+  class evrp_range_analyzer evrp_range_analyzer;
+
   /* Dummy condition to avoid creating lots of throw away statements.  */
   gcond *m_dummy_cond;
 
@@ -835,6 +842,9 @@ make_pass_dominator (gcc::context *ctxt)
   return new pass_dominator (ctxt);
 }
 
+/* A hack until we remove threading from tree-vrp.c and bring the
+   simplification routine into the dom_opt_dom_walker class.  */
+static class vr_values *x_vr_values;
 
 /* A trivial wrapper so that we can present the generic jump
    threading code with a simple API for simplifying statements.  */
@@ -844,7 +854,95 @@ simplify_stmt_for_jump_threading (gimple *stmt,
 				  class avail_exprs_stack *avail_exprs_stack,
 				  basic_block bb ATTRIBUTE_UNUSED)
 {
-  return avail_exprs_stack->lookup_avail_expr (stmt, false, true);
+  /* First query our hash table to see if the the expression is available
+     there.  A non-NULL return value will be either a constant or another
+     SSA_NAME.  */
+  tree cached_lhs =  avail_exprs_stack->lookup_avail_expr (stmt, false, true);
+  if (cached_lhs)
+    return cached_lhs;
+
+  /* If the hash table query failed, query VRP information.  This is
+     essentially the same as tree-vrp's simplification routine.  The
+     copy in tree-vrp is scheduled for removal in gcc-9.  */
+  if (gcond *cond_stmt = dyn_cast <gcond *> (stmt))
+    {
+      cached_lhs
+	= x_vr_values->vrp_evaluate_conditional (gimple_cond_code (cond_stmt),
+						 gimple_cond_lhs (cond_stmt),
+						 gimple_cond_rhs (cond_stmt),
+						 within_stmt);
+      return cached_lhs;
+    }
+
+  if (gswitch *switch_stmt = dyn_cast <gswitch *> (stmt))
+    {
+      tree op = gimple_switch_index (switch_stmt);
+      if (TREE_CODE (op) != SSA_NAME)
+	return NULL_TREE;
+
+      value_range *vr = x_vr_values->get_value_range (op);
+      if ((vr->type != VR_RANGE && vr->type != VR_ANTI_RANGE)
+	  || symbolic_range_p (vr))
+	return NULL_TREE;
+
+      if (vr->type == VR_RANGE)
+	{
+	  size_t i, j;
+
+	  find_case_label_range (switch_stmt, vr->min, vr->max, &i, &j);
+
+	  if (i == j)
+	    {
+	      tree label = gimple_switch_label (switch_stmt, i);
+
+	      if (CASE_HIGH (label) != NULL_TREE
+		  ? (tree_int_cst_compare (CASE_LOW (label), vr->min) <= 0
+		     && tree_int_cst_compare (CASE_HIGH (label), vr->max) >= 0)
+		  : (tree_int_cst_equal (CASE_LOW (label), vr->min)
+		     && tree_int_cst_equal (vr->min, vr->max)))
+		return label;
+
+	      if (i > j)
+		return gimple_switch_label (switch_stmt, 0);
+	    }
+	}
+
+      if (vr->type == VR_ANTI_RANGE)
+          {
+            unsigned n = gimple_switch_num_labels (switch_stmt);
+            tree min_label = gimple_switch_label (switch_stmt, 1);
+            tree max_label = gimple_switch_label (switch_stmt, n - 1);
+
+            /* The default label will be taken only if the anti-range of the
+               operand is entirely outside the bounds of all the (non-default)
+               case labels.  */
+            if (tree_int_cst_compare (vr->min, CASE_LOW (min_label)) <= 0
+                && (CASE_HIGH (max_label) != NULL_TREE
+                    ? tree_int_cst_compare (vr->max, CASE_HIGH (max_label)) >= 0
+                    : tree_int_cst_compare (vr->max, CASE_LOW (max_label)) >= 0))
+            return gimple_switch_label (switch_stmt, 0);
+          }
+	return NULL_TREE;
+    }
+
+  if (gassign *assign_stmt = dyn_cast <gassign *> (stmt))
+    {
+      tree lhs = gimple_assign_lhs (assign_stmt);
+      if (TREE_CODE (lhs) == SSA_NAME
+	  && (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+	      || POINTER_TYPE_P (TREE_TYPE (lhs)))
+	  && stmt_interesting_for_vrp (stmt))
+	{
+	  edge dummy_e;
+	  tree dummy_tree;
+	  value_range new_vr = VR_INITIALIZER;
+	  x_vr_values->extract_range_from_stmt (stmt, &dummy_e,
+					      &dummy_tree, &new_vr);
+	  if (range_int_cst_singleton_p (&new_vr))
+	    return new_vr.min;
+	}
+    }
+  return NULL;
 }
 
 /* Valueize hook for gimple_fold_stmt_to_constant_1.  */
@@ -1310,6 +1408,8 @@ dom_opt_dom_walker::before_dom_children (basic_block bb)
   if (dump_file && (dump_flags & TDF_DETAILS))
     fprintf (dump_file, "\n\nOptimizing block #%d\n\n", bb->index);
 
+  evrp_range_analyzer.enter (bb);
+
   /* Push a marker on the stacks of local information so that we know how
      far to unwind when we finalize this block.  */
   m_avail_exprs_stack->push_marker ();
@@ -1332,7 +1432,10 @@ dom_opt_dom_walker::before_dom_children (basic_block bb)
 
   edge taken_edge = NULL;
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
-    taken_edge = this->optimize_stmt (bb, gsi);
+    {
+      evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi));
+      taken_edge = this->optimize_stmt (bb, gsi);
+    }
 
   /* Now prepare to process dominated blocks.  */
   record_edge_info (bb);
@@ -1350,13 +1453,16 @@ dom_opt_dom_walker::before_dom_children (basic_block bb)
 void
 dom_opt_dom_walker::after_dom_children (basic_block bb)
 {
+  x_vr_values = evrp_range_analyzer.get_vr_values ();
   thread_outgoing_edges (bb, m_dummy_cond, m_const_and_copies,
 			 m_avail_exprs_stack,
 			 simplify_stmt_for_jump_threading);
+  x_vr_values = NULL;
 
   /* These remove expressions local to BB from the tables.  */
   m_avail_exprs_stack->pop_to_marker ();
   m_const_and_copies->pop_to_marker ();
+  evrp_range_analyzer.leave (bb);
 }
 
 /* Search for redundant computations in STMT.  If any are found, then
@@ -1902,6 +2008,31 @@ dom_opt_dom_walker::optimize_stmt (basic_block bb, gimple_stmt_iterator si)
 						 integer_zero_node));
 	      gimple_set_modified (stmt, true);
 	    }
+	  else if (TREE_CODE (lhs) == SSA_NAME)
+	    {
+	      /* Exploiting EVRP data is not yet fully integrated into DOM
+		 but we need to do something for this case to avoid regressing
+		 udr4.f90 and new1.C which have unexecutable blocks with
+		 undefined behavior that get diagnosed if they're left in the
+		 IL because we've attached range information to new
+		 SSA_NAMES.  */
+	      edge taken_edge = NULL;
+	      evrp_range_analyzer.vrp_visit_cond_stmt (as_a <gcond *> (stmt),
+						       &taken_edge);
+	      if (taken_edge)
+		{
+		  if (taken_edge->flags & EDGE_TRUE_VALUE)
+		    gimple_cond_make_true (as_a <gcond *> (stmt));
+		  else if (taken_edge->flags & EDGE_FALSE_VALUE)
+		    gimple_cond_make_false (as_a <gcond *> (stmt));
+		  else
+		    gcc_unreachable ();
+		  gimple_set_modified (stmt, true);
+		  update_stmt (stmt);
+		  cfg_altered = true;
+		  return taken_edge;
+		}
+	    }
 	}
 
       update_stmt_if_modified (stmt);

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFA][PATCH][tree-optimization/78496] 03/03 Embed and exploit EVRP analysis within DOM
  2017-12-04  5:59 [RFA][PATCH][tree-optimization/78496] 03/03 Embed and exploit EVRP analysis within DOM Jeff Law
@ 2017-12-04 10:42 ` Richard Biener
  2017-12-04 23:44   ` Jeff Law
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Biener @ 2017-12-04 10:42 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Mon, Dec 4, 2017 at 6:59 AM, Jeff Law <law@redhat.com> wrote:
> And finally, the meat of the fix for pr78496.  This is largely the patch
> that was posed right when stage1 closed with just minor updates.
>
> This patch embeds evrp analysis into DOM and exploits it in the minimal
> way necessary to fix pr78496 without regressing other tests in the
> testsuite.
>
> The key effect we're looking for here is to pick up a slew of jump
> threads in DOM2 by exploiting the range information during jump threading.
>
> These aren't handled well in the standard tree-vrp jump threading -- we
> could abuse ASSERT_EXPRs further, but it'd just be ugly and probably
> computationally expensive.
>
> The ranges we want fall out naturally from a dominator walk, hence all
> the recent work around pulling out EVRP analysis into a little module
> other passes could reuse.
>
> With these changes we pick up all the missing jump thread opportunities
> in DOM for pr78496.  Sadly, I've been able to pull together an automated
> test that I like.  About the best I could come up with would be to
> verify that a large number of jump threads were realized during DOM2.
> But that still seems rather fragile.  I also looked at examining the
> results of PRE, but the partial redundancies that were originally the
> source of complaints in that BZ have already been fixed.  I also looked
> at perhaps looking for PHIs with constant args and then perhaps trying
> to filter those, but got nowhere.
>
> So there's no direct test for pr78496.  Sigh.
>
>
>
> Running EVRP analysis in DOM had an unintended side effect.  Namely that
> SSA_NAMEs that got created after the EVRP optimization pass would have
> range information attached to them.  That caused two warning regressions
> with -O1 code in the C++ and Fortran testsuites.  The problem is the
> ranges attached to the new SSA_NAMES were very wide and there was code
> left on an unexecutable path that called an allocation routine (C++) and
> a memset (Fortran) using those names as arguments.  The uber-wide ranges
> in those circumstances triggered the false positive warnings.
>
> By exploiting the EVRP data during the standard part of DOM to optimize
> conditionals, we're able to prove the paths in question simply aren't
> executable.  We remove them and the warnings go away.
>
> This work compromises builtin-unreachable-6.  So the original test it
> kept and -fno-tree-dominator-opts is added to keep it from being
> compromised.  A new builtin-unreachable-6a test is created to very that
> DOM does indeed remove all the __builtin_unreachable paths.
>
> This work also results in us optimizing 20030922-2.c again (conditional
> equivalences).  EVRP analysis will note the conditional equivalence.  We
> don't propagate anything from EVRP analysis at this time, but we do use
> it to try and simplify GIMPLE_CONDs to a compile-time constant which is
> what happens in this case.
>
> I plan to check this in if/when the first two patches are approved.

Looks good to me.  Btw, did you look at adding a GIMPLE testcase
for 78496?  You'd have stable IL into DOM then...

Richard.

> Jeff
>
> ps. The x_vr_values hack is scheduled to go away as we remove
> tree-vrp.c's threading implementation in gcc-9 -- we'll be able to drop
> the simplification callback and instead have simplification live within
> the dom_opt_dom_walker class where it'll have access to vr_values.  The
> analogous version in tree-vrp.c just gets deleted at that time.
>
> pps. I know there's a bogus propagation patch that I need to go back and
> re-check based on comments from Richi.  That's definitely in the queue.
>
>
>
>         * gimple-ssa-evrp-analyze.h
>         (evrp_range_analyzer::get_vr_values): Simplify.
>         * gimple-ssa-evrp-analyze.c: Corresponding changes.
>
>         * tree-ssa-dom.c: Include alloc-pool.h, tree-vrp.h, vr-values.h
>         and gimple-ssa-evrp-analyze.h.
>         (dom_opt_dom_walker class): Add evrp_range_analyzer member.
>         (simplify_stmt_for_jump_threading): Copy a blob of code from
>         tree-vrp.c to use ranges to simplify statements.
>         (dom_opt_dom_walker::before_dom_children): Call
>         evrp_range_analyzer::{enter,record_ranges_from_stmt} methods.
>         (dom_opt_dom_walker::after_dom_children): Similarly for
>         evrp_range_analyzer::leave.
>         (dom_opt_dom_walker::optimize_stmt): Use EVRP ranges to optimize
>         conditionals.
>
>         * gcc.dg/builtin-unreachable-6.c: Disable DOM.
>         * gcc.dg/builtin-unreachable-6a.c: New test.
>         * gcc.dg/tree-ssa/20030922-1.c: No longer XFAIL.
>         * gcc.dg/ssa-dom-branch-1.c: Tweak expected output.
>
> diff --git a/gcc/gimple-ssa-evrp-analyze.h b/gcc/gimple-ssa-evrp-analyze.h
> index 6216a90..4783e6f 100644
> --- a/gcc/gimple-ssa-evrp-analyze.h
> +++ b/gcc/gimple-ssa-evrp-analyze.h
> @@ -51,13 +51,7 @@ class evrp_range_analyzer
>       true, then we are transferring ownership.  Else we keep ownership.
>
>       This should be converted to a unique_ptr.  */
> -  class vr_values *get_vr_values (bool transfer)
> -    {
> -      class vr_values *x = vr_values;
> -      if (transfer)
> -       vr_values = NULL;
> -      return x;
> -    }
> +  class vr_values *get_vr_values (void) { return vr_values; }
>
>   private:
>    DISABLE_COPY_AND_ASSIGN (evrp_range_analyzer);
> diff --git a/gcc/gimple-ssa-evrp.c b/gcc/gimple-ssa-evrp.c
> index a554cf9..609ce38 100644
> --- a/gcc/gimple-ssa-evrp.c
> +++ b/gcc/gimple-ssa-evrp.c
> @@ -68,7 +68,7 @@ class evrp_dom_walker : public dom_walker
>  public:
>    evrp_dom_walker ()
>      : dom_walker (CDI_DOMINATORS),
> -      evrp_folder (evrp_range_analyzer.get_vr_values (false))
> +      evrp_folder (evrp_range_analyzer.get_vr_values ())
>      {
>        need_eh_cleanup = BITMAP_ALLOC (NULL);
>      }
> diff --git a/gcc/testsuite/gcc.dg/builtin-unreachable-6.c b/gcc/testsuite/gcc.dg/builtin-unreachable-6.c
> index 2f8ca36..1915dd1 100644
> --- a/gcc/testsuite/gcc.dg/builtin-unreachable-6.c
> +++ b/gcc/testsuite/gcc.dg/builtin-unreachable-6.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-fab1" } */
> +/* { dg-options "-O2 -fdump-tree-fab1 -fno-tree-dominator-opts" } */
>
>  void
>  foo (int b, int c)
> diff --git a/gcc/testsuite/gcc.dg/builtin-unreachable-6a.c b/gcc/testsuite/gcc.dg/builtin-unreachable-6a.c
> new file mode 100644
> index 0000000..f527f2e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/builtin-unreachable-6a.c
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-fab1" } */
> +
> +#include "builtin-unreachable-6.c"
> +
> +/* { dg-final { scan-tree-dump-times "lab:" 1 "fab1" } } */
> +/* { dg-final { scan-tree-dump-not "__builtin_unreachable" "fab1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> index 172f203..16c79da 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> @@ -20,6 +20,4 @@ rgn_rank (rtx insn1, rtx insn2)
>  }
>
>  /* There should be two IF conditionals.  */
> -/* We no longer record the conditional equivalence by design, thus we
> -   are unable to simplify the 3rd conditional at compile time.  */
> -/* { dg-final { scan-tree-dump-times "if " 2 "dom2" { xfail *-*-* } } } */
> +/* { dg-final { scan-tree-dump-times "if " 2 "dom2" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c
> index 3c15296..fae5bde 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c
> @@ -19,9 +19,10 @@ try_combine (rtx i1, rtx newpat)
>    else if (i1 && foo ());
>  }
>
> -/* There should be two tests against i1.  One from the hash table
> -   dumps, one in the code itself.  */
> -/* { dg-final { scan-tree-dump-times "if .i1_" 2 "dom2"} } */
> +/* There should be four tests against i1.  One from the hash table
> +   dumps, one from the EVRP analyzer one from EVRP evaluation and one
> +   in the code itself.  */
> +/* { dg-final { scan-tree-dump-times "if .i1_" 4 "dom2"} } */
>
>  /* There should be no actual jump threads realized by DOM.  The
>     legitimize jump threads are handled in VRP and those discovered
> diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
> index 916d661..59a7d87 100644
> --- a/gcc/tree-ssa-dom.c
> +++ b/gcc/tree-ssa-dom.c
> @@ -46,6 +46,10 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimplify.h"
>  #include "tree-cfgcleanup.h"
>  #include "dbgcnt.h"
> +#include "alloc-pool.h"
> +#include "tree-vrp.h"
> +#include "vr-values.h"
> +#include "gimple-ssa-evrp-analyze.h"
>
>  /* This file implements optimizations on the dominator tree.  */
>
> @@ -584,6 +588,9 @@ private:
>    class const_and_copies *m_const_and_copies;
>    class avail_exprs_stack *m_avail_exprs_stack;
>
> +  /* VRP data.  */
> +  class evrp_range_analyzer evrp_range_analyzer;
> +
>    /* Dummy condition to avoid creating lots of throw away statements.  */
>    gcond *m_dummy_cond;
>
> @@ -835,6 +842,9 @@ make_pass_dominator (gcc::context *ctxt)
>    return new pass_dominator (ctxt);
>  }
>
> +/* A hack until we remove threading from tree-vrp.c and bring the
> +   simplification routine into the dom_opt_dom_walker class.  */
> +static class vr_values *x_vr_values;
>
>  /* A trivial wrapper so that we can present the generic jump
>     threading code with a simple API for simplifying statements.  */
> @@ -844,7 +854,95 @@ simplify_stmt_for_jump_threading (gimple *stmt,
>                                   class avail_exprs_stack *avail_exprs_stack,
>                                   basic_block bb ATTRIBUTE_UNUSED)
>  {
> -  return avail_exprs_stack->lookup_avail_expr (stmt, false, true);
> +  /* First query our hash table to see if the the expression is available
> +     there.  A non-NULL return value will be either a constant or another
> +     SSA_NAME.  */
> +  tree cached_lhs =  avail_exprs_stack->lookup_avail_expr (stmt, false, true);
> +  if (cached_lhs)
> +    return cached_lhs;
> +
> +  /* If the hash table query failed, query VRP information.  This is
> +     essentially the same as tree-vrp's simplification routine.  The
> +     copy in tree-vrp is scheduled for removal in gcc-9.  */
> +  if (gcond *cond_stmt = dyn_cast <gcond *> (stmt))
> +    {
> +      cached_lhs
> +       = x_vr_values->vrp_evaluate_conditional (gimple_cond_code (cond_stmt),
> +                                                gimple_cond_lhs (cond_stmt),
> +                                                gimple_cond_rhs (cond_stmt),
> +                                                within_stmt);
> +      return cached_lhs;
> +    }
> +
> +  if (gswitch *switch_stmt = dyn_cast <gswitch *> (stmt))
> +    {
> +      tree op = gimple_switch_index (switch_stmt);
> +      if (TREE_CODE (op) != SSA_NAME)
> +       return NULL_TREE;
> +
> +      value_range *vr = x_vr_values->get_value_range (op);
> +      if ((vr->type != VR_RANGE && vr->type != VR_ANTI_RANGE)
> +         || symbolic_range_p (vr))
> +       return NULL_TREE;
> +
> +      if (vr->type == VR_RANGE)
> +       {
> +         size_t i, j;
> +
> +         find_case_label_range (switch_stmt, vr->min, vr->max, &i, &j);
> +
> +         if (i == j)
> +           {
> +             tree label = gimple_switch_label (switch_stmt, i);
> +
> +             if (CASE_HIGH (label) != NULL_TREE
> +                 ? (tree_int_cst_compare (CASE_LOW (label), vr->min) <= 0
> +                    && tree_int_cst_compare (CASE_HIGH (label), vr->max) >= 0)
> +                 : (tree_int_cst_equal (CASE_LOW (label), vr->min)
> +                    && tree_int_cst_equal (vr->min, vr->max)))
> +               return label;
> +
> +             if (i > j)
> +               return gimple_switch_label (switch_stmt, 0);
> +           }
> +       }
> +
> +      if (vr->type == VR_ANTI_RANGE)
> +          {
> +            unsigned n = gimple_switch_num_labels (switch_stmt);
> +            tree min_label = gimple_switch_label (switch_stmt, 1);
> +            tree max_label = gimple_switch_label (switch_stmt, n - 1);
> +
> +            /* The default label will be taken only if the anti-range of the
> +               operand is entirely outside the bounds of all the (non-default)
> +               case labels.  */
> +            if (tree_int_cst_compare (vr->min, CASE_LOW (min_label)) <= 0
> +                && (CASE_HIGH (max_label) != NULL_TREE
> +                    ? tree_int_cst_compare (vr->max, CASE_HIGH (max_label)) >= 0
> +                    : tree_int_cst_compare (vr->max, CASE_LOW (max_label)) >= 0))
> +            return gimple_switch_label (switch_stmt, 0);
> +          }
> +       return NULL_TREE;
> +    }
> +
> +  if (gassign *assign_stmt = dyn_cast <gassign *> (stmt))
> +    {
> +      tree lhs = gimple_assign_lhs (assign_stmt);
> +      if (TREE_CODE (lhs) == SSA_NAME
> +         && (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
> +             || POINTER_TYPE_P (TREE_TYPE (lhs)))
> +         && stmt_interesting_for_vrp (stmt))
> +       {
> +         edge dummy_e;
> +         tree dummy_tree;
> +         value_range new_vr = VR_INITIALIZER;
> +         x_vr_values->extract_range_from_stmt (stmt, &dummy_e,
> +                                             &dummy_tree, &new_vr);
> +         if (range_int_cst_singleton_p (&new_vr))
> +           return new_vr.min;
> +       }
> +    }
> +  return NULL;
>  }
>
>  /* Valueize hook for gimple_fold_stmt_to_constant_1.  */
> @@ -1310,6 +1408,8 @@ dom_opt_dom_walker::before_dom_children (basic_block bb)
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      fprintf (dump_file, "\n\nOptimizing block #%d\n\n", bb->index);
>
> +  evrp_range_analyzer.enter (bb);
> +
>    /* Push a marker on the stacks of local information so that we know how
>       far to unwind when we finalize this block.  */
>    m_avail_exprs_stack->push_marker ();
> @@ -1332,7 +1432,10 @@ dom_opt_dom_walker::before_dom_children (basic_block bb)
>
>    edge taken_edge = NULL;
>    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> -    taken_edge = this->optimize_stmt (bb, gsi);
> +    {
> +      evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi));
> +      taken_edge = this->optimize_stmt (bb, gsi);
> +    }
>
>    /* Now prepare to process dominated blocks.  */
>    record_edge_info (bb);
> @@ -1350,13 +1453,16 @@ dom_opt_dom_walker::before_dom_children (basic_block bb)
>  void
>  dom_opt_dom_walker::after_dom_children (basic_block bb)
>  {
> +  x_vr_values = evrp_range_analyzer.get_vr_values ();
>    thread_outgoing_edges (bb, m_dummy_cond, m_const_and_copies,
>                          m_avail_exprs_stack,
>                          simplify_stmt_for_jump_threading);
> +  x_vr_values = NULL;
>
>    /* These remove expressions local to BB from the tables.  */
>    m_avail_exprs_stack->pop_to_marker ();
>    m_const_and_copies->pop_to_marker ();
> +  evrp_range_analyzer.leave (bb);
>  }
>
>  /* Search for redundant computations in STMT.  If any are found, then
> @@ -1902,6 +2008,31 @@ dom_opt_dom_walker::optimize_stmt (basic_block bb, gimple_stmt_iterator si)
>                                                  integer_zero_node));
>               gimple_set_modified (stmt, true);
>             }
> +         else if (TREE_CODE (lhs) == SSA_NAME)
> +           {
> +             /* Exploiting EVRP data is not yet fully integrated into DOM
> +                but we need to do something for this case to avoid regressing
> +                udr4.f90 and new1.C which have unexecutable blocks with
> +                undefined behavior that get diagnosed if they're left in the
> +                IL because we've attached range information to new
> +                SSA_NAMES.  */
> +             edge taken_edge = NULL;
> +             evrp_range_analyzer.vrp_visit_cond_stmt (as_a <gcond *> (stmt),
> +                                                      &taken_edge);
> +             if (taken_edge)
> +               {
> +                 if (taken_edge->flags & EDGE_TRUE_VALUE)
> +                   gimple_cond_make_true (as_a <gcond *> (stmt));
> +                 else if (taken_edge->flags & EDGE_FALSE_VALUE)
> +                   gimple_cond_make_false (as_a <gcond *> (stmt));
> +                 else
> +                   gcc_unreachable ();
> +                 gimple_set_modified (stmt, true);
> +                 update_stmt (stmt);
> +                 cfg_altered = true;
> +                 return taken_edge;
> +               }
> +           }
>         }
>
>        update_stmt_if_modified (stmt);
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFA][PATCH][tree-optimization/78496] 03/03 Embed and exploit EVRP analysis within DOM
  2017-12-04 10:42 ` Richard Biener
@ 2017-12-04 23:44   ` Jeff Law
  0 siblings, 0 replies; 3+ messages in thread
From: Jeff Law @ 2017-12-04 23:44 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On 12/04/2017 03:42 AM, Richard Biener wrote:
> On Mon, Dec 4, 2017 at 6:59 AM, Jeff Law <law@redhat.com> wrote:
>> And finally, the meat of the fix for pr78496.  This is largely the patch
>> that was posed right when stage1 closed with just minor updates.
>>
>> This patch embeds evrp analysis into DOM and exploits it in the minimal
>> way necessary to fix pr78496 without regressing other tests in the
>> testsuite.
>>
>> The key effect we're looking for here is to pick up a slew of jump
>> threads in DOM2 by exploiting the range information during jump threading.
>>
>> These aren't handled well in the standard tree-vrp jump threading -- we
>> could abuse ASSERT_EXPRs further, but it'd just be ugly and probably
>> computationally expensive.
>>
>> The ranges we want fall out naturally from a dominator walk, hence all
>> the recent work around pulling out EVRP analysis into a little module
>> other passes could reuse.
>>
>> With these changes we pick up all the missing jump thread opportunities
>> in DOM for pr78496.  Sadly, I've been able to pull together an automated
>> test that I like.  About the best I could come up with would be to
>> verify that a large number of jump threads were realized during DOM2.
>> But that still seems rather fragile.  I also looked at examining the
>> results of PRE, but the partial redundancies that were originally the
>> source of complaints in that BZ have already been fixed.  I also looked
>> at perhaps looking for PHIs with constant args and then perhaps trying
>> to filter those, but got nowhere.
>>
>> So there's no direct test for pr78496.  Sigh.
>>
>>
>>
>> Running EVRP analysis in DOM had an unintended side effect.  Namely that
>> SSA_NAMEs that got created after the EVRP optimization pass would have
>> range information attached to them.  That caused two warning regressions
>> with -O1 code in the C++ and Fortran testsuites.  The problem is the
>> ranges attached to the new SSA_NAMES were very wide and there was code
>> left on an unexecutable path that called an allocation routine (C++) and
>> a memset (Fortran) using those names as arguments.  The uber-wide ranges
>> in those circumstances triggered the false positive warnings.
>>
>> By exploiting the EVRP data during the standard part of DOM to optimize
>> conditionals, we're able to prove the paths in question simply aren't
>> executable.  We remove them and the warnings go away.
>>
>> This work compromises builtin-unreachable-6.  So the original test it
>> kept and -fno-tree-dominator-opts is added to keep it from being
>> compromised.  A new builtin-unreachable-6a test is created to very that
>> DOM does indeed remove all the __builtin_unreachable paths.
>>
>> This work also results in us optimizing 20030922-2.c again (conditional
>> equivalences).  EVRP analysis will note the conditional equivalence.  We
>> don't propagate anything from EVRP analysis at this time, but we do use
>> it to try and simplify GIMPLE_CONDs to a compile-time constant which is
>> what happens in this case.
>>
>> I plan to check this in if/when the first two patches are approved.
> 
> Looks good to me.  Btw, did you look at adding a GIMPLE testcase
> for 78496?  You'd have stable IL into DOM then...
I thought about it.  But we're still stuck doing something like counting
jump threads realized.  We have several tests of this nature and they
all feel rather fragile.  So I'm definitely looking for better ways to
handle testing of this stuff.

jeff

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-12-04 23:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-04  5:59 [RFA][PATCH][tree-optimization/78496] 03/03 Embed and exploit EVRP analysis within DOM Jeff Law
2017-12-04 10:42 ` Richard Biener
2017-12-04 23:44   ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).