public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Jump threader refactor
@ 2021-04-28 17:12 Aldy Hernandez
  2021-04-28 17:12 ` [PATCH 1/2] " Aldy Hernandez
  2021-04-28 17:12 ` [PATCH 2/2] Refactor backward threader registry and profitability code into classes Aldy Hernandez
  0 siblings, 2 replies; 7+ messages in thread
From: Aldy Hernandez @ 2021-04-28 17:12 UTC (permalink / raw)
  To: Jeff Law, GCC patches

Hi Jeff.

This is the jump threader overhaul I sent you last year, along with
further refactors for the backwards threader.

The meat of it is in the first patch, which IIRC, you passed through
the Fedora tester multiple times.

OK for trunk?

Aldy


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] Jump threader refactor.
  2021-04-28 17:12 [PATCH 0/2] Jump threader refactor Aldy Hernandez
@ 2021-04-28 17:12 ` Aldy Hernandez
  2021-04-30 15:53   ` Jeff Law
  2021-04-28 17:12 ` [PATCH 2/2] Refactor backward threader registry and profitability code into classes Aldy Hernandez
  1 sibling, 1 reply; 7+ messages in thread
From: Aldy Hernandez @ 2021-04-28 17:12 UTC (permalink / raw)
  To: Jeff Law, GCC patches

This is an overall refactor of the jump threader, both for the low level
bits in tree-ssa-threadupdate.* and the high level bits in
tree-ssa-threadedge.*.

There should be no functional changes.

Some of the benefits of the refactor are:

a) Eliminates some icky global state (for example the x_vr_values hack).

b) Provides some semblance of an API for the threader.

c) Makes it clearer to see what parts are from the high level
threader, and what parts belong in the low level path registry and
BB threading mechanism.

d) Avoids passing a ton of variables around.

e) Provides for easier sharing with the backward threader.

f) Merges the simplify stmt code in VRP and DOM as they were nearly
identical.

This has been bootstrapped and regression tested on x86-64 Linux.
Jeff had also been testing this path as part of his Fedora throughout the
off-season.

gcc/ChangeLog:

	* tree-ssa-dom.c (class dom_jump_threader_simplifier): New.
	(class dom_opt_dom_walker): Initialize some class variables.
	(pass_dominator::execute): Pass evrp_range_analyzer and
	dom_jump_threader_simplifier to dom_opt_dom_walker.
	Adjust for some functions moving into classes.
	(simplify_stmt_for_jump_threading): Adjust and move to...
	(jump_threader_simplifier::simplify): ...here.
	(dom_opt_dom_walker::before_dom_children): Adjust for
	m_evrp_range_analyzer.
	(dom_opt_dom_walker::after_dom_children): Remove x_vr_values hack.
	(test_for_singularity): Place in dom_opt_dom_walker class.
	(dom_opt_dom_walker::optimize_stmt): The argument
	evrp_range_analyzer is now a class field.
	* tree-ssa-threadbackward.c (class thread_jumps): Add m_registry.
	(thread_jumps::thread_through_all_blocks): New.
	(thread_jumps::convert_and_register_current_path): Use m_registry.
	(pass_thread_jumps::execute): Adjust for thread_through_all_blocks
	being in the threader class.
	(pass_early_thread_jumps::execute): Same.
	* tree-ssa-threadedge.c (threadedge_initialize_values): Move...
	(jump_threader::jump_threader): ...here.
	(threadedge_finalize_values): Move...
	(jump_threader::~jump_threader): ...here.
	(jump_threader::remove_jump_threads_including): New.
	(jump_threader::thread_through_all_blocks): New.
	(record_temporary_equivalences_from_phis): Move...
	(jump_threader::record_temporary_equivalences_from_phis): ...here.
	(record_temporary_equivalences_from_stmts_at_dest): Move...
	(jump_threader::record_temporary_equivalences_from_stmts_at_dest):
	Here...
	(simplify_control_stmt_condition_1): Move to jump_threader class.
	(simplify_control_stmt_condition): Move...
	(jump_threader::simplify_control_stmt_condition): ...here.
	(thread_around_empty_blocks): Move...
	(jump_threader::thread_around_empty_blocks): ...here.
	(thread_through_normal_block): Move...
	(jump_threader::thread_through_normal_block): ...here.
	(thread_across_edge): Move...
	(jump_threader::thread_across_edge): ...here.
	(thread_outgoing_edges): Move...
	(jump_threader::thread_outgoing_edges): ...here.
	* tree-ssa-threadedge.h: Move externally facing functings...
	(class jump_threader): ...here...
	(class jump_threader_simplifier): ...and here.
	* tree-ssa-threadupdate.c (struct redirection_data): Remove comment.
	(jump_thread_path_allocator::jump_thread_path_allocator): New.
	(jump_thread_path_allocator::~jump_thread_path_allocator): New.
	(jump_thread_path_allocator::allocate_thread_edge): New.
	(jump_thread_path_allocator::allocate_thread_path): New.
	(jump_thread_path_registry::jump_thread_path_registry): New.
	(jump_thread_path_registry::~jump_thread_path_registry): New.
	(jump_thread_path_registry::allocate_thread_edge): New.
	(jump_thread_path_registry::allocate_thread_path): New.
	(dump_jump_thread_path): Make extern.
	(debug (const vec<jump_thread_edge *> &path)): New.
	(struct removed_edges): Move to tree-ssa-threadupdate.h.
	(struct thread_stats_d): Remove.
	(remove_ctrl_stmt_and_useless_edges): Make static.
	(lookup_redirection_data): Move...
	(jump_thread_path_registry::lookup_redirection_data): ...here.
	(ssa_redirect_edges): Make static.
	(thread_block_1): Move...
	(jump_thread_path_registry::thread_block_1): ...here.
	(thread_block): Move...
	(jump_thread_path_registry::thread_block): ...here.
	(thread_through_loop_header):  Move...
	(jump_thread_path_registry::thread_through_loop_header): ...here.
	(mark_threaded_blocks): Move...
	(jump_thread_path_registry::mark_threaded_blocks): ...here.
	(debug_path): Move...
	(jump_thread_path_registry::debug_path): ...here.
	(debug_all_paths): Move...
	(jump_thread_path_registry::dump): ..here.
	(rewire_first_differing_edge): Move...
	(jump_thread_path_registry::rewire_first_differing_edge): ...here.
	(adjust_paths_after_duplication): Move...
	(jump_thread_path_registry::adjust_paths_after_duplication): ...here.
	(duplicate_thread_path): Move...
	(jump_thread_path_registry::duplicate_thread_path): ..here.
	(remove_jump_threads_including): Move...
	(jump_thread_path_registry::remove_jump_threads_including): ...here.
	(thread_through_all_blocks): Move to...
	(jump_thread_path_registry::thread_through_all_blocks): ...here.
	(delete_jump_thread_path): Remove.
	(register_jump_thread): Move...
	(jump_thread_path_registry::register_jump_thread): ...here.
	* tree-ssa-threadupdate.h: Move externally facing functions...
	(class jump_thread_path_allocator): ...here...
	(class jump_thread_path_registry): ...and here.
	(thread_through_all_blocks): Remove.
	(struct removed_edges): New.
	(register_jump_thread): Remove.
	(remove_jump_threads_including): Remove.
	(delete_jump_thread_path): Remove.
	(remove_ctrl_stmt_and_useless_edges): Remove.
	(free_dom_edge_info): New prototype.
	* tree-vrp.c: Remove x_vr_values hack.
	(class vrp_jump_threader_simplifier): New.
	(vrp_jump_threader_simplifier::simplify): New.
	(vrp_jump_threader::vrp_jump_threader): Adjust method signature.
	Remove m_dummy_cond.
	Instantiate m_simplifier and m_threader.
	(vrp_jump_threader::thread_through_all_blocks): New.
	(vrp_jump_threader::simplify_stmt): Remove.
	(vrp_jump_threader::after_dom_children): Do not set m_dummy_cond.
	Remove x_vr_values hack.
	(execute_vrp): Adjust for thread_through_all_blocks being in a
	class.
---
 gcc/tree-ssa-dom.c            | 183 ++++++---------
 gcc/tree-ssa-threadbackward.c |  31 ++-
 gcc/tree-ssa-threadedge.c     | 416 +++++++++++++++-------------------
 gcc/tree-ssa-threadedge.h     |  80 ++++++-
 gcc/tree-ssa-threadupdate.c   | 306 +++++++++++++------------
 gcc/tree-ssa-threadupdate.h   |  85 ++++++-
 gcc/tree-vrp.c                | 157 ++++++-------
 7 files changed, 648 insertions(+), 610 deletions(-)

diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 81abf35ac02..11b86b2a326 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -585,19 +585,48 @@ record_edge_info (basic_block bb)
     }
 }
 
+class dom_jump_threader_simplifier : public jump_threader_simplifier
+{
+public:
+  dom_jump_threader_simplifier (vr_values *v,
+				avail_exprs_stack *avails)
+    : jump_threader_simplifier (v, avails) {}
+
+private:
+  tree simplify (gimple *, gimple *, basic_block);
+};
+
+tree
+dom_jump_threader_simplifier::simplify (gimple *stmt,
+					gimple *within_stmt,
+					basic_block bb)
+{
+  /* First see if the conditional is in the hash table.  */
+  tree cached_lhs =  m_avail_exprs_stack->lookup_avail_expr (stmt,
+							     false, true);
+  if (cached_lhs)
+    return cached_lhs;
+
+  return jump_threader_simplifier::simplify (stmt, within_stmt, bb);
+}
 
 class dom_opt_dom_walker : public dom_walker
 {
 public:
   dom_opt_dom_walker (cdi_direction direction,
-		      class const_and_copies *const_and_copies,
-		      class avail_exprs_stack *avail_exprs_stack,
-		      gcond *dummy_cond)
-    : dom_walker (direction, REACHABLE_BLOCKS),
-      m_const_and_copies (const_and_copies),
-      m_avail_exprs_stack (avail_exprs_stack),
-      evrp_range_analyzer (true),
-      m_dummy_cond (dummy_cond) { }
+		      jump_threader *threader,
+		      evrp_range_analyzer *analyzer,
+		      const_and_copies *const_and_copies,
+		      avail_exprs_stack *avail_exprs_stack)
+    : dom_walker (direction, REACHABLE_BLOCKS)
+    {
+      m_evrp_range_analyzer = analyzer;
+      m_dummy_cond = gimple_build_cond (NE_EXPR, integer_zero_node,
+					integer_zero_node, NULL, NULL);
+      m_const_and_copies = const_and_copies;
+      m_avail_exprs_stack = avail_exprs_stack;
+      m_threader = threader;
+    }
 
   virtual edge before_dom_children (basic_block);
   virtual void after_dom_children (basic_block);
@@ -608,9 +637,6 @@ private:
   class const_and_copies *m_const_and_copies;
   class avail_exprs_stack *m_avail_exprs_stack;
 
-  /* VRP data.  */
-  class evrp_range_analyzer evrp_range_analyzer;
-
   /* Dummy condition to avoid creating lots of throw away statements.  */
   gcond *m_dummy_cond;
 
@@ -619,6 +645,13 @@ private:
      the statement is a conditional with a statically determined
      value.  */
   edge optimize_stmt (basic_block, gimple_stmt_iterator *, bool *);
+
+
+  void test_for_singularity (gimple *, avail_exprs_stack *);
+
+  dom_jump_threader_simplifier *m_simplifier;
+  jump_threader *m_threader;
+  evrp_range_analyzer *m_evrp_range_analyzer;
 };
 
 /* Jump threading, redundancy elimination and const/copy propagation.
@@ -697,9 +730,6 @@ pass_dominator::execute (function *fun)
      LOOPS_HAVE_PREHEADERS won't be needed here.  */
   loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
 
-  /* Initialize the value-handle array.  */
-  threadedge_initialize_values ();
-
   /* We need accurate information regarding back edges in the CFG
      for jump threading; this may include back edges that are not part of
      a single loop.  */
@@ -715,12 +745,16 @@ pass_dominator::execute (function *fun)
   FOR_EACH_BB_FN (bb, fun)
     record_edge_info (bb);
 
-  gcond *dummy_cond = gimple_build_cond (NE_EXPR, integer_zero_node,
-					 integer_zero_node, NULL, NULL);
-
   /* Recursively walk the dominator tree optimizing statements.  */
-  dom_opt_dom_walker walker (CDI_DOMINATORS, const_and_copies,
-			     avail_exprs_stack, dummy_cond);
+  evrp_range_analyzer analyzer (true);
+  dom_jump_threader_simplifier simplifier (&analyzer, avail_exprs_stack);
+  jump_threader threader (const_and_copies, avail_exprs_stack,
+			  &simplifier, &analyzer);
+  dom_opt_dom_walker walker (CDI_DOMINATORS,
+			     &threader,
+			     &analyzer,
+			     const_and_copies,
+			     avail_exprs_stack);
   walker.walk (fun->cfg->x_entry_block_ptr);
 
   /* Look for blocks where we cleared EDGE_EXECUTABLE on an outgoing
@@ -749,7 +783,7 @@ pass_dominator::execute (function *fun)
 	     containing any edge leaving BB.  */
 	  if (found)
 	    FOR_EACH_EDGE (e, ei, bb->succs)
-	      remove_jump_threads_including (e);
+	      threader.remove_jump_threads_including (e);
 	}
     }
 
@@ -773,7 +807,7 @@ pass_dominator::execute (function *fun)
   free_all_edge_infos ();
 
   /* Thread jumps, creating duplicate blocks as needed.  */
-  cfg_altered |= thread_through_all_blocks (may_peel_loop_headers_p);
+  cfg_altered |= threader.thread_through_all_blocks (may_peel_loop_headers_p);
 
   if (cfg_altered)
     free_dominance_info (CDI_DOMINATORS);
@@ -849,9 +883,6 @@ pass_dominator::execute (function *fun)
   delete avail_exprs_stack;
   delete const_and_copies;
 
-  /* Free the value-handle array.  */
-  threadedge_finalize_values ();
-
   return 0;
 }
 
@@ -863,72 +894,6 @@ make_pass_dominator (gcc::context *ctxt)
   return new pass_dominator (ctxt);
 }
 
-/* A hack until we remove threading from tree-vrp.c and bring the
-   simplification routine into the dom_opt_dom_walker class.  */
-static class vr_values *x_vr_values;
-
-/* A trivial wrapper so that we can present the generic jump
-   threading code with a simple API for simplifying statements.
-
-   ?? This should be cleaned up.  There's a virtually identical copy
-   of this function in tree-vrp.c.  */
-
-static tree
-simplify_stmt_for_jump_threading (gimple *stmt,
-				  gimple *within_stmt ATTRIBUTE_UNUSED,
-				  class avail_exprs_stack *avail_exprs_stack,
-				  basic_block bb ATTRIBUTE_UNUSED)
-{
-  /* First query our hash table to see if the expression is available
-     there.  A non-NULL return value will be either a constant or another
-     SSA_NAME.  */
-  tree cached_lhs =  avail_exprs_stack->lookup_avail_expr (stmt, false, true);
-  if (cached_lhs)
-    return cached_lhs;
-
-  /* If the hash table query failed, query VRP information.  This is
-     essentially the same as tree-vrp's simplification routine.  The
-     copy in tree-vrp is scheduled for removal in gcc-9.  */
-  if (gcond *cond_stmt = dyn_cast <gcond *> (stmt))
-    {
-      simplify_using_ranges simplifier (x_vr_values);
-      return simplifier.vrp_evaluate_conditional (gimple_cond_code (cond_stmt),
-						  gimple_cond_lhs (cond_stmt),
-						  gimple_cond_rhs (cond_stmt),
-						  within_stmt);
-    }
-
-  if (gswitch *switch_stmt = dyn_cast <gswitch *> (stmt))
-    {
-      tree op = gimple_switch_index (switch_stmt);
-      if (TREE_CODE (op) != SSA_NAME)
-	return NULL_TREE;
-
-      const value_range_equiv *vr = x_vr_values->get_value_range (op);
-      return find_case_label_range (switch_stmt, vr);
-    }
-
-  if (gassign *assign_stmt = dyn_cast <gassign *> (stmt))
-    {
-      tree lhs = gimple_assign_lhs (assign_stmt);
-      if (TREE_CODE (lhs) == SSA_NAME
-	  && (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
-	      || POINTER_TYPE_P (TREE_TYPE (lhs)))
-	  && stmt_interesting_for_vrp (stmt))
-	{
-	  edge dummy_e;
-	  tree dummy_tree;
-	  value_range_equiv new_vr;
-	  x_vr_values->extract_range_from_stmt (stmt, &dummy_e,
-						&dummy_tree, &new_vr);
-	  tree singleton;
-	  if (new_vr.singleton_p (&singleton))
-	    return singleton;
-	}
-    }
-  return NULL;
-}
-
 /* Valueize hook for gimple_fold_stmt_to_constant_1.  */
 
 static tree
@@ -1417,7 +1382,7 @@ dom_opt_dom_walker::before_dom_children (basic_block bb)
   if (dump_file && (dump_flags & TDF_DETAILS))
     fprintf (dump_file, "\n\nOptimizing block #%d\n\n", bb->index);
 
-  evrp_range_analyzer.enter (bb);
+  m_evrp_range_analyzer->enter (bb);
 
   /* Push a marker on the stacks of local information so that we know how
      far to unwind when we finalize this block.  */
@@ -1455,7 +1420,7 @@ dom_opt_dom_walker::before_dom_children (basic_block bb)
 	}
 
       /* Compute range information and optimize the stmt.  */
-      evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi), false);
+      m_evrp_range_analyzer->record_ranges_from_stmt (gsi_stmt (gsi), false);
       bool removed_p = false;
       taken_edge = this->optimize_stmt (bb, &gsi, &removed_p);
       if (!removed_p)
@@ -1500,17 +1465,10 @@ dom_opt_dom_walker::before_dom_children (basic_block bb)
 void
 dom_opt_dom_walker::after_dom_children (basic_block bb)
 {
-  x_vr_values = &evrp_range_analyzer;
-  thread_outgoing_edges (bb, m_dummy_cond, m_const_and_copies,
-			 m_avail_exprs_stack,
-			 &evrp_range_analyzer,
-			 simplify_stmt_for_jump_threading);
-  x_vr_values = NULL;
-
-  /* These remove expressions local to BB from the tables.  */
+  m_threader->thread_outgoing_edges (bb);
   m_avail_exprs_stack->pop_to_marker ();
   m_const_and_copies->pop_to_marker ();
-  evrp_range_analyzer.leave (bb);
+  m_evrp_range_analyzer->leave (bb);
 }
 
 /* Search for redundant computations in STMT.  If any are found, then
@@ -1849,9 +1807,9 @@ cprop_into_stmt (gimple *stmt, vr_values *vr_values)
 
    This is similar to code in VRP.  */
 
-static void
-test_for_singularity (gimple *stmt, gcond *dummy_cond,
-		      avail_exprs_stack *avail_exprs_stack)
+void
+dom_opt_dom_walker::test_for_singularity (gimple *stmt,
+					  avail_exprs_stack *avail_exprs_stack)
 {
   /* We want to support gimple conditionals as well as assignments
      where the RHS contains a conditional.  */
@@ -1897,11 +1855,12 @@ test_for_singularity (gimple *stmt, gcond *dummy_cond,
 	    test_code = GE_EXPR;
 
 	  /* Update the dummy statement so we can query the hash tables.  */
-	  gimple_cond_set_code (dummy_cond, test_code);
-	  gimple_cond_set_lhs (dummy_cond, lhs);
-	  gimple_cond_set_rhs (dummy_cond, rhs);
+	  gimple_cond_set_code (m_dummy_cond, test_code);
+	  gimple_cond_set_lhs (m_dummy_cond, lhs);
+	  gimple_cond_set_rhs (m_dummy_cond, rhs);
 	  tree cached_lhs
-	    = avail_exprs_stack->lookup_avail_expr (dummy_cond, false, false);
+	    = avail_exprs_stack->lookup_avail_expr (m_dummy_cond,
+						    false, false);
 
 	  /* If the lookup returned 1 (true), then the expression we
 	     queried was in the hash table.  As a result there is only
@@ -1970,7 +1929,7 @@ dom_opt_dom_walker::optimize_stmt (basic_block bb, gimple_stmt_iterator *si,
   opt_stats.num_stmts++;
 
   /* Const/copy propagate into USES, VUSES and the RHS of VDEFs.  */
-  cprop_into_stmt (stmt, &evrp_range_analyzer);
+  cprop_into_stmt (stmt, m_evrp_range_analyzer);
 
   /* If the statement has been modified with constant replacements,
      fold its RHS before checking for redundant computations.  */
@@ -2068,8 +2027,8 @@ dom_opt_dom_walker::optimize_stmt (basic_block bb, gimple_stmt_iterator *si,
 		 SSA_NAMES.  */
 	      update_stmt_if_modified (stmt);
 	      edge taken_edge = NULL;
-	      evrp_range_analyzer.vrp_visit_cond_stmt (as_a <gcond *> (stmt),
-						       &taken_edge);
+	      m_evrp_range_analyzer->vrp_visit_cond_stmt
+		(as_a <gcond *> (stmt), &taken_edge);
 	      if (taken_edge)
 		{
 		  if (taken_edge->flags & EDGE_TRUE_VALUE)
@@ -2136,7 +2095,7 @@ dom_opt_dom_walker::optimize_stmt (basic_block bb, gimple_stmt_iterator *si,
       /* If this statement was not redundant, we may still be able to simplify
 	 it, which may in turn allow other part of DOM or other passes to do
 	 a better job.  */
-      test_for_singularity (stmt, m_dummy_cond, m_avail_exprs_stack);
+      test_for_singularity (stmt, m_avail_exprs_stack);
     }
 
   /* Record any additional equivalences created by this statement.  */
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 515a94b4670..428cf0767c6 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -39,9 +39,11 @@ along with GCC; see the file COPYING3.  If not see
 
 class thread_jumps
 {
- public:
+public:
   void find_jump_threads_backwards (basic_block bb, bool speed_p);
- private:
+  bool thread_through_all_blocks ();
+
+private:
   edge profitable_jump_thread_path (basic_block bbi, tree name, tree arg,
 				    bool *creates_irreducible_loop);
   void convert_and_register_current_path (edge taken_edge);
@@ -65,8 +67,16 @@ class thread_jumps
   /* Indicate that we could increase code size to improve the
      code path.  */
   bool m_speed_p;
+
+  jump_thread_path_registry m_registry;
 };
 
+bool
+thread_jumps::thread_through_all_blocks ()
+{
+  return m_registry.thread_through_all_blocks (true);
+}
+
 /* Simple helper to get the last statement from BB, which is assumed
    to be a control statement.   Return NULL if the last statement is
    not a control statement.  */
@@ -459,7 +469,7 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name,
 void
 thread_jumps::convert_and_register_current_path (edge taken_edge)
 {
-  vec<jump_thread_edge *> *jump_thread_path = new vec<jump_thread_edge *> ();
+  vec<jump_thread_edge *> *path = m_registry.allocate_thread_path ();
 
   /* Record the edges between the blocks in PATH.  */
   for (unsigned int j = 0; j + 1 < m_path.length (); j++)
@@ -469,16 +479,17 @@ thread_jumps::convert_and_register_current_path (edge taken_edge)
 
       edge e = find_edge (bb1, bb2);
       gcc_assert (e);
-      jump_thread_edge *x = new jump_thread_edge (e, EDGE_FSM_THREAD);
-      jump_thread_path->safe_push (x);
+      jump_thread_edge *x
+	= m_registry.allocate_thread_edge (e, EDGE_FSM_THREAD);
+      path->safe_push (x);
     }
 
   /* Add the edge taken when the control variable has value ARG.  */
   jump_thread_edge *x
-    = new jump_thread_edge (taken_edge, EDGE_NO_COPY_SRC_BLOCK);
-  jump_thread_path->safe_push (x);
+    = m_registry.allocate_thread_edge (taken_edge, EDGE_NO_COPY_SRC_BLOCK);
+  path->safe_push (x);
 
-  register_jump_thread (jump_thread_path);
+  m_registry.register_jump_thread (path);
   --m_max_threaded_paths;
 }
 
@@ -827,7 +838,7 @@ pass_thread_jumps::execute (function *fun)
       if (EDGE_COUNT (bb->succs) > 1)
 	threader.find_jump_threads_backwards (bb, true);
     }
-  bool changed = thread_through_all_blocks (true);
+  bool changed = threader.thread_through_all_blocks ();
 
   loop_optimizer_finalize ();
   return changed ? TODO_cleanup_cfg : 0;
@@ -888,7 +899,7 @@ pass_early_thread_jumps::execute (function *fun)
       if (EDGE_COUNT (bb->succs) > 1)
 	threader.find_jump_threads_backwards (bb, false);
     }
-  thread_through_all_blocks (true);
+  threader.thread_through_all_blocks ();
 
   loop_optimizer_finalize ();
   return 0;
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 5e152e7608e..6ce32644aa5 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -49,10 +49,6 @@ static int stmt_count;
 /* Array to record value-handles per SSA_NAME.  */
 vec<tree> ssa_name_values;
 
-typedef tree (pfn_simplify) (gimple *, gimple *,
-			     class avail_exprs_stack *,
-			     basic_block);
-
 /* Set the value for the SSA name NAME to VALUE.  */
 
 void
@@ -65,25 +61,48 @@ set_ssa_name_value (tree name, tree value)
   ssa_name_values[SSA_NAME_VERSION (name)] = value;
 }
 
-/* Initialize the per SSA_NAME value-handles array.  Returns it.  */
-void
-threadedge_initialize_values (void)
+jump_threader::jump_threader (const_and_copies *copies,
+			      avail_exprs_stack *avails,
+			      jump_threader_simplifier *simplifier,
+			      evrp_range_analyzer *analyzer)
 {
+  /* Initialize the per SSA_NAME value-handles array.  */
   gcc_assert (!ssa_name_values.exists ());
   ssa_name_values.create (num_ssa_names);
+
+  dummy_cond = gimple_build_cond (NE_EXPR, integer_zero_node,
+				  integer_zero_node, NULL, NULL);
+
+  m_const_and_copies = copies;
+  m_avail_exprs_stack = avails;
+  m_registry = new jump_thread_path_registry ();
+  m_simplifier = simplifier;
+  m_evrp_range_analyzer = analyzer;
 }
 
-/* Free the per SSA_NAME value-handle array.  */
-void
-threadedge_finalize_values (void)
+jump_threader::~jump_threader (void)
 {
   ssa_name_values.release ();
+  ggc_free (dummy_cond);
+  delete m_registry;
+}
+
+void
+jump_threader::remove_jump_threads_including (edge_def *e)
+{
+  m_registry->remove_jump_threads_including (e);
+}
+
+bool
+jump_threader::thread_through_all_blocks (bool may_peel_loop_headers)
+{
+  return m_registry->thread_through_all_blocks (may_peel_loop_headers);
 }
 
 /* Return TRUE if we may be able to thread an incoming edge into
    BB to an outgoing edge from BB.  Return FALSE otherwise.  */
 
-bool
+static bool
 potentially_threadable_block (basic_block bb)
 {
   gimple_stmt_iterator gsi;
@@ -116,16 +135,13 @@ potentially_threadable_block (basic_block bb)
 }
 
 /* Record temporary equivalences created by PHIs at the target of the
-   edge E.  Record unwind information for the equivalences into
-   CONST_AND_COPIES and EVRP_RANGE_DATA.
+   edge E.
 
    If a PHI which prevents threading is encountered, then return FALSE
    indicating we should not thread this edge, else return TRUE.  */
 
-static bool
-record_temporary_equivalences_from_phis (edge e,
-    const_and_copies *const_and_copies,
-    evrp_range_analyzer *evrp_range_analyzer)
+bool
+jump_threader::record_temporary_equivalences_from_phis (edge e)
 {
   gphi_iterator gsi;
 
@@ -152,19 +168,19 @@ record_temporary_equivalences_from_phis (edge e,
       if (!virtual_operand_p (dst))
 	stmt_count++;
 
-      const_and_copies->record_const_or_copy (dst, src);
+      m_const_and_copies->record_const_or_copy (dst, src);
 
       /* Also update the value range associated with DST, using
 	 the range from SRC.
 
 	 Note that even if SRC is a constant we need to set a suitable
 	 output range so that VR_UNDEFINED ranges do not leak through.  */
-      if (evrp_range_analyzer)
+      if (m_evrp_range_analyzer)
 	{
 	  /* Get an empty new VR we can pass to update_value_range and save
 	     away in the VR stack.  */
 	  value_range_equiv *new_vr
-			  = evrp_range_analyzer->allocate_value_range_equiv ();
+	    = m_evrp_range_analyzer->allocate_value_range_equiv ();
 	  new (new_vr) value_range_equiv ();
 
 	  /* There are three cases to consider:
@@ -178,14 +194,14 @@ record_temporary_equivalences_from_phis (edge e,
 	       Otherwise set NEW_VR to varying.  This may be overly
 	       conservative.  */
 	  if (TREE_CODE (src) == SSA_NAME)
-	    new_vr->deep_copy (evrp_range_analyzer->get_value_range (src));
+	    new_vr->deep_copy (m_evrp_range_analyzer->get_value_range (src));
 	  else if (TREE_CODE (src) == INTEGER_CST)
 	    new_vr->set (src);
 	  else
 	    new_vr->set_varying (TREE_TYPE (src));
 
 	  /* This is a temporary range for DST, so push it.  */
-	  evrp_range_analyzer->push_value_range (dst, new_vr);
+	  m_evrp_range_analyzer->push_value_range (dst, new_vr);
 	}
     }
   return true;
@@ -210,8 +226,8 @@ threadedge_valueize (tree t)
 
    Record unwind information for temporary equivalences onto STACK.
 
-   Use SIMPLIFY (a pointer to a callback function) to further simplify
-   statements using pass specific information.
+   Uses M_SIMPLIFIER to further simplify statements using pass specific
+   information.
 
    We might consider marking just those statements which ultimately
    feed the COND_EXPR.  It's not clear if the overhead of bookkeeping
@@ -222,12 +238,8 @@ threadedge_valueize (tree t)
    a context sensitive equivalence which may help us simplify
    later statements in E->dest.  */
 
-static gimple *
-record_temporary_equivalences_from_stmts_at_dest (edge e,
-    const_and_copies *const_and_copies,
-    avail_exprs_stack *avail_exprs_stack,
-    evrp_range_analyzer *evrp_range_analyzer,
-    pfn_simplify simplify)
+gimple *
+jump_threader::record_temporary_equivalences_from_stmts_at_dest (edge e)
 {
   gimple *stmt = NULL;
   gimple_stmt_iterator gsi;
@@ -294,8 +306,8 @@ record_temporary_equivalences_from_stmts_at_dest (edge e,
 
       /* These are temporary ranges, do nto reflect them back into
 	 the global range data.  */
-      if (evrp_range_analyzer)
-	evrp_range_analyzer->record_ranges_from_stmt (stmt, true);
+      if (m_evrp_range_analyzer)
+	m_evrp_range_analyzer->record_ranges_from_stmt (stmt, true);
 
       /* If this is not a statement that sets an SSA_NAME to a new
 	 value, then do not try to simplify this statement as it will
@@ -396,7 +408,7 @@ record_temporary_equivalences_from_stmts_at_dest (edge e,
 		    SET_USE (use_p, tmp);
 		}
 
-	      cached_lhs = (*simplify) (stmt, stmt, avail_exprs_stack, e->src);
+	      cached_lhs = m_simplifier->simplify (stmt, stmt, e->src);
 
 	      /* Restore the statement's original uses/defs.  */
 	      i = 0;
@@ -410,38 +422,23 @@ record_temporary_equivalences_from_stmts_at_dest (edge e,
       if (cached_lhs
 	  && (TREE_CODE (cached_lhs) == SSA_NAME
 	      || is_gimple_min_invariant (cached_lhs)))
-	const_and_copies->record_const_or_copy (gimple_get_lhs (stmt),
-						cached_lhs);
+	m_const_and_copies->record_const_or_copy (gimple_get_lhs (stmt),
+						  cached_lhs);
     }
   return stmt;
 }
 
-static tree simplify_control_stmt_condition_1 (edge, gimple *,
-					       class avail_exprs_stack *,
-					       tree, enum tree_code, tree,
-					       gcond *, pfn_simplify,
-					       unsigned);
-
 /* Simplify the control statement at the end of the block E->dest.
 
-   To avoid allocating memory unnecessarily, a scratch GIMPLE_COND
-   is available to use/clobber in DUMMY_COND.
-
    Use SIMPLIFY (a pointer to a callback function) to further simplify
    a condition using pass specific information.
 
    Return the simplified condition or NULL if simplification could
    not be performed.  When simplifying a GIMPLE_SWITCH, we may return
-   the CASE_LABEL_EXPR that will be taken.
+   the CASE_LABEL_EXPR that will be taken.  */
 
-   The available expression table is referenced via AVAIL_EXPRS_STACK.  */
-
-static tree
-simplify_control_stmt_condition (edge e,
-				 gimple *stmt,
-				 class avail_exprs_stack *avail_exprs_stack,
-				 gcond *dummy_cond,
-				 pfn_simplify simplify)
+tree
+jump_threader::simplify_control_stmt_condition (edge e, gimple *stmt)
 {
   tree cond, cached_lhs;
   enum gimple_code code = gimple_code (stmt);
@@ -485,9 +482,7 @@ simplify_control_stmt_condition (edge e,
       const unsigned recursion_limit = 4;
 
       cached_lhs
-	= simplify_control_stmt_condition_1 (e, stmt, avail_exprs_stack,
-					     op0, cond_code, op1,
-					     dummy_cond, simplify,
+	= simplify_control_stmt_condition_1 (e, stmt, op0, cond_code, op1,
 					     recursion_limit);
 
       /* If we were testing an integer/pointer against a constant, then
@@ -557,12 +552,11 @@ simplify_control_stmt_condition (edge e,
 		 the label that is proven to be taken.  */
 	      gswitch *dummy_switch = as_a<gswitch *> (gimple_copy (stmt));
 	      gimple_switch_set_index (dummy_switch, cached_lhs);
-	      cached_lhs = (*simplify) (dummy_switch, stmt,
-					avail_exprs_stack, e->src);
+	      cached_lhs = m_simplifier->simplify (dummy_switch, stmt, e->src);
 	      ggc_free (dummy_switch);
 	    }
 	  else
-	    cached_lhs = (*simplify) (stmt, stmt, avail_exprs_stack, e->src);
+	    cached_lhs = m_simplifier->simplify (stmt, stmt, e->src);
 	}
 
       /* We couldn't find an invariant.  But, callers of this
@@ -579,16 +573,14 @@ simplify_control_stmt_condition (edge e,
 
 /* Recursive helper for simplify_control_stmt_condition.  */
 
-static tree
-simplify_control_stmt_condition_1 (edge e,
-				   gimple *stmt,
-				   class avail_exprs_stack *avail_exprs_stack,
-				   tree op0,
-				   enum tree_code cond_code,
-				   tree op1,
-				   gcond *dummy_cond,
-				   pfn_simplify simplify,
-				   unsigned limit)
+tree
+jump_threader::simplify_control_stmt_condition_1
+					(edge e,
+					 gimple *stmt,
+					 tree op0,
+					 enum tree_code cond_code,
+					 tree op1,
+					 unsigned limit)
 {
   if (limit == 0)
     return NULL_TREE;
@@ -623,9 +615,8 @@ simplify_control_stmt_condition_1 (edge e,
 
 	  /* Is A != 0 ?  */
 	  const tree res1
-	    = simplify_control_stmt_condition_1 (e, def_stmt, avail_exprs_stack,
+	    = simplify_control_stmt_condition_1 (e, def_stmt,
 						 rhs1, NE_EXPR, op1,
-						 dummy_cond, simplify,
 						 limit - 1);
 	  if (res1 == NULL_TREE)
 	    ;
@@ -650,9 +641,8 @@ simplify_control_stmt_condition_1 (edge e,
 
 	  /* Is B != 0 ?  */
 	  const tree res2
-	    = simplify_control_stmt_condition_1 (e, def_stmt, avail_exprs_stack,
+	    = simplify_control_stmt_condition_1 (e, def_stmt,
 						 rhs2, NE_EXPR, op1,
-						 dummy_cond, simplify,
 						 limit - 1);
 	  if (res2 == NULL_TREE)
 	    ;
@@ -715,9 +705,8 @@ simplify_control_stmt_condition_1 (edge e,
 	    new_cond = invert_tree_comparison (new_cond, false);
 
 	  tree res
-	    = simplify_control_stmt_condition_1 (e, def_stmt, avail_exprs_stack,
+	    = simplify_control_stmt_condition_1 (e, def_stmt,
 						 rhs1, new_cond, rhs2,
-						 dummy_cond, simplify,
 						 limit - 1);
 	  if (res != NULL_TREE && is_gimple_min_invariant (res))
 	    return res;
@@ -744,7 +733,7 @@ simplify_control_stmt_condition_1 (edge e,
      then use the pass specific callback to simplify the condition.  */
   if (!res
       || !is_gimple_min_invariant (res))
-    res = (*simplify) (dummy_cond, stmt, avail_exprs_stack, e->src);
+    res = m_simplifier->simplify (dummy_cond, stmt, e->src);
 
   return res;
 }
@@ -893,18 +882,12 @@ propagate_threaded_block_debug_into (basic_block dest, basic_block src)
    returning TRUE from the toplevel call.   Otherwise do nothing and
    return false.
 
-   DUMMY_COND, SIMPLIFY are used to try and simplify the condition at the
-   end of TAKEN_EDGE->dest.
-
    The available expression table is referenced via AVAIL_EXPRS_STACK.  */
 
-static bool
-thread_around_empty_blocks (edge taken_edge,
-			    gcond *dummy_cond,
-			    class avail_exprs_stack *avail_exprs_stack,
-			    pfn_simplify simplify,
-			    bitmap visited,
-			    vec<jump_thread_edge *> *path)
+bool
+jump_threader::thread_around_empty_blocks (vec<jump_thread_edge *> *path,
+					   edge taken_edge,
+					   bitmap visited)
 {
   basic_block bb = taken_edge->dest;
   gimple_stmt_iterator gsi;
@@ -946,15 +929,11 @@ thread_around_empty_blocks (edge taken_edge,
 	  if (!bitmap_bit_p (visited, taken_edge->dest->index))
 	    {
 	      jump_thread_edge *x
-		= new jump_thread_edge (taken_edge, EDGE_NO_COPY_SRC_BLOCK);
+		= m_registry->allocate_thread_edge (taken_edge,
+						    EDGE_NO_COPY_SRC_BLOCK);
 	      path->safe_push (x);
 	      bitmap_set_bit (visited, taken_edge->dest->index);
-	      return thread_around_empty_blocks (taken_edge,
-						 dummy_cond,
-						 avail_exprs_stack,
-						 simplify,
-						 visited,
-						 path);
+	      return thread_around_empty_blocks (path, taken_edge, visited);
 	    }
 	}
 
@@ -971,9 +950,7 @@ thread_around_empty_blocks (edge taken_edge,
     return false;
 
   /* Extract and simplify the condition.  */
-  cond = simplify_control_stmt_condition (taken_edge, stmt,
-					  avail_exprs_stack, dummy_cond,
-					  simplify);
+  cond = simplify_control_stmt_condition (taken_edge, stmt);
 
   /* If the condition can be statically computed and we have not already
      visited the destination edge, then add the taken edge to our thread
@@ -996,15 +973,11 @@ thread_around_empty_blocks (edge taken_edge,
       bitmap_set_bit (visited, taken_edge->dest->index);
 
       jump_thread_edge *x
-	= new jump_thread_edge (taken_edge, EDGE_NO_COPY_SRC_BLOCK);
+	= m_registry->allocate_thread_edge (taken_edge,
+					    EDGE_NO_COPY_SRC_BLOCK);
       path->safe_push (x);
 
-      thread_around_empty_blocks (taken_edge,
-				  dummy_cond,
-				  avail_exprs_stack,
-				  simplify,
-				  visited,
-				  path);
+      thread_around_empty_blocks (path, taken_edge, visited);
       return true;
     }
 
@@ -1024,14 +997,9 @@ thread_around_empty_blocks (edge taken_edge,
    limited in that case to avoid short-circuiting the loop
    incorrectly.
 
-   DUMMY_COND is a shared cond_expr used by condition simplification as scratch,
-   to avoid allocating memory.
-
    STACK is used to undo temporary equivalences created during the walk of
    E->dest.
 
-   SIMPLIFY is a pass-specific function used to simplify statements.
-
    Our caller is responsible for restoring the state of the expression
    and const_and_copies stacks.
 
@@ -1040,34 +1008,23 @@ thread_around_empty_blocks (edge taken_edge,
    negative indicates the block should not be duplicated and thus is not
    suitable for a joiner in a jump threading path.  */
 
-static int
-thread_through_normal_block (edge e,
-			     gcond *dummy_cond,
-			     const_and_copies *const_and_copies,
-			     avail_exprs_stack *avail_exprs_stack,
-			     evrp_range_analyzer *evrp_range_analyzer,
-			     pfn_simplify simplify,
-			     vec<jump_thread_edge *> *path,
-			     bitmap visited)
+int
+jump_threader::thread_through_normal_block (vec<jump_thread_edge *> *path,
+					    edge e, bitmap visited)
 {
   /* We want to record any equivalences created by traversing E.  */
-  record_temporary_equivalences (e, const_and_copies, avail_exprs_stack);
+  record_temporary_equivalences (e, m_const_and_copies, m_avail_exprs_stack);
 
   /* PHIs create temporary equivalences.
      Note that if we found a PHI that made the block non-threadable, then
      we need to bubble that up to our caller in the same manner we do
      when we prematurely stop processing statements below.  */
-  if (!record_temporary_equivalences_from_phis (e, const_and_copies,
-					        evrp_range_analyzer))
+  if (!record_temporary_equivalences_from_phis (e))
     return -1;
 
   /* Now walk each statement recording any context sensitive
      temporary equivalences we can detect.  */
-  gimple *stmt
-    = record_temporary_equivalences_from_stmts_at_dest (e, const_and_copies,
-							avail_exprs_stack,
-							evrp_range_analyzer,
-							simplify);
+  gimple *stmt = record_temporary_equivalences_from_stmts_at_dest (e);
 
   /* There's two reasons STMT might be null, and distinguishing
      between them is important.
@@ -1104,8 +1061,7 @@ thread_through_normal_block (edge e,
       tree cond;
 
       /* Extract and simplify the condition.  */
-      cond = simplify_control_stmt_condition (e, stmt, avail_exprs_stack,
-					      dummy_cond, simplify);
+      cond = simplify_control_stmt_condition (e, stmt);
 
       if (!cond)
 	return 0;
@@ -1135,12 +1091,13 @@ thread_through_normal_block (edge e,
 	  if (path->length () == 0)
 	    {
               jump_thread_edge *x
-	        = new jump_thread_edge (e, EDGE_START_JUMP_THREAD);
+		= m_registry->allocate_thread_edge (e, EDGE_START_JUMP_THREAD);
 	      path->safe_push (x);
 	    }
 
 	  jump_thread_edge *x
-	    = new jump_thread_edge (taken_edge, EDGE_COPY_SRC_BLOCK);
+	    = m_registry->allocate_thread_edge (taken_edge,
+						EDGE_COPY_SRC_BLOCK);
 	  path->safe_push (x);
 
 	  /* See if we can thread through DEST as well, this helps capture
@@ -1151,12 +1108,7 @@ thread_through_normal_block (edge e,
  	     visited.  This may be overly conservative.  */
 	  bitmap_set_bit (visited, dest->index);
 	  bitmap_set_bit (visited, e->dest->index);
-	  thread_around_empty_blocks (taken_edge,
-				      dummy_cond,
-				      avail_exprs_stack,
-				      simplify,
-				      visited,
-				      path);
+	  thread_around_empty_blocks (path, taken_edge, visited);
 	  return 1;
 	}
     }
@@ -1225,49 +1177,30 @@ edge_forwards_cmp_to_conditional_jump_through_empty_bb_p (edge e)
   return true;
 }
 
-/* We are exiting E->src, see if E->dest ends with a conditional
-   jump which has a known value when reached via E.
-
-   DUMMY_COND is a shared cond_expr used by condition simplification as scratch,
-   to avoid allocating memory.
-
-   CONST_AND_COPIES is used to undo temporary equivalences created during the
-   walk of E->dest.
+/* We are exiting E->src, see if E->dest ends with a conditional jump
+   which has a known value when reached via E.  If so, thread the
+   edge.  */
 
-   The available expression table is referenced vai AVAIL_EXPRS_STACK.
-
-   SIMPLIFY is a pass-specific function used to simplify statements.  */
-
-static void
-thread_across_edge (gcond *dummy_cond,
-		    edge e,
-		    class const_and_copies *const_and_copies,
-		    class avail_exprs_stack *avail_exprs_stack,
-		    class evrp_range_analyzer *evrp_range_analyzer,
-		    pfn_simplify simplify)
+void
+jump_threader::thread_across_edge (edge e)
 {
   bitmap visited = BITMAP_ALLOC (NULL);
 
-  const_and_copies->push_marker ();
-  avail_exprs_stack->push_marker ();
-  if (evrp_range_analyzer)
-    evrp_range_analyzer->push_marker ();
+  m_const_and_copies->push_marker ();
+  m_avail_exprs_stack->push_marker ();
+  if (m_evrp_range_analyzer)
+    m_evrp_range_analyzer->push_marker ();
 
   stmt_count = 0;
 
-  vec<jump_thread_edge *> *path = new vec<jump_thread_edge *> ();
+  vec<jump_thread_edge *> *path = m_registry->allocate_thread_path ();
   bitmap_clear (visited);
   bitmap_set_bit (visited, e->src->index);
   bitmap_set_bit (visited, e->dest->index);
 
   int threaded;
   if ((e->flags & EDGE_DFS_BACK) == 0)
-    threaded = thread_through_normal_block (e, dummy_cond,
-					    const_and_copies,
-					    avail_exprs_stack,
-					    evrp_range_analyzer,
-					    simplify, path,
-					    visited);
+    threaded = thread_through_normal_block (path, e, visited);
   else
     threaded = 0;
 
@@ -1275,12 +1208,12 @@ thread_across_edge (gcond *dummy_cond,
     {
       propagate_threaded_block_debug_into (path->last ()->e->dest,
 					   e->dest);
-      const_and_copies->pop_to_marker ();
-      avail_exprs_stack->pop_to_marker ();
-      if (evrp_range_analyzer)
-	evrp_range_analyzer->pop_to_marker ();
+      m_const_and_copies->pop_to_marker ();
+      m_avail_exprs_stack->pop_to_marker ();
+      if (m_evrp_range_analyzer)
+	m_evrp_range_analyzer->pop_to_marker ();
       BITMAP_FREE (visited);
-      register_jump_thread (path);
+      m_registry->register_jump_thread (path);
       return;
     }
   else
@@ -1290,7 +1223,6 @@ thread_across_edge (gcond *dummy_cond,
 	 through the vector entries.  */
       gcc_assert (path->length () == 0);
       path->release ();
-      delete path;
 
       /* A negative status indicates the target block was deemed too big to
 	 duplicate.  Just quit now rather than trying to use the block as
@@ -1302,10 +1234,10 @@ thread_across_edge (gcond *dummy_cond,
       if (threaded < 0)
 	{
 	  BITMAP_FREE (visited);
-	  const_and_copies->pop_to_marker ();
-          avail_exprs_stack->pop_to_marker ();
-	  if (evrp_range_analyzer)
-	    evrp_range_analyzer->pop_to_marker ();
+	  m_const_and_copies->pop_to_marker ();
+	  m_avail_exprs_stack->pop_to_marker ();
+	  if (m_evrp_range_analyzer)
+	    m_evrp_range_analyzer->pop_to_marker ();
 	  return;
 	}
     }
@@ -1331,10 +1263,10 @@ thread_across_edge (gcond *dummy_cond,
     FOR_EACH_EDGE (taken_edge, ei, e->dest->succs)
       if (taken_edge->flags & EDGE_COMPLEX)
 	{
-	  const_and_copies->pop_to_marker ();
-          avail_exprs_stack->pop_to_marker ();
-	  if (evrp_range_analyzer)
-	    evrp_range_analyzer->pop_to_marker ();
+	  m_const_and_copies->pop_to_marker ();
+	  m_avail_exprs_stack->pop_to_marker ();
+	  if (m_evrp_range_analyzer)
+	    m_evrp_range_analyzer->pop_to_marker ();
 	  BITMAP_FREE (visited);
 	  return;
 	}
@@ -1348,39 +1280,32 @@ thread_across_edge (gcond *dummy_cond,
 
 	/* Push a fresh marker so we can unwind the equivalences created
 	   for each of E->dest's successors.  */
-	const_and_copies->push_marker ();
-	avail_exprs_stack->push_marker ();
-	if (evrp_range_analyzer)
-	  evrp_range_analyzer->push_marker ();
+	m_const_and_copies->push_marker ();
+	m_avail_exprs_stack->push_marker ();
+	if (m_evrp_range_analyzer)
+	  m_evrp_range_analyzer->push_marker ();
 
 	/* Avoid threading to any block we have already visited.  */
 	bitmap_clear (visited);
 	bitmap_set_bit (visited, e->src->index);
 	bitmap_set_bit (visited, e->dest->index);
 	bitmap_set_bit (visited, taken_edge->dest->index);
-        vec<jump_thread_edge *> *path = new vec<jump_thread_edge *> ();
+	vec<jump_thread_edge *> *path = m_registry->allocate_thread_path ();
 
 	/* Record whether or not we were able to thread through a successor
 	   of E->dest.  */
-        jump_thread_edge *x = new jump_thread_edge (e, EDGE_START_JUMP_THREAD);
+	jump_thread_edge *x
+	  = m_registry->allocate_thread_edge (e, EDGE_START_JUMP_THREAD);
 	path->safe_push (x);
 
-        x = new jump_thread_edge (taken_edge, EDGE_COPY_SRC_JOINER_BLOCK);
+	x = m_registry->allocate_thread_edge (taken_edge,
+					      EDGE_COPY_SRC_JOINER_BLOCK);
 	path->safe_push (x);
-	found = thread_around_empty_blocks (taken_edge,
-					    dummy_cond,
-					    avail_exprs_stack,
-					    simplify,
-					    visited,
-					    path);
+	found = thread_around_empty_blocks (path, taken_edge, visited);
 
 	if (!found)
-	  found = thread_through_normal_block (path->last ()->e, dummy_cond,
-					       const_and_copies,
-					       avail_exprs_stack,
-					       evrp_range_analyzer,
-					       simplify, path,
-					       visited) > 0;
+	  found = thread_through_normal_block (path,
+					       path->last ()->e, visited) > 0;
 
 	/* If we were able to thread through a successor of E->dest, then
 	   record the jump threading opportunity.  */
@@ -1390,47 +1315,31 @@ thread_across_edge (gcond *dummy_cond,
 	    if (taken_edge->dest != path->last ()->e->dest)
 	      propagate_threaded_block_debug_into (path->last ()->e->dest,
 						   taken_edge->dest);
-	    register_jump_thread (path);
+	    m_registry->register_jump_thread (path);
 	  }
 	else
-	  delete_jump_thread_path (path);
+	  path->release ();
 
 	/* And unwind the equivalence table.  */
-	if (evrp_range_analyzer)
-	  evrp_range_analyzer->pop_to_marker ();
-	avail_exprs_stack->pop_to_marker ();
-	const_and_copies->pop_to_marker ();
+	if (m_evrp_range_analyzer)
+	  m_evrp_range_analyzer->pop_to_marker ();
+	m_avail_exprs_stack->pop_to_marker ();
+	m_const_and_copies->pop_to_marker ();
       }
     BITMAP_FREE (visited);
   }
 
-  if (evrp_range_analyzer)
-    evrp_range_analyzer->pop_to_marker ();
-  const_and_copies->pop_to_marker ();
-  avail_exprs_stack->pop_to_marker ();
+  if (m_evrp_range_analyzer)
+    m_evrp_range_analyzer->pop_to_marker ();
+  m_const_and_copies->pop_to_marker ();
+  m_avail_exprs_stack->pop_to_marker ();
 }
 
 /* Examine the outgoing edges from BB and conditionally
-   try to thread them.
-
-   DUMMY_COND is a shared cond_expr used by condition simplification as scratch,
-   to avoid allocating memory.
-
-   CONST_AND_COPIES is used to undo temporary equivalences created during the
-   walk of E->dest.
-
-   The available expression table is referenced vai AVAIL_EXPRS_STACK.
-
-   SIMPLIFY is a pass-specific function used to simplify statements.  */
+   try to thread them.  */
 
 void
-thread_outgoing_edges (basic_block bb, gcond *dummy_cond,
-		       class const_and_copies *const_and_copies,
-		       class avail_exprs_stack *avail_exprs_stack,
-		       class evrp_range_analyzer *evrp_range_analyzer,
-		       tree (*simplify) (gimple *, gimple *,
-					 class avail_exprs_stack *,
-					 basic_block))
+jump_threader::thread_outgoing_edges (basic_block bb)
 {
   int flags = (EDGE_IGNORE | EDGE_COMPLEX | EDGE_ABNORMAL);
   gimple *last;
@@ -1443,9 +1352,7 @@ thread_outgoing_edges (basic_block bb, gcond *dummy_cond,
       && (single_succ_edge (bb)->flags & flags) == 0
       && potentially_threadable_block (single_succ (bb)))
     {
-      thread_across_edge (dummy_cond, single_succ_edge (bb),
-			  const_and_copies, avail_exprs_stack,
-			  evrp_range_analyzer, simplify);
+      thread_across_edge (single_succ_edge (bb));
     }
   else if ((last = last_stmt (bb))
 	   && gimple_code (last) == GIMPLE_COND
@@ -1460,14 +1367,53 @@ thread_outgoing_edges (basic_block bb, gcond *dummy_cond,
       /* Only try to thread the edge if it reaches a target block with
 	 more than one predecessor and more than one successor.  */
       if (potentially_threadable_block (true_edge->dest))
-	thread_across_edge (dummy_cond, true_edge,
-			    const_and_copies, avail_exprs_stack,
-			    evrp_range_analyzer, simplify);
+	thread_across_edge (true_edge);
 
       /* Similarly for the ELSE arm.  */
       if (potentially_threadable_block (false_edge->dest))
-	thread_across_edge (dummy_cond, false_edge,
-			    const_and_copies, avail_exprs_stack,
-			    evrp_range_analyzer, simplify);
+	thread_across_edge (false_edge);
+    }
+}
+
+tree
+jump_threader_simplifier::simplify (gimple *stmt,
+				    gimple *within_stmt,
+				    basic_block)
+{
+  if (gcond *cond_stmt = dyn_cast <gcond *> (stmt))
+    {
+      simplify_using_ranges simplifier (m_vr_values);
+      return simplifier.vrp_evaluate_conditional (gimple_cond_code (cond_stmt),
+						  gimple_cond_lhs (cond_stmt),
+						  gimple_cond_rhs (cond_stmt),
+						  within_stmt);
+    }
+  if (gswitch *switch_stmt = dyn_cast <gswitch *> (stmt))
+    {
+      tree op = gimple_switch_index (switch_stmt);
+      if (TREE_CODE (op) != SSA_NAME)
+	return NULL_TREE;
+
+      const value_range_equiv *vr = m_vr_values->get_value_range (op);
+      return find_case_label_range (switch_stmt, vr);
+    }
+   if (gassign *assign_stmt = dyn_cast <gassign *> (stmt))
+    {
+      tree lhs = gimple_assign_lhs (assign_stmt);
+      if (TREE_CODE (lhs) == SSA_NAME
+	  && (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+	      || POINTER_TYPE_P (TREE_TYPE (lhs)))
+	  && stmt_interesting_for_vrp (stmt))
+	{
+	  edge dummy_e;
+	  tree dummy_tree;
+	  value_range_equiv new_vr;
+	  m_vr_values->extract_range_from_stmt (stmt, &dummy_e, &dummy_tree,
+						&new_vr);
+	  tree singleton;
+	  if (new_vr.singleton_p (&singleton))
+	    return singleton;
+	}
     }
+   return NULL;
 }
diff --git a/gcc/tree-ssa-threadedge.h b/gcc/tree-ssa-threadedge.h
index e19dc4b37cf..48735f2bc27 100644
--- a/gcc/tree-ssa-threadedge.h
+++ b/gcc/tree-ssa-threadedge.h
@@ -20,22 +20,80 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_TREE_SSA_THREADEDGE_H
 #define GCC_TREE_SSA_THREADEDGE_H
 
+// This is the high level threader.  The entry point is
+// thread_outgoing_edges(), which calculates and registers paths to be
+// threaded.  When all candidates have been registered,
+// thread_through_all_blocks() is called to actually change the CFG.
+
+class jump_threader
+{
+public:
+  jump_threader (class const_and_copies *,
+		 avail_exprs_stack *,
+		 class jump_threader_simplifier *,
+		 class evrp_range_analyzer * = NULL);
+  ~jump_threader ();
+  void thread_outgoing_edges (basic_block);
+  void remove_jump_threads_including (edge_def *);
+  bool thread_through_all_blocks (bool may_peel_loop_headers);
+
+private:
+  tree simplify_control_stmt_condition (edge, gimple *);
+  tree simplify_control_stmt_condition_1 (edge,
+					  gimple *,
+					  tree op0,
+					  tree_code cond_code,
+					  tree op1,
+					  unsigned limit);
+
+  bool thread_around_empty_blocks (vec<class jump_thread_edge *> *path,
+				   edge, bitmap visited);
+  int thread_through_normal_block (vec<jump_thread_edge *> *path,
+				   edge, bitmap visited);
+  void thread_across_edge (edge);
+  bool record_temporary_equivalences_from_phis (edge);
+  gimple *record_temporary_equivalences_from_stmts_at_dest (edge);
+
+  // Dummy condition to avoid creating lots of throw away statements.
+  gcond *dummy_cond;
+
+  const_and_copies *m_const_and_copies;
+  avail_exprs_stack *m_avail_exprs_stack;
+  class jump_thread_path_registry *m_registry;
+  jump_threader_simplifier *m_simplifier;
+  evrp_range_analyzer *m_evrp_range_analyzer;
+};
+
+// Statement simplifier callback for the jump threader.
+
+class jump_threader_simplifier
+{
+public:
+  jump_threader_simplifier (class vr_values *v,
+			    avail_exprs_stack *avails)
+    : m_vr_values (v),
+      m_avail_exprs_stack (avails)
+  { }
+  virtual ~jump_threader_simplifier () { }
+  virtual tree simplify (gimple *, gimple *, basic_block);
+
+protected:
+  vr_values *m_vr_values;
+  avail_exprs_stack *m_avail_exprs_stack;
+};
+
+extern void propagate_threaded_block_debug_into (basic_block, basic_block);
+
+// ?? All this ssa_name_values stuff is the store of values for
+// avail_exprs_stack and const_and_copies, so it really belongs in the
+// jump_threader class.  However, it's probably not worth touching
+// this, since all this windable state is slated to go with the
+// ranger.
 extern vec<tree> ssa_name_values;
 #define SSA_NAME_VALUE(x) \
     (SSA_NAME_VERSION (x) < ssa_name_values.length () \
      ? ssa_name_values[SSA_NAME_VERSION (x)] \
      : NULL_TREE)
 extern void set_ssa_name_value (tree, tree);
-extern void threadedge_initialize_values (void);
-extern void threadedge_finalize_values (void);
-extern bool potentially_threadable_block (basic_block);
-extern void propagate_threaded_block_debug_into (basic_block, basic_block);
-class evrp_range_analyzer;
-extern void thread_outgoing_edges (basic_block, gcond *,
-				   const_and_copies *,
-				   avail_exprs_stack *,
-				   evrp_range_analyzer *,
-				   tree (*) (gimple *, gimple *,
-					     avail_exprs_stack *, basic_block));
 
 #endif /* GCC_TREE_SSA_THREADEDGE_H */
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 73776466146..a86302be18e 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -128,7 +128,6 @@ struct redirection_data : free_ptr_hash<redirection_data>
      which they appear in the jump thread path.  */
   basic_block dup_blocks[2];
 
-  /* The jump threading path.  */
   vec<jump_thread_edge *> *path;
 
   /* A list of incoming edges which we want to thread to the
@@ -140,11 +139,66 @@ struct redirection_data : free_ptr_hash<redirection_data>
   static inline int equal (const redirection_data *, const redirection_data *);
 };
 
+jump_thread_path_allocator::jump_thread_path_allocator ()
+{
+  obstack_init (&m_obstack);
+}
+
+jump_thread_path_allocator::~jump_thread_path_allocator ()
+{
+  obstack_free (&m_obstack, NULL);
+}
+
+jump_thread_edge *
+jump_thread_path_allocator::allocate_thread_edge (edge e,
+						  jump_thread_edge_type type)
+{
+  void *r = obstack_alloc (&m_obstack, sizeof (jump_thread_edge));
+  return new (r) jump_thread_edge (e, type);
+}
+
+vec<jump_thread_edge *> *
+jump_thread_path_allocator::allocate_thread_path ()
+{
+  // ?? Since the paths live in an obstack, we should be able to remove all
+  // references to path->release() throughout the code.
+  void *r = obstack_alloc (&m_obstack, sizeof (vec <jump_thread_edge *>));
+  return new (r) vec<jump_thread_edge *> ();
+}
+
+jump_thread_path_registry::jump_thread_path_registry ()
+{
+  m_paths.create (5);
+  m_removed_edges = new hash_table<struct removed_edges> (17);
+  m_num_threaded_edges = 0;
+  m_redirection_data = NULL;
+}
+
+jump_thread_path_registry::~jump_thread_path_registry ()
+{
+  m_paths.release ();
+  delete m_removed_edges;
+}
+
+jump_thread_edge *
+jump_thread_path_registry::allocate_thread_edge (edge e,
+						 jump_thread_edge_type t)
+{
+  return m_allocator.allocate_thread_edge (e, t);
+}
+
+vec<jump_thread_edge *> *
+jump_thread_path_registry::allocate_thread_path ()
+{
+  return m_allocator.allocate_thread_path ();
+}
+
 /* Dump a jump threading path, including annotations about each
    edge in the path.  */
 
-static void
-dump_jump_thread_path (FILE *dump_file, vec<jump_thread_edge *> path,
+void
+dump_jump_thread_path (FILE *dump_file,
+		       const vec<jump_thread_edge *> path,
 		       bool registering)
 {
   fprintf (dump_file,
@@ -178,6 +232,12 @@ dump_jump_thread_path (FILE *dump_file, vec<jump_thread_edge *> path,
   fputc ('\n', dump_file);
 }
 
+DEBUG_FUNCTION void
+debug (const vec<jump_thread_edge *> &path)
+{
+  dump_jump_thread_path (stderr, path, true);
+}
+
 /* Simple hashing function.  For any given incoming edge E, we're going
    to be most concerned with the final destination of its jump thread
    path.  So hash on the block index of the final edge in the path.  */
@@ -210,18 +270,6 @@ redirection_data::equal (const redirection_data *p1, const redirection_data *p2)
   return true;
 }
 
-/* Rather than search all the edges in jump thread paths each time
-   DOM is able to simply if control statement, we build a hash table
-   with the deleted edges.  We only care about the address of the edge,
-   not its contents.  */
-struct removed_edges : nofree_ptr_hash<edge_def>
-{
-  static hashval_t hash (edge e) { return htab_hash_pointer (e); }
-  static bool equal (edge e1, edge e2) { return e1 == e2; }
-};
-
-static hash_table<removed_edges> *removed_edges;
-
 /* Data structure of information to pass to hash table traversal routines.  */
 struct ssa_local_info_t
 {
@@ -251,34 +299,21 @@ struct ssa_local_info_t
      final destinations, then we may need to correct for potential
      profile insanities.  */
   bool need_profile_correction;
-};
 
-/* Passes which use the jump threading code register jump threading
-   opportunities as they are discovered.  We keep the registered
-   jump threading opportunities in this vector as edge pairs
-   (original_edge, target_edge).  */
-static vec<vec<jump_thread_edge *> *> paths;
+  // Jump threading statistics.
+  unsigned long num_threaded_edges;
+};
 
 /* When we start updating the CFG for threading, data necessary for jump
    threading is attached to the AUX field for the incoming edge.  Use these
    macros to access the underlying structure attached to the AUX field.  */
 #define THREAD_PATH(E) ((vec<jump_thread_edge *> *)(E)->aux)
 
-/* Jump threading statistics.  */
-
-struct thread_stats_d
-{
-  unsigned long num_threaded_edges;
-};
-
-struct thread_stats_d thread_stats;
-
-
 /* Remove the last statement in block BB if it is a control statement
    Also remove all outgoing edges except the edge which reaches DEST_BB.
    If DEST_BB is NULL, then remove all outgoing edges.  */
 
-void
+static void
 remove_ctrl_stmt_and_useless_edges (basic_block bb, basic_block dest_bb)
 {
   gimple_stmt_iterator gsi;
@@ -360,18 +395,15 @@ create_block_for_threading (basic_block bb,
     bitmap_set_bit (*duplicate_blocks, rd->dup_blocks[count]->index);
 }
 
-/* Main data structure to hold information for duplicates of BB.  */
-
-static hash_table<redirection_data> *redirection_data;
-
 /* Given an outgoing edge E lookup and return its entry in our hash table.
 
    If INSERT is true, then we insert the entry into the hash table if
    it is not already present.  INCOMING_EDGE is added to the list of incoming
    edges associated with E in the hash table.  */
 
-static struct redirection_data *
-lookup_redirection_data (edge e, enum insert_option insert)
+redirection_data *
+jump_thread_path_registry::lookup_redirection_data (edge e,
+						    enum insert_option insert)
 {
   struct redirection_data **slot;
   struct redirection_data *elt;
@@ -385,7 +417,7 @@ lookup_redirection_data (edge e, enum insert_option insert)
   elt->dup_blocks[1] = NULL;
   elt->incoming_edges = NULL;
 
-  slot = redirection_data->find_slot (elt, insert);
+  slot = m_redirection_data->find_slot (elt, insert);
 
   /* This will only happen if INSERT is false and the entry is not
      in the hash table.  */
@@ -1253,7 +1285,7 @@ ssa_fixup_template_block (struct redirection_data **slot,
 /* Hash table traversal callback to redirect each incoming edge
    associated with this hash table element to its new destination.  */
 
-int
+static int
 ssa_redirect_edges (struct redirection_data **slot,
 		    ssa_local_info_t *local_info)
 {
@@ -1273,7 +1305,7 @@ ssa_redirect_edges (struct redirection_data **slot,
       next = el->next;
       free (el);
 
-      thread_stats.num_threaded_edges++;
+      local_info->num_threaded_edges++;
 
       if (rd->dup_blocks[0])
 	{
@@ -1292,7 +1324,7 @@ ssa_redirect_edges (struct redirection_data **slot,
 
       /* Go ahead and clear E->aux.  It's not needed anymore and failure
 	 to clear it will cause all kinds of unpleasant problems later.  */
-      delete_jump_thread_path (path);
+      path->release ();
       e->aux = NULL;
 
     }
@@ -1356,8 +1388,10 @@ redirection_block_p (basic_block bb)
 
    If JOINERS is true, then thread through joiner blocks as well.  */
 
-static bool
-thread_block_1 (basic_block bb, bool noloop_only, bool joiners)
+bool
+jump_thread_path_registry::thread_block_1 (basic_block bb,
+					   bool noloop_only,
+					   bool joiners)
 {
   /* E is an incoming edge into BB that we may or may not want to
      redirect to a duplicate of BB.  */
@@ -1367,12 +1401,13 @@ thread_block_1 (basic_block bb, bool noloop_only, bool joiners)
 
   local_info.duplicate_blocks = BITMAP_ALLOC (NULL);
   local_info.need_profile_correction = false;
+  local_info.num_threaded_edges = 0;
 
   /* To avoid scanning a linear array for the element we need we instead
      use a hash table.  For normal code there should be no noticeable
      difference.  However, if we have a block with a large number of
      incoming and outgoing edges such linear searches can get expensive.  */
-  redirection_data
+  m_redirection_data
     = new hash_table<struct redirection_data> (EDGE_COUNT (bb->succs));
 
   /* Record each unique threaded destination into a hash table for
@@ -1407,7 +1442,7 @@ thread_block_1 (basic_block bb, bool noloop_only, bool joiners)
 	      /* Since this case is not handled by our special code
 		 to thread through a loop header, we must explicitly
 		 cancel the threading request here.  */
-	      delete_jump_thread_path (path);
+	      path->release ();
 	      e->aux = NULL;
 	      continue;
 	    }
@@ -1446,7 +1481,7 @@ thread_block_1 (basic_block bb, bool noloop_only, bool joiners)
 
 	      if (i != path->length ())
 		{
-		  delete_jump_thread_path (path);
+		  path->release ();
 		  e->aux = NULL;
 		  continue;
 		}
@@ -1491,7 +1526,7 @@ thread_block_1 (basic_block bb, bool noloop_only, bool joiners)
   local_info.template_block = NULL;
   local_info.bb = bb;
   local_info.jumps_threaded = false;
-  redirection_data->traverse <ssa_local_info_t *, ssa_create_duplicates>
+  m_redirection_data->traverse <ssa_local_info_t *, ssa_create_duplicates>
 			    (&local_info);
 
   /* The template does not have an outgoing edge.  Create that outgoing
@@ -1499,19 +1534,19 @@ thread_block_1 (basic_block bb, bool noloop_only, bool joiners)
 
      We do this after creating all the duplicates to avoid creating
      unnecessary edges.  */
-  redirection_data->traverse <ssa_local_info_t *, ssa_fixup_template_block>
+  m_redirection_data->traverse <ssa_local_info_t *, ssa_fixup_template_block>
 			    (&local_info);
 
   /* The hash table traversals above created the duplicate blocks (and the
      statements within the duplicate blocks).  This loop creates PHI nodes for
      the duplicated blocks and redirects the incoming edges into BB to reach
      the duplicates of BB.  */
-  redirection_data->traverse <ssa_local_info_t *, ssa_redirect_edges>
+  m_redirection_data->traverse <ssa_local_info_t *, ssa_redirect_edges>
 			    (&local_info);
 
   /* Done with this block.  Clear REDIRECTION_DATA.  */
-  delete redirection_data;
-  redirection_data = NULL;
+  delete m_redirection_data;
+  m_redirection_data = NULL;
 
   if (noloop_only
       && bb == bb->loop_father->header)
@@ -1520,6 +1555,8 @@ thread_block_1 (basic_block bb, bool noloop_only, bool joiners)
   BITMAP_FREE (local_info.duplicate_blocks);
   local_info.duplicate_blocks = NULL;
 
+  m_num_threaded_edges += local_info.num_threaded_edges;
+
   /* Indicate to our caller whether or not any jumps were threaded.  */
   return local_info.jumps_threaded;
 }
@@ -1532,8 +1569,8 @@ thread_block_1 (basic_block bb, bool noloop_only, bool joiners)
    not worry that copying a joiner block will create a jump threading
    opportunity.  */
 
-static bool
-thread_block (basic_block bb, bool noloop_only)
+bool
+jump_thread_path_registry::thread_block (basic_block bb, bool noloop_only)
 {
   bool retval;
   retval = thread_block_1 (bb, noloop_only, false);
@@ -1613,8 +1650,10 @@ determine_bb_domination_status (class loop *loop, basic_block bb)
    If MAY_PEEL_LOOP_HEADERS is false, we avoid threading from entry edges
    to the inside of the loop.  */
 
-static bool
-thread_through_loop_header (class loop *loop, bool may_peel_loop_headers)
+bool
+jump_thread_path_registry::thread_through_loop_header
+				(class loop *loop,
+				 bool may_peel_loop_headers)
 {
   basic_block header = loop->header;
   edge e, tgt_edge, latch = loop_latch_edge (loop);
@@ -1801,7 +1840,7 @@ fail:
 
       if (path)
 	{
-	  delete_jump_thread_path (path);
+	  path->release ();
 	  e->aux = NULL;
 	}
     }
@@ -1868,8 +1907,8 @@ count_stmts_and_phis_in_block (basic_block bb)
    discover blocks which need processing and avoids unnecessary
    hash table lookups to map from threaded edge to new target.  */
 
-static void
-mark_threaded_blocks (bitmap threaded_blocks)
+void
+jump_thread_path_registry::mark_threaded_blocks (bitmap threaded_blocks)
 {
   unsigned int i;
   bitmap_iterator bi;
@@ -1892,9 +1931,9 @@ mark_threaded_blocks (bitmap threaded_blocks)
 
      So first convert the jump thread requests which do not require a
      joiner block.  */
-  for (i = 0; i < paths.length (); i++)
+  for (i = 0; i < m_paths.length (); i++)
     {
-      vec<jump_thread_edge *> *path = paths[i];
+      vec<jump_thread_edge *> *path = m_paths[i];
 
       if (path->length () > 1
 	  && (*path)[1]->type != EDGE_COPY_SRC_JOINER_BLOCK)
@@ -1913,9 +1952,9 @@ mark_threaded_blocks (bitmap threaded_blocks)
      cases where the second path starts at a downstream edge on the same
      path).  First record all joiner paths, deleting any in the unexpected
      case where there is already a path for that incoming edge.  */
-  for (i = 0; i < paths.length ();)
+  for (i = 0; i < m_paths.length ();)
     {
-      vec<jump_thread_edge *> *path = paths[i];
+      vec<jump_thread_edge *> *path = m_paths[i];
 
       if (path->length () > 1
 	  && (*path)[1]->type == EDGE_COPY_SRC_JOINER_BLOCK)
@@ -1928,10 +1967,10 @@ mark_threaded_blocks (bitmap threaded_blocks)
 	    }
 	  else
 	    {
-	      paths.unordered_remove (i);
+	      m_paths.unordered_remove (i);
 	      if (dump_file && (dump_flags & TDF_DETAILS))
 		dump_jump_thread_path (dump_file, *path, false);
-	      delete_jump_thread_path (path);
+	      path->release ();
 	    }
 	}
       else
@@ -1942,9 +1981,9 @@ mark_threaded_blocks (bitmap threaded_blocks)
 
   /* Second, look for paths that have any other jump thread attached to
      them, and either finish converting them or cancel them.  */
-  for (i = 0; i < paths.length ();)
+  for (i = 0; i < m_paths.length ();)
     {
-      vec<jump_thread_edge *> *path = paths[i];
+      vec<jump_thread_edge *> *path = m_paths[i];
       edge e = (*path)[0]->e;
 
       if (path->length () > 1
@@ -1965,10 +2004,10 @@ mark_threaded_blocks (bitmap threaded_blocks)
 	  else
 	    {
 	      e->aux = NULL;
-	      paths.unordered_remove (i);
+	      m_paths.unordered_remove (i);
 	      if (dump_file && (dump_flags & TDF_DETAILS))
 		dump_jump_thread_path (dump_file, *path, false);
-	      delete_jump_thread_path (path);
+	      path->release ();
 	    }
 	}
       else
@@ -2015,8 +2054,8 @@ mark_threaded_blocks (bitmap threaded_blocks)
 		if (j != path->length ())
 		  {
 		    if (dump_file && (dump_flags & TDF_DETAILS))
-		      dump_jump_thread_path (dump_file, *path, 0);
-		    delete_jump_thread_path (path);
+		      dump_jump_thread_path (dump_file, *path, false);
+		    path->release ();
 		    e->aux = NULL;
 		  }
 		else
@@ -2063,7 +2102,7 @@ mark_threaded_blocks (bitmap threaded_blocks)
 
 		  if (e2 && !phi_args_equal_on_edges (e2, final_edge))
 		    {
-		      delete_jump_thread_path (path);
+		      path->release ();
 		      e->aux = NULL;
 		    }
 		}
@@ -2137,10 +2176,10 @@ bb_in_bbs (basic_block bb, basic_block *bbs, int n)
   return false;
 }
 
-DEBUG_FUNCTION void
-debug_path (FILE *dump_file, int pathno)
+void
+jump_thread_path_registry::debug_path (FILE *dump_file, int pathno)
 {
-  vec<jump_thread_edge *> *p = paths[pathno];
+  vec<jump_thread_edge *> *p = m_paths[pathno];
   fprintf (dump_file, "path: ");
   for (unsigned i = 0; i < p->length (); ++i)
     fprintf (dump_file, "%d -> %d, ",
@@ -2148,10 +2187,10 @@ debug_path (FILE *dump_file, int pathno)
   fprintf (dump_file, "\n");
 }
 
-DEBUG_FUNCTION void
-debug_all_paths ()
+void
+jump_thread_path_registry::dump ()
 {
-  for (unsigned i = 0; i < paths.length (); ++i)
+  for (unsigned i = 0; i < m_paths.length (); ++i)
     debug_path (stderr, i);
 }
 
@@ -2163,10 +2202,11 @@ debug_all_paths ()
 
    Returns TRUE if we were able to successfully rewire the edge.  */
 
-static bool
-rewire_first_differing_edge (unsigned path_num, unsigned edge_num)
+bool
+jump_thread_path_registry::rewire_first_differing_edge (unsigned path_num,
+							unsigned edge_num)
 {
-  vec<jump_thread_edge *> *path = paths[path_num];
+  vec<jump_thread_edge *> *path = m_paths[path_num];
   edge &e = (*path)[edge_num]->e;
   if (dump_file && (dump_flags & TDF_DETAILS))
     fprintf (dump_file, "rewiring edge candidate: %d -> %d\n",
@@ -2208,10 +2248,11 @@ rewire_first_differing_edge (unsigned path_num, unsigned edge_num)
    CURR_PATH_NUM is an index into the global paths table.  It
    specifies the path that was just threaded.  */
 
-static void
-adjust_paths_after_duplication (unsigned curr_path_num)
+void
+jump_thread_path_registry::adjust_paths_after_duplication
+	(unsigned curr_path_num)
 {
-  vec<jump_thread_edge *> *curr_path = paths[curr_path_num];
+  vec<jump_thread_edge *> *curr_path = m_paths[curr_path_num];
   gcc_assert ((*curr_path)[0]->type == EDGE_FSM_THREAD);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -2221,7 +2262,7 @@ adjust_paths_after_duplication (unsigned curr_path_num)
     }
 
   /* Iterate through all the other paths and adjust them.  */
-  for (unsigned cand_path_num = 0; cand_path_num < paths.length (); )
+  for (unsigned cand_path_num = 0; cand_path_num < m_paths.length (); )
     {
       if (cand_path_num == curr_path_num)
 	{
@@ -2230,7 +2271,7 @@ adjust_paths_after_duplication (unsigned curr_path_num)
 	}
       /* Make sure the candidate to adjust starts with the same path
 	 as the recently threaded path and is an FSM thread.  */
-      vec<jump_thread_edge *> *cand_path = paths[cand_path_num];
+      vec<jump_thread_edge *> *cand_path = m_paths[cand_path_num];
       if ((*cand_path)[0]->type != EDGE_FSM_THREAD
 	  || (*cand_path)[0]->e != (*curr_path)[0]->e)
 	{
@@ -2284,8 +2325,8 @@ adjust_paths_after_duplication (unsigned curr_path_num)
 	    remove_candidate_from_list:
 	      if (dump_file && (dump_flags & TDF_DETAILS))
 		fprintf (dump_file, "adjusted candidate: [EMPTY]\n");
-	      delete_jump_thread_path (cand_path);
-	      paths.unordered_remove (cand_path_num);
+	      cand_path->release ();
+	      m_paths.unordered_remove (cand_path_num);
 	      continue;
 	    }
 	  /* Otherwise, just remove the redundant sub-path.  */
@@ -2312,9 +2353,12 @@ adjust_paths_after_duplication (unsigned curr_path_num)
 
    Returns false if it is unable to copy the region, true otherwise.  */
 
-static bool
-duplicate_thread_path (edge entry, edge exit, basic_block *region,
-		       unsigned n_region, unsigned current_path_no)
+bool
+jump_thread_path_registry::duplicate_thread_path (edge entry,
+						  edge exit,
+						  basic_block *region,
+						  unsigned n_region,
+						  unsigned current_path_no)
 {
   unsigned i;
   class loop *loop = entry->dest->loop_father;
@@ -2489,15 +2533,12 @@ valid_jump_thread_path (vec<jump_thread_edge *> *path)
    DOM/VRP rather than for every case where DOM optimizes away a COND_EXPR.  */
 
 void
-remove_jump_threads_including (edge_def *e)
+jump_thread_path_registry::remove_jump_threads_including (edge_def *e)
 {
-  if (!paths.exists ())
+  if (!m_paths.exists ())
     return;
 
-  if (!removed_edges)
-    removed_edges = new hash_table<struct removed_edges> (17);
-
-  edge *slot = removed_edges->find_slot (e, INSERT);
+  edge *slot = m_removed_edges->find_slot (e, INSERT);
   *slot = e;
 }
 
@@ -2513,7 +2554,8 @@ remove_jump_threads_including (edge_def *e)
    Returns true if one or more edges were threaded, false otherwise.  */
 
 bool
-thread_through_all_blocks (bool may_peel_loop_headers)
+jump_thread_path_registry::thread_through_all_blocks
+	(bool may_peel_loop_headers)
 {
   bool retval = false;
   unsigned int i;
@@ -2521,41 +2563,41 @@ thread_through_all_blocks (bool may_peel_loop_headers)
   auto_bitmap threaded_blocks;
   hash_set<edge> visited_starting_edges;
 
-  if (!paths.exists ())
+  if (!m_paths.exists ())
     {
       retval = false;
       goto out;
     }
 
-  memset (&thread_stats, 0, sizeof (thread_stats));
+  m_num_threaded_edges = 0;
 
   /* Remove any paths that referenced removed edges.  */
-  if (removed_edges)
-    for (i = 0; i < paths.length (); )
+  if (m_removed_edges)
+    for (i = 0; i < m_paths.length (); )
       {
 	unsigned int j;
-	vec<jump_thread_edge *> *path = paths[i];
+	vec<jump_thread_edge *> *path = m_paths[i];
 
 	for (j = 0; j < path->length (); j++)
 	  {
 	    edge e = (*path)[j]->e;
-	    if (removed_edges->find_slot (e, NO_INSERT))
+	    if (m_removed_edges->find_slot (e, NO_INSERT))
 	      break;
 	  }
 
 	if (j != path->length ())
 	  {
-	    delete_jump_thread_path (path);
-	    paths.unordered_remove (i);
+	    path->release ();
+	    m_paths.unordered_remove (i);
 	    continue;
 	  }
 	i++;
       }
 
   /* Jump-thread all FSM threads before other jump-threads.  */
-  for (i = 0; i < paths.length ();)
+  for (i = 0; i < m_paths.length ();)
     {
-      vec<jump_thread_edge *> *path = paths[i];
+      vec<jump_thread_edge *> *path = m_paths[i];
       edge entry = (*path)[0]->e;
 
       /* Only code-generate FSM jump-threads in this loop.  */
@@ -2579,8 +2621,8 @@ thread_through_all_blocks (bool may_peel_loop_headers)
 	  || !valid_jump_thread_path (path))
 	{
 	  /* Remove invalid FSM jump-thread paths.  */
-	  delete_jump_thread_path (path);
-	  paths.unordered_remove (i);
+	  path->release ();
+	  m_paths.unordered_remove (i);
 	  continue;
 	}
 
@@ -2597,26 +2639,26 @@ thread_through_all_blocks (bool may_peel_loop_headers)
 	  free_dominance_info (CDI_DOMINATORS);
 	  visited_starting_edges.add (entry);
 	  retval = true;
-	  thread_stats.num_threaded_edges++;
+	  m_num_threaded_edges++;
 	}
 
-      delete_jump_thread_path (path);
-      paths.unordered_remove (i);
+      path->release ();
+      m_paths.unordered_remove (i);
       free (region);
     }
 
   /* Remove from PATHS all the jump-threads starting with an edge already
      jump-threaded.  */
-  for (i = 0; i < paths.length ();)
+  for (i = 0; i < m_paths.length ();)
     {
-      vec<jump_thread_edge *> *path = paths[i];
+      vec<jump_thread_edge *> *path = m_paths[i];
       edge entry = (*path)[0]->e;
 
       /* Do not jump-thread twice from the same block.  */
       if (visited_starting_edges.contains (entry))
 	{
-	  delete_jump_thread_path (path);
-	  paths.unordered_remove (i);
+	  path->release ();
+	  m_paths.unordered_remove (i);
 	}
       else
 	i++;
@@ -2678,34 +2720,19 @@ thread_through_all_blocks (bool may_peel_loop_headers)
 	gcc_assert (e->aux == NULL);
     }
 
-  statistics_counter_event (cfun, "Jumps threaded",
-			    thread_stats.num_threaded_edges);
+  statistics_counter_event (cfun, "Jumps threaded", m_num_threaded_edges);
 
   free_original_copy_tables ();
 
-  paths.release ();
+  m_paths.release ();
 
   if (retval)
     loops_state_set (LOOPS_NEED_FIXUP);
 
  out:
-  delete removed_edges;
-  removed_edges = NULL;
   return retval;
 }
 
-/* Delete the jump threading path PATH.  We have to explicitly delete
-   each entry in the vector, then the container.  */
-
-void
-delete_jump_thread_path (vec<jump_thread_edge *> *path)
-{
-  for (unsigned int i = 0; i < path->length (); i++)
-    delete (*path)[i];
-  path->release();
-  delete path;
-}
-
 /* Register a jump threading opportunity.  We queue up all the jump
    threading opportunities discovered by a pass and update the CFG
    and SSA form all at once.
@@ -2715,11 +2742,11 @@ delete_jump_thread_path (vec<jump_thread_edge *> *path)
    after fixing the SSA graph.  */
 
 void
-register_jump_thread (vec<jump_thread_edge *> *path)
+jump_thread_path_registry::register_jump_thread (vec<jump_thread_edge *> *path)
 {
   if (!dbg_cnt (registered_jump_thread))
     {
-      delete_jump_thread_path (path);
+      path->release ();
       return;
     }
 
@@ -2736,7 +2763,7 @@ register_jump_thread (vec<jump_thread_edge *> *path)
 	      dump_jump_thread_path (dump_file, *path, false);
 	    }
 
-	  delete_jump_thread_path (path);
+	  path->release ();
 	  return;
 	}
 
@@ -2750,10 +2777,7 @@ register_jump_thread (vec<jump_thread_edge *> *path)
   if (dump_file && (dump_flags & TDF_DETAILS))
     dump_jump_thread_path (dump_file, *path, true);
 
-  if (!paths.exists ())
-    paths.create (5);
-
-  paths.safe_push (path);
+  m_paths.safe_push (path);
 }
 
 /* Return how many uses of T there are within BB, as long as there
diff --git a/gcc/tree-ssa-threadupdate.h b/gcc/tree-ssa-threadupdate.h
index 5f49b1ae0ab..b806caee581 100644
--- a/gcc/tree-ssa-threadupdate.h
+++ b/gcc/tree-ssa-threadupdate.h
@@ -21,8 +21,6 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef _TREE_SSA_THREADUPDATE_H
 #define _TREE_SSA_THREADUPDATE_H 1
 
-/* In tree-ssa-threadupdate.c.  */
-extern bool thread_through_all_blocks (bool);
 enum jump_thread_edge_type
 {
   EDGE_START_JUMP_THREAD,
@@ -32,21 +30,85 @@ enum jump_thread_edge_type
   EDGE_NO_COPY_SRC_BLOCK
 };
 
+// We keep the registered jump threading opportunities in this
+// vector as edge pairs (original_edge, target_edge).
+
 class jump_thread_edge
 {
 public:
-  jump_thread_edge (edge e, enum jump_thread_edge_type type)
-    : e (e), type (type) {}
+  jump_thread_edge (edge e, jump_thread_edge_type t) : e (e), type (t) {}
 
   edge e;
-  enum jump_thread_edge_type type;
+  jump_thread_edge_type type;
+};
+
+class jump_thread_path_allocator
+{
+public:
+  jump_thread_path_allocator ();
+  ~jump_thread_path_allocator ();
+  jump_thread_edge *allocate_thread_edge (edge, jump_thread_edge_type);
+  vec<jump_thread_edge *> *allocate_thread_path ();
+private:
+  DISABLE_COPY_AND_ASSIGN (jump_thread_path_allocator);
+  obstack m_obstack;
+};
+
+// This is the underlying jump thread registry.  When all candidates
+// have been registered with register_jump_thread(),
+// thread_through_all_blocks() is called to actually change the CFG.
+
+class jump_thread_path_registry
+{
+public:
+  jump_thread_path_registry ();
+  ~jump_thread_path_registry ();
+  void register_jump_thread (vec<jump_thread_edge *> *);
+  void remove_jump_threads_including (edge);
+  bool thread_through_all_blocks (bool);
+  jump_thread_edge *allocate_thread_edge (edge e, jump_thread_edge_type t);
+  vec<jump_thread_edge *> *allocate_thread_path ();
+  void dump ();
+
+private:
+  void debug_path (FILE *, int pathno);
+  void mark_threaded_blocks (bitmap threaded_blocks);
+  bool rewire_first_differing_edge (unsigned path_num, unsigned edge_num);
+  void adjust_paths_after_duplication (unsigned curr_path_num);
+  bool duplicate_thread_path (edge entry,
+			      edge exit,
+			      basic_block *region,
+			      unsigned n_region,
+			      unsigned current_path_no);
+  bool thread_block_1 (basic_block, bool noloop_only, bool joiners);
+  bool thread_block (basic_block, bool noloop_only);
+  bool thread_through_loop_header (class loop *loop,
+				   bool may_peel_loop_headers);
+  class redirection_data *lookup_redirection_data (edge e, enum insert_option);
+
+  vec<vec<jump_thread_edge *> *> m_paths;
+
+  hash_table<struct removed_edges> *m_removed_edges;
+
+  // Main data structure to hold information for duplicates of BB.
+  hash_table<redirection_data> *m_redirection_data;
+
+  // Jump threading statistics.
+  unsigned long m_num_threaded_edges;
+
+  jump_thread_path_allocator m_allocator;
+};
+
+// Rather than search all the edges in jump thread paths each time DOM
+// is able to simply if control statement, we build a hash table with
+// the deleted edges.  We only care about the address of the edge, not
+// its contents.
+struct removed_edges : nofree_ptr_hash<edge_def>
+{
+  static hashval_t hash (edge e) { return htab_hash_pointer (e); }
+  static bool equal (edge e1, edge e2) { return e1 == e2; }
 };
 
-extern void register_jump_thread (vec <class jump_thread_edge *> *);
-extern void remove_jump_threads_including (edge);
-extern void delete_jump_thread_path (vec <class jump_thread_edge *> *);
-extern void remove_ctrl_stmt_and_useless_edges (basic_block, basic_block);
-extern void free_dom_edge_info (edge);
 extern unsigned int estimate_threading_killed_stmts (basic_block);
 
 enum bb_dom_status
@@ -61,4 +123,7 @@ enum bb_dom_status
 
 enum bb_dom_status determine_bb_domination_status (class loop *, basic_block);
 
+// In tree-ssa-dom.c.
+extern void free_dom_edge_info (edge);
+
 #endif
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index d968ef288ff..12e6e6f3e22 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2373,9 +2373,6 @@ lhs_of_dominating_assert (tree op, basic_block bb, gimple *stmt)
   return op;
 }
 
-/* A hack.  */
-static class vr_values *x_vr_values;
-
 /* Searches the case label vector VEC for the index *IDX of the CASE_LABEL
    that includes the value VAL.  The search is restricted to the range
    [START_IDX, n - 1] where n is the size of VEC.
@@ -4163,6 +4160,54 @@ vrp_folder::fold_stmt (gimple_stmt_iterator *si)
   return simplifier.simplify (si);
 }
 
+class vrp_jump_threader_simplifier : public jump_threader_simplifier
+{
+public:
+  vrp_jump_threader_simplifier (vr_values *v, avail_exprs_stack *avails)
+    : jump_threader_simplifier (v, avails) {}
+
+private:
+  tree simplify (gimple *, gimple *, basic_block) OVERRIDE;
+};
+
+tree
+vrp_jump_threader_simplifier::simplify (gimple *stmt,
+					gimple *within_stmt,
+					basic_block bb)
+{
+  /* First see if the conditional is in the hash table.  */
+  tree cached_lhs = m_avail_exprs_stack->lookup_avail_expr (stmt, false, true);
+  if (cached_lhs && is_gimple_min_invariant (cached_lhs))
+    return cached_lhs;
+
+  if (gcond *cond_stmt = dyn_cast <gcond *> (stmt))
+    {
+      tree op0 = gimple_cond_lhs (cond_stmt);
+      op0 = lhs_of_dominating_assert (op0, bb, stmt);
+
+      tree op1 = gimple_cond_rhs (cond_stmt);
+      op1 = lhs_of_dominating_assert (op1, bb, stmt);
+
+      simplify_using_ranges simplifier (m_vr_values);
+      return simplifier.vrp_evaluate_conditional (gimple_cond_code (cond_stmt),
+						  op0, op1, within_stmt);
+    }
+
+  if (gswitch *switch_stmt = dyn_cast <gswitch *> (stmt))
+    {
+      tree op = gimple_switch_index (switch_stmt);
+      if (TREE_CODE (op) != SSA_NAME)
+	return NULL_TREE;
+
+      op = lhs_of_dominating_assert (op, bb, stmt);
+
+      const value_range_equiv *vr = m_vr_values->get_value_range (op);
+      return find_case_label_range (switch_stmt, vr);
+    }
+
+  return jump_threader_simplifier::simplify (stmt, within_stmt, bb);
+}
+
 /* Blocks which have more than one predecessor and more than
    one successor present jump threading opportunities, i.e.,
    when the block is reached from a specific predecessor, we
@@ -4186,7 +4231,7 @@ vrp_folder::fold_stmt (gimple_stmt_iterator *si)
 class vrp_jump_threader : public dom_walker
 {
 public:
-  vrp_jump_threader (struct function *, vr_values *);
+  vrp_jump_threader (function *, vr_values *);
   ~vrp_jump_threader ();
 
   void thread_jumps ()
@@ -4194,9 +4239,13 @@ public:
     walk (m_fun->cfg->x_entry_block_ptr);
   }
 
+  void thread_through_all_blocks ()
+  {
+    // FIXME: Put this in the destructor?
+    m_threader->thread_through_all_blocks (false);
+  }
+
 private:
-  static tree simplify_stmt (gimple *stmt, gimple *within_stmt,
-			     avail_exprs_stack *, basic_block);
   virtual edge before_dom_children (basic_block);
   virtual void after_dom_children (basic_block);
 
@@ -4205,7 +4254,8 @@ private:
   const_and_copies *m_const_and_copies;
   avail_exprs_stack *m_avail_exprs_stack;
   hash_table<expr_elt_hasher> *m_avail_exprs;
-  gcond *m_dummy_cond;
+  vrp_jump_threader_simplifier *m_simplifier;
+  jump_threader *m_threader;
 };
 
 vrp_jump_threader::vrp_jump_threader (struct function *fun, vr_values *v)
@@ -4227,11 +4277,15 @@ vrp_jump_threader::vrp_jump_threader (struct function *fun, vr_values *v)
      that might be recorded.  */
   m_const_and_copies = new const_and_copies ();
 
-  m_dummy_cond = NULL;
   m_fun = fun;
   m_vr_values = v;
   m_avail_exprs = new hash_table<expr_elt_hasher> (1024);
   m_avail_exprs_stack = new avail_exprs_stack (m_avail_exprs);
+
+  m_simplifier = new vrp_jump_threader_simplifier (m_vr_values,
+						   m_avail_exprs_stack);
+  m_threader = new jump_threader (m_const_and_copies, m_avail_exprs_stack,
+				  m_simplifier);
 }
 
 vrp_jump_threader::~vrp_jump_threader ()
@@ -4242,6 +4296,8 @@ vrp_jump_threader::~vrp_jump_threader ()
   delete m_const_and_copies;
   delete m_avail_exprs;
   delete m_avail_exprs_stack;
+  delete m_simplifier;
+  delete m_threader;
 }
 
 /* Called before processing dominator children of BB.  We want to look
@@ -4284,89 +4340,12 @@ vrp_jump_threader::before_dom_children (basic_block bb)
   return NULL;
 }
 
-/* A trivial wrapper so that we can present the generic jump threading
-   code with a simple API for simplifying statements.  STMT is the
-   statement we want to simplify, WITHIN_STMT provides the location
-   for any overflow warnings.
-
-   ?? This should be cleaned up.  There's a virtually identical copy
-   of this function in tree-ssa-dom.c.  */
-
-tree
-vrp_jump_threader::simplify_stmt (gimple *stmt,
-				  gimple *within_stmt,
-				  avail_exprs_stack *avail_exprs_stack,
-				  basic_block bb)
-{
-  /* First see if the conditional is in the hash table.  */
-  tree cached_lhs = avail_exprs_stack->lookup_avail_expr (stmt, false, true);
-  if (cached_lhs && is_gimple_min_invariant (cached_lhs))
-    return cached_lhs;
-
-  class vr_values *vr_values = x_vr_values;
-  if (gcond *cond_stmt = dyn_cast <gcond *> (stmt))
-    {
-      tree op0 = gimple_cond_lhs (cond_stmt);
-      op0 = lhs_of_dominating_assert (op0, bb, stmt);
-
-      tree op1 = gimple_cond_rhs (cond_stmt);
-      op1 = lhs_of_dominating_assert (op1, bb, stmt);
-
-      simplify_using_ranges simplifier (vr_values);
-      return simplifier.vrp_evaluate_conditional (gimple_cond_code (cond_stmt),
-						  op0, op1, within_stmt);
-    }
-
-  if (gswitch *switch_stmt = dyn_cast <gswitch *> (stmt))
-    {
-      tree op = gimple_switch_index (switch_stmt);
-      if (TREE_CODE (op) != SSA_NAME)
-	return NULL_TREE;
-
-      op = lhs_of_dominating_assert (op, bb, stmt);
-
-      const value_range_equiv *vr = vr_values->get_value_range (op);
-      return find_case_label_range (switch_stmt, vr);
-    }
-
-  if (gassign *assign_stmt = dyn_cast <gassign *> (stmt))
-    {
-      tree lhs = gimple_assign_lhs (assign_stmt);
-      if (TREE_CODE (lhs) == SSA_NAME
-	  && (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
-	      || POINTER_TYPE_P (TREE_TYPE (lhs)))
-	  && stmt_interesting_for_vrp (stmt))
-	{
-	  edge dummy_e;
-	  tree dummy_tree;
-	  value_range_equiv new_vr;
-	  vr_values->extract_range_from_stmt (stmt, &dummy_e,
-					      &dummy_tree, &new_vr);
-	  tree singleton;
-	  if (new_vr.singleton_p (&singleton))
-	    return singleton;
-	}
-    }
-
-  return NULL_TREE;
-}
-
 /* Called after processing dominator children of BB.  This is where we
    actually call into the threader.  */
 void
 vrp_jump_threader::after_dom_children (basic_block bb)
 {
-  if (!m_dummy_cond)
-    m_dummy_cond = gimple_build_cond (NE_EXPR,
-				      integer_zero_node, integer_zero_node,
-				      NULL, NULL);
-
-  x_vr_values = m_vr_values;
-  thread_outgoing_edges (bb, m_dummy_cond, m_const_and_copies,
-			 m_avail_exprs_stack, NULL,
-			 simplify_stmt);
-  x_vr_values = NULL;
-
+  m_threader->thread_outgoing_edges (bb);
   m_avail_exprs_stack->pop_to_marker ();
   m_const_and_copies->pop_to_marker ();
 }
@@ -4500,8 +4479,6 @@ execute_vrp (struct function *fun, bool warn_array_bounds_p)
   vrp_asserts assert_engine (fun);
   assert_engine.insert_range_assertions ();
 
-  threadedge_initialize_values ();
-
   /* For visiting PHI nodes we need EDGE_DFS_BACK computed.  */
   mark_dfs_back_edges ();
 
@@ -4577,9 +4554,7 @@ execute_vrp (struct function *fun, bool warn_array_bounds_p)
 
      Note the SSA graph update will occur during the normal TODO
      processing by the pass manager.  */
-  thread_through_all_blocks (false);
-
-  threadedge_finalize_values ();
+  threader.thread_through_all_blocks ();
 
   scev_finalize ();
   loop_optimizer_finalize ();
-- 
2.30.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/2] Refactor backward threader registry and profitability code into classes.
  2021-04-28 17:12 [PATCH 0/2] Jump threader refactor Aldy Hernandez
  2021-04-28 17:12 ` [PATCH 1/2] " Aldy Hernandez
@ 2021-04-28 17:12 ` Aldy Hernandez
  2021-04-30 16:10   ` Jeff Law
  1 sibling, 1 reply; 7+ messages in thread
From: Aldy Hernandez @ 2021-04-28 17:12 UTC (permalink / raw)
  To: Jeff Law, GCC patches

This refactors the registry and the profitability code from the
backwards threader into two separate classes.  It cleans up the code,
and makes it easier for alternate implementations to share code.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* tree-ssa-threadbackward.c (class thread_jumps): Split out code
	from here...
	(class back_threader_registry): ...to here...
	(class back_threader_profitability): ...and here...
	(thread_jumps::thread_through_all_blocks): Remove argument.
	(back_threader_registry::back_threader_registry): New.
	(back_threader_registry::~back_threader_registry): New.
	(back_threader_registry::thread_through_all_blocks): New.
	(thread_jumps::profitable_jump_thread_path): Move from here...
	(back_threader_profitability::profitable_path_p): ...to here.
	(thread_jumps::find_taken_edge): New.
	(thread_jumps::convert_and_register_current_path): Move...
	(back_threader_registry::register_path): ...to here.
	(thread_jumps::register_jump_thread_path_if_profitable): Move...
	(thread_jumps::maybe_register_path): ...to here.
	(thread_jumps::handle_phi): Call find_taken_edge and
	maybe_register_path.
	(thread_jumps::handle_assignment): Same.
	(thread_jumps::fsm_find_control_statement_thread_paths): Remove
	tree argument to handle_phi and handle_assignment.
	(thread_jumps::find_jump_threads_backwards): Set m_name.  Remove
	set of m_speed_p and m_max_threaded_paths.
	(pass_thread_jumps::execute): Remove second argument from
	find_jump_threads_backwards.
	(pass_early_thread_jumps::execute): Same.
---
 gcc/tree-ssa-threadbackward.c | 367 ++++++++++++++++++++--------------
 1 file changed, 213 insertions(+), 154 deletions(-)

diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 428cf0767c6..7dd8594e3d4 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -37,44 +37,79 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-inline.h"
 #include "tree-vectorizer.h"
 
+// Path registry for the backwards threader.  After all paths have been
+// registered with register_path(), thread_through_all_blocks() is called
+// to modify the CFG.
+
+class back_threader_registry
+{
+public:
+  back_threader_registry (int max_allowable_paths);
+  ~back_threader_registry ();
+  bool register_path (const vec<basic_block> &, edge taken);
+  bool thread_through_all_blocks ();
+
+private:
+  vec<vec<basic_block>> m_all_paths;
+  jump_thread_path_registry m_lowlevel_registry;
+  const int m_max_allowable_paths;
+  int m_threaded_paths;
+};
+
+// Class to abstract the profitability code for the backwards threader.
+
+class back_threader_profitability
+{
+public:
+  back_threader_profitability (bool speed_p)
+    : m_speed_p (speed_p)
+  { }
+  bool profitable_path_p (const vec<basic_block> &, tree name, edge taken,
+			  bool *irreducible_loop = NULL);
+
+private:
+  const bool m_speed_p;
+};
+
 class thread_jumps
 {
 public:
-  void find_jump_threads_backwards (basic_block bb, bool speed_p);
+  thread_jumps (bool speed_p = true)
+    : m_profit (speed_p), m_registry (param_max_fsm_thread_paths)
+  { }
+  void find_jump_threads_backwards (basic_block bb);
   bool thread_through_all_blocks ();
 
 private:
-  edge profitable_jump_thread_path (basic_block bbi, tree name, tree arg,
-				    bool *creates_irreducible_loop);
-  void convert_and_register_current_path (edge taken_edge);
-  void register_jump_thread_path_if_profitable (tree name, tree arg,
-						basic_block def_bb);
-  void handle_assignment (gimple *stmt, tree name, basic_block def_bb);
-  void handle_phi (gphi *phi, tree name, basic_block def_bb);
+  void maybe_register_path (const vec<basic_block> &m_path,
+			    tree name,
+			    edge taken_edge);
+  edge find_taken_edge (const vec<basic_block> &path, tree arg);
+  void handle_assignment (gimple *stmt, basic_block def_bb);
+  void handle_phi (gphi *phi, basic_block def_bb);
   void fsm_find_control_statement_thread_paths (tree name);
   bool check_subpath_and_update_thread_path (basic_block last_bb,
 					     basic_block new_bb,
 					     int *next_path_length);
 
-  /* Maximum number of BBs we are allowed to thread.  */
-  int m_max_threaded_paths;
   /* Hash to keep track of seen bbs.  */
   hash_set<basic_block> m_visited_bbs;
   /* Current path we're analyzing.  */
   auto_vec<basic_block> m_path;
   /* Tracks if we have recursed through a loop PHI node.  */
   bool m_seen_loop_phi;
-  /* Indicate that we could increase code size to improve the
-     code path.  */
-  bool m_speed_p;
 
-  jump_thread_path_registry m_registry;
+  tree m_name;
+  back_threader_profitability m_profit;
+  back_threader_registry m_registry;
 };
 
+// Perform the actual jump threading for the all queued paths.
+
 bool
 thread_jumps::thread_through_all_blocks ()
 {
-  return m_registry.thread_through_all_blocks (true);
+  return m_registry.thread_through_all_blocks ();
 }
 
 /* Simple helper to get the last statement from BB, which is assumed
@@ -133,62 +168,65 @@ fsm_find_thread_path (basic_block start_bb, basic_block end_bb,
   return false;
 }
 
-/* Examine jump threading path PATH to which we want to add BBI.
+back_threader_registry::back_threader_registry (int max_allowable_paths)
+  : m_max_allowable_paths (max_allowable_paths)
+{
+  m_all_paths.create (5);
+  m_threaded_paths = 0;
+}
 
-   If the resulting path is profitable to thread, then return the
-   final taken edge from the path, NULL otherwise.
+back_threader_registry::~back_threader_registry ()
+{
+  m_all_paths.release ();
+}
+
+bool
+back_threader_registry::thread_through_all_blocks ()
+{
+  return m_lowlevel_registry.thread_through_all_blocks (true);
+}
+
+/* Examine jump threading path PATH and return TRUE if it is profitable to
+   thread it, otherwise return FALSE.
 
    NAME is the SSA_NAME of the variable we found to have a constant
-   value on PATH.  ARG is the constant value of NAME on that path.
+   value on PATH.  If unknown, SSA_NAME is NULL.
 
-   BBI will be appended to PATH when we have a profitable jump
-   threading path.  Callers are responsible for removing BBI from PATH
-   in that case.  */
+   If the taken edge out of the path is known ahead of time it is passed in
+   TAKEN_EDGE, otherwise it is NULL.
 
-edge
-thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name,
-					   tree arg,
-					   bool *creates_irreducible_loop)
+   CREATES_IRREDUCIBLE_LOOP, if non-null is set to TRUE if threading this path
+   would create an irreducible loop.  */
+
+bool
+back_threader_profitability::profitable_path_p (const vec<basic_block> &m_path,
+						tree name,
+						edge taken_edge,
+						bool *creates_irreducible_loop)
 {
-  /* Note BBI is not in the path yet, hence the +1 in the test below
-     to make sure BBI is accounted for in the path length test.  */
+  gcc_checking_assert (!m_path.is_empty ());
 
-  /* We can get a length of 0 here when the statement that
-     makes a conditional generate a compile-time constant
-     result is in the same block as the conditional.
+  /* We can an empty path here (excluding the DEF block) when the
+     statement that makes a conditional generate a compile-time
+     constant result is in the same block as the conditional.
 
      That's not really a jump threading opportunity, but instead is
      simple cprop & simplification.  We could handle it here if we
      wanted by wiring up all the incoming edges.  If we run this
      early in IPA, that might be worth doing.   For now we just
      reject that case.  */
-  if (m_path.is_empty ())
-      return NULL;
+  if (m_path.length () <= 1)
+      return false;
 
-  if (m_path.length () + 1
-      > (unsigned) param_max_fsm_thread_length)
+  if (m_path.length () > (unsigned) param_max_fsm_thread_length)
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
-	fprintf (dump_file, "FSM jump-thread path not considered: "
+	fprintf (dump_file, "  FAIL: FSM jump-thread path not considered: "
 		 "the number of basic blocks on the path "
 		 "exceeds PARAM_MAX_FSM_THREAD_LENGTH.\n");
-      return NULL;
+      return false;
     }
 
-  if (m_max_threaded_paths <= 0)
-    {
-      if (dump_file && (dump_flags & TDF_DETAILS))
-	fprintf (dump_file, "FSM jump-thread path not considered: "
-		 "the number of previously recorded FSM paths to "
-		 "thread exceeds PARAM_MAX_FSM_THREAD_PATHS.\n");
-      return NULL;
-    }
-
-  /* Add BBI to the path.
-     From this point onward, if we decide we the path is not profitable
-     to thread, we must remove BBI from the path.  */
-  m_path.safe_push (bbi);
-
   int n_insns = 0;
   gimple_stmt_iterator gsi;
   loop_p loop = m_path[0]->loop_father;
@@ -256,6 +294,8 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name,
 		     SSA_NAMEs, then we do not have enough information
 		     to consider them associated.  */
 		  if (dst != name
+		      && name
+		      && TREE_CODE (name) == SSA_NAME
 		      && (SSA_NAME_VAR (dst) != SSA_NAME_VAR (name)
 			  || !SSA_NAME_VAR (dst))
 		      && !virtual_operand_p (dst))
@@ -276,10 +316,7 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name,
 	      gimple *stmt = gsi_stmt (gsi);
 	      if (gimple_call_internal_p (stmt, IFN_UNIQUE)
 		  || gimple_call_builtin_p (stmt, BUILT_IN_CONSTANT_P))
-		{
-		  m_path.pop ();
-		  return NULL;
-		}
+		return false;
 	      /* Do not count empty statements and labels.  */
 	      if (gimple_code (stmt) != GIMPLE_NOP
 		  && !(gimple_code (stmt) == GIMPLE_ASSIGN
@@ -330,75 +367,52 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name,
 	     "  Overall: %i insns\n",
 	     stmt_insns, n_insns);
 
-  /* We have found a constant value for ARG.  For GIMPLE_SWITCH
-     and GIMPLE_GOTO, we use it as-is.  However, for a GIMPLE_COND
-     we need to substitute, fold and simplify so we can determine
-     the edge taken out of the last block.  */
-  if (gimple_code (stmt) == GIMPLE_COND)
+  if (creates_irreducible_loop)
     {
-      enum tree_code cond_code = gimple_cond_code (stmt);
-
-      /* We know the underyling format of the condition.  */
-      arg = fold_binary (cond_code, boolean_type_node,
-			 arg, gimple_cond_rhs (stmt));
-    }
-
-  /* If this path threaded through the loop latch back into the
-     same loop and the destination does not dominate the loop
-     latch, then this thread would create an irreducible loop.
-
-     We have to know the outgoing edge to figure this out.  */
-  edge taken_edge = find_taken_edge (m_path[0], arg);
-
-  /* There are cases where we may not be able to extract the
-     taken edge.  For example, a computed goto to an absolute
-     address.  Handle those cases gracefully.  */
-  if (taken_edge == NULL)
-    {
-      m_path.pop ();
-      return NULL;
+      /* If this path threaded through the loop latch back into the
+	 same loop and the destination does not dominate the loop
+	 latch, then this thread would create an irreducible loop.  */
+      *creates_irreducible_loop = false;
+      if (taken_edge
+	  && threaded_through_latch
+	  && loop == taken_edge->dest->loop_father
+	  && (determine_bb_domination_status (loop, taken_edge->dest)
+	      == DOMST_NONDOMINATING))
+	*creates_irreducible_loop = true;
     }
 
-  *creates_irreducible_loop = false;
-  if (threaded_through_latch
-      && loop == taken_edge->dest->loop_father
-      && (determine_bb_domination_status (loop, taken_edge->dest)
-	  == DOMST_NONDOMINATING))
-    *creates_irreducible_loop = true;
-
   if (path_crosses_loops)
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
-	fprintf (dump_file, "FSM jump-thread path not considered: "
+	fprintf (dump_file, "  FAIL: FSM jump-thread path not considered: "
 		 "the path crosses loops.\n");
-      m_path.pop ();
-      return NULL;
+      return false;
     }
 
   /* Threading is profitable if the path duplicated is hot but also
      in a case we separate cold path from hot path and permit optimization
      of the hot path later.  Be on the agressive side here. In some testcases,
      as in PR 78407 this leads to noticeable improvements.  */
-  if (m_speed_p && (optimize_edge_for_speed_p (taken_edge) || contains_hot_bb))
+  if (m_speed_p
+      && ((taken_edge && optimize_edge_for_speed_p (taken_edge))
+	  || contains_hot_bb))
     {
       if (n_insns >= param_max_fsm_thread_path_insns)
 	{
 	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    fprintf (dump_file, "FSM jump-thread path not considered: "
+	    fprintf (dump_file, "  FAIL: FSM jump-thread path not considered: "
 		     "the number of instructions on the path "
 		     "exceeds PARAM_MAX_FSM_THREAD_PATH_INSNS.\n");
-	  m_path.pop ();
-	  return NULL;
+	  return false;
 	}
     }
-  else if (n_insns > 1)
+  else if (!m_speed_p && n_insns > 1)
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
-	fprintf (dump_file, "FSM jump-thread path not considered: "
+	fprintf (dump_file, "  FAIL: FSM jump-thread path not considered: "
 		 "duplication of %i insns is needed and optimizing for size.\n",
 		 n_insns);
-      m_path.pop ();
-      return NULL;
+      return false;
     }
 
   /* We avoid creating irreducible inner loops unless we thread through
@@ -410,7 +424,9 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name,
      the path -- in that case there's little the traditional loop
      optimizer would have done anyway, so an irreducible loop is not
      so bad.  */
-  if (!threaded_multiway_branch && *creates_irreducible_loop
+  if (!threaded_multiway_branch
+      && creates_irreducible_loop
+      && *creates_irreducible_loop
       && (n_insns * (unsigned) param_fsm_scale_path_stmts
 	  > (m_path.length () *
 	     (unsigned) param_fsm_scale_path_blocks)))
@@ -418,13 +434,11 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name,
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
 	fprintf (dump_file,
-		 "FSM would create irreducible loop without threading "
+		 "  FAIL: FSM would create irreducible loop without threading "
 		 "multiway branch.\n");
-      m_path.pop ();
-      return NULL;
+      return false;
     }
 
-
   /* If this path does not thread through the loop latch, then we are
      using the FSM threader to find old style jump threads.  This
      is good, except the FSM threader does not re-use an existing
@@ -438,10 +452,9 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name,
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
 	fprintf (dump_file,
-		 "FSM did not thread around loop and would copy too "
+		 "  FAIL: FSM did not thread around loop and would copy too "
 		 "many statements.\n");
-      m_path.pop ();
-      return NULL;
+      return false;
     }
 
   /* When there is a multi-way branch on the path, then threading can
@@ -452,24 +465,69 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, tree name,
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
 	fprintf (dump_file,
-		 "FSM Thread through multiway branch without threading "
+		 "  FAIL: FSM Thread through multiway branch without threading "
 		 "a multiway branch.\n");
-      m_path.pop ();
-      return NULL;
+      return false;
     }
-  return taken_edge;
+  return true;
+}
+
+/* Return the taken edge out of a path, assuming that the underlying assignment
+   or PHI SSA resolves to ARG.  */
+
+edge
+thread_jumps::find_taken_edge (const vec<basic_block> &path, tree arg)
+{
+  if (TREE_CODE_CLASS (TREE_CODE (arg)) != tcc_constant)
+    return NULL;
+
+  gcc_checking_assert (!path.is_empty ());
+  gimple *stmt = get_gimple_control_stmt (m_path[0]);
+
+  /* We have found a constant value for ARG.  For GIMPLE_SWITCH
+     and GIMPLE_GOTO, we use it as-is.  However, for a GIMPLE_COND
+     we need to substitute, fold and simplify so we can determine
+     the edge taken out of the last block.  */
+  if (gimple_code (stmt) == GIMPLE_COND)
+    {
+      enum tree_code cond_code = gimple_cond_code (stmt);
+
+      /* We know the underyling format of the condition.  */
+      arg = fold_binary (cond_code, boolean_type_node,
+			 arg, gimple_cond_rhs (stmt));
+    }
+
+  /* If this path threaded through the loop latch back into the
+     same loop and the destination does not dominate the loop
+     latch, then this thread would create an irreducible loop.
+
+     We have to know the outgoing edge to figure this out.  */
+  return ::find_taken_edge (m_path[0], arg);
 }
 
 /* The current path PATH is a vector of blocks forming a jump threading
    path in reverse order.  TAKEN_EDGE is the edge taken from path[0].
 
    Convert the current path into the form used by register_jump_thread and
-   register it.   */
+   register it.
 
-void
-thread_jumps::convert_and_register_current_path (edge taken_edge)
+   Return TRUE if successful or FALSE otherwise.  */
+
+bool
+back_threader_registry::register_path (const vec<basic_block> &m_path,
+				       edge taken_edge)
 {
-  vec<jump_thread_edge *> *path = m_registry.allocate_thread_path ();
+  if (m_threaded_paths > m_max_allowable_paths)
+    {
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	fprintf (dump_file, "  FAIL: FSM jump-thread path not considered: "
+		 "the number of previously recorded FSM paths to "
+		 "thread exceeds PARAM_MAX_FSM_THREAD_PATHS.\n");
+      return false;
+    }
+
+  vec<jump_thread_edge *> *jump_thread_path
+    = m_lowlevel_registry.allocate_thread_path ();
 
   /* Record the edges between the blocks in PATH.  */
   for (unsigned int j = 0; j + 1 < m_path.length (); j++)
@@ -480,17 +538,19 @@ thread_jumps::convert_and_register_current_path (edge taken_edge)
       edge e = find_edge (bb1, bb2);
       gcc_assert (e);
       jump_thread_edge *x
-	= m_registry.allocate_thread_edge (e, EDGE_FSM_THREAD);
-      path->safe_push (x);
+	= m_lowlevel_registry.allocate_thread_edge (e, EDGE_FSM_THREAD);
+      jump_thread_path->safe_push (x);
     }
 
   /* Add the edge taken when the control variable has value ARG.  */
   jump_thread_edge *x
-    = m_registry.allocate_thread_edge (taken_edge, EDGE_NO_COPY_SRC_BLOCK);
-  path->safe_push (x);
+    = m_lowlevel_registry.allocate_thread_edge (taken_edge,
+						EDGE_NO_COPY_SRC_BLOCK);
+  jump_thread_path->safe_push (x);
 
-  m_registry.register_jump_thread (path);
-  --m_max_threaded_paths;
+  m_lowlevel_registry.register_jump_thread (jump_thread_path);
+  ++m_threaded_paths;
+  return true;
 }
 
 /* While following a chain of SSA_NAME definitions, we jumped from a
@@ -558,19 +618,17 @@ thread_jumps::check_subpath_and_update_thread_path (basic_block last_bb,
    DEF_BB is the basic block that ultimately defines the constant.  */
 
 void
-thread_jumps::register_jump_thread_path_if_profitable (tree name, tree arg,
-						       basic_block def_bb)
+thread_jumps::maybe_register_path (const vec<basic_block> &m_path,
+				   tree name,
+				   edge taken_edge)
 {
-  if (TREE_CODE_CLASS (TREE_CODE (arg)) != tcc_constant)
-    return;
-
   bool irreducible = false;
-  edge taken_edge = profitable_jump_thread_path (def_bb, name, arg,
-						 &irreducible);
-  if (taken_edge)
+  bool profitable = m_profit.profitable_path_p (m_path, name, taken_edge,
+						&irreducible);
+  if (profitable)
     {
-      convert_and_register_current_path (taken_edge);
-      m_path.pop ();
+      if (!m_registry.register_path (m_path, taken_edge))
+	return;
 
       if (irreducible)
 	vect_free_loop_info_assumptions (m_path[0]->loop_father);
@@ -585,7 +643,7 @@ thread_jumps::register_jump_thread_path_if_profitable (tree name, tree arg,
    NAME having a constant value.  */
 
 void
-thread_jumps::handle_phi (gphi *phi, tree name, basic_block def_bb)
+thread_jumps::handle_phi (gphi *phi, basic_block def_bb)
 {
   /* Iterate over the arguments of PHI.  */
   for (unsigned int i = 0; i < gimple_phi_num_args (phi); i++)
@@ -608,7 +666,11 @@ thread_jumps::handle_phi (gphi *phi, tree name, basic_block def_bb)
 	  continue;
 	}
 
-      register_jump_thread_path_if_profitable (name, arg, bbi);
+      m_path.safe_push (bbi);
+      edge taken_edge = find_taken_edge (m_path, arg);
+      if (taken_edge)
+	maybe_register_path (m_path, m_name, taken_edge);
+      m_path.pop ();
     }
 }
 
@@ -650,25 +712,23 @@ handle_assignment_p (gimple *stmt)
    NAME having a constant value.  */
 
 void
-thread_jumps::handle_assignment (gimple *stmt, tree name, basic_block def_bb)
+thread_jumps::handle_assignment (gimple *stmt, basic_block def_bb)
 {
   tree arg = gimple_assign_rhs1 (stmt);
 
   if (TREE_CODE (arg) == SSA_NAME)
     fsm_find_control_statement_thread_paths (arg);
-
   else
     {
-      /* register_jump_thread_path_if_profitable will push the current
-	 block onto the path.  But the path will always have the current
-	 block at this point.  So we can just pop it.  */
-      m_path.pop ();
-
-      register_jump_thread_path_if_profitable (name, arg, def_bb);
-
-      /* And put the current block back onto the path so that the
-	 state of the stack is unchanged when we leave.  */
-      m_path.safe_push (def_bb);
+      if (CHECKING_P)
+	{
+	  gcc_assert (!m_path.is_empty ());
+	  basic_block top = m_path[m_path.length () - 1];
+	  gcc_assert (top == def_bb);
+	}
+      edge taken_edge = find_taken_edge (m_path, arg);
+      if (taken_edge)
+	maybe_register_path (m_path, m_name, taken_edge);
     }
 }
 
@@ -738,9 +798,9 @@ thread_jumps::fsm_find_control_statement_thread_paths (tree name)
   gcc_assert (m_path.last () == def_bb);
 
   if (gimple_code (def_stmt) == GIMPLE_PHI)
-    handle_phi (as_a <gphi *> (def_stmt), name, def_bb);
+    handle_phi (as_a <gphi *> (def_stmt), def_bb);
   else if (gimple_code (def_stmt) == GIMPLE_ASSIGN)
-    handle_assignment (def_stmt, name, def_bb);
+    handle_assignment (def_stmt, def_bb);
 
   /* Remove all the nodes that we added from NEXT_PATH.  */
   if (next_path_length)
@@ -756,8 +816,8 @@ thread_jumps::fsm_find_control_statement_thread_paths (tree name)
    code path.  */
 
 void
-thread_jumps::find_jump_threads_backwards (basic_block bb, bool speed_p)
-{     
+thread_jumps::find_jump_threads_backwards (basic_block bb)
+{
   gimple *stmt = get_gimple_control_stmt (bb);
   if (!stmt)
     return;
@@ -785,8 +845,7 @@ thread_jumps::find_jump_threads_backwards (basic_block bb, bool speed_p)
   m_path.safe_push (bb);
   m_visited_bbs.empty ();
   m_seen_loop_phi = false;
-  m_speed_p = speed_p;
-  m_max_threaded_paths = param_max_fsm_thread_paths;
+  m_name = name;
 
   fsm_find_control_statement_thread_paths (name);
 }
@@ -836,7 +895,7 @@ pass_thread_jumps::execute (function *fun)
   FOR_EACH_BB_FN (bb, fun)
     {
       if (EDGE_COUNT (bb->succs) > 1)
-	threader.find_jump_threads_backwards (bb, true);
+	threader.find_jump_threads_backwards (bb);
     }
   bool changed = threader.thread_through_all_blocks ();
 
@@ -892,12 +951,12 @@ pass_early_thread_jumps::execute (function *fun)
   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
 
   /* Try to thread each block with more than one successor.  */
-  thread_jumps threader;
+  thread_jumps threader (/*speed_p=*/false);
   basic_block bb;
   FOR_EACH_BB_FN (bb, fun)
     {
       if (EDGE_COUNT (bb->succs) > 1)
-	threader.find_jump_threads_backwards (bb, false);
+	threader.find_jump_threads_backwards (bb);
     }
   threader.thread_through_all_blocks ();
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] Jump threader refactor.
  2021-04-28 17:12 ` [PATCH 1/2] " Aldy Hernandez
@ 2021-04-30 15:53   ` Jeff Law
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff Law @ 2021-04-30 15:53 UTC (permalink / raw)
  To: Aldy Hernandez, GCC patches


On 4/28/2021 11:12 AM, Aldy Hernandez wrote:
> This is an overall refactor of the jump threader, both for the low level
> bits in tree-ssa-threadupdate.* and the high level bits in
> tree-ssa-threadedge.*.
>
> There should be no functional changes.
>
> Some of the benefits of the refactor are:
>
> a) Eliminates some icky global state (for example the x_vr_values hack).

Thank goodness.  This was the biggest wart from the VRP refactoring a 
couple years back.


>
> b) Provides some semblance of an API for the threader.

Definitely good.  As you've noted, there's a few distinct phases 
(simplification for threading, registering threadinng opportunities, 
realization of threading opportunities).  There's a relatively narrow 
interface between each and putting a real API in between those phases is 
definitely an improvement.


>
> c) Makes it clearer to see what parts are from the high level
> threader, and what parts belong in the low level path registry and
> BB threading mechanism.

Exactly.  I wouldn't lose any sleep if the registry bits moved into 
their own file.


>
> d) Avoids passing a ton of variables around.

Yea.  I had some similar patches here -- as various routines move into 
classes we don't need that nonsense anymore.


>
> e) Provides for easier sharing with the backward threader.
>
> f) Merges the simplify stmt code in VRP and DOM as they were nearly
> identical.

Yea.  I can't even remember why those bits weren't shared.  They should 
have been damn close to identical.


>
> This has been bootstrapped and regression tested on x86-64 Linux.
> Jeff had also been testing this path as part of his Fedora throughout the
> off-season.

More correctly, it's been in the Upstream GCC tester for months, so it's 
been tested across various native and embedded targets.


The Fedora tester is different :-)  I'm hoping to get the Fedora 
snapshot tester fired up again next week so that we're building Fedora 
against GCC snapshots regularly for at least x86_64 and aarch64.




>
> gcc/ChangeLog:
>
> 	* tree-ssa-dom.c (class dom_jump_threader_simplifier): New.
> 	(class dom_opt_dom_walker): Initialize some class variables.
> 	(pass_dominator::execute): Pass evrp_range_analyzer and
> 	dom_jump_threader_simplifier to dom_opt_dom_walker.
> 	Adjust for some functions moving into classes.
> 	(simplify_stmt_for_jump_threading): Adjust and move to...
> 	(jump_threader_simplifier::simplify): ...here.
> 	(dom_opt_dom_walker::before_dom_children): Adjust for
> 	m_evrp_range_analyzer.
> 	(dom_opt_dom_walker::after_dom_children): Remove x_vr_values hack.
> 	(test_for_singularity): Place in dom_opt_dom_walker class.
> 	(dom_opt_dom_walker::optimize_stmt): The argument
> 	evrp_range_analyzer is now a class field.
> 	* tree-ssa-threadbackward.c (class thread_jumps): Add m_registry.
> 	(thread_jumps::thread_through_all_blocks): New.
> 	(thread_jumps::convert_and_register_current_path): Use m_registry.
> 	(pass_thread_jumps::execute): Adjust for thread_through_all_blocks
> 	being in the threader class.
> 	(pass_early_thread_jumps::execute): Same.
> 	* tree-ssa-threadedge.c (threadedge_initialize_values): Move...
> 	(jump_threader::jump_threader): ...here.
> 	(threadedge_finalize_values): Move...
> 	(jump_threader::~jump_threader): ...here.
> 	(jump_threader::remove_jump_threads_including): New.
> 	(jump_threader::thread_through_all_blocks): New.
> 	(record_temporary_equivalences_from_phis): Move...
> 	(jump_threader::record_temporary_equivalences_from_phis): ...here.
> 	(record_temporary_equivalences_from_stmts_at_dest): Move...
> 	(jump_threader::record_temporary_equivalences_from_stmts_at_dest):
> 	Here...
> 	(simplify_control_stmt_condition_1): Move to jump_threader class.
> 	(simplify_control_stmt_condition): Move...
> 	(jump_threader::simplify_control_stmt_condition): ...here.
> 	(thread_around_empty_blocks): Move...
> 	(jump_threader::thread_around_empty_blocks): ...here.
> 	(thread_through_normal_block): Move...
> 	(jump_threader::thread_through_normal_block): ...here.
> 	(thread_across_edge): Move...
> 	(jump_threader::thread_across_edge): ...here.
> 	(thread_outgoing_edges): Move...
> 	(jump_threader::thread_outgoing_edges): ...here.
> 	* tree-ssa-threadedge.h: Move externally facing functings...
> 	(class jump_threader): ...here...
> 	(class jump_threader_simplifier): ...and here.
> 	* tree-ssa-threadupdate.c (struct redirection_data): Remove comment.
> 	(jump_thread_path_allocator::jump_thread_path_allocator): New.
> 	(jump_thread_path_allocator::~jump_thread_path_allocator): New.
> 	(jump_thread_path_allocator::allocate_thread_edge): New.
> 	(jump_thread_path_allocator::allocate_thread_path): New.
> 	(jump_thread_path_registry::jump_thread_path_registry): New.
> 	(jump_thread_path_registry::~jump_thread_path_registry): New.
> 	(jump_thread_path_registry::allocate_thread_edge): New.
> 	(jump_thread_path_registry::allocate_thread_path): New.
> 	(dump_jump_thread_path): Make extern.
> 	(debug (const vec<jump_thread_edge *> &path)): New.
> 	(struct removed_edges): Move to tree-ssa-threadupdate.h.
> 	(struct thread_stats_d): Remove.
> 	(remove_ctrl_stmt_and_useless_edges): Make static.
> 	(lookup_redirection_data): Move...
> 	(jump_thread_path_registry::lookup_redirection_data): ...here.
> 	(ssa_redirect_edges): Make static.
> 	(thread_block_1): Move...
> 	(jump_thread_path_registry::thread_block_1): ...here.
> 	(thread_block): Move...
> 	(jump_thread_path_registry::thread_block): ...here.
> 	(thread_through_loop_header):  Move...
> 	(jump_thread_path_registry::thread_through_loop_header): ...here.
> 	(mark_threaded_blocks): Move...
> 	(jump_thread_path_registry::mark_threaded_blocks): ...here.
> 	(debug_path): Move...
> 	(jump_thread_path_registry::debug_path): ...here.
> 	(debug_all_paths): Move...
> 	(jump_thread_path_registry::dump): ..here.
> 	(rewire_first_differing_edge): Move...
> 	(jump_thread_path_registry::rewire_first_differing_edge): ...here.
> 	(adjust_paths_after_duplication): Move...
> 	(jump_thread_path_registry::adjust_paths_after_duplication): ...here.
> 	(duplicate_thread_path): Move...
> 	(jump_thread_path_registry::duplicate_thread_path): ..here.
> 	(remove_jump_threads_including): Move...
> 	(jump_thread_path_registry::remove_jump_threads_including): ...here.
> 	(thread_through_all_blocks): Move to...
> 	(jump_thread_path_registry::thread_through_all_blocks): ...here.
> 	(delete_jump_thread_path): Remove.
> 	(register_jump_thread): Move...
> 	(jump_thread_path_registry::register_jump_thread): ...here.
> 	* tree-ssa-threadupdate.h: Move externally facing functions...
> 	(class jump_thread_path_allocator): ...here...
> 	(class jump_thread_path_registry): ...and here.
> 	(thread_through_all_blocks): Remove.
> 	(struct removed_edges): New.
> 	(register_jump_thread): Remove.
> 	(remove_jump_threads_including): Remove.
> 	(delete_jump_thread_path): Remove.
> 	(remove_ctrl_stmt_and_useless_edges): Remove.
> 	(free_dom_edge_info): New prototype.
> 	* tree-vrp.c: Remove x_vr_values hack.
> 	(class vrp_jump_threader_simplifier): New.
> 	(vrp_jump_threader_simplifier::simplify): New.
> 	(vrp_jump_threader::vrp_jump_threader): Adjust method signature.
> 	Remove m_dummy_cond.
> 	Instantiate m_simplifier and m_threader.
> 	(vrp_jump_threader::thread_through_all_blocks): New.
> 	(vrp_jump_threader::simplify_stmt): Remove.
> 	(vrp_jump_threader::after_dom_children): Do not set m_dummy_cond.
> 	Remove x_vr_values hack.
> 	(execute_vrp): Adjust for thread_through_all_blocks being in a
> 	class.

OK.  Thanks for taking care of this.

jeff


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] Refactor backward threader registry and profitability code into classes.
  2021-04-28 17:12 ` [PATCH 2/2] Refactor backward threader registry and profitability code into classes Aldy Hernandez
@ 2021-04-30 16:10   ` Jeff Law
  2021-04-30 16:16     ` Aldy Hernandez
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Law @ 2021-04-30 16:10 UTC (permalink / raw)
  To: Aldy Hernandez, GCC patches


On 4/28/2021 11:12 AM, Aldy Hernandez wrote:
> This refactors the registry and the profitability code from the
> backwards threader into two separate classes.  It cleans up the code,
> and makes it easier for alternate implementations to share code.
>
> Tested on x86-64 Linux.
>
> gcc/ChangeLog:
>
> 	* tree-ssa-threadbackward.c (class thread_jumps): Split out code
> 	from here...
> 	(class back_threader_registry): ...to here...
> 	(class back_threader_profitability): ...and here...
> 	(thread_jumps::thread_through_all_blocks): Remove argument.
> 	(back_threader_registry::back_threader_registry): New.
> 	(back_threader_registry::~back_threader_registry): New.
> 	(back_threader_registry::thread_through_all_blocks): New.
> 	(thread_jumps::profitable_jump_thread_path): Move from here...
> 	(back_threader_profitability::profitable_path_p): ...to here.
> 	(thread_jumps::find_taken_edge): New.
> 	(thread_jumps::convert_and_register_current_path): Move...
> 	(back_threader_registry::register_path): ...to here.
> 	(thread_jumps::register_jump_thread_path_if_profitable): Move...
> 	(thread_jumps::maybe_register_path): ...to here.
> 	(thread_jumps::handle_phi): Call find_taken_edge and
> 	maybe_register_path.
> 	(thread_jumps::handle_assignment): Same.
> 	(thread_jumps::fsm_find_control_statement_thread_paths): Remove
> 	tree argument to handle_phi and handle_assignment.
> 	(thread_jumps::find_jump_threads_backwards): Set m_name.  Remove
> 	set of m_speed_p and m_max_threaded_paths.
> 	(pass_thread_jumps::execute): Remove second argument from
> 	find_jump_threads_backwards.
> 	(pass_early_thread_jumps::execute): Same.

OK.  And if you wanted to pull any of that code into its own file, 
consider that pre-approved.

jeff


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] Refactor backward threader registry and profitability code into classes.
  2021-04-30 16:10   ` Jeff Law
@ 2021-04-30 16:16     ` Aldy Hernandez
  0 siblings, 0 replies; 7+ messages in thread
From: Aldy Hernandez @ 2021-04-30 16:16 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC patches, Andrew MacLeod

I may just do that :).

Thanks.
Aldy

On Fri, Apr 30, 2021 at 6:10 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
> On 4/28/2021 11:12 AM, Aldy Hernandez wrote:
> > This refactors the registry and the profitability code from the
> > backwards threader into two separate classes.  It cleans up the code,
> > and makes it easier for alternate implementations to share code.
> >
> > Tested on x86-64 Linux.
> >
> > gcc/ChangeLog:
> >
> >       * tree-ssa-threadbackward.c (class thread_jumps): Split out code
> >       from here...
> >       (class back_threader_registry): ...to here...
> >       (class back_threader_profitability): ...and here...
> >       (thread_jumps::thread_through_all_blocks): Remove argument.
> >       (back_threader_registry::back_threader_registry): New.
> >       (back_threader_registry::~back_threader_registry): New.
> >       (back_threader_registry::thread_through_all_blocks): New.
> >       (thread_jumps::profitable_jump_thread_path): Move from here...
> >       (back_threader_profitability::profitable_path_p): ...to here.
> >       (thread_jumps::find_taken_edge): New.
> >       (thread_jumps::convert_and_register_current_path): Move...
> >       (back_threader_registry::register_path): ...to here.
> >       (thread_jumps::register_jump_thread_path_if_profitable): Move...
> >       (thread_jumps::maybe_register_path): ...to here.
> >       (thread_jumps::handle_phi): Call find_taken_edge and
> >       maybe_register_path.
> >       (thread_jumps::handle_assignment): Same.
> >       (thread_jumps::fsm_find_control_statement_thread_paths): Remove
> >       tree argument to handle_phi and handle_assignment.
> >       (thread_jumps::find_jump_threads_backwards): Set m_name.  Remove
> >       set of m_speed_p and m_max_threaded_paths.
> >       (pass_thread_jumps::execute): Remove second argument from
> >       find_jump_threads_backwards.
> >       (pass_early_thread_jumps::execute): Same.
>
> OK.  And if you wanted to pull any of that code into its own file,
> consider that pre-approved.
>
> jeff
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 0/2] Jump threader refactor
@ 2021-04-28 17:12 Aldy Hernandez
  0 siblings, 0 replies; 7+ messages in thread
From: Aldy Hernandez @ 2021-04-28 17:12 UTC (permalink / raw)
  To: jeff, GCC patches

Hi Jeff.

This is the jump threader overhaul I sent you last year, along with
further refactors for the backwards threader.

The meat of it is in the first patch, which IIRC, you passed through
the Fedora tester multiple times.

OK for trunk?

Aldy


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-04-30 16:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-28 17:12 [PATCH 0/2] Jump threader refactor Aldy Hernandez
2021-04-28 17:12 ` [PATCH 1/2] " Aldy Hernandez
2021-04-30 15:53   ` Jeff Law
2021-04-28 17:12 ` [PATCH 2/2] Refactor backward threader registry and profitability code into classes Aldy Hernandez
2021-04-30 16:10   ` Jeff Law
2021-04-30 16:16     ` Aldy Hernandez
  -- strict thread matches above, loose matches on Subject: below --
2021-04-28 17:12 [PATCH 0/2] Jump threader refactor Aldy Hernandez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).