public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Ranger-based backwards threader implementation.
@ 2021-06-28 16:21 Aldy Hernandez
  2021-06-28 16:21 ` [PATCH 1/2] Implement basic block path solver Aldy Hernandez
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Aldy Hernandez @ 2021-06-28 16:21 UTC (permalink / raw)
  To: GCC patches
  Cc: Andrew MacLeod, Jeff Law, Richard Biener, Martin Sebor, Aldy Hernandez

This is the ranger-based backwards threader.  It is divided into two
parts: the solver and the path discovery bits.

The solver is generic enough, that it may be of use to other passes,
so it's been abstracted into its own separate class/file.  Andrew and
I have already gone over it, so I don't think a review is necessary.
Besides, it's technically an extension of the ranger infrastructure.

On the other hand, the path discovery bits could benefit from the
watchful eye of the jump threading experts.

Documenting the solver in a [ranger-tech] post is on my TODO list,
as I think it would be useful as an example of GORI as a general
tool, outside the VRP world.

As I have mentioned elsewhere, I have gone through each test and
documented the reasons why they were adjusted (when useful).  The
reviewer(s) may benefit from looking at the test notes.

I have added a --param=threader-mode={ranger,legacy} option, which I
hope to remove shortly after.  It has been useful for diagnosing
issues in the past, though perhaps not so much now.  I've left it
in case there's a remote interest in using it during stage1, but
removing it could be a huge cleanup to tree-ssa-threadbackward.c.

If/when accepted, I will open 2-3 PRs with the XFAILed tests as
requested.  I am still working on distilling a C counterpart for
the libphobos missing thread edge.  It'll hopefully be ready by the
time the review is done.

A version of this patchset with the verification code has
been tested on x86-64, ppc64, ppc64le, and aarch64 (all Linux).

I am currently re-testing on x86-64 Linux, but will not re-test on the
rest of the architectures because...OMG aarch6 is so slow!

Thanks.
Aldy

Aldy Hernandez (2):
  Implement basic block path solver.
  Backwards jump threader rewrite with ranger.

 gcc/Makefile.in                               |   6 +
 gcc/flag-types.h                              |   7 +
 gcc/params.opt                                |  17 +
 .../g++.dg/debug/dwarf2/deallocator.C         |   3 +-
 gcc/testsuite/gcc.c-torture/compile/pr83510.c |  33 ++
 gcc/testsuite/gcc.dg/Wrestrict-22.c           |   3 +
 gcc/testsuite/gcc.dg/loop-unswitch-2.c        |   2 +-
 gcc/testsuite/gcc.dg/old-style-asm-1.c        |   5 +-
 gcc/testsuite/gcc.dg/pr68317.c                |   4 +-
 gcc/testsuite/gcc.dg/pr97567-2.c              |   2 +-
 gcc/testsuite/gcc.dg/predict-9.c              |   4 +-
 gcc/testsuite/gcc.dg/shrink-wrap-loop.c       |  53 ++
 gcc/testsuite/gcc.dg/sibcall-1.c              |  10 +
 .../gcc.dg/tree-ssa/builtin-sprintf-3.c       |   5 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr21001.c       |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/pr21294.c       |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/pr21417.c       |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c     |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr21563.c       |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr49039.c       |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c     |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c     |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c     |   2 +-
 .../gcc.dg/tree-ssa/ranger-threader-1.c       |  20 +
 .../gcc.dg/tree-ssa/ranger-threader-2.c       |  39 ++
 .../gcc.dg/tree-ssa/ranger-threader-3.c       |  41 ++
 .../gcc.dg/tree-ssa/ranger-threader-4.c       |  83 +++
 gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c  |   4 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-11.c       |   2 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-12.c       |   2 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-14.c       |   1 +
 .../gcc.dg/tree-ssa/ssa-dom-thread-18.c       |   5 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-6.c        |   4 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-7.c        |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c    |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/vrp02.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp03.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp05.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp06.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp07.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp09.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp19.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp20.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp33.c         |   2 +-
 gcc/testsuite/gcc.dg/vect/bb-slp-16.c         |   7 +
 .../gcc.target/i386/avx2-vect-aggressive.c    |   2 +-
 gcc/tree-ssa-path-solver.cc                   | 310 ++++++++++++
 gcc/tree-ssa-path-solver.h                    |  85 ++++
 gcc/tree-ssa-threadbackward.c                 | 475 +++++++++++++++++-
 gcc/tree-ssa-threadedge.c                     |  20 +-
 gcc/tree-ssa-threadedge.h                     |   3 +-
 gcc/tree-ssa-threadupdate.c                   |  12 +-
 gcc/tree-ssa-threadupdate.h                   |   2 +-
 .../libgomp.graphite/force-parallel-4.c       |   1 +
 .../libgomp.graphite/force-parallel-8.c       |   2 +
 58 files changed, 1261 insertions(+), 54 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c
 create mode 100644 gcc/tree-ssa-path-solver.cc
 create mode 100644 gcc/tree-ssa-path-solver.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/2] Implement basic block path solver.
  2021-06-28 16:21 [PATCH 0/2] Ranger-based backwards threader implementation Aldy Hernandez
@ 2021-06-28 16:21 ` Aldy Hernandez
  2021-07-01 22:20   ` Jeff Law
  2021-06-28 16:21 ` [PATCH 2/2] Backwards jump threader rewrite with ranger Aldy Hernandez
  2021-06-28 23:19 ` [PATCH 0/2] Ranger-based backwards threader implementation Martin Sebor
  2 siblings, 1 reply; 20+ messages in thread
From: Aldy Hernandez @ 2021-06-28 16:21 UTC (permalink / raw)
  To: GCC patches
  Cc: Andrew MacLeod, Jeff Law, Richard Biener, Martin Sebor, Aldy Hernandez

This is is the main basic block path solver for use in the ranger-based
backwards threader.  Given a path of BBs, the class can solve the final
conditional or any SSA name used in calculating the final conditional.

The main API is:

// This class is a basic block path solver.  Given a set of BBs
// indicating a path through the CFG, range_in_path() will return the
// range of an SSA as if the BBs in the path would have been executed
// in order.
//
// Only SSA names passed in IMPORTS are precomputed, and can be
// queried.
//
// Note that the blocks are in reverse order, thus the exit block is
// path[0].

class path_solver
{
public:
  path_solver (gimple_ranger &ranger);
  virtual ~path_solver ();
  void precompute_ranges (const vec<basic_block> *path,
			  const bitmap_head *imports);
  void range_in_path (irange &, tree name);
  void range_in_path (irange &, gimple *);
};

gcc/ChangeLog:

        * Makefile.in (OBJS): Add tree-ssa-path-solver.o.
	* tree-ssa-path-solver.cc: New file.
	* tree-ssa-path-solver.h: New file.
---
 gcc/Makefile.in             |   1 +
 gcc/tree-ssa-path-solver.cc | 310 ++++++++++++++++++++++++++++++++++++
 gcc/tree-ssa-path-solver.h  |  85 ++++++++++
 3 files changed, 396 insertions(+)
 create mode 100644 gcc/tree-ssa-path-solver.cc
 create mode 100644 gcc/tree-ssa-path-solver.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index ebf26442992..66cc5f9529e 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1644,6 +1644,7 @@ OBJS = \
 	tree-ssa-loop.o \
 	tree-ssa-math-opts.o \
 	tree-ssa-operands.o \
+	tree-ssa-path-solver.o \
 	tree-ssa-phiopt.o \
 	tree-ssa-phiprop.o \
 	tree-ssa-pre.o \
diff --git a/gcc/tree-ssa-path-solver.cc b/gcc/tree-ssa-path-solver.cc
new file mode 100644
index 00000000000..1e2c37cff78
--- /dev/null
+++ b/gcc/tree-ssa-path-solver.cc
@@ -0,0 +1,310 @@
+/* Basic block path solver.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   Contributed by Aldy Hernandez <aldyh@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "tree.h"
+#include "gimple.h"
+#include "cfganal.h"
+#include "value-range.h"
+#include "gimple-range.h"
+#include "tree-pretty-print.h"
+#include "tree-ssa-path-solver.h"
+#include "ssa.h"
+
+// Internal construct to help facilitate debugging of solver.
+#define DEBUG_SOLVER getenv("DEBUG")
+
+path_solver::path_solver (gimple_ranger &ranger)
+  : m_ranger (ranger)
+{
+  m_cache = new ssa_global_cache;
+  m_has_cache_entry = BITMAP_ALLOC (NULL);
+  m_path = NULL;
+}
+
+path_solver::~path_solver ()
+{
+  BITMAP_FREE (m_has_cache_entry);
+  delete m_cache;
+}
+
+// Mark cache entry for NAME as unused.
+
+void
+path_solver::clear_cache (tree name)
+{
+  unsigned v = SSA_NAME_VERSION (name);
+  bitmap_clear_bit (m_has_cache_entry, v);
+}
+
+// If NAME has a cache entry, return it in R, and return TRUE.
+
+inline bool
+path_solver::get_cache (irange &r, tree name)
+{
+  if (!gimple_range_ssa_p (name))
+    return get_global_range_query ()->range_of_expr (r, name);
+
+  unsigned v = SSA_NAME_VERSION (name);
+  if (bitmap_bit_p (m_has_cache_entry, v))
+    return m_cache->get_global_range (r, name);
+
+  return false;
+}
+
+// Set the cache entry for NAME to R.
+
+void
+path_solver::set_cache (const irange &r, tree name)
+{
+  unsigned v = SSA_NAME_VERSION (name);
+  bitmap_set_bit (m_has_cache_entry, v);
+  m_cache->set_global_range (name, r);
+}
+
+bool
+path_solver::range_of_expr (irange &r, tree name, gimple *stmt)
+{
+  if (!irange::supports_type_p (TREE_TYPE (name)))
+    return false;
+
+  if (get_cache (r, name))
+    return true;
+
+  if (stmt && range_defined_in_block (r, name, gimple_bb (stmt)))
+    {
+      set_cache (r, name);
+      return true;
+    }
+
+  // Otherwise return varying.
+  r.set_varying (TREE_TYPE (name));
+  // ?? Is this set_cache necessary?
+  set_cache (r, name);
+  return true;
+}
+
+// Initialize the current path to PATH.  The current block is set to
+// the entry block to the path.
+//
+// Note that the blocks are in reverse order, so the exit block is
+// path[0].
+
+void
+path_solver::set_path (const vec<basic_block> *path)
+{
+  gcc_checking_assert (path->length () > 1);
+  m_path = path;
+  m_pos = m_path->length () - 1;
+  bitmap_clear (m_has_cache_entry);
+}
+
+// Return the range of the result of PHI in R.
+
+void
+path_solver::ssa_range_in_phi (irange &r, gphi *phi)
+{
+  tree name = gimple_phi_result (phi);
+  basic_block bb = gimple_bb (phi);
+
+  // We experimented with querying ranger's range_on_entry here, but
+  // the performance penalty was too high, for hardly any improvements.
+  if (at_entry ())
+    {
+      r.set_varying (TREE_TYPE (name));
+      return;
+    }
+
+  basic_block prev = prev_bb ();
+  edge e_in = find_edge (prev, bb);
+  for (size_t i = 0; i < gimple_phi_num_args (phi); ++i)
+    if (e_in == gimple_phi_arg_edge (phi, i))
+      {
+	tree arg = gimple_phi_arg_def (phi, i);
+
+	if (!get_cache (r, arg))
+	  r.set_varying (TREE_TYPE (name));
+
+	// ?? Is this set_cache necessary?
+	set_cache (r, name);
+	return;
+      }
+  gcc_unreachable ();
+}
+
+// If NAME is defined in BB, set R to the range of NAME, and return
+// TRUE.  Otherwise, return FALSE.
+
+bool
+path_solver::range_defined_in_block (irange &r, tree name, basic_block bb)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
+  basic_block def_bb = gimple_bb (def_stmt);
+
+  if (def_bb != bb)
+    return false;
+
+  if (gimple_code (def_stmt) == GIMPLE_PHI)
+    ssa_range_in_phi (r, as_a<gphi *> (def_stmt));
+  else if (!fold_range (r, def_stmt, this))
+    r.set_varying (TREE_TYPE (name));
+
+  if (DEBUG_SOLVER)
+    {
+      fprintf (stderr, "range_defined_in_block (BB%d) for ", bb->index);
+      print_generic_expr (stderr, name, TDF_SLIM);
+      fprintf (stderr, " is ");
+      r.dump (stderr);
+      fprintf (stderr, "\n");
+    }
+  return true;
+}
+
+// Precompute ranges defined in the current block, or ranges
+// that are exported on an edge to the next block.  The cache is
+// updated accordingly.
+
+void
+path_solver::precompute_ranges_in_block ()
+{
+  basic_block bb = curr_bb ();
+  bitmap_iterator bi;
+  int_range_max r, cached_range;
+  unsigned i;
+
+  // Force recalculation of any names in the cache that are defined in
+  // this block.  This can happen on interdependent SSA/phis in loops.
+  EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+      gimple *def_stmt = SSA_NAME_DEF_STMT (name);
+      basic_block def_bb = gimple_bb (def_stmt);
+
+      if (def_bb == bb)
+	clear_cache (name);
+    }
+
+  // Solve imports defined in this block.
+  EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+
+      if (range_defined_in_block (r, name, bb))
+	set_cache (r, name);
+    }
+
+  if (at_exit ())
+    return;
+
+  // Solve imports that are exported to the next block.
+  edge e = find_edge (bb, next_bb ());
+  EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+      gori_compute &g = m_ranger.gori ();
+      bitmap exports = g.exports (bb);
+
+      if (bitmap_bit_p (exports, i))
+	{
+	  if (g.outgoing_edge_range_p (r, e, name, *this))
+	    {
+	      if (get_cache (cached_range, name))
+		r.intersect (cached_range);
+
+	      set_cache (r, name);
+	      if (DEBUG_SOLVER)
+		{
+		  fprintf (stderr, "outgoing_edge_range_p for ");
+		  print_generic_expr (stderr, name, TDF_SLIM);
+		  fprintf (stderr, " on edge %d->%d ",
+			   e->src->index, e->dest->index);
+		  fprintf (stderr, "is ");
+		  r.dump (stderr);
+		  fprintf (stderr, "\n");
+		}
+	    }
+	}
+    }
+}
+
+// Precompute the ranges for IMPORTS along PATH.
+//
+// IMPORTS are the set of SSA names, any of which could potentially
+// change the value of the final conditional in PATH.
+
+void
+path_solver::precompute_ranges (const vec<basic_block> *path,
+				const bitmap_head *imports)
+{
+  set_path (path);
+  m_imports = imports;
+
+  if (DEBUG_SOLVER)
+    {
+      extern void debug (vec<basic_block> *);
+      extern void debug (const bitmap_head *);
+      fprintf (stderr, "\nPATH is:\n");
+      debug (const_cast <vec<basic_block> *> (path));
+      fprintf (stderr, "imports: ");
+      debug (imports);
+    }
+
+  while (1)
+    {
+      precompute_ranges_in_block ();
+      if (at_exit ())
+	break;
+      move_next ();
+    }
+}
+
+// Return the range of NAME as it would be seen at the end of the path
+// being analyzed.
+
+void
+path_solver::range_in_path (irange &r, tree name)
+{
+  basic_block bb = exit_bb ();
+
+  if (get_cache (r, name))
+    return;
+
+  if (range_defined_in_block (r, name, bb))
+    return;
+
+  // The path may not be deep enough to resolve NAME.
+  r.set_varying (TREE_TYPE (name));
+}
+
+// Return the range of STMT as it would be seen at the end of the path
+// being analyzed.  Anything but the final conditional in a BB will
+// return VARYING.
+
+void
+path_solver::range_in_path (irange &r, gimple *stmt)
+{
+  if (gimple_code (stmt) == GIMPLE_COND && fold_range (r, stmt, this))
+    return;
+
+  r.set_varying (gimple_expr_type (stmt));
+}
diff --git a/gcc/tree-ssa-path-solver.h b/gcc/tree-ssa-path-solver.h
new file mode 100644
index 00000000000..c07265d4510
--- /dev/null
+++ b/gcc/tree-ssa-path-solver.h
@@ -0,0 +1,85 @@
+/* Header file for jump threading path solver.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   Contributed by Aldy Hernandez <aldyh@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_TREE_SSA_THREADSOLVER_H
+#define GCC_TREE_SSA_THREADSOLVER_H
+
+// This class is a basic block path solver.  Given a set of BBs
+// indicating a path through the CFG, range_in_path() will return the
+// range of an SSA as if the BBs in the path would have been executed
+// in order.
+//
+// Only SSA names passed in IMPORTS are precomputed, and can be
+// queried.
+//
+// Note that the blocks are in reverse order, thus the exit block is
+// path[0].
+
+class path_solver : private range_query
+{
+public:
+  path_solver (gimple_ranger &ranger);
+  virtual ~path_solver ();
+  void precompute_ranges (const vec<basic_block> *path,
+			  const bitmap_head *imports);
+  void range_in_path (irange &, tree name);
+  void range_in_path (irange &, gimple *);
+
+private:
+  bool range_of_expr (irange &r, tree name, gimple * = NULL) override;
+
+  // Cache manipulation.
+  void set_cache (const irange &r, tree name);
+  bool get_cache (irange &r, tree name);
+  void clear_cache (tree name);
+
+  // Methods to precompute ranges for the given path.
+  bool range_defined_in_block (irange &, tree name, basic_block bb);
+  void precompute_ranges_in_block ();
+  void ssa_range_in_phi (irange &r, gphi *phi);
+
+  // Path navigation.
+  void set_path (const vec<basic_block> *);
+  basic_block entry_bb () { return (*m_path)[m_path->length () - 1]; }
+  basic_block exit_bb ()  { return (*m_path)[0]; }
+  basic_block curr_bb ()  { return (*m_path)[m_pos]; }
+  basic_block prev_bb ()  { return (*m_path)[m_pos + 1]; }
+  basic_block next_bb ()  { return (*m_path)[m_pos - 1]; }
+  bool at_entry ()	  { return m_pos == m_path->length () - 1; }
+  bool at_exit ()	  { return m_pos == 0; }
+  void move_next ()	  { --m_pos; }
+
+  // Range cache for SSA names.
+  ssa_global_cache *m_cache;
+
+  // Set for each SSA that has an active entry in the cache.
+  bitmap m_has_cache_entry;
+
+  // Path being analyzed.
+  const vec<basic_block> *m_path;
+
+  // Current path position.
+  unsigned m_pos;
+
+  const bitmap_head *m_imports;
+  gimple_ranger &m_ranger;
+};
+
+#endif // GCC_TREE_SSA_THREADSOLVER_H
-- 
2.31.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/2] Backwards jump threader rewrite with ranger.
  2021-06-28 16:21 [PATCH 0/2] Ranger-based backwards threader implementation Aldy Hernandez
  2021-06-28 16:21 ` [PATCH 1/2] Implement basic block path solver Aldy Hernandez
@ 2021-06-28 16:21 ` Aldy Hernandez
  2021-07-05 15:39   ` Aldy Hernandez
  2021-06-28 23:19 ` [PATCH 0/2] Ranger-based backwards threader implementation Martin Sebor
  2 siblings, 1 reply; 20+ messages in thread
From: Aldy Hernandez @ 2021-06-28 16:21 UTC (permalink / raw)
  To: GCC patches
  Cc: Andrew MacLeod, Jeff Law, Richard Biener, Martin Sebor, Aldy Hernandez

This is a rewrite of the backwards threader with a ranger based solver.

The code is divided into two parts: the path solver in
tree-ssa-path-solver.*, and the path discovery in
tree-ssa-threadbackward.c.

The legacy code is still available with --param=threader-mode=legacy,
but will be removed shortly after.

gcc/ChangeLog:

	* Makefile.in (tree-ssa-loop-im.o-warn): New.
	* flag-types.h (enum threader_mode): New.
	* params.opt: Add entry for --param=threader-mode.
	* tree-ssa-threadbackward.c (THREADER_ITERATIVE_MODE): New.
	(class back_threader): New.
	(back_threader::back_threader): New.
	(back_threader::~back_threader): New.
	(back_threader::maybe_register_path): New.
	(back_threader::find_taken_edge): New.
	(back_threader::find_taken_edge_switch): New.
	(back_threader::find_taken_edge_cond): New.
	(back_threader::resolve_def): New.
	(back_threader::resolve_phi): New.
	(back_threader::find_paths_to_names): New.
	(back_threader::find_paths): New.
	(dump_path): New.
	(debug): New.
	(thread_jumps::find_jump_threads_backwards): Call ranger threader.
	(thread_jumps::find_jump_threads_backwards_with_ranger): New.
	(pass_thread_jumps::execute): Abstract out code...
	(try_thread_blocks): ...here.
	* tree-ssa-threadedge.c (jump_threader::thread_outgoing_edges):
	Abstract out threading candidate code to...
	(single_succ_to_potentially_threadable_block): ...here.
	* tree-ssa-threadedge.h (single_succ_to_potentially_threadable_block):
	New.
	* tree-ssa-threadupdate.c (register_jump_thread): Return boolean.
	* tree-ssa-threadupdate.h (class jump_thread_path_registry):
	Return bool from register_jump_thread.

libgomp/ChangeLog:

	* testsuite/libgomp.graphite/force-parallel-4.c: Adjust for
	threader.
	* testsuite/libgomp.graphite/force-parallel-8.c: Same.

gcc/testsuite/ChangeLog:

	* g++.dg/debug/dwarf2/deallocator.C: Adjust for threader.
	* gcc.c-torture/compile/pr83510.c: Same.
	* gcc.dg/Wrestrict-22.c: Same.
	* gcc.dg/loop-unswitch-2.c: Same.
	* gcc.dg/old-style-asm-1.c: Same.
	* gcc.dg/pr68317.c: Same.
	* gcc.dg/pr97567-2.c: Same.
	* gcc.dg/predict-9.c: Same.
	* gcc.dg/shrink-wrap-loop.c: Same.
	* gcc.dg/sibcall-1.c: Same.
	* gcc.dg/tree-ssa/builtin-sprintf-3.c: Same.
	* gcc.dg/tree-ssa/pr21001.c: Same.
	* gcc.dg/tree-ssa/pr21294.c: Same.
	* gcc.dg/tree-ssa/pr21417.c: Same.
	* gcc.dg/tree-ssa/pr21458-2.c: Same.
	* gcc.dg/tree-ssa/pr21563.c: Same.
	* gcc.dg/tree-ssa/pr49039.c: Same.
	* gcc.dg/tree-ssa/pr61839_1.c: Same.
	* gcc.dg/tree-ssa/pr61839_3.c: Same.
	* gcc.dg/tree-ssa/pr77445-2.c: Same.
	* gcc.dg/tree-ssa/split-path-4.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-11.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-12.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-14.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same.
	* gcc.dg/tree-ssa/ssa-fre-48.c: Same.
	* gcc.dg/tree-ssa/ssa-thread-11.c: Same.
	* gcc.dg/tree-ssa/ssa-thread-12.c: Same.
	* gcc.dg/tree-ssa/ssa-thread-14.c: Same.
	* gcc.dg/tree-ssa/vrp02.c: Same.
	* gcc.dg/tree-ssa/vrp03.c: Same.
	* gcc.dg/tree-ssa/vrp05.c: Same.
	* gcc.dg/tree-ssa/vrp06.c: Same.
	* gcc.dg/tree-ssa/vrp07.c: Same.
	* gcc.dg/tree-ssa/vrp09.c: Same.
	* gcc.dg/tree-ssa/vrp19.c: Same.
	* gcc.dg/tree-ssa/vrp20.c: Same.
	* gcc.dg/tree-ssa/vrp33.c: Same.
	* gcc.dg/vect/bb-slp-16.c: Same.
	* gcc.target/i386/avx2-vect-aggressive.c: Same.
	* gcc.dg/tree-ssa/ranger-threader-1.c: New test.
	* gcc.dg/tree-ssa/ranger-threader-2.c: New test.
	* gcc.dg/tree-ssa/ranger-threader-3.c: New test.
	* gcc.dg/tree-ssa/ranger-threader-4.c: New test.
---
 gcc/Makefile.in                               |   5 +
 gcc/flag-types.h                              |   7 +
 gcc/params.opt                                |  17 +
 .../g++.dg/debug/dwarf2/deallocator.C         |   3 +-
 gcc/testsuite/gcc.c-torture/compile/pr83510.c |  33 ++
 gcc/testsuite/gcc.dg/Wrestrict-22.c           |   3 +
 gcc/testsuite/gcc.dg/loop-unswitch-2.c        |   2 +-
 gcc/testsuite/gcc.dg/old-style-asm-1.c        |   5 +-
 gcc/testsuite/gcc.dg/pr68317.c                |   4 +-
 gcc/testsuite/gcc.dg/pr97567-2.c              |   2 +-
 gcc/testsuite/gcc.dg/predict-9.c              |   4 +-
 gcc/testsuite/gcc.dg/shrink-wrap-loop.c       |  53 ++
 gcc/testsuite/gcc.dg/sibcall-1.c              |  10 +
 .../gcc.dg/tree-ssa/builtin-sprintf-3.c       |   5 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr21001.c       |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/pr21294.c       |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/pr21417.c       |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c     |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr21563.c       |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr49039.c       |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c     |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c     |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c     |   2 +-
 .../gcc.dg/tree-ssa/ranger-threader-1.c       |  20 +
 .../gcc.dg/tree-ssa/ranger-threader-2.c       |  39 ++
 .../gcc.dg/tree-ssa/ranger-threader-3.c       |  41 ++
 .../gcc.dg/tree-ssa/ranger-threader-4.c       |  83 +++
 gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c  |   4 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-11.c       |   2 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-12.c       |   2 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-14.c       |   1 +
 .../gcc.dg/tree-ssa/ssa-dom-thread-18.c       |   5 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-6.c        |   4 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-7.c        |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c    |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/vrp02.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp03.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp05.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp06.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp07.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp09.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp19.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp20.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp33.c         |   2 +-
 gcc/testsuite/gcc.dg/vect/bb-slp-16.c         |   7 +
 .../gcc.target/i386/avx2-vect-aggressive.c    |   2 +-
 gcc/tree-ssa-threadbackward.c                 | 475 +++++++++++++++++-
 gcc/tree-ssa-threadedge.c                     |  20 +-
 gcc/tree-ssa-threadedge.h                     |   3 +-
 gcc/tree-ssa-threadupdate.c                   |  12 +-
 gcc/tree-ssa-threadupdate.h                   |   2 +-
 .../libgomp.graphite/force-parallel-4.c       |   1 +
 .../libgomp.graphite/force-parallel-8.c       |   2 +
 56 files changed, 865 insertions(+), 54 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 66cc5f9529e..b166a6539f2 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -223,6 +223,11 @@ gimple-match.o-warn = -Wno-unused
 generic-match.o-warn = -Wno-unused
 dfp.o-warn = -Wno-strict-aliasing
 
+# maybe_emit_free_warning() is picking up the inlined location for the
+# warning, not the source of the original va_heap::release() function
+# which has a pragma disabling this warning.
+tree-ssa-loop-im.o-warn = -Wno-free-nonheap-object
+
 # All warnings have to be shut off in stage1 if the compiler used then
 # isn't gcc; configure determines that.  WARN_CFLAGS will be either
 # $(GCC_WARN_CFLAGS), or nothing.  Similarly, WARN_CXXFLAGS will be
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 375448ebf5f..c7d98964fc7 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -435,6 +435,13 @@ enum evrp_mode
   EVRP_MODE_RVRP_DEBUG = EVRP_MODE_RVRP_ONLY | EVRP_MODE_DEBUG
 };
 
+/* Backwards threader mode.  */
+enum threader_mode
+{
+  THREADER_MODE_LEGACY = 0,
+  THREADER_MODE_RANGER = 1
+};
+
 /* Modes of OpenACC 'kernels' constructs handling.  */
 enum openacc_kernels
 {
diff --git a/gcc/params.opt b/gcc/params.opt
index 18e6036c4f4..d7cae1b390b 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -1006,6 +1006,23 @@ Maximum depth of DFS walk used by modref escape analysis.
 Common Joined UInteger Var(param_modref_max_escape_points) Init(256) Param Optimization
 Maximum number of escape points tracked by modref per SSA-name.
 
+-param=threader-iterative=
+Common Joined UInteger Var(param_threader_iterative) Init(0) Param Optimization
+Run backwards threader in iterative mode.
+
+-param=threader-mode=
+Common Joined Var(param_threader_mode) Enum(threader_mode) Init(THREADER_MODE_RANGER) Param Optimization
+--param=threader-mode=[legacy|ranger] Specifies the mode the backwards threader should run in.
+
+Enum
+Name(threader_mode) Type(enum threader_mode) UnknownError(unknown threader mode %qs)
+
+EnumValue
+Enum(threader_mode) String(legacy) Value(THREADER_MODE_LEGACY)
+
+EnumValue
+Enum(threader_mode) String(ranger) Value(THREADER_MODE_RANGER)
+
 -param=tm-max-aggregate-size=
 Common Joined UInteger Var(param_tm_max_aggregate_size) Init(9) Param Optimization
 Size in bytes after which thread-local aggregates should be instrumented with the logging functions instead of save/restore pairs.
diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/deallocator.C b/gcc/testsuite/g++.dg/debug/dwarf2/deallocator.C
index d895e78e608..c1d387922ba 100644
--- a/gcc/testsuite/g++.dg/debug/dwarf2/deallocator.C
+++ b/gcc/testsuite/g++.dg/debug/dwarf2/deallocator.C
@@ -29,7 +29,7 @@ void foo(int i)
 	  return;
 	}
     }
-  if (i)
+  if (i) // Threader makes everything after here disappear.
     {
       t test;
       if (i == 10)
@@ -42,5 +42,4 @@ void foo(int i)
 }
 // { dg-final { scan-assembler "deallocator.C:29" } }
 // { dg-final { scan-assembler "deallocator.C:24" } }
-// { dg-final { scan-assembler "deallocator.C:34" } }
 // { dg-final { scan-assembler "deallocator.C:21" } }
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr83510.c b/gcc/testsuite/gcc.c-torture/compile/pr83510.c
index 907dd80ccd7..fc932e57f3a 100644
--- a/gcc/testsuite/gcc.c-torture/compile/pr83510.c
+++ b/gcc/testsuite/gcc.c-torture/compile/pr83510.c
@@ -3,6 +3,39 @@
    (PR tree-optimization/83510).  */
 
 /* { dg-options "-Warray-bounds" } */
+/* { dg-xfail-if "" { "*-*-*" } { "-Os" } } */
+
+
+/*  This test is XFAILed because thread1 threads a switch statement
+    such that the various cases have been split into different
+    independent blocks.  One of these blocks exposes an arr[i_27]
+    which is later propagated by VRP to be arr[10].  This is an
+    invalid access, but the array bounds code doesn't know it is an
+    unreachable path.
+
+    However, it is not until dom2 that we "know" that the value of the
+    switch index is such that the path to arr[10] is unreachable.  For
+    that matter, it is not until dom3 that we remove the unreachable
+    path.
+
+
+    See:
+    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83510
+    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83312
+
+    It's not until here that ranger "knows" that the path is
+    unreachable:
+
+    thread1
+    vrp1		<-- array bounds checking
+    dce2
+    stdarg
+    cdce
+    cselim
+    copyprop
+    ifcombine
+    mergephi3		<-- too late
+*/
 
 extern int get_flag (void);
 
diff --git a/gcc/testsuite/gcc.dg/Wrestrict-22.c b/gcc/testsuite/gcc.dg/Wrestrict-22.c
index 46f507b56b6..1997bccb4bb 100644
--- a/gcc/testsuite/gcc.dg/Wrestrict-22.c
+++ b/gcc/testsuite/gcc.dg/Wrestrict-22.c
@@ -1,6 +1,9 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -Wrestrict" } */
 
+/* This looks like the threader caused the entire loop to collapse, and the
+   warning pass can't determine the arguments to memcpy.  */
+
 void test_memcpy_warn (char *d, unsigned n)
 {
   for (unsigned i = n; i < 30; ++i)
diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-2.c b/gcc/testsuite/gcc.dg/loop-unswitch-2.c
index f8d314e34de..0931f6e71c3 100644
--- a/gcc/testsuite/gcc.dg/loop-unswitch-2.c
+++ b/gcc/testsuite/gcc.dg/loop-unswitch-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details" } */
+/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details -fdisable-tree-thread2 -fdisable-tree-thread3" } */
 
 void foo (float **a, float **b, float *c, int n, int m, int l)
 {
diff --git a/gcc/testsuite/gcc.dg/old-style-asm-1.c b/gcc/testsuite/gcc.dg/old-style-asm-1.c
index 8af007795a7..f9406ff0a26 100644
--- a/gcc/testsuite/gcc.dg/old-style-asm-1.c
+++ b/gcc/testsuite/gcc.dg/old-style-asm-1.c
@@ -1,6 +1,9 @@
 /* PR inline-asm/8832 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -dP" } */
+/* { dg-options "-O2 -dP -fdisable-tree-ethread -fdisable-tree-thread1 -fdisable-tree-thread2 -fdisable-tree-thread3 -fdisable-tree-thread4" } */
+
+/* Note: Threader will duplicate BBs and replace one conditional branch by an
+   unconditional one.  */
 
 /* Verify that GCC doesn't optimize
    old style asm instructions.  */
diff --git a/gcc/testsuite/gcc.dg/pr68317.c b/gcc/testsuite/gcc.dg/pr68317.c
index 891d12954ba..bd053a7522b 100644
--- a/gcc/testsuite/gcc.dg/pr68317.c
+++ b/gcc/testsuite/gcc.dg/pr68317.c
@@ -1,5 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -fdisable-tree-ethread" } */
+
+/* Note: Threader will collapse loop.  */
 
 typedef int int32_t __attribute__((mode (__SI__)));
 
diff --git a/gcc/testsuite/gcc.dg/pr97567-2.c b/gcc/testsuite/gcc.dg/pr97567-2.c
index dee31c6dc01..c3ead54eaa8 100644
--- a/gcc/testsuite/gcc.dg/pr97567-2.c
+++ b/gcc/testsuite/gcc.dg/pr97567-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile} */
-/* { dg-options "-O2 -fdump-tree-evrp" } */
+/* { dg-options "-O2 -fdump-tree-evrp -fdisable-tree-ethread" } */
 
 char a[2];
 
diff --git a/gcc/testsuite/gcc.dg/predict-9.c b/gcc/testsuite/gcc.dg/predict-9.c
index f491c511bd9..cb68a218a93 100644
--- a/gcc/testsuite/gcc.dg/predict-9.c
+++ b/gcc/testsuite/gcc.dg/predict-9.c
@@ -1,5 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate -fno-finite-loops" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate -fno-finite-loops -fdisable-tree-ethread" } */
+
+/* Note: Threader causes removal of for loop.  */
 
 extern int global;
 extern int global2;
diff --git a/gcc/testsuite/gcc.dg/shrink-wrap-loop.c b/gcc/testsuite/gcc.dg/shrink-wrap-loop.c
index 52dfc2790ed..ba872fa23f6 100644
--- a/gcc/testsuite/gcc.dg/shrink-wrap-loop.c
+++ b/gcc/testsuite/gcc.dg/shrink-wrap-loop.c
@@ -1,5 +1,58 @@
 /* { dg-do compile { target { { { i?86-*-* x86_64-*-* } && lp64 } || { arm_thumb2 } } } } */
 /* { dg-options "-O2 -fdump-rtl-pro_and_epilogue"  } */
+// { dg-additional-options "-fdisable-tree-ethread" }
+
+/*
+Our new threader is threading things a bit too early, and causing the
+testcase in gcc.dg/shrink-wrap-loop.c to fail.
+
+  The gist is this BB inside a loop:
+
+  <bb 6> :
+  # p_2 = PHI <p2_6(D)(2), p_12(5)>
+  if (p_2 != 0B)
+    goto <bb 3>; [INV]
+  else
+    goto <bb 7>; [INV]
+
+Our threader can move this check outside of the loop (good).  This is
+done before branch probabilities are calculated and causes the probs
+to be calculated as:
+
+<bb 2> [local count: 216361238]:
+  if (p2_6(D) != 0B)
+    goto <bb 7>; [54.59%]
+  else
+    goto <bb 6>; [45.41%]
+
+Logically this seems correct to me.  A simple check outside of a loop
+should slightly but not overwhelmingly favor a non-zero value.
+
+Interestingly however, the old threader couldn't get this, but the IL
+ended up identical, albeit with different probabilities.  What happens
+is that, because the old code could not thread this, the p2 != 0 check
+would remain inside the loop and probs would be calculated thusly:
+
+  <bb 6> [local count: 1073741824]:
+  # p_2 = PHI <p2_6(D)(2), p_12(5)>
+  if (p_2 != 0B)
+    goto <bb 3>; [94.50%]
+  else
+    goto <bb 7>; [5.50%]
+
+Then when the loop header copying pass ("ch") shuffled things around,
+the IL would end up identical to my early threader code, but with the
+probabilities would remain as 94.5/5.5.
+
+The above discrepancy causes the RTL ifcvt pass to generate different
+code, and by the time we get to the shrink wrapping pass, things look
+sufficiently different such that the legacy code can actually shrink
+wrap, whereas our new code does not.
+
+IMO, if the loop-ch pass moves conditionals outside of a loop, the
+probabilities should be adjusted, but that does mean the shrink wrap
+won't happen for this contrived testcase.
+ */
 
 int foo (int *p1, int *p2);
 
diff --git a/gcc/testsuite/gcc.dg/sibcall-1.c b/gcc/testsuite/gcc.dg/sibcall-1.c
index e8a95513d9e..367ee4374e1 100644
--- a/gcc/testsuite/gcc.dg/sibcall-1.c
+++ b/gcc/testsuite/gcc.dg/sibcall-1.c
@@ -7,6 +7,9 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -foptimize-sibling-calls" } */
 
+/* See note in recurser_void() as to why we disable threading.  */
+/* { dg-additional-options "-fdisable-tree-thread1" } */
+
 /* The option -foptimize-sibling-calls is the default, but serves as
    marker.  Self-recursion tail calls are optimized for all targets,
    regardless of presence of sibcall patterns.  */
@@ -26,6 +29,13 @@ int main ()
 void
 recurser_void (int n)
 {
+  /* In some architectures like ppc64*, jump threading may thread
+     paths such that there are two calls into track(), one for
+     track(0) and one for track(7).  The track(7) call can be
+     transformed into a jump instead of a call, which means that
+     different calls into track() may end up with a different
+     &stackpos.  This is the reason we disable jump threading for this
+     test.  */
   if (n == 0 || n == 7)
     track (n);
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c
index fae2a1b73ea..351e89ad3aa 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c
@@ -3,7 +3,10 @@
    that the sprintf return value (or value range) optimization is not
    performed for an unknown string.  */
 /* { dg-do compile } */
-/* { dg-options "-O2 -Wall -Werror -fdump-tree-optimized -fprintf-return-value" } */
+/* { dg-options "-O2 -Wall -Werror -fdump-tree-optimized -fprintf-return-value -fdisable-tree-ethread -fdisable-tree-thread1 -fdisable-tree-thread2" } */
+
+/* Note: Threader will duplicate BBs such that there are multiple
+   string_*_fail calls on certain paths.  */
 
 #define INT_MAX   __INT_MAX__
 #define INT_MIN   (-INT_MAX - 1)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21001.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21001.c
index 719360a015f..4ea5f21addf 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21001.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21001.c
@@ -6,6 +6,7 @@
 
 /* { dg-do compile } */
 /* { dg-options "-O2 -fno-tree-dominator-opts -fno-tree-fre -fdisable-tree-evrp -fdump-tree-vrp1-details" } */
+/* { dg-additional-options "-fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 int
 foo (int a)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21294.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21294.c
index cc7d4cdf338..b9edabc6234 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21294.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21294.c
@@ -5,6 +5,7 @@
 
 /* { dg-do compile } */
 /* { dg-options "-O2 -fno-tree-dominator-opts -fdisable-tree-evrp -fdump-tree-vrp1-details" } */
+/* { dg-additional-options "-fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 struct f {
   int i;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c
index 484511978a1..fc14af4e662 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-thread4-details" } */
+/* { dg-options "-O2 -fdisable-tree-thread3 -fdump-tree-thread4-details" } */
 
 struct tree_common 
 { 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c
index 2aee42f4c05..f8d7353fc0e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-evrp-details" } */
+/* { dg-options "-O2 -fdump-tree-evrp-details -fdisable-tree-ethread" } */
 
 extern void g (void);
 extern void bar (int);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21563.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21563.c
index 9c67a3acb46..72dce83ce37 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21563.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21563.c
@@ -2,7 +2,7 @@
    Make sure VRP folds the second "if" statement.  */
 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-dominator-opts -fdisable-tree-evrp -fdump-tree-vrp1-details" } */
+/* { dg-options "-O2 -fno-tree-dominator-opts -fdisable-tree-evrp -fdump-tree-vrp1-details -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 int
 foo (int a)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr49039.c b/gcc/testsuite/gcc.dg/tree-ssa/pr49039.c
index 4bc0a8175a0..a2044d012cc 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr49039.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr49039.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/49039 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 extern void bar (void);
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c
index d44c7dc1882..ddc53fbfbcc 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/61839.  */
 /* { dg-do run } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fdisable-tree-evrp -fdump-tree-optimized" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fdisable-tree-evrp -fdump-tree-optimized -fdisable-tree-ethread -fdisable-tree-thread1" } */
 /* { dg-require-effective-target int32plus } */
 
 __attribute__ ((noinline))
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c
index 5ceb0738bde..cc322d6e703 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/61839.  */
 /* { dg-do run } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fdump-tree-optimized" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fdump-tree-optimized -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 __attribute__ ((noinline))
 int foo (int a, unsigned b)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c
index cf74e156109..f9fc212f49e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c
@@ -124,7 +124,7 @@ enum STATES FMS( u8 **in , u32 *transitions) {
    to change decisions in switch expansion which in turn can expose new
    jump threading opportunities.  Skip the later tests on aarch64.  */
 /* { dg-final { scan-tree-dump "Jumps threaded: 1\[1-9\]" "thread1" } } */
-/* { dg-final { scan-tree-dump-times "Invalid sum" 3 "thread1" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum" 4 "thread1" } } */
 /* { dg-final { scan-tree-dump-not "optimizing for size" "thread1" } } */
 /* { dg-final { scan-tree-dump-not "optimizing for size" "thread2" } } */
 /* { dg-final { scan-tree-dump-not "optimizing for size" "thread3" { target { ! aarch64*-*-* } } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c
new file mode 100644
index 00000000000..c3ccb5d5b7e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-thread1-details --param logical-op-non-short-circuit=1" } */
+
+// Copied from ssa-dom-thread-11.c
+
+static int *bb_ticks;
+extern void frob (void);
+void
+mark_target_live_regs (int b, int block, int bb_tick)
+{
+  if (b == block && b != -1 && bb_tick == bb_ticks[b])
+      return;
+  if (b != -1)
+    frob ();
+}
+
+/* When the first two conditionals in the first IF are true, but
+   the third conditional is false, then there's a jump threading
+   opportunity to bypass the second IF statement.  */
+/* { dg-final { scan-tree-dump-times "Registering.*jump thread" 1 "thread1"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c
new file mode 100644
index 00000000000..d2689b6c788
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */ 
+/* { dg-options "-O2 -fdump-tree-thread2-details -w" } */
+
+// Copied from ssa-dom-thread-12.c.
+
+typedef long unsigned int size_t;
+union tree_node;
+typedef union tree_node *tree;
+typedef union gimple_statement_d *gimple;
+typedef const union gimple_statement_d *const_gimple;
+union gimple_statement_d
+{
+  unsigned num_ops;
+  tree exp;
+};
+
+unsigned int x;
+static inline tree
+gimple_op (const_gimple gs, unsigned i)
+{
+  if (!(i < gs->num_ops))
+    abort ();
+  return gs->exp;
+}
+
+unsigned char
+scan_function (gimple stmt)
+{
+  unsigned i;
+  for (i = 0; i < stmt->num_ops - 3 ; i++)
+    gimple_call_arg (stmt, i);
+  gimple_op (stmt, 1);
+}
+
+/* The test which bypasses the loop is simplified prior to DOM to check
+   that stmt->num_ops - 3 != 0.  When that test is false, we can derive
+   a value for stmt->num_ops.  That in turn allows us to thread the jump
+   for the conditional at the start of the call to gimple_op.  */
+/* { dg-final { scan-tree-dump-times "Registering.*jump thread" 1 "thread2"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c
new file mode 100644
index 00000000000..79ec067e68a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ethread-details -w --param logical-op-non-short-circuit=1" } */
+
+// Copied from ssa-dom-thread-14.c
+
+enum optab_methods
+{
+  OPTAB_DIRECT,
+  OPTAB_LIB,
+  OPTAB_WIDEN,
+  OPTAB_LIB_WIDEN,
+  OPTAB_MUST_WIDEN
+};
+struct optab_d { };
+typedef struct optab_d *optab;
+void
+expand_shift_1 (int code, int unsignedp, int rotate,
+		optab lshift_optab, optab rshift_arith_optab)
+{
+  int left = (code == 42 || code == 0xde);
+  int attempt;
+  enum optab_methods methods;
+  if (attempt == 0)
+    methods = OPTAB_DIRECT;
+  else if (attempt == 1)
+    methods = OPTAB_WIDEN;
+  if ((!unsignedp || (!left && methods == OPTAB_WIDEN)))
+    {
+      enum optab_methods methods1 = methods;
+      if (unsignedp)
+	methods1 = OPTAB_MUST_WIDEN;
+      expand_binop (left ? lshift_optab : rshift_arith_optab,
+			   unsignedp, methods1);
+    }
+}
+
+/* When UNSIGNEDP is true, LEFT is false and METHOD == OPTAB_WIDEN
+   we will enter the TRUE arm of the conditional and we can thread
+   the test to compute the first first argument of the expand_binop
+   call if we look backwards through the boolean logicals.  */
+/* { dg-final { scan-tree-dump-times "Registering.*jump thread" 1 "ethread"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c
new file mode 100644
index 00000000000..e8d1cfc03ac
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c
@@ -0,0 +1,83 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -fdump-tree-vrp-details -fdump-tree-thread1-details --param logical-op-non-short-circuit=1" }  */
+/* { dg-final { scan-tree-dump-times "Registering FSM jump" 8 "thread1" } }  */
+
+/* Copied from ssa-thread-14.  */
+
+void foo (void);
+void bar (void);
+void blah (void);
+
+/* One jump threaded here.  */
+
+void
+baz_1 (int a, int b, int c)
+{
+  if (a && b)
+    foo ();
+  if (!b && c)
+    bar ();
+}
+
+/* One jump threaded here.  */
+
+void
+baz_2 (int a, int b, int c)
+{
+  if (a && b)
+    foo ();
+  if (b || c)
+    bar ();
+}
+
+/* One jump threaded here.  */
+
+void
+baz_3 (int a, int b, int c)
+{
+  if (a && b > 10)
+    foo ();
+  if (b < 5 && c)
+    bar ();
+}
+
+/* Two jumps threaded here.  */
+
+void
+baz_4 (int a, int b, int c)
+{
+  if (a && b)
+    {
+      foo ();
+      if (c)
+        bar ();
+    }
+  if (b && c)
+    blah ();
+}
+
+/* Two jumps threaded here.  */
+
+void
+baz_5 (int a, int b, int c)
+{
+  if (a && b)
+    {
+      foo ();
+      if (c)
+        bar ();
+    }
+  if (!b || !c)
+    blah ();
+}
+
+/* One jump threaded here.  */
+
+void
+baz_6 (int a, int b, int c)
+{
+  if (a == 39 && b == 41)
+    foo ();
+  if (c == 12 || b == 41)
+    bar ();
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c
index dac931c18d0..8ef7646282c 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c
@@ -1,5 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details -w" } */
+/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details -w -fdisable-tree-thread1 -fdisable-tree-thread2" } */
+
+/* Note: Threader causes the infinite loop in val & 1 sooner.  */
 
 powi_cost (long n)
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-11.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-11.c
index 5f90613263d..856ab389439 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-11.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-11.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-dom2-details --param logical-op-non-short-circuit=1" } */
+/* { dg-options "-O2 -fdump-tree-dom2-details --param logical-op-non-short-circuit=1 -fdisable-tree-thread1 -fdisable-tree-thread2" } */
 
 static int *bb_ticks;
 extern void frob (void);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-12.c
index 63bd12a06a4..bad5e0a782d 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-12.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-12.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */ 
-/* { dg-options "-O2 -fdump-tree-dom2-details -w" } */
+/* { dg-options "-O2 -fdump-tree-dom2-details -w -fdisable-tree-thread2" } */
 typedef long unsigned int size_t;
 union tree_node;
 typedef union tree_node *tree;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-14.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-14.c
index 4e6a911506e..3bc4b3795cb 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-14.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-14.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-dom2-details -w --param logical-op-non-short-circuit=1" } */
+/* { dg-additional-options "-fdisable-tree-thread1 -fdisable-tree-ethread -fdisable-tree-thread2" } */
 
 enum optab_methods
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-18.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-18.c
index d4759b8903b..03872e7a02f 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-18.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-18.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */ 
-/* { dg-options "-O2 -fdump-tree-vrp1-details -fdump-tree-dom2-details -std=gnu89 --param logical-op-non-short-circuit=0" } */
+/* { dg-options "-O2 -fdump-tree-vrp1-details -fdump-tree-thread1-details -std=gnu89 --param logical-op-non-short-circuit=0" } */
 
 #include "ssa-dom-thread-4.c"
 
@@ -21,4 +21,5 @@
 	 condition.
 
    All the cases are picked up by VRP1 as jump threads.  */
-/* { dg-final { scan-tree-dump-times "Threaded" 4 "vrp1" } } */
+/* { dg-final { scan-tree-dump-times "Registering FSM jump" 6 "thread1" } } */
+/* { dg-final { scan-tree-dump-times "Threaded" 2 "vrp1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
index 16a9ef4e28a..c7bf867b084 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
@@ -34,8 +34,8 @@
      SWITCH_BB -> BBx -> BBy -> BBz -> PHI
 
    We now know the value of the switch index at PHI.  */
-/* { dg-final { scan-tree-dump-times "FSM" 6 "thread1" } } */
-/* { dg-final { scan-tree-dump-times "FSM" 1 "thread2" } } */
+/* { dg-final { scan-tree-dump-times "Registering FSM jump" 6 "thread1" } } */
+/* { dg-final { scan-tree-dump-times "Registering FSM jump" 1 "thread2" } } */
 
 int sum0, sum1, sum2, sum3;
 int foo (char *s, char **ret)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
index bad5bc1d003..1c2d12aa9ea 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-thread2-stats -fdump-tree-dom2-stats -fdump-tree-thread3-stats -fdump-tree-dom3-stats -fdump-tree-vrp2-stats -fno-guess-branch-probability" } */
+/* { dg-additional-options "--param=threader-mode=legacy" } */
 
 /* Here we have the same issue as was commented in ssa-dom-thread-6.c.
    The PHI coming into the threader has a lot more constants, so the
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c
index b3d610204da..5e74c78e7b0 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-fre1-details" } */
+/* { dg-options "-O -fdump-tree-fre1-details -fdisable-tree-ethread" } */
 
 int foo (int i)
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c
index 67e1e89ecd3..672a54e07db 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-vrp2-details --param logical-op-non-short-circuit=1" } */
+/* { dg-additional-options "-fdisable-tree-ethread -fdisable-tree-thread1 -fdisable-tree-thread2" } */
 /* { dg-final { scan-tree-dump-not "IRREDUCIBLE_LOOP" "vrp2" } } */
 
 void abort (void);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
index fb9840e95d5..8f554641b28 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
@@ -2,7 +2,7 @@
 /* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details -fno-finite-loops --param early-inlining-insns=14 -fno-inline-functions" } */
 /* { dg-final { scan-tree-dump "FSM" "thread2" } } */
 /* { dg-final { scan-tree-dump "FSM" "thread3" } } */
-/* { dg-final { scan-tree-dump "FSM" "thread4" { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump "FSM" "thread4" } } */
 
 typedef struct bitmap_head_def *bitmap;
 typedef const struct bitmap_head_def *const_bitmap;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c
index 38661c88822..f9152b9358f 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-additional-options "-O2 -fdump-tree-vrp-details --param logical-op-non-short-circuit=1" }  */
+/* { dg-additional-options "-fdisable-tree-thread1" } */
 /* { dg-final { scan-tree-dump-times "Threaded jump" 8 "vrp1" } }  */
 
 void foo (void);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp02.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp02.c
index 4be538f5944..2285c55c7d2 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp02.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp02.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fdelete-null-pointer-checks -fdisable-tree-evrp" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fdelete-null-pointer-checks -fdisable-tree-evrp -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 struct A
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c
index bafb65a53d6..1d7ea4e8ffb 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 struct A
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp05.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp05.c
index 8c611e91fcd..c17cd1b5738 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp05.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp05.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fno-early-inlining" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fno-early-inlining -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 
 inline int ten()
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp06.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp06.c
index a872bc43731..acb03c29aa4 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp06.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp06.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 int baz (void);
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp07.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp07.c
index 0f3f280e4e1..31a541522f5 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp07.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp07.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-fre -fdisable-tree-evrp -fdump-tree-vrp1-details -fdelete-null-pointer-checks" } */
+/* { dg-options "-O2 -fno-tree-fre -fdisable-tree-evrp -fdump-tree-vrp1-details -fdelete-null-pointer-checks -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 int
 foo (int i, int *p)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp09.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp09.c
index 56cc50c5f1d..fad0051cb09 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp09.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp09.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-fre -fdisable-tree-evrp -fdump-tree-vrp1 -std=gnu89" } */
+/* { dg-options "-O2 -fno-tree-fre -fdisable-tree-evrp -fdump-tree-vrp1 -std=gnu89 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 foo (int *p)
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c
index 40373fde163..98a8da6d05e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fwrapv -O1 -ftree-vrp -fdisable-tree-evrp -fdump-tree-vrp1" } */
+/* { dg-options "-fwrapv -O1 -ftree-vrp -fdisable-tree-evrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 #include <limits.h>
 extern void abort ();
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp20.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp20.c
index 4a3b0d73648..f9df67ffb05 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp20.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp20.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fwrapv -O1 -fno-tree-fre -fdisable-tree-evrp -ftree-vrp -fdump-tree-vrp1" } */
+/* { dg-options "-fwrapv -O1 -fno-tree-fre -fdisable-tree-evrp -ftree-vrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 extern void abort ();
 extern void exit (int);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp33.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp33.c
index f1d3863943e..88833eb701e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp33.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp33.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fno-tree-fre -fdisable-tree-evrp" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fno-tree-fre -fdisable-tree-evrp -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 /* This is from PR14052.  */
 
diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-16.c b/gcc/testsuite/gcc.dg/vect/bb-slp-16.c
index e68a9b62535..664e93e9b60 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-16.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-16.c
@@ -1,5 +1,8 @@
 /* { dg-require-effective-target vect_int } */
 
+/* See note below as to why we disable threading.  */
+/* { dg-additional-options "-fdisable-tree-thread1" } */
+
 #include <stdarg.h>
 #include "tree-vect.h"
 
@@ -27,6 +30,10 @@ main1 (int dummy)
       *pout++ = *pin++ + a;
       *pout++ = *pin++ + a;
       *pout++ = *pin++ + a;
+      /* In some architectures like ppc64, jump threading may thread
+	 the iteration where i==0 such that we no longer optimize the
+	 BB.  Another alternative to disable jump threading would be
+	 to wrap the read from `i' into a function returning i.  */
       if (arr[i] = i)
         a = i;
       else
diff --git a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
index 1ea111795a4..57192791857 100644
--- a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
+++ b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
@@ -1,6 +1,6 @@
 /* { dg-do run } */
 /* { dg-require-effective-target avx2 } */
-/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details" } */
+/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details -fdisable-tree-thread1" } */
 
 #include "avx2-check.h"
 #define N 64
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 7dd8594e3d4..09cb07c2b2b 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -36,6 +36,12 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-phinodes.h"
 #include "tree-inline.h"
 #include "tree-vectorizer.h"
+#include "value-range.h"
+#include "gimple-range.h"
+#include "tree-ssa-threadedge.h"
+#include "tree-ssa-path-solver.h"
+#include "ssa.h"
+#include "tree-cfgcleanup.h"
 
 // Path registry for the backwards threader.  After all paths have been
 // registered with register_path(), thread_through_all_blocks() is called
@@ -71,13 +77,414 @@ private:
   const bool m_speed_p;
 };
 
+// Ranger based backwards threader.
+
+class back_threader
+{
+  // Temporary until we remove old code.
+  friend bool path_is_unreachable_p (const vec<jump_thread_edge *> &);
+
+public:
+  back_threader (back_threader_profitability &, back_threader_registry &);
+  ~back_threader ();
+  void find_paths (basic_block bb, tree name);
+
+private:
+  void maybe_register_path (edge taken_edge);
+  bool find_paths_to_names (basic_block bb, bitmap imports);
+  bool resolve_def (tree name, bitmap interesting, vec<tree> worklist);
+  bool resolve_phi (gphi *phi, bitmap imports);
+  edge find_taken_edge (const vec<basic_block> &path);
+  edge find_taken_edge_cond (const vec<basic_block> &path, gcond *);
+  edge find_taken_edge_switch (const vec<basic_block> &path, gswitch *);
+
+  back_threader_registry &m_registry;
+  back_threader_profitability &m_profit;
+  gimple_ranger m_ranger;
+  path_solver m_solver;
+
+  // Current path being analyzed.
+  auto_vec<basic_block> m_path;
+  // Hash to mark visited BBs while analyzing a path.
+  hash_set<basic_block> m_visited_bbs;
+  // The set of SSA names, any of which could potentially change the
+  // value of the final conditional in a path.
+  bitmap m_imports;
+  // The last statement in the path.
+  gimple *m_last_stmt;
+  // This is a bit of a wart.  It's used to pass the LHS SSA name to
+  // the profitability engine.
+  tree m_name;
+  // Marker to differentiate unreachable edges.
+  static const edge UNREACHABLE_EDGE;
+};
+
+// Used to differentiate unreachable edges, so we may stop the search
+// in a the given direction.
+const edge back_threader::UNREACHABLE_EDGE = (edge) -1;
+
+back_threader::back_threader (back_threader_profitability &profit,
+			      back_threader_registry &registry)
+  : m_registry (registry),
+    m_profit (profit),
+    m_solver (m_ranger)
+{
+  m_last_stmt = NULL;
+  m_imports = BITMAP_ALLOC (NULL);
+}
+
+back_threader::~back_threader ()
+{
+  m_path.release ();
+  BITMAP_FREE (m_imports);
+}
+
+// Register the current path for jump threading if it's profitable to
+// do so.  TAKEN_EDGE is the known edge out of the path.
+
+void
+back_threader::maybe_register_path (edge taken_edge)
+{
+  bool irreducible = false;
+  bool profitable
+    = m_profit.profitable_path_p (m_path, m_name, taken_edge, &irreducible);
+
+  if (profitable)
+    {
+      m_registry.register_path (m_path, taken_edge);
+
+      if (irreducible)
+	vect_free_loop_info_assumptions (m_path[0]->loop_father);
+    }
+}
+
+// Return the known taken edge out of a path.  If the path can be
+// determined to be unreachable, return UNREACHABLE_EDGE.  If no
+// outgoing edge can be calculated, return NULL.
+
+edge
+back_threader::find_taken_edge (const vec<basic_block> &path)
+{
+  gcc_checking_assert (path.length () > 1);
+  switch (gimple_code (m_last_stmt))
+    {
+    case GIMPLE_COND:
+      return find_taken_edge_cond (path, as_a<gcond *> (m_last_stmt));
+
+    case GIMPLE_SWITCH:
+      return find_taken_edge_switch (path, as_a<gswitch *> (m_last_stmt));
+
+    default:
+      return NULL;
+    }
+}
+
+// Same as find_taken_edge, but for paths ending in a switch.
+
+edge
+back_threader::find_taken_edge_switch (const vec<basic_block> &path,
+				       gswitch *sw)
+{
+  tree name = gimple_switch_index (sw);
+  int_range_max r;
+
+  m_solver.precompute_ranges (&path, m_imports);
+  m_solver.range_in_path (r, name);
+
+  if (r.undefined_p ())
+    return UNREACHABLE_EDGE;
+
+  if (r.varying_p ())
+    return NULL;
+
+  tree val;
+  if (r.singleton_p (&val))
+    return ::find_taken_edge (gimple_bb (sw), val);
+
+  return NULL;
+}
+
+// Same as find_taken_edge, but for paths ending in a GIMPLE_COND.
+
+edge
+back_threader::find_taken_edge_cond (const vec<basic_block> &path,
+				     gcond *cond)
+{
+  m_solver.precompute_ranges (&path, m_imports);
+
+  // Check if either operand is unreachable since this knowledge could
+  // help the caller cut down the search space.
+  int_range_max r;
+  m_solver.range_in_path (r, gimple_cond_lhs (cond));
+  if (r.undefined_p ())
+    return UNREACHABLE_EDGE;
+  m_solver.range_in_path (r, gimple_cond_rhs (cond));
+  if (r.undefined_p ())
+    return UNREACHABLE_EDGE;
+
+  m_solver.range_in_path (r, cond);
+
+  int_range<2> true_range (boolean_true_node, boolean_true_node);
+  int_range<2> false_range (boolean_false_node, boolean_false_node);
+
+  if (r == true_range || r == false_range)
+    {
+      edge e_true, e_false;
+      basic_block bb = gimple_bb (cond);
+      extract_true_false_edges_from_block (bb, &e_true, &e_false);
+      return r == true_range ? e_true : e_false;
+    }
+  return NULL;
+}
+
+// Populate a vector of trees from a bitmap.
+
+static inline void
+populate_worklist (vec<tree> worklist, bitmap bits)
+{
+  bitmap_iterator bi;
+  unsigned i;
+
+  EXECUTE_IF_SET_IN_BITMAP (bits, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+      worklist.quick_push (name);
+    }
+}
+
+// If any of the incoming edges for a PHI resolves the current path,
+// register the path(s), and return TRUE.
+
+bool
+back_threader::resolve_phi (gphi *phi, bitmap interesting)
+{
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_phi_result (phi)))
+    return true;
+
+  bool done = false;
+  for (size_t i = 0; i < gimple_phi_num_args (phi); ++i)
+    {
+      edge e = gimple_phi_arg_edge (phi, i);
+
+      // This is like path_crosses_loops in profitable_path_p but more
+      // restrictive, since profitable_path_p allows threading the
+      // first block because it would be redirected anyhow.
+      //
+      // If we loosened the restriction and used profitable_path_p()
+      // here instead, we would peel off the first iterations of loops
+      // in places like tree-ssa/pr14341.c.
+      bool profitable_p = m_path[0]->loop_father == e->src->loop_father;
+      if (!profitable_p)
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file,
+		     "  FAIL: path through PHI in bb%d (incoming bb:%d) crosses loop\n",
+		     e->dest->index, e->src->index);
+	  continue;
+	}
+
+      tree arg = gimple_phi_arg_def (phi, i);
+      if (TREE_CODE (arg) == SSA_NAME)
+	{
+	  unsigned v = SSA_NAME_VERSION (arg);
+
+	  // Avoid loops as in: x_5 = PHI <x_5(2), ...>.
+	  if (bitmap_bit_p (interesting, v))
+	    continue;
+
+	  bitmap_set_bit (interesting, v);
+	  bitmap_set_bit (m_imports, v);
+	  done |= find_paths_to_names (e->src, interesting);
+	  bitmap_clear_bit (interesting, v);
+	}
+      else if (TREE_CODE (arg) == INTEGER_CST)
+	{
+	  m_path.safe_push (e->src);
+	  edge taken_edge = find_taken_edge (m_path);
+	  if (taken_edge && taken_edge != UNREACHABLE_EDGE)
+	    {
+	      maybe_register_path (taken_edge);
+	      done = true;
+	    }
+	  m_path.pop ();
+	}
+    }
+  return done;
+}
+
+// If the definition of NAME resolves the current path, register the
+// path, and return TRUE.
+
+bool
+back_threader::resolve_def (tree name, bitmap interesting, vec<tree> worklist)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
+
+  // Handle PHIs.
+  if (is_a<gphi *> (def_stmt)
+      && resolve_phi (as_a<gphi *> (def_stmt), interesting))
+    return true;
+
+  // Defer copies of SSAs by adding the source to the worklist.
+  if (gimple_assign_single_p (def_stmt)
+      && TREE_CODE (gimple_assign_rhs1 (def_stmt)) == SSA_NAME)
+    {
+      tree rhs = gimple_assign_rhs1 (def_stmt);
+      bitmap_set_bit (m_imports, SSA_NAME_VERSION (rhs));
+      bitmap_set_bit (interesting, SSA_NAME_VERSION (rhs));
+      worklist.safe_push (rhs);
+    }
+  return false;
+}
+
+// Find jump threading paths to any of the SSA names in the
+// INTERESTING bitmap, and register any such paths.
+//
+// Return TRUE if no further processing past this block is necessary.
+// This is because we've either registered a path, or because there is
+// nothing of interesting beyond this block.
+//
+// BB is the current path being processed.
+
+bool
+back_threader::find_paths_to_names (basic_block bb, bitmap interesting)
+{
+  if (m_visited_bbs.add (bb))
+    return true;
+
+  m_path.safe_push (bb);
+
+  if (m_path.length () > 1
+      && !m_profit.profitable_path_p (m_path, m_name, NULL))
+    {
+      m_path.pop ();
+      m_visited_bbs.remove (bb);
+      return false;
+    }
+
+  auto_bitmap processed;
+  unsigned i;
+  bool done = false;
+
+  // We use a worklist instead of iterating through the bitmap,
+  // because we may add new items in-flight.
+  auto_vec<tree> worklist (bitmap_count_bits (interesting));
+  populate_worklist (worklist, interesting);
+  while (!worklist.is_empty ())
+    {
+      tree name = worklist.pop ();
+      unsigned i = SSA_NAME_VERSION (name);
+      basic_block def_bb = gimple_bb (SSA_NAME_DEF_STMT (name));
+
+      // Process any names defined in this block.
+      if (def_bb == bb)
+	{
+	  bitmap_set_bit (processed, i);
+
+	  if (resolve_def (name, interesting, worklist))
+	    {
+	      done = true;
+	      goto leave_bb;
+	    }
+	}
+      // Examine blocks that define or export an interesting SSA,
+      // since they may compute a range which resolve this path.
+      if ((def_bb == bb
+	   || bitmap_bit_p (m_ranger.gori ().exports (bb), i))
+	  && m_path.length () > 1)
+	{
+	  edge taken_edge = find_taken_edge (m_path);
+	  if (taken_edge)
+	    {
+	      if (taken_edge != UNREACHABLE_EDGE)
+		maybe_register_path (taken_edge);
+
+	      done = true;
+	      goto leave_bb;
+	    }
+	}
+    }
+
+  // If there are interesting names not yet processed, keep looking.
+  bitmap_and_compl_into (interesting, processed);
+  if (!bitmap_empty_p (interesting))
+    {
+      edge_iterator iter;
+      edge e;
+      FOR_EACH_EDGE (e, iter, bb->preds)
+	if ((e->flags & EDGE_ABNORMAL) == 0)
+	  done |= find_paths_to_names (e->src, interesting);
+    }
+
+ leave_bb:
+  bitmap_iterator bi;
+  EXECUTE_IF_SET_IN_BITMAP (processed, 0, i, bi)
+    bitmap_set_bit (interesting, i);
+
+  m_path.pop ();
+  m_visited_bbs.remove (bb);
+  return done;
+}
+
+// Search backwards from BB looking for paths where the final
+// conditional out of BB can be determined.  NAME is the LHS of the
+// final conditional.  Register such paths for jump threading.
+
+void
+back_threader::find_paths (basic_block bb, tree name)
+{
+  gimple *stmt = last_stmt (bb);
+  if (!stmt
+      || (gimple_code (stmt) != GIMPLE_COND
+	  && gimple_code (stmt) != GIMPLE_SWITCH))
+    return;
+
+  if (EDGE_COUNT (bb->succs) > 1
+      || single_succ_to_potentially_threadable_block (bb))
+    {
+      m_last_stmt = stmt;
+      m_visited_bbs.empty ();
+      m_path.truncate (0);
+      m_name = name;
+      bitmap_clear (m_imports);
+
+      auto_bitmap interesting;
+      bitmap_copy (m_imports, m_ranger.gori ().imports (bb));
+      bitmap_copy (interesting, m_imports);
+      find_paths_to_names (bb, interesting);
+    }
+}
+
+// Dump a sequence of BBs through the CFG.
+
+DEBUG_FUNCTION void
+dump_path (FILE *dump_file, const vec<basic_block> &path)
+{
+  for (size_t i = 0; i < path.length (); ++i)
+    {
+      fprintf (dump_file, "BB%d", path[i]->index);
+      if (i + 1 < path.length ())
+	fprintf (dump_file, " <- ");
+    }
+  fprintf (dump_file, "\n");
+}
+
+DEBUG_FUNCTION void
+debug (const vec <basic_block> &path)
+{
+  dump_path (stderr, path);
+}
+
 class thread_jumps
 {
 public:
   thread_jumps (bool speed_p = true)
-    : m_profit (speed_p), m_registry (param_max_fsm_thread_paths)
+    : m_profit (speed_p),
+      m_registry (param_max_fsm_thread_paths),
+      m_back_threader (m_profit, m_registry)
   { }
   void find_jump_threads_backwards (basic_block bb);
+  void find_jump_threads_backwards_with_ranger (basic_block bb);
   bool thread_through_all_blocks ();
 
 private:
@@ -102,6 +509,7 @@ private:
   tree m_name;
   back_threader_profitability m_profit;
   back_threader_registry m_registry;
+  back_threader m_back_threader;
 };
 
 // Perform the actual jump threading for the all queued paths.
@@ -548,8 +956,8 @@ back_threader_registry::register_path (const vec<basic_block> &m_path,
 						EDGE_NO_COPY_SRC_BLOCK);
   jump_thread_path->safe_push (x);
 
-  m_lowlevel_registry.register_jump_thread (jump_thread_path);
-  ++m_threaded_paths;
+  if (m_lowlevel_registry.register_jump_thread (jump_thread_path))
+    ++m_threaded_paths;
   return true;
 }
 
@@ -818,6 +1226,12 @@ thread_jumps::fsm_find_control_statement_thread_paths (tree name)
 void
 thread_jumps::find_jump_threads_backwards (basic_block bb)
 {
+  if (param_threader_mode & THREADER_MODE_RANGER)
+    {
+      find_jump_threads_backwards_with_ranger (bb);
+      return;
+    }
+
   gimple *stmt = get_gimple_control_stmt (bb);
   if (!stmt)
     return;
@@ -850,6 +1264,28 @@ thread_jumps::find_jump_threads_backwards (basic_block bb)
   fsm_find_control_statement_thread_paths (name);
 }
 
+// Like find_jump_threads_backwards(), but using ranger.
+
+void
+thread_jumps::find_jump_threads_backwards_with_ranger (basic_block bb)
+{
+  gimple *stmt = get_gimple_control_stmt (bb);
+  if (!stmt)
+    return;
+
+  enum gimple_code code = gimple_code (stmt);
+  tree name = NULL;
+  if (code == GIMPLE_SWITCH)
+    name = gimple_switch_index (as_a <gswitch *> (stmt));
+  else if (code == GIMPLE_GOTO)
+    name = gimple_goto_dest (stmt);
+  else if (code == GIMPLE_COND)
+    name = gimple_cond_lhs (stmt);
+
+  m_name = name;
+  m_back_threader.find_paths (bb, name);
+}
+
 namespace {
 
 const pass_data pass_data_thread_jumps =
@@ -883,12 +1319,12 @@ pass_thread_jumps::gate (function *fun ATTRIBUTE_UNUSED)
   return flag_expensive_optimizations;
 }
 
+// Try to thread blocks in FUN.  Return TRUE if any jump thread paths were
+// registered.
 
-unsigned int
-pass_thread_jumps::execute (function *fun)
+static bool
+try_thread_blocks (function *fun)
 {
-  loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
-
   /* Try to thread each block with more than one successor.  */
   thread_jumps threader;
   basic_block bb;
@@ -897,7 +1333,30 @@ pass_thread_jumps::execute (function *fun)
       if (EDGE_COUNT (bb->succs) > 1)
 	threader.find_jump_threads_backwards (bb);
     }
-  bool changed = threader.thread_through_all_blocks ();
+  return threader.thread_through_all_blocks ();
+}
+
+unsigned int
+pass_thread_jumps::execute (function *fun)
+{
+  loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
+
+  // Iterative mode is a testing construct and is not meant for public
+  // consumption.  It is OFF by default.
+  bool iterative = param_threader_iterative;
+
+  bool changed = false;
+  while (try_thread_blocks (fun))
+    {
+      changed = true;
+
+      if (!iterative)
+	break;
+
+      if ((param_threader_mode & THREADER_MODE_RANGER) == 0)
+	break;
+      cleanup_tree_cfg (TODO_update_ssa);
+    }
 
   loop_optimizer_finalize ();
   return changed ? TODO_cleanup_cfg : 0;
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 6ce32644aa5..ea5c37a2c65 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -1335,6 +1335,18 @@ jump_threader::thread_across_edge (edge e)
   m_avail_exprs_stack->pop_to_marker ();
 }
 
+/* Return TRUE if BB has a single successor to a block with multiple
+   incoming and outgoing edges.  */
+
+bool
+single_succ_to_potentially_threadable_block (basic_block bb)
+{
+  int flags = (EDGE_IGNORE | EDGE_COMPLEX | EDGE_ABNORMAL);
+  return (single_succ_p (bb)
+	  && (single_succ_edge (bb)->flags & flags) == 0
+	  && potentially_threadable_block (single_succ (bb)));
+}
+
 /* Examine the outgoing edges from BB and conditionally
    try to thread them.  */
 
@@ -1348,12 +1360,8 @@ jump_threader::thread_outgoing_edges (basic_block bb)
      outgoing edges, then we may be able to thread the edge, i.e., we
      may be able to statically determine which of the outgoing edges
      will be traversed when the incoming edge from BB is traversed.  */
-  if (single_succ_p (bb)
-      && (single_succ_edge (bb)->flags & flags) == 0
-      && potentially_threadable_block (single_succ (bb)))
-    {
-      thread_across_edge (single_succ_edge (bb));
-    }
+  if (single_succ_to_potentially_threadable_block (bb))
+    thread_across_edge (single_succ_edge (bb));
   else if ((last = last_stmt (bb))
 	   && gimple_code (last) == GIMPLE_COND
 	   && EDGE_COUNT (bb->succs) == 2
diff --git a/gcc/tree-ssa-threadedge.h b/gcc/tree-ssa-threadedge.h
index 48735f2bc27..c78eef25219 100644
--- a/gcc/tree-ssa-threadedge.h
+++ b/gcc/tree-ssa-threadedge.h
@@ -29,7 +29,7 @@ class jump_threader
 {
 public:
   jump_threader (class const_and_copies *,
-		 avail_exprs_stack *,
+		 class avail_exprs_stack *,
 		 class jump_threader_simplifier *,
 		 class evrp_range_analyzer * = NULL);
   ~jump_threader ();
@@ -83,6 +83,7 @@ protected:
 };
 
 extern void propagate_threaded_block_debug_into (basic_block, basic_block);
+extern bool single_succ_to_potentially_threadable_block (basic_block);
 
 // ?? All this ssa_name_values stuff is the store of values for
 // avail_exprs_stack and const_and_copies, so it really belongs in the
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index f496dd3eb8c..29cf010e9ca 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "dbgcnt.h"
 #include "tree-cfg.h"
 #include "tree-vectorizer.h"
+#include "tree-pass.h"
 
 /* Given a block B, update the CFG and SSA graph to reflect redirecting
    one or more in-edges to B to instead reach the destination of an
@@ -2741,15 +2742,17 @@ jump_thread_path_registry::thread_through_all_blocks
 
    E is the edge we can thread, E2 is the new target edge, i.e., we
    are effectively recording that E->dest can be changed to E2->dest
-   after fixing the SSA graph.  */
+   after fixing the SSA graph.
 
-void
+   Return TRUE if PATH was successfully threaded.  */
+
+bool
 jump_thread_path_registry::register_jump_thread (vec<jump_thread_edge *> *path)
 {
   if (!dbg_cnt (registered_jump_thread))
     {
       path->release ();
-      return;
+      return false;
     }
 
   /* First make sure there are no NULL outgoing edges on the jump threading
@@ -2766,7 +2769,7 @@ jump_thread_path_registry::register_jump_thread (vec<jump_thread_edge *> *path)
 	    }
 
 	  path->release ();
-	  return;
+	  return false;
 	}
 
       /* Only the FSM threader is allowed to thread across
@@ -2780,6 +2783,7 @@ jump_thread_path_registry::register_jump_thread (vec<jump_thread_edge *> *path)
     dump_jump_thread_path (dump_file, *path, true);
 
   m_paths.safe_push (path);
+  return true;
 }
 
 /* Return how many uses of T there are within BB, as long as there
diff --git a/gcc/tree-ssa-threadupdate.h b/gcc/tree-ssa-threadupdate.h
index b806caee581..2030bda15af 100644
--- a/gcc/tree-ssa-threadupdate.h
+++ b/gcc/tree-ssa-threadupdate.h
@@ -63,7 +63,7 @@ class jump_thread_path_registry
 public:
   jump_thread_path_registry ();
   ~jump_thread_path_registry ();
-  void register_jump_thread (vec<jump_thread_edge *> *);
+  bool register_jump_thread (vec<jump_thread_edge *> *);
   void remove_jump_threads_including (edge);
   bool thread_through_all_blocks (bool);
   jump_thread_edge *allocate_thread_edge (edge e, jump_thread_edge_type t);
diff --git a/libgomp/testsuite/libgomp.graphite/force-parallel-4.c b/libgomp/testsuite/libgomp.graphite/force-parallel-4.c
index d2af142e8fe..ef6f64d229b 100644
--- a/libgomp/testsuite/libgomp.graphite/force-parallel-4.c
+++ b/libgomp/testsuite/libgomp.graphite/force-parallel-4.c
@@ -1,4 +1,5 @@
 /* Autopar with IF conditions.  */
+/* { dg-additional-options "-fdisable-tree-thread1" } */
 
 void abort();
 
diff --git a/libgomp/testsuite/libgomp.graphite/force-parallel-8.c b/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
index 32ba5ab84ef..a97eb97acf6 100644
--- a/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
+++ b/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
@@ -1,3 +1,5 @@
+/* { dg-additional-options "-fdisable-tree-thread1" } */
+
 #define N 1500
 
 int x[N][N], y[N];
-- 
2.31.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2] Ranger-based backwards threader implementation.
  2021-06-28 16:21 [PATCH 0/2] Ranger-based backwards threader implementation Aldy Hernandez
  2021-06-28 16:21 ` [PATCH 1/2] Implement basic block path solver Aldy Hernandez
  2021-06-28 16:21 ` [PATCH 2/2] Backwards jump threader rewrite with ranger Aldy Hernandez
@ 2021-06-28 23:19 ` Martin Sebor
  2021-06-29 10:27   ` Aldy Hernandez
  2 siblings, 1 reply; 20+ messages in thread
From: Martin Sebor @ 2021-06-28 23:19 UTC (permalink / raw)
  To: Aldy Hernandez, GCC patches; +Cc: Martin Sebor

[-- Attachment #1: Type: text/plain, Size: 6870 bytes --]

On 6/28/21 10:21 AM, Aldy Hernandez via Gcc-patches wrote:
> This is the ranger-based backwards threader.  It is divided into two
> parts: the solver and the path discovery bits.
> 
> The solver is generic enough, that it may be of use to other passes,
> so it's been abstracted into its own separate class/file.  Andrew and
> I have already gone over it, so I don't think a review is necessary.
> Besides, it's technically an extension of the ranger infrastructure.
> 
> On the other hand, the path discovery bits could benefit from the
> watchful eye of the jump threading experts.
> 
> Documenting the solver in a [ranger-tech] post is on my TODO list,
> as I think it would be useful as an example of GORI as a general
> tool, outside the VRP world.
> 
> As I have mentioned elsewhere, I have gone through each test and
> documented the reasons why they were adjusted (when useful).  The
> reviewer(s) may benefit from looking at the test notes.
> 
> I have added a --param=threader-mode={ranger,legacy} option, which I
> hope to remove shortly after.  It has been useful for diagnosing
> issues in the past, though perhaps not so much now.  I've left it
> in case there's a remote interest in using it during stage1, but
> removing it could be a huge cleanup to tree-ssa-threadbackward.c.
> 
> If/when accepted, I will open 2-3 PRs with the XFAILed tests as
> requested.  I am still working on distilling a C counterpart for
> the libphobos missing thread edge.  It'll hopefully be ready by the
> time the review is done.
> 
> A version of this patchset with the verification code has
> been tested on x86-64, ppc64, ppc64le, and aarch64 (all Linux).
> 
> I am currently re-testing on x86-64 Linux, but will not re-test on the
> rest of the architectures because...OMG aarch6 is so slow!

I applied the series and ran a subset of tests and didn't see any
failures, just the three XPASSes below.  The Wfree-nonheap-object
tests you mentioned in the other post all pass.  Looks like you
got past that problem?

XPASS: gcc.dg/uninit-pr61112.c pr61112 (test for bogus messages, line 32)
XPASS: gcc.dg/uninit-pr61112.c pr61112 (test for bogus messages, line 46)
XPASS: gcc.dg/uninit-pr61112.c pr61112 (test for bogus messages, line 60)

A couple of comments on the tests below (I haven't looked at the meat
of the patch):

> 
> Thanks.
> Aldy
> 
> Aldy Hernandez (2):
>    Implement basic block path solver.
>    Backwards jump threader rewrite with ranger.
> 
>   gcc/Makefile.in                               |   6 +
>   gcc/flag-types.h                              |   7 +
>   gcc/params.opt                                |  17 +
>   .../g++.dg/debug/dwarf2/deallocator.C         |   3 +-
>   gcc/testsuite/gcc.c-torture/compile/pr83510.c |  33 ++
>   gcc/testsuite/gcc.dg/Wrestrict-22.c           |   3 +

The change here just adds the comment:

+/* This looks like the threader caused the entire loop to collapse, and the
+   warning pass can't determine the arguments to memcpy.  */
+

Since the test passes I'm not sure I understand what the comment
is trying to say.  Is it still accurate and necessary?

>   gcc/testsuite/gcc.dg/loop-unswitch-2.c        |   2 +-
>   gcc/testsuite/gcc.dg/old-style-asm-1.c        |   5 +-
>   gcc/testsuite/gcc.dg/pr68317.c                |   4 +-
>   gcc/testsuite/gcc.dg/pr97567-2.c              |   2 +-
>   gcc/testsuite/gcc.dg/predict-9.c              |   4 +-
>   gcc/testsuite/gcc.dg/shrink-wrap-loop.c       |  53 ++
>   gcc/testsuite/gcc.dg/sibcall-1.c              |  10 +
>   .../gcc.dg/tree-ssa/builtin-sprintf-3.c       |   5 +-

I wonder if breaking up the test function into five, one for each
of the tests it does, would be a better way to avoid the IL changes
than disabling all the threading passes.  Like in the attached patch.

Martin

>   gcc/testsuite/gcc.dg/tree-ssa/pr21001.c       |   1 +
>   gcc/testsuite/gcc.dg/tree-ssa/pr21294.c       |   1 +
>   gcc/testsuite/gcc.dg/tree-ssa/pr21417.c       |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c     |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/pr21563.c       |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/pr49039.c       |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c     |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c     |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c     |   2 +-
>   .../gcc.dg/tree-ssa/ranger-threader-1.c       |  20 +
>   .../gcc.dg/tree-ssa/ranger-threader-2.c       |  39 ++
>   .../gcc.dg/tree-ssa/ranger-threader-3.c       |  41 ++
>   .../gcc.dg/tree-ssa/ranger-threader-4.c       |  83 +++
>   gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c  |   4 +-
>   .../gcc.dg/tree-ssa/ssa-dom-thread-11.c       |   2 +-
>   .../gcc.dg/tree-ssa/ssa-dom-thread-12.c       |   2 +-
>   .../gcc.dg/tree-ssa/ssa-dom-thread-14.c       |   1 +
>   .../gcc.dg/tree-ssa/ssa-dom-thread-18.c       |   5 +-
>   .../gcc.dg/tree-ssa/ssa-dom-thread-6.c        |   4 +-
>   .../gcc.dg/tree-ssa/ssa-dom-thread-7.c        |   1 +
>   gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c    |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c |   1 +
>   gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c |   1 +
>   gcc/testsuite/gcc.dg/tree-ssa/vrp02.c         |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/vrp03.c         |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/vrp05.c         |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/vrp06.c         |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/vrp07.c         |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/vrp09.c         |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/vrp19.c         |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/vrp20.c         |   2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/vrp33.c         |   2 +-
>   gcc/testsuite/gcc.dg/vect/bb-slp-16.c         |   7 +
>   .../gcc.target/i386/avx2-vect-aggressive.c    |   2 +-
>   gcc/tree-ssa-path-solver.cc                   | 310 ++++++++++++
>   gcc/tree-ssa-path-solver.h                    |  85 ++++
>   gcc/tree-ssa-threadbackward.c                 | 475 +++++++++++++++++-
>   gcc/tree-ssa-threadedge.c                     |  20 +-
>   gcc/tree-ssa-threadedge.h                     |   3 +-
>   gcc/tree-ssa-threadupdate.c                   |  12 +-
>   gcc/tree-ssa-threadupdate.h                   |   2 +-
>   .../libgomp.graphite/force-parallel-4.c       |   1 +
>   .../libgomp.graphite/force-parallel-8.c       |   2 +
>   58 files changed, 1261 insertions(+), 54 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c
>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c
>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c
>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c
>   create mode 100644 gcc/tree-ssa-path-solver.cc
>   create mode 100644 gcc/tree-ssa-path-solver.h
> 


[-- Attachment #2: builtin-sprintf-3.c.diff --]
[-- Type: text/x-patch, Size: 1991 bytes --]

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c
index fae2a1b73ea..b2e005bc716 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c
@@ -3,7 +3,10 @@
    that the sprintf return value (or value range) optimization is not
    performed for an unknown string.  */
 /* { dg-do compile } */
-/* { dg-options "-O2 -Wall -Werror -fdump-tree-optimized -fprintf-return-value" } */
+/* { dg-options "-O2 -Wall -Werror -fdump-tree-optimized" } */
+
+/* Note: Threader will duplicate BBs such that there are multiple
+   string_*_fail calls on certain paths.  */
 
 #define INT_MAX   __INT_MAX__
 #define INT_MIN   (-INT_MAX - 1)
@@ -15,7 +18,7 @@ extern void string_lt_0_fail ();
 extern void string_eq_0_fail ();
 extern void string_gt_0_fail ();
 
-void test_string (char *d, const char *s)
+void test_string_eq_min (char *d, const char *s)
 {
   int n = __builtin_sprintf (d, "%-s", s);
 
@@ -23,13 +26,36 @@ void test_string (char *d, const char *s)
      or INT_MAX.  (This is a white box test based on knowing that
      the optimization computes its own values of the two constants.)  */
   if (n == INT_MIN) string_eq_min_fail ();
+}
+
+void test_string_eq_max (char *d, const char *s)
+{
+  int n = __builtin_sprintf (d, "%-s", s);
+
   if (n == INT_MAX) string_eq_max_fail ();
+}
+
+void test_string_lt_0 (char *d, const char *s)
+{
+  int n = __builtin_sprintf (d, "%-s", s);
 
   /* The return value could be negative when strlen(s) is in excess
      of 4095 (the maximum number of bytes a single directive is required
      to handle).  */
   if (n < 0) string_lt_0_fail ();
+}
+
+void test_string_eq_0 (char *d, const char *s)
+{
+  int n = __builtin_sprintf (d, "%-s", s);
+
   if (n == 0) string_eq_0_fail ();
+}
+
+void test_string_gt_0 (char *d, const char *s)
+{
+  int n = __builtin_sprintf (d, "%-s", s);
+
   if (n > 0) string_gt_0_fail ();
 }
 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2] Ranger-based backwards threader implementation.
  2021-06-28 23:19 ` [PATCH 0/2] Ranger-based backwards threader implementation Martin Sebor
@ 2021-06-29 10:27   ` Aldy Hernandez
  2021-06-29 21:22     ` Martin Sebor
  0 siblings, 1 reply; 20+ messages in thread
From: Aldy Hernandez @ 2021-06-29 10:27 UTC (permalink / raw)
  To: Martin Sebor, GCC patches
  Cc: Martin Sebor, Jeff Law, Andrew MacLeod, Richard Biener

[-- Attachment #1: Type: text/plain, Size: 4657 bytes --]



On 6/29/21 1:19 AM, Martin Sebor wrote:
> On 6/28/21 10:21 AM, Aldy Hernandez via Gcc-patches wrote:
>> This is the ranger-based backwards threader.  It is divided into two
>> parts: the solver and the path discovery bits.
>>
>> The solver is generic enough, that it may be of use to other passes,
>> so it's been abstracted into its own separate class/file.  Andrew and
>> I have already gone over it, so I don't think a review is necessary.
>> Besides, it's technically an extension of the ranger infrastructure.
>>
>> On the other hand, the path discovery bits could benefit from the
>> watchful eye of the jump threading experts.
>>
>> Documenting the solver in a [ranger-tech] post is on my TODO list,
>> as I think it would be useful as an example of GORI as a general
>> tool, outside the VRP world.
>>
>> As I have mentioned elsewhere, I have gone through each test and
>> documented the reasons why they were adjusted (when useful).  The
>> reviewer(s) may benefit from looking at the test notes.
>>
>> I have added a --param=threader-mode={ranger,legacy} option, which I
>> hope to remove shortly after.  It has been useful for diagnosing
>> issues in the past, though perhaps not so much now.  I've left it
>> in case there's a remote interest in using it during stage1, but
>> removing it could be a huge cleanup to tree-ssa-threadbackward.c.
>>
>> If/when accepted, I will open 2-3 PRs with the XFAILed tests as
>> requested.  I am still working on distilling a C counterpart for
>> the libphobos missing thread edge.  It'll hopefully be ready by the
>> time the review is done.
>>
>> A version of this patchset with the verification code has
>> been tested on x86-64, ppc64, ppc64le, and aarch64 (all Linux).
>>
>> I am currently re-testing on x86-64 Linux, but will not re-test on the
>> rest of the architectures because...OMG aarch6 is so slow!
> 
> I applied the series and ran a subset of tests and didn't see any
> failures, just the three XPASSes below.  The Wfree-nonheap-object
> tests you mentioned in the other post all pass.  Looks like you
> got past that problem?
> 
> XPASS: gcc.dg/uninit-pr61112.c pr61112 (test for bogus messages, line 32)
> XPASS: gcc.dg/uninit-pr61112.c pr61112 (test for bogus messages, line 46)
> XPASS: gcc.dg/uninit-pr61112.c pr61112 (test for bogus messages, line 60)
> 
> A couple of comments on the tests below (I haven't looked at the meat
> of the patch):
> 
>>
>> Thanks.
>> Aldy
>>
>> Aldy Hernandez (2):
>>    Implement basic block path solver.
>>    Backwards jump threader rewrite with ranger.
>>
>>   gcc/Makefile.in                               |   6 +
>>   gcc/flag-types.h                              |   7 +
>>   gcc/params.opt                                |  17 +
>>   .../g++.dg/debug/dwarf2/deallocator.C         |   3 +-
>>   gcc/testsuite/gcc.c-torture/compile/pr83510.c |  33 ++
>>   gcc/testsuite/gcc.dg/Wrestrict-22.c           |   3 +
> 
> The change here just adds the comment:
> 
> +/* This looks like the threader caused the entire loop to collapse, and 
> the
> +   warning pass can't determine the arguments to memcpy.  */
> +
> 
> Since the test passes I'm not sure I understand what the comment
> is trying to say.  Is it still accurate and necessary?

This seems like it came from the ranger branch which had slightly 
different code, particularly it made use of a full ranger with 
equivalences.  It looks like this could have failed in the branch, but 
no longer does.  I have removed the comment.

> 
>>   gcc/testsuite/gcc.dg/loop-unswitch-2.c        |   2 +-
>>   gcc/testsuite/gcc.dg/old-style-asm-1.c        |   5 +-
>>   gcc/testsuite/gcc.dg/pr68317.c                |   4 +-
>>   gcc/testsuite/gcc.dg/pr97567-2.c              |   2 +-
>>   gcc/testsuite/gcc.dg/predict-9.c              |   4 +-
>>   gcc/testsuite/gcc.dg/shrink-wrap-loop.c       |  53 ++
>>   gcc/testsuite/gcc.dg/sibcall-1.c              |  10 +
>>   .../gcc.dg/tree-ssa/builtin-sprintf-3.c       |   5 +-
> 
> I wonder if breaking up the test function into five, one for each
> of the tests it does, would be a better way to avoid the IL changes
> than disabling all the threading passes.  Like in the attached patch.

As the author of the original test, I completely differ to you :).

Attached is the latest version with your suggested changes, as well as a 
gimple FE test for the previously discussed failing libphobos test.

Thanks.
Aldy

[-- Attachment #2: 0002-Backwards-jump-threader-rewrite-with-ranger.patch --]
[-- Type: text/x-patch, Size: 66244 bytes --]

From a373ff1b936f39c8372ba88c4a462dd61a78c535 Mon Sep 17 00:00:00 2001
From: Aldy Hernandez <aldyh@redhat.com>
Date: Tue, 15 Jun 2021 12:32:51 +0200
Subject: [PATCH 2/2] Backwards jump threader rewrite with ranger.

This is a rewrite of the backwards threader with a ranger based solver.

The code is divided into two parts: the path solver in
tree-ssa-path-solver.*, and the path discovery in
tree-ssa-threadbackward.c.

The legacy code is still available with --param=threader-mode=legacy,
but will be removed shortly after.

gcc/ChangeLog:

	* Makefile.in (tree-ssa-loop-im.o-warn): New.
	* flag-types.h (enum threader_mode): New.
	* params.opt: Add entry for --param=threader-mode.
	* tree-ssa-threadbackward.c (THREADER_ITERATIVE_MODE): New.
	(class back_threader): New.
	(back_threader::back_threader): New.
	(back_threader::~back_threader): New.
	(back_threader::maybe_register_path): New.
	(back_threader::find_taken_edge): New.
	(back_threader::find_taken_edge_switch): New.
	(back_threader::find_taken_edge_cond): New.
	(back_threader::resolve_def): New.
	(back_threader::resolve_phi): New.
	(back_threader::find_paths_to_names): New.
	(back_threader::find_paths): New.
	(dump_path): New.
	(debug): New.
	(thread_jumps::find_jump_threads_backwards): Call ranger threader.
	(thread_jumps::find_jump_threads_backwards_with_ranger): New.
	(pass_thread_jumps::execute): Abstract out code...
	(try_thread_blocks): ...here.
	* tree-ssa-threadedge.c (jump_threader::thread_outgoing_edges):
	Abstract out threading candidate code to...
	(single_succ_to_potentially_threadable_block): ...here.
	* tree-ssa-threadedge.h (single_succ_to_potentially_threadable_block):
	New.
	* tree-ssa-threadupdate.c (register_jump_thread): Return boolean.
	* tree-ssa-threadupdate.h (class jump_thread_path_registry):
	Return bool from register_jump_thread.

libgomp/ChangeLog:

	* testsuite/libgomp.graphite/force-parallel-4.c: Adjust for
	threader.
	* testsuite/libgomp.graphite/force-parallel-8.c: Same.

gcc/testsuite/ChangeLog:

	* g++.dg/debug/dwarf2/deallocator.C: Adjust for threader.
	* gcc.c-torture/compile/pr83510.c: Same.
	* gcc.dg/loop-unswitch-2.c: Same.
	* gcc.dg/old-style-asm-1.c: Same.
	* gcc.dg/pr68317.c: Same.
	* gcc.dg/pr97567-2.c: Same.
	* gcc.dg/predict-9.c: Same.
	* gcc.dg/shrink-wrap-loop.c: Same.
	* gcc.dg/sibcall-1.c: Same.
	* gcc.dg/tree-ssa/builtin-sprintf-3.c: Same.
	* gcc.dg/tree-ssa/pr21001.c: Same.
	* gcc.dg/tree-ssa/pr21294.c: Same.
	* gcc.dg/tree-ssa/pr21417.c: Same.
	* gcc.dg/tree-ssa/pr21458-2.c: Same.
	* gcc.dg/tree-ssa/pr21563.c: Same.
	* gcc.dg/tree-ssa/pr49039.c: Same.
	* gcc.dg/tree-ssa/pr61839_1.c: Same.
	* gcc.dg/tree-ssa/pr61839_3.c: Same.
	* gcc.dg/tree-ssa/pr77445-2.c: Same.
	* gcc.dg/tree-ssa/split-path-4.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-11.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-12.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-14.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same.
	* gcc.dg/tree-ssa/ssa-fre-48.c: Same.
	* gcc.dg/tree-ssa/ssa-thread-11.c: Same.
	* gcc.dg/tree-ssa/ssa-thread-12.c: Same.
	* gcc.dg/tree-ssa/ssa-thread-14.c: Same.
	* gcc.dg/tree-ssa/vrp02.c: Same.
	* gcc.dg/tree-ssa/vrp03.c: Same.
	* gcc.dg/tree-ssa/vrp05.c: Same.
	* gcc.dg/tree-ssa/vrp06.c: Same.
	* gcc.dg/tree-ssa/vrp07.c: Same.
	* gcc.dg/tree-ssa/vrp09.c: Same.
	* gcc.dg/tree-ssa/vrp19.c: Same.
	* gcc.dg/tree-ssa/vrp20.c: Same.
	* gcc.dg/tree-ssa/vrp33.c: Same.
	* gcc.dg/uninit-pred-9_b.c: Same.
	* gcc.dg/vect/bb-slp-16.c: Same.
	* gcc.target/i386/avx2-vect-aggressive.c: Same.
	* gcc.dg/tree-ssa/ranger-threader-1.c: New test.
	* gcc.dg/tree-ssa/ranger-threader-2.c: New test.
	* gcc.dg/tree-ssa/ranger-threader-3.c: New test.
	* gcc.dg/tree-ssa/ranger-threader-4.c: New test.
	* gcc.dg/tree-ssa/ranger-threader-5.c: New test.
---
 gcc/Makefile.in                               |   5 +
 gcc/flag-types.h                              |   7 +
 gcc/params.opt                                |  17 +
 .../g++.dg/debug/dwarf2/deallocator.C         |   3 +-
 gcc/testsuite/gcc.c-torture/compile/pr83510.c |  33 ++
 gcc/testsuite/gcc.dg/loop-unswitch-2.c        |   2 +-
 gcc/testsuite/gcc.dg/old-style-asm-1.c        |   5 +-
 gcc/testsuite/gcc.dg/pr68317.c                |   4 +-
 gcc/testsuite/gcc.dg/pr97567-2.c              |   2 +-
 gcc/testsuite/gcc.dg/predict-9.c              |   4 +-
 gcc/testsuite/gcc.dg/shrink-wrap-loop.c       |  53 ++
 gcc/testsuite/gcc.dg/sibcall-1.c              |  10 +
 .../gcc.dg/tree-ssa/builtin-sprintf-3.c       |  25 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr21001.c       |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/pr21294.c       |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/pr21417.c       |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c     |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr21563.c       |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr49039.c       |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c     |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c     |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c     |   2 +-
 .../gcc.dg/tree-ssa/ranger-threader-1.c       |  20 +
 .../gcc.dg/tree-ssa/ranger-threader-2.c       |  39 ++
 .../gcc.dg/tree-ssa/ranger-threader-3.c       |  41 ++
 .../gcc.dg/tree-ssa/ranger-threader-4.c       |  83 +++
 .../gcc.dg/tree-ssa/ranger-threader-5.c       |  80 +++
 gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c  |   4 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-11.c       |   2 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-12.c       |   2 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-14.c       |   1 +
 .../gcc.dg/tree-ssa/ssa-dom-thread-18.c       |   5 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-6.c        |   4 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-7.c        |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c    |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/vrp02.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp03.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp05.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp06.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp07.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp09.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp19.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp20.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp33.c         |   2 +-
 gcc/testsuite/gcc.dg/uninit-pred-9_b.c        |   1 +
 gcc/testsuite/gcc.dg/vect/bb-slp-16.c         |   7 +
 .../gcc.target/i386/avx2-vect-aggressive.c    |   2 +-
 gcc/tree-ssa-threadbackward.c                 | 475 +++++++++++++++++-
 gcc/tree-ssa-threadedge.c                     |  20 +-
 gcc/tree-ssa-threadedge.h                     |   3 +-
 gcc/tree-ssa-threadupdate.c                   |  12 +-
 gcc/tree-ssa-threadupdate.h                   |   2 +-
 .../libgomp.graphite/force-parallel-4.c       |   1 +
 .../libgomp.graphite/force-parallel-8.c       |   2 +
 57 files changed, 963 insertions(+), 54 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-5.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 66cc5f9529e..b166a6539f2 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -223,6 +223,11 @@ gimple-match.o-warn = -Wno-unused
 generic-match.o-warn = -Wno-unused
 dfp.o-warn = -Wno-strict-aliasing
 
+# maybe_emit_free_warning() is picking up the inlined location for the
+# warning, not the source of the original va_heap::release() function
+# which has a pragma disabling this warning.
+tree-ssa-loop-im.o-warn = -Wno-free-nonheap-object
+
 # All warnings have to be shut off in stage1 if the compiler used then
 # isn't gcc; configure determines that.  WARN_CFLAGS will be either
 # $(GCC_WARN_CFLAGS), or nothing.  Similarly, WARN_CXXFLAGS will be
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 375448ebf5f..c7d98964fc7 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -435,6 +435,13 @@ enum evrp_mode
   EVRP_MODE_RVRP_DEBUG = EVRP_MODE_RVRP_ONLY | EVRP_MODE_DEBUG
 };
 
+/* Backwards threader mode.  */
+enum threader_mode
+{
+  THREADER_MODE_LEGACY = 0,
+  THREADER_MODE_RANGER = 1
+};
+
 /* Modes of OpenACC 'kernels' constructs handling.  */
 enum openacc_kernels
 {
diff --git a/gcc/params.opt b/gcc/params.opt
index 18e6036c4f4..d7cae1b390b 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -1006,6 +1006,23 @@ Maximum depth of DFS walk used by modref escape analysis.
 Common Joined UInteger Var(param_modref_max_escape_points) Init(256) Param Optimization
 Maximum number of escape points tracked by modref per SSA-name.
 
+-param=threader-iterative=
+Common Joined UInteger Var(param_threader_iterative) Init(0) Param Optimization
+Run backwards threader in iterative mode.
+
+-param=threader-mode=
+Common Joined Var(param_threader_mode) Enum(threader_mode) Init(THREADER_MODE_RANGER) Param Optimization
+--param=threader-mode=[legacy|ranger] Specifies the mode the backwards threader should run in.
+
+Enum
+Name(threader_mode) Type(enum threader_mode) UnknownError(unknown threader mode %qs)
+
+EnumValue
+Enum(threader_mode) String(legacy) Value(THREADER_MODE_LEGACY)
+
+EnumValue
+Enum(threader_mode) String(ranger) Value(THREADER_MODE_RANGER)
+
 -param=tm-max-aggregate-size=
 Common Joined UInteger Var(param_tm_max_aggregate_size) Init(9) Param Optimization
 Size in bytes after which thread-local aggregates should be instrumented with the logging functions instead of save/restore pairs.
diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/deallocator.C b/gcc/testsuite/g++.dg/debug/dwarf2/deallocator.C
index d895e78e608..c1d387922ba 100644
--- a/gcc/testsuite/g++.dg/debug/dwarf2/deallocator.C
+++ b/gcc/testsuite/g++.dg/debug/dwarf2/deallocator.C
@@ -29,7 +29,7 @@ void foo(int i)
 	  return;
 	}
     }
-  if (i)
+  if (i) // Threader makes everything after here disappear.
     {
       t test;
       if (i == 10)
@@ -42,5 +42,4 @@ void foo(int i)
 }
 // { dg-final { scan-assembler "deallocator.C:29" } }
 // { dg-final { scan-assembler "deallocator.C:24" } }
-// { dg-final { scan-assembler "deallocator.C:34" } }
 // { dg-final { scan-assembler "deallocator.C:21" } }
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr83510.c b/gcc/testsuite/gcc.c-torture/compile/pr83510.c
index 907dd80ccd7..fc932e57f3a 100644
--- a/gcc/testsuite/gcc.c-torture/compile/pr83510.c
+++ b/gcc/testsuite/gcc.c-torture/compile/pr83510.c
@@ -3,6 +3,39 @@
    (PR tree-optimization/83510).  */
 
 /* { dg-options "-Warray-bounds" } */
+/* { dg-xfail-if "" { "*-*-*" } { "-Os" } } */
+
+
+/*  This test is XFAILed because thread1 threads a switch statement
+    such that the various cases have been split into different
+    independent blocks.  One of these blocks exposes an arr[i_27]
+    which is later propagated by VRP to be arr[10].  This is an
+    invalid access, but the array bounds code doesn't know it is an
+    unreachable path.
+
+    However, it is not until dom2 that we "know" that the value of the
+    switch index is such that the path to arr[10] is unreachable.  For
+    that matter, it is not until dom3 that we remove the unreachable
+    path.
+
+
+    See:
+    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83510
+    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83312
+
+    It's not until here that ranger "knows" that the path is
+    unreachable:
+
+    thread1
+    vrp1		<-- array bounds checking
+    dce2
+    stdarg
+    cdce
+    cselim
+    copyprop
+    ifcombine
+    mergephi3		<-- too late
+*/
 
 extern int get_flag (void);
 
diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-2.c b/gcc/testsuite/gcc.dg/loop-unswitch-2.c
index f8d314e34de..0931f6e71c3 100644
--- a/gcc/testsuite/gcc.dg/loop-unswitch-2.c
+++ b/gcc/testsuite/gcc.dg/loop-unswitch-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details" } */
+/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details -fdisable-tree-thread2 -fdisable-tree-thread3" } */
 
 void foo (float **a, float **b, float *c, int n, int m, int l)
 {
diff --git a/gcc/testsuite/gcc.dg/old-style-asm-1.c b/gcc/testsuite/gcc.dg/old-style-asm-1.c
index 8af007795a7..f9406ff0a26 100644
--- a/gcc/testsuite/gcc.dg/old-style-asm-1.c
+++ b/gcc/testsuite/gcc.dg/old-style-asm-1.c
@@ -1,6 +1,9 @@
 /* PR inline-asm/8832 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -dP" } */
+/* { dg-options "-O2 -dP -fdisable-tree-ethread -fdisable-tree-thread1 -fdisable-tree-thread2 -fdisable-tree-thread3 -fdisable-tree-thread4" } */
+
+/* Note: Threader will duplicate BBs and replace one conditional branch by an
+   unconditional one.  */
 
 /* Verify that GCC doesn't optimize
    old style asm instructions.  */
diff --git a/gcc/testsuite/gcc.dg/pr68317.c b/gcc/testsuite/gcc.dg/pr68317.c
index 891d12954ba..bd053a7522b 100644
--- a/gcc/testsuite/gcc.dg/pr68317.c
+++ b/gcc/testsuite/gcc.dg/pr68317.c
@@ -1,5 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -fdisable-tree-ethread" } */
+
+/* Note: Threader will collapse loop.  */
 
 typedef int int32_t __attribute__((mode (__SI__)));
 
diff --git a/gcc/testsuite/gcc.dg/pr97567-2.c b/gcc/testsuite/gcc.dg/pr97567-2.c
index dee31c6dc01..c3ead54eaa8 100644
--- a/gcc/testsuite/gcc.dg/pr97567-2.c
+++ b/gcc/testsuite/gcc.dg/pr97567-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile} */
-/* { dg-options "-O2 -fdump-tree-evrp" } */
+/* { dg-options "-O2 -fdump-tree-evrp -fdisable-tree-ethread" } */
 
 char a[2];
 
diff --git a/gcc/testsuite/gcc.dg/predict-9.c b/gcc/testsuite/gcc.dg/predict-9.c
index f491c511bd9..cb68a218a93 100644
--- a/gcc/testsuite/gcc.dg/predict-9.c
+++ b/gcc/testsuite/gcc.dg/predict-9.c
@@ -1,5 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate -fno-finite-loops" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate -fno-finite-loops -fdisable-tree-ethread" } */
+
+/* Note: Threader causes removal of for loop.  */
 
 extern int global;
 extern int global2;
diff --git a/gcc/testsuite/gcc.dg/shrink-wrap-loop.c b/gcc/testsuite/gcc.dg/shrink-wrap-loop.c
index 52dfc2790ed..ba872fa23f6 100644
--- a/gcc/testsuite/gcc.dg/shrink-wrap-loop.c
+++ b/gcc/testsuite/gcc.dg/shrink-wrap-loop.c
@@ -1,5 +1,58 @@
 /* { dg-do compile { target { { { i?86-*-* x86_64-*-* } && lp64 } || { arm_thumb2 } } } } */
 /* { dg-options "-O2 -fdump-rtl-pro_and_epilogue"  } */
+// { dg-additional-options "-fdisable-tree-ethread" }
+
+/*
+Our new threader is threading things a bit too early, and causing the
+testcase in gcc.dg/shrink-wrap-loop.c to fail.
+
+  The gist is this BB inside a loop:
+
+  <bb 6> :
+  # p_2 = PHI <p2_6(D)(2), p_12(5)>
+  if (p_2 != 0B)
+    goto <bb 3>; [INV]
+  else
+    goto <bb 7>; [INV]
+
+Our threader can move this check outside of the loop (good).  This is
+done before branch probabilities are calculated and causes the probs
+to be calculated as:
+
+<bb 2> [local count: 216361238]:
+  if (p2_6(D) != 0B)
+    goto <bb 7>; [54.59%]
+  else
+    goto <bb 6>; [45.41%]
+
+Logically this seems correct to me.  A simple check outside of a loop
+should slightly but not overwhelmingly favor a non-zero value.
+
+Interestingly however, the old threader couldn't get this, but the IL
+ended up identical, albeit with different probabilities.  What happens
+is that, because the old code could not thread this, the p2 != 0 check
+would remain inside the loop and probs would be calculated thusly:
+
+  <bb 6> [local count: 1073741824]:
+  # p_2 = PHI <p2_6(D)(2), p_12(5)>
+  if (p_2 != 0B)
+    goto <bb 3>; [94.50%]
+  else
+    goto <bb 7>; [5.50%]
+
+Then when the loop header copying pass ("ch") shuffled things around,
+the IL would end up identical to my early threader code, but with the
+probabilities would remain as 94.5/5.5.
+
+The above discrepancy causes the RTL ifcvt pass to generate different
+code, and by the time we get to the shrink wrapping pass, things look
+sufficiently different such that the legacy code can actually shrink
+wrap, whereas our new code does not.
+
+IMO, if the loop-ch pass moves conditionals outside of a loop, the
+probabilities should be adjusted, but that does mean the shrink wrap
+won't happen for this contrived testcase.
+ */
 
 int foo (int *p1, int *p2);
 
diff --git a/gcc/testsuite/gcc.dg/sibcall-1.c b/gcc/testsuite/gcc.dg/sibcall-1.c
index e8a95513d9e..367ee4374e1 100644
--- a/gcc/testsuite/gcc.dg/sibcall-1.c
+++ b/gcc/testsuite/gcc.dg/sibcall-1.c
@@ -7,6 +7,9 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -foptimize-sibling-calls" } */
 
+/* See note in recurser_void() as to why we disable threading.  */
+/* { dg-additional-options "-fdisable-tree-thread1" } */
+
 /* The option -foptimize-sibling-calls is the default, but serves as
    marker.  Self-recursion tail calls are optimized for all targets,
    regardless of presence of sibcall patterns.  */
@@ -26,6 +29,13 @@ int main ()
 void
 recurser_void (int n)
 {
+  /* In some architectures like ppc64*, jump threading may thread
+     paths such that there are two calls into track(), one for
+     track(0) and one for track(7).  The track(7) call can be
+     transformed into a jump instead of a call, which means that
+     different calls into track() may end up with a different
+     &stackpos.  This is the reason we disable jump threading for this
+     test.  */
   if (n == 0 || n == 7)
     track (n);
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c
index fae2a1b73ea..ec55f267318 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c
@@ -15,7 +15,7 @@ extern void string_lt_0_fail ();
 extern void string_eq_0_fail ();
 extern void string_gt_0_fail ();
 
-void test_string (char *d, const char *s)
+void test_string_eq_min (char *d, const char *s)
 {
   int n = __builtin_sprintf (d, "%-s", s);
 
@@ -23,13 +23,36 @@ void test_string (char *d, const char *s)
      or INT_MAX.  (This is a white box test based on knowing that
      the optimization computes its own values of the two constants.)  */
   if (n == INT_MIN) string_eq_min_fail ();
+}
+
+void test_string_eq_max (char *d, const char *s)
+{
+  int n = __builtin_sprintf (d, "%-s", s);
+
   if (n == INT_MAX) string_eq_max_fail ();
+}
+
+void test_string_lt_0 (char *d, const char *s)
+{
+  int n = __builtin_sprintf (d, "%-s", s);
 
   /* The return value could be negative when strlen(s) is in excess
      of 4095 (the maximum number of bytes a single directive is required
      to handle).  */
   if (n < 0) string_lt_0_fail ();
+}
+
+void test_string_eq_0 (char *d, const char *s)
+{
+  int n = __builtin_sprintf (d, "%-s", s);
+
   if (n == 0) string_eq_0_fail ();
+}
+
+void test_string_gt_0 (char *d, const char *s)
+{
+  int n = __builtin_sprintf (d, "%-s", s);
+
   if (n > 0) string_gt_0_fail ();
 }
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21001.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21001.c
index 719360a015f..4ea5f21addf 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21001.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21001.c
@@ -6,6 +6,7 @@
 
 /* { dg-do compile } */
 /* { dg-options "-O2 -fno-tree-dominator-opts -fno-tree-fre -fdisable-tree-evrp -fdump-tree-vrp1-details" } */
+/* { dg-additional-options "-fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 int
 foo (int a)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21294.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21294.c
index cc7d4cdf338..b9edabc6234 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21294.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21294.c
@@ -5,6 +5,7 @@
 
 /* { dg-do compile } */
 /* { dg-options "-O2 -fno-tree-dominator-opts -fdisable-tree-evrp -fdump-tree-vrp1-details" } */
+/* { dg-additional-options "-fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 struct f {
   int i;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c
index 484511978a1..fc14af4e662 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-thread4-details" } */
+/* { dg-options "-O2 -fdisable-tree-thread3 -fdump-tree-thread4-details" } */
 
 struct tree_common 
 { 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c
index 2aee42f4c05..f8d7353fc0e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-evrp-details" } */
+/* { dg-options "-O2 -fdump-tree-evrp-details -fdisable-tree-ethread" } */
 
 extern void g (void);
 extern void bar (int);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21563.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21563.c
index 9c67a3acb46..72dce83ce37 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21563.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21563.c
@@ -2,7 +2,7 @@
    Make sure VRP folds the second "if" statement.  */
 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-dominator-opts -fdisable-tree-evrp -fdump-tree-vrp1-details" } */
+/* { dg-options "-O2 -fno-tree-dominator-opts -fdisable-tree-evrp -fdump-tree-vrp1-details -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 int
 foo (int a)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr49039.c b/gcc/testsuite/gcc.dg/tree-ssa/pr49039.c
index 4bc0a8175a0..a2044d012cc 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr49039.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr49039.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/49039 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 extern void bar (void);
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c
index d44c7dc1882..ddc53fbfbcc 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/61839.  */
 /* { dg-do run } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fdisable-tree-evrp -fdump-tree-optimized" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fdisable-tree-evrp -fdump-tree-optimized -fdisable-tree-ethread -fdisable-tree-thread1" } */
 /* { dg-require-effective-target int32plus } */
 
 __attribute__ ((noinline))
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c
index 5ceb0738bde..cc322d6e703 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/61839.  */
 /* { dg-do run } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fdump-tree-optimized" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fdump-tree-optimized -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 __attribute__ ((noinline))
 int foo (int a, unsigned b)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c
index cf74e156109..f9fc212f49e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c
@@ -124,7 +124,7 @@ enum STATES FMS( u8 **in , u32 *transitions) {
    to change decisions in switch expansion which in turn can expose new
    jump threading opportunities.  Skip the later tests on aarch64.  */
 /* { dg-final { scan-tree-dump "Jumps threaded: 1\[1-9\]" "thread1" } } */
-/* { dg-final { scan-tree-dump-times "Invalid sum" 3 "thread1" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum" 4 "thread1" } } */
 /* { dg-final { scan-tree-dump-not "optimizing for size" "thread1" } } */
 /* { dg-final { scan-tree-dump-not "optimizing for size" "thread2" } } */
 /* { dg-final { scan-tree-dump-not "optimizing for size" "thread3" { target { ! aarch64*-*-* } } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c
new file mode 100644
index 00000000000..c3ccb5d5b7e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-thread1-details --param logical-op-non-short-circuit=1" } */
+
+// Copied from ssa-dom-thread-11.c
+
+static int *bb_ticks;
+extern void frob (void);
+void
+mark_target_live_regs (int b, int block, int bb_tick)
+{
+  if (b == block && b != -1 && bb_tick == bb_ticks[b])
+      return;
+  if (b != -1)
+    frob ();
+}
+
+/* When the first two conditionals in the first IF are true, but
+   the third conditional is false, then there's a jump threading
+   opportunity to bypass the second IF statement.  */
+/* { dg-final { scan-tree-dump-times "Registering.*jump thread" 1 "thread1"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c
new file mode 100644
index 00000000000..d2689b6c788
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */ 
+/* { dg-options "-O2 -fdump-tree-thread2-details -w" } */
+
+// Copied from ssa-dom-thread-12.c.
+
+typedef long unsigned int size_t;
+union tree_node;
+typedef union tree_node *tree;
+typedef union gimple_statement_d *gimple;
+typedef const union gimple_statement_d *const_gimple;
+union gimple_statement_d
+{
+  unsigned num_ops;
+  tree exp;
+};
+
+unsigned int x;
+static inline tree
+gimple_op (const_gimple gs, unsigned i)
+{
+  if (!(i < gs->num_ops))
+    abort ();
+  return gs->exp;
+}
+
+unsigned char
+scan_function (gimple stmt)
+{
+  unsigned i;
+  for (i = 0; i < stmt->num_ops - 3 ; i++)
+    gimple_call_arg (stmt, i);
+  gimple_op (stmt, 1);
+}
+
+/* The test which bypasses the loop is simplified prior to DOM to check
+   that stmt->num_ops - 3 != 0.  When that test is false, we can derive
+   a value for stmt->num_ops.  That in turn allows us to thread the jump
+   for the conditional at the start of the call to gimple_op.  */
+/* { dg-final { scan-tree-dump-times "Registering.*jump thread" 1 "thread2"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c
new file mode 100644
index 00000000000..79ec067e68a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ethread-details -w --param logical-op-non-short-circuit=1" } */
+
+// Copied from ssa-dom-thread-14.c
+
+enum optab_methods
+{
+  OPTAB_DIRECT,
+  OPTAB_LIB,
+  OPTAB_WIDEN,
+  OPTAB_LIB_WIDEN,
+  OPTAB_MUST_WIDEN
+};
+struct optab_d { };
+typedef struct optab_d *optab;
+void
+expand_shift_1 (int code, int unsignedp, int rotate,
+		optab lshift_optab, optab rshift_arith_optab)
+{
+  int left = (code == 42 || code == 0xde);
+  int attempt;
+  enum optab_methods methods;
+  if (attempt == 0)
+    methods = OPTAB_DIRECT;
+  else if (attempt == 1)
+    methods = OPTAB_WIDEN;
+  if ((!unsignedp || (!left && methods == OPTAB_WIDEN)))
+    {
+      enum optab_methods methods1 = methods;
+      if (unsignedp)
+	methods1 = OPTAB_MUST_WIDEN;
+      expand_binop (left ? lshift_optab : rshift_arith_optab,
+			   unsignedp, methods1);
+    }
+}
+
+/* When UNSIGNEDP is true, LEFT is false and METHOD == OPTAB_WIDEN
+   we will enter the TRUE arm of the conditional and we can thread
+   the test to compute the first first argument of the expand_binop
+   call if we look backwards through the boolean logicals.  */
+/* { dg-final { scan-tree-dump-times "Registering.*jump thread" 1 "ethread"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c
new file mode 100644
index 00000000000..e8d1cfc03ac
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c
@@ -0,0 +1,83 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -fdump-tree-vrp-details -fdump-tree-thread1-details --param logical-op-non-short-circuit=1" }  */
+/* { dg-final { scan-tree-dump-times "Registering FSM jump" 8 "thread1" } }  */
+
+/* Copied from ssa-thread-14.  */
+
+void foo (void);
+void bar (void);
+void blah (void);
+
+/* One jump threaded here.  */
+
+void
+baz_1 (int a, int b, int c)
+{
+  if (a && b)
+    foo ();
+  if (!b && c)
+    bar ();
+}
+
+/* One jump threaded here.  */
+
+void
+baz_2 (int a, int b, int c)
+{
+  if (a && b)
+    foo ();
+  if (b || c)
+    bar ();
+}
+
+/* One jump threaded here.  */
+
+void
+baz_3 (int a, int b, int c)
+{
+  if (a && b > 10)
+    foo ();
+  if (b < 5 && c)
+    bar ();
+}
+
+/* Two jumps threaded here.  */
+
+void
+baz_4 (int a, int b, int c)
+{
+  if (a && b)
+    {
+      foo ();
+      if (c)
+        bar ();
+    }
+  if (b && c)
+    blah ();
+}
+
+/* Two jumps threaded here.  */
+
+void
+baz_5 (int a, int b, int c)
+{
+  if (a && b)
+    {
+      foo ();
+      if (c)
+        bar ();
+    }
+  if (!b || !c)
+    blah ();
+}
+
+/* One jump threaded here.  */
+
+void
+baz_6 (int a, int b, int c)
+{
+  if (a == 39 && b == 41)
+    foo ();
+  if (c == 12 || b == 41)
+    bar ();
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-5.c b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-5.c
new file mode 100644
index 00000000000..b7ca99a0152
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-5.c
@@ -0,0 +1,80 @@
+// { dg-do compile }
+// { dg-options "-fgimple -O2 -fdump-tree-thread1-details" }
+
+/* This tests that we can thread BB4->BB999 coming in through the
+   following path:
+
+      latch   many insns
+        |         |
+        V         V
+   6 -> 7 -> 3 -> 4 -> 999
+
+   The ranger based threader cannot thread this because BB4 has too
+   many instructions so it gives up looking back.  However, if we were
+   able to looking further, we would notice that a profitable path
+   passing through the loop latch (BB7) exists.
+
+   That is, 3->4->N in isolation is not profitable, but 6->7->3->4->N is.
+
+   It is not clear whether handling this case in the backwards
+   threader is profitable, as it would increase the search space
+   considerably.  The test is being added to note a regression from
+   the old backward threader code.
+
+   This test has been distilled from libphobos/src/std/net/isemail.d.
+
+   The ranger threader stops at the 3->4 subpath with: "did not thread
+   around loop and would copy too many statements".  */
+
+
+extern void bar();
+extern int random();
+
+int __GIMPLE (ssa,startwith("thread1"))
+foo (int key)
+{
+  int context;
+  int _1454;
+
+ __BB(2):
+  goto __BB3;
+
+  // Loop header.
+ __BB(3):
+  context_448 = __PHI (__BB2: 0, __BB7: context_450);
+  if (key_5(D) > 0)
+    goto __BB999;
+  else
+    goto __BB4;
+
+ __BB(4):
+  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();
+  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();
+  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();
+  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();
+  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();
+  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();
+  switch (context_448) {default: L5; case 0: L999; }
+
+ __BB(5):
+ L5:
+  goto __BB6;
+
+ __BB(6):
+  context_450 = __PHI (__BB5: 0);
+  _1454 = random ();
+  if (_1454 > 0)
+    goto __BB999;
+  else
+    goto __BB7;
+
+  // Loop latch.
+ __BB(7):
+  goto __BB3;
+
+ __BB(999):
+ L999:
+  return 5;
+}
+
+// { dg-final { scan-tree-dump-times "Registering.*jump thread.*incoming edge;  \\(6, 7\\)  \\(7, 3\\)  \\(3, 4\\)  \\(4, 999\\) nocopy" 1 "thread1" { xfail *-*-* } } }
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c
index dac931c18d0..8ef7646282c 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c
@@ -1,5 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details -w" } */
+/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details -w -fdisable-tree-thread1 -fdisable-tree-thread2" } */
+
+/* Note: Threader causes the infinite loop in val & 1 sooner.  */
 
 powi_cost (long n)
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-11.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-11.c
index 5f90613263d..856ab389439 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-11.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-11.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-dom2-details --param logical-op-non-short-circuit=1" } */
+/* { dg-options "-O2 -fdump-tree-dom2-details --param logical-op-non-short-circuit=1 -fdisable-tree-thread1 -fdisable-tree-thread2" } */
 
 static int *bb_ticks;
 extern void frob (void);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-12.c
index 63bd12a06a4..bad5e0a782d 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-12.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-12.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */ 
-/* { dg-options "-O2 -fdump-tree-dom2-details -w" } */
+/* { dg-options "-O2 -fdump-tree-dom2-details -w -fdisable-tree-thread2" } */
 typedef long unsigned int size_t;
 union tree_node;
 typedef union tree_node *tree;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-14.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-14.c
index 4e6a911506e..3bc4b3795cb 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-14.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-14.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-dom2-details -w --param logical-op-non-short-circuit=1" } */
+/* { dg-additional-options "-fdisable-tree-thread1 -fdisable-tree-ethread -fdisable-tree-thread2" } */
 
 enum optab_methods
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-18.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-18.c
index d4759b8903b..03872e7a02f 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-18.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-18.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */ 
-/* { dg-options "-O2 -fdump-tree-vrp1-details -fdump-tree-dom2-details -std=gnu89 --param logical-op-non-short-circuit=0" } */
+/* { dg-options "-O2 -fdump-tree-vrp1-details -fdump-tree-thread1-details -std=gnu89 --param logical-op-non-short-circuit=0" } */
 
 #include "ssa-dom-thread-4.c"
 
@@ -21,4 +21,5 @@
 	 condition.
 
    All the cases are picked up by VRP1 as jump threads.  */
-/* { dg-final { scan-tree-dump-times "Threaded" 4 "vrp1" } } */
+/* { dg-final { scan-tree-dump-times "Registering FSM jump" 6 "thread1" } } */
+/* { dg-final { scan-tree-dump-times "Threaded" 2 "vrp1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
index 16a9ef4e28a..c7bf867b084 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
@@ -34,8 +34,8 @@
      SWITCH_BB -> BBx -> BBy -> BBz -> PHI
 
    We now know the value of the switch index at PHI.  */
-/* { dg-final { scan-tree-dump-times "FSM" 6 "thread1" } } */
-/* { dg-final { scan-tree-dump-times "FSM" 1 "thread2" } } */
+/* { dg-final { scan-tree-dump-times "Registering FSM jump" 6 "thread1" } } */
+/* { dg-final { scan-tree-dump-times "Registering FSM jump" 1 "thread2" } } */
 
 int sum0, sum1, sum2, sum3;
 int foo (char *s, char **ret)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
index bad5bc1d003..1c2d12aa9ea 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-thread2-stats -fdump-tree-dom2-stats -fdump-tree-thread3-stats -fdump-tree-dom3-stats -fdump-tree-vrp2-stats -fno-guess-branch-probability" } */
+/* { dg-additional-options "--param=threader-mode=legacy" } */
 
 /* Here we have the same issue as was commented in ssa-dom-thread-6.c.
    The PHI coming into the threader has a lot more constants, so the
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c
index b3d610204da..5e74c78e7b0 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-fre1-details" } */
+/* { dg-options "-O -fdump-tree-fre1-details -fdisable-tree-ethread" } */
 
 int foo (int i)
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c
index 67e1e89ecd3..672a54e07db 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-vrp2-details --param logical-op-non-short-circuit=1" } */
+/* { dg-additional-options "-fdisable-tree-ethread -fdisable-tree-thread1 -fdisable-tree-thread2" } */
 /* { dg-final { scan-tree-dump-not "IRREDUCIBLE_LOOP" "vrp2" } } */
 
 void abort (void);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
index fb9840e95d5..8f554641b28 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
@@ -2,7 +2,7 @@
 /* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details -fno-finite-loops --param early-inlining-insns=14 -fno-inline-functions" } */
 /* { dg-final { scan-tree-dump "FSM" "thread2" } } */
 /* { dg-final { scan-tree-dump "FSM" "thread3" } } */
-/* { dg-final { scan-tree-dump "FSM" "thread4" { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump "FSM" "thread4" } } */
 
 typedef struct bitmap_head_def *bitmap;
 typedef const struct bitmap_head_def *const_bitmap;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c
index 38661c88822..f9152b9358f 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-additional-options "-O2 -fdump-tree-vrp-details --param logical-op-non-short-circuit=1" }  */
+/* { dg-additional-options "-fdisable-tree-thread1" } */
 /* { dg-final { scan-tree-dump-times "Threaded jump" 8 "vrp1" } }  */
 
 void foo (void);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp02.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp02.c
index 4be538f5944..2285c55c7d2 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp02.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp02.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fdelete-null-pointer-checks -fdisable-tree-evrp" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fdelete-null-pointer-checks -fdisable-tree-evrp -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 struct A
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c
index bafb65a53d6..1d7ea4e8ffb 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 struct A
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp05.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp05.c
index 8c611e91fcd..c17cd1b5738 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp05.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp05.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fno-early-inlining" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fno-early-inlining -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 
 inline int ten()
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp06.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp06.c
index a872bc43731..acb03c29aa4 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp06.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp06.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 int baz (void);
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp07.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp07.c
index 0f3f280e4e1..31a541522f5 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp07.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp07.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-fre -fdisable-tree-evrp -fdump-tree-vrp1-details -fdelete-null-pointer-checks" } */
+/* { dg-options "-O2 -fno-tree-fre -fdisable-tree-evrp -fdump-tree-vrp1-details -fdelete-null-pointer-checks -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 int
 foo (int i, int *p)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp09.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp09.c
index 56cc50c5f1d..fad0051cb09 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp09.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp09.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-fre -fdisable-tree-evrp -fdump-tree-vrp1 -std=gnu89" } */
+/* { dg-options "-O2 -fno-tree-fre -fdisable-tree-evrp -fdump-tree-vrp1 -std=gnu89 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 foo (int *p)
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c
index 40373fde163..98a8da6d05e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fwrapv -O1 -ftree-vrp -fdisable-tree-evrp -fdump-tree-vrp1" } */
+/* { dg-options "-fwrapv -O1 -ftree-vrp -fdisable-tree-evrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 #include <limits.h>
 extern void abort ();
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp20.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp20.c
index 4a3b0d73648..f9df67ffb05 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp20.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp20.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fwrapv -O1 -fno-tree-fre -fdisable-tree-evrp -ftree-vrp -fdump-tree-vrp1" } */
+/* { dg-options "-fwrapv -O1 -fno-tree-fre -fdisable-tree-evrp -ftree-vrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 extern void abort ();
 extern void exit (int);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp33.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp33.c
index f1d3863943e..88833eb701e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp33.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp33.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fno-tree-fre -fdisable-tree-evrp" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fno-tree-fre -fdisable-tree-evrp -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 /* This is from PR14052.  */
 
diff --git a/gcc/testsuite/gcc.dg/uninit-pred-9_b.c b/gcc/testsuite/gcc.dg/uninit-pred-9_b.c
index d9ae75e0765..d46d6659a61 100644
--- a/gcc/testsuite/gcc.dg/uninit-pred-9_b.c
+++ b/gcc/testsuite/gcc.dg/uninit-pred-9_b.c
@@ -1,6 +1,7 @@
 
 /* { dg-do compile } */
 /* { dg-options "-Wuninitialized -O2" } */
+/* { dg-xfail-if "threading shuffles things around" { ppc64*-*-* } } */
 
 int g;
 void bar();
diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-16.c b/gcc/testsuite/gcc.dg/vect/bb-slp-16.c
index e68a9b62535..664e93e9b60 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-16.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-16.c
@@ -1,5 +1,8 @@
 /* { dg-require-effective-target vect_int } */
 
+/* See note below as to why we disable threading.  */
+/* { dg-additional-options "-fdisable-tree-thread1" } */
+
 #include <stdarg.h>
 #include "tree-vect.h"
 
@@ -27,6 +30,10 @@ main1 (int dummy)
       *pout++ = *pin++ + a;
       *pout++ = *pin++ + a;
       *pout++ = *pin++ + a;
+      /* In some architectures like ppc64, jump threading may thread
+	 the iteration where i==0 such that we no longer optimize the
+	 BB.  Another alternative to disable jump threading would be
+	 to wrap the read from `i' into a function returning i.  */
       if (arr[i] = i)
         a = i;
       else
diff --git a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
index 1ea111795a4..57192791857 100644
--- a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
+++ b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
@@ -1,6 +1,6 @@
 /* { dg-do run } */
 /* { dg-require-effective-target avx2 } */
-/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details" } */
+/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details -fdisable-tree-thread1" } */
 
 #include "avx2-check.h"
 #define N 64
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 7dd8594e3d4..09cb07c2b2b 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -36,6 +36,12 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-phinodes.h"
 #include "tree-inline.h"
 #include "tree-vectorizer.h"
+#include "value-range.h"
+#include "gimple-range.h"
+#include "tree-ssa-threadedge.h"
+#include "tree-ssa-path-solver.h"
+#include "ssa.h"
+#include "tree-cfgcleanup.h"
 
 // Path registry for the backwards threader.  After all paths have been
 // registered with register_path(), thread_through_all_blocks() is called
@@ -71,13 +77,414 @@ private:
   const bool m_speed_p;
 };
 
+// Ranger based backwards threader.
+
+class back_threader
+{
+  // Temporary until we remove old code.
+  friend bool path_is_unreachable_p (const vec<jump_thread_edge *> &);
+
+public:
+  back_threader (back_threader_profitability &, back_threader_registry &);
+  ~back_threader ();
+  void find_paths (basic_block bb, tree name);
+
+private:
+  void maybe_register_path (edge taken_edge);
+  bool find_paths_to_names (basic_block bb, bitmap imports);
+  bool resolve_def (tree name, bitmap interesting, vec<tree> worklist);
+  bool resolve_phi (gphi *phi, bitmap imports);
+  edge find_taken_edge (const vec<basic_block> &path);
+  edge find_taken_edge_cond (const vec<basic_block> &path, gcond *);
+  edge find_taken_edge_switch (const vec<basic_block> &path, gswitch *);
+
+  back_threader_registry &m_registry;
+  back_threader_profitability &m_profit;
+  gimple_ranger m_ranger;
+  path_solver m_solver;
+
+  // Current path being analyzed.
+  auto_vec<basic_block> m_path;
+  // Hash to mark visited BBs while analyzing a path.
+  hash_set<basic_block> m_visited_bbs;
+  // The set of SSA names, any of which could potentially change the
+  // value of the final conditional in a path.
+  bitmap m_imports;
+  // The last statement in the path.
+  gimple *m_last_stmt;
+  // This is a bit of a wart.  It's used to pass the LHS SSA name to
+  // the profitability engine.
+  tree m_name;
+  // Marker to differentiate unreachable edges.
+  static const edge UNREACHABLE_EDGE;
+};
+
+// Used to differentiate unreachable edges, so we may stop the search
+// in a the given direction.
+const edge back_threader::UNREACHABLE_EDGE = (edge) -1;
+
+back_threader::back_threader (back_threader_profitability &profit,
+			      back_threader_registry &registry)
+  : m_registry (registry),
+    m_profit (profit),
+    m_solver (m_ranger)
+{
+  m_last_stmt = NULL;
+  m_imports = BITMAP_ALLOC (NULL);
+}
+
+back_threader::~back_threader ()
+{
+  m_path.release ();
+  BITMAP_FREE (m_imports);
+}
+
+// Register the current path for jump threading if it's profitable to
+// do so.  TAKEN_EDGE is the known edge out of the path.
+
+void
+back_threader::maybe_register_path (edge taken_edge)
+{
+  bool irreducible = false;
+  bool profitable
+    = m_profit.profitable_path_p (m_path, m_name, taken_edge, &irreducible);
+
+  if (profitable)
+    {
+      m_registry.register_path (m_path, taken_edge);
+
+      if (irreducible)
+	vect_free_loop_info_assumptions (m_path[0]->loop_father);
+    }
+}
+
+// Return the known taken edge out of a path.  If the path can be
+// determined to be unreachable, return UNREACHABLE_EDGE.  If no
+// outgoing edge can be calculated, return NULL.
+
+edge
+back_threader::find_taken_edge (const vec<basic_block> &path)
+{
+  gcc_checking_assert (path.length () > 1);
+  switch (gimple_code (m_last_stmt))
+    {
+    case GIMPLE_COND:
+      return find_taken_edge_cond (path, as_a<gcond *> (m_last_stmt));
+
+    case GIMPLE_SWITCH:
+      return find_taken_edge_switch (path, as_a<gswitch *> (m_last_stmt));
+
+    default:
+      return NULL;
+    }
+}
+
+// Same as find_taken_edge, but for paths ending in a switch.
+
+edge
+back_threader::find_taken_edge_switch (const vec<basic_block> &path,
+				       gswitch *sw)
+{
+  tree name = gimple_switch_index (sw);
+  int_range_max r;
+
+  m_solver.precompute_ranges (&path, m_imports);
+  m_solver.range_in_path (r, name);
+
+  if (r.undefined_p ())
+    return UNREACHABLE_EDGE;
+
+  if (r.varying_p ())
+    return NULL;
+
+  tree val;
+  if (r.singleton_p (&val))
+    return ::find_taken_edge (gimple_bb (sw), val);
+
+  return NULL;
+}
+
+// Same as find_taken_edge, but for paths ending in a GIMPLE_COND.
+
+edge
+back_threader::find_taken_edge_cond (const vec<basic_block> &path,
+				     gcond *cond)
+{
+  m_solver.precompute_ranges (&path, m_imports);
+
+  // Check if either operand is unreachable since this knowledge could
+  // help the caller cut down the search space.
+  int_range_max r;
+  m_solver.range_in_path (r, gimple_cond_lhs (cond));
+  if (r.undefined_p ())
+    return UNREACHABLE_EDGE;
+  m_solver.range_in_path (r, gimple_cond_rhs (cond));
+  if (r.undefined_p ())
+    return UNREACHABLE_EDGE;
+
+  m_solver.range_in_path (r, cond);
+
+  int_range<2> true_range (boolean_true_node, boolean_true_node);
+  int_range<2> false_range (boolean_false_node, boolean_false_node);
+
+  if (r == true_range || r == false_range)
+    {
+      edge e_true, e_false;
+      basic_block bb = gimple_bb (cond);
+      extract_true_false_edges_from_block (bb, &e_true, &e_false);
+      return r == true_range ? e_true : e_false;
+    }
+  return NULL;
+}
+
+// Populate a vector of trees from a bitmap.
+
+static inline void
+populate_worklist (vec<tree> worklist, bitmap bits)
+{
+  bitmap_iterator bi;
+  unsigned i;
+
+  EXECUTE_IF_SET_IN_BITMAP (bits, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+      worklist.quick_push (name);
+    }
+}
+
+// If any of the incoming edges for a PHI resolves the current path,
+// register the path(s), and return TRUE.
+
+bool
+back_threader::resolve_phi (gphi *phi, bitmap interesting)
+{
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_phi_result (phi)))
+    return true;
+
+  bool done = false;
+  for (size_t i = 0; i < gimple_phi_num_args (phi); ++i)
+    {
+      edge e = gimple_phi_arg_edge (phi, i);
+
+      // This is like path_crosses_loops in profitable_path_p but more
+      // restrictive, since profitable_path_p allows threading the
+      // first block because it would be redirected anyhow.
+      //
+      // If we loosened the restriction and used profitable_path_p()
+      // here instead, we would peel off the first iterations of loops
+      // in places like tree-ssa/pr14341.c.
+      bool profitable_p = m_path[0]->loop_father == e->src->loop_father;
+      if (!profitable_p)
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file,
+		     "  FAIL: path through PHI in bb%d (incoming bb:%d) crosses loop\n",
+		     e->dest->index, e->src->index);
+	  continue;
+	}
+
+      tree arg = gimple_phi_arg_def (phi, i);
+      if (TREE_CODE (arg) == SSA_NAME)
+	{
+	  unsigned v = SSA_NAME_VERSION (arg);
+
+	  // Avoid loops as in: x_5 = PHI <x_5(2), ...>.
+	  if (bitmap_bit_p (interesting, v))
+	    continue;
+
+	  bitmap_set_bit (interesting, v);
+	  bitmap_set_bit (m_imports, v);
+	  done |= find_paths_to_names (e->src, interesting);
+	  bitmap_clear_bit (interesting, v);
+	}
+      else if (TREE_CODE (arg) == INTEGER_CST)
+	{
+	  m_path.safe_push (e->src);
+	  edge taken_edge = find_taken_edge (m_path);
+	  if (taken_edge && taken_edge != UNREACHABLE_EDGE)
+	    {
+	      maybe_register_path (taken_edge);
+	      done = true;
+	    }
+	  m_path.pop ();
+	}
+    }
+  return done;
+}
+
+// If the definition of NAME resolves the current path, register the
+// path, and return TRUE.
+
+bool
+back_threader::resolve_def (tree name, bitmap interesting, vec<tree> worklist)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
+
+  // Handle PHIs.
+  if (is_a<gphi *> (def_stmt)
+      && resolve_phi (as_a<gphi *> (def_stmt), interesting))
+    return true;
+
+  // Defer copies of SSAs by adding the source to the worklist.
+  if (gimple_assign_single_p (def_stmt)
+      && TREE_CODE (gimple_assign_rhs1 (def_stmt)) == SSA_NAME)
+    {
+      tree rhs = gimple_assign_rhs1 (def_stmt);
+      bitmap_set_bit (m_imports, SSA_NAME_VERSION (rhs));
+      bitmap_set_bit (interesting, SSA_NAME_VERSION (rhs));
+      worklist.safe_push (rhs);
+    }
+  return false;
+}
+
+// Find jump threading paths to any of the SSA names in the
+// INTERESTING bitmap, and register any such paths.
+//
+// Return TRUE if no further processing past this block is necessary.
+// This is because we've either registered a path, or because there is
+// nothing of interesting beyond this block.
+//
+// BB is the current path being processed.
+
+bool
+back_threader::find_paths_to_names (basic_block bb, bitmap interesting)
+{
+  if (m_visited_bbs.add (bb))
+    return true;
+
+  m_path.safe_push (bb);
+
+  if (m_path.length () > 1
+      && !m_profit.profitable_path_p (m_path, m_name, NULL))
+    {
+      m_path.pop ();
+      m_visited_bbs.remove (bb);
+      return false;
+    }
+
+  auto_bitmap processed;
+  unsigned i;
+  bool done = false;
+
+  // We use a worklist instead of iterating through the bitmap,
+  // because we may add new items in-flight.
+  auto_vec<tree> worklist (bitmap_count_bits (interesting));
+  populate_worklist (worklist, interesting);
+  while (!worklist.is_empty ())
+    {
+      tree name = worklist.pop ();
+      unsigned i = SSA_NAME_VERSION (name);
+      basic_block def_bb = gimple_bb (SSA_NAME_DEF_STMT (name));
+
+      // Process any names defined in this block.
+      if (def_bb == bb)
+	{
+	  bitmap_set_bit (processed, i);
+
+	  if (resolve_def (name, interesting, worklist))
+	    {
+	      done = true;
+	      goto leave_bb;
+	    }
+	}
+      // Examine blocks that define or export an interesting SSA,
+      // since they may compute a range which resolve this path.
+      if ((def_bb == bb
+	   || bitmap_bit_p (m_ranger.gori ().exports (bb), i))
+	  && m_path.length () > 1)
+	{
+	  edge taken_edge = find_taken_edge (m_path);
+	  if (taken_edge)
+	    {
+	      if (taken_edge != UNREACHABLE_EDGE)
+		maybe_register_path (taken_edge);
+
+	      done = true;
+	      goto leave_bb;
+	    }
+	}
+    }
+
+  // If there are interesting names not yet processed, keep looking.
+  bitmap_and_compl_into (interesting, processed);
+  if (!bitmap_empty_p (interesting))
+    {
+      edge_iterator iter;
+      edge e;
+      FOR_EACH_EDGE (e, iter, bb->preds)
+	if ((e->flags & EDGE_ABNORMAL) == 0)
+	  done |= find_paths_to_names (e->src, interesting);
+    }
+
+ leave_bb:
+  bitmap_iterator bi;
+  EXECUTE_IF_SET_IN_BITMAP (processed, 0, i, bi)
+    bitmap_set_bit (interesting, i);
+
+  m_path.pop ();
+  m_visited_bbs.remove (bb);
+  return done;
+}
+
+// Search backwards from BB looking for paths where the final
+// conditional out of BB can be determined.  NAME is the LHS of the
+// final conditional.  Register such paths for jump threading.
+
+void
+back_threader::find_paths (basic_block bb, tree name)
+{
+  gimple *stmt = last_stmt (bb);
+  if (!stmt
+      || (gimple_code (stmt) != GIMPLE_COND
+	  && gimple_code (stmt) != GIMPLE_SWITCH))
+    return;
+
+  if (EDGE_COUNT (bb->succs) > 1
+      || single_succ_to_potentially_threadable_block (bb))
+    {
+      m_last_stmt = stmt;
+      m_visited_bbs.empty ();
+      m_path.truncate (0);
+      m_name = name;
+      bitmap_clear (m_imports);
+
+      auto_bitmap interesting;
+      bitmap_copy (m_imports, m_ranger.gori ().imports (bb));
+      bitmap_copy (interesting, m_imports);
+      find_paths_to_names (bb, interesting);
+    }
+}
+
+// Dump a sequence of BBs through the CFG.
+
+DEBUG_FUNCTION void
+dump_path (FILE *dump_file, const vec<basic_block> &path)
+{
+  for (size_t i = 0; i < path.length (); ++i)
+    {
+      fprintf (dump_file, "BB%d", path[i]->index);
+      if (i + 1 < path.length ())
+	fprintf (dump_file, " <- ");
+    }
+  fprintf (dump_file, "\n");
+}
+
+DEBUG_FUNCTION void
+debug (const vec <basic_block> &path)
+{
+  dump_path (stderr, path);
+}
+
 class thread_jumps
 {
 public:
   thread_jumps (bool speed_p = true)
-    : m_profit (speed_p), m_registry (param_max_fsm_thread_paths)
+    : m_profit (speed_p),
+      m_registry (param_max_fsm_thread_paths),
+      m_back_threader (m_profit, m_registry)
   { }
   void find_jump_threads_backwards (basic_block bb);
+  void find_jump_threads_backwards_with_ranger (basic_block bb);
   bool thread_through_all_blocks ();
 
 private:
@@ -102,6 +509,7 @@ private:
   tree m_name;
   back_threader_profitability m_profit;
   back_threader_registry m_registry;
+  back_threader m_back_threader;
 };
 
 // Perform the actual jump threading for the all queued paths.
@@ -548,8 +956,8 @@ back_threader_registry::register_path (const vec<basic_block> &m_path,
 						EDGE_NO_COPY_SRC_BLOCK);
   jump_thread_path->safe_push (x);
 
-  m_lowlevel_registry.register_jump_thread (jump_thread_path);
-  ++m_threaded_paths;
+  if (m_lowlevel_registry.register_jump_thread (jump_thread_path))
+    ++m_threaded_paths;
   return true;
 }
 
@@ -818,6 +1226,12 @@ thread_jumps::fsm_find_control_statement_thread_paths (tree name)
 void
 thread_jumps::find_jump_threads_backwards (basic_block bb)
 {
+  if (param_threader_mode & THREADER_MODE_RANGER)
+    {
+      find_jump_threads_backwards_with_ranger (bb);
+      return;
+    }
+
   gimple *stmt = get_gimple_control_stmt (bb);
   if (!stmt)
     return;
@@ -850,6 +1264,28 @@ thread_jumps::find_jump_threads_backwards (basic_block bb)
   fsm_find_control_statement_thread_paths (name);
 }
 
+// Like find_jump_threads_backwards(), but using ranger.
+
+void
+thread_jumps::find_jump_threads_backwards_with_ranger (basic_block bb)
+{
+  gimple *stmt = get_gimple_control_stmt (bb);
+  if (!stmt)
+    return;
+
+  enum gimple_code code = gimple_code (stmt);
+  tree name = NULL;
+  if (code == GIMPLE_SWITCH)
+    name = gimple_switch_index (as_a <gswitch *> (stmt));
+  else if (code == GIMPLE_GOTO)
+    name = gimple_goto_dest (stmt);
+  else if (code == GIMPLE_COND)
+    name = gimple_cond_lhs (stmt);
+
+  m_name = name;
+  m_back_threader.find_paths (bb, name);
+}
+
 namespace {
 
 const pass_data pass_data_thread_jumps =
@@ -883,12 +1319,12 @@ pass_thread_jumps::gate (function *fun ATTRIBUTE_UNUSED)
   return flag_expensive_optimizations;
 }
 
+// Try to thread blocks in FUN.  Return TRUE if any jump thread paths were
+// registered.
 
-unsigned int
-pass_thread_jumps::execute (function *fun)
+static bool
+try_thread_blocks (function *fun)
 {
-  loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
-
   /* Try to thread each block with more than one successor.  */
   thread_jumps threader;
   basic_block bb;
@@ -897,7 +1333,30 @@ pass_thread_jumps::execute (function *fun)
       if (EDGE_COUNT (bb->succs) > 1)
 	threader.find_jump_threads_backwards (bb);
     }
-  bool changed = threader.thread_through_all_blocks ();
+  return threader.thread_through_all_blocks ();
+}
+
+unsigned int
+pass_thread_jumps::execute (function *fun)
+{
+  loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
+
+  // Iterative mode is a testing construct and is not meant for public
+  // consumption.  It is OFF by default.
+  bool iterative = param_threader_iterative;
+
+  bool changed = false;
+  while (try_thread_blocks (fun))
+    {
+      changed = true;
+
+      if (!iterative)
+	break;
+
+      if ((param_threader_mode & THREADER_MODE_RANGER) == 0)
+	break;
+      cleanup_tree_cfg (TODO_update_ssa);
+    }
 
   loop_optimizer_finalize ();
   return changed ? TODO_cleanup_cfg : 0;
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 6ce32644aa5..ea5c37a2c65 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -1335,6 +1335,18 @@ jump_threader::thread_across_edge (edge e)
   m_avail_exprs_stack->pop_to_marker ();
 }
 
+/* Return TRUE if BB has a single successor to a block with multiple
+   incoming and outgoing edges.  */
+
+bool
+single_succ_to_potentially_threadable_block (basic_block bb)
+{
+  int flags = (EDGE_IGNORE | EDGE_COMPLEX | EDGE_ABNORMAL);
+  return (single_succ_p (bb)
+	  && (single_succ_edge (bb)->flags & flags) == 0
+	  && potentially_threadable_block (single_succ (bb)));
+}
+
 /* Examine the outgoing edges from BB and conditionally
    try to thread them.  */
 
@@ -1348,12 +1360,8 @@ jump_threader::thread_outgoing_edges (basic_block bb)
      outgoing edges, then we may be able to thread the edge, i.e., we
      may be able to statically determine which of the outgoing edges
      will be traversed when the incoming edge from BB is traversed.  */
-  if (single_succ_p (bb)
-      && (single_succ_edge (bb)->flags & flags) == 0
-      && potentially_threadable_block (single_succ (bb)))
-    {
-      thread_across_edge (single_succ_edge (bb));
-    }
+  if (single_succ_to_potentially_threadable_block (bb))
+    thread_across_edge (single_succ_edge (bb));
   else if ((last = last_stmt (bb))
 	   && gimple_code (last) == GIMPLE_COND
 	   && EDGE_COUNT (bb->succs) == 2
diff --git a/gcc/tree-ssa-threadedge.h b/gcc/tree-ssa-threadedge.h
index 48735f2bc27..c78eef25219 100644
--- a/gcc/tree-ssa-threadedge.h
+++ b/gcc/tree-ssa-threadedge.h
@@ -29,7 +29,7 @@ class jump_threader
 {
 public:
   jump_threader (class const_and_copies *,
-		 avail_exprs_stack *,
+		 class avail_exprs_stack *,
 		 class jump_threader_simplifier *,
 		 class evrp_range_analyzer * = NULL);
   ~jump_threader ();
@@ -83,6 +83,7 @@ protected:
 };
 
 extern void propagate_threaded_block_debug_into (basic_block, basic_block);
+extern bool single_succ_to_potentially_threadable_block (basic_block);
 
 // ?? All this ssa_name_values stuff is the store of values for
 // avail_exprs_stack and const_and_copies, so it really belongs in the
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index f496dd3eb8c..29cf010e9ca 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "dbgcnt.h"
 #include "tree-cfg.h"
 #include "tree-vectorizer.h"
+#include "tree-pass.h"
 
 /* Given a block B, update the CFG and SSA graph to reflect redirecting
    one or more in-edges to B to instead reach the destination of an
@@ -2741,15 +2742,17 @@ jump_thread_path_registry::thread_through_all_blocks
 
    E is the edge we can thread, E2 is the new target edge, i.e., we
    are effectively recording that E->dest can be changed to E2->dest
-   after fixing the SSA graph.  */
+   after fixing the SSA graph.
 
-void
+   Return TRUE if PATH was successfully threaded.  */
+
+bool
 jump_thread_path_registry::register_jump_thread (vec<jump_thread_edge *> *path)
 {
   if (!dbg_cnt (registered_jump_thread))
     {
       path->release ();
-      return;
+      return false;
     }
 
   /* First make sure there are no NULL outgoing edges on the jump threading
@@ -2766,7 +2769,7 @@ jump_thread_path_registry::register_jump_thread (vec<jump_thread_edge *> *path)
 	    }
 
 	  path->release ();
-	  return;
+	  return false;
 	}
 
       /* Only the FSM threader is allowed to thread across
@@ -2780,6 +2783,7 @@ jump_thread_path_registry::register_jump_thread (vec<jump_thread_edge *> *path)
     dump_jump_thread_path (dump_file, *path, true);
 
   m_paths.safe_push (path);
+  return true;
 }
 
 /* Return how many uses of T there are within BB, as long as there
diff --git a/gcc/tree-ssa-threadupdate.h b/gcc/tree-ssa-threadupdate.h
index b806caee581..2030bda15af 100644
--- a/gcc/tree-ssa-threadupdate.h
+++ b/gcc/tree-ssa-threadupdate.h
@@ -63,7 +63,7 @@ class jump_thread_path_registry
 public:
   jump_thread_path_registry ();
   ~jump_thread_path_registry ();
-  void register_jump_thread (vec<jump_thread_edge *> *);
+  bool register_jump_thread (vec<jump_thread_edge *> *);
   void remove_jump_threads_including (edge);
   bool thread_through_all_blocks (bool);
   jump_thread_edge *allocate_thread_edge (edge e, jump_thread_edge_type t);
diff --git a/libgomp/testsuite/libgomp.graphite/force-parallel-4.c b/libgomp/testsuite/libgomp.graphite/force-parallel-4.c
index d2af142e8fe..ef6f64d229b 100644
--- a/libgomp/testsuite/libgomp.graphite/force-parallel-4.c
+++ b/libgomp/testsuite/libgomp.graphite/force-parallel-4.c
@@ -1,4 +1,5 @@
 /* Autopar with IF conditions.  */
+/* { dg-additional-options "-fdisable-tree-thread1" } */
 
 void abort();
 
diff --git a/libgomp/testsuite/libgomp.graphite/force-parallel-8.c b/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
index 32ba5ab84ef..a97eb97acf6 100644
--- a/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
+++ b/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
@@ -1,3 +1,5 @@
+/* { dg-additional-options "-fdisable-tree-thread1" } */
+
 #define N 1500
 
 int x[N][N], y[N];
-- 
2.31.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2] Ranger-based backwards threader implementation.
  2021-06-29 10:27   ` Aldy Hernandez
@ 2021-06-29 21:22     ` Martin Sebor
  2021-06-30  6:08       ` Aldy Hernandez
  0 siblings, 1 reply; 20+ messages in thread
From: Martin Sebor @ 2021-06-29 21:22 UTC (permalink / raw)
  To: Aldy Hernandez, GCC patches
  Cc: Martin Sebor, Jeff Law, Andrew MacLeod, Richard Biener

On 6/29/21 4:27 AM, Aldy Hernandez wrote:
> 
> 
> On 6/29/21 1:19 AM, Martin Sebor wrote:
>> On 6/28/21 10:21 AM, Aldy Hernandez via Gcc-patches wrote:
>>> This is the ranger-based backwards threader.  It is divided into two
>>> parts: the solver and the path discovery bits.
>>>
>>> The solver is generic enough, that it may be of use to other passes,
>>> so it's been abstracted into its own separate class/file.  Andrew and
>>> I have already gone over it, so I don't think a review is necessary.
>>> Besides, it's technically an extension of the ranger infrastructure.
>>>
>>> On the other hand, the path discovery bits could benefit from the
>>> watchful eye of the jump threading experts.
>>>
>>> Documenting the solver in a [ranger-tech] post is on my TODO list,
>>> as I think it would be useful as an example of GORI as a general
>>> tool, outside the VRP world.
>>>
>>> As I have mentioned elsewhere, I have gone through each test and
>>> documented the reasons why they were adjusted (when useful).  The
>>> reviewer(s) may benefit from looking at the test notes.
>>>
>>> I have added a --param=threader-mode={ranger,legacy} option, which I
>>> hope to remove shortly after.  It has been useful for diagnosing
>>> issues in the past, though perhaps not so much now.  I've left it
>>> in case there's a remote interest in using it during stage1, but
>>> removing it could be a huge cleanup to tree-ssa-threadbackward.c.
>>>
>>> If/when accepted, I will open 2-3 PRs with the XFAILed tests as
>>> requested.  I am still working on distilling a C counterpart for
>>> the libphobos missing thread edge.  It'll hopefully be ready by the
>>> time the review is done.
>>>
>>> A version of this patchset with the verification code has
>>> been tested on x86-64, ppc64, ppc64le, and aarch64 (all Linux).
>>>
>>> I am currently re-testing on x86-64 Linux, but will not re-test on the
>>> rest of the architectures because...OMG aarch6 is so slow!
>>
>> I applied the series and ran a subset of tests and didn't see any
>> failures, just the three XPASSes below.  The Wfree-nonheap-object
>> tests you mentioned in the other post all pass.  Looks like you
>> got past that problem?
>>
>> XPASS: gcc.dg/uninit-pr61112.c pr61112 (test for bogus messages, line 32)
>> XPASS: gcc.dg/uninit-pr61112.c pr61112 (test for bogus messages, line 46)
>> XPASS: gcc.dg/uninit-pr61112.c pr61112 (test for bogus messages, line 60)
>>
>> A couple of comments on the tests below (I haven't looked at the meat
>> of the patch):
>>
>>>
>>> Thanks.
>>> Aldy
>>>
>>> Aldy Hernandez (2):
>>>    Implement basic block path solver.
>>>    Backwards jump threader rewrite with ranger.
>>>
>>>   gcc/Makefile.in                               |   6 +
>>>   gcc/flag-types.h                              |   7 +
>>>   gcc/params.opt                                |  17 +
>>>   .../g++.dg/debug/dwarf2/deallocator.C         |   3 +-
>>>   gcc/testsuite/gcc.c-torture/compile/pr83510.c |  33 ++
>>>   gcc/testsuite/gcc.dg/Wrestrict-22.c           |   3 +
>>
>> The change here just adds the comment:
>>
>> +/* This looks like the threader caused the entire loop to collapse, 
>> and the
>> +   warning pass can't determine the arguments to memcpy.  */
>> +
>>
>> Since the test passes I'm not sure I understand what the comment
>> is trying to say.  Is it still accurate and necessary?
> 
> This seems like it came from the ranger branch which had slightly 
> different code, particularly it made use of a full ranger with 
> equivalences.  It looks like this could have failed in the branch, but 
> no longer does.  I have removed the comment.

Okay, thanks.

> 
>>
>>>   gcc/testsuite/gcc.dg/loop-unswitch-2.c        |   2 +-
>>>   gcc/testsuite/gcc.dg/old-style-asm-1.c        |   5 +-
>>>   gcc/testsuite/gcc.dg/pr68317.c                |   4 +-
>>>   gcc/testsuite/gcc.dg/pr97567-2.c              |   2 +-
>>>   gcc/testsuite/gcc.dg/predict-9.c              |   4 +-
>>>   gcc/testsuite/gcc.dg/shrink-wrap-loop.c       |  53 ++
>>>   gcc/testsuite/gcc.dg/sibcall-1.c              |  10 +
>>>   .../gcc.dg/tree-ssa/builtin-sprintf-3.c       |   5 +-
>>
>> I wonder if breaking up the test function into five, one for each
>> of the tests it does, would be a better way to avoid the IL changes
>> than disabling all the threading passes.  Like in the attached patch.
> 
> As the author of the original test, I completely differ to you :).
> 
> Attached is the latest version with your suggested changes, as well as a 
> gimple FE test for the previously discussed failing libphobos test.

The tests look good.

In the new APIs, instead of taking vec by value can you please change
them to either by-const-reference if they don't change the vec or by-
reference if they do?  I'm in the midst of changing code to do that
with the goal of eventually removing all by-value vec arguments.

Thanks
Martin

> 
> Thanks.
> Aldy


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2] Ranger-based backwards threader implementation.
  2021-06-29 21:22     ` Martin Sebor
@ 2021-06-30  6:08       ` Aldy Hernandez
  0 siblings, 0 replies; 20+ messages in thread
From: Aldy Hernandez @ 2021-06-30  6:08 UTC (permalink / raw)
  To: Martin Sebor, GCC patches
  Cc: Martin Sebor, Jeff Law, Andrew MacLeod, Richard Biener



On 6/29/21 11:22 PM, Martin Sebor wrote:
> On 6/29/21 4:27 AM, Aldy Hernandez wrote:
>>
>>
>> On 6/29/21 1:19 AM, Martin Sebor wrote:
>>> On 6/28/21 10:21 AM, Aldy Hernandez via Gcc-patches wrote:
>>>> This is the ranger-based backwards threader.  It is divided into two
>>>> parts: the solver and the path discovery bits.
>>>>
>>>> The solver is generic enough, that it may be of use to other passes,
>>>> so it's been abstracted into its own separate class/file.  Andrew and
>>>> I have already gone over it, so I don't think a review is necessary.
>>>> Besides, it's technically an extension of the ranger infrastructure.
>>>>
>>>> On the other hand, the path discovery bits could benefit from the
>>>> watchful eye of the jump threading experts.
>>>>
>>>> Documenting the solver in a [ranger-tech] post is on my TODO list,
>>>> as I think it would be useful as an example of GORI as a general
>>>> tool, outside the VRP world.
>>>>
>>>> As I have mentioned elsewhere, I have gone through each test and
>>>> documented the reasons why they were adjusted (when useful).  The
>>>> reviewer(s) may benefit from looking at the test notes.
>>>>
>>>> I have added a --param=threader-mode={ranger,legacy} option, which I
>>>> hope to remove shortly after.  It has been useful for diagnosing
>>>> issues in the past, though perhaps not so much now.  I've left it
>>>> in case there's a remote interest in using it during stage1, but
>>>> removing it could be a huge cleanup to tree-ssa-threadbackward.c.
>>>>
>>>> If/when accepted, I will open 2-3 PRs with the XFAILed tests as
>>>> requested.  I am still working on distilling a C counterpart for
>>>> the libphobos missing thread edge.  It'll hopefully be ready by the
>>>> time the review is done.
>>>>
>>>> A version of this patchset with the verification code has
>>>> been tested on x86-64, ppc64, ppc64le, and aarch64 (all Linux).
>>>>
>>>> I am currently re-testing on x86-64 Linux, but will not re-test on the
>>>> rest of the architectures because...OMG aarch6 is so slow!
>>>
>>> I applied the series and ran a subset of tests and didn't see any
>>> failures, just the three XPASSes below.  The Wfree-nonheap-object
>>> tests you mentioned in the other post all pass.  Looks like you
>>> got past that problem?
>>>
>>> XPASS: gcc.dg/uninit-pr61112.c pr61112 (test for bogus messages, line 
>>> 32)
>>> XPASS: gcc.dg/uninit-pr61112.c pr61112 (test for bogus messages, line 
>>> 46)
>>> XPASS: gcc.dg/uninit-pr61112.c pr61112 (test for bogus messages, line 
>>> 60)
>>>
>>> A couple of comments on the tests below (I haven't looked at the meat
>>> of the patch):
>>>
>>>>
>>>> Thanks.
>>>> Aldy
>>>>
>>>> Aldy Hernandez (2):
>>>>    Implement basic block path solver.
>>>>    Backwards jump threader rewrite with ranger.
>>>>
>>>>   gcc/Makefile.in                               |   6 +
>>>>   gcc/flag-types.h                              |   7 +
>>>>   gcc/params.opt                                |  17 +
>>>>   .../g++.dg/debug/dwarf2/deallocator.C         |   3 +-
>>>>   gcc/testsuite/gcc.c-torture/compile/pr83510.c |  33 ++
>>>>   gcc/testsuite/gcc.dg/Wrestrict-22.c           |   3 +
>>>
>>> The change here just adds the comment:
>>>
>>> +/* This looks like the threader caused the entire loop to collapse, 
>>> and the
>>> +   warning pass can't determine the arguments to memcpy.  */
>>> +
>>>
>>> Since the test passes I'm not sure I understand what the comment
>>> is trying to say.  Is it still accurate and necessary?
>>
>> This seems like it came from the ranger branch which had slightly 
>> different code, particularly it made use of a full ranger with 
>> equivalences.  It looks like this could have failed in the branch, but 
>> no longer does.  I have removed the comment.
> 
> Okay, thanks.
> 
>>
>>>
>>>>   gcc/testsuite/gcc.dg/loop-unswitch-2.c        |   2 +-
>>>>   gcc/testsuite/gcc.dg/old-style-asm-1.c        |   5 +-
>>>>   gcc/testsuite/gcc.dg/pr68317.c                |   4 +-
>>>>   gcc/testsuite/gcc.dg/pr97567-2.c              |   2 +-
>>>>   gcc/testsuite/gcc.dg/predict-9.c              |   4 +-
>>>>   gcc/testsuite/gcc.dg/shrink-wrap-loop.c       |  53 ++
>>>>   gcc/testsuite/gcc.dg/sibcall-1.c              |  10 +
>>>>   .../gcc.dg/tree-ssa/builtin-sprintf-3.c       |   5 +-
>>>
>>> I wonder if breaking up the test function into five, one for each
>>> of the tests it does, would be a better way to avoid the IL changes
>>> than disabling all the threading passes.  Like in the attached patch.
>>
>> As the author of the original test, I completely differ to you :).
>>
>> Attached is the latest version with your suggested changes, as well as 
>> a gimple FE test for the previously discussed failing libphobos test.
> 
> The tests look good.
> 
> In the new APIs, instead of taking vec by value can you please change
> them to either by-const-reference if they don't change the vec or by-
> reference if they do?  I'm in the midst of changing code to do that
> with the goal of eventually removing all by-value vec arguments.

Sure.

Aldy


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] Implement basic block path solver.
  2021-06-28 16:21 ` [PATCH 1/2] Implement basic block path solver Aldy Hernandez
@ 2021-07-01 22:20   ` Jeff Law
  2021-07-02  8:13     ` Aldy Hernandez
  0 siblings, 1 reply; 20+ messages in thread
From: Jeff Law @ 2021-07-01 22:20 UTC (permalink / raw)
  To: Aldy Hernandez, GCC patches; +Cc: Andrew MacLeod, Richard Biener, Martin Sebor



On 6/28/2021 10:21 AM, Aldy Hernandez wrote:
> This is is the main basic block path solver for use in the ranger-based
> backwards threader.  Given a path of BBs, the class can solve the final
> conditional or any SSA name used in calculating the final conditional.
>
> The main API is:
>
> // This class is a basic block path solver.  Given a set of BBs
> // indicating a path through the CFG, range_in_path() will return the
> // range of an SSA as if the BBs in the path would have been executed
> // in order.
> //
> // Only SSA names passed in IMPORTS are precomputed, and can be
> // queried.
> //
> // Note that the blocks are in reverse order, thus the exit block is
> // path[0].
>
> class path_solver
> {
> public:
>    path_solver (gimple_ranger &ranger);
>    virtual ~path_solver ();
>    void precompute_ranges (const vec<basic_block> *path,
> 			  const bitmap_head *imports);
>    void range_in_path (irange &, tree name);
>    void range_in_path (irange &, gimple *);
> };
>
> gcc/ChangeLog:
>
>          * Makefile.in (OBJS): Add tree-ssa-path-solver.o.
> 	* tree-ssa-path-solver.cc: New file.
> 	* tree-ssa-path-solver.h: New file.
> ---
>   gcc/Makefile.in             |   1 +
>   gcc/tree-ssa-path-solver.cc | 310 ++++++++++++++++++++++++++++++++++++
>   gcc/tree-ssa-path-solver.h  |  85 ++++++++++
>   3 files changed, 396 insertions(+)
>   create mode 100644 gcc/tree-ssa-path-solver.cc
>   create mode 100644 gcc/tree-ssa-path-solver.h
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index ebf26442992..66cc5f9529e 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1644,6 +1644,7 @@ OBJS = \
>   	tree-ssa-loop.o \
>   	tree-ssa-math-opts.o \
>   	tree-ssa-operands.o \
> +	tree-ssa-path-solver.o \
>   	tree-ssa-phiopt.o \
>   	tree-ssa-phiprop.o \
>   	tree-ssa-pre.o \
> diff --git a/gcc/tree-ssa-path-solver.cc b/gcc/tree-ssa-path-solver.cc
> new file mode 100644
> index 00000000000..1e2c37cff78
> --- /dev/null
> +++ b/gcc/tree-ssa-path-solver.cc
> @@ -0,0 +1,310 @@
> +/* Basic block path solver.
> +   Copyright (C) 2021 Free Software Foundation, Inc.
> +   Contributed by Aldy Hernandez <aldyh@redhat.com>.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> + for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "backend.h"
> +#include "tree.h"
> +#include "gimple.h"
> +#include "cfganal.h"
> +#include "value-range.h"
> +#include "gimple-range.h"
> +#include "tree-pretty-print.h"
> +#include "tree-ssa-path-solver.h"
> +#include "ssa.h"
> +
> +// Internal construct to help facilitate debugging of solver.
> +#define DEBUG_SOLVER getenv("DEBUG")
Shouldn't this really be a property of what pass is using the solver and 
whether or not the appropriate dump flag is on for that pass?


> +
> +path_solver::path_solver (gimple_ranger &ranger)
> +  : m_ranger (ranger)
> +{
> +  m_cache = new ssa_global_cache;
> +  m_has_cache_entry = BITMAP_ALLOC (NULL);
> +  m_path = NULL;
> +}
> +
> +path_solver::~path_solver ()
> +{
> +  BITMAP_FREE (m_has_cache_entry);
> +  delete m_cache;
> +}
Do we need to clean up any other members in here? m_has_cache_entry, 
m_path, m_imports, m_ranger?

path_solver::range_of_expr has a comment asking if the call to set_cache 
is necessary.  Can you resolve that one way or the other?

+// Initialize the current path to PATH. The current block is set to
> +// the entry block to the path.
> +//
> +// Note that the blocks are in reverse order, so the exit block is
> +// path[0].
> +
> +void
> +path_solver::set_path (const vec<basic_block> *path)
> +{
> +  gcc_checking_assert (path->length () > 1);
> +  m_path = path;
> +  m_pos = m_path->length () - 1;
> +  bitmap_clear (m_has_cache_entry);
> +}
What's our position on ownership of PATH here?  Can our caller delete 
it?  Can we modify it?  Who releases it?  I realize you may be 
interfacing with some nonsense code I wrote eons ago ;-)


> +
> +// Return the range of the result of PHI in R.
> +
> +void
> +path_solver::ssa_range_in_phi (irange &r, gphi *phi)
> +{
> +  tree name = gimple_phi_result (phi);
> +  basic_block bb = gimple_bb (phi);
> +
> +  // We experimented with querying ranger's range_on_entry here, but
> +  // the performance penalty was too high, for hardly any improvements.
> +  if (at_entry ())
> +    {
> +      r.set_varying (TREE_TYPE (name));
> +      return;
> +    }
> +
> +  basic_block prev = prev_bb ();
> +  edge e_in = find_edge (prev, bb);
> +  for (size_t i = 0; i < gimple_phi_num_args (phi); ++i)
It's probably not important in practice, but you're going to end up 
calling gimple_phi_num_args every iteration of this loop.  It's value 
isn't generally subject to LICM.


> +    if (e_in == gimple_phi_arg_edge (phi, i))
> +      {
> +	tree arg = gimple_phi_arg_def (phi, i);
> +
> +	if (!get_cache (r, arg))
> +	  r.set_varying (TREE_TYPE (name));
> +
> +	// ?? Is this set_cache necessary?
Similar to the earlier instance.  Can we get a resolution on whether 
this is necessary or not?
+
> +// If NAME is defined in BB, set R to the range of NAME, and return
> +// TRUE.  Otherwise, return FALSE.
> +
> +bool
> +path_solver::range_defined_in_block (irange &r, tree name, basic_block bb)
> +{
> +  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
> +  basic_block def_bb = gimple_bb (def_stmt);
> +
> +  if (def_bb != bb)
> +    return false;
> +
> +  if (gimple_code (def_stmt) == GIMPLE_PHI)
> +    ssa_range_in_phi (r, as_a<gphi *> (def_stmt));
> +  else if (!fold_range (r, def_stmt, this))
> +    r.set_varying (TREE_TYPE (name));
> +
> +  if (DEBUG_SOLVER)
> +    {
> +      fprintf (stderr, "range_defined_in_block (BB%d) for ", bb->index);
> +      print_generic_expr (stderr, name, TDF_SLIM);
> +      fprintf (stderr, " is ");
> +      r.dump (stderr);
> +      fprintf (stderr, "\n");
> +    }
So see the earlier note.  This should be doing into the appropriate pass 
specific dump file rather than dumping to stderr based on the value of 
an environment variable.  Similarly for the other chunks of debugging bits.

> +
> +// Return the range of STMT as it would be seen at the end of the path
> +// being analyzed.  Anything but the final conditional in a BB will
> +// return VARYING.
> +
> +void
> +path_solver::range_in_path (irange &r, gimple *stmt)
> +{
> +  if (gimple_code (stmt) == GIMPLE_COND && fold_range (r, stmt, this))
> +    return;
> +
> +  r.set_varying (gimple_expr_type (stmt));
> +}
Not objecting to anything here other than to note that I think we have 
cases where there's a COND_EXPR on the RHS of statements within a 
block.  We're (in general) not handling those well in DOM or jump threading.

This looks pretty reasonable to me.  There's a few things to address 
noted above, but it's close.

jeff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] Implement basic block path solver.
  2021-07-01 22:20   ` Jeff Law
@ 2021-07-02  8:13     ` Aldy Hernandez
  2021-07-02 13:16       ` Andrew MacLeod
  2021-07-26 19:10       ` Jeff Law
  0 siblings, 2 replies; 20+ messages in thread
From: Aldy Hernandez @ 2021-07-02  8:13 UTC (permalink / raw)
  To: Jeff Law, GCC patches; +Cc: Andrew MacLeod, Richard Biener, Martin Sebor

[-- Attachment #1: Type: text/plain, Size: 5660 bytes --]



On 7/2/21 12:20 AM, Jeff Law wrote:
> 
> 
> On 6/28/2021 10:21 AM, Aldy Hernandez wrote:

>> +// Internal construct to help facilitate debugging of solver.
>> +#define DEBUG_SOLVER getenv("DEBUG")
> Shouldn't this really be a property of what pass is using the solver and 
> whether or not the appropriate dump flag is on for that pass?

Whoops.  This was a private construct used for debugging the solver. 
I've changed it to:

+#define DEBUG_SOLVER (0 && dump_file)

> 
> 
>> +
>> +path_solver::path_solver (gimple_ranger &ranger)
>> +  : m_ranger (ranger)
>> +{
>> +  m_cache = new ssa_global_cache;
>> +  m_has_cache_entry = BITMAP_ALLOC (NULL);
>> +  m_path = NULL;
>> +}
>> +
>> +path_solver::~path_solver ()
>> +{
>> +  BITMAP_FREE (m_has_cache_entry);
>> +  delete m_cache;
>> +}
> Do we need to clean up any other members in here? m_has_cache_entry, 
> m_path, m_imports, m_ranger?

Nope, that's it.

m_has_cache_entry is being freed here.

The m_path and m_imports parameters belong to the caller, and m_ranger 
has a destructor of its own.

> 
> path_solver::range_of_expr has a comment asking if the call to set_cache 
> is necessary.  Can you resolve that one way or the other?
> 
> +// Initialize the current path to PATH. The current block is set to
>> +// the entry block to the path.
>> +//
>> +// Note that the blocks are in reverse order, so the exit block is
>> +// path[0].
>> +
>> +void
>> +path_solver::set_path (const vec<basic_block> *path)
>> +{
>> +  gcc_checking_assert (path->length () > 1);
>> +  m_path = path;
>> +  m_pos = m_path->length () - 1;
>> +  bitmap_clear (m_has_cache_entry);
>> +}
> What's our position on ownership of PATH here?  Can our caller delete 
> it?  Can we modify it?  Who releases it?  I realize you may be 
> interfacing with some nonsense code I wrote eons ago ;-)

Belongs to the caller.  As per Martin's suggestion, it's now a const 
reference to make this clearer.  I forgot to repost after that change.

> 
> 
>> +
>> +// Return the range of the result of PHI in R.
>> +
>> +void
>> +path_solver::ssa_range_in_phi (irange &r, gphi *phi)
>> +{
>> +  tree name = gimple_phi_result (phi);
>> +  basic_block bb = gimple_bb (phi);
>> +
>> +  // We experimented with querying ranger's range_on_entry here, but
>> +  // the performance penalty was too high, for hardly any improvements.
>> +  if (at_entry ())
>> +    {
>> +      r.set_varying (TREE_TYPE (name));
>> +      return;
>> +    }
>> +
>> +  basic_block prev = prev_bb ();
>> +  edge e_in = find_edge (prev, bb);
>> +  for (size_t i = 0; i < gimple_phi_num_args (phi); ++i)
> It's probably not important in practice, but you're going to end up 
> calling gimple_phi_num_args every iteration of this loop.  It's value 
> isn't generally subject to LICM.

I was just following standard practice:

$ grep for\ .*gimple_phi_num_args *.c|wc -l
73

But if it's something you feel strongly about, I can change it.

> 
> 
>> +    if (e_in == gimple_phi_arg_edge (phi, i))
>> +      {
>> +    tree arg = gimple_phi_arg_def (phi, i);
>> +
>> +    if (!get_cache (r, arg))
>> +      r.set_varying (TREE_TYPE (name));
>> +
>> +    // ?? Is this set_cache necessary?
> Similar to the earlier instance.  Can we get a resolution on whether 
> this is necessary or not?

It turns out this doesn't make a difference.  We can recalculate in this 
case, and it does not affect neither performance nor the number of 
threads throughout my test harness.  I've removed both instances.

> +
>> +// If NAME is defined in BB, set R to the range of NAME, and return
>> +// TRUE.  Otherwise, return FALSE.
>> +
>> +bool
>> +path_solver::range_defined_in_block (irange &r, tree name, 
>> basic_block bb)
>> +{
>> +  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
>> +  basic_block def_bb = gimple_bb (def_stmt);
>> +
>> +  if (def_bb != bb)
>> +    return false;
>> +
>> +  if (gimple_code (def_stmt) == GIMPLE_PHI)
>> +    ssa_range_in_phi (r, as_a<gphi *> (def_stmt));
>> +  else if (!fold_range (r, def_stmt, this))
>> +    r.set_varying (TREE_TYPE (name));
>> +
>> +  if (DEBUG_SOLVER)
>> +    {
>> +      fprintf (stderr, "range_defined_in_block (BB%d) for ", bb->index);
>> +      print_generic_expr (stderr, name, TDF_SLIM);
>> +      fprintf (stderr, " is ");
>> +      r.dump (stderr);
>> +      fprintf (stderr, "\n");
>> +    }
> So see the earlier note.  This should be doing into the appropriate pass 
> specific dump file rather than dumping to stderr based on the value of 
> an environment variable.  Similarly for the other chunks of debugging bits.
> 
>> +
>> +// Return the range of STMT as it would be seen at the end of the path
>> +// being analyzed.  Anything but the final conditional in a BB will
>> +// return VARYING.
>> +
>> +void
>> +path_solver::range_in_path (irange &r, gimple *stmt)
>> +{
>> +  if (gimple_code (stmt) == GIMPLE_COND && fold_range (r, stmt, this))
>> +    return;
>> +
>> +  r.set_varying (gimple_expr_type (stmt));
>> +}
> Not objecting to anything here other than to note that I think we have 
> cases where there's a COND_EXPR on the RHS of statements within a 
> block.  We're (in general) not handling those well in DOM or jump 
> threading.

I guess I can put that on my TODO list :).

> 
> This looks pretty reasonable to me.  There's a few things to address 
> noted above, but it's close.

Attached is an updated patch.

Tested on x86-64 Linux with bootstrap, regtests, as well as our 
callgrind benchmark suite.  I also verified that the number of threading 
opportunities was the same.

Aldy

[-- Attachment #2: 0001-Implement-basic-block-path-solver.patch --]
[-- Type: text/x-patch, Size: 13057 bytes --]

From e70ab3b9849db45176f4ab6ec5b1c25b9d30b58a Mon Sep 17 00:00:00 2001
From: Aldy Hernandez <aldyh@redhat.com>
Date: Tue, 15 Jun 2021 12:20:43 +0200
Subject: [PATCH 1/2] Implement basic block path solver.

This is is the main basic block path solver for use in the ranger-based
backwards threader.  Given a path of BBs, the class can solve the final
conditional or any SSA name used in calculating the final conditional.

The main API is:

// This class is a basic block path solver.  Given a set of BBs
// indicating a path through the CFG, range_in_path() will return the
// range of an SSA as if the BBs in the path would have been executed
// in order.
//
// Only SSA names passed in IMPORTS are precomputed, and can be
// queried.
//
// Note that the blocks are in reverse order, thus the exit block is
// path[0].

class path_solver
{
public:
  path_solver (gimple_ranger &ranger);
  virtual ~path_solver ();
  void precompute_ranges (const vec<basic_block> *path,
			  const bitmap_head *imports);
  void range_in_path (irange &, tree name);
  void range_in_path (irange &, gimple *);
};

gcc/ChangeLog:

        * Makefile.in (OBJS): Add tree-ssa-path-solver.o.
	* tree-ssa-path-solver.cc: New file.
	* tree-ssa-path-solver.h: New file.
---
 gcc/Makefile.in             |   1 +
 gcc/tree-ssa-path-solver.cc | 306 ++++++++++++++++++++++++++++++++++++
 gcc/tree-ssa-path-solver.h  |  85 ++++++++++
 3 files changed, 392 insertions(+)
 create mode 100644 gcc/tree-ssa-path-solver.cc
 create mode 100644 gcc/tree-ssa-path-solver.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index ebf26442992..66cc5f9529e 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1644,6 +1644,7 @@ OBJS = \
 	tree-ssa-loop.o \
 	tree-ssa-math-opts.o \
 	tree-ssa-operands.o \
+	tree-ssa-path-solver.o \
 	tree-ssa-phiopt.o \
 	tree-ssa-phiprop.o \
 	tree-ssa-pre.o \
diff --git a/gcc/tree-ssa-path-solver.cc b/gcc/tree-ssa-path-solver.cc
new file mode 100644
index 00000000000..9dca2937ecc
--- /dev/null
+++ b/gcc/tree-ssa-path-solver.cc
@@ -0,0 +1,306 @@
+/* Basic block path solver.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   Contributed by Aldy Hernandez <aldyh@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "tree.h"
+#include "gimple.h"
+#include "cfganal.h"
+#include "value-range.h"
+#include "gimple-range.h"
+#include "tree-pretty-print.h"
+#include "tree-ssa-path-solver.h"
+#include "ssa.h"
+
+// Internal construct to help facilitate debugging of solver.
+#define DEBUG_SOLVER (0 && dump_file)
+
+path_solver::path_solver (gimple_ranger &ranger)
+  : m_ranger (ranger)
+{
+  m_cache = new ssa_global_cache;
+  m_has_cache_entry = BITMAP_ALLOC (NULL);
+  m_path = NULL;
+}
+
+path_solver::~path_solver ()
+{
+  BITMAP_FREE (m_has_cache_entry);
+  delete m_cache;
+}
+
+// Mark cache entry for NAME as unused.
+
+void
+path_solver::clear_cache (tree name)
+{
+  unsigned v = SSA_NAME_VERSION (name);
+  bitmap_clear_bit (m_has_cache_entry, v);
+}
+
+// If NAME has a cache entry, return it in R, and return TRUE.
+
+inline bool
+path_solver::get_cache (irange &r, tree name)
+{
+  if (!gimple_range_ssa_p (name))
+    return get_global_range_query ()->range_of_expr (r, name);
+
+  unsigned v = SSA_NAME_VERSION (name);
+  if (bitmap_bit_p (m_has_cache_entry, v))
+    return m_cache->get_global_range (r, name);
+
+  return false;
+}
+
+// Set the cache entry for NAME to R.
+
+void
+path_solver::set_cache (const irange &r, tree name)
+{
+  unsigned v = SSA_NAME_VERSION (name);
+  bitmap_set_bit (m_has_cache_entry, v);
+  m_cache->set_global_range (name, r);
+}
+
+bool
+path_solver::range_of_expr (irange &r, tree name, gimple *stmt)
+{
+  if (!irange::supports_type_p (TREE_TYPE (name)))
+    return false;
+
+  if (get_cache (r, name))
+    return true;
+
+  if (stmt && range_defined_in_block (r, name, gimple_bb (stmt)))
+    {
+      set_cache (r, name);
+      return true;
+    }
+
+  // Otherwise return varying.
+  r.set_varying (TREE_TYPE (name));
+  return true;
+}
+
+// Initialize the current path to PATH.  The current block is set to
+// the entry block to the path.
+//
+// Note that the blocks are in reverse order, so the exit block is
+// path[0].
+
+void
+path_solver::set_path (const vec<basic_block> &path)
+{
+  gcc_checking_assert (path.length () > 1);
+  m_path = &path;
+  m_pos = m_path->length () - 1;
+  bitmap_clear (m_has_cache_entry);
+}
+
+// Return the range of the result of PHI in R.
+
+void
+path_solver::ssa_range_in_phi (irange &r, gphi *phi)
+{
+  tree name = gimple_phi_result (phi);
+  basic_block bb = gimple_bb (phi);
+
+  // We experimented with querying ranger's range_on_entry here, but
+  // the performance penalty was too high, for hardly any improvements.
+  if (at_entry ())
+    {
+      r.set_varying (TREE_TYPE (name));
+      return;
+    }
+
+  basic_block prev = prev_bb ();
+  edge e_in = find_edge (prev, bb);
+  for (size_t i = 0; i < gimple_phi_num_args (phi); ++i)
+    if (e_in == gimple_phi_arg_edge (phi, i))
+      {
+	tree arg = gimple_phi_arg_def (phi, i);
+
+	if (!get_cache (r, arg))
+	  r.set_varying (TREE_TYPE (name));
+
+	return;
+      }
+  gcc_unreachable ();
+}
+
+// If NAME is defined in BB, set R to the range of NAME, and return
+// TRUE.  Otherwise, return FALSE.
+
+bool
+path_solver::range_defined_in_block (irange &r, tree name, basic_block bb)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
+  basic_block def_bb = gimple_bb (def_stmt);
+
+  if (def_bb != bb)
+    return false;
+
+  if (gimple_code (def_stmt) == GIMPLE_PHI)
+    ssa_range_in_phi (r, as_a<gphi *> (def_stmt));
+  else if (!fold_range (r, def_stmt, this))
+    r.set_varying (TREE_TYPE (name));
+
+  if (DEBUG_SOLVER)
+    {
+      fprintf (dump_file, "range_defined_in_block (BB%d) for ", bb->index);
+      print_generic_expr (dump_file, name, TDF_SLIM);
+      fprintf (dump_file, " is ");
+      r.dump (dump_file);
+      fprintf (dump_file, "\n");
+    }
+  return true;
+}
+
+// Precompute ranges defined in the current block, or ranges
+// that are exported on an edge to the next block.  The cache is
+// updated accordingly.
+
+void
+path_solver::precompute_ranges_in_block ()
+{
+  basic_block bb = curr_bb ();
+  bitmap_iterator bi;
+  int_range_max r, cached_range;
+  unsigned i;
+
+  // Force recalculation of any names in the cache that are defined in
+  // this block.  This can happen on interdependent SSA/phis in loops.
+  EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+      gimple *def_stmt = SSA_NAME_DEF_STMT (name);
+      basic_block def_bb = gimple_bb (def_stmt);
+
+      if (def_bb == bb)
+	clear_cache (name);
+    }
+
+  // Solve imports defined in this block.
+  EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+
+      if (range_defined_in_block (r, name, bb))
+	set_cache (r, name);
+    }
+
+  if (at_exit ())
+    return;
+
+  // Solve imports that are exported to the next block.
+  edge e = find_edge (bb, next_bb ());
+  EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+      gori_compute &g = m_ranger.gori ();
+      bitmap exports = g.exports (bb);
+
+      if (bitmap_bit_p (exports, i))
+	{
+	  if (g.outgoing_edge_range_p (r, e, name, *this))
+	    {
+	      if (get_cache (cached_range, name))
+		r.intersect (cached_range);
+
+	      set_cache (r, name);
+	      if (DEBUG_SOLVER)
+		{
+		  fprintf (dump_file, "outgoing_edge_range_p for ");
+		  print_generic_expr (dump_file, name, TDF_SLIM);
+		  fprintf (dump_file, " on edge %d->%d ",
+			   e->src->index, e->dest->index);
+		  fprintf (dump_file, "is ");
+		  r.dump (dump_file);
+		  fprintf (dump_file, "\n");
+		}
+	    }
+	}
+    }
+}
+
+// Precompute the ranges for IMPORTS along PATH.
+//
+// IMPORTS are the set of SSA names, any of which could potentially
+// change the value of the final conditional in PATH.
+
+void
+path_solver::precompute_ranges (const vec<basic_block> &path,
+				const bitmap_head *imports)
+{
+  set_path (path);
+  m_imports = imports;
+
+  if (DEBUG_SOLVER)
+    {
+      extern void debug (vec<basic_block> &);
+      extern void debug (const bitmap_head *);
+      fprintf (dump_file, "\nPATH is:\n");
+      debug (const_cast <vec<basic_block> &> (path));
+      fprintf (dump_file, "imports: ");
+      debug (imports);
+    }
+
+  while (1)
+    {
+      precompute_ranges_in_block ();
+      if (at_exit ())
+	break;
+      move_next ();
+    }
+}
+
+// Return the range of NAME as it would be seen at the end of the path
+// being analyzed.
+
+void
+path_solver::range_in_path (irange &r, tree name)
+{
+  basic_block bb = exit_bb ();
+
+  if (get_cache (r, name))
+    return;
+
+  if (range_defined_in_block (r, name, bb))
+    return;
+
+  // The path may not be deep enough to resolve NAME.
+  r.set_varying (TREE_TYPE (name));
+}
+
+// Return the range of STMT as it would be seen at the end of the path
+// being analyzed.  Anything but the final conditional in a BB will
+// return VARYING.
+
+void
+path_solver::range_in_path (irange &r, gimple *stmt)
+{
+  if (gimple_code (stmt) == GIMPLE_COND && fold_range (r, stmt, this))
+    return;
+
+  r.set_varying (gimple_expr_type (stmt));
+}
diff --git a/gcc/tree-ssa-path-solver.h b/gcc/tree-ssa-path-solver.h
new file mode 100644
index 00000000000..90bcf389f66
--- /dev/null
+++ b/gcc/tree-ssa-path-solver.h
@@ -0,0 +1,85 @@
+/* Header file for jump threading path solver.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   Contributed by Aldy Hernandez <aldyh@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_TREE_SSA_THREADSOLVER_H
+#define GCC_TREE_SSA_THREADSOLVER_H
+
+// This class is a basic block path solver.  Given a set of BBs
+// indicating a path through the CFG, range_in_path() will return the
+// range of an SSA as if the BBs in the path would have been executed
+// in order.
+//
+// Only SSA names passed in IMPORTS are precomputed, and can be
+// queried.
+//
+// Note that the blocks are in reverse order, thus the exit block is
+// path[0].
+
+class path_solver : private range_query
+{
+public:
+  path_solver (gimple_ranger &ranger);
+  virtual ~path_solver ();
+  void precompute_ranges (const vec<basic_block> &path,
+			  const bitmap_head *imports);
+  void range_in_path (irange &, tree name);
+  void range_in_path (irange &, gimple *);
+
+private:
+  bool range_of_expr (irange &r, tree name, gimple * = NULL) override;
+
+  // Cache manipulation.
+  void set_cache (const irange &r, tree name);
+  bool get_cache (irange &r, tree name);
+  void clear_cache (tree name);
+
+  // Methods to precompute ranges for the given path.
+  bool range_defined_in_block (irange &, tree name, basic_block bb);
+  void precompute_ranges_in_block ();
+  void ssa_range_in_phi (irange &r, gphi *phi);
+
+  // Path navigation.
+  void set_path (const vec<basic_block> &);
+  basic_block entry_bb () { return (*m_path)[m_path->length () - 1]; }
+  basic_block exit_bb ()  { return (*m_path)[0]; }
+  basic_block curr_bb ()  { return (*m_path)[m_pos]; }
+  basic_block prev_bb ()  { return (*m_path)[m_pos + 1]; }
+  basic_block next_bb ()  { return (*m_path)[m_pos - 1]; }
+  bool at_entry ()	  { return m_pos == m_path->length () - 1; }
+  bool at_exit ()	  { return m_pos == 0; }
+  void move_next ()	  { --m_pos; }
+
+  // Range cache for SSA names.
+  ssa_global_cache *m_cache;
+
+  // Set for each SSA that has an active entry in the cache.
+  bitmap m_has_cache_entry;
+
+  // Path being analyzed.
+  const vec<basic_block> *m_path;
+
+  // Current path position.
+  unsigned m_pos;
+
+  const bitmap_head *m_imports;
+  gimple_ranger &m_ranger;
+};
+
+#endif // GCC_TREE_SSA_THREADSOLVER_H
-- 
2.31.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] Implement basic block path solver.
  2021-07-02  8:13     ` Aldy Hernandez
@ 2021-07-02 13:16       ` Andrew MacLeod
  2021-07-15 14:55         ` Aldy Hernandez
  2021-07-26 19:10       ` Jeff Law
  1 sibling, 1 reply; 20+ messages in thread
From: Andrew MacLeod @ 2021-07-02 13:16 UTC (permalink / raw)
  To: Aldy Hernandez, Jeff Law, GCC patches; +Cc: Richard Biener, Martin Sebor

On 7/2/21 4:13 AM, Aldy Hernandez wrote:
>>> +
>>> +// Return the range of STMT as it would be seen at the end of the path
>>> +// being analyzed.  Anything but the final conditional in a BB will
>>> +// return VARYING.
>>> +
>>> +void
>>> +path_solver::range_in_path (irange &r, gimple *stmt)
>>> +{
>>> +  if (gimple_code (stmt) == GIMPLE_COND && fold_range (r, stmt, this))
>>> +    return;
>>> +
>>> +  r.set_varying (gimple_expr_type (stmt));
>>> +}
>> Not objecting to anything here other than to note that I think we 
>> have cases where there's a COND_EXPR on the RHS of statements within 
>> a block.  We're (in general) not handling those well in DOM or jump 
>> threading.
>
> I guess I can put that on my TODO list :). 

note that we are no longer in the days of range-ops only processing...   
fold_range handles COND_EXPR (and every other kind of stmt)  just fine.

Andrew


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Backwards jump threader rewrite with ranger.
  2021-06-28 16:21 ` [PATCH 2/2] Backwards jump threader rewrite with ranger Aldy Hernandez
@ 2021-07-05 15:39   ` Aldy Hernandez
  2021-07-15 14:57     ` Aldy Hernandez
  0 siblings, 1 reply; 20+ messages in thread
From: Aldy Hernandez @ 2021-07-05 15:39 UTC (permalink / raw)
  To: GCC patches; +Cc: Andrew MacLeod, Jeff Law, Richard Biener, Martin Sebor

PING.

Aldy


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] Implement basic block path solver.
  2021-07-02 13:16       ` Andrew MacLeod
@ 2021-07-15 14:55         ` Aldy Hernandez
  0 siblings, 0 replies; 20+ messages in thread
From: Aldy Hernandez @ 2021-07-15 14:55 UTC (permalink / raw)
  To: Andrew MacLeod; +Cc: Jeff Law, GCC patches, Richard Biener, Martin Sebor

[-- Attachment #1: Type: text/plain, Size: 1983 bytes --]

Jeff has mentioned that it'll take a while longer to review the
threader rewrite, so I've decided to make some minor cleanups while he
gets to it.

There are few minor changes here:

1. I've renamed the solver to gimple-range-path.* which expresses
better that it's part of the ranger tools. The prefix tree-ssa-* is
somewhat outdated ;-).

2. I've made the folder a full blown range_query, which can be passed
around anywhere a range_query is accepted.  It turns out, we were 99%
of the way there, so might as well share the same API.  Now users will
be able use range_of_expr, range_of_stmt, and friends.  This can come
in handy when passing a range_query to something like
simplify_using_ranges, something which I am considering for my
follow-up changes to the DOM threader.

3. Finally, I've renamed the class to path_range_query to make it
obvious that it's a range_query object.

There are no functional changes.

Tested on x86-64 Linux.

I will wait on Jeff's review of the tree-ssa-threadbackward.* changes
before committing this.

Aldy

On Fri, Jul 2, 2021 at 3:17 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>
> On 7/2/21 4:13 AM, Aldy Hernandez wrote:
>
> +
> +// Return the range of STMT as it would be seen at the end of the path
> +// being analyzed.  Anything but the final conditional in a BB will
> +// return VARYING.
> +
> +void
> +path_solver::range_in_path (irange &r, gimple *stmt)
> +{
> +  if (gimple_code (stmt) == GIMPLE_COND && fold_range (r, stmt, this))
> +    return;
> +
> +  r.set_varying (gimple_expr_type (stmt));
> +}
>
> Not objecting to anything here other than to note that I think we have cases where there's a COND_EXPR on the RHS of statements within a block.  We're (in general) not handling those well in DOM or jump threading.
>
>
> I guess I can put that on my TODO list :).
>
> note that we are no longer in the days of range-ops only processing...   fold_range handles COND_EXPR (and every other kind of stmt)  just fine.
>
> Andrew

[-- Attachment #2: 0002-Implement-basic-block-path-solver.patch --]
[-- Type: text/x-patch, Size: 12822 bytes --]

From bb2d12abf7bab6306a38e143aed0f0a828f1c790 Mon Sep 17 00:00:00 2001
From: Aldy Hernandez <aldyh@redhat.com>
Date: Tue, 15 Jun 2021 12:20:43 +0200
Subject: [PATCH 2/5] Implement basic block path solver.

This is is the main basic block path solver for use in the ranger-based
backwards threader.  Given a path of BBs, the class can solve the final
conditional or any SSA name used in calculating the final conditional.

gcc/ChangeLog:

        * Makefile.in (OBJS): Add gimple-range-path.o.
	* gimple-range-path.cc: New file.
	* gimple-range-path.h: New file.
---
 gcc/Makefile.in          |   1 +
 gcc/gimple-range-path.cc | 327 +++++++++++++++++++++++++++++++++++++++
 gcc/gimple-range-path.h  |  85 ++++++++++
 3 files changed, 413 insertions(+)
 create mode 100644 gcc/gimple-range-path.cc
 create mode 100644 gcc/gimple-range-path.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 934b2a05327..863f1256811 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1649,6 +1649,7 @@ OBJS = \
 	tree-ssa-loop.o \
 	tree-ssa-math-opts.o \
 	tree-ssa-operands.o \
+	gimple-range-path.o \
 	tree-ssa-phiopt.o \
 	tree-ssa-phiprop.o \
 	tree-ssa-pre.o \
diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
new file mode 100644
index 00000000000..dd7c5342d8b
--- /dev/null
+++ b/gcc/gimple-range-path.cc
@@ -0,0 +1,327 @@
+/* Basic block path solver.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   Contributed by Aldy Hernandez <aldyh@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "tree.h"
+#include "gimple.h"
+#include "cfganal.h"
+#include "value-range.h"
+#include "gimple-range.h"
+#include "tree-pretty-print.h"
+#include "gimple-range-path.h"
+#include "ssa.h"
+
+// Internal construct to help facilitate debugging of solver.
+#define DEBUG_SOLVER (0 && dump_file)
+
+path_range_query::path_range_query (gimple_ranger &ranger)
+  : m_ranger (ranger)
+{
+  m_cache = new ssa_global_cache;
+  m_has_cache_entry = BITMAP_ALLOC (NULL);
+  m_path = NULL;
+}
+
+path_range_query::~path_range_query ()
+{
+  BITMAP_FREE (m_has_cache_entry);
+  delete m_cache;
+}
+
+// Mark cache entry for NAME as unused.
+
+void
+path_range_query::clear_cache (tree name)
+{
+  unsigned v = SSA_NAME_VERSION (name);
+  bitmap_clear_bit (m_has_cache_entry, v);
+}
+
+// If NAME has a cache entry, return it in R, and return TRUE.
+
+inline bool
+path_range_query::get_cache (irange &r, tree name)
+{
+  if (!gimple_range_ssa_p (name))
+    return get_global_range_query ()->range_of_expr (r, name);
+
+  unsigned v = SSA_NAME_VERSION (name);
+  if (bitmap_bit_p (m_has_cache_entry, v))
+    return m_cache->get_global_range (r, name);
+
+  return false;
+}
+
+// Set the cache entry for NAME to R.
+
+void
+path_range_query::set_cache (const irange &r, tree name)
+{
+  unsigned v = SSA_NAME_VERSION (name);
+  bitmap_set_bit (m_has_cache_entry, v);
+  m_cache->set_global_range (name, r);
+}
+
+void
+path_range_query::dump (FILE *dump_file)
+{
+  if (m_path->is_empty ())
+    return;
+
+  unsigned i;
+  bitmap_iterator bi;
+  extern void dump_ranger (FILE *, const vec<basic_block> &);
+
+  fprintf (dump_file, "Path is:\n");
+  dump_ranger (dump_file, *m_path);
+
+  fprintf (dump_file, "Imports:\n");
+  EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+      print_generic_expr (dump_file, name, TDF_SLIM);
+      fprintf (dump_file, "\n");
+    }
+
+  m_cache->dump (dump_file);
+}
+
+void
+path_range_query::debug ()
+{
+  dump (stderr);
+}
+
+// Return the range of NAME at the end of the path being analyzed.
+
+bool
+path_range_query::range_of_expr (irange &r, tree name, gimple *stmt)
+{
+  if (!irange::supports_type_p (TREE_TYPE (name)))
+    return false;
+
+  if (get_cache (r, name))
+    return true;
+
+
+  basic_block bb = stmt ? gimple_bb (stmt) : exit_bb ();
+  if (stmt && range_defined_in_block (r, name, bb))
+    {
+      set_cache (r, name);
+      return true;
+    }
+
+  r.set_varying (TREE_TYPE (name));
+  return true;
+}
+
+// Return the range of STMT at the end of the path being analyzed.
+// Anything but the final conditional in a BB will return VARYING.
+
+bool
+path_range_query::range_of_stmt (irange &r, gimple *stmt, tree)
+{
+  tree type = gimple_expr_type (stmt);
+
+  if (!irange::supports_type_p (type))
+    return false;
+
+  if (gimple_code (stmt) == GIMPLE_COND && fold_range (r, stmt, this))
+    return true;
+
+  r.set_varying (type);
+  return true;
+}
+
+// Initialize the current path to PATH.  The current block is set to
+// the entry block to the path.
+//
+// Note that the blocks are in reverse order, so the exit block is
+// path[0].
+
+void
+path_range_query::set_path (const vec<basic_block> &path)
+{
+  gcc_checking_assert (path.length () > 1);
+  m_path = &path;
+  m_pos = m_path->length () - 1;
+  bitmap_clear (m_has_cache_entry);
+}
+
+// Return the range of the result of PHI in R.
+
+void
+path_range_query::ssa_range_in_phi (irange &r, gphi *phi)
+{
+  tree name = gimple_phi_result (phi);
+  basic_block bb = gimple_bb (phi);
+
+  // We experimented with querying ranger's range_on_entry here, but
+  // the performance penalty was too high, for hardly any improvements.
+  if (at_entry ())
+    {
+      // Try fold just in case we can resolve simple things like PHI <5(99), 6(88)>.
+      if (!fold_range (r, phi, this))
+	r.set_varying (TREE_TYPE (name));
+
+      return;
+    }
+
+  basic_block prev = prev_bb ();
+  edge e_in = find_edge (prev, bb);
+  for (size_t i = 0; i < gimple_phi_num_args (phi); ++i)
+    if (e_in == gimple_phi_arg_edge (phi, i))
+      {
+	tree arg = gimple_phi_arg_def (phi, i);
+
+	if (!get_cache (r, arg))
+	  r.set_varying (TREE_TYPE (name));
+
+	return;
+      }
+  gcc_unreachable ();
+}
+
+// If NAME is defined in BB, set R to the range of NAME, and return
+// TRUE.  Otherwise, return FALSE.
+
+bool
+path_range_query::range_defined_in_block (irange &r, tree name, basic_block bb)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
+  basic_block def_bb = gimple_bb (def_stmt);
+
+  if (def_bb != bb)
+    return false;
+
+  if (gimple_code (def_stmt) == GIMPLE_PHI)
+    ssa_range_in_phi (r, as_a<gphi *> (def_stmt));
+  else if (!fold_range (r, def_stmt, this))
+    r.set_varying (TREE_TYPE (name));
+
+  if (DEBUG_SOLVER)
+    {
+      fprintf (dump_file, "range_defined_in_block (BB%d) for ", bb->index);
+      print_generic_expr (dump_file, name, TDF_SLIM);
+      fprintf (dump_file, " is ");
+      r.dump (dump_file);
+      fprintf (dump_file, "\n");
+    }
+  return true;
+}
+
+// Precompute ranges defined in the current block, or ranges
+// that are exported on an edge to the next block.
+
+void
+path_range_query::precompute_ranges_in_block (basic_block bb)
+{
+  bitmap_iterator bi;
+  int_range_max r, cached_range;
+  unsigned i;
+
+  // Force recalculation of any names in the cache that are defined in
+  // this block.  This can happen on interdependent SSA/phis in loops.
+  EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+      gimple *def_stmt = SSA_NAME_DEF_STMT (name);
+      basic_block def_bb = gimple_bb (def_stmt);
+
+      if (def_bb == bb)
+	clear_cache (name);
+    }
+
+  // Solve imports defined in this block.
+  EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+
+      if (range_defined_in_block (r, name, bb))
+	set_cache (r, name);
+    }
+
+  if (at_exit ())
+    return;
+
+  // Solve imports that are exported to the next block.
+  edge e = find_edge (bb, next_bb ());
+  EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+      gori_compute &g = m_ranger.gori ();
+      bitmap exports = g.exports (bb);
+
+      if (bitmap_bit_p (exports, i))
+	{
+	  if (g.outgoing_edge_range_p (r, e, name, *this))
+	    {
+	      if (get_cache (cached_range, name))
+		r.intersect (cached_range);
+
+	      set_cache (r, name);
+	      if (DEBUG_SOLVER)
+		{
+		  fprintf (dump_file, "outgoing_edge_range_p for ");
+		  print_generic_expr (dump_file, name, TDF_SLIM);
+		  fprintf (dump_file, " on edge %d->%d ",
+			   e->src->index, e->dest->index);
+		  fprintf (dump_file, "is ");
+		  r.dump (dump_file);
+		  fprintf (dump_file, "\n");
+		}
+	    }
+	}
+    }
+}
+
+// Precompute the ranges for IMPORTS along PATH.
+//
+// IMPORTS are the set of SSA names, any of which could potentially
+// change the value of the final conditional in PATH.
+
+void
+path_range_query::precompute_ranges (const vec<basic_block> &path,
+				     const bitmap_head *imports)
+{
+  set_path (path);
+  m_imports = imports;
+
+  if (DEBUG_SOLVER)
+    fprintf (dump_file, "path_range_query: precompute_ranges\n");
+
+  while (1)
+    {
+      basic_block bb = curr_bb ();
+
+      precompute_ranges_in_block (bb);
+
+      if (at_exit ())
+	break;
+
+      move_next ();
+    }
+
+  if (DEBUG_SOLVER)
+    dump (dump_file);
+}
diff --git a/gcc/gimple-range-path.h b/gcc/gimple-range-path.h
new file mode 100644
index 00000000000..43f0ec80286
--- /dev/null
+++ b/gcc/gimple-range-path.h
@@ -0,0 +1,85 @@
+/* Header file for jump threading path solver.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   Contributed by Aldy Hernandez <aldyh@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_TREE_SSA_THREADSOLVER_H
+#define GCC_TREE_SSA_THREADSOLVER_H
+
+// This class is a basic block path solver.  Given a set of BBs
+// indicating a path through the CFG, range_of_expr and range_of_stmt
+// will calculate the range of an SSA or STMT as if the BBs in the
+// path would have been executed in order.
+//
+// Only SSA names passed in IMPORTS are precomputed, and can be
+// queried.
+//
+// Note that the blocks are in reverse order, thus the exit block is
+// path[0].
+
+class path_range_query : public range_query
+{
+public:
+  path_range_query (class gimple_ranger &ranger);
+  virtual ~path_range_query ();
+  void precompute_ranges (const vec<basic_block> &path,
+			  const bitmap_head *imports);
+  bool range_of_expr (irange &r, tree name, gimple * = NULL) override;
+  bool range_of_stmt (irange &r, gimple *, tree name = NULL) override;
+  void dump (FILE *);
+  void debug ();
+
+private:
+  // Cache manipulation.
+  void set_cache (const irange &r, tree name);
+  bool get_cache (irange &r, tree name);
+  void clear_cache (tree name);
+
+  // Methods to precompute ranges for the given path.
+  bool range_defined_in_block (irange &, tree name, basic_block bb);
+  void precompute_ranges_in_block (basic_block bb);
+  void ssa_range_in_phi (irange &r, gphi *phi);
+
+  // Path navigation.
+  void set_path (const vec<basic_block> &);
+  basic_block entry_bb () { return (*m_path)[m_path->length () - 1]; }
+  basic_block exit_bb ()  { return (*m_path)[0]; }
+  basic_block curr_bb ()  { return (*m_path)[m_pos]; }
+  basic_block prev_bb ()  { return (*m_path)[m_pos + 1]; }
+  basic_block next_bb ()  { return (*m_path)[m_pos - 1]; }
+  bool at_entry ()	  { return m_pos == m_path->length () - 1; }
+  bool at_exit ()	  { return m_pos == 0; }
+  void move_next ()	  { --m_pos; }
+
+  // Range cache for SSA names.
+  ssa_global_cache *m_cache;
+
+  // Set for each SSA that has an active entry in the cache.
+  bitmap m_has_cache_entry;
+
+  // Path being analyzed.
+  const vec<basic_block> *m_path;
+
+  // Current path position.
+  unsigned m_pos;
+
+  const bitmap_head *m_imports;
+  gimple_ranger &m_ranger;
+};
+
+#endif // GCC_TREE_SSA_THREADSOLVER_H
-- 
2.31.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Backwards jump threader rewrite with ranger.
  2021-07-05 15:39   ` Aldy Hernandez
@ 2021-07-15 14:57     ` Aldy Hernandez
  2021-07-26 12:43       ` Aldy Hernandez
  2021-07-28 14:32       ` Jeff Law
  0 siblings, 2 replies; 20+ messages in thread
From: Aldy Hernandez @ 2021-07-15 14:57 UTC (permalink / raw)
  To: GCC patches; +Cc: Andrew MacLeod, Jeff Law, Richard Biener, Martin Sebor

[-- Attachment #1: Type: text/plain, Size: 333 bytes --]

As mentioned in my previous email, these are some minor changes to the
previous revision.  All I'm changing here is the call into the solver
to use range_of_expr and range_of_stmt.  Everything else remains the
same.

Tested on x86-64 Linux.

On Mon, Jul 5, 2021 at 5:39 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>
> PING.
>
> Aldy

[-- Attachment #2: 0003-Backwards-jump-threader-rewrite-with-ranger.patch --]
[-- Type: text/x-patch, Size: 66250 bytes --]

From 1774338ddd1f4718884e766aae2fc48b97110c5d Mon Sep 17 00:00:00 2001
From: Aldy Hernandez <aldyh@redhat.com>
Date: Tue, 15 Jun 2021 12:32:51 +0200
Subject: [PATCH 3/5] Backwards jump threader rewrite with ranger.

This is a rewrite of the backwards threader with a ranger based solver.

The code is divided into two parts: the path solver in
gimple-range-path.*, and the path discovery bits in
tree-ssa-threadbackward.c.

The legacy code is still available with --param=threader-mode=legacy,
but will be removed shortly after.

gcc/ChangeLog:

	* Makefile.in (tree-ssa-loop-im.o-warn): New.
	* flag-types.h (enum threader_mode): New.
	* params.opt: Add entry for --param=threader-mode.
	* tree-ssa-threadbackward.c (THREADER_ITERATIVE_MODE): New.
	(class back_threader): New.
	(back_threader::back_threader): New.
	(back_threader::~back_threader): New.
	(back_threader::maybe_register_path): New.
	(back_threader::find_taken_edge): New.
	(back_threader::find_taken_edge_switch): New.
	(back_threader::find_taken_edge_cond): New.
	(back_threader::resolve_def): New.
	(back_threader::resolve_phi): New.
	(back_threader::find_paths_to_names): New.
	(back_threader::find_paths): New.
	(dump_path): New.
	(debug): New.
	(thread_jumps::find_jump_threads_backwards): Call ranger threader.
	(thread_jumps::find_jump_threads_backwards_with_ranger): New.
	(pass_thread_jumps::execute): Abstract out code...
	(try_thread_blocks): ...here.
	* tree-ssa-threadedge.c (jump_threader::thread_outgoing_edges):
	Abstract out threading candidate code to...
	(single_succ_to_potentially_threadable_block): ...here.
	* tree-ssa-threadedge.h (single_succ_to_potentially_threadable_block):
	New.
	* tree-ssa-threadupdate.c (register_jump_thread): Return boolean.
	* tree-ssa-threadupdate.h (class jump_thread_path_registry):
	Return bool from register_jump_thread.

libgomp/ChangeLog:

	* testsuite/libgomp.graphite/force-parallel-4.c: Adjust for
	threader.
	* testsuite/libgomp.graphite/force-parallel-8.c: Same.

gcc/testsuite/ChangeLog:

	* g++.dg/debug/dwarf2/deallocator.C: Adjust for threader.
	* gcc.c-torture/compile/pr83510.c: Same.
	* gcc.dg/loop-unswitch-2.c: Same.
	* gcc.dg/old-style-asm-1.c: Same.
	* gcc.dg/pr68317.c: Same.
	* gcc.dg/pr97567-2.c: Same.
	* gcc.dg/predict-9.c: Same.
	* gcc.dg/shrink-wrap-loop.c: Same.
	* gcc.dg/sibcall-1.c: Same.
	* gcc.dg/tree-ssa/builtin-sprintf-3.c: Same.
	* gcc.dg/tree-ssa/pr21001.c: Same.
	* gcc.dg/tree-ssa/pr21294.c: Same.
	* gcc.dg/tree-ssa/pr21417.c: Same.
	* gcc.dg/tree-ssa/pr21458-2.c: Same.
	* gcc.dg/tree-ssa/pr21563.c: Same.
	* gcc.dg/tree-ssa/pr49039.c: Same.
	* gcc.dg/tree-ssa/pr61839_1.c: Same.
	* gcc.dg/tree-ssa/pr61839_3.c: Same.
	* gcc.dg/tree-ssa/pr77445-2.c: Same.
	* gcc.dg/tree-ssa/split-path-4.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-11.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-12.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-14.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same.
	* gcc.dg/tree-ssa/ssa-fre-48.c: Same.
	* gcc.dg/tree-ssa/ssa-thread-11.c: Same.
	* gcc.dg/tree-ssa/ssa-thread-12.c: Same.
	* gcc.dg/tree-ssa/ssa-thread-14.c: Same.
	* gcc.dg/tree-ssa/vrp02.c: Same.
	* gcc.dg/tree-ssa/vrp03.c: Same.
	* gcc.dg/tree-ssa/vrp05.c: Same.
	* gcc.dg/tree-ssa/vrp06.c: Same.
	* gcc.dg/tree-ssa/vrp07.c: Same.
	* gcc.dg/tree-ssa/vrp09.c: Same.
	* gcc.dg/tree-ssa/vrp19.c: Same.
	* gcc.dg/tree-ssa/vrp20.c: Same.
	* gcc.dg/tree-ssa/vrp33.c: Same.
	* gcc.dg/uninit-pred-9_b.c: Same.
	* gcc.dg/vect/bb-slp-16.c: Same.
	* gcc.target/i386/avx2-vect-aggressive.c: Same.
	* gcc.dg/tree-ssa/ranger-threader-1.c: New test.
	* gcc.dg/tree-ssa/ranger-threader-2.c: New test.
	* gcc.dg/tree-ssa/ranger-threader-3.c: New test.
	* gcc.dg/tree-ssa/ranger-threader-4.c: New test.
	* gcc.dg/tree-ssa/ranger-threader-5.c: New test.
---
 gcc/Makefile.in                               |   5 +
 gcc/flag-types.h                              |   7 +
 gcc/params.opt                                |  17 +
 .../g++.dg/debug/dwarf2/deallocator.C         |   3 +-
 gcc/testsuite/gcc.c-torture/compile/pr83510.c |  33 ++
 gcc/testsuite/gcc.dg/loop-unswitch-2.c        |   2 +-
 gcc/testsuite/gcc.dg/old-style-asm-1.c        |   5 +-
 gcc/testsuite/gcc.dg/pr68317.c                |   4 +-
 gcc/testsuite/gcc.dg/pr97567-2.c              |   2 +-
 gcc/testsuite/gcc.dg/predict-9.c              |   4 +-
 gcc/testsuite/gcc.dg/shrink-wrap-loop.c       |  53 ++
 gcc/testsuite/gcc.dg/sibcall-1.c              |  10 +
 .../gcc.dg/tree-ssa/builtin-sprintf-3.c       |  25 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr21001.c       |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/pr21294.c       |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/pr21417.c       |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c     |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr21563.c       |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr49039.c       |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c     |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c     |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c     |   2 +-
 .../gcc.dg/tree-ssa/ranger-threader-1.c       |  20 +
 .../gcc.dg/tree-ssa/ranger-threader-2.c       |  39 ++
 .../gcc.dg/tree-ssa/ranger-threader-3.c       |  41 ++
 .../gcc.dg/tree-ssa/ranger-threader-4.c       |  83 +++
 .../gcc.dg/tree-ssa/ranger-threader-5.c       |  80 +++
 gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c  |   4 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-11.c       |   2 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-12.c       |   2 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-14.c       |   1 +
 .../gcc.dg/tree-ssa/ssa-dom-thread-18.c       |   5 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-6.c        |   4 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-7.c        |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c    |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/vrp02.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp03.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp05.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp06.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp07.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp09.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp19.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp20.c         |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp33.c         |   2 +-
 gcc/testsuite/gcc.dg/uninit-pred-9_b.c        |   1 +
 gcc/testsuite/gcc.dg/vect/bb-slp-16.c         |   7 +
 .../gcc.target/i386/avx2-vect-aggressive.c    |   2 +-
 gcc/tree-ssa-threadbackward.c                 | 475 +++++++++++++++++-
 gcc/tree-ssa-threadedge.c                     |  20 +-
 gcc/tree-ssa-threadedge.h                     |   3 +-
 gcc/tree-ssa-threadupdate.c                   |  12 +-
 gcc/tree-ssa-threadupdate.h                   |   2 +-
 .../libgomp.graphite/force-parallel-4.c       |   1 +
 .../libgomp.graphite/force-parallel-8.c       |   2 +
 57 files changed, 963 insertions(+), 54 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-5.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 863f1256811..0e205a41ac3 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -223,6 +223,11 @@ gimple-match.o-warn = -Wno-unused
 generic-match.o-warn = -Wno-unused
 dfp.o-warn = -Wno-strict-aliasing
 
+# maybe_emit_free_warning() is picking up the inlined location for the
+# warning, not the source of the original va_heap::release() function
+# which has a pragma disabling this warning.
+tree-ssa-loop-im.o-warn = -Wno-free-nonheap-object
+
 # All warnings have to be shut off in stage1 if the compiler used then
 # isn't gcc; configure determines that.  WARN_CFLAGS will be either
 # $(GCC_WARN_CFLAGS), or nothing.  Similarly, WARN_CXXFLAGS will be
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index e43d1de490d..e39673f6716 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -454,6 +454,13 @@ enum evrp_mode
   EVRP_MODE_RVRP_DEBUG = EVRP_MODE_RVRP_ONLY | EVRP_MODE_DEBUG
 };
 
+/* Backwards threader mode.  */
+enum threader_mode
+{
+  THREADER_MODE_LEGACY = 0,
+  THREADER_MODE_RANGER = 1
+};
+
 /* Modes of OpenACC 'kernels' constructs handling.  */
 enum openacc_kernels
 {
diff --git a/gcc/params.opt b/gcc/params.opt
index 92b003e38cb..f1f47b44215 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -1010,6 +1010,23 @@ Maximum depth of DFS walk used by modref escape analysis.
 Common Joined UInteger Var(param_modref_max_escape_points) Init(256) Param Optimization
 Maximum number of escape points tracked by modref per SSA-name.
 
+-param=threader-iterative=
+Common Joined UInteger Var(param_threader_iterative) Init(0) Param Optimization
+Run backwards threader in iterative mode.
+
+-param=threader-mode=
+Common Joined Var(param_threader_mode) Enum(threader_mode) Init(THREADER_MODE_RANGER) Param Optimization
+--param=threader-mode=[legacy|ranger] Specifies the mode the backwards threader should run in.
+
+Enum
+Name(threader_mode) Type(enum threader_mode) UnknownError(unknown threader mode %qs)
+
+EnumValue
+Enum(threader_mode) String(legacy) Value(THREADER_MODE_LEGACY)
+
+EnumValue
+Enum(threader_mode) String(ranger) Value(THREADER_MODE_RANGER)
+
 -param=tm-max-aggregate-size=
 Common Joined UInteger Var(param_tm_max_aggregate_size) Init(9) Param Optimization
 Size in bytes after which thread-local aggregates should be instrumented with the logging functions instead of save/restore pairs.
diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/deallocator.C b/gcc/testsuite/g++.dg/debug/dwarf2/deallocator.C
index d895e78e608..c1d387922ba 100644
--- a/gcc/testsuite/g++.dg/debug/dwarf2/deallocator.C
+++ b/gcc/testsuite/g++.dg/debug/dwarf2/deallocator.C
@@ -29,7 +29,7 @@ void foo(int i)
 	  return;
 	}
     }
-  if (i)
+  if (i) // Threader makes everything after here disappear.
     {
       t test;
       if (i == 10)
@@ -42,5 +42,4 @@ void foo(int i)
 }
 // { dg-final { scan-assembler "deallocator.C:29" } }
 // { dg-final { scan-assembler "deallocator.C:24" } }
-// { dg-final { scan-assembler "deallocator.C:34" } }
 // { dg-final { scan-assembler "deallocator.C:21" } }
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr83510.c b/gcc/testsuite/gcc.c-torture/compile/pr83510.c
index 907dd80ccd7..fc932e57f3a 100644
--- a/gcc/testsuite/gcc.c-torture/compile/pr83510.c
+++ b/gcc/testsuite/gcc.c-torture/compile/pr83510.c
@@ -3,6 +3,39 @@
    (PR tree-optimization/83510).  */
 
 /* { dg-options "-Warray-bounds" } */
+/* { dg-xfail-if "" { "*-*-*" } { "-Os" } } */
+
+
+/*  This test is XFAILed because thread1 threads a switch statement
+    such that the various cases have been split into different
+    independent blocks.  One of these blocks exposes an arr[i_27]
+    which is later propagated by VRP to be arr[10].  This is an
+    invalid access, but the array bounds code doesn't know it is an
+    unreachable path.
+
+    However, it is not until dom2 that we "know" that the value of the
+    switch index is such that the path to arr[10] is unreachable.  For
+    that matter, it is not until dom3 that we remove the unreachable
+    path.
+
+
+    See:
+    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83510
+    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83312
+
+    It's not until here that ranger "knows" that the path is
+    unreachable:
+
+    thread1
+    vrp1		<-- array bounds checking
+    dce2
+    stdarg
+    cdce
+    cselim
+    copyprop
+    ifcombine
+    mergephi3		<-- too late
+*/
 
 extern int get_flag (void);
 
diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-2.c b/gcc/testsuite/gcc.dg/loop-unswitch-2.c
index f8d314e34de..0931f6e71c3 100644
--- a/gcc/testsuite/gcc.dg/loop-unswitch-2.c
+++ b/gcc/testsuite/gcc.dg/loop-unswitch-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details" } */
+/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details -fdisable-tree-thread2 -fdisable-tree-thread3" } */
 
 void foo (float **a, float **b, float *c, int n, int m, int l)
 {
diff --git a/gcc/testsuite/gcc.dg/old-style-asm-1.c b/gcc/testsuite/gcc.dg/old-style-asm-1.c
index 8af007795a7..f9406ff0a26 100644
--- a/gcc/testsuite/gcc.dg/old-style-asm-1.c
+++ b/gcc/testsuite/gcc.dg/old-style-asm-1.c
@@ -1,6 +1,9 @@
 /* PR inline-asm/8832 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -dP" } */
+/* { dg-options "-O2 -dP -fdisable-tree-ethread -fdisable-tree-thread1 -fdisable-tree-thread2 -fdisable-tree-thread3 -fdisable-tree-thread4" } */
+
+/* Note: Threader will duplicate BBs and replace one conditional branch by an
+   unconditional one.  */
 
 /* Verify that GCC doesn't optimize
    old style asm instructions.  */
diff --git a/gcc/testsuite/gcc.dg/pr68317.c b/gcc/testsuite/gcc.dg/pr68317.c
index 891d12954ba..bd053a7522b 100644
--- a/gcc/testsuite/gcc.dg/pr68317.c
+++ b/gcc/testsuite/gcc.dg/pr68317.c
@@ -1,5 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -fdisable-tree-ethread" } */
+
+/* Note: Threader will collapse loop.  */
 
 typedef int int32_t __attribute__((mode (__SI__)));
 
diff --git a/gcc/testsuite/gcc.dg/pr97567-2.c b/gcc/testsuite/gcc.dg/pr97567-2.c
index dee31c6dc01..c3ead54eaa8 100644
--- a/gcc/testsuite/gcc.dg/pr97567-2.c
+++ b/gcc/testsuite/gcc.dg/pr97567-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile} */
-/* { dg-options "-O2 -fdump-tree-evrp" } */
+/* { dg-options "-O2 -fdump-tree-evrp -fdisable-tree-ethread" } */
 
 char a[2];
 
diff --git a/gcc/testsuite/gcc.dg/predict-9.c b/gcc/testsuite/gcc.dg/predict-9.c
index f491c511bd9..cb68a218a93 100644
--- a/gcc/testsuite/gcc.dg/predict-9.c
+++ b/gcc/testsuite/gcc.dg/predict-9.c
@@ -1,5 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate -fno-finite-loops" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate -fno-finite-loops -fdisable-tree-ethread" } */
+
+/* Note: Threader causes removal of for loop.  */
 
 extern int global;
 extern int global2;
diff --git a/gcc/testsuite/gcc.dg/shrink-wrap-loop.c b/gcc/testsuite/gcc.dg/shrink-wrap-loop.c
index 52dfc2790ed..ba872fa23f6 100644
--- a/gcc/testsuite/gcc.dg/shrink-wrap-loop.c
+++ b/gcc/testsuite/gcc.dg/shrink-wrap-loop.c
@@ -1,5 +1,58 @@
 /* { dg-do compile { target { { { i?86-*-* x86_64-*-* } && lp64 } || { arm_thumb2 } } } } */
 /* { dg-options "-O2 -fdump-rtl-pro_and_epilogue"  } */
+// { dg-additional-options "-fdisable-tree-ethread" }
+
+/*
+Our new threader is threading things a bit too early, and causing the
+testcase in gcc.dg/shrink-wrap-loop.c to fail.
+
+  The gist is this BB inside a loop:
+
+  <bb 6> :
+  # p_2 = PHI <p2_6(D)(2), p_12(5)>
+  if (p_2 != 0B)
+    goto <bb 3>; [INV]
+  else
+    goto <bb 7>; [INV]
+
+Our threader can move this check outside of the loop (good).  This is
+done before branch probabilities are calculated and causes the probs
+to be calculated as:
+
+<bb 2> [local count: 216361238]:
+  if (p2_6(D) != 0B)
+    goto <bb 7>; [54.59%]
+  else
+    goto <bb 6>; [45.41%]
+
+Logically this seems correct to me.  A simple check outside of a loop
+should slightly but not overwhelmingly favor a non-zero value.
+
+Interestingly however, the old threader couldn't get this, but the IL
+ended up identical, albeit with different probabilities.  What happens
+is that, because the old code could not thread this, the p2 != 0 check
+would remain inside the loop and probs would be calculated thusly:
+
+  <bb 6> [local count: 1073741824]:
+  # p_2 = PHI <p2_6(D)(2), p_12(5)>
+  if (p_2 != 0B)
+    goto <bb 3>; [94.50%]
+  else
+    goto <bb 7>; [5.50%]
+
+Then when the loop header copying pass ("ch") shuffled things around,
+the IL would end up identical to my early threader code, but with the
+probabilities would remain as 94.5/5.5.
+
+The above discrepancy causes the RTL ifcvt pass to generate different
+code, and by the time we get to the shrink wrapping pass, things look
+sufficiently different such that the legacy code can actually shrink
+wrap, whereas our new code does not.
+
+IMO, if the loop-ch pass moves conditionals outside of a loop, the
+probabilities should be adjusted, but that does mean the shrink wrap
+won't happen for this contrived testcase.
+ */
 
 int foo (int *p1, int *p2);
 
diff --git a/gcc/testsuite/gcc.dg/sibcall-1.c b/gcc/testsuite/gcc.dg/sibcall-1.c
index e8a95513d9e..367ee4374e1 100644
--- a/gcc/testsuite/gcc.dg/sibcall-1.c
+++ b/gcc/testsuite/gcc.dg/sibcall-1.c
@@ -7,6 +7,9 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -foptimize-sibling-calls" } */
 
+/* See note in recurser_void() as to why we disable threading.  */
+/* { dg-additional-options "-fdisable-tree-thread1" } */
+
 /* The option -foptimize-sibling-calls is the default, but serves as
    marker.  Self-recursion tail calls are optimized for all targets,
    regardless of presence of sibcall patterns.  */
@@ -26,6 +29,13 @@ int main ()
 void
 recurser_void (int n)
 {
+  /* In some architectures like ppc64*, jump threading may thread
+     paths such that there are two calls into track(), one for
+     track(0) and one for track(7).  The track(7) call can be
+     transformed into a jump instead of a call, which means that
+     different calls into track() may end up with a different
+     &stackpos.  This is the reason we disable jump threading for this
+     test.  */
   if (n == 0 || n == 7)
     track (n);
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c
index fae2a1b73ea..ec55f267318 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-3.c
@@ -15,7 +15,7 @@ extern void string_lt_0_fail ();
 extern void string_eq_0_fail ();
 extern void string_gt_0_fail ();
 
-void test_string (char *d, const char *s)
+void test_string_eq_min (char *d, const char *s)
 {
   int n = __builtin_sprintf (d, "%-s", s);
 
@@ -23,13 +23,36 @@ void test_string (char *d, const char *s)
      or INT_MAX.  (This is a white box test based on knowing that
      the optimization computes its own values of the two constants.)  */
   if (n == INT_MIN) string_eq_min_fail ();
+}
+
+void test_string_eq_max (char *d, const char *s)
+{
+  int n = __builtin_sprintf (d, "%-s", s);
+
   if (n == INT_MAX) string_eq_max_fail ();
+}
+
+void test_string_lt_0 (char *d, const char *s)
+{
+  int n = __builtin_sprintf (d, "%-s", s);
 
   /* The return value could be negative when strlen(s) is in excess
      of 4095 (the maximum number of bytes a single directive is required
      to handle).  */
   if (n < 0) string_lt_0_fail ();
+}
+
+void test_string_eq_0 (char *d, const char *s)
+{
+  int n = __builtin_sprintf (d, "%-s", s);
+
   if (n == 0) string_eq_0_fail ();
+}
+
+void test_string_gt_0 (char *d, const char *s)
+{
+  int n = __builtin_sprintf (d, "%-s", s);
+
   if (n > 0) string_gt_0_fail ();
 }
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21001.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21001.c
index 719360a015f..4ea5f21addf 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21001.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21001.c
@@ -6,6 +6,7 @@
 
 /* { dg-do compile } */
 /* { dg-options "-O2 -fno-tree-dominator-opts -fno-tree-fre -fdisable-tree-evrp -fdump-tree-vrp1-details" } */
+/* { dg-additional-options "-fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 int
 foo (int a)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21294.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21294.c
index cc7d4cdf338..b9edabc6234 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21294.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21294.c
@@ -5,6 +5,7 @@
 
 /* { dg-do compile } */
 /* { dg-options "-O2 -fno-tree-dominator-opts -fdisable-tree-evrp -fdump-tree-vrp1-details" } */
+/* { dg-additional-options "-fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 struct f {
   int i;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c
index 484511978a1..fc14af4e662 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-thread4-details" } */
+/* { dg-options "-O2 -fdisable-tree-thread3 -fdump-tree-thread4-details" } */
 
 struct tree_common 
 { 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c
index 2aee42f4c05..f8d7353fc0e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-evrp-details" } */
+/* { dg-options "-O2 -fdump-tree-evrp-details -fdisable-tree-ethread" } */
 
 extern void g (void);
 extern void bar (int);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21563.c b/gcc/testsuite/gcc.dg/tree-ssa/pr21563.c
index 9c67a3acb46..72dce83ce37 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21563.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21563.c
@@ -2,7 +2,7 @@
    Make sure VRP folds the second "if" statement.  */
 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-dominator-opts -fdisable-tree-evrp -fdump-tree-vrp1-details" } */
+/* { dg-options "-O2 -fno-tree-dominator-opts -fdisable-tree-evrp -fdump-tree-vrp1-details -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 int
 foo (int a)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr49039.c b/gcc/testsuite/gcc.dg/tree-ssa/pr49039.c
index 4bc0a8175a0..a2044d012cc 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr49039.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr49039.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/49039 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 extern void bar (void);
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c
index d44c7dc1882..ddc53fbfbcc 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/61839.  */
 /* { dg-do run } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fdisable-tree-evrp -fdump-tree-optimized" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fdisable-tree-evrp -fdump-tree-optimized -fdisable-tree-ethread -fdisable-tree-thread1" } */
 /* { dg-require-effective-target int32plus } */
 
 __attribute__ ((noinline))
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c
index 5ceb0738bde..cc322d6e703 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/61839.  */
 /* { dg-do run } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fdump-tree-optimized" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fdump-tree-optimized -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 __attribute__ ((noinline))
 int foo (int a, unsigned b)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c
index cf74e156109..f9fc212f49e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c
@@ -124,7 +124,7 @@ enum STATES FMS( u8 **in , u32 *transitions) {
    to change decisions in switch expansion which in turn can expose new
    jump threading opportunities.  Skip the later tests on aarch64.  */
 /* { dg-final { scan-tree-dump "Jumps threaded: 1\[1-9\]" "thread1" } } */
-/* { dg-final { scan-tree-dump-times "Invalid sum" 3 "thread1" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum" 4 "thread1" } } */
 /* { dg-final { scan-tree-dump-not "optimizing for size" "thread1" } } */
 /* { dg-final { scan-tree-dump-not "optimizing for size" "thread2" } } */
 /* { dg-final { scan-tree-dump-not "optimizing for size" "thread3" { target { ! aarch64*-*-* } } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c
new file mode 100644
index 00000000000..c3ccb5d5b7e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-thread1-details --param logical-op-non-short-circuit=1" } */
+
+// Copied from ssa-dom-thread-11.c
+
+static int *bb_ticks;
+extern void frob (void);
+void
+mark_target_live_regs (int b, int block, int bb_tick)
+{
+  if (b == block && b != -1 && bb_tick == bb_ticks[b])
+      return;
+  if (b != -1)
+    frob ();
+}
+
+/* When the first two conditionals in the first IF are true, but
+   the third conditional is false, then there's a jump threading
+   opportunity to bypass the second IF statement.  */
+/* { dg-final { scan-tree-dump-times "Registering.*jump thread" 1 "thread1"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c
new file mode 100644
index 00000000000..d2689b6c788
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */ 
+/* { dg-options "-O2 -fdump-tree-thread2-details -w" } */
+
+// Copied from ssa-dom-thread-12.c.
+
+typedef long unsigned int size_t;
+union tree_node;
+typedef union tree_node *tree;
+typedef union gimple_statement_d *gimple;
+typedef const union gimple_statement_d *const_gimple;
+union gimple_statement_d
+{
+  unsigned num_ops;
+  tree exp;
+};
+
+unsigned int x;
+static inline tree
+gimple_op (const_gimple gs, unsigned i)
+{
+  if (!(i < gs->num_ops))
+    abort ();
+  return gs->exp;
+}
+
+unsigned char
+scan_function (gimple stmt)
+{
+  unsigned i;
+  for (i = 0; i < stmt->num_ops - 3 ; i++)
+    gimple_call_arg (stmt, i);
+  gimple_op (stmt, 1);
+}
+
+/* The test which bypasses the loop is simplified prior to DOM to check
+   that stmt->num_ops - 3 != 0.  When that test is false, we can derive
+   a value for stmt->num_ops.  That in turn allows us to thread the jump
+   for the conditional at the start of the call to gimple_op.  */
+/* { dg-final { scan-tree-dump-times "Registering.*jump thread" 1 "thread2"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c
new file mode 100644
index 00000000000..79ec067e68a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ethread-details -w --param logical-op-non-short-circuit=1" } */
+
+// Copied from ssa-dom-thread-14.c
+
+enum optab_methods
+{
+  OPTAB_DIRECT,
+  OPTAB_LIB,
+  OPTAB_WIDEN,
+  OPTAB_LIB_WIDEN,
+  OPTAB_MUST_WIDEN
+};
+struct optab_d { };
+typedef struct optab_d *optab;
+void
+expand_shift_1 (int code, int unsignedp, int rotate,
+		optab lshift_optab, optab rshift_arith_optab)
+{
+  int left = (code == 42 || code == 0xde);
+  int attempt;
+  enum optab_methods methods;
+  if (attempt == 0)
+    methods = OPTAB_DIRECT;
+  else if (attempt == 1)
+    methods = OPTAB_WIDEN;
+  if ((!unsignedp || (!left && methods == OPTAB_WIDEN)))
+    {
+      enum optab_methods methods1 = methods;
+      if (unsignedp)
+	methods1 = OPTAB_MUST_WIDEN;
+      expand_binop (left ? lshift_optab : rshift_arith_optab,
+			   unsignedp, methods1);
+    }
+}
+
+/* When UNSIGNEDP is true, LEFT is false and METHOD == OPTAB_WIDEN
+   we will enter the TRUE arm of the conditional and we can thread
+   the test to compute the first first argument of the expand_binop
+   call if we look backwards through the boolean logicals.  */
+/* { dg-final { scan-tree-dump-times "Registering.*jump thread" 1 "ethread"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c
new file mode 100644
index 00000000000..e8d1cfc03ac
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c
@@ -0,0 +1,83 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -fdump-tree-vrp-details -fdump-tree-thread1-details --param logical-op-non-short-circuit=1" }  */
+/* { dg-final { scan-tree-dump-times "Registering FSM jump" 8 "thread1" } }  */
+
+/* Copied from ssa-thread-14.  */
+
+void foo (void);
+void bar (void);
+void blah (void);
+
+/* One jump threaded here.  */
+
+void
+baz_1 (int a, int b, int c)
+{
+  if (a && b)
+    foo ();
+  if (!b && c)
+    bar ();
+}
+
+/* One jump threaded here.  */
+
+void
+baz_2 (int a, int b, int c)
+{
+  if (a && b)
+    foo ();
+  if (b || c)
+    bar ();
+}
+
+/* One jump threaded here.  */
+
+void
+baz_3 (int a, int b, int c)
+{
+  if (a && b > 10)
+    foo ();
+  if (b < 5 && c)
+    bar ();
+}
+
+/* Two jumps threaded here.  */
+
+void
+baz_4 (int a, int b, int c)
+{
+  if (a && b)
+    {
+      foo ();
+      if (c)
+        bar ();
+    }
+  if (b && c)
+    blah ();
+}
+
+/* Two jumps threaded here.  */
+
+void
+baz_5 (int a, int b, int c)
+{
+  if (a && b)
+    {
+      foo ();
+      if (c)
+        bar ();
+    }
+  if (!b || !c)
+    blah ();
+}
+
+/* One jump threaded here.  */
+
+void
+baz_6 (int a, int b, int c)
+{
+  if (a == 39 && b == 41)
+    foo ();
+  if (c == 12 || b == 41)
+    bar ();
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-5.c b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-5.c
new file mode 100644
index 00000000000..b7ca99a0152
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-5.c
@@ -0,0 +1,80 @@
+// { dg-do compile }
+// { dg-options "-fgimple -O2 -fdump-tree-thread1-details" }
+
+/* This tests that we can thread BB4->BB999 coming in through the
+   following path:
+
+      latch   many insns
+        |         |
+        V         V
+   6 -> 7 -> 3 -> 4 -> 999
+
+   The ranger based threader cannot thread this because BB4 has too
+   many instructions so it gives up looking back.  However, if we were
+   able to looking further, we would notice that a profitable path
+   passing through the loop latch (BB7) exists.
+
+   That is, 3->4->N in isolation is not profitable, but 6->7->3->4->N is.
+
+   It is not clear whether handling this case in the backwards
+   threader is profitable, as it would increase the search space
+   considerably.  The test is being added to note a regression from
+   the old backward threader code.
+
+   This test has been distilled from libphobos/src/std/net/isemail.d.
+
+   The ranger threader stops at the 3->4 subpath with: "did not thread
+   around loop and would copy too many statements".  */
+
+
+extern void bar();
+extern int random();
+
+int __GIMPLE (ssa,startwith("thread1"))
+foo (int key)
+{
+  int context;
+  int _1454;
+
+ __BB(2):
+  goto __BB3;
+
+  // Loop header.
+ __BB(3):
+  context_448 = __PHI (__BB2: 0, __BB7: context_450);
+  if (key_5(D) > 0)
+    goto __BB999;
+  else
+    goto __BB4;
+
+ __BB(4):
+  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();
+  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();
+  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();
+  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();
+  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();
+  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();  bar();
+  switch (context_448) {default: L5; case 0: L999; }
+
+ __BB(5):
+ L5:
+  goto __BB6;
+
+ __BB(6):
+  context_450 = __PHI (__BB5: 0);
+  _1454 = random ();
+  if (_1454 > 0)
+    goto __BB999;
+  else
+    goto __BB7;
+
+  // Loop latch.
+ __BB(7):
+  goto __BB3;
+
+ __BB(999):
+ L999:
+  return 5;
+}
+
+// { dg-final { scan-tree-dump-times "Registering.*jump thread.*incoming edge;  \\(6, 7\\)  \\(7, 3\\)  \\(3, 4\\)  \\(4, 999\\) nocopy" 1 "thread1" { xfail *-*-* } } }
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c
index dac931c18d0..8ef7646282c 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c
@@ -1,5 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details -w" } */
+/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details -w -fdisable-tree-thread1 -fdisable-tree-thread2" } */
+
+/* Note: Threader causes the infinite loop in val & 1 sooner.  */
 
 powi_cost (long n)
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-11.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-11.c
index 5f90613263d..856ab389439 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-11.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-11.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-dom2-details --param logical-op-non-short-circuit=1" } */
+/* { dg-options "-O2 -fdump-tree-dom2-details --param logical-op-non-short-circuit=1 -fdisable-tree-thread1 -fdisable-tree-thread2" } */
 
 static int *bb_ticks;
 extern void frob (void);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-12.c
index 63bd12a06a4..bad5e0a782d 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-12.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-12.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */ 
-/* { dg-options "-O2 -fdump-tree-dom2-details -w" } */
+/* { dg-options "-O2 -fdump-tree-dom2-details -w -fdisable-tree-thread2" } */
 typedef long unsigned int size_t;
 union tree_node;
 typedef union tree_node *tree;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-14.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-14.c
index 4e6a911506e..3bc4b3795cb 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-14.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-14.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-dom2-details -w --param logical-op-non-short-circuit=1" } */
+/* { dg-additional-options "-fdisable-tree-thread1 -fdisable-tree-ethread -fdisable-tree-thread2" } */
 
 enum optab_methods
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-18.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-18.c
index d4759b8903b..03872e7a02f 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-18.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-18.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */ 
-/* { dg-options "-O2 -fdump-tree-vrp1-details -fdump-tree-dom2-details -std=gnu89 --param logical-op-non-short-circuit=0" } */
+/* { dg-options "-O2 -fdump-tree-vrp1-details -fdump-tree-thread1-details -std=gnu89 --param logical-op-non-short-circuit=0" } */
 
 #include "ssa-dom-thread-4.c"
 
@@ -21,4 +21,5 @@
 	 condition.
 
    All the cases are picked up by VRP1 as jump threads.  */
-/* { dg-final { scan-tree-dump-times "Threaded" 4 "vrp1" } } */
+/* { dg-final { scan-tree-dump-times "Registering FSM jump" 6 "thread1" } } */
+/* { dg-final { scan-tree-dump-times "Threaded" 2 "vrp1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
index 16a9ef4e28a..c7bf867b084 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
@@ -34,8 +34,8 @@
      SWITCH_BB -> BBx -> BBy -> BBz -> PHI
 
    We now know the value of the switch index at PHI.  */
-/* { dg-final { scan-tree-dump-times "FSM" 6 "thread1" } } */
-/* { dg-final { scan-tree-dump-times "FSM" 1 "thread2" } } */
+/* { dg-final { scan-tree-dump-times "Registering FSM jump" 6 "thread1" } } */
+/* { dg-final { scan-tree-dump-times "Registering FSM jump" 1 "thread2" } } */
 
 int sum0, sum1, sum2, sum3;
 int foo (char *s, char **ret)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
index bad5bc1d003..1c2d12aa9ea 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-thread2-stats -fdump-tree-dom2-stats -fdump-tree-thread3-stats -fdump-tree-dom3-stats -fdump-tree-vrp2-stats -fno-guess-branch-probability" } */
+/* { dg-additional-options "--param=threader-mode=legacy" } */
 
 /* Here we have the same issue as was commented in ssa-dom-thread-6.c.
    The PHI coming into the threader has a lot more constants, so the
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c
index b3d610204da..5e74c78e7b0 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-fre1-details" } */
+/* { dg-options "-O -fdump-tree-fre1-details -fdisable-tree-ethread" } */
 
 int foo (int i)
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c
index 67e1e89ecd3..672a54e07db 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-vrp2-details --param logical-op-non-short-circuit=1" } */
+/* { dg-additional-options "-fdisable-tree-ethread -fdisable-tree-thread1 -fdisable-tree-thread2" } */
 /* { dg-final { scan-tree-dump-not "IRREDUCIBLE_LOOP" "vrp2" } } */
 
 void abort (void);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
index fb9840e95d5..8f554641b28 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
@@ -2,7 +2,7 @@
 /* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details -fno-finite-loops --param early-inlining-insns=14 -fno-inline-functions" } */
 /* { dg-final { scan-tree-dump "FSM" "thread2" } } */
 /* { dg-final { scan-tree-dump "FSM" "thread3" } } */
-/* { dg-final { scan-tree-dump "FSM" "thread4" { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump "FSM" "thread4" } } */
 
 typedef struct bitmap_head_def *bitmap;
 typedef const struct bitmap_head_def *const_bitmap;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c
index 38661c88822..f9152b9358f 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-additional-options "-O2 -fdump-tree-vrp-details --param logical-op-non-short-circuit=1" }  */
+/* { dg-additional-options "-fdisable-tree-thread1" } */
 /* { dg-final { scan-tree-dump-times "Threaded jump" 8 "vrp1" } }  */
 
 void foo (void);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp02.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp02.c
index 4be538f5944..2285c55c7d2 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp02.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp02.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fdelete-null-pointer-checks -fdisable-tree-evrp" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fdelete-null-pointer-checks -fdisable-tree-evrp -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 struct A
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c
index bafb65a53d6..1d7ea4e8ffb 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 struct A
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp05.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp05.c
index 8c611e91fcd..c17cd1b5738 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp05.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp05.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fno-early-inlining" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fno-early-inlining -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 
 inline int ten()
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp06.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp06.c
index a872bc43731..acb03c29aa4 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp06.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp06.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 int baz (void);
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp07.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp07.c
index 0f3f280e4e1..31a541522f5 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp07.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp07.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-fre -fdisable-tree-evrp -fdump-tree-vrp1-details -fdelete-null-pointer-checks" } */
+/* { dg-options "-O2 -fno-tree-fre -fdisable-tree-evrp -fdump-tree-vrp1-details -fdelete-null-pointer-checks -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 int
 foo (int i, int *p)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp09.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp09.c
index 56cc50c5f1d..fad0051cb09 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp09.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp09.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-fre -fdisable-tree-evrp -fdump-tree-vrp1 -std=gnu89" } */
+/* { dg-options "-O2 -fno-tree-fre -fdisable-tree-evrp -fdump-tree-vrp1 -std=gnu89 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 foo (int *p)
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c
index 40373fde163..98a8da6d05e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fwrapv -O1 -ftree-vrp -fdisable-tree-evrp -fdump-tree-vrp1" } */
+/* { dg-options "-fwrapv -O1 -ftree-vrp -fdisable-tree-evrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 #include <limits.h>
 extern void abort ();
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp20.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp20.c
index 4a3b0d73648..f9df67ffb05 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp20.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp20.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fwrapv -O1 -fno-tree-fre -fdisable-tree-evrp -ftree-vrp -fdump-tree-vrp1" } */
+/* { dg-options "-fwrapv -O1 -fno-tree-fre -fdisable-tree-evrp -ftree-vrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 extern void abort ();
 extern void exit (int);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp33.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp33.c
index f1d3863943e..88833eb701e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp33.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp33.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fno-tree-fre -fdisable-tree-evrp" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fno-tree-fre -fdisable-tree-evrp -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 /* This is from PR14052.  */
 
diff --git a/gcc/testsuite/gcc.dg/uninit-pred-9_b.c b/gcc/testsuite/gcc.dg/uninit-pred-9_b.c
index d9ae75e0765..d46d6659a61 100644
--- a/gcc/testsuite/gcc.dg/uninit-pred-9_b.c
+++ b/gcc/testsuite/gcc.dg/uninit-pred-9_b.c
@@ -1,6 +1,7 @@
 
 /* { dg-do compile } */
 /* { dg-options "-Wuninitialized -O2" } */
+/* { dg-xfail-if "threading shuffles things around" { ppc64*-*-* } } */
 
 int g;
 void bar();
diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-16.c b/gcc/testsuite/gcc.dg/vect/bb-slp-16.c
index e68a9b62535..664e93e9b60 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-16.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-16.c
@@ -1,5 +1,8 @@
 /* { dg-require-effective-target vect_int } */
 
+/* See note below as to why we disable threading.  */
+/* { dg-additional-options "-fdisable-tree-thread1" } */
+
 #include <stdarg.h>
 #include "tree-vect.h"
 
@@ -27,6 +30,10 @@ main1 (int dummy)
       *pout++ = *pin++ + a;
       *pout++ = *pin++ + a;
       *pout++ = *pin++ + a;
+      /* In some architectures like ppc64, jump threading may thread
+	 the iteration where i==0 such that we no longer optimize the
+	 BB.  Another alternative to disable jump threading would be
+	 to wrap the read from `i' into a function returning i.  */
       if (arr[i] = i)
         a = i;
       else
diff --git a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
index 1ea111795a4..57192791857 100644
--- a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
+++ b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
@@ -1,6 +1,6 @@
 /* { dg-do run } */
 /* { dg-require-effective-target avx2 } */
-/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details" } */
+/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details -fdisable-tree-thread1" } */
 
 #include "avx2-check.h"
 #define N 64
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 7dd8594e3d4..f29892042e2 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -36,6 +36,12 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-phinodes.h"
 #include "tree-inline.h"
 #include "tree-vectorizer.h"
+#include "value-range.h"
+#include "gimple-range.h"
+#include "tree-ssa-threadedge.h"
+#include "gimple-range-path.h"
+#include "ssa.h"
+#include "tree-cfgcleanup.h"
 
 // Path registry for the backwards threader.  After all paths have been
 // registered with register_path(), thread_through_all_blocks() is called
@@ -71,13 +77,414 @@ private:
   const bool m_speed_p;
 };
 
+// Ranger based backwards threader.
+
+class back_threader
+{
+  // Temporary until we remove old code.
+  friend bool path_is_unreachable_p (const vec<jump_thread_edge *> &);
+
+public:
+  back_threader (back_threader_profitability &, back_threader_registry &);
+  ~back_threader ();
+  void find_paths (basic_block bb, tree name);
+
+private:
+  void maybe_register_path (edge taken_edge);
+  bool find_paths_to_names (basic_block bb, bitmap imports);
+  bool resolve_def (tree name, bitmap interesting, vec<tree> worklist);
+  bool resolve_phi (gphi *phi, bitmap imports);
+  edge find_taken_edge (const vec<basic_block> &path);
+  edge find_taken_edge_cond (const vec<basic_block> &path, gcond *);
+  edge find_taken_edge_switch (const vec<basic_block> &path, gswitch *);
+
+  back_threader_registry &m_registry;
+  back_threader_profitability &m_profit;
+  gimple_ranger m_ranger;
+  path_range_query m_solver;
+
+  // Current path being analyzed.
+  auto_vec<basic_block> m_path;
+  // Hash to mark visited BBs while analyzing a path.
+  hash_set<basic_block> m_visited_bbs;
+  // The set of SSA names, any of which could potentially change the
+  // value of the final conditional in a path.
+  bitmap m_imports;
+  // The last statement in the path.
+  gimple *m_last_stmt;
+  // This is a bit of a wart.  It's used to pass the LHS SSA name to
+  // the profitability engine.
+  tree m_name;
+  // Marker to differentiate unreachable edges.
+  static const edge UNREACHABLE_EDGE;
+};
+
+// Used to differentiate unreachable edges, so we may stop the search
+// in a the given direction.
+const edge back_threader::UNREACHABLE_EDGE = (edge) -1;
+
+back_threader::back_threader (back_threader_profitability &profit,
+			      back_threader_registry &registry)
+  : m_registry (registry),
+    m_profit (profit),
+    m_solver (m_ranger)
+{
+  m_last_stmt = NULL;
+  m_imports = BITMAP_ALLOC (NULL);
+}
+
+back_threader::~back_threader ()
+{
+  m_path.release ();
+  BITMAP_FREE (m_imports);
+}
+
+// Register the current path for jump threading if it's profitable to
+// do so.  TAKEN_EDGE is the known edge out of the path.
+
+void
+back_threader::maybe_register_path (edge taken_edge)
+{
+  bool irreducible = false;
+  bool profitable
+    = m_profit.profitable_path_p (m_path, m_name, taken_edge, &irreducible);
+
+  if (profitable)
+    {
+      m_registry.register_path (m_path, taken_edge);
+
+      if (irreducible)
+	vect_free_loop_info_assumptions (m_path[0]->loop_father);
+    }
+}
+
+// Return the known taken edge out of a path.  If the path can be
+// determined to be unreachable, return UNREACHABLE_EDGE.  If no
+// outgoing edge can be calculated, return NULL.
+
+edge
+back_threader::find_taken_edge (const vec<basic_block> &path)
+{
+  gcc_checking_assert (path.length () > 1);
+  switch (gimple_code (m_last_stmt))
+    {
+    case GIMPLE_COND:
+      return find_taken_edge_cond (path, as_a<gcond *> (m_last_stmt));
+
+    case GIMPLE_SWITCH:
+      return find_taken_edge_switch (path, as_a<gswitch *> (m_last_stmt));
+
+    default:
+      return NULL;
+    }
+}
+
+// Same as find_taken_edge, but for paths ending in a switch.
+
+edge
+back_threader::find_taken_edge_switch (const vec<basic_block> &path,
+				       gswitch *sw)
+{
+  tree name = gimple_switch_index (sw);
+  int_range_max r;
+
+  m_solver.precompute_ranges (path, m_imports);
+  m_solver.range_of_expr (r, name, sw);
+
+  if (r.undefined_p ())
+    return UNREACHABLE_EDGE;
+
+  if (r.varying_p ())
+    return NULL;
+
+  tree val;
+  if (r.singleton_p (&val))
+    return ::find_taken_edge (gimple_bb (sw), val);
+
+  return NULL;
+}
+
+// Same as find_taken_edge, but for paths ending in a GIMPLE_COND.
+
+edge
+back_threader::find_taken_edge_cond (const vec<basic_block> &path,
+				     gcond *cond)
+{
+  m_solver.precompute_ranges (path, m_imports);
+
+  // Check if either operand is unreachable since this knowledge could
+  // help the caller cut down the search space.
+  int_range_max r;
+  m_solver.range_of_expr (r, gimple_cond_lhs (cond));
+  if (r.undefined_p ())
+    return UNREACHABLE_EDGE;
+  m_solver.range_of_expr (r, gimple_cond_rhs (cond));
+  if (r.undefined_p ())
+    return UNREACHABLE_EDGE;
+
+  m_solver.range_of_stmt (r, cond);
+
+  int_range<2> true_range (boolean_true_node, boolean_true_node);
+  int_range<2> false_range (boolean_false_node, boolean_false_node);
+
+  if (r == true_range || r == false_range)
+    {
+      edge e_true, e_false;
+      basic_block bb = gimple_bb (cond);
+      extract_true_false_edges_from_block (bb, &e_true, &e_false);
+      return r == true_range ? e_true : e_false;
+    }
+  return NULL;
+}
+
+// Populate a vector of trees from a bitmap.
+
+static inline void
+populate_worklist (vec<tree> worklist, bitmap bits)
+{
+  bitmap_iterator bi;
+  unsigned i;
+
+  EXECUTE_IF_SET_IN_BITMAP (bits, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+      worklist.quick_push (name);
+    }
+}
+
+// If any of the incoming edges for a PHI resolves the current path,
+// register the path(s), and return TRUE.
+
+bool
+back_threader::resolve_phi (gphi *phi, bitmap interesting)
+{
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_phi_result (phi)))
+    return true;
+
+  bool done = false;
+  for (size_t i = 0; i < gimple_phi_num_args (phi); ++i)
+    {
+      edge e = gimple_phi_arg_edge (phi, i);
+
+      // This is like path_crosses_loops in profitable_path_p but more
+      // restrictive, since profitable_path_p allows threading the
+      // first block because it would be redirected anyhow.
+      //
+      // If we loosened the restriction and used profitable_path_p()
+      // here instead, we would peel off the first iterations of loops
+      // in places like tree-ssa/pr14341.c.
+      bool profitable_p = m_path[0]->loop_father == e->src->loop_father;
+      if (!profitable_p)
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file,
+		     "  FAIL: path through PHI in bb%d (incoming bb:%d) crosses loop\n",
+		     e->dest->index, e->src->index);
+	  continue;
+	}
+
+      tree arg = gimple_phi_arg_def (phi, i);
+      if (TREE_CODE (arg) == SSA_NAME)
+	{
+	  unsigned v = SSA_NAME_VERSION (arg);
+
+	  // Avoid loops as in: x_5 = PHI <x_5(2), ...>.
+	  if (bitmap_bit_p (interesting, v))
+	    continue;
+
+	  bitmap_set_bit (interesting, v);
+	  bitmap_set_bit (m_imports, v);
+	  done |= find_paths_to_names (e->src, interesting);
+	  bitmap_clear_bit (interesting, v);
+	}
+      else if (TREE_CODE (arg) == INTEGER_CST)
+	{
+	  m_path.safe_push (e->src);
+	  edge taken_edge = find_taken_edge (m_path);
+	  if (taken_edge && taken_edge != UNREACHABLE_EDGE)
+	    {
+	      maybe_register_path (taken_edge);
+	      done = true;
+	    }
+	  m_path.pop ();
+	}
+    }
+  return done;
+}
+
+// If the definition of NAME resolves the current path, register the
+// path, and return TRUE.
+
+bool
+back_threader::resolve_def (tree name, bitmap interesting, vec<tree> worklist)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
+
+  // Handle PHIs.
+  if (is_a<gphi *> (def_stmt)
+      && resolve_phi (as_a<gphi *> (def_stmt), interesting))
+    return true;
+
+  // Defer copies of SSAs by adding the source to the worklist.
+  if (gimple_assign_single_p (def_stmt)
+      && TREE_CODE (gimple_assign_rhs1 (def_stmt)) == SSA_NAME)
+    {
+      tree rhs = gimple_assign_rhs1 (def_stmt);
+      bitmap_set_bit (m_imports, SSA_NAME_VERSION (rhs));
+      bitmap_set_bit (interesting, SSA_NAME_VERSION (rhs));
+      worklist.safe_push (rhs);
+    }
+  return false;
+}
+
+// Find jump threading paths to any of the SSA names in the
+// INTERESTING bitmap, and register any such paths.
+//
+// Return TRUE if no further processing past this block is necessary.
+// This is because we've either registered a path, or because there is
+// nothing of interesting beyond this block.
+//
+// BB is the current path being processed.
+
+bool
+back_threader::find_paths_to_names (basic_block bb, bitmap interesting)
+{
+  if (m_visited_bbs.add (bb))
+    return true;
+
+  m_path.safe_push (bb);
+
+  if (m_path.length () > 1
+      && !m_profit.profitable_path_p (m_path, m_name, NULL))
+    {
+      m_path.pop ();
+      m_visited_bbs.remove (bb);
+      return false;
+    }
+
+  auto_bitmap processed;
+  unsigned i;
+  bool done = false;
+
+  // We use a worklist instead of iterating through the bitmap,
+  // because we may add new items in-flight.
+  auto_vec<tree> worklist (bitmap_count_bits (interesting));
+  populate_worklist (worklist, interesting);
+  while (!worklist.is_empty ())
+    {
+      tree name = worklist.pop ();
+      unsigned i = SSA_NAME_VERSION (name);
+      basic_block def_bb = gimple_bb (SSA_NAME_DEF_STMT (name));
+
+      // Process any names defined in this block.
+      if (def_bb == bb)
+	{
+	  bitmap_set_bit (processed, i);
+
+	  if (resolve_def (name, interesting, worklist))
+	    {
+	      done = true;
+	      goto leave_bb;
+	    }
+	}
+      // Examine blocks that define or export an interesting SSA,
+      // since they may compute a range which resolve this path.
+      if ((def_bb == bb
+	   || bitmap_bit_p (m_ranger.gori ().exports (bb), i))
+	  && m_path.length () > 1)
+	{
+	  edge taken_edge = find_taken_edge (m_path);
+	  if (taken_edge)
+	    {
+	      if (taken_edge != UNREACHABLE_EDGE)
+		maybe_register_path (taken_edge);
+
+	      done = true;
+	      goto leave_bb;
+	    }
+	}
+    }
+
+  // If there are interesting names not yet processed, keep looking.
+  bitmap_and_compl_into (interesting, processed);
+  if (!bitmap_empty_p (interesting))
+    {
+      edge_iterator iter;
+      edge e;
+      FOR_EACH_EDGE (e, iter, bb->preds)
+	if ((e->flags & EDGE_ABNORMAL) == 0)
+	  done |= find_paths_to_names (e->src, interesting);
+    }
+
+ leave_bb:
+  bitmap_iterator bi;
+  EXECUTE_IF_SET_IN_BITMAP (processed, 0, i, bi)
+    bitmap_set_bit (interesting, i);
+
+  m_path.pop ();
+  m_visited_bbs.remove (bb);
+  return done;
+}
+
+// Search backwards from BB looking for paths where the final
+// conditional out of BB can be determined.  NAME is the LHS of the
+// final conditional.  Register such paths for jump threading.
+
+void
+back_threader::find_paths (basic_block bb, tree name)
+{
+  gimple *stmt = last_stmt (bb);
+  if (!stmt
+      || (gimple_code (stmt) != GIMPLE_COND
+	  && gimple_code (stmt) != GIMPLE_SWITCH))
+    return;
+
+  if (EDGE_COUNT (bb->succs) > 1
+      || single_succ_to_potentially_threadable_block (bb))
+    {
+      m_last_stmt = stmt;
+      m_visited_bbs.empty ();
+      m_path.truncate (0);
+      m_name = name;
+      bitmap_clear (m_imports);
+
+      auto_bitmap interesting;
+      bitmap_copy (m_imports, m_ranger.gori ().imports (bb));
+      bitmap_copy (interesting, m_imports);
+      find_paths_to_names (bb, interesting);
+    }
+}
+
+// Dump a sequence of BBs through the CFG.
+
+DEBUG_FUNCTION void
+dump_path (FILE *dump_file, const vec<basic_block> &path)
+{
+  for (size_t i = 0; i < path.length (); ++i)
+    {
+      fprintf (dump_file, "BB%d", path[i]->index);
+      if (i + 1 < path.length ())
+	fprintf (dump_file, " <- ");
+    }
+  fprintf (dump_file, "\n");
+}
+
+DEBUG_FUNCTION void
+debug (const vec <basic_block> &path)
+{
+  dump_path (stderr, path);
+}
+
 class thread_jumps
 {
 public:
   thread_jumps (bool speed_p = true)
-    : m_profit (speed_p), m_registry (param_max_fsm_thread_paths)
+    : m_profit (speed_p),
+      m_registry (param_max_fsm_thread_paths),
+      m_back_threader (m_profit, m_registry)
   { }
   void find_jump_threads_backwards (basic_block bb);
+  void find_jump_threads_backwards_with_ranger (basic_block bb);
   bool thread_through_all_blocks ();
 
 private:
@@ -102,6 +509,7 @@ private:
   tree m_name;
   back_threader_profitability m_profit;
   back_threader_registry m_registry;
+  back_threader m_back_threader;
 };
 
 // Perform the actual jump threading for the all queued paths.
@@ -548,8 +956,8 @@ back_threader_registry::register_path (const vec<basic_block> &m_path,
 						EDGE_NO_COPY_SRC_BLOCK);
   jump_thread_path->safe_push (x);
 
-  m_lowlevel_registry.register_jump_thread (jump_thread_path);
-  ++m_threaded_paths;
+  if (m_lowlevel_registry.register_jump_thread (jump_thread_path))
+    ++m_threaded_paths;
   return true;
 }
 
@@ -818,6 +1226,12 @@ thread_jumps::fsm_find_control_statement_thread_paths (tree name)
 void
 thread_jumps::find_jump_threads_backwards (basic_block bb)
 {
+  if (param_threader_mode & THREADER_MODE_RANGER)
+    {
+      find_jump_threads_backwards_with_ranger (bb);
+      return;
+    }
+
   gimple *stmt = get_gimple_control_stmt (bb);
   if (!stmt)
     return;
@@ -850,6 +1264,28 @@ thread_jumps::find_jump_threads_backwards (basic_block bb)
   fsm_find_control_statement_thread_paths (name);
 }
 
+// Like find_jump_threads_backwards(), but using ranger.
+
+void
+thread_jumps::find_jump_threads_backwards_with_ranger (basic_block bb)
+{
+  gimple *stmt = get_gimple_control_stmt (bb);
+  if (!stmt)
+    return;
+
+  enum gimple_code code = gimple_code (stmt);
+  tree name = NULL;
+  if (code == GIMPLE_SWITCH)
+    name = gimple_switch_index (as_a <gswitch *> (stmt));
+  else if (code == GIMPLE_GOTO)
+    name = gimple_goto_dest (stmt);
+  else if (code == GIMPLE_COND)
+    name = gimple_cond_lhs (stmt);
+
+  m_name = name;
+  m_back_threader.find_paths (bb, name);
+}
+
 namespace {
 
 const pass_data pass_data_thread_jumps =
@@ -883,12 +1319,12 @@ pass_thread_jumps::gate (function *fun ATTRIBUTE_UNUSED)
   return flag_expensive_optimizations;
 }
 
+// Try to thread blocks in FUN.  Return TRUE if any jump thread paths were
+// registered.
 
-unsigned int
-pass_thread_jumps::execute (function *fun)
+static bool
+try_thread_blocks (function *fun)
 {
-  loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
-
   /* Try to thread each block with more than one successor.  */
   thread_jumps threader;
   basic_block bb;
@@ -897,7 +1333,30 @@ pass_thread_jumps::execute (function *fun)
       if (EDGE_COUNT (bb->succs) > 1)
 	threader.find_jump_threads_backwards (bb);
     }
-  bool changed = threader.thread_through_all_blocks ();
+  return threader.thread_through_all_blocks ();
+}
+
+unsigned int
+pass_thread_jumps::execute (function *fun)
+{
+  loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
+
+  // Iterative mode is a testing construct and is not meant for public
+  // consumption.  It is OFF by default.
+  bool iterative = param_threader_iterative;
+
+  bool changed = false;
+  while (try_thread_blocks (fun))
+    {
+      changed = true;
+
+      if (!iterative)
+	break;
+
+      if ((param_threader_mode & THREADER_MODE_RANGER) == 0)
+	break;
+      cleanup_tree_cfg (TODO_update_ssa);
+    }
 
   loop_optimizer_finalize ();
   return changed ? TODO_cleanup_cfg : 0;
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 6ce32644aa5..ea5c37a2c65 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -1335,6 +1335,18 @@ jump_threader::thread_across_edge (edge e)
   m_avail_exprs_stack->pop_to_marker ();
 }
 
+/* Return TRUE if BB has a single successor to a block with multiple
+   incoming and outgoing edges.  */
+
+bool
+single_succ_to_potentially_threadable_block (basic_block bb)
+{
+  int flags = (EDGE_IGNORE | EDGE_COMPLEX | EDGE_ABNORMAL);
+  return (single_succ_p (bb)
+	  && (single_succ_edge (bb)->flags & flags) == 0
+	  && potentially_threadable_block (single_succ (bb)));
+}
+
 /* Examine the outgoing edges from BB and conditionally
    try to thread them.  */
 
@@ -1348,12 +1360,8 @@ jump_threader::thread_outgoing_edges (basic_block bb)
      outgoing edges, then we may be able to thread the edge, i.e., we
      may be able to statically determine which of the outgoing edges
      will be traversed when the incoming edge from BB is traversed.  */
-  if (single_succ_p (bb)
-      && (single_succ_edge (bb)->flags & flags) == 0
-      && potentially_threadable_block (single_succ (bb)))
-    {
-      thread_across_edge (single_succ_edge (bb));
-    }
+  if (single_succ_to_potentially_threadable_block (bb))
+    thread_across_edge (single_succ_edge (bb));
   else if ((last = last_stmt (bb))
 	   && gimple_code (last) == GIMPLE_COND
 	   && EDGE_COUNT (bb->succs) == 2
diff --git a/gcc/tree-ssa-threadedge.h b/gcc/tree-ssa-threadedge.h
index 48735f2bc27..c78eef25219 100644
--- a/gcc/tree-ssa-threadedge.h
+++ b/gcc/tree-ssa-threadedge.h
@@ -29,7 +29,7 @@ class jump_threader
 {
 public:
   jump_threader (class const_and_copies *,
-		 avail_exprs_stack *,
+		 class avail_exprs_stack *,
 		 class jump_threader_simplifier *,
 		 class evrp_range_analyzer * = NULL);
   ~jump_threader ();
@@ -83,6 +83,7 @@ protected:
 };
 
 extern void propagate_threaded_block_debug_into (basic_block, basic_block);
+extern bool single_succ_to_potentially_threadable_block (basic_block);
 
 // ?? All this ssa_name_values stuff is the store of values for
 // avail_exprs_stack and const_and_copies, so it really belongs in the
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index f496dd3eb8c..29cf010e9ca 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "dbgcnt.h"
 #include "tree-cfg.h"
 #include "tree-vectorizer.h"
+#include "tree-pass.h"
 
 /* Given a block B, update the CFG and SSA graph to reflect redirecting
    one or more in-edges to B to instead reach the destination of an
@@ -2741,15 +2742,17 @@ jump_thread_path_registry::thread_through_all_blocks
 
    E is the edge we can thread, E2 is the new target edge, i.e., we
    are effectively recording that E->dest can be changed to E2->dest
-   after fixing the SSA graph.  */
+   after fixing the SSA graph.
 
-void
+   Return TRUE if PATH was successfully threaded.  */
+
+bool
 jump_thread_path_registry::register_jump_thread (vec<jump_thread_edge *> *path)
 {
   if (!dbg_cnt (registered_jump_thread))
     {
       path->release ();
-      return;
+      return false;
     }
 
   /* First make sure there are no NULL outgoing edges on the jump threading
@@ -2766,7 +2769,7 @@ jump_thread_path_registry::register_jump_thread (vec<jump_thread_edge *> *path)
 	    }
 
 	  path->release ();
-	  return;
+	  return false;
 	}
 
       /* Only the FSM threader is allowed to thread across
@@ -2780,6 +2783,7 @@ jump_thread_path_registry::register_jump_thread (vec<jump_thread_edge *> *path)
     dump_jump_thread_path (dump_file, *path, true);
 
   m_paths.safe_push (path);
+  return true;
 }
 
 /* Return how many uses of T there are within BB, as long as there
diff --git a/gcc/tree-ssa-threadupdate.h b/gcc/tree-ssa-threadupdate.h
index b806caee581..2030bda15af 100644
--- a/gcc/tree-ssa-threadupdate.h
+++ b/gcc/tree-ssa-threadupdate.h
@@ -63,7 +63,7 @@ class jump_thread_path_registry
 public:
   jump_thread_path_registry ();
   ~jump_thread_path_registry ();
-  void register_jump_thread (vec<jump_thread_edge *> *);
+  bool register_jump_thread (vec<jump_thread_edge *> *);
   void remove_jump_threads_including (edge);
   bool thread_through_all_blocks (bool);
   jump_thread_edge *allocate_thread_edge (edge e, jump_thread_edge_type t);
diff --git a/libgomp/testsuite/libgomp.graphite/force-parallel-4.c b/libgomp/testsuite/libgomp.graphite/force-parallel-4.c
index d2af142e8fe..ef6f64d229b 100644
--- a/libgomp/testsuite/libgomp.graphite/force-parallel-4.c
+++ b/libgomp/testsuite/libgomp.graphite/force-parallel-4.c
@@ -1,4 +1,5 @@
 /* Autopar with IF conditions.  */
+/* { dg-additional-options "-fdisable-tree-thread1" } */
 
 void abort();
 
diff --git a/libgomp/testsuite/libgomp.graphite/force-parallel-8.c b/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
index 32ba5ab84ef..a97eb97acf6 100644
--- a/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
+++ b/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
@@ -1,3 +1,5 @@
+/* { dg-additional-options "-fdisable-tree-thread1" } */
+
 #define N 1500
 
 int x[N][N], y[N];
-- 
2.31.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Backwards jump threader rewrite with ranger.
  2021-07-15 14:57     ` Aldy Hernandez
@ 2021-07-26 12:43       ` Aldy Hernandez
  2021-07-28 14:32       ` Jeff Law
  1 sibling, 0 replies; 20+ messages in thread
From: Aldy Hernandez @ 2021-07-26 12:43 UTC (permalink / raw)
  To: GCC patches

PING * 2

BTW, this is also needed for:

a) evrp_range_analyzer removal from DOM threader (I have follow-up patches).

b) VRP replacement with evrp along with VRP threader removal (I also
have patches).

Thanks.
Aldy


On Thu, Jul 15, 2021 at 4:57 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>
> As mentioned in my previous email, these are some minor changes to the
> previous revision.  All I'm changing here is the call into the solver
> to use range_of_expr and range_of_stmt.  Everything else remains the
> same.
>
> Tested on x86-64 Linux.
>
> On Mon, Jul 5, 2021 at 5:39 PM Aldy Hernandez <aldyh@redhat.com> wrote:
> >
> > PING.
> >
> > Aldy


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] Implement basic block path solver.
  2021-07-02  8:13     ` Aldy Hernandez
  2021-07-02 13:16       ` Andrew MacLeod
@ 2021-07-26 19:10       ` Jeff Law
  2021-07-27  9:58         ` Aldy Hernandez
  1 sibling, 1 reply; 20+ messages in thread
From: Jeff Law @ 2021-07-26 19:10 UTC (permalink / raw)
  To: Aldy Hernandez, GCC patches; +Cc: Andrew MacLeod, Richard Biener, Martin Sebor



On 7/2/2021 2:13 AM, Aldy Hernandez wrote:
>
>
> On 7/2/21 12:20 AM, Jeff Law wrote:
>>
>>
>> On 6/28/2021 10:21 AM, Aldy Hernandez wrote:
>
>>> +// Internal construct to help facilitate debugging of solver.
>>> +#define DEBUG_SOLVER getenv("DEBUG")
>> Shouldn't this really be a property of what pass is using the solver 
>> and whether or not the appropriate dump flag is on for that pass?
>
> Whoops.  This was a private construct used for debugging the solver. 
> I've changed it to:
>
> +#define DEBUG_SOLVER (0 && dump_file)
I would probably argue that the #define should disappear and the code 
should be checking the current dump state for the current pass.   If you 
don't want to keep the debugging output, then remove it  :-)  I think 
that can be handled in a follow-up patch.


>>
>>
>>> +
>>> +// Return the range of the result of PHI in R.
>>> +
>>> +void
>>> +path_solver::ssa_range_in_phi (irange &r, gphi *phi)
>>> +{
>>> +  tree name = gimple_phi_result (phi);
>>> +  basic_block bb = gimple_bb (phi);
>>> +
>>> +  // We experimented with querying ranger's range_on_entry here, but
>>> +  // the performance penalty was too high, for hardly any 
>>> improvements.
>>> +  if (at_entry ())
>>> +    {
>>> +      r.set_varying (TREE_TYPE (name));
>>> +      return;
>>> +    }
>>> +
>>> +  basic_block prev = prev_bb ();
>>> +  edge e_in = find_edge (prev, bb);
>>> +  for (size_t i = 0; i < gimple_phi_num_args (phi); ++i)
>> It's probably not important in practice, but you're going to end up 
>> calling gimple_phi_num_args every iteration of this loop. It's value 
>> isn't generally subject to LICM.
>
> I was just following standard practice:
Yea.  I doubt we're at all consistent with that.  In fact, I'd bet I'm a 
serial offender.  We should probably try to do better through since 
we're going to get a function call every loop iteration when the value 
is invariant.  FIxing other instances should be considered 
pre-approved.  ISTM a separate patch for fix that up would be fine.

Oh, and showing the # of instances in FOR statements is useful, but you 
didn't show the other cases (ie, where we shove it into a variable and 
use that as a loop bound).  I'd estimate there's probably 40-50 of 
those.  So it's really a mixed bag.

OK.

Jeff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] Implement basic block path solver.
  2021-07-26 19:10       ` Jeff Law
@ 2021-07-27  9:58         ` Aldy Hernandez
  2021-07-27 15:18           ` Jeff Law
  0 siblings, 1 reply; 20+ messages in thread
From: Aldy Hernandez @ 2021-07-27  9:58 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC patches, Andrew MacLeod, Richard Biener, Martin Sebor

[-- Attachment #1: Type: text/plain, Size: 2146 bytes --]

On Mon, Jul 26, 2021 at 9:10 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 7/2/2021 2:13 AM, Aldy Hernandez wrote:
> >
> >
> > On 7/2/21 12:20 AM, Jeff Law wrote:
> >>
> >>
> >> On 6/28/2021 10:21 AM, Aldy Hernandez wrote:
> >
> >>> +// Internal construct to help facilitate debugging of solver.
> >>> +#define DEBUG_SOLVER getenv("DEBUG")
> >> Shouldn't this really be a property of what pass is using the solver
> >> and whether or not the appropriate dump flag is on for that pass?
> >
> > Whoops.  This was a private construct used for debugging the solver.
> > I've changed it to:
> >
> > +#define DEBUG_SOLVER (0 && dump_file)
> I would probably argue that the #define should disappear and the code
> should be checking the current dump state for the current pass.   If you
> don't want to keep the debugging output, then remove it  :-)  I think
> that can be handled in a follow-up patch.

The debugging output is really verbose, especially because the
threader will try a boatload of different paths all of which get their
propagation dumped.  Is there a recommended way of leaving the
debugging in the code, but only enabled sporadically?  Perhaps a
--param ??.  Or perhaps I could remove it near the end of stage 1?

> >>> +  edge e_in = find_edge (prev, bb);
> >>> +  for (size_t i = 0; i < gimple_phi_num_args (phi); ++i)
> >> It's probably not important in practice, but you're going to end up
> >> calling gimple_phi_num_args every iteration of this loop. It's value
> >> isn't generally subject to LICM.
> >
> > I was just following standard practice:
> Yea.  I doubt we're at all consistent with that.  In fact, I'd bet I'm a
> serial offender.  We should probably try to do better through since
> we're going to get a function call every loop iteration when the value
> is invariant.  FIxing other instances should be considered
> pre-approved.  ISTM a separate patch for fix that up would be fine.

No problem.  Fixed in this patch (attached).

> OK.

I've retested on x86-64 Linux and will push because I have some
follow-up patches (for the DOM threader this time :)), that depend on
this.

Thanks for reviewing.
Aldy

[-- Attachment #2: 0001-Implement-basic-block-path-solver.patch --]
[-- Type: text/x-patch, Size: 12848 bytes --]

From b9d83250d6d2dac77da4f47221a830a012617dfc Mon Sep 17 00:00:00 2001
From: Aldy Hernandez <aldyh@redhat.com>
Date: Tue, 15 Jun 2021 12:20:43 +0200
Subject: [PATCH] Implement basic block path solver.

This is is the main basic block path solver for use in the ranger-based
backwards threader.  Given a path of BBs, the class can solve the final
conditional or any SSA name used in calculating the final conditional.

gcc/ChangeLog:

        * Makefile.in (OBJS): Add gimple-range-path.o.
	* gimple-range-path.cc: New file.
	* gimple-range-path.h: New file.
---
 gcc/Makefile.in          |   1 +
 gcc/gimple-range-path.cc | 329 +++++++++++++++++++++++++++++++++++++++
 gcc/gimple-range-path.h  |  85 ++++++++++
 3 files changed, 415 insertions(+)
 create mode 100644 gcc/gimple-range-path.cc
 create mode 100644 gcc/gimple-range-path.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 1666ef84d6a..29bd4edb4b7 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1650,6 +1650,7 @@ OBJS = \
 	tree-ssa-loop.o \
 	tree-ssa-math-opts.o \
 	tree-ssa-operands.o \
+	gimple-range-path.o \
 	tree-ssa-phiopt.o \
 	tree-ssa-phiprop.o \
 	tree-ssa-pre.o \
diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
new file mode 100644
index 00000000000..a8226a6810f
--- /dev/null
+++ b/gcc/gimple-range-path.cc
@@ -0,0 +1,329 @@
+/* Basic block path solver.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   Contributed by Aldy Hernandez <aldyh@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "tree.h"
+#include "gimple.h"
+#include "cfganal.h"
+#include "value-range.h"
+#include "gimple-range.h"
+#include "tree-pretty-print.h"
+#include "gimple-range-path.h"
+#include "ssa.h"
+
+// Internal construct to help facilitate debugging of solver.
+#define DEBUG_SOLVER (0 && dump_file)
+
+path_range_query::path_range_query (gimple_ranger &ranger)
+  : m_ranger (ranger)
+{
+  m_cache = new ssa_global_cache;
+  m_has_cache_entry = BITMAP_ALLOC (NULL);
+  m_path = NULL;
+}
+
+path_range_query::~path_range_query ()
+{
+  BITMAP_FREE (m_has_cache_entry);
+  delete m_cache;
+}
+
+// Mark cache entry for NAME as unused.
+
+void
+path_range_query::clear_cache (tree name)
+{
+  unsigned v = SSA_NAME_VERSION (name);
+  bitmap_clear_bit (m_has_cache_entry, v);
+}
+
+// If NAME has a cache entry, return it in R, and return TRUE.
+
+inline bool
+path_range_query::get_cache (irange &r, tree name)
+{
+  if (!gimple_range_ssa_p (name))
+    return get_global_range_query ()->range_of_expr (r, name);
+
+  unsigned v = SSA_NAME_VERSION (name);
+  if (bitmap_bit_p (m_has_cache_entry, v))
+    return m_cache->get_global_range (r, name);
+
+  return false;
+}
+
+// Set the cache entry for NAME to R.
+
+void
+path_range_query::set_cache (const irange &r, tree name)
+{
+  unsigned v = SSA_NAME_VERSION (name);
+  bitmap_set_bit (m_has_cache_entry, v);
+  m_cache->set_global_range (name, r);
+}
+
+void
+path_range_query::dump (FILE *dump_file)
+{
+  if (m_path->is_empty ())
+    return;
+
+  unsigned i;
+  bitmap_iterator bi;
+  extern void dump_ranger (FILE *, const vec<basic_block> &);
+
+  fprintf (dump_file, "Path is:\n");
+  dump_ranger (dump_file, *m_path);
+
+  fprintf (dump_file, "Imports:\n");
+  EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+      print_generic_expr (dump_file, name, TDF_SLIM);
+      fprintf (dump_file, "\n");
+    }
+
+  m_cache->dump (dump_file);
+}
+
+void
+path_range_query::debug ()
+{
+  dump (stderr);
+}
+
+// Return the range of NAME at the end of the path being analyzed.
+
+bool
+path_range_query::range_of_expr (irange &r, tree name, gimple *stmt)
+{
+  if (!irange::supports_type_p (TREE_TYPE (name)))
+    return false;
+
+  if (get_cache (r, name))
+    return true;
+
+
+  basic_block bb = stmt ? gimple_bb (stmt) : exit_bb ();
+  if (stmt && range_defined_in_block (r, name, bb))
+    {
+      set_cache (r, name);
+      return true;
+    }
+
+  r.set_varying (TREE_TYPE (name));
+  return true;
+}
+
+// Return the range of STMT at the end of the path being analyzed.
+// Anything but the final conditional in a BB will return VARYING.
+
+bool
+path_range_query::range_of_stmt (irange &r, gimple *stmt, tree)
+{
+  tree type = gimple_range_type (stmt);
+
+  if (!irange::supports_type_p (type))
+    return false;
+
+  if (gimple_code (stmt) == GIMPLE_COND && fold_range (r, stmt, this))
+    return true;
+
+  r.set_varying (type);
+  return true;
+}
+
+// Initialize the current path to PATH.  The current block is set to
+// the entry block to the path.
+//
+// Note that the blocks are in reverse order, so the exit block is
+// path[0].
+
+void
+path_range_query::set_path (const vec<basic_block> &path)
+{
+  gcc_checking_assert (path.length () > 1);
+  m_path = &path;
+  m_pos = m_path->length () - 1;
+  bitmap_clear (m_has_cache_entry);
+}
+
+// Return the range of the result of PHI in R.
+
+void
+path_range_query::ssa_range_in_phi (irange &r, gphi *phi)
+{
+  tree name = gimple_phi_result (phi);
+  basic_block bb = gimple_bb (phi);
+
+  // We experimented with querying ranger's range_on_entry here, but
+  // the performance penalty was too high, for hardly any improvements.
+  if (at_entry ())
+    {
+      // Try fold just in case we can resolve simple things like PHI <5(99), 6(88)>.
+      if (!fold_range (r, phi, this))
+	r.set_varying (TREE_TYPE (name));
+
+      return;
+    }
+
+  basic_block prev = prev_bb ();
+  edge e_in = find_edge (prev, bb);
+  unsigned nargs = gimple_phi_num_args (phi);
+
+  for (size_t i = 0; i < nargs; ++i)
+    if (e_in == gimple_phi_arg_edge (phi, i))
+      {
+	tree arg = gimple_phi_arg_def (phi, i);
+
+	if (!get_cache (r, arg))
+	  r.set_varying (TREE_TYPE (name));
+
+	return;
+      }
+  gcc_unreachable ();
+}
+
+// If NAME is defined in BB, set R to the range of NAME, and return
+// TRUE.  Otherwise, return FALSE.
+
+bool
+path_range_query::range_defined_in_block (irange &r, tree name, basic_block bb)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
+  basic_block def_bb = gimple_bb (def_stmt);
+
+  if (def_bb != bb)
+    return false;
+
+  if (gimple_code (def_stmt) == GIMPLE_PHI)
+    ssa_range_in_phi (r, as_a<gphi *> (def_stmt));
+  else if (!fold_range (r, def_stmt, this))
+    r.set_varying (TREE_TYPE (name));
+
+  if (DEBUG_SOLVER)
+    {
+      fprintf (dump_file, "range_defined_in_block (BB%d) for ", bb->index);
+      print_generic_expr (dump_file, name, TDF_SLIM);
+      fprintf (dump_file, " is ");
+      r.dump (dump_file);
+      fprintf (dump_file, "\n");
+    }
+  return true;
+}
+
+// Precompute ranges defined in the current block, or ranges
+// that are exported on an edge to the next block.
+
+void
+path_range_query::precompute_ranges_in_block (basic_block bb)
+{
+  bitmap_iterator bi;
+  int_range_max r, cached_range;
+  unsigned i;
+
+  // Force recalculation of any names in the cache that are defined in
+  // this block.  This can happen on interdependent SSA/phis in loops.
+  EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+      gimple *def_stmt = SSA_NAME_DEF_STMT (name);
+      basic_block def_bb = gimple_bb (def_stmt);
+
+      if (def_bb == bb)
+	clear_cache (name);
+    }
+
+  // Solve imports defined in this block.
+  EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+
+      if (range_defined_in_block (r, name, bb))
+	set_cache (r, name);
+    }
+
+  if (at_exit ())
+    return;
+
+  // Solve imports that are exported to the next block.
+  edge e = find_edge (bb, next_bb ());
+  EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
+    {
+      tree name = ssa_name (i);
+      gori_compute &g = m_ranger.gori ();
+      bitmap exports = g.exports (bb);
+
+      if (bitmap_bit_p (exports, i))
+	{
+	  if (g.outgoing_edge_range_p (r, e, name, *this))
+	    {
+	      if (get_cache (cached_range, name))
+		r.intersect (cached_range);
+
+	      set_cache (r, name);
+	      if (DEBUG_SOLVER)
+		{
+		  fprintf (dump_file, "outgoing_edge_range_p for ");
+		  print_generic_expr (dump_file, name, TDF_SLIM);
+		  fprintf (dump_file, " on edge %d->%d ",
+			   e->src->index, e->dest->index);
+		  fprintf (dump_file, "is ");
+		  r.dump (dump_file);
+		  fprintf (dump_file, "\n");
+		}
+	    }
+	}
+    }
+}
+
+// Precompute the ranges for IMPORTS along PATH.
+//
+// IMPORTS are the set of SSA names, any of which could potentially
+// change the value of the final conditional in PATH.
+
+void
+path_range_query::precompute_ranges (const vec<basic_block> &path,
+				     const bitmap_head *imports)
+{
+  set_path (path);
+  m_imports = imports;
+
+  if (DEBUG_SOLVER)
+    fprintf (dump_file, "path_range_query: precompute_ranges\n");
+
+  while (1)
+    {
+      basic_block bb = curr_bb ();
+
+      precompute_ranges_in_block (bb);
+
+      if (at_exit ())
+	break;
+
+      move_next ();
+    }
+
+  if (DEBUG_SOLVER)
+    dump (dump_file);
+}
diff --git a/gcc/gimple-range-path.h b/gcc/gimple-range-path.h
new file mode 100644
index 00000000000..43f0ec80286
--- /dev/null
+++ b/gcc/gimple-range-path.h
@@ -0,0 +1,85 @@
+/* Header file for jump threading path solver.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   Contributed by Aldy Hernandez <aldyh@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_TREE_SSA_THREADSOLVER_H
+#define GCC_TREE_SSA_THREADSOLVER_H
+
+// This class is a basic block path solver.  Given a set of BBs
+// indicating a path through the CFG, range_of_expr and range_of_stmt
+// will calculate the range of an SSA or STMT as if the BBs in the
+// path would have been executed in order.
+//
+// Only SSA names passed in IMPORTS are precomputed, and can be
+// queried.
+//
+// Note that the blocks are in reverse order, thus the exit block is
+// path[0].
+
+class path_range_query : public range_query
+{
+public:
+  path_range_query (class gimple_ranger &ranger);
+  virtual ~path_range_query ();
+  void precompute_ranges (const vec<basic_block> &path,
+			  const bitmap_head *imports);
+  bool range_of_expr (irange &r, tree name, gimple * = NULL) override;
+  bool range_of_stmt (irange &r, gimple *, tree name = NULL) override;
+  void dump (FILE *);
+  void debug ();
+
+private:
+  // Cache manipulation.
+  void set_cache (const irange &r, tree name);
+  bool get_cache (irange &r, tree name);
+  void clear_cache (tree name);
+
+  // Methods to precompute ranges for the given path.
+  bool range_defined_in_block (irange &, tree name, basic_block bb);
+  void precompute_ranges_in_block (basic_block bb);
+  void ssa_range_in_phi (irange &r, gphi *phi);
+
+  // Path navigation.
+  void set_path (const vec<basic_block> &);
+  basic_block entry_bb () { return (*m_path)[m_path->length () - 1]; }
+  basic_block exit_bb ()  { return (*m_path)[0]; }
+  basic_block curr_bb ()  { return (*m_path)[m_pos]; }
+  basic_block prev_bb ()  { return (*m_path)[m_pos + 1]; }
+  basic_block next_bb ()  { return (*m_path)[m_pos - 1]; }
+  bool at_entry ()	  { return m_pos == m_path->length () - 1; }
+  bool at_exit ()	  { return m_pos == 0; }
+  void move_next ()	  { --m_pos; }
+
+  // Range cache for SSA names.
+  ssa_global_cache *m_cache;
+
+  // Set for each SSA that has an active entry in the cache.
+  bitmap m_has_cache_entry;
+
+  // Path being analyzed.
+  const vec<basic_block> *m_path;
+
+  // Current path position.
+  unsigned m_pos;
+
+  const bitmap_head *m_imports;
+  gimple_ranger &m_ranger;
+};
+
+#endif // GCC_TREE_SSA_THREADSOLVER_H
-- 
2.31.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] Implement basic block path solver.
  2021-07-27  9:58         ` Aldy Hernandez
@ 2021-07-27 15:18           ` Jeff Law
  0 siblings, 0 replies; 20+ messages in thread
From: Jeff Law @ 2021-07-27 15:18 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: GCC patches, Andrew MacLeod, Richard Biener, Martin Sebor



On 7/27/2021 3:58 AM, Aldy Hernandez wrote:
> On Mon, Jul 26, 2021 at 9:10 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
>>
>>
>> On 7/2/2021 2:13 AM, Aldy Hernandez wrote:
>>>
>>> On 7/2/21 12:20 AM, Jeff Law wrote:
>>>>
>>>> On 6/28/2021 10:21 AM, Aldy Hernandez wrote:
>>>>> +// Internal construct to help facilitate debugging of solver.
>>>>> +#define DEBUG_SOLVER getenv("DEBUG")
>>>> Shouldn't this really be a property of what pass is using the solver
>>>> and whether or not the appropriate dump flag is on for that pass?
>>> Whoops.  This was a private construct used for debugging the solver.
>>> I've changed it to:
>>>
>>> +#define DEBUG_SOLVER (0 && dump_file)
>> I would probably argue that the #define should disappear and the code
>> should be checking the current dump state for the current pass.   If you
>> don't want to keep the debugging output, then remove it  :-)  I think
>> that can be handled in a follow-up patch.
> The debugging output is really verbose, especially because the
> threader will try a boatload of different paths all of which get their
> propagation dumped.  Is there a recommended way of leaving the
> debugging in the code, but only enabled sporadically?  Perhaps a
> --param ??.  Or perhaps I could remove it near the end of stage 1?
You could make it conditional on TDF_DETAILS, which is what I think most 
passes do when they want the higher levels of verbosity.

ie

if (dump_file && (dump_flags & TDF_DETAILS))

Jeff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Backwards jump threader rewrite with ranger.
  2021-07-15 14:57     ` Aldy Hernandez
  2021-07-26 12:43       ` Aldy Hernandez
@ 2021-07-28 14:32       ` Jeff Law
  2021-07-28 14:51         ` Aldy Hernandez
  1 sibling, 1 reply; 20+ messages in thread
From: Jeff Law @ 2021-07-28 14:32 UTC (permalink / raw)
  To: Aldy Hernandez, GCC patches; +Cc: Andrew MacLeod, Richard Biener, Martin Sebor



On 7/15/2021 8:57 AM, Aldy Hernandez wrote:
> As mentioned in my previous email, these are some minor changes to the
> previous revision.  All I'm changing here is the call into the solver
> to use range_of_expr and range_of_stmt.  Everything else remains the
> same.
>
> Tested on x86-64 Linux.
>
> On Mon, Jul 5, 2021 at 5:39 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>> PING.
>>
>> Aldy
>>
>> 0003-Backwards-jump-threader-rewrite-with-ranger.patch
>>
>>  From 1774338ddd1f4718884e766aae2fc48b97110c5d Mon Sep 17 00:00:00 2001
>> From: Aldy Hernandez <aldyh@redhat.com>
>> Date: Tue, 15 Jun 2021 12:32:51 +0200
>> Subject: [PATCH 3/5] Backwards jump threader rewrite with ranger.
>>
>> This is a rewrite of the backwards threader with a ranger based solver.
>>
>> The code is divided into two parts: the path solver in
>> gimple-range-path.*, and the path discovery bits in
>> tree-ssa-threadbackward.c.
>>
>> The legacy code is still available with --param=threader-mode=legacy,
>> but will be removed shortly after.
>>
>> gcc/ChangeLog:
>>
>> 	* Makefile.in (tree-ssa-loop-im.o-warn): New.
>> 	* flag-types.h (enum threader_mode): New.
>> 	* params.opt: Add entry for --param=threader-mode.
>> 	* tree-ssa-threadbackward.c (THREADER_ITERATIVE_MODE): New.
>> 	(class back_threader): New.
>> 	(back_threader::back_threader): New.
>> 	(back_threader::~back_threader): New.
>> 	(back_threader::maybe_register_path): New.
>> 	(back_threader::find_taken_edge): New.
>> 	(back_threader::find_taken_edge_switch): New.
>> 	(back_threader::find_taken_edge_cond): New.
>> 	(back_threader::resolve_def): New.
>> 	(back_threader::resolve_phi): New.
>> 	(back_threader::find_paths_to_names): New.
>> 	(back_threader::find_paths): New.
>> 	(dump_path): New.
>> 	(debug): New.
>> 	(thread_jumps::find_jump_threads_backwards): Call ranger threader.
>> 	(thread_jumps::find_jump_threads_backwards_with_ranger): New.
>> 	(pass_thread_jumps::execute): Abstract out code...
>> 	(try_thread_blocks): ...here.
>> 	* tree-ssa-threadedge.c (jump_threader::thread_outgoing_edges):
>> 	Abstract out threading candidate code to...
>> 	(single_succ_to_potentially_threadable_block): ...here.
>> 	* tree-ssa-threadedge.h (single_succ_to_potentially_threadable_block):
>> 	New.
>> 	* tree-ssa-threadupdate.c (register_jump_thread): Return boolean.
>> 	* tree-ssa-threadupdate.h (class jump_thread_path_registry):
>> 	Return bool from register_jump_thread.
>>
>> libgomp/ChangeLog:
>>
>> 	* testsuite/libgomp.graphite/force-parallel-4.c: Adjust for
>> 	threader.
>> 	* testsuite/libgomp.graphite/force-parallel-8.c: Same.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 	* g++.dg/debug/dwarf2/deallocator.C: Adjust for threader.
>> 	* gcc.c-torture/compile/pr83510.c: Same.
>> 	* gcc.dg/loop-unswitch-2.c: Same.
>> 	* gcc.dg/old-style-asm-1.c: Same.
>> 	* gcc.dg/pr68317.c: Same.
>> 	* gcc.dg/pr97567-2.c: Same.
>> 	* gcc.dg/predict-9.c: Same.
>> 	* gcc.dg/shrink-wrap-loop.c: Same.
>> 	* gcc.dg/sibcall-1.c: Same.
>> 	* gcc.dg/tree-ssa/builtin-sprintf-3.c: Same.
>> 	* gcc.dg/tree-ssa/pr21001.c: Same.
>> 	* gcc.dg/tree-ssa/pr21294.c: Same.
>> 	* gcc.dg/tree-ssa/pr21417.c: Same.
>> 	* gcc.dg/tree-ssa/pr21458-2.c: Same.
>> 	* gcc.dg/tree-ssa/pr21563.c: Same.
>> 	* gcc.dg/tree-ssa/pr49039.c: Same.
>> 	* gcc.dg/tree-ssa/pr61839_1.c: Same.
>> 	* gcc.dg/tree-ssa/pr61839_3.c: Same.
>> 	* gcc.dg/tree-ssa/pr77445-2.c: Same.
>> 	* gcc.dg/tree-ssa/split-path-4.c: Same.
>> 	* gcc.dg/tree-ssa/ssa-dom-thread-11.c: Same.
>> 	* gcc.dg/tree-ssa/ssa-dom-thread-12.c: Same.
>> 	* gcc.dg/tree-ssa/ssa-dom-thread-14.c: Same.
>> 	* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same.
>> 	* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
>> 	* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same.
>> 	* gcc.dg/tree-ssa/ssa-fre-48.c: Same.
>> 	* gcc.dg/tree-ssa/ssa-thread-11.c: Same.
>> 	* gcc.dg/tree-ssa/ssa-thread-12.c: Same.
>> 	* gcc.dg/tree-ssa/ssa-thread-14.c: Same.
>> 	* gcc.dg/tree-ssa/vrp02.c: Same.
>> 	* gcc.dg/tree-ssa/vrp03.c: Same.
>> 	* gcc.dg/tree-ssa/vrp05.c: Same.
>> 	* gcc.dg/tree-ssa/vrp06.c: Same.
>> 	* gcc.dg/tree-ssa/vrp07.c: Same.
>> 	* gcc.dg/tree-ssa/vrp09.c: Same.
>> 	* gcc.dg/tree-ssa/vrp19.c: Same.
>> 	* gcc.dg/tree-ssa/vrp20.c: Same.
>> 	* gcc.dg/tree-ssa/vrp33.c: Same.
>> 	* gcc.dg/uninit-pred-9_b.c: Same.
>> 	* gcc.dg/vect/bb-slp-16.c: Same.
>> 	* gcc.target/i386/avx2-vect-aggressive.c: Same.
>> 	* gcc.dg/tree-ssa/ranger-threader-1.c: New test.
>> 	* gcc.dg/tree-ssa/ranger-threader-2.c: New test.
>> 	* gcc.dg/tree-ssa/ranger-threader-3.c: New test.
>> 	* gcc.dg/tree-ssa/ranger-threader-4.c: New test.
>> 	* gcc.dg/tree-ssa/ranger-threader-5.c: New test.
>> ---
>>   gcc/Makefile.in                               |   5 +
>>   gcc/flag-types.h                              |   7 +
>>   gcc/params.opt                                |  17 +
>>   .../g++.dg/debug/dwarf2/deallocator.C         |   3 +-
>>   gcc/testsuite/gcc.c-torture/compile/pr83510.c |  33 ++
>>   gcc/testsuite/gcc.dg/loop-unswitch-2.c        |   2 +-
>>   gcc/testsuite/gcc.dg/old-style-asm-1.c        |   5 +-
>>   gcc/testsuite/gcc.dg/pr68317.c                |   4 +-
>>   gcc/testsuite/gcc.dg/pr97567-2.c              |   2 +-
>>   gcc/testsuite/gcc.dg/predict-9.c              |   4 +-
>>   gcc/testsuite/gcc.dg/shrink-wrap-loop.c       |  53 ++
>>   gcc/testsuite/gcc.dg/sibcall-1.c              |  10 +
>>   .../gcc.dg/tree-ssa/builtin-sprintf-3.c       |  25 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/pr21001.c       |   1 +
>>   gcc/testsuite/gcc.dg/tree-ssa/pr21294.c       |   1 +
>>   gcc/testsuite/gcc.dg/tree-ssa/pr21417.c       |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c     |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/pr21563.c       |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/pr49039.c       |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c     |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c     |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c     |   2 +-
>>   .../gcc.dg/tree-ssa/ranger-threader-1.c       |  20 +
>>   .../gcc.dg/tree-ssa/ranger-threader-2.c       |  39 ++
>>   .../gcc.dg/tree-ssa/ranger-threader-3.c       |  41 ++
>>   .../gcc.dg/tree-ssa/ranger-threader-4.c       |  83 +++
>>   .../gcc.dg/tree-ssa/ranger-threader-5.c       |  80 +++
>>   gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c  |   4 +-
>>   .../gcc.dg/tree-ssa/ssa-dom-thread-11.c       |   2 +-
>>   .../gcc.dg/tree-ssa/ssa-dom-thread-12.c       |   2 +-
>>   .../gcc.dg/tree-ssa/ssa-dom-thread-14.c       |   1 +
>>   .../gcc.dg/tree-ssa/ssa-dom-thread-18.c       |   5 +-
>>   .../gcc.dg/tree-ssa/ssa-dom-thread-6.c        |   4 +-
>>   .../gcc.dg/tree-ssa/ssa-dom-thread-7.c        |   1 +
>>   gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c    |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c |   1 +
>>   gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c |   1 +
>>   gcc/testsuite/gcc.dg/tree-ssa/vrp02.c         |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/vrp03.c         |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/vrp05.c         |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/vrp06.c         |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/vrp07.c         |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/vrp09.c         |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/vrp19.c         |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/vrp20.c         |   2 +-
>>   gcc/testsuite/gcc.dg/tree-ssa/vrp33.c         |   2 +-
>>   gcc/testsuite/gcc.dg/uninit-pred-9_b.c        |   1 +
>>   gcc/testsuite/gcc.dg/vect/bb-slp-16.c         |   7 +
>>   .../gcc.target/i386/avx2-vect-aggressive.c    |   2 +-
>>   gcc/tree-ssa-threadbackward.c                 | 475 +++++++++++++++++-
>>   gcc/tree-ssa-threadedge.c                     |  20 +-
>>   gcc/tree-ssa-threadedge.h                     |   3 +-
>>   gcc/tree-ssa-threadupdate.c                   |  12 +-
>>   gcc/tree-ssa-threadupdate.h                   |   2 +-
>>   .../libgomp.graphite/force-parallel-4.c       |   1 +
>>   .../libgomp.graphite/force-parallel-8.c       |   2 +
>>   57 files changed, 963 insertions(+), 54 deletions(-)
>>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c
>>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c
>>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c
>>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c
>>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-5.c
>>
>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
>> index 863f1256811..0e205a41ac3 100644
>> --- a/gcc/Makefile.in
>> +++ b/gcc/Makefile.in
>> @@ -223,6 +223,11 @@ gimple-match.o-warn = -Wno-unused
>>   generic-match.o-warn = -Wno-unused
>>   dfp.o-warn = -Wno-strict-aliasing
>>   
>> +# maybe_emit_free_warning() is picking up the inlined location for the
>> +# warning, not the source of the original va_heap::release() function
>> +# which has a pragma disabling this warning.
>> +tree-ssa-loop-im.o-warn = -Wno-free-nonheap-object
I think some of Martin's work may help here, but I'm not sure if it's 
all gone in yet.  It might be worth syncing with him on the state of the 
improvements to how inlining and warnings interact.  If his work does 
fix the problem here, this hunk can be removed as a distinct follow-up.

>> diff --git a/gcc/params.opt b/gcc/params.opt
>> index 92b003e38cb..f1f47b44215 100644
>> --- a/gcc/params.opt
>> +++ b/gcc/params.opt
>> @@ -1010,6 +1010,23 @@ Maximum depth of DFS walk used by modref escape analysis.
>>   Common Joined UInteger Var(param_modref_max_escape_points) Init(256) Param Optimization
>>   Maximum number of escape points tracked by modref per SSA-name.
>>   
>> +-param=threader-iterative=
>> +Common Joined UInteger Var(param_threader_iterative) Init(0) Param Optimization
>> +Run backwards threader in iterative mode.
Presumably this is going away?  I thought the iterative mode was just 
for debugging/evaluation purposes.

I only glossed over the testsuite changes.

God I love how much you've refactored here.  Lots of small, easy to 
understand functions.

>> +
>> +// If any of the incoming edges for a PHI resolves the current path,
>> +// register the path(s), and return TRUE.
>> +
>> +bool
>> +back_threader::resolve_phi (gphi *phi, bitmap interesting)
[ ... ]
It might be useful to indicate somewhere what "resolving the current 
path" means.  I kind of figured it out as I was walking through the 
patch, but my first thought had it reversed.


diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
>> index 6ce32644aa5..ea5c37a2c65 100644
>> --- a/gcc/tree-ssa-threadedge.c
>> +++ b/gcc/tree-ssa-threadedge.c
>> @@ -1335,6 +1335,18 @@ jump_threader::thread_across_edge (edge e)
>>     m_avail_exprs_stack->pop_to_marker ();
>>   }
>>   
>> +/* Return TRUE if BB has a single successor to a block with multiple
>> +   incoming and outgoing edges.  */
>> +
>> +bool
>> +single_succ_to_potentially_threadable_block (basic_block bb)
>> +{
>> +  int flags = (EDGE_IGNORE | EDGE_COMPLEX | EDGE_ABNORMAL);
>> +  return (single_succ_p (bb)
>> +	  && (single_succ_edge (bb)->flags & flags) == 0
>> +	  && potentially_threadable_block (single_succ (bb)));
>> +}
Note on many occasions I've pondered killing these checks.  The forward 
jump threader in particular uses them to narrow its search space.  But 
I've regularly found that they're suppressing useful paths.  The worry, 
of course, is the compile-time cost.


I don't see anything in here that is worrisome.  I'd like to see the 
iterating bits disappear, but I'd also understand if you want to keep 
them for debugging/evaluation purposes in the immediate term.

OK for the trunk.

jeff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Backwards jump threader rewrite with ranger.
  2021-07-28 14:32       ` Jeff Law
@ 2021-07-28 14:51         ` Aldy Hernandez
  2021-07-28 15:29           ` Martin Sebor
  0 siblings, 1 reply; 20+ messages in thread
From: Aldy Hernandez @ 2021-07-28 14:51 UTC (permalink / raw)
  To: Jeff Law, GCC patches; +Cc: Andrew MacLeod, Richard Biener, Martin Sebor



On 7/28/21 4:32 PM, Jeff Law wrote:
> 
> 
> On 7/15/2021 8:57 AM, Aldy Hernandez wrote:
>> As mentioned in my previous email, these are some minor changes to the
>> previous revision.  All I'm changing here is the call into the solver
>> to use range_of_expr and range_of_stmt.  Everything else remains the
>> same.
>>
>> Tested on x86-64 Linux.
>>
>> On Mon, Jul 5, 2021 at 5:39 PM Aldy Hernandez<aldyh@redhat.com>  wrote:
>>> PING.
>>>
>>> Aldy
>>>
>>> 0003-Backwards-jump-threader-rewrite-with-ranger.patch
>>>
>>>  From 1774338ddd1f4718884e766aae2fc48b97110c5d Mon Sep 17 00:00:00 2001
>>> From: Aldy Hernandez<aldyh@redhat.com>
>>> Date: Tue, 15 Jun 2021 12:32:51 +0200
>>> Subject: [PATCH 3/5] Backwards jump threader rewrite with ranger.
>>>
>>> This is a rewrite of the backwards threader with a ranger based solver.
>>>
>>> The code is divided into two parts: the path solver in
>>> gimple-range-path.*, and the path discovery bits in
>>> tree-ssa-threadbackward.c.
>>>
>>> The legacy code is still available with --param=threader-mode=legacy,
>>> but will be removed shortly after.
>>>
>>> gcc/ChangeLog:
>>>
>>> 	* Makefile.in (tree-ssa-loop-im.o-warn): New.
>>> 	* flag-types.h (enum threader_mode): New.
>>> 	* params.opt: Add entry for --param=threader-mode.
>>> 	* tree-ssa-threadbackward.c (THREADER_ITERATIVE_MODE): New.
>>> 	(class back_threader): New.
>>> 	(back_threader::back_threader): New.
>>> 	(back_threader::~back_threader): New.
>>> 	(back_threader::maybe_register_path): New.
>>> 	(back_threader::find_taken_edge): New.
>>> 	(back_threader::find_taken_edge_switch): New.
>>> 	(back_threader::find_taken_edge_cond): New.
>>> 	(back_threader::resolve_def): New.
>>> 	(back_threader::resolve_phi): New.
>>> 	(back_threader::find_paths_to_names): New.
>>> 	(back_threader::find_paths): New.
>>> 	(dump_path): New.
>>> 	(debug): New.
>>> 	(thread_jumps::find_jump_threads_backwards): Call ranger threader.
>>> 	(thread_jumps::find_jump_threads_backwards_with_ranger): New.
>>> 	(pass_thread_jumps::execute): Abstract out code...
>>> 	(try_thread_blocks): ...here.
>>> 	* tree-ssa-threadedge.c (jump_threader::thread_outgoing_edges):
>>> 	Abstract out threading candidate code to...
>>> 	(single_succ_to_potentially_threadable_block): ...here.
>>> 	* tree-ssa-threadedge.h (single_succ_to_potentially_threadable_block):
>>> 	New.
>>> 	* tree-ssa-threadupdate.c (register_jump_thread): Return boolean.
>>> 	* tree-ssa-threadupdate.h (class jump_thread_path_registry):
>>> 	Return bool from register_jump_thread.
>>>
>>> libgomp/ChangeLog:
>>>
>>> 	* testsuite/libgomp.graphite/force-parallel-4.c: Adjust for
>>> 	threader.
>>> 	* testsuite/libgomp.graphite/force-parallel-8.c: Same.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 	* g++.dg/debug/dwarf2/deallocator.C: Adjust for threader.
>>> 	* gcc.c-torture/compile/pr83510.c: Same.
>>> 	* gcc.dg/loop-unswitch-2.c: Same.
>>> 	* gcc.dg/old-style-asm-1.c: Same.
>>> 	* gcc.dg/pr68317.c: Same.
>>> 	* gcc.dg/pr97567-2.c: Same.
>>> 	* gcc.dg/predict-9.c: Same.
>>> 	* gcc.dg/shrink-wrap-loop.c: Same.
>>> 	* gcc.dg/sibcall-1.c: Same.
>>> 	* gcc.dg/tree-ssa/builtin-sprintf-3.c: Same.
>>> 	* gcc.dg/tree-ssa/pr21001.c: Same.
>>> 	* gcc.dg/tree-ssa/pr21294.c: Same.
>>> 	* gcc.dg/tree-ssa/pr21417.c: Same.
>>> 	* gcc.dg/tree-ssa/pr21458-2.c: Same.
>>> 	* gcc.dg/tree-ssa/pr21563.c: Same.
>>> 	* gcc.dg/tree-ssa/pr49039.c: Same.
>>> 	* gcc.dg/tree-ssa/pr61839_1.c: Same.
>>> 	* gcc.dg/tree-ssa/pr61839_3.c: Same.
>>> 	* gcc.dg/tree-ssa/pr77445-2.c: Same.
>>> 	* gcc.dg/tree-ssa/split-path-4.c: Same.
>>> 	* gcc.dg/tree-ssa/ssa-dom-thread-11.c: Same.
>>> 	* gcc.dg/tree-ssa/ssa-dom-thread-12.c: Same.
>>> 	* gcc.dg/tree-ssa/ssa-dom-thread-14.c: Same.
>>> 	* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same.
>>> 	* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
>>> 	* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same.
>>> 	* gcc.dg/tree-ssa/ssa-fre-48.c: Same.
>>> 	* gcc.dg/tree-ssa/ssa-thread-11.c: Same.
>>> 	* gcc.dg/tree-ssa/ssa-thread-12.c: Same.
>>> 	* gcc.dg/tree-ssa/ssa-thread-14.c: Same.
>>> 	* gcc.dg/tree-ssa/vrp02.c: Same.
>>> 	* gcc.dg/tree-ssa/vrp03.c: Same.
>>> 	* gcc.dg/tree-ssa/vrp05.c: Same.
>>> 	* gcc.dg/tree-ssa/vrp06.c: Same.
>>> 	* gcc.dg/tree-ssa/vrp07.c: Same.
>>> 	* gcc.dg/tree-ssa/vrp09.c: Same.
>>> 	* gcc.dg/tree-ssa/vrp19.c: Same.
>>> 	* gcc.dg/tree-ssa/vrp20.c: Same.
>>> 	* gcc.dg/tree-ssa/vrp33.c: Same.
>>> 	* gcc.dg/uninit-pred-9_b.c: Same.
>>> 	* gcc.dg/vect/bb-slp-16.c: Same.
>>> 	* gcc.target/i386/avx2-vect-aggressive.c: Same.
>>> 	* gcc.dg/tree-ssa/ranger-threader-1.c: New test.
>>> 	* gcc.dg/tree-ssa/ranger-threader-2.c: New test.
>>> 	* gcc.dg/tree-ssa/ranger-threader-3.c: New test.
>>> 	* gcc.dg/tree-ssa/ranger-threader-4.c: New test.
>>> 	* gcc.dg/tree-ssa/ranger-threader-5.c: New test.
>>> ---
>>>   gcc/Makefile.in                               |   5 +
>>>   gcc/flag-types.h                              |   7 +
>>>   gcc/params.opt                                |  17 +
>>>   .../g++.dg/debug/dwarf2/deallocator.C         |   3 +-
>>>   gcc/testsuite/gcc.c-torture/compile/pr83510.c |  33 ++
>>>   gcc/testsuite/gcc.dg/loop-unswitch-2.c        |   2 +-
>>>   gcc/testsuite/gcc.dg/old-style-asm-1.c        |   5 +-
>>>   gcc/testsuite/gcc.dg/pr68317.c                |   4 +-
>>>   gcc/testsuite/gcc.dg/pr97567-2.c              |   2 +-
>>>   gcc/testsuite/gcc.dg/predict-9.c              |   4 +-
>>>   gcc/testsuite/gcc.dg/shrink-wrap-loop.c       |  53 ++
>>>   gcc/testsuite/gcc.dg/sibcall-1.c              |  10 +
>>>   .../gcc.dg/tree-ssa/builtin-sprintf-3.c       |  25 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/pr21001.c       |   1 +
>>>   gcc/testsuite/gcc.dg/tree-ssa/pr21294.c       |   1 +
>>>   gcc/testsuite/gcc.dg/tree-ssa/pr21417.c       |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c     |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/pr21563.c       |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/pr49039.c       |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c     |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c     |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c     |   2 +-
>>>   .../gcc.dg/tree-ssa/ranger-threader-1.c       |  20 +
>>>   .../gcc.dg/tree-ssa/ranger-threader-2.c       |  39 ++
>>>   .../gcc.dg/tree-ssa/ranger-threader-3.c       |  41 ++
>>>   .../gcc.dg/tree-ssa/ranger-threader-4.c       |  83 +++
>>>   .../gcc.dg/tree-ssa/ranger-threader-5.c       |  80 +++
>>>   gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c  |   4 +-
>>>   .../gcc.dg/tree-ssa/ssa-dom-thread-11.c       |   2 +-
>>>   .../gcc.dg/tree-ssa/ssa-dom-thread-12.c       |   2 +-
>>>   .../gcc.dg/tree-ssa/ssa-dom-thread-14.c       |   1 +
>>>   .../gcc.dg/tree-ssa/ssa-dom-thread-18.c       |   5 +-
>>>   .../gcc.dg/tree-ssa/ssa-dom-thread-6.c        |   4 +-
>>>   .../gcc.dg/tree-ssa/ssa-dom-thread-7.c        |   1 +
>>>   gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c    |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c |   1 +
>>>   gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c |   1 +
>>>   gcc/testsuite/gcc.dg/tree-ssa/vrp02.c         |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/vrp03.c         |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/vrp05.c         |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/vrp06.c         |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/vrp07.c         |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/vrp09.c         |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/vrp19.c         |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/vrp20.c         |   2 +-
>>>   gcc/testsuite/gcc.dg/tree-ssa/vrp33.c         |   2 +-
>>>   gcc/testsuite/gcc.dg/uninit-pred-9_b.c        |   1 +
>>>   gcc/testsuite/gcc.dg/vect/bb-slp-16.c         |   7 +
>>>   .../gcc.target/i386/avx2-vect-aggressive.c    |   2 +-
>>>   gcc/tree-ssa-threadbackward.c                 | 475 +++++++++++++++++-
>>>   gcc/tree-ssa-threadedge.c                     |  20 +-
>>>   gcc/tree-ssa-threadedge.h                     |   3 +-
>>>   gcc/tree-ssa-threadupdate.c                   |  12 +-
>>>   gcc/tree-ssa-threadupdate.h                   |   2 +-
>>>   .../libgomp.graphite/force-parallel-4.c       |   1 +
>>>   .../libgomp.graphite/force-parallel-8.c       |   2 +
>>>   57 files changed, 963 insertions(+), 54 deletions(-)
>>>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c
>>>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c
>>>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c
>>>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c
>>>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-5.c
>>>
>>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
>>> index 863f1256811..0e205a41ac3 100644
>>> --- a/gcc/Makefile.in
>>> +++ b/gcc/Makefile.in
>>> @@ -223,6 +223,11 @@ gimple-match.o-warn = -Wno-unused
>>>   generic-match.o-warn = -Wno-unused
>>>   dfp.o-warn = -Wno-strict-aliasing
>>>   
>>> +# maybe_emit_free_warning() is picking up the inlined location for the
>>> +# warning, not the source of the original va_heap::release() function
>>> +# which has a pragma disabling this warning.
>>> +tree-ssa-loop-im.o-warn = -Wno-free-nonheap-object
> I think some of Martin's work may help here, but I'm not sure if it's 
> all gone in yet.  It might be worth syncing with him on the state of the 
> improvements to how inlining and warnings interact.  If his work does 
> fix the problem here, this hunk can be removed as a distinct follow-up.

Yes.  He definitely has some patches in this space that were likely to 
fix this.  I will re-test without this hunk, and remove it if it's fixed.

I will also add a few PRs as suggested by Martin to keep track of the 
XFAILs I introduced.

> 
>>> diff --git a/gcc/params.opt b/gcc/params.opt
>>> index 92b003e38cb..f1f47b44215 100644
>>> --- a/gcc/params.opt
>>> +++ b/gcc/params.opt
>>> @@ -1010,6 +1010,23 @@ Maximum depth of DFS walk used by modref escape analysis.
>>>   Common Joined UInteger Var(param_modref_max_escape_points) Init(256) Param Optimization
>>>   Maximum number of escape points tracked by modref per SSA-name.
>>>   
>>> +-param=threader-iterative=
>>> +Common Joined UInteger Var(param_threader_iterative) Init(0) Param Optimization
>>> +Run backwards threader in iterative mode.
> Presumably this is going away?  I thought the iterative mode was just 
> for debugging/evaluation purposes.

I can remove it in the next few months, as I'll probably need it to 
analyze what to do with the DOM and VRP threaders.  But if it really 
bothers you, I can rip it out now.  They're simple enough to keep local. 
  All in all, the non documented iterative mode, and the legacy code 
will be long gone before stage 1 ends :).

> 
> I only glossed over the testsuite changes.
> 
> God I love how much you've refactored here.  Lots of small, easy to 
> understand functions.

It was the only way to understand what I was doing :).

> 
>>> +
>>> +// If any of the incoming edges for a PHI resolves the current path,
>>> +// register the path(s), and return TRUE.
>>> +
>>> +bool
>>> +back_threader::resolve_phi (gphi *phi, bitmap interesting)
> [ ... ]
> It might be useful to indicate somewhere what "resolving the current 
> path" means.  I kind of figured it out as I was walking through the 
> patch, but my first thought had it reversed.

Will do.

> 
> 
> diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
>>> index 6ce32644aa5..ea5c37a2c65 100644
>>> --- a/gcc/tree-ssa-threadedge.c
>>> +++ b/gcc/tree-ssa-threadedge.c
>>> @@ -1335,6 +1335,18 @@ jump_threader::thread_across_edge (edge e)
>>>     m_avail_exprs_stack->pop_to_marker ();
>>>   }
>>>   
>>> +/* Return TRUE if BB has a single successor to a block with multiple
>>> +   incoming and outgoing edges.  */
>>> +
>>> +bool
>>> +single_succ_to_potentially_threadable_block (basic_block bb)
>>> +{
>>> +  int flags = (EDGE_IGNORE | EDGE_COMPLEX | EDGE_ABNORMAL);
>>> +  return (single_succ_p (bb)
>>> +	  && (single_succ_edge (bb)->flags & flags) == 0
>>> +	  && potentially_threadable_block (single_succ (bb)));
>>> +}
> Note on many occasions I've pondered killing these checks.  The forward 
> jump threader in particular uses them to narrow its search space.  But 
> I've regularly found that they're suppressing useful paths.  The worry, 
> of course, is the compile-time cost.

I'll have to put this in my TODO list, because as you mention, even 
small changes tend to blow up the search space and show a measurable 
performance degradation.

> 
> 
> I don't see anything in here that is worrisome.  I'd like to see the 
> iterating bits disappear, but I'd also understand if you want to keep 
> them for debugging/evaluation purposes in the immediate term.
> 
> OK for the trunk.

Sweeeet!  I'll retest, remove the warning suppression if applicable, and 
follow-up with PRs.

Let me know if you're OK removing the legacy mode in the next week or 
so.  I see no need for it.

Aldy


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Backwards jump threader rewrite with ranger.
  2021-07-28 14:51         ` Aldy Hernandez
@ 2021-07-28 15:29           ` Martin Sebor
  0 siblings, 0 replies; 20+ messages in thread
From: Martin Sebor @ 2021-07-28 15:29 UTC (permalink / raw)
  To: Aldy Hernandez, Jeff Law, GCC patches; +Cc: Martin Sebor

On 7/28/21 8:51 AM, Aldy Hernandez via Gcc-patches wrote:
> 
> 
> On 7/28/21 4:32 PM, Jeff Law wrote:
...
>>>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
>>>> index 863f1256811..0e205a41ac3 100644
>>>> --- a/gcc/Makefile.in
>>>> +++ b/gcc/Makefile.in
>>>> @@ -223,6 +223,11 @@ gimple-match.o-warn = -Wno-unused
>>>>   generic-match.o-warn = -Wno-unused
>>>>   dfp.o-warn = -Wno-strict-aliasing
>>>> +# maybe_emit_free_warning() is picking up the inlined location for the
>>>> +# warning, not the source of the original va_heap::release() function
>>>> +# which has a pragma disabling this warning.
>>>> +tree-ssa-loop-im.o-warn = -Wno-free-nonheap-object
>> I think some of Martin's work may help here, but I'm not sure if it's 
>> all gone in yet.  It might be worth syncing with him on the state of 
>> the improvements to how inlining and warnings interact.  If his work 
>> does fix the problem here, this hunk can be removed as a distinct 
>> follow-up.
> 
> Yes.  He definitely has some patches in this space that were likely to 
> fix this.  I will re-test without this hunk, and remove it if it's fixed.

I think I tested your patch on top of mine while it was still in
review and confirmed that the workaround isn't necessary anymore.
But if something's changed/regressed since then please let me know.

Martin

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-07-28 15:29 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-28 16:21 [PATCH 0/2] Ranger-based backwards threader implementation Aldy Hernandez
2021-06-28 16:21 ` [PATCH 1/2] Implement basic block path solver Aldy Hernandez
2021-07-01 22:20   ` Jeff Law
2021-07-02  8:13     ` Aldy Hernandez
2021-07-02 13:16       ` Andrew MacLeod
2021-07-15 14:55         ` Aldy Hernandez
2021-07-26 19:10       ` Jeff Law
2021-07-27  9:58         ` Aldy Hernandez
2021-07-27 15:18           ` Jeff Law
2021-06-28 16:21 ` [PATCH 2/2] Backwards jump threader rewrite with ranger Aldy Hernandez
2021-07-05 15:39   ` Aldy Hernandez
2021-07-15 14:57     ` Aldy Hernandez
2021-07-26 12:43       ` Aldy Hernandez
2021-07-28 14:32       ` Jeff Law
2021-07-28 14:51         ` Aldy Hernandez
2021-07-28 15:29           ` Martin Sebor
2021-06-28 23:19 ` [PATCH 0/2] Ranger-based backwards threader implementation Martin Sebor
2021-06-29 10:27   ` Aldy Hernandez
2021-06-29 21:22     ` Martin Sebor
2021-06-30  6:08       ` Aldy Hernandez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).