[00/23] Make fwprop use an on-the-side RTL SSA representation

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [00/23] Make fwprop use an on-the-side RTL SSA representation
@ 2020-11-13  8:10 Richard Sandiford
  2020-11-13  8:11 ` [01/23] vec: Silence clang warning Richard Sandiford
                   ` (23 more replies)
  0 siblings, 24 replies; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:10 UTC (permalink / raw)
  To: gcc-patches

Just after GCC 10 stage 1 closed (oops), I posted a patch to add a new
combine pass.  One of its main aims was to allow instructions to move
around where necessary in order to make a combination possible.
It also tried to parallelise instructions that use the same resource.

That pass contained its own code for maintaining limited def-use chains.
When I posted the patch, Segher asked why we wanted yet another piece
of pass-specific code to do that.  Although I had specific reasons
(which I explained at the time) I've gradually come round to agreeing
that that was a flaw.

This series of patches is the result of a Covid-time project to add
a more general, pass-agnostic framework.  There are two parts:
adding the framework itself, and using it to make fwprop.c faster.

The framework part
------------------

The framework provides an optional, on-the-side SSA view of existing
RTL instructions.  Each instruction gets a list of definitions and a
list of uses, with each use having a single definition.  Phi nodes
handle cases in which there are multiple possible definitions of a
register on entry to a basic block.  There are also routines for
updating instructions while keeping the SSA representation intact.

The aim is only to provide a different view of existing RTL instructions.
Unlike gimple, and unlike (IIRC) the old RTL SSA project from way back,
the new framework isn't a “native” SSA representation.  This means that
all inputs to a phi node for a register R are also definitions of
register R; no move operation is “hidden” in the phi node.

Like gimple, the framework treats memory as a single unified resource.

A more in-depth summary is contained in the doc patch, but some
other random notes:

* At the moment, the SSA information is local to one pass, but it might
  be good to maintain it between passes in future.

* The SSA code groups blocks into extended basic blocks, with the
  EBBs rather than individual blocks having phi nodes.  

* The framework also provides live range information for registers
  within an extended basic block and allows instructions to move within
  their EBB.  It might be useful to allow further movement in future;
  I just don't have a use case for it yet.

* One advantage of the new infrastructure is that it gives
  recog_for_combine-like behaviour: if recog wants to add clobbers
  of things like the flags register, the SSA code will make sure
  that the flags register is free.

* All current queries and updates have amortised sublinear complexity.
  Some updates are done lazily in order to avoid an upfront linear cost.

* I've tried to optimise the code for both memory footprint and
  compile time.  The first part involves quite a bit of overloading
  of pointers and various other kinds of reuse, so most of the new data
  structures use private member variables and public accessor functions.
  I know that style isn't universally popular, but I think it's
  justified here.  Things could easily go wrong if passes tried
  to operate directly on the underlying data structures.

* Debug instructions get SSA information too, on a best-effort basis.
  Providing complete information would be significantly more expensive.

* I wasn't sure for new C++ code whether to stick to the old C /* … */
  comments, or whether to switch to //.  In the end I went for //,
  on the basis that:

  - The ranger code already does this.

  - // is certainly more idiomatic in C++.

  - // is in the lisp tradition of per-line comments and it matches the
    ;; used in .md files.  I feel sure that GCC would have been written
    using // from the outset if that had been possible.

  The patches only do this for new files.  The aim is to ensure that
  each file is at least self-consistent.

Using RTL SSA to make fwprop faster
-----------------------------------

In order to show the thing in action, I tried to port fwprop.c
to use RTL SSA while preserving the pass's current heuristics as
much as possible.

To get an extreme measurement of speed, I made each fwprop pass
run 5000 times, calling:

  df_finish_pass (false);

after each iteration.  Usually only the first iteration would actually
do any optimisation, the other iterations would simply test the cost of
the instruction processing.  In the case of the “old” pass, this included:

  - df_analyze (including solving the notes and md problems)
  - the dominator walk to build a list of single definitions

In the case of the “new” pass, this included:

  - df_analyze (with no additional problems)
  - building the SSA representation

On an --enable-checking=release compiler, the post-patch version was 23%
faster than the pre-patch version when compiling simplify-rtx.ii at -O.

When compiling simplify-rtx.ii at -O normally (without the hack above),
the compile-time improvement is ~0.5% (which was outside the noise).
The assembly output was unchanged.

Testing
-------

Tested so far on aarch64-linux-gnu, arm-linux-gnueabihf and
x86_64-linux-gnu.  I'll test on powerpc64le-linux-gnu too.

I also tried comparing code with the old and new fwprop.c implementations.
When testing the “old” fwprop.c implementation I applied the attached
patch to avoid a couple of quirks that would otherwise skew the results:

(1) The code that handled LO_SUM propagations had:

	  /* OP1 is likely not a legitimate address, otherwise there would have
	     been no LO_SUM.  We want it to disappear if it is invalid, return
	     false in that case.  */
	  return memory_address_p (mode, tem);

    But this early exit occurs before any replacement has been made,
    so the pass never substituted into LO_SUMs, even on a true return.

(2) use_killed_between didn't take advantage of the fact that
    frame_pointer_rtx and arg_pointer_rtx are function invariants.
    Sometimes it could prove this indirectly, but not always.

With those tweaks, the “old” and “new” passes produced the same assembly
output for some spot-checked GCC files, such as simplify-rtx.ii, optabs.ii,
etc., compiled with -O2 -g.

I also tried building glibc for aarch64-linux-gnu with both versions.
There were some cases in which INDEX+BASE addresses were canonicalised
as BASE+INDEX, but there were no other differences.

I also tried compiling gcc.dg, g++.dg and gcc.c-torture at -O2
-ftree-vectorize on at least one target per CPU directory.  The same
kind of address canonicalisation differences showed up here too,
but for most targets there were only a handful (<=10) cases in
which the new file had more or fewer lines.  Most of the differences
were improvements.

As a side-effect of the above, I tried building with gcc 10.2.1, gcc 7.4.0
and gcc 5.4.0.  I also tried gcc 4.8.5 and clang 10 for compatibility
purposes.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [01/23] vec: Silence clang warning
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
@ 2020-11-13  8:11 ` Richard Sandiford
  2020-11-25 19:58   ` Jeff Law
  2020-11-13  8:12 ` [02/23] rtlanal: Remove noop_move_p REG_EQUAL condition Richard Sandiford
                   ` (22 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:11 UTC (permalink / raw)
  To: gcc-patches

I noticed during compatibility testing that clang warns that this
operator won't be implicitly const in C++14 onwards.

gcc/
	* vec.h (vnull::operator vec<T, A, L>): Make const.
---
 gcc/vec.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/vec.h b/gcc/vec.h
index 14d77e87342..f02beddc975 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -540,7 +540,7 @@ vec_copy_construct (T *dst, const T *src, unsigned n)
 struct vnull
 {
   template <typename T, typename A, typename L>
-  CONSTEXPR operator vec<T, A, L> () { return vec<T, A, L>(); }
+  CONSTEXPR operator vec<T, A, L> () const { return vec<T, A, L>(); }
 };
 extern vnull vNULL;
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [02/23] rtlanal: Remove noop_move_p REG_EQUAL condition
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
  2020-11-13  8:11 ` [01/23] vec: Silence clang warning Richard Sandiford
@ 2020-11-13  8:12 ` Richard Sandiford
  2020-11-25 20:00   ` Jeff Law
  2020-11-13  8:12 ` [03/23] reginfo: Add a global_reg_set Richard Sandiford
                   ` (21 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:12 UTC (permalink / raw)
  To: gcc-patches

noop_move_p currently keeps any instruction that has a REG_EQUAL
note, on the basis that the equality might be useful in future.
But this creates a perverse incentive not to add potentially-useful
REG_EQUAL notes, in case they prevent an instruction from later being
removed as dead.

The condition originates from flow.c:life_analysis_1 and predates
the changes tracked by the current repository (1992).  It probably
made sense when most optimisations were done on RTL rather than FE
trees, but it seems counterproductive now.

gcc/
	* rtlanal.c (noop_move_p): Don't check for REG_EQUAL notes.
---
 gcc/rtlanal.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index 01130a10783..6f521503c39 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -1668,10 +1668,6 @@ noop_move_p (const rtx_insn *insn)
   if (INSN_CODE (insn) == NOOP_MOVE_INSN_CODE)
     return 1;
 
-  /* Insns carrying these notes are useful later on.  */
-  if (find_reg_note (insn, REG_EQUAL, NULL_RTX))
-    return 0;
-
   /* Check the code to be executed for COND_EXEC.  */
   if (GET_CODE (pat) == COND_EXEC)
     pat = COND_EXEC_CODE (pat);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [03/23] reginfo: Add a global_reg_set
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
  2020-11-13  8:11 ` [01/23] vec: Silence clang warning Richard Sandiford
  2020-11-13  8:12 ` [02/23] rtlanal: Remove noop_move_p REG_EQUAL condition Richard Sandiford
@ 2020-11-13  8:12 ` Richard Sandiford
  2020-11-25 20:01   ` Jeff Law
  2020-11-13  8:13 ` [04/23] Move iterator_range to a new iterator-utils.h file Richard Sandiford
                   ` (20 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:12 UTC (permalink / raw)
  To: gcc-patches

A later patch wants to use the set of global registers as a HARD_REG_SET
rather than a bool/char array.  Most other arrays already have a
HARD_REG_SET counterpart, but this one didn't.

gcc/
	* hard-reg-set.h (global_reg_set): Declare.
	* reginfo.c (global_reg_set): New variable.
	(init_reg_sets_1, globalize_reg): Update it when globalizing
	registers.
---
 gcc/hard-reg-set.h | 2 ++
 gcc/reginfo.c      | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/gcc/hard-reg-set.h b/gcc/hard-reg-set.h
index 1ec1b4e4aa0..787da3a4f02 100644
--- a/gcc/hard-reg-set.h
+++ b/gcc/hard-reg-set.h
@@ -359,6 +359,8 @@ hard_reg_set_iter_next (hard_reg_set_iterator *iter, unsigned *regno)
 
 extern char global_regs[FIRST_PSEUDO_REGISTER];
 
+extern HARD_REG_SET global_reg_set;
+
 class simplifiable_subreg;
 class subreg_shape;
 
diff --git a/gcc/reginfo.c b/gcc/reginfo.c
index e34b74af9f1..cc7d17460eb 100644
--- a/gcc/reginfo.c
+++ b/gcc/reginfo.c
@@ -91,6 +91,9 @@ static const char initial_call_used_regs[] = CALL_USED_REGISTERS;
    and are also considered fixed.  */
 char global_regs[FIRST_PSEUDO_REGISTER];
 
+/* The set of global registers.  */
+HARD_REG_SET global_reg_set;
+
 /* Declaration for the global register. */
 tree global_regs_decl[FIRST_PSEUDO_REGISTER];
 
@@ -390,6 +393,7 @@ init_reg_sets_1 (void)
 	{
 	  fixed_regs[i] = call_used_regs[i] = 1;
 	  SET_HARD_REG_BIT (fixed_reg_set, i);
+	  SET_HARD_REG_BIT (global_reg_set, i);
 	}
     }
 
@@ -724,6 +728,7 @@ globalize_reg (tree decl, int i)
 
   global_regs[i] = 1;
   global_regs_decl[i] = decl;
+  SET_HARD_REG_BIT (global_reg_set, i);
 
   /* If we're globalizing the frame pointer, we need to set the
      appropriate regs_invalidated_by_call bit, even if it's already
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [04/23] Move iterator_range to a new iterator-utils.h file
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (2 preceding siblings ...)
  2020-11-13  8:12 ` [03/23] reginfo: Add a global_reg_set Richard Sandiford
@ 2020-11-13  8:13 ` Richard Sandiford
  2020-11-25 20:02   ` Jeff Law
  2020-11-13  8:13 ` [05/23] Add more iterator utilities Richard Sandiford
                   ` (19 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:13 UTC (permalink / raw)
  To: gcc-patches

A later patch will add more iterator-related utilities.  Rather than
putting them all directly in coretypes.h, it seemed better to add a
new header file, here called "iterator-utils.h".  This preliminary
patch moves the existing iterator_range class there too.

I used the same copyright date range as coretypes.h “just to be sure”.

gcc/
	* coretypes.h (iterator_range): Move to...
	* iterator-utils.h: ...this new file.
---
 gcc/coretypes.h      | 18 +-----------------
 gcc/iterator-utils.h | 44 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+), 17 deletions(-)
 create mode 100644 gcc/iterator-utils.h

diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index da178b6a9f6..043df12f588 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -367,23 +367,6 @@ struct kv_pair
   const ValueType value;	/* the value of the name */
 };
 
-/* Iterator pair used for a collection iteration with range-based loops.  */
-
-template<typename T>
-struct iterator_range
-{
-public:
-  iterator_range (const T &begin, const T &end)
-    : m_begin (begin), m_end (end) {}
-
-  T begin () const { return m_begin; }
-  T end () const { return m_end; }
-
-private:
-  T m_begin;
-  T m_end;
-};
-
 #else
 
 struct _dont_use_rtx_here_;
@@ -491,6 +474,7 @@ typedef unsigned char uchar;
 #include "align.h"
 /* Most host source files will require the following headers.  */
 #if !defined (GENERATOR_FILE)
+#include "iterator-utils.h"
 #include "real.h"
 #include "fixed-value.h"
 #include "hash-table.h"
diff --git a/gcc/iterator-utils.h b/gcc/iterator-utils.h
new file mode 100644
index 00000000000..0c95862c7ca
--- /dev/null
+++ b/gcc/iterator-utils.h
@@ -0,0 +1,44 @@
+// Iterator-related utilities.
+// Copyright (C) 2002-2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef GCC_ITERATOR_UTILS_H
+#define GCC_ITERATOR_UTILS_H 1
+
+// A half-open [begin, end) range of iterators.
+template<typename T>
+struct iterator_range
+{
+public:
+  using const_iterator = T;
+
+  iterator_range () = default;
+  iterator_range (const T &begin, const T &end)
+    : m_begin (begin), m_end (end) {}
+
+  T begin () const { return m_begin; }
+  T end () const { return m_end; }
+
+  explicit operator bool () const { return m_begin != m_end; }
+
+private:
+  T m_begin;
+  T m_end;
+};
+
+#endif
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [05/23] Add more iterator utilities
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (3 preceding siblings ...)
  2020-11-13  8:13 ` [04/23] Move iterator_range to a new iterator-utils.h file Richard Sandiford
@ 2020-11-13  8:13 ` Richard Sandiford
  2020-11-25 20:12   ` Jeff Law
  2020-11-13  8:14 ` [06/23] Add an RAII class for managing obstacks Richard Sandiford
                   ` (18 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:13 UTC (permalink / raw)
  To: gcc-patches

This patch adds some more iterator helper classes.  They really fall
into two groups, but there didn't seem much value in separating them:

- A later patch has a class hierarchy of the form:

     Base
      +- Derived1
      +- Derived2

  A class wants to store an array A1 of Derived1 pointers and an
  array A2 of Derived2 pointers.  However, for compactness reasons,
  it was convenient to have a single array of Base pointers,
  with A1 and A2 being slices of this array.  This reduces the
  overhead from two pointers and two ints (3 LP64 words) to one
  pointer and two ints (2 LP64 words).

  But consumers of the class shouldn't be aware of this: they should
  see A1 as containing Derived1 pointers rather than Base pointers
  and A2 as containing Derived2 pointers rather than Base pointers.
  This patch adds derived_iterator and const_derived_container
  classes to support this use case.

- A later patch also adds various linked lists.  This patch adds
  wrapper_iterator and list_iterator classes to make it easier
  to create iterators for these linked lists.  For example:

    // Iterators for lists of definitions.
    using def_iterator = list_iterator<def_info, &def_info::next_def>;
    using reverse_def_iterator
      = list_iterator<def_info, &def_info::prev_def>;

  This in turn makes it possible to use range-based for loops
  on the lists.

The patch just adds the things that the later patches need; it doesn't
try to make the classes as functionally complete as possible.  I think
we should add extra functionality when needed rather than ahead of time.

gcc/
	* iterator-utils.h (derived_iterator): New class.
	(const_derived_container, wrapper_iterator): Likewise.
	(list_iterator): Likewise.
---
 gcc/iterator-utils.h | 159 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 159 insertions(+)

diff --git a/gcc/iterator-utils.h b/gcc/iterator-utils.h
index 0c95862c7ca..22cc1a545ef 100644
--- a/gcc/iterator-utils.h
+++ b/gcc/iterator-utils.h
@@ -41,4 +41,163 @@ private:
   T m_end;
 };
 
+// Provide an iterator like BaseIT, except that it yields values of type T,
+// which is derived from the type that BaseIT normally yields.
+//
+// The class doesn't inherit from BaseIT for two reasons:
+// - using inheritance would stop the class working with plain pointers
+// - not using inheritance increases type-safety for writable iterators
+//
+// Constructing this class from a BaseIT involves an assertion that all
+// contents really do have type T.  The constructor is therefore explicit.
+template<typename T, typename BaseIT>
+class derived_iterator
+{
+public:
+  using value_type = T;
+
+  derived_iterator () = default;
+
+  template<typename... Ts>
+  explicit derived_iterator (Ts... args)
+    : m_base (std::forward<Ts> (args)...) {}
+
+  derived_iterator &operator++ () { ++m_base; return *this; }
+  derived_iterator operator++ (int);
+
+  T operator* () const { return static_cast<T> (*m_base); }
+  T *operator-> () const { return static_cast<T *> (m_base.operator-> ()); }
+
+  bool operator== (const derived_iterator &other) const;
+  bool operator!= (const derived_iterator &other) const;
+
+protected:
+  BaseIT m_base;
+};
+
+template<typename T, typename BaseIT>
+inline derived_iterator<T, BaseIT>
+derived_iterator<T, BaseIT>::operator++ (int)
+{
+  derived_iterator ret = *this;
+  ++m_base;
+  return ret;
+}
+
+template<typename T, typename BaseIT>
+inline bool
+derived_iterator<T, BaseIT>::operator== (const derived_iterator &other) const
+{
+  return m_base == other.m_base;
+}
+
+template<typename T, typename BaseIT>
+inline bool
+derived_iterator<T, BaseIT>::operator!= (const derived_iterator &other) const
+{
+  return m_base != other.m_base;
+}
+
+// Provide a constant view of a BaseCT in which every value is known to
+// have type T, which is derived from the type that BaseCT normally presents.
+//
+// Constructing this class from a BaseCT involves an assertion that all
+// contents really do have type T.  The constructor is therefore explicit.
+template<typename T, typename BaseCT>
+class const_derived_container : public BaseCT
+{
+  using base_const_iterator = typename BaseCT::const_iterator;
+
+public:
+  using value_type = T;
+  using const_iterator = derived_iterator<T, base_const_iterator>;
+
+  const_derived_container () = default;
+
+  template<typename... Ts>
+  explicit const_derived_container (Ts... args)
+    : BaseCT (std::forward<Ts> (args)...) {}
+
+  const_iterator begin () const { return const_iterator (BaseCT::begin ()); }
+  const_iterator end () const { return const_iterator (BaseCT::end ()); }
+
+  T front () const { return static_cast<T> (BaseCT::front ()); }
+  T back () const { return static_cast<T> (BaseCT::back ()); }
+  T operator[] (unsigned int i) const;
+};
+
+template<typename T, typename BaseCT>
+inline T
+const_derived_container<T, BaseCT>::operator[] (unsigned int i) const
+{
+  return static_cast<T> (BaseCT::operator[] (i));
+}
+
+// A base class for iterators whose contents consist of a StoredT and that
+// when dereferenced yield those StoredT contents as a T.  Derived classes
+// should implement at least operator++ or operator--.
+template<typename T, typename StoredT = T>
+class wrapper_iterator
+{
+public:
+  using value_type = T;
+
+  wrapper_iterator () = default;
+
+  template<typename... Ts>
+  wrapper_iterator (Ts... args) : m_contents (std::forward<Ts> (args)...) {}
+
+  T operator* () const { return static_cast<T> (m_contents); }
+  bool operator== (const wrapper_iterator &) const;
+  bool operator!= (const wrapper_iterator &) const;
+
+protected:
+  StoredT m_contents;
+};
+
+template<typename T, typename StoredT>
+inline bool
+wrapper_iterator<T, StoredT>::operator== (const wrapper_iterator &other) const
+{
+  return m_contents == other.m_contents;
+}
+
+template<typename T, typename StoredT>
+inline bool
+wrapper_iterator<T, StoredT>::operator!= (const wrapper_iterator &other) const
+{
+  return m_contents != other.m_contents;
+}
+
+// A forward iterator for a linked list whose nodes are referenced using
+// type T.  Given a node "T N", the next element is given by (N->*Next) ().
+template<typename T, T *(T::*Next) () const>
+class list_iterator : public wrapper_iterator<T *>
+{
+private:
+  using parent = wrapper_iterator<T *>;
+
+public:
+  using parent::parent;
+  list_iterator &operator++ ();
+  list_iterator operator++ (int);
+};
+
+template<typename T, T *(T::*Next) () const>
+inline list_iterator<T, Next> &
+list_iterator<T, Next>::operator++ ()
+{
+  this->m_contents = (this->m_contents->*Next) ();
+  return *this;
+}
+
+template<typename T, T *(T::*Next) () const>
+inline list_iterator<T, Next>
+list_iterator<T, Next>::operator++ (int)
+{
+  list_iterator ret = *this;
+  this->m_contents = (this->m_contents->*Next) ();
+  return ret;
+}
+
 #endif
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [06/23] Add an RAII class for managing obstacks
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (4 preceding siblings ...)
  2020-11-13  8:13 ` [05/23] Add more iterator utilities Richard Sandiford
@ 2020-11-13  8:14 ` Richard Sandiford
  2020-11-25 20:15   ` Jeff Law
  2020-11-13  8:14 ` [07/23] Add a class that multiplexes two pointer types Richard Sandiford
                   ` (17 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:14 UTC (permalink / raw)
  To: gcc-patches

This patch adds an RAII class for managing the lifetimes of objects
on an obstack.  See the comments in the patch for more details and
example usage.

gcc/
	* obstack-utils.h: New file.
---
 gcc/obstack-utils.h | 86 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 86 insertions(+)
 create mode 100644 gcc/obstack-utils.h

diff --git a/gcc/obstack-utils.h b/gcc/obstack-utils.h
new file mode 100644
index 00000000000..ee389f89923
--- /dev/null
+++ b/gcc/obstack-utils.h
@@ -0,0 +1,86 @@
+// Obstack-related utilities.
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef GCC_OBSTACK_UTILS_H
+#define GCC_OBSTACK_UTILS_H
+
+// This RAII class automatically frees memory allocated on an obstack,
+// unless told not to via keep ().  It automatically converts to an
+// obstack, so it can (optionally) be used in place of the obstack
+// to make the scoping clearer.  For example:
+//
+//     obstack_watermark watermark (ob);
+//     auto *ptr1 = XOBNEW (watermark, struct1);
+//     if (...)
+//       // Frees ptr1.
+//       return false;
+//
+//     auto *ptr2 = XOBNEW (watermark, struct2);
+//     if (...)
+//       // Frees ptr1 and ptr2.
+//       return false;
+//
+//     // Retains ptr1 and ptr2.
+//     watermark.keep ();
+//
+//     auto *ptr3 = XOBNEW (watermark, struct3);
+//     if (...)
+//       // Frees ptr3.
+//       return false;
+//
+//     // Retains ptr3 (in addition to ptr1 and ptr2 above).
+//     watermark.keep ();
+//     return true;
+//
+// The move constructor makes it possible to transfer ownership to a caller:
+//
+//     obstack_watermark
+//     foo ()
+//     {
+//       obstack_watermark watermark (ob);
+//       ...
+//       return watermark;
+//     }
+//
+//     void
+//     bar ()
+//     {
+//       // Inherit ownership of everything that foo allocated.
+//       obstack_watermark watermark = foo ();
+//       ...
+//     }
+class obstack_watermark
+{
+public:
+  obstack_watermark (obstack *ob) : m_obstack (ob) { keep (); }
+  constexpr obstack_watermark (obstack_watermark &&) = default;
+  ~obstack_watermark () { obstack_free (m_obstack, m_start); }
+
+  operator obstack *() const { return m_obstack; }
+  void keep () { m_start = XOBNEWVAR (m_obstack, char, 0); }
+
+private:
+  DISABLE_COPY_AND_ASSIGN (obstack_watermark);
+
+protected:
+  obstack *m_obstack;
+  char *m_start;
+};
+
+#endif
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [07/23] Add a class that multiplexes two pointer types
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (5 preceding siblings ...)
  2020-11-13  8:14 ` [06/23] Add an RAII class for managing obstacks Richard Sandiford
@ 2020-11-13  8:14 ` Richard Sandiford
  2020-11-25 20:23   ` Jeff Law
  2020-11-25 23:33   ` Martin Sebor
  2020-11-13  8:15 ` [08/23] Add an alternative splay tree implementation Richard Sandiford
                   ` (16 subsequent siblings)
  23 siblings, 2 replies; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:14 UTC (permalink / raw)
  To: gcc-patches

This patch adds a pointer_mux<T1, T2> class that provides similar
functionality to:

    union { T1 *a; T2 *b; };
    ...
    bool is_b_rather_than_a;

except that the is_b_rather_than_a tag is stored in the low bit
of the pointer.  See the comments in the patch for a comparison
between the two approaches and why this one can be more efficient.

I've tried to microoptimise the class a fair bit, since a later
patch uses it extensively in order to keep the sizes of data
structures down.

gcc/
	* mux-utils.h: New file.
---
 gcc/mux-utils.h | 248 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 248 insertions(+)
 create mode 100644 gcc/mux-utils.h

diff --git a/gcc/mux-utils.h b/gcc/mux-utils.h
new file mode 100644
index 00000000000..17ced49cd22
--- /dev/null
+++ b/gcc/mux-utils.h
@@ -0,0 +1,248 @@
+// Multiplexer utilities
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef GCC_MUX_UTILS_H
+#define GCC_MUX_UTILS_H 1
+
+// A class that stores a choice "A or B", where A has type T1 * and B has
+// type T2 *.  Both T1 and T2 must have an alignment greater than 1, since
+// the low bit is used to identify B over A.  T1 and T2 can be the same.
+//
+// A can be a null pointer but B cannot.
+//
+// Barring the requirement that B must be nonnull, using the class is
+// equivalent to using:
+//
+//     union { T1 *A; T2 *B; };
+//
+// and having a separate tag bit to indicate which alternative is active.
+// However, using this class can have two advantages over a union:
+//
+// - It avoides the need to find somewhere to store the tag bit.
+//
+// - The compiler is aware that B cannot be null, which can make checks
+//   of the form:
+//
+//       if (auto *B = mux.dyn_cast<T2 *> ())
+//
+//   more efficient.  With a union-based representation, the dyn_cast
+//   check could fail either because MUX is an A or because MUX is a
+//   null B, both of which require a run-time test.  With a pointer_mux,
+//   only a check for MUX being A is needed.
+template<typename T1, typename T2 = T1>
+class pointer_mux
+{
+public:
+  // Return an A pointer with the given value.
+  static pointer_mux first (T1 *);
+
+  // Return a B pointer with the given (nonnull) value.
+  static pointer_mux second (T2 *);
+
+  pointer_mux () = default;
+
+  // Create a null A pointer.
+  pointer_mux (std::nullptr_t) : m_ptr (nullptr) {}
+
+  // Create an A or B pointer with the given value.  This is only valid
+  // if T1 and T2 are distinct and if T can be resolved to exactly one
+  // of them.
+  template<typename T,
+	   typename Enable = typename
+	     std::enable_if<std::is_convertible<T *, T1 *>::value
+			    != std::is_convertible<T *, T2 *>::value>::type>
+  pointer_mux (T *ptr);
+
+  // Return true unless the pointer is a null A pointer.
+  explicit operator bool () const { return m_ptr; }
+
+  // Assign A and B pointers respectively.
+  void set_first (T1 *ptr) { *this = first (ptr); }
+  void set_second (T2 *ptr) { *this = second (ptr); }
+
+  // Return true if the pointer is an A pointer.
+  bool is_first () const { return !(uintptr_t (m_ptr) & 1); }
+
+  // Return true if the pointer is a B pointer.
+  bool is_second () const { return uintptr_t (m_ptr) & 1; }
+
+  // Return the contents of the pointer, given that it is known to be
+  // an A pointer.
+  T1 *known_first () const { return reinterpret_cast<T1 *> (m_ptr); }
+
+  // Return the contents of the pointer, given that it is known to be
+  // a B pointer.
+  T2 *known_second () const { return reinterpret_cast<T2 *> (m_ptr - 1); }
+
+  // If the pointer is an A pointer, return its contents, otherwise
+  // return null.  Thus a null return can mean that the pointer is
+  // either a null A pointer or a B pointer.
+  //
+  // If all A pointers are nonnull, it is more efficient to use:
+  //
+  //    if (ptr.is_first ())
+  //      ...use ptr.known_first ()...
+  //
+  // over:
+  //
+  //    if (T1 *a = ptr.first_or_null ())
+  //      ...use a...
+  T1 *first_or_null () const;
+
+  // If the pointer is a B pointer, return its contents, otherwise
+  // return null.  Using:
+  //
+  //    if (T1 *b = ptr.second_or_null ())
+  //      ...use b...
+  //
+  // should be at least as efficient as:
+  //
+  //    if (ptr.is_second ())
+  //      ...use ptr.known_second ()...
+  T2 *second_or_null () const;
+
+  // Return true if the pointer is a T.
+  //
+  // This is only valid if T1 and T2 are distinct and if T can be
+  // resolved to exactly one of them.  The condition is checked using
+  // a static assertion rather than SFINAE because it gives a clearer
+  // error message.
+  template<typename T>
+  bool is_a () const;
+
+  // Assert that the pointer is a T and return it as such.  See is_a
+  // for the restrictions on T.
+  template<typename T>
+  T as_a () const;
+
+  // If the pointer is a T, return it as such, otherwise return null.
+  // See is_a for the restrictions on T.
+  template<typename T>
+  T dyn_cast () const;
+
+private:
+  pointer_mux (char *ptr) : m_ptr (ptr) {}
+
+  // The pointer value for A pointers, or the pointer value + 1 for B pointers.
+  // Using a pointer rather than a uintptr_t tells the compiler that second ()
+  // can never return null, and that second_or_null () is only null if
+  // is_first ().
+  char *m_ptr;
+};
+
+template<typename T1, typename T2>
+inline pointer_mux<T1, T2>
+pointer_mux<T1, T2>::first (T1 *ptr)
+{
+  gcc_checking_assert (!(uintptr_t (ptr) & 1));
+  return reinterpret_cast<char *> (ptr);
+}
+
+template<typename T1, typename T2>
+inline pointer_mux<T1, T2>
+pointer_mux<T1, T2>::second (T2 *ptr)
+{
+  gcc_checking_assert (!(uintptr_t (ptr) & 1));
+  return reinterpret_cast<char *> (ptr) + 1;
+}
+
+template<typename T1, typename T2>
+template<typename T, typename Enable>
+inline pointer_mux<T1, T2>::pointer_mux (T *ptr)
+  : m_ptr (reinterpret_cast<char *> (ptr))
+{
+  if (std::is_convertible<T *, T2 *>::value)
+    m_ptr += 1;
+}
+
+template<typename T1, typename T2>
+inline T1 *
+pointer_mux<T1, T2>::first_or_null () const
+{
+  return is_first () ? known_first () : nullptr;
+}
+
+template<typename T1, typename T2>
+inline T2 *
+pointer_mux<T1, T2>::second_or_null () const
+{
+  // Micro optimization that's effective as of GCC 11: compute the value
+  // of the second pointer as an integer and test that, so that the integer
+  // result can be reused as the pointer and so that all computation can
+  // happen before a branch on null.  This reduces the number of branches
+  // needed for loops.
+  return uintptr_t (m_ptr - 1) & 1 ? nullptr : known_second ();
+}
+
+template<typename T1, typename T2>
+template<typename T>
+inline bool
+pointer_mux<T1, T2>::is_a () const
+{
+  static_assert (std::is_convertible<T1 *, T>::value
+		 != std::is_convertible<T2 *, T>::value,
+		 "Ambiguous pointer type");
+  if (std::is_convertible<T2 *, T>::value)
+    return is_second ();
+  else
+    return is_first ();
+}
+
+template<typename T1, typename T2>
+template<typename T>
+inline T
+pointer_mux<T1, T2>::as_a () const
+{
+  static_assert (std::is_convertible<T1 *, T>::value
+		 != std::is_convertible<T2 *, T>::value,
+		 "Ambiguous pointer type");
+  if (std::is_convertible<T2 *, T>::value)
+    {
+      gcc_checking_assert (is_second ());
+      return reinterpret_cast<T> (m_ptr - 1);
+    }
+  else
+    {
+      gcc_checking_assert (is_first ());
+      return reinterpret_cast<T> (m_ptr);
+    }
+}
+
+template<typename T1, typename T2>
+template<typename T>
+inline T
+pointer_mux<T1, T2>::dyn_cast () const
+{
+  static_assert (std::is_convertible<T1 *, T>::value
+		 != std::is_convertible<T2 *, T>::value,
+		 "Ambiguous pointer type");
+  if (std::is_convertible<T2 *, T>::value)
+    {
+      if (is_second ())
+	return reinterpret_cast<T> (m_ptr - 1);
+    }
+  else
+    {
+      if (is_first ())
+	return reinterpret_cast<T> (m_ptr);
+    }
+  return nullptr;
+}
+
+#endif
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [08/23] Add an alternative splay tree implementation
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (6 preceding siblings ...)
  2020-11-13  8:14 ` [07/23] Add a class that multiplexes two pointer types Richard Sandiford
@ 2020-11-13  8:15 ` Richard Sandiford
  2020-12-02 20:36   ` Jeff Law
  2021-01-01  8:25   ` Andreas Schwab
  2020-11-13  8:15 ` [09/23] Add a cut-down version of std::span (array_slice) Richard Sandiford
                   ` (15 subsequent siblings)
  23 siblings, 2 replies; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:15 UTC (permalink / raw)
  To: gcc-patches

We already have two splay tree implementations: the old C one in
libiberty and a templated reimplementation of it in typed-splay-tree.h.
However, they have some drawbacks:

- They hard-code the assumption that nodes should have both a key and
  a value, which isn't always true.

- They use the two-phase method of lookup, and so nodes need to store
  a temporary back pointer.  We can avoid that overhead by using the
  top-down method (as e.g. the bitmap tree code already does).

- The tree node has to own the key and the value.  For some use cases
  it's more convenient to embed the tree links in the value instead.

Also, a later patch wants to use splay trees to represent an
adaptive total order: the splay tree itself records whether node N1
is less than node N2, and (in the worst case) comparing nodes is
a splay operation.

This patch therefore adds an alternative implementation.  The main
features are:

- Nodes can optionally point back to their parents.

- An Accessors class abstracts accessing child nodes and (where
  applicable) parent nodes, so that the information can be embedded
  in larger data structures.

- There is no fixed comparison function at the class level.  Instead,
  individual functions that do comparisons take a comparison function
  argument.

- There are two styles of comparison function, optimised for different
  use cases.  (See the comments in the patch for details.)

- It's possible to do some operations directly on a given node,
  without knowing whether it's the root.  This includes the comparison
  use case described above.

This of course has its own set of drawbacks.  It's really providing
splay utility functions rather than a true ADT, and so is more low-level
than the existing routines.  It's mostly geared for cases in which the
client code wants to participate in the splay operations to some extent.

gcc/
	* Makefile.in (OBJS): Add splay-tree-utils.o.
	* system.h: Include <array> when INCLUDE_ARRAY is defined.
	* selftest.h (splay_tree_cc_tests): Declare.
	* selftest-run-tests.c (selftest::run_tests): Run splay_tree_cc_tests.
	* splay-tree-utils.h: New file.
	* splay-tree-utils.tcc: Likewise.
	* splay-tree-utils.cc: Likewise.
---
 gcc/Makefile.in          |   1 +
 gcc/selftest-run-tests.c |   1 +
 gcc/selftest.h           |   1 +
 gcc/splay-tree-utils.cc  | 264 +++++++++++
 gcc/splay-tree-utils.h   | 491 ++++++++++++++++++++
 gcc/splay-tree-utils.tcc | 960 +++++++++++++++++++++++++++++++++++++++
 gcc/system.h             |   3 +
 7 files changed, 1721 insertions(+)
 create mode 100644 gcc/splay-tree-utils.cc
 create mode 100644 gcc/splay-tree-utils.h
 create mode 100644 gcc/splay-tree-utils.tcc

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 978a08f7b04..900bf11b0ba 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1540,6 +1540,7 @@ OBJS = \
 	sparseset.o \
 	spellcheck.o \
 	spellcheck-tree.o \
+	splay-tree-utils.o \
 	sreal.o \
 	stack-ptr-mod.o \
 	statistics.o \
diff --git a/gcc/selftest-run-tests.c b/gcc/selftest-run-tests.c
index 7a89b2df5bd..c0c18ad17ca 100644
--- a/gcc/selftest-run-tests.c
+++ b/gcc/selftest-run-tests.c
@@ -79,6 +79,7 @@ selftest::run_tests ()
   optinfo_emit_json_cc_tests ();
   opt_problem_cc_tests ();
   ordered_hash_map_tests_cc_tests ();
+  splay_tree_cc_tests ();
 
   /* Mid-level data structures.  */
   input_c_tests ();
diff --git a/gcc/selftest.h b/gcc/selftest.h
index 963e074b4d2..b6e4345b19f 100644
--- a/gcc/selftest.h
+++ b/gcc/selftest.h
@@ -256,6 +256,7 @@ extern void selftest_c_tests ();
 extern void simplify_rtx_c_tests ();
 extern void spellcheck_c_tests ();
 extern void spellcheck_tree_c_tests ();
+extern void splay_tree_cc_tests ();
 extern void sreal_c_tests ();
 extern void store_merging_c_tests ();
 extern void tree_c_tests ();
diff --git a/gcc/splay-tree-utils.cc b/gcc/splay-tree-utils.cc
new file mode 100644
index 00000000000..4b2007b8414
--- /dev/null
+++ b/gcc/splay-tree-utils.cc
@@ -0,0 +1,264 @@
+// Splay tree utilities                                             -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#define INCLUDE_ALGORITHM
+#define INCLUDE_ARRAY
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "pretty-print.h"
+#include "splay-tree-utils.h"
+#include "selftest.h"
+
+#if CHECKING_P
+namespace {
+// A simple test node for rootless_splay_tree.
+struct rootless_test_node
+{
+  int data;
+  rootless_test_node *m_parent;
+  rootless_test_node *m_children[2];
+};
+}
+
+namespace selftest {
+
+// Random input data.
+static const size_t MAX_DATA = 32768;
+static const int data[] = {
+  1379, 14643, 30579, 28160, 31750, 22280, 5502, 4720, 30075, 27595,
+  8395, 19410, 518, 19709, 29694, 19865, 25372, 11752, 15485, 21547,
+  25153, 25072, 10146, 3341, 15625, 3038, 10189, 19943, 1322, 11762,
+  807, 430, 11284, 11841, 23965, 32008, 4547, 8087, 13225, 23054,
+  22284, 13756, 2182, 26450, 30482, 32502, 23348, 20265, 29509, 3290,
+  10807, 1242, 3212, 32178, 25354, 22032, 30509, 16157, 22432, 1295,
+  8348, 23342, 24678, 193, 31016, 10316, 3872, 13521, 19211, 30594,
+  12229, 4794, 25083, 16098, 28144, 27896, 4801, 20689, 31450, 15614,
+  19597, 13731, 30309, 24846, 11042, 31929, 18306, 28520, 16907, 12488,
+  15001, 18487, 3438, 1706, 4829, 20892, 6226, 18204, 15776, 30717,
+  19398, 2480, 19434, 2838, 2605, 3994, 22538, 12269, 6486, 1314,
+  30301, 9919, 31405, 30847, 25000, 24013, 22196, 30220, 31415, 14630,
+  26319, 4880, 21292, 20217, 20078, 14679, 25686, 28675, 13883, 14853,
+  2872, 2428, 3636, 14131, 2952, 2133, 4470, 25808, 12576, 31395,
+  5938, 28393, 14553, 4494, 14928, 24310, 17394, 17436, 23385, 22792,
+  9785, 13118, 22338, 23320, 27059, 17663, 16434, 14954, 16962, 31088,
+  22247, 22600, 7980, 1344, 15635, 13611, 32739, 3283, 12924, 17904,
+  28216, 7542, 9212, 28308, 18873, 3912, 5473, 4666, 11900, 21420,
+  20072, 27662, 16445, 29848, 24444, 31668, 30664, 14287, 13754, 29276,
+  21462, 25517, 17632, 8105, 32510, 16677, 11162, 20734, 26873, 5097
+};
+
+// Look up VALUE in TREE using the single-comparator lookup function.
+static int
+lookup1 (splay_tree<int> &tree, int value)
+{
+  auto compare = [&](splay_tree_node<int> *node)
+    {
+      return value - node->value ();
+    };
+  return tree.lookup (compare);
+}
+
+// Look up VALUE in TREE using the double-comparator lookup function.
+static int
+lookup2 (splay_tree<int> &tree, int value)
+{
+  auto want_something_smaller = [&](splay_tree_node<int> *node)
+    {
+      return value < node->value ();
+    };
+  auto want_something_bigger = [&](splay_tree_node<int> *node)
+    {
+      return value > node->value ();
+    };
+  return tree.lookup (want_something_smaller, want_something_bigger);
+}
+
+// Test printing TREE to a pretty printer.  Don't check the output against
+// anything; just make sure that it doesn't crash.
+static void
+test_print (splay_tree<int> &tree)
+{
+  auto print_node = [](pretty_printer *pp, splay_tree_node<int> *node)
+    {
+      pp_decimal_int (pp, node->value ());
+    };
+  pretty_printer pp;
+  tree.print (&pp, print_node);
+}
+
+// Test various lookups on TREE using LOOKUP, where lookup returns the
+// same kind of value as the rooted_splay_tree lookup functions.
+static void
+test_lookup (splay_tree<int> &tree, int (*lookup) (splay_tree<int> &, int))
+{
+  // Look up values that are known to exist.
+  for (int value : data)
+    ASSERT_EQ (lookup (tree, value), 0);
+
+  // Look up values that are 1 less than values that are known to exist.
+  for (int value : data)
+    {
+      int result = lookup (tree, value - 1);
+      if (result == 0)
+	ASSERT_EQ (tree->value (), value - 1);
+      else if (result < 0)
+	// VALUE - 1 is less than the root.
+	ASSERT_EQ (tree->value (), value);
+      else if (result > 0)
+	{
+	  // VALUE - 1 is greater than the root.
+	  ASSERT_TRUE (tree->value () < value - 1);
+	  if (tree.splay_next_node ())
+	    ASSERT_EQ (tree->value (), value);
+	}
+    }
+
+  // Look up values that are 1 greater than values that are known to exist.
+  for (int value : data)
+    {
+      int result = lookup (tree, value + 1);
+      if (result == 0)
+	ASSERT_EQ (tree->value (), value + 1);
+      else if (result < 0)
+	{
+	  // VALUE + 1 is less than the root.
+	  ASSERT_TRUE (tree->value () > value + 1);
+	  if (tree.splay_prev_node ())
+	    ASSERT_EQ (tree->value (), value);
+	}
+      else if (result > 0)
+	// VALUE + 1 is greater than the root.
+	ASSERT_EQ (tree->value (), value);
+    }
+}
+
+// Run all tests for this module.
+void
+splay_tree_cc_tests ()
+{
+  obstack ob;
+  gcc_obstack_init (&ob);
+
+  // Build up the splay tree.
+  splay_tree<int> tree;
+  for (int value : data)
+    {
+      auto *node = XOBNEW (&ob, splay_tree_node<int>);
+      new (node) splay_tree_node<int> (value);
+      auto compare = [&](splay_tree_node<int> *other_node)
+	{
+	  return value - other_node->value ();
+	};
+      bool inserted = tree.insert (node, compare);
+      ASSERT_TRUE (inserted);
+    }
+
+  // Test the single-comparator lookup function.
+  test_lookup (tree, lookup1);
+
+  // Sort the input data.
+  std::array<int, ARRAY_SIZE (data)> sorted;
+  std::copy (data, data + ARRAY_SIZE (data), sorted.begin ());
+  std::sort (sorted.begin (), sorted.end ());
+
+  // Iterate over the tree in ascending order.
+  tree.splay_min_node ();
+  bool result = true;
+  for (int value : sorted)
+    {
+      ASSERT_TRUE (result);
+      ASSERT_EQ (tree->value (), value);
+      result = tree.splay_next_node ();
+    }
+  ASSERT_FALSE (result);
+  ASSERT_EQ (tree.min_node ()->value (), sorted.front ());
+
+  // Test the double-comparator lookup function.
+  test_lookup (tree, lookup2);
+
+  // Test printing the tree now, while it's still bushy.
+  test_print (tree);
+
+  // Iterate over the tree in descending order.
+  tree.splay_max_node ();
+  result = true;
+  for (auto it = sorted.rbegin (); it != sorted.rend (); ++it)
+    {
+      ASSERT_TRUE (result);
+      ASSERT_EQ (tree->value (), *it);
+      result = tree.splay_prev_node ();
+    }
+  ASSERT_FALSE (result);
+  ASSERT_EQ (tree.max_node ()->value (), sorted.back ());
+
+  // Try splitting the tree into three.
+  int mid_min = sorted[sorted.size () / 3];
+  int mid_max = sorted[sorted.size () * 2 / 3];
+  ASSERT_EQ (lookup1 (tree, mid_min), 0);
+  splay_tree<int> left = tree.split_before_root ();
+  ASSERT_EQ (lookup1 (tree, mid_max), 0);
+  splay_tree<int> right = tree.split_after_root ();
+
+  // Test removing all the nodes from their respective trees.
+  for (int value : data)
+    {
+      splay_tree<int> &t = (value < mid_min ? left
+			    : value > mid_max ? right : tree);
+      ASSERT_EQ (lookup1 (t, value), 0);
+      t.remove_root ();
+    }
+  ASSERT_EQ (left.root (), nullptr);
+  ASSERT_EQ (tree.root (), nullptr);
+  ASSERT_EQ (right.root (), nullptr);
+
+  using rootless = default_rootless_splay_tree<rootless_test_node *>;
+
+  // Build a tree in ascending order with the lowest element as the root.
+  auto *nodes = XOBNEWVEC (&ob, rootless_test_node *, MAX_DATA);
+  rootless_test_node *parent = nullptr;
+  for (int data : sorted)
+    {
+      auto *node = XOBNEW (&ob, rootless_test_node);
+      new (node) rootless_test_node ();
+      node->data = data;
+      nodes[data] = node;
+      if (parent)
+	rootless::insert_child (parent, 1, node);
+      parent = node;
+    }
+
+  // Try comparing nodes to make sure that their order matches the data.
+  for (size_t i = 1; i < ARRAY_SIZE (data); ++i)
+    {
+      int data1 = data[i - 1];
+      int data2 = data[i];
+      int comparison = rootless::compare_nodes (nodes[data1], nodes[data2]);
+      if (data1 < data2)
+	ASSERT_TRUE (comparison < 0);
+      else if (data1 > data2)
+	ASSERT_TRUE (comparison > 0);
+      else
+	ASSERT_EQ (comparison, 0);
+    }
+
+  obstack_free (&ob, nullptr);
+}
+}
+#endif // CHECKING_P
diff --git a/gcc/splay-tree-utils.h b/gcc/splay-tree-utils.h
new file mode 100644
index 00000000000..dfb2a4a0478
--- /dev/null
+++ b/gcc/splay-tree-utils.h
@@ -0,0 +1,491 @@
+// Splay tree utilities                                             -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// Implement splay tree node accessors for a class that stores its
+// two child nodes in a member variable of the form:
+//
+//    Node m_children[2];
+template<typename Node>
+class default_splay_tree_accessors
+{
+public:
+  using node_type = Node;
+
+  static auto
+  child (node_type node, unsigned int index)
+    -> decltype (node->m_children[index]) &
+  {
+    return node->m_children[index];
+  }
+};
+
+// Implement splay tree node accessors for a class that stores its
+// two child nodes in a member variable of the form:
+//
+//    Node m_children[2];
+//
+// and also stores its parent node in a member variable of the form:
+//
+//    Node m_parent;
+template<typename Node>
+class default_splay_tree_accessors_with_parent
+  : public default_splay_tree_accessors<Node>
+{
+public:
+  using node_type = Node;
+
+  static auto
+  parent (node_type node) -> decltype (node->m_parent) &
+  {
+    return node->m_parent;
+  }
+};
+
+// Base is a splay tree accessor class for nodes that have no parent field.
+// Base therefore provides a Base::child method but does not provide a
+// Base::parent method.  Extend Base with dummy routines for setting the
+// parent, which is a no-op when the parent is not stored.
+template<typename Base>
+class splay_tree_accessors_without_parent : public Base
+{
+public:
+  using typename Base::node_type;
+
+  static void set_parent (node_type, node_type) {}
+};
+
+// Base is splay tree accessor class for nodes that have a parent field.
+// Base therefore provides both Base::child and Base::parent methods.
+// Extend Base with routines for setting the parent.
+template<typename Base>
+class splay_tree_accessors_with_parent : public Base
+{
+public:
+  using typename Base::node_type;
+
+  // Record that NODE's parent is now NEW_PARENT.
+  static void
+  set_parent (node_type node, node_type new_parent)
+  {
+    Base::parent (node) = new_parent;
+  }
+};
+
+// A base class that provides some splay tree operations that are common
+// to both rooted_splay_tree and rootless_splay_tree.
+//
+// Nodes in the splay tree have type Accessors::node_type; this is
+// usually a pointer type.  The Accessors class provides the following
+// static member functions for accessing nodes:
+//
+// - Accessors::child (NODE, INDEX)
+//     INDEX is guaranteed to be 0 or 1.  If INDEX is 0, return a reference
+//     to where NODE's left child is stored, otherwise return a reference
+//     to where NODE's right child is stored.
+//
+// - Accessors::set_parent (NODE, PARENT)
+//     Record that NODE's parent node is now PARENT.
+template<typename Accessors>
+class base_splay_tree : protected Accessors
+{
+public:
+  using typename Accessors::node_type;
+
+  // INDEX is either 0 or 1.  If INDEX is 0, insert CHILD immediately
+  // before NODE, otherwise insert CHILD immediately after NODE.
+  //
+  // Complexity: O(1).
+  static void insert_child (node_type node, unsigned int index,
+			    node_type child);
+
+  // Print NODE and its child nodes to PP for debugging purposes,
+  // using PRINTER (PP, N) to print the data for node N.
+  template<typename Printer>
+  static void print (pretty_printer *pp, node_type node, Printer printer);
+
+protected:
+  using Accessors::set_parent;
+
+  static node_type get_child (node_type, unsigned int);
+  static void set_child (node_type, unsigned int, node_type);
+  static node_type promote_child (node_type, unsigned int);
+  static void promote_child (node_type, unsigned int, node_type);
+
+  template<unsigned int N>
+  static node_type splay_limit (node_type);
+
+  static node_type remove_node_internal (node_type);
+
+  template<typename Printer>
+  static void print (pretty_printer *pp, node_type node, Printer printer,
+		     char, vec<char> &);
+};
+
+// This class provides splay tree routines for cases in which the root
+// of the splay tree is known.  It works with both nodes that store
+// their parent node and nodes that don't.
+//
+// The class is lightweight: it only contains a single root node.
+template<typename Accessors>
+class rooted_splay_tree : public base_splay_tree<Accessors>
+{
+  using parent = base_splay_tree<Accessors>;
+
+public:
+  using typename Accessors::node_type;
+
+protected:
+  // The root of the splay tree, or node_type () if the tree is empty.
+  node_type m_root;
+
+public:
+  rooted_splay_tree () : m_root () {}
+
+  // Construct a tree with the specified root node.
+  rooted_splay_tree (node_type root) : m_root (root) {}
+
+  // Return the root of the tree.
+  node_type root () const { return m_root; }
+
+  // Return true if the tree contains any nodes.
+  explicit operator bool () const { return m_root; }
+
+  // Dereference the root node.
+  node_type operator-> () { return m_root; }
+
+  // Insert NEW_NODE into the splay tree, if no equivalent node already
+  // exists.  For a given node N, COMPARE (N) should return:
+  //
+  // - a negative value if NEW_NODE should come before N
+  // - zero if NEW_NODE and N are the same
+  // - a positive value if NEW_NODE should come after N
+  //
+  // Return true if NEW_NODE was inserted.
+  //
+  // On return, NEW_NODE or its equivalent is the root of the tree.
+  //
+  // Complexity: amortized O(C log N), worst-cast O(C N), where C is
+  // the complexity of the comparison.
+  template<typename Comparator>
+  bool insert (node_type new_node, Comparator compare);
+
+  // Insert NEW_NODE into the splay tree, given that NEW_NODE is the
+  // maximum node of the new tree.  On return, NEW_NODE is also the
+  // root of the tree.
+  //
+  // Complexity: O(1).
+  void insert_max_node (node_type new_node);
+
+  // Splice NEXT_TREE onto this one, given that all nodes in NEXT_TREE
+  // are greater than the maximum node in this tree.  NEXT_TREE should
+  // not be used afterwards.
+  //
+  // Complexity: O(1) if the root of the splay tree is already the maximum
+  // node.  Otherwise amortized O(log N), worst-cast O(N).
+  void splice_next_tree (rooted_splay_tree next_tree);
+
+  // The root of the tree is currently the maximum node.  Replace it
+  // with NEW_NODE.
+  //
+  // Complexity: O(1).
+  void replace_max_node_at_root (node_type new_node);
+
+  // Remove the root node of the splay tree.
+  //
+  // Complexity: O(1) if removing the maximum or minimum node.
+  // Otherwise amortized O(log N), worst-cast O(N).
+  void remove_root ();
+
+  // Split the left child of the current root out into a separate tree
+  // and return the new tree.
+  rooted_splay_tree split_before_root ();
+
+  // Split the right child of the current root out into a separate tree
+  // and return the new tree.
+  rooted_splay_tree split_after_root ();
+
+  // If the root is not the minimum node of the splay tree, bring the previous
+  // node to the root and return true, otherwise return false.
+  //
+  // Complexity: amortized O(log N), worst-cast O(N).
+  bool splay_prev_node ();
+
+  // If the root is not the maximum node of the splay tree, bring the next
+  // node to the root and return true, otherwise return false.
+  //
+  // Complexity: amortized O(log N), worst-cast O(N).
+  bool splay_next_node ();
+
+  // Bring the minimum node of the splay tree to the root.
+  //
+  // Complexity: amortized O(log N), worst-cast O(N).
+  void splay_min_node ();
+
+  // Bring the maximum node of the splay tree to the root.
+  //
+  // Complexity: amortized O(log N), worst-cast O(N).
+  void splay_max_node ();
+
+  // Return the minimum node of the splay tree, or node_type () if the
+  // tree is empty.  On return, the minimum node (if any) is also the
+  // root of the tree.
+  //
+  // Complexity: amortized O(log N), worst-cast O(N).
+  node_type min_node ();
+
+  // Return the maximum node of the splay tree, or node_type () if the
+  // tree is empty.  On return, the maximum node (if any) is also the
+  // root of the tree.
+  //
+  // Complexity: amortized O(log N), worst-cast O(N).
+  node_type max_node ();
+
+  // Search the splay tree.  For a given node N, COMPARE (N) should return:
+  //
+  // - a negative value if N is bigger than the node being searched for
+  // - zero if N is the node being searched for
+  // - a positive value if N is smaller than the node being searched for
+  //
+  // If the node that COMPARE is looking for exists, install it as the root
+  // node of the splay tree.  Otherwise, arbitrarily pick either:
+  //
+  // - the maximum node that is smaller than the node being searched for or
+  // - the minimum node that is bigger than the node being searched for
+  //
+  // and install that node as the root instead.
+  //
+  // Return the result of COMPARE for the new root.
+  //
+  // This form of lookup is intended for cases in which both the following
+  // are true:
+  //
+  // (a) The work that COMPARE needs to do to detect if a node is too big
+  //     is the same as the work that COMPARE needs to do to detect if a
+  //     node is too small.  (This is not true of range comparisons,
+  //     for example.)
+  //
+  // (b) COMPARE is (or might be) relatively complex.
+  //
+  // This form of lookup is also useful if the items being compared naturally
+  // provide a <=>-style comparison result, without the result having to be
+  // forced by the equivalent of a ?: expression.
+  //
+  // The implementation only invokes COMPARE once per node.
+  //
+  // Complexity: amortized O(C log N), worst-cast O(C N), where C is
+  // the complexity of the comparison.
+  template<typename Comparator>
+  auto lookup (Comparator compare) -> decltype (compare (m_root));
+
+  // Search the splay tree.  For a given node N, WANT_SOMETHING_SMALLER (N)
+  // is true if N is too big and WANT_SOMETHING_BIGGER (N) is true if N
+  // is too small.  Both functions return false if N is the node being
+  // searched for.
+  //
+  // If the node that is being searched for exists, install it as the root
+  // node of the splay tree and return 0.  Otherwise, arbitrarily choose
+  // between these two options:
+  //
+  // - Install the maximum node that is smaller than the node being
+  //   searched for as the root of the splay tree and return 1.
+  //
+  // - Install the minimum node that is bigger than the node being
+  //   searched for and return -1.
+  //
+  // This form of lookup is intended for cases in which either of the
+  // following are true:
+  //
+  // (a) WANT_SOMETHING_SMALLER and WANT_SOMETHING_BIGGER test different
+  //     parts of the node's data.  For example, when comparing ranges,
+  //     WANT_SOMETHING_SMALLER would test the lower limit of the given
+  //     node's range while WANT_SOMETHING_BIGGER would test the upper
+  //     limit of the given node's range.
+  //
+  // (b) There is no significant overhead to calling both
+  //     WANT_SOMETHING_SMALLER and WANT_SOMETHING_BIGGER for the same node.
+  //
+  // Complexity: amortized O(C log N), worst-cast O(C N), where C is
+  // the complexity of the comparisons.
+  template<typename LeftPredicate, typename RightPredicate>
+  int lookup (LeftPredicate want_something_smaller,
+	      RightPredicate want_something_bigger);
+
+  // Keep the ability to print subtrees.
+  using parent::print;
+
+  // Print the tree to PP for debugging purposes, using PRINTER (PP, N)
+  // to print the data for node N.
+  template<typename Printer>
+  void print (pretty_printer *pp, Printer printer) const;
+
+protected:
+  using parent::get_child;
+  using parent::set_child;
+  using parent::promote_child;
+
+  using parent::set_parent;
+
+  template<unsigned int N>
+  bool splay_neighbor ();
+};
+
+// Provide splay tree routines for nodes of type Accessors::node_type,
+// which doesn't have a parent field.  Use Accessors::child to access
+// the children of a node.
+template<typename Accessors>
+using splay_tree_without_parent
+  = rooted_splay_tree<splay_tree_accessors_without_parent<Accessors>>;
+
+// A splay tree for nodes of type Node, which is usually a pointer type.
+// The child nodes are stored in a member variable:
+//
+//    Node m_children[2];
+//
+// Node does not have a parent field.
+template<typename Node>
+using default_splay_tree
+  = splay_tree_without_parent<default_splay_tree_accessors<Node>>;
+
+// A simple splay tree node that stores a value of type T.
+template<typename T>
+class splay_tree_node
+{
+  friend class default_splay_tree_accessors<splay_tree_node *>;
+
+public:
+  splay_tree_node () = default;
+  splay_tree_node (T value) : m_value (value), m_children () {}
+
+  T &value () { return m_value; }
+  const T &value () const { return m_value; }
+
+private:
+  T m_value;
+  splay_tree_node *m_children[2];
+};
+
+// A splay tree whose nodes hold values of type T.
+template<typename T>
+using splay_tree = default_splay_tree<splay_tree_node<T> *>;
+
+// Provide splay tree routines for cases in which the root of the tree
+// is not explicitly stored.
+//
+// The nodes of the tree have type Accessors::node_type, which is usually
+// a pointer type.  The nodes have a link back to their parent.
+//
+// The Accessors class provides the following static member functions:
+//
+// - Accessors::child (NODE, INDEX)
+//     INDEX is guaranteed to be 0 or 1.  If INDEX is 0, return a reference
+//     to where NODE's left child is stored, otherwise return a reference
+//     to where NODE's right child is stored.
+//
+// - Accessors::parent (NODE)
+//     Return a reference to where NODE's parent is stored.
+template<typename Accessors>
+class rootless_splay_tree
+  : public base_splay_tree<splay_tree_accessors_with_parent<Accessors>>
+{
+  using full_accessors = splay_tree_accessors_with_parent<Accessors>;
+  using parent = base_splay_tree<full_accessors>;
+
+public:
+  using rooted = rooted_splay_tree<full_accessors>;
+
+  using typename Accessors::node_type;
+
+  // Remove NODE from the splay tree.  Return the node that replaces it,
+  // or null if NODE had no children.
+  //
+  // Complexity: O(1) if removing the maximum or minimum node.
+  // Otherwise amortized O(log N), worst-cast O(N).
+  static node_type remove_node (node_type node);
+
+  // Splay NODE so that it becomes the root of the splay tree.
+  //
+  // Complexity: amortized O(log N), worst-cast O(N).
+  static void splay (node_type node);
+
+  // Like splay, but take advantage of the fact that NODE is known to be
+  // the minimum node in the tree.
+  //
+  // Complexity: amortized O(log N), worst-cast O(N).
+  static void splay_known_min_node (node_type node);
+
+  // Like splay, but take advantage of the fact that NODE is known to be
+  // the maximum node in the tree.
+  //
+  // Complexity: amortized O(log N), worst-cast O(N).
+  static void splay_known_max_node (node_type node);
+
+  // Splay NODE while looking for an ancestor node N for which PREDICATE (N)
+  // is true.  If such an ancestor node exists, stop the splay operation
+  // early and return PREDICATE (N).  Otherwise, complete the splay operation
+  // and return DEFAULT_RESULT.  In the latter case, NODE is now the root of
+  // the splay tree.
+  //
+  // Note that this routine only examines nodes that happen to be ancestors
+  // of NODE.  It does not search the full tree.
+  //
+  // Complexity: amortized O(P log N), worst-cast O(P N), where P is the
+  // complexity of the predicate.
+  template<typename DefaultResult, typename Predicate>
+  static auto splay_and_search (node_type node, DefaultResult default_result,
+				Predicate predicate)
+    -> decltype (predicate (node, 0));
+
+  // NODE1 and NODE2 are known to belong to the same splay tree.  Return:
+  //
+  // -1 if NODE1 < NODE2
+  // 0 if NODE1 == NODE2
+  // 1 if NODE1 > NODE2
+  //
+  // Complexity: amortized O(log N), worst-cast O(N).
+  static int compare_nodes (node_type node1, node_type node2);
+
+protected:
+  using parent::get_child;
+  using parent::set_child;
+  using parent::promote_child;
+
+  static node_type get_parent (node_type);
+  using parent::set_parent;
+
+  static unsigned int child_index (node_type, node_type);
+
+  static int compare_nodes_one_way (node_type, node_type);
+
+  template<unsigned int N>
+  static void splay_known_limit (node_type);
+};
+
+// Provide rootless splay tree routines for nodes of type Node.
+// The child nodes are stored in a member variable:
+//
+//    Node m_children[2];
+//
+// and the parent node is stored in a member variable:
+//
+//    Node m_parent;
+template<typename Node>
+using default_rootless_splay_tree
+  = rootless_splay_tree<default_splay_tree_accessors_with_parent<Node>>;
+
+#include "splay-tree-utils.tcc"
diff --git a/gcc/splay-tree-utils.tcc b/gcc/splay-tree-utils.tcc
new file mode 100644
index 00000000000..5c9c859d0bd
--- /dev/null
+++ b/gcc/splay-tree-utils.tcc
@@ -0,0 +1,960 @@
+// Splay tree utilities                                             -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// INDEX is either 0 or 1.  If it is 0, return NODE's left child,
+// otherwise return NODE's right child.
+template<typename Accessors>
+inline typename base_splay_tree<Accessors>::node_type
+base_splay_tree<Accessors>::get_child (node_type node, unsigned int index)
+{
+  return Accessors::child (node, index);
+}
+
+// INDEX is either 0 or 1.  If it is 0, change NODE's left child to CHILD,
+// otherwise change NODE's right child to CHILD.  If CHILD has a parent
+// field, record that its parent is now NODE.
+template<typename Accessors>
+inline void
+base_splay_tree<Accessors>::set_child (node_type node, unsigned int index,
+				       node_type child)
+{
+  Accessors::child (node, index) = child;
+  if (child)
+    set_parent (child, node);
+}
+
+// Rotate the tree to promote child number INDEX of NODE, so that that
+// child becomes a parent of NODE.  Return the promoted node.
+//
+// The caller has the responsibility of assigning a correct parent
+// to the returned node.
+template<typename Accessors>
+inline typename base_splay_tree<Accessors>::node_type
+base_splay_tree<Accessors>::promote_child (node_type node, unsigned int index)
+{
+  node_type promoted = get_child (node, index);
+  set_child (node, index, get_child (promoted, 1 - index));
+  set_child (promoted, 1 - index, node);
+  return promoted;
+}
+
+// Treat child number INDEX of NODE as being CHILD and rotate the tree
+// so that CHILD becomes a parent of NODE.
+//
+// The caller has the responsibility of assigning a correct parent to CHILD.
+template<typename Accessors>
+inline void
+base_splay_tree<Accessors>::promote_child (node_type node, unsigned int index,
+					   node_type child)
+{
+  set_child (node, index, get_child (child, 1 - index));
+  set_child (child, 1 - index, node);
+}
+
+// Print NODE to PP, using PRINTER (PP, N) to print the contents of node N.
+// Prefix each new line with INDENT_STRING.  CODE is 'T' if NODE is the root
+// node, 'L' if NODE is the left child of its parent, or 'R' if NODE is the
+// right child of its parent.
+template<typename Accessors>
+template<typename Printer>
+void
+base_splay_tree<Accessors>::print (pretty_printer *pp, node_type node,
+				   Printer printer, char code,
+				   vec<char> &indent_string)
+{
+  // In the comments below, PREFIX refers to the incoming contents
+  // of INDENT_STRING.
+  node_type left = get_child (node, 0);
+  node_type right = get_child (node, 1);
+
+  auto orig_indent_len = indent_string.length ();
+  indent_string.safe_grow (orig_indent_len + 3);
+  char *extra_indent = indent_string.address () + orig_indent_len;
+
+  // Print [T], [L], or [R].
+  extra_indent[0] = '[';
+  extra_indent[1] = code;
+  extra_indent[2] = ']';
+  pp_append_text (pp, extra_indent, indent_string.end ());
+  pp_space (pp);
+
+  // Print the node itself, using PREFIX + " | " or PREFIX + "   " to indent
+  // new lines under the "[_]" that we just printed.
+  extra_indent[0] = ' ';
+  extra_indent[1] = (left || right ? '|' : ' ');
+  extra_indent[2] = ' ';
+  {
+    pretty_printer sub_pp;
+    printer (&sub_pp, node);
+    const char *text = pp_formatted_text (&sub_pp);
+    while (const char *end = strchr (text, '\n'))
+      {
+	pp_append_text (pp, text, end);
+	pp_newline_and_indent (pp, 0);
+	pp_append_text (pp, indent_string.begin (), indent_string.end ());
+	text = end + 1;
+      }
+    pp_string (pp, text);
+  }
+
+  if (left)
+    {
+      // Print PREFIX + " +-" for the first line of the left subtree,
+      // to be followed by "[L]".
+      extra_indent[1] = '+';
+      extra_indent[2] = '-';
+      pp_newline_and_indent (pp, 0);
+      pp_append_text (pp, indent_string.begin (), indent_string.end ());
+
+      // Print the left subtree, using PREFIX + " | " or PREFIX + "   "
+      // to indent under the PREFIX + " +-" that we just printed.
+      extra_indent[1] = right ? '|' : ' ';
+      extra_indent[2] = ' ';
+      print (pp, left, printer, 'L', indent_string);
+      extra_indent = indent_string.address () + orig_indent_len;
+
+      // If LEFT is not a leaf and we also have a right subtree, use a
+      // PREFIX + " |" line to separate them.
+      if (right && (get_child (left, 0) || get_child (left, 1)))
+	{
+	  pp_newline_and_indent (pp, 0);
+	  pp_append_text (pp, indent_string.begin (), &extra_indent[2]);
+	}
+    }
+  if (right)
+    {
+      // Print PREFIX + " +-" for the first line of the right subtree,
+      // to be followed by "[R]".
+      extra_indent[1] = '+';
+      extra_indent[2] = '-';
+      pp_newline_and_indent (pp, 0);
+      pp_append_text (pp, indent_string.begin (), indent_string.end ());
+
+      // Print the right subtree, using PREFIX + "   " to indent under the
+      // PREFIX + " +-" that we just printed.
+      extra_indent[1] = ' ';
+      extra_indent[2] = ' ';
+      print (pp, right, printer, 'R', indent_string);
+    }
+  indent_string.truncate (orig_indent_len);
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+template<typename Printer>
+void
+base_splay_tree<Accessors>::print (pretty_printer *pp, node_type node,
+				   Printer printer)
+{
+  if (!node)
+    {
+      pp_string (pp, "null");
+      return;
+    }
+  auto_vec<char, 64> indent_string;
+  print (pp, node, printer, 'T', indent_string);
+}
+
+// If N is 1, splay the last (rightmost) node reachable from START
+// to the position that START current holds and return the splayed node.
+// START is not itself the last node.
+//
+// If N is 0, splay the first (leftmost) node reachable from START
+// to the position that START current holds and return the splayed node.
+// START is not itself the first node.
+//
+// The caller has the responsibility of updating the parent of the
+// returned node.
+template<typename Accessors>
+template<unsigned int N>
+typename base_splay_tree<Accessors>::node_type
+base_splay_tree<Accessors>::splay_limit (node_type start)
+{
+  // This essentially follows the simpilfied top-down method described
+  // in Sleator and Tarjan's "Self-adjusting Binary Search Trees", but
+  // specialized for the case in which the comparison result is fixed.
+  // The first iteration is peeled to avoid the need for stack temporaries.
+  //
+  // The comments and names reflect the behavior for N == 1, but the
+  // N == 0 case behaves analogously.
+
+  // Rotate the tree to promote the right child of START to the root.
+  node_type node = promote_child (start, N);
+  if (node_type right = get_child (node, N))
+    {
+      // Perform the link left step, which for this first iteration
+      // means making NODE the root of the left tree.
+      //
+      // NODE will become left child of the final node.  For a right
+      // spine starting at NODE of the form:
+      //
+      //  1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> ... -> N
+      //  |    |    |    |    |    |    |           |
+      //  V    V    V    V    V    V    V           V
+      //  A    B    C    D    E    F    G           NL
+      //
+      // the next step is to create a subtree of N whose right spine contains
+      // the odd-numbered nodes, as follows:
+      //
+      //  N
+      //  |
+      //  V
+      //  1 ------> 3 ------> 5 ------> 7 -> .... -> NL
+      //  |         |         |         |
+      //  V         V         V         V
+      //  A         2 -> C    4 -> E    6 -> G
+      //            |         |         |
+      //            V         V         V
+      //            B         D         F
+      //
+      // First record 1 as the left child of the final root (N) and move
+      // on to node 2.
+      node_type final_child = node;
+      node_type new_spine_end = node;
+      node = right;
+      while (node_type right = get_child (node, N))
+	{
+	  // Perform another rotate left step.
+	  //
+	  // We've built the tree rooted at 1 in the diagram above up to,
+	  // but not including, an even-numbered node NODE on the original
+	  // right spine.  Rotate the tree at NODE to promote the following
+	  // odd-numbered node.
+	  promote_child (node, N, right);
+	  node = right;
+	  if (node_type right = get_child (node, N))
+	    {
+	      // Perform another link left step.
+	      //
+	      // Add the promoted odd-numbered node to the right spine of the
+	      // tree rooted at 1 and move on to the next even-numbered node.
+	      set_child (new_spine_end, N, node);
+	      new_spine_end = node;
+	      node = right;
+	    }
+	}
+      // Perform the assembly step.
+      //
+      // Add NL to the new spine and make N the new root.
+      set_child (new_spine_end, N, get_child (node, 1 - N));
+      set_child (node, 1 - N, final_child);
+    }
+  return node;
+}
+
+// Remove NODE from its position in the splay tree.  If NODE has at least
+// one child node, return the node that should now hold NODE's position in
+// the splay tree.  If NODE has no children, return null.
+//
+// The caller has the responsibility of updating the parent of the
+// returned node.
+template<typename Accessors>
+inline typename base_splay_tree<Accessors>::node_type
+base_splay_tree<Accessors>::remove_node_internal (node_type node)
+{
+  node_type left = get_child (node, 0);
+  node_type right = get_child (node, 1);
+  if (!left)
+    return right;
+
+  if (!right)
+    return left;
+
+  if (get_child (left, 1))
+    {
+      left = splay_limit<1> (left);
+      gcc_checking_assert (!get_child (left, 1));
+    }
+  set_child (left, 1, right);
+  return left;
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+inline void
+base_splay_tree<Accessors>::insert_child (node_type node, unsigned int index,
+					  node_type child)
+{
+  gcc_checking_assert (!get_child (child, 0) && !get_child (child, 1));
+  set_child (child, index, get_child (node, index));
+  set_child (node, index, child);
+}
+
+// Implement splay_next_node if N == 1 and splay_prev_node if N == 0.
+template<typename Accessors>
+template<unsigned int N>
+bool
+rooted_splay_tree<Accessors>::splay_neighbor ()
+{
+  node_type node = m_root;
+  node_type new_root = get_child (node, N);
+  if (!new_root)
+    return false;
+
+  if (get_child (new_root, 1 - N))
+    {
+      // NEW_ROOT is not itself the required node, so splay the required
+      // node into its place.
+      new_root = parent::template splay_limit<1 - N> (new_root);
+      gcc_checking_assert (!get_child (new_root, 1 - N));
+      set_child (node, N, node_type ());
+      set_child (new_root, 1 - N, node);
+    }
+  else
+    promote_child (node, N, new_root);
+  set_parent (new_root, node_type ());
+  m_root = new_root;
+  return true;
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+template<typename Comparator>
+bool
+rooted_splay_tree<Accessors>::insert (node_type new_node, Comparator compare)
+{
+  gcc_checking_assert (!get_child (new_node, 0) && !get_child (new_node, 1));
+  if (!m_root)
+    {
+      m_root = new_node;
+      return true;
+    }
+
+  int comparison = lookup (compare);
+  if (comparison == 0)
+    return false;
+
+  // Insert NEW_NODE before M_ROOT if COMPARISON < 0 and after M_ROOT
+  // otherwise.
+  set_child (new_node, comparison < 0, m_root);
+  set_child (new_node, comparison > 0, get_child (m_root, comparison > 0));
+  set_child (m_root, comparison > 0, nullptr);
+  m_root = new_node;
+  return true;
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+inline void
+rooted_splay_tree<Accessors>::insert_max_node (node_type new_node)
+{
+  gcc_checking_assert (!get_child (new_node, 0) && !get_child (new_node, 1));
+  set_child (new_node, 0, m_root);
+  m_root = new_node;
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+inline void
+rooted_splay_tree<Accessors>::splice_next_tree (rooted_splay_tree next_tree)
+{
+  splay_max_node ();
+  set_child (m_root, 1, next_tree.m_root);
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+inline void
+rooted_splay_tree<Accessors>::replace_max_node_at_root (node_type new_node)
+{
+  node_type old_node = m_root;
+  gcc_checking_assert (!get_child (new_node, 0)
+		       && !get_child (new_node, 1)
+		       && !get_child (old_node, 1));
+  set_child (new_node, 0, get_child (old_node, 0));
+  // Clear the links from OLD_NODE.  Its parent and right child are
+  // already node_type ().
+  set_child (old_node, 0, node_type ());
+  m_root = new_node;
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+inline void
+rooted_splay_tree<Accessors>::remove_root ()
+{
+  node_type node = m_root;
+  m_root = parent::remove_node_internal (node);
+  if (m_root)
+    set_parent (m_root, node_type ());
+  // Clear the links from NODE.  Its parent is already node_type ().
+  set_child (node, 0, node_type ());
+  set_child (node, 1, node_type ());
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+inline rooted_splay_tree<Accessors>
+rooted_splay_tree<Accessors>::split_before_root ()
+{
+  node_type new_root = get_child (m_root, 0);
+  set_child (m_root, 0, node_type ());
+  set_parent (new_root, node_type ());
+  return new_root;
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+inline rooted_splay_tree<Accessors>
+rooted_splay_tree<Accessors>::split_after_root ()
+{
+  node_type new_root = get_child (m_root, 1);
+  set_child (m_root, 1, node_type ());
+  set_parent (new_root, node_type ());
+  return new_root;
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+inline bool
+rooted_splay_tree<Accessors>::splay_prev_node ()
+{
+  return splay_neighbor<0> ();
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+inline bool
+rooted_splay_tree<Accessors>::splay_next_node ()
+{
+  return splay_neighbor<1> ();
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+inline void
+rooted_splay_tree<Accessors>::splay_min_node ()
+{
+  if (m_root && get_child (m_root, 0))
+    {
+      m_root = parent::template splay_limit<0> (m_root);
+      set_parent (m_root, node_type ());
+    }
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+inline void
+rooted_splay_tree<Accessors>::splay_max_node ()
+{
+  if (m_root && get_child (m_root, 1))
+    {
+      m_root = parent::template splay_limit<1> (m_root);
+      set_parent (m_root, node_type ());
+    }
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+inline typename rooted_splay_tree<Accessors>::node_type
+rooted_splay_tree<Accessors>::min_node ()
+{
+  splay_min_node ();
+  return m_root;
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+inline typename rooted_splay_tree<Accessors>::node_type
+rooted_splay_tree<Accessors>::max_node ()
+{
+  splay_max_node ();
+  return m_root;
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+template<typename Comparator>
+auto
+rooted_splay_tree<Accessors>::lookup (Comparator compare)
+  -> decltype (compare (m_root))
+{
+  // This essentially follows the simpilfied top-down method described
+  // in Sleator and Tarjan's "Self-adjusting Binary Search Trees", but
+  // with the complication that the comparisons are done only once.
+  using result_type = decltype (compare (m_root));
+
+  // The roots of the left and right trees.
+  node_type link_left_root = node_type ();
+  node_type link_right_root = node_type ();
+
+  // Where to add new nodes to the left and right trees.
+  node_type *link_left_ptr = &link_left_root;
+  node_type *link_right_ptr = &link_right_root;
+
+  // The nodes that contain *LINK_LEFT_PTR and *LINK_RIGHT_PTR,
+  // once they no longer point to the roots above.
+  node_type link_left_parent = node_type ();
+  node_type link_right_parent = node_type ();
+
+  auto link_left = [&](node_type node)
+    {
+      *link_left_ptr = node;
+      link_left_ptr = &Accessors::child (node, 1);
+      set_parent (node, link_left_parent);
+      link_left_parent = node;
+    };
+
+  auto link_right = [&](node_type node)
+    {
+      *link_right_ptr = node;
+      link_right_ptr = &Accessors::child (node, 0);
+      set_parent (node, link_right_parent);
+      link_right_parent = node;
+    };
+
+  node_type node = m_root;
+  node_type parent = node_type ();
+  result_type result;
+  result_type old_result = 0;
+  while (1)
+    {
+      // OLD_RESULT is 0 if NODE is the root of the middle tree.
+      // Otherwise, PARENT is the root of the middle tree and OLD_RESULT
+      // is how it compared.
+      //
+      // Results are:
+      // < 0 if we want something smaller.
+      // = 0 if we found the right node.
+      // > 0 if we want something bigger.
+      result = compare (node);
+      if (old_result < 0)
+	{
+	  if (result < 0)
+	    {
+	      // SEARCH < NODE < PARENT
+	      //
+	      // Promote NODE (rotate right).
+	      promote_child (parent, 0, node);
+	      node_type next = get_child (node, 0);
+	      if (!next)
+		break;
+
+	      link_right (node);
+
+	      // NEXT is now the root of the middle tree.
+	      node = next;
+	      old_result = 0;
+	      continue;
+	    }
+
+	  // SEARCH >= NODE, NODE < PARENT
+	  link_right (parent);
+	}
+      else if (old_result > 0)
+	{
+	  if (result > 0)
+	    {
+	      // SEARCH > NODE > PARENT
+	      //
+	      // Promote NODE (rotate left).
+	      promote_child (parent, 1, node);
+	      node_type next = get_child (node, 1);
+	      if (!next)
+		break;
+
+	      link_left (node);
+
+	      // NEXT is now the root of the middle tree.
+	      node = next;
+	      old_result = 0;
+	      continue;
+	    }
+
+	  // SEARCH <= NODE, NODE > PARENT
+	  link_left (parent);
+	}
+
+      // Microoptimization to allow NODE to be read even if RESULT == 0.
+      node_type next = get_child (node, result >= 0);
+      if (result == 0 || !next)
+	break;
+
+      // NODE is now the root of the tree.
+      parent = node;
+      node = next;
+      old_result = result;
+    }
+
+  node_type new_left = link_left_root;
+  node_type new_right = link_right_root;
+
+  if (new_left)
+    {
+      node_type old_left = get_child (node, 0);
+      *link_left_ptr = old_left;
+      if (old_left)
+	set_parent (old_left, link_left_parent);
+      set_child (node, 0, new_left);
+    }
+
+  if (new_right)
+    {
+      node_type old_right = get_child (node, 1);
+      *link_right_ptr = old_right;
+      if (old_right)
+	set_parent (old_right, link_right_parent);
+      set_child (node, 1, new_right);
+    }
+
+  set_parent (node, node_type ());
+  m_root = node;
+  return result;
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+template<typename LeftPredicate, typename RightPredicate>
+int
+rooted_splay_tree<Accessors>::lookup (LeftPredicate want_something_smaller,
+				      RightPredicate want_something_bigger)
+{
+  // This essentially follows the simpilfied top-down method described
+  // in Sleator and Tarjan's "Self-adjusting Binary Search Trees"
+  // (and follows it more closely than the single-comparator version above).
+
+  // The roots of the left and right trees.
+  node_type link_left_root = node_type ();
+  node_type link_right_root = node_type ();
+
+  // Where to add new nodes to the left and right trees.
+  node_type *link_left_ptr = &link_left_root;
+  node_type *link_right_ptr = &link_right_root;
+
+  // The nodes that contain *LINK_LEFT_PTR and *LINK_RIGHT_PTR,
+  // once they no longer point to the roots above.
+  node_type link_left_parent = node_type ();
+  node_type link_right_parent = node_type ();
+
+  node_type node = m_root;
+  int result;
+  for (;;)
+    {
+      // NODE is the root of the middle tree.
+      if (want_something_smaller (node))
+	{
+	  result = -1;
+	  node_type next = get_child (node, 0);
+	  if (!next)
+	    break;
+
+	  if (want_something_smaller (next))
+	    {
+	      // Promote NODE (rotate right).
+	      promote_child (node, 0, next);
+	      node = next;
+	      next = get_child (node, 0);
+	      if (!next)
+		break;
+	    }
+
+	  // Add NODE to the right tree (link right).
+	  *link_right_ptr = node;
+	  link_right_ptr = &Accessors::child (node, 0);
+	  set_parent (node, link_right_parent);
+	  link_right_parent = node;
+
+	  node = next;
+	}
+      else if (want_something_bigger (node))
+	{
+	  result = 1;
+	  node_type next = get_child (node, 1);
+	  if (!next)
+	    break;
+
+	  if (want_something_bigger (next))
+	    {
+	      // Promote NODE (rotate left).
+	      promote_child (node, 1, next);
+	      node = next;
+	      next = get_child (node, 1);
+	      if (!next)
+		break;
+	    }
+
+	  // Add NODE to the left tree (link left).
+	  *link_left_ptr = node;
+	  link_left_ptr = &Accessors::child (node, 1);
+	  set_parent (node, link_left_parent);
+	  link_left_parent = node;
+
+	  node = next;
+	}
+      else
+	{
+	  result = 0;
+	  break;
+	}
+    }
+
+  node_type new_left = link_left_root;
+  node_type new_right = link_right_root;
+
+  if (new_left)
+    {
+      node_type old_left = get_child (node, 0);
+      *link_left_ptr = old_left;
+      if (old_left)
+	set_parent (old_left, link_left_parent);
+      set_child (node, 0, new_left);
+    }
+
+  if (new_right)
+    {
+      node_type old_right = get_child (node, 1);
+      *link_right_ptr = old_right;
+      if (old_right)
+	set_parent (old_right, link_right_parent);
+      set_child (node, 1, new_right);
+    }
+
+  set_parent (node, node_type ());
+  m_root = node;
+  return result;
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+template<typename Printer>
+inline void
+rooted_splay_tree<Accessors>::print (pretty_printer *pp, Printer printer) const
+{
+  print (pp, m_root, printer);
+}
+
+// Return NODE's current parent.
+template<typename Accessors>
+inline typename rootless_splay_tree<Accessors>::node_type
+rootless_splay_tree<Accessors>::get_parent (node_type node)
+{
+  return Accessors::parent (node);
+}
+
+// CHILD is known to be a child of PARENT.  Return which index it has.
+template<typename Accessors>
+inline unsigned int
+rootless_splay_tree<Accessors>::child_index (node_type parent, node_type child)
+{
+  return get_child (parent, 1) == child;
+}
+
+// If N == 1, implement splay_known_max_node, otherwise implement
+// splay_known_min_node.
+template<typename Accessors>
+template<unsigned int N>
+inline void
+rootless_splay_tree<Accessors>::splay_known_limit (node_type node)
+{
+  node_type child = node;
+  node_type parent = get_parent (child);
+  if (!parent)
+    return;
+
+  do
+    // At this point, NODE conceptually replaces CHILD as a child of
+    // PARENT, but we haven't yet updated PARENT accordingly.
+    if (node_type grandparent = get_parent (parent))
+      {
+	node_type greatgrandparent = get_parent (grandparent);
+	promote_child (grandparent, N, parent);
+	promote_child (parent, N, node);
+	child = grandparent;
+	parent = greatgrandparent;
+      }
+    else
+      {
+	promote_child (parent, N, node);
+	break;
+      }
+  while (parent);
+  set_parent (node, node_type ());
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+typename rootless_splay_tree<Accessors>::node_type
+rootless_splay_tree<Accessors>::remove_node (node_type node)
+{
+  node_type replacement = parent::remove_node_internal (node);
+  if (node_type parent = get_parent (node))
+    set_child (parent, child_index (parent, node), replacement);
+  else if (replacement)
+    set_parent (replacement, node_type ());
+  // Clear the links from NODE.
+  set_parent (node, node_type ());
+  set_child (node, 0, node_type ());
+  set_child (node, 1, node_type ());
+  return replacement;
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+void
+rootless_splay_tree<Accessors>::splay (node_type node)
+{
+  node_type child = node;
+  node_type parent = get_parent (child);
+  if (!parent)
+    return;
+
+  do
+    {
+      // At this point, NODE conceptually replaces CHILD as a child of
+      // PARENT, but we haven't yet updated PARENT accordingly.
+      unsigned int index = child_index (parent, child);
+      if (node_type grandparent = get_parent (parent))
+	{
+	  node_type greatgrandparent = get_parent (grandparent);
+	  unsigned int parent_index = child_index (grandparent, parent);
+	  if (index == parent_index)
+	    {
+	      promote_child (grandparent, parent_index, parent);
+	      promote_child (parent, index, node);
+	    }
+	  else
+	    {
+	      promote_child (parent, index, node);
+	      promote_child (grandparent, parent_index, node);
+	    }
+	  child = grandparent;
+	  parent = greatgrandparent;
+	}
+      else
+	{
+	  promote_child (parent, index, node);
+	  break;
+	}
+    }
+  while (parent);
+  set_parent (node, node_type ());
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+inline void
+rootless_splay_tree<Accessors>::splay_known_min_node (node_type node)
+{
+  splay_known_limit<0> (node);
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+inline void
+rootless_splay_tree<Accessors>::splay_known_max_node (node_type node)
+{
+  splay_known_limit<1> (node);
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+template<typename DefaultResult, typename Predicate>
+auto
+rootless_splay_tree<Accessors>::
+splay_and_search (node_type node, DefaultResult default_result,
+		  Predicate predicate)
+  -> decltype (predicate (node, 0))
+{
+  using Result = decltype (predicate (node, 0));
+
+  node_type child = node;
+  node_type parent = get_parent (child);
+  if (!parent)
+    return default_result;
+
+  do
+    {
+      // At this point, NODE conceptually replaces CHILD as a child of
+      // PARENT, but we haven't yet updated PARENT accordingly.
+      unsigned int index = child_index (parent, child);
+      if (Result result = predicate (parent, index))
+	{
+	  set_child (parent, index, node);
+	  return result;
+	}
+      if (node_type grandparent = get_parent (parent))
+	{
+	  node_type greatgrandparent = get_parent (grandparent);
+	  unsigned int parent_index = child_index (grandparent, parent);
+	  if (Result result = predicate (grandparent, parent_index))
+	    {
+	      set_child (parent, index, node);
+	      return result;
+	    }
+	  if (index == parent_index)
+	    {
+	      promote_child (grandparent, parent_index, parent);
+	      promote_child (parent, index, node);
+	    }
+	  else
+	    {
+	      promote_child (parent, index, node);
+	      promote_child (grandparent, parent_index, node);
+	    }
+	  child = grandparent;
+	  parent = greatgrandparent;
+	}
+      else
+	{
+	  promote_child (parent, index, node);
+	  break;
+	}
+    }
+  while (parent);
+  set_parent (node, node_type ());
+  return default_result;
+}
+
+// Splay NODE1 looking to see if one of its ancestors is NODE2.  If it is,
+// return -1 if NODE1 comes before NODE2 or 1 if NODE1 comes after NODE2.
+// Return 0 if NODE2 is not an ancestor of NODE1.
+template<typename Accessors>
+int
+rootless_splay_tree<Accessors>::compare_nodes_one_way (node_type node1,
+						       node_type node2)
+{
+  auto compare = [&](node_type parent, unsigned int index) -> int
+    {
+      if (parent == node2)
+	return index ? 1 : -1;
+      return 0;
+    };
+  return splay_and_search (node1, 0, compare);
+}
+
+// See the comment above the declaration.
+template<typename Accessors>
+int
+rootless_splay_tree<Accessors>::compare_nodes (node_type node1,
+					       node_type node2)
+{
+  if (node1 == node2)
+    return 0;
+
+  // Splay NODE1 looking for NODE2.
+  int cmp = compare_nodes_one_way (node1, node2);
+  if (cmp)
+    return cmp;
+
+  // That failed, but NODE1 is now the root of the tree.  Splay NODE2
+  // to see on which side of NODE1 it falls.
+  cmp = compare_nodes_one_way (node2, node1);
+  gcc_checking_assert (cmp);
+  return -cmp;
+}
diff --git a/gcc/system.h b/gcc/system.h
index b0f3f1dd019..d4e29e53ee3 100644
--- a/gcc/system.h
+++ b/gcc/system.h
@@ -232,6 +232,9 @@ extern int errno;
 #ifdef INCLUDE_VECTOR
 # include <vector>
 #endif
+#ifdef INCLUDE_ARRAY
+# include <array>
+#endif
 # include <cstring>
 # include <new>
 # include <utility>
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [09/23] Add a cut-down version of std::span (array_slice)
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (7 preceding siblings ...)
  2020-11-13  8:15 ` [08/23] Add an alternative splay tree implementation Richard Sandiford
@ 2020-11-13  8:15 ` Richard Sandiford
  2020-11-30 19:56   ` Jeff Law
                     ` (2 more replies)
  2020-11-13  8:16 ` [10/23] Tweak the way that is_a is implemented Richard Sandiford
                   ` (14 subsequent siblings)
  23 siblings, 3 replies; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:15 UTC (permalink / raw)
  To: gcc-patches

A later patch wants to be able to pass around subarray views of an
existing array.  The standard class to do that is std::span, but it's
a C++20 thing.  This patch just adds a cut-down version of it.

The intention is just to provide what's currently needed.

gcc/
	* vec.h (array_slice): New class.
---
 gcc/vec.h | 120 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 120 insertions(+)

diff --git a/gcc/vec.h b/gcc/vec.h
index f02beddc975..7768de9f518 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -2128,6 +2128,126 @@ release_vec_vec (vec<vec<T> > &vec)
   vec.release ();
 }
 
+// Provide a subset of the std::span functionality.  (We can't use std::span
+// itself because it's a C++20 feature.)
+//
+// In addition, provide an invalid value that is distinct from all valid
+// sequences (including the empty sequence).  This can be used to return
+// failure without having to use std::optional.
+//
+// There is no operator bool because it would be ambiguous whether it is
+// testing for a valid value or an empty sequence.
+template<typename T>
+class array_slice
+{
+  template<typename OtherT> friend class array_slice;
+
+public:
+  using value_type = T;
+  using iterator = T *;
+  using const_iterator = const T *;
+
+  array_slice () : m_base (nullptr), m_size (0) {}
+
+  template<typename OtherT>
+  array_slice (array_slice<OtherT> other)
+    : m_base (other.m_base), m_size (other.m_size) {}
+
+  array_slice (iterator base, unsigned int size)
+    : m_base (base), m_size (size) {}
+
+  template<size_t N>
+  array_slice (T (&array)[N]) : m_base (array), m_size (N) {}
+
+  template<typename OtherT>
+  array_slice (const vec<OtherT> &v)
+    : m_base (v.address ()), m_size (v.length ()) {}
+
+  iterator begin () { return m_base; }
+  iterator end () { return m_base + m_size; }
+
+  const_iterator begin () const { return m_base; }
+  const_iterator end () const { return m_base + m_size; }
+
+  value_type &front ();
+  value_type &back ();
+  value_type &operator[] (unsigned int i);
+
+  const value_type &front () const;
+  const value_type &back () const;
+  const value_type &operator[] (unsigned int i) const;
+
+  size_t size () const { return m_size; }
+  size_t size_bytes () const { return m_size * sizeof (T); }
+  bool empty () const { return m_size == 0; }
+
+  // An invalid array_slice that represents a failed operation.  This is
+  // distinct from an empty slice, which is a valid result in some contexts.
+  static array_slice invalid () { return { nullptr, ~0U }; }
+
+  // True if the array is valid, false if it is an array like INVALID.
+  bool is_valid () const { return m_base || m_size == 0; }
+
+private:
+  iterator m_base;
+  unsigned int m_size;
+};
+
+template<typename T>
+inline typename array_slice<T>::value_type &
+array_slice<T>::front ()
+{
+  gcc_checking_assert (m_size);
+  return m_base[0];
+}
+
+template<typename T>
+inline const typename array_slice<T>::value_type &
+array_slice<T>::front () const
+{
+  gcc_checking_assert (m_size);
+  return m_base[0];
+}
+
+template<typename T>
+inline typename array_slice<T>::value_type &
+array_slice<T>::back ()
+{
+  gcc_checking_assert (m_size);
+  return m_base[m_size - 1];
+}
+
+template<typename T>
+inline const typename array_slice<T>::value_type &
+array_slice<T>::back () const
+{
+  gcc_checking_assert (m_size);
+  return m_base[m_size - 1];
+}
+
+template<typename T>
+inline typename array_slice<T>::value_type &
+array_slice<T>::operator[] (unsigned int i)
+{
+  gcc_checking_assert (i < m_size);
+  return m_base[i];
+}
+
+template<typename T>
+inline const typename array_slice<T>::value_type &
+array_slice<T>::operator[] (unsigned int i) const
+{
+  gcc_checking_assert (i < m_size);
+  return m_base[i];
+}
+
+template<typename T>
+array_slice<T>
+make_array_slice (T *base, unsigned int size)
+{
+  return array_slice<T> (base, size);
+}
+
 #if (GCC_VERSION >= 3000)
 # pragma GCC poison m_vec m_vecpfx m_vecdata
 #endif
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [10/23] Tweak the way that is_a is implemented
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (8 preceding siblings ...)
  2020-11-13  8:15 ` [09/23] Add a cut-down version of std::span (array_slice) Richard Sandiford
@ 2020-11-13  8:16 ` Richard Sandiford
  2020-12-02  5:15   ` Jeff Law
  2020-11-13  8:16 ` [11/23] Split update_cfg_for_uncondjump out of combine Richard Sandiford
                   ` (13 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:16 UTC (permalink / raw)
  To: gcc-patches

At the moment, class hierarchies that use is_a are expected
to define specialisations like:

  template <>
  template <>
  inline bool
  is_a_helper <cgraph_node *>::test (symtab_node *p)
  {
    return p->type == SYMTAB_FUNCTION;
  }

But this doesn't scale well to larger hierarchies, because it only
defines ::test for an argument that is exactly “symtab_node *”
(and not for example “const symtab_node *” or something that
comes between cgraph_node and symtab_node in the hierarchy).

For example:

  struct A { int x; };
  struct B : A {};
  struct C : B {};

  template <>
  template <>
  inline bool
  is_a_helper <C *>::test (A *a)
  {
    return a->x == 1;
  }

  bool f(B *b) { return is_a<C *> (b); }

gives:

  warning: inline function ‘static bool is_a_helper<T>::test(U*) [with U = B; T = C*]’ used but never defined

and:

  bool f(const A *a) { return is_a<const C *> (a); }

gives:

  warning: inline function ‘static bool is_a_helper<T>::test(U*) [with U = const A; T = const C*]’ used but never defined

This patch instead allows is_a to be implemented by specialising
is_a_helper as a whole, for example:

  template<>
  struct is_a_helper<C *> : static_is_a_helper<C *>
  {
    static inline bool test (const A *a) { return a->x == 1; }
  };

It also adds a general specialisation of is_a_helper for const
pointers.  Together, this makes both of the above examples work.

gcc/
	* is-a.h (reinterpret_is_a_helper): New class.
	(static_is_a_helper): Likewise.
	(is_a_helper): Inherit from reinterpret_is_a_helper.
	(is_a_helper<const T *>): New specialization.
---
 gcc/is-a.h | 81 ++++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 63 insertions(+), 18 deletions(-)

diff --git a/gcc/is-a.h b/gcc/is-a.h
index e84c3e4880c..26f53a5ba4a 100644
--- a/gcc/is-a.h
+++ b/gcc/is-a.h
@@ -116,9 +116,30 @@ the connection between the types has not been made.  See below.
 
 EXTENDING THE GENERIC TYPE FACILITY
 
-Each connection between types must be made by defining a specialization of the
-template member function 'test' of the template class 'is_a_helper'.  For
-example,
+Method 1
+--------
+
+If DERIVED is derived from BASE, and if BASE contains enough information
+to determine whether an object is actually an instance of DERIVED,
+then you can make the above routines work for DERIVED by defining
+a specialization of is_a_helper such as:
+
+  template<>
+  struct is_a_helper<DERIVED *> : static_is_a_helper<DERIVED *>
+  {
+    static inline bool test (const BASE *p) { return ...; }
+  };
+
+This test function should return true if P is an instanced of DERIVED.
+This on its own is enough; the comments below for method 2 do not apply.
+
+Method 2
+--------
+
+Alternatively, if two types are connected in ways other than C++
+inheritance, each connection between them must be made by defining a
+specialization of the template member function 'test' of the template
+class 'is_a_helper'.  For example,
 
   template <>
   template <>
@@ -145,15 +166,52 @@ when needed may result in a crash.  For example,
 #ifndef GCC_IS_A_H
 #define GCC_IS_A_H
 
+/* A base class that specializations of is_a_helper can use if casting
+   U * to T is simply a reinterpret_cast.  */
+
+template <typename T>
+struct reinterpret_is_a_helper
+{
+  template <typename U>
+  static inline T cast (U *p) { return reinterpret_cast <T> (p); }
+};
+
+/* A base class that specializations of is_a_helper can use if casting
+   U * to T is simply a static_cast.  This is more type-safe than
+   reinterpret_is_a_helper.  */
+
+template <typename T>
+struct static_is_a_helper
+{
+  template <typename U>
+  static inline T cast (U *p) { return static_cast <T> (p); }
+};
+
 /* A generic type conversion internal helper class.  */
 
 template <typename T>
-struct is_a_helper
+struct is_a_helper : reinterpret_is_a_helper<T>
 {
   template <typename U>
   static inline bool test (U *p);
+};
+
+/* Reuse the definition of is_a_helper<T *> to implement
+   is_a_helper<const T *>.  */
+
+template <typename T>
+struct is_a_helper<const T *>
+{
   template <typename U>
-  static inline T cast (U *p);
+  static inline const T *cast (const U *p)
+  {
+    return is_a_helper<T *>::cast (const_cast <U *> (p));
+  }
+  template <typename U>
+  static inline bool test (const U *p)
+  {
+    return is_a_helper<T *>::test (p);
+  }
 };
 
 /* Note that we deliberately do not define the 'test' member template.  Not
@@ -161,19 +219,6 @@ struct is_a_helper
    not been defined, rather than a run-time error.  See the discussion above
    for when to define this member.  */
 
-/* This is the generic implementation for casting from one type to another.
-   Do not use this routine directly; it is an internal function.  See the
-   discussion above for when to define this member.  */
-
-template <typename T>
-template <typename U>
-inline T
-is_a_helper <T>::cast (U *p)
-{
-  return reinterpret_cast <T> (p);
-}
-
-
 /* The public interface.  */
 
 /* A generic test for a type relationship.  See the discussion above for when
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [11/23] Split update_cfg_for_uncondjump out of combine
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (9 preceding siblings ...)
  2020-11-13  8:16 ` [10/23] Tweak the way that is_a is implemented Richard Sandiford
@ 2020-11-13  8:16 ` Richard Sandiford
  2020-11-30  6:14   ` Jeff Law
  2020-11-13  8:17 ` [12/23] Export print-rtl.c:print_insn_with_notes Richard Sandiford
                   ` (12 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:16 UTC (permalink / raw)
  To: gcc-patches

Later patches want to reuse combine's update_cfg_for_uncondjump,
so this patch makes it a public cfgrtl.c function.

gcc/
	* cfgrtl.h (update_cfg_for_uncondjump): Declare.
	* combine.c (update_cfg_for_uncondjump): Move to...
	* cfgrtl.c: ...here.
---
 gcc/cfgrtl.c  | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 gcc/cfgrtl.h  |  1 +
 gcc/combine.c | 36 ------------------------------------
 3 files changed, 48 insertions(+), 36 deletions(-)

diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index 45d84d39b22..332e93607e6 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -3417,6 +3417,53 @@ fixup_abnormal_edges (void)
   return inserted;
 }
 \f
+/* Delete the unconditional jump INSN and adjust the CFG correspondingly.
+   Note that the INSN should be deleted *after* removing dead edges, so
+   that the kept edge is the fallthrough edge for a (set (pc) (pc))
+   but not for a (set (pc) (label_ref FOO)).  */
+
+void
+update_cfg_for_uncondjump (rtx_insn *insn)
+{
+  basic_block bb = BLOCK_FOR_INSN (insn);
+  gcc_assert (BB_END (bb) == insn);
+
+  purge_dead_edges (bb);
+
+  if (current_ir_type () != IR_RTL_CFGLAYOUT)
+    {
+      if (!find_fallthru_edge (bb->succs))
+	{
+	  auto barrier = next_nonnote_nondebug_insn (insn);
+	  if (!barrier || !BARRIER_P (barrier))
+	    emit_barrier_after (insn);
+	}
+      return;
+    }
+
+  delete_insn (insn);
+  if (EDGE_COUNT (bb->succs) == 1)
+    {
+      rtx_insn *insn;
+
+      single_succ_edge (bb)->flags |= EDGE_FALLTHRU;
+
+      /* Remove barriers from the footer if there are any.  */
+      for (insn = BB_FOOTER (bb); insn; insn = NEXT_INSN (insn))
+	if (BARRIER_P (insn))
+	  {
+	    if (PREV_INSN (insn))
+	      SET_NEXT_INSN (PREV_INSN (insn)) = NEXT_INSN (insn);
+	    else
+	      BB_FOOTER (bb) = NEXT_INSN (insn);
+	    if (NEXT_INSN (insn))
+	      SET_PREV_INSN (NEXT_INSN (insn)) = PREV_INSN (insn);
+	  }
+	else if (LABEL_P (insn))
+	  break;
+    }
+}
+\f
 /* Cut the insns from FIRST to LAST out of the insns stream.  */
 
 rtx_insn *
diff --git a/gcc/cfgrtl.h b/gcc/cfgrtl.h
index ae62d6cf05c..1c177d3a7e3 100644
--- a/gcc/cfgrtl.h
+++ b/gcc/cfgrtl.h
@@ -47,6 +47,7 @@ extern void fixup_partitions (void);
 extern bool purge_dead_edges (basic_block);
 extern bool purge_all_dead_edges (void);
 extern bool fixup_abnormal_edges (void);
+extern void update_cfg_for_uncondjump (rtx_insn *);
 extern rtx_insn *unlink_insn_chain (rtx_insn *, rtx_insn *);
 extern void relink_block_chain (bool);
 extern rtx_insn *duplicate_insn_chain (rtx_insn *, rtx_insn *,
diff --git a/gcc/combine.c b/gcc/combine.c
index ed1ad45de83..5864474e720 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2531,42 +2531,6 @@ reg_subword_p (rtx x, rtx reg)
 	 && GET_MODE_CLASS (GET_MODE (x)) == MODE_INT;
 }
 
-/* Delete the unconditional jump INSN and adjust the CFG correspondingly.
-   Note that the INSN should be deleted *after* removing dead edges, so
-   that the kept edge is the fallthrough edge for a (set (pc) (pc))
-   but not for a (set (pc) (label_ref FOO)).  */
-
-static void
-update_cfg_for_uncondjump (rtx_insn *insn)
-{
-  basic_block bb = BLOCK_FOR_INSN (insn);
-  gcc_assert (BB_END (bb) == insn);
-
-  purge_dead_edges (bb);
-
-  delete_insn (insn);
-  if (EDGE_COUNT (bb->succs) == 1)
-    {
-      rtx_insn *insn;
-
-      single_succ_edge (bb)->flags |= EDGE_FALLTHRU;
-
-      /* Remove barriers from the footer if there are any.  */
-      for (insn = BB_FOOTER (bb); insn; insn = NEXT_INSN (insn))
-	if (BARRIER_P (insn))
-	  {
-	    if (PREV_INSN (insn))
-	      SET_NEXT_INSN (PREV_INSN (insn)) = NEXT_INSN (insn);
-	    else
-	      BB_FOOTER (bb) = NEXT_INSN (insn);
-	    if (NEXT_INSN (insn))
-	      SET_PREV_INSN (NEXT_INSN (insn)) = PREV_INSN (insn);
-	  }
-	else if (LABEL_P (insn))
-	  break;
-    }
-}
-
 /* Return whether PAT is a PARALLEL of exactly N register SETs followed
    by an arbitrary number of CLOBBERs.  */
 static bool
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [12/23] Export print-rtl.c:print_insn_with_notes
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (10 preceding siblings ...)
  2020-11-13  8:16 ` [11/23] Split update_cfg_for_uncondjump out of combine Richard Sandiford
@ 2020-11-13  8:17 ` Richard Sandiford
  2020-11-25 20:24   ` Jeff Law
  2020-11-13  8:18 ` [13/23] recog: Split out a register_asm_p function Richard Sandiford
                   ` (11 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:17 UTC (permalink / raw)
  To: gcc-patches

Later patches want to use print_insn_with_notes (printing to
a pretty_printer).  This patch exports it from print-rtl.c.

The non-notes version is already public.

gcc/
	* print-rtl.h (print_insn_with_notes): Declare.
	* print-rtl.c (print_insn_with_notes): Make non-static
---
 gcc/print-rtl.c | 5 +----
 gcc/print-rtl.h | 1 +
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c
index d514b1c5373..c1d3c179b75 100644
--- a/gcc/print-rtl.c
+++ b/gcc/print-rtl.c
@@ -1260,9 +1260,6 @@ print_rtx_insn_vec (FILE *file, const vec<rtx_insn *> &vec)
    It is also possible to obtain a string for a single pattern as a string
    pointer, via str_pattern_slim, but this usage is discouraged.  */
 
-/* For insns we print patterns, and for some patterns we print insns...  */
-static void print_insn_with_notes (pretty_printer *, const rtx_insn *);
-
 /* This recognizes rtx'en classified as expressions.  These are always
    represent some action on values or results of other expression, that
    may be stored in objects representing values.  */
@@ -2011,7 +2008,7 @@ print_insn (pretty_printer *pp, const rtx_insn *x, int verbose)
 /* Pretty-print a slim dump of X (an insn) to PP, including any register
    note attached to the instruction.  */
 
-static void
+void
 print_insn_with_notes (pretty_printer *pp, const rtx_insn *x)
 {
   pp_string (pp, print_rtx_head);
diff --git a/gcc/print-rtl.h b/gcc/print-rtl.h
index 09e5a519be9..cf801e81332 100644
--- a/gcc/print-rtl.h
+++ b/gcc/print-rtl.h
@@ -84,6 +84,7 @@ extern void dump_rtl_slim (FILE *, const rtx_insn *, const rtx_insn *,
 extern void print_value (pretty_printer *, const_rtx, int);
 extern void print_pattern (pretty_printer *, const_rtx, int);
 extern void print_insn (pretty_printer *pp, const rtx_insn *x, int verbose);
+extern void print_insn_with_notes (pretty_printer *, const rtx_insn *);
 
 extern void rtl_dump_bb_for_graph (pretty_printer *, basic_block);
 extern const char *str_pattern_slim (const_rtx);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [13/23] recog: Split out a register_asm_p function
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (11 preceding siblings ...)
  2020-11-13  8:17 ` [12/23] Export print-rtl.c:print_insn_with_notes Richard Sandiford
@ 2020-11-13  8:18 ` Richard Sandiford
  2020-11-25 20:24   ` Jeff Law
  2020-11-13  8:18 ` [14/23] simplify-rtx: Put simplify routines into a class Richard Sandiford
                   ` (10 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:18 UTC (permalink / raw)
  To: gcc-patches

verify_changes has a test for whether a particular hard register
is a user-defined register asm.  A later patch needs to test the
same thing, so this patch splits it out into a helper.

gcc/
	* rtl.h (register_asm_p): Declare.
	* recog.c (verify_changes): Split out the test for whether
	a hard register is a register asm to...
	(register_asm_p): ...this new function.
---
 gcc/recog.c   |  5 +----
 gcc/rtl.h     |  1 +
 gcc/rtlanal.c | 12 ++++++++++++
 3 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/gcc/recog.c b/gcc/recog.c
index 6d8b7d560ee..2d934169a81 100644
--- a/gcc/recog.c
+++ b/gcc/recog.c
@@ -408,10 +408,7 @@ verify_changes (int num)
 	       changes[i].old
 	       && REG_P (changes[i].old)
 	       && asm_noperands (PATTERN (object)) > 0
-	       && REG_EXPR (changes[i].old) != NULL_TREE
-	       && HAS_DECL_ASSEMBLER_NAME_P (REG_EXPR (changes[i].old))
-	       && DECL_ASSEMBLER_NAME_SET_P (REG_EXPR (changes[i].old))
-	       && DECL_REGISTER (REG_EXPR (changes[i].old)))
+	       && register_asm_p (changes[i].old))
 	{
 	  /* Don't allow changes of hard register operands to inline
 	     assemblies if they have been defined as register asm ("x").  */
diff --git a/gcc/rtl.h b/gcc/rtl.h
index fcec9dc6387..5a1670f295c 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -3525,6 +3525,7 @@ extern rtx tablejump_casesi_pattern (const rtx_insn *insn);
 extern int computed_jump_p (const rtx_insn *);
 extern bool tls_referenced_p (const_rtx);
 extern bool contains_mem_rtx_p (rtx x);
+extern bool register_asm_p (const_rtx);
 
 /* Overload for refers_to_regno_p for checking a single register.  */
 inline bool
diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index 6f521503c39..30d5b0c6b76 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -6617,3 +6617,15 @@ add_auto_inc_notes (rtx_insn *insn, rtx x)
 	  add_auto_inc_notes (insn, XVECEXP (x, i, j));
     }
 }
+
+/* Return true if X is register asm.  */
+
+bool
+register_asm_p (const_rtx x)
+{
+  return (REG_P (x)
+	  && REG_EXPR (x) != NULL_TREE
+	  && HAS_DECL_ASSEMBLER_NAME_P (REG_EXPR (x))
+	  && DECL_ASSEMBLER_NAME_SET_P (REG_EXPR (x))
+	  && DECL_REGISTER (REG_EXPR (x)));
+}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [14/23] simplify-rtx: Put simplify routines into a class
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (12 preceding siblings ...)
  2020-11-13  8:18 ` [13/23] recog: Split out a register_asm_p function Richard Sandiford
@ 2020-11-13  8:18 ` Richard Sandiford
  2020-11-30 19:54   ` Jeff Law
  2020-11-13  8:19 ` [15/23] recog: Add a validate_change_xveclen function Richard Sandiford
                   ` (9 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:18 UTC (permalink / raw)
  To: gcc-patches

One of the recurring warts of RTL is that multiplication by a power
of 2 is represented as a MULT inside a MEM but as an ASHIFT outside
a MEM.  It would obviously be better if we didn't have this kind of
context sensitivity, but it would be difficult to remove.

Currently the simplify-rtx.c routines are hard-coded for the
ASHIFT form.  This means that some callers have to convert the
ASHIFTs “back” into MULTs after calling the simplify-rtx.c
routines; see fwprop.c:canonicalize_address for an example.

I think we can relieve some of the pain by wrapping the simplify-rtx.c
routines in a simple class that tracks whether the expression occurs
in a MEM or not, so that no post-processing is needed.

An obvious concern is whether passing the “this” pointer around
will slow things down or bloat the code.  I can't measure any
increase in compile time after applying the patch.  Sizewise,
simplify-rtx.o text increases by 2.3% in default-checking builds
and 4.1% in release-checking builds.

I realise the MULT/ASHIFT thing isn't the most palatable
reason for doing this, but I think it might be useful for
other things in future, such as using local nonzero_bits
hooks/virtual functions instead of the global hooks.

The obvious alternative would be to add a static variable
and hope that it is always updated correctly.

Later patches make use of this.

gcc/
	* rtl.h (simplify_context): New class.
	(simplify_unary_operation, simplify_binary_operation): Use it.
	(simplify_ternary_operation, simplify_relational_operation): Likewise.
	(simplify_subreg, simplify_gen_unary, simplify_gen_binary): Likewise.
	(simplify_gen_ternary, simplify_gen_relational): Likewise.
	(simplify_gen_subreg, lowpart_subreg): Likewise.
	* simplify-rtx.c (simplify_gen_binary): Turn into a member function
	of simplify_context.
	(simplify_gen_unary, simplify_gen_ternary, simplify_gen_relational)
	(simplify_truncation, simplify_unary_operation): Likewise.
	(simplify_unary_operation_1, simplify_byte_swapping_operation)
	(simplify_associative_operation, simplify_logical_relational_operation)
	(simplify_binary_operation, simplify_binary_operation_series)
	(simplify_distributive_operation, simplify_plus_minus): Likewise.
	(simplify_relational_operation, simplify_relational_operation_1)
	(simplify_cond_clz_ctz, simplify_merge_mask): Likewise.
	(simplify_ternary_operation, simplify_subreg, simplify_gen_subreg)
	(lowpart_subreg): Likewise.
	(simplify_binary_operation_1): Likewise.  Test mem_depth when
	deciding whether the ASHIFT or MULT form is canonical.
	(simplify_merge_mask): Use simplify_context.
---
 gcc/rtl.h          | 149 ++++++++++++++++++++++++++++++++++++++------
 gcc/simplify-rtx.c | 152 ++++++++++++++++++++++++++-------------------
 2 files changed, 220 insertions(+), 81 deletions(-)

diff --git a/gcc/rtl.h b/gcc/rtl.h
index 5a1670f295c..e9df95b02c4 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -3375,30 +3375,143 @@ extern rtx_insn *try_split (rtx, rtx_insn *, int);
 extern rtx_insn *split_insns (rtx, rtx_insn *);
 
 /* In simplify-rtx.c  */
+
+/* A class that records the context in which a simplification
+   is being mode.  */
+class simplify_context
+{
+public:
+  rtx simplify_unary_operation (rtx_code, machine_mode, rtx, machine_mode);
+  rtx simplify_binary_operation (rtx_code, machine_mode, rtx, rtx);
+  rtx simplify_ternary_operation (rtx_code, machine_mode, machine_mode,
+				  rtx, rtx, rtx);
+  rtx simplify_relational_operation (rtx_code, machine_mode, machine_mode,
+				     rtx, rtx);
+  rtx simplify_subreg (machine_mode, rtx, machine_mode, poly_uint64);
+
+  rtx lowpart_subreg (machine_mode, rtx, machine_mode);
+
+  rtx simplify_merge_mask (rtx, rtx, int);
+
+  rtx simplify_gen_unary (rtx_code, machine_mode, rtx, machine_mode);
+  rtx simplify_gen_binary (rtx_code, machine_mode, rtx, rtx);
+  rtx simplify_gen_ternary (rtx_code, machine_mode, machine_mode,
+			    rtx, rtx, rtx);
+  rtx simplify_gen_relational (rtx_code, machine_mode, machine_mode, rtx, rtx);
+  rtx simplify_gen_subreg (machine_mode, rtx, machine_mode, poly_uint64);
+
+  /* Tracks the level of MEM nesting for the value being simplified:
+     0 means the value is not in a MEM, >0 means it is.  This is needed
+     because the canonical representation of multiplication is different
+     inside a MEM than outside.  */
+  unsigned int mem_depth = 0;
+
+private:
+  rtx simplify_truncation (machine_mode, rtx, machine_mode);
+  rtx simplify_byte_swapping_operation (rtx_code, machine_mode, rtx, rtx);
+  rtx simplify_associative_operation (rtx_code, machine_mode, rtx, rtx);
+  rtx simplify_distributive_operation (rtx_code, machine_mode, rtx, rtx);
+  rtx simplify_logical_relational_operation (rtx_code, machine_mode, rtx, rtx);
+  rtx simplify_binary_operation_series (rtx_code, machine_mode, rtx, rtx);
+  rtx simplify_distribute_over_subregs (rtx_code, machine_mode, rtx, rtx);
+  rtx simplify_shift_const_int (rtx_code, machine_mode, rtx, unsigned int);
+  rtx simplify_plus_minus (rtx_code, machine_mode, rtx, rtx);
+  rtx simplify_cond_clz_ctz (rtx, rtx_code, rtx, rtx);
+
+  rtx simplify_unary_operation_1 (rtx_code, machine_mode, rtx);
+  rtx simplify_binary_operation_1 (rtx_code, machine_mode, rtx, rtx, rtx, rtx);
+  rtx simplify_ternary_operation_1 (rtx_code, machine_mode, machine_mode,
+				    rtx, rtx, rtx);
+  rtx simplify_relational_operation_1 (rtx_code, machine_mode, machine_mode,
+				       rtx, rtx);
+};
+
+inline rtx
+simplify_unary_operation (rtx_code code, machine_mode mode, rtx op,
+			  machine_mode op_mode)
+{
+  return simplify_context ().simplify_unary_operation (code, mode, op,
+						       op_mode);
+}
+
+inline rtx
+simplify_binary_operation (rtx_code code, machine_mode mode, rtx op0, rtx op1)
+{
+  return simplify_context ().simplify_binary_operation (code, mode, op0, op1);
+}
+
+inline rtx
+simplify_ternary_operation (rtx_code code, machine_mode mode,
+			    machine_mode op0_mode, rtx op0, rtx op1, rtx op2)
+{
+  return simplify_context ().simplify_ternary_operation (code, mode, op0_mode,
+							 op0, op1, op2);
+}
+
+inline rtx
+simplify_relational_operation (rtx_code code, machine_mode mode,
+			       machine_mode op_mode, rtx op0, rtx op1)
+{
+  return simplify_context ().simplify_relational_operation (code, mode,
+							    op_mode, op0, op1);
+}
+
+inline rtx
+simplify_subreg (machine_mode outermode, rtx op, machine_mode innermode,
+		 poly_uint64 byte)
+{
+  return simplify_context ().simplify_subreg (outermode, op, innermode, byte);
+}
+
+inline rtx
+simplify_gen_unary (rtx_code code, machine_mode mode, rtx op,
+		    machine_mode op_mode)
+{
+  return simplify_context ().simplify_gen_unary (code, mode, op, op_mode);
+}
+
+inline rtx
+simplify_gen_binary (rtx_code code, machine_mode mode, rtx op0, rtx op1)
+{
+  return simplify_context ().simplify_gen_binary (code, mode, op0, op1);
+}
+
+inline rtx
+simplify_gen_ternary (rtx_code code, machine_mode mode, machine_mode op0_mode,
+		      rtx op0, rtx op1, rtx op2)
+{
+  return simplify_context ().simplify_gen_ternary (code, mode, op0_mode,
+						   op0, op1, op2);
+}
+
+inline rtx
+simplify_gen_relational (rtx_code code, machine_mode mode,
+			 machine_mode op_mode, rtx op0, rtx op1)
+{
+  return simplify_context ().simplify_gen_relational (code, mode, op_mode,
+						      op0, op1);
+}
+
+inline rtx
+simplify_gen_subreg (machine_mode outermode, rtx op, machine_mode innermode,
+		     poly_uint64 byte)
+{
+  return simplify_context ().simplify_gen_subreg (outermode, op,
+						  innermode, byte);
+}
+
+inline rtx
+lowpart_subreg (machine_mode outermode, rtx op, machine_mode innermode)
+{
+  return simplify_context ().lowpart_subreg (outermode, op, innermode);
+}
+
 extern rtx simplify_const_unary_operation (enum rtx_code, machine_mode,
 					   rtx, machine_mode);
-extern rtx simplify_unary_operation (enum rtx_code, machine_mode, rtx,
-				     machine_mode);
 extern rtx simplify_const_binary_operation (enum rtx_code, machine_mode,
 					    rtx, rtx);
-extern rtx simplify_binary_operation (enum rtx_code, machine_mode, rtx,
-				      rtx);
-extern rtx simplify_ternary_operation (enum rtx_code, machine_mode,
-				       machine_mode, rtx, rtx, rtx);
 extern rtx simplify_const_relational_operation (enum rtx_code,
 						machine_mode, rtx, rtx);
-extern rtx simplify_relational_operation (enum rtx_code, machine_mode,
-					  machine_mode, rtx, rtx);
-extern rtx simplify_gen_binary (enum rtx_code, machine_mode, rtx, rtx);
-extern rtx simplify_gen_unary (enum rtx_code, machine_mode, rtx,
-			       machine_mode);
-extern rtx simplify_gen_ternary (enum rtx_code, machine_mode,
-				 machine_mode, rtx, rtx, rtx);
-extern rtx simplify_gen_relational (enum rtx_code, machine_mode,
-				    machine_mode, rtx, rtx);
-extern rtx simplify_subreg (machine_mode, rtx, machine_mode, poly_uint64);
-extern rtx simplify_gen_subreg (machine_mode, rtx, machine_mode, poly_uint64);
-extern rtx lowpart_subreg (machine_mode, rtx, machine_mode);
 extern rtx simplify_replace_fn_rtx (rtx, const_rtx,
 				    rtx (*fn) (rtx, const_rtx, void *), void *);
 extern rtx simplify_replace_rtx (rtx, const_rtx, rtx);
diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 47e7aebda8a..ff331e1a17f 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -47,14 +47,6 @@ along with GCC; see the file COPYING3.  If not see
   ((((HOST_WIDE_INT) low) < 0) ? HOST_WIDE_INT_M1 : HOST_WIDE_INT_0)
 
 static bool plus_minus_operand_p (const_rtx);
-static rtx simplify_plus_minus (enum rtx_code, machine_mode, rtx, rtx);
-static rtx simplify_associative_operation (enum rtx_code, machine_mode,
-					   rtx, rtx);
-static rtx simplify_relational_operation_1 (enum rtx_code, machine_mode,
-					    machine_mode, rtx, rtx);
-static rtx simplify_unary_operation_1 (enum rtx_code, machine_mode, rtx);
-static rtx simplify_binary_operation_1 (enum rtx_code, machine_mode,
-					rtx, rtx, rtx, rtx);
 \f
 /* Negate I, which satisfies poly_int_rtx_p.  MODE is the mode of I.  */
 
@@ -180,8 +172,8 @@ val_signbit_known_clear_p (machine_mode mode, unsigned HOST_WIDE_INT val)
    seeing if the expression folds.  */
 
 rtx
-simplify_gen_binary (enum rtx_code code, machine_mode mode, rtx op0,
-		     rtx op1)
+simplify_context::simplify_gen_binary (rtx_code code, machine_mode mode,
+				       rtx op0, rtx op1)
 {
   rtx tem;
 
@@ -358,8 +350,8 @@ delegitimize_mem_from_attrs (rtx x)
    the specified operation.  */
 
 rtx
-simplify_gen_unary (enum rtx_code code, machine_mode mode, rtx op,
-		    machine_mode op_mode)
+simplify_context::simplify_gen_unary (rtx_code code, machine_mode mode, rtx op,
+				      machine_mode op_mode)
 {
   rtx tem;
 
@@ -373,8 +365,9 @@ simplify_gen_unary (enum rtx_code code, machine_mode mode, rtx op,
 /* Likewise for ternary operations.  */
 
 rtx
-simplify_gen_ternary (enum rtx_code code, machine_mode mode,
-		      machine_mode op0_mode, rtx op0, rtx op1, rtx op2)
+simplify_context::simplify_gen_ternary (rtx_code code, machine_mode mode,
+					machine_mode op0_mode,
+					rtx op0, rtx op1, rtx op2)
 {
   rtx tem;
 
@@ -390,8 +383,9 @@ simplify_gen_ternary (enum rtx_code code, machine_mode mode,
    CMP_MODE specifies mode comparison is done in.  */
 
 rtx
-simplify_gen_relational (enum rtx_code code, machine_mode mode,
-			 machine_mode cmp_mode, rtx op0, rtx op1)
+simplify_context::simplify_gen_relational (rtx_code code, machine_mode mode,
+					   machine_mode cmp_mode,
+					   rtx op0, rtx op1)
 {
   rtx tem;
 
@@ -614,9 +608,9 @@ simplify_replace_rtx (rtx x, const_rtx old_rtx, rtx new_rtx)
    However, X is still an arbitrary 64-bit number and so we cannot
    assume that truncating it too is a no-op.  */
 
-static rtx
-simplify_truncation (machine_mode mode, rtx op,
-		     machine_mode op_mode)
+rtx
+simplify_context::simplify_truncation (machine_mode mode, rtx op,
+				       machine_mode op_mode)
 {
   unsigned int precision = GET_MODE_UNIT_PRECISION (mode);
   unsigned int op_precision = GET_MODE_UNIT_PRECISION (op_mode);
@@ -858,8 +852,8 @@ simplify_truncation (machine_mode mode, rtx op,
    MODE with input operand OP whose mode was originally OP_MODE.
    Return zero if no simplification can be made.  */
 rtx
-simplify_unary_operation (enum rtx_code code, machine_mode mode,
-			  rtx op, machine_mode op_mode)
+simplify_context::simplify_unary_operation (rtx_code code, machine_mode mode,
+					    rtx op, machine_mode op_mode)
 {
   rtx trueop, tem;
 
@@ -900,8 +894,9 @@ exact_int_to_float_conversion_p (const_rtx op)
 
 /* Perform some simplifications we can do even if the operands
    aren't constant.  */
-static rtx
-simplify_unary_operation_1 (enum rtx_code code, machine_mode mode, rtx op)
+rtx
+simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
+					      rtx op)
 {
   enum rtx_code reversed;
   rtx temp, elt, base, step;
@@ -2091,9 +2086,10 @@ simplify_const_unary_operation (enum rtx_code code, machine_mode mode,
    operating on OP0 and OP1.  CODE is currently one of AND, IOR or XOR.
    Return zero if no simplification or canonicalization is possible.  */
 
-static rtx
-simplify_byte_swapping_operation (enum rtx_code code, machine_mode mode,
-				  rtx op0, rtx op1)
+rtx
+simplify_context::simplify_byte_swapping_operation (rtx_code code,
+						    machine_mode mode,
+						    rtx op0, rtx op1)
 {
   rtx tem;
 
@@ -2121,9 +2117,10 @@ simplify_byte_swapping_operation (enum rtx_code code, machine_mode mode,
    SMIN, SMAX, UMIN or UMAX.  Return zero if no simplification or
    canonicalization is possible.  */
 
-static rtx
-simplify_associative_operation (enum rtx_code code, machine_mode mode,
-				rtx op0, rtx op1)
+rtx
+simplify_context::simplify_associative_operation (rtx_code code,
+						  machine_mode mode,
+						  rtx op0, rtx op1)
 {
   rtx tem;
 
@@ -2302,8 +2299,9 @@ comparison_code_valid_for_mode (enum rtx_code code, enum machine_mode mode)
    and OP1, which should be both relational operations.  Return 0 if no such
    simplification is possible.  */
 rtx
-simplify_logical_relational_operation (enum rtx_code code, machine_mode mode,
-				       rtx op0, rtx op1)
+simplify_context::simplify_logical_relational_operation (rtx_code code,
+							 machine_mode mode,
+							 rtx op0, rtx op1)
 {
   /* We only handle IOR of two relational operations.  */
   if (code != IOR)
@@ -2351,8 +2349,8 @@ simplify_logical_relational_operation (enum rtx_code code, machine_mode mode,
    Don't use this for relational operations such as EQ or LT.
    Use simplify_relational_operation instead.  */
 rtx
-simplify_binary_operation (enum rtx_code code, machine_mode mode,
-			   rtx op0, rtx op1)
+simplify_context::simplify_binary_operation (rtx_code code, machine_mode mode,
+					     rtx op0, rtx op1)
 {
   rtx trueop0, trueop1;
   rtx tem;
@@ -2397,9 +2395,10 @@ simplify_binary_operation (enum rtx_code code, machine_mode mode,
    MODE is the mode of the operation and is known to be a vector
    integer mode.  */
 
-static rtx
-simplify_binary_operation_series (rtx_code code, machine_mode mode,
-				  rtx op0, rtx op1)
+rtx
+simplify_context::simplify_binary_operation_series (rtx_code code,
+						    machine_mode mode,
+						    rtx op0, rtx op1)
 {
   rtx base0, step0;
   if (vec_duplicate_p (op0, &base0))
@@ -2433,9 +2432,10 @@ simplify_binary_operation_series (rtx_code code, machine_mode mode,
    e.g. simplify (xor (and A C) (and (B C)) to (and (xor (A B) C).
    Returns NULL_RTX if no simplification is possible.  */
 
-static rtx
-simplify_distributive_operation (enum rtx_code code, machine_mode mode,
-				 rtx op0, rtx op1)
+rtx
+simplify_context::simplify_distributive_operation (rtx_code code,
+						   machine_mode mode,
+						   rtx op0, rtx op1)
 {
   enum rtx_code op = GET_CODE (op0);
   gcc_assert (GET_CODE (op1) == op);
@@ -2481,9 +2481,11 @@ simplify_distributive_operation (enum rtx_code code, machine_mode mode,
    OP1 are constant pool references, TRUEOP0 and TRUEOP1 represent the
    actual constants.  */
 
-static rtx
-simplify_binary_operation_1 (enum rtx_code code, machine_mode mode,
-			     rtx op0, rtx op1, rtx trueop0, rtx trueop1)
+rtx
+simplify_context::simplify_binary_operation_1 (rtx_code code,
+					       machine_mode mode,
+					       rtx op0, rtx op1,
+					       rtx trueop0, rtx trueop1)
 {
   rtx tem, reversed, opleft, opright, elt0, elt1;
   HOST_WIDE_INT val;
@@ -2948,7 +2950,7 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode,
 	return op0;
 
       /* Convert multiply by constant power of two into shift.  */
-      if (CONST_SCALAR_INT_P (trueop1))
+      if (mem_depth == 0 && CONST_SCALAR_INT_P (trueop1))
 	{
 	  val = wi::exact_log2 (rtx_mode_t (trueop1, mode));
 	  if (val >= 0)
@@ -3863,6 +3865,18 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode,
 	return op0;
       if (trueop0 == CONST0_RTX (mode) && ! side_effects_p (op1))
 	return op0;
+      if (mem_depth
+	  && code == ASHIFT
+	  && CONST_INT_P (trueop1)
+	  && is_a <scalar_int_mode> (mode, &int_mode)
+	  && IN_RANGE (UINTVAL (trueop1),
+		       1, GET_MODE_PRECISION (int_mode) - 1))
+	{
+	  auto c = (wi::one (GET_MODE_PRECISION (int_mode))
+		    << UINTVAL (trueop1));
+	  rtx new_op1 = immed_wide_int_const (c, int_mode);
+	  return simplify_gen_binary (MULT, int_mode, op0, new_op1);
+	}
       goto canonicalize_shift;
 
     case LSHIFTRT:
@@ -4887,9 +4901,9 @@ simplify_plus_minus_op_data_cmp (rtx x, rtx y)
 
    May return NULL_RTX when no changes were made.  */
 
-static rtx
-simplify_plus_minus (enum rtx_code code, machine_mode mode, rtx op0,
-		     rtx op1)
+rtx
+simplify_context::simplify_plus_minus (rtx_code code, machine_mode mode,
+				       rtx op0, rtx op1)
 {
   struct simplify_plus_minus_op_data
   {
@@ -5234,8 +5248,10 @@ plus_minus_operand_p (const_rtx x)
    the operands or, if both are VOIDmode, the operands are compared in
    "infinite precision".  */
 rtx
-simplify_relational_operation (enum rtx_code code, machine_mode mode,
-			       machine_mode cmp_mode, rtx op0, rtx op1)
+simplify_context::simplify_relational_operation (rtx_code code,
+						 machine_mode mode,
+						 machine_mode cmp_mode,
+						 rtx op0, rtx op1)
 {
   rtx tem, trueop0, trueop1;
 
@@ -5318,9 +5334,11 @@ simplify_relational_operation (enum rtx_code code, machine_mode mode,
    MODE is the mode of the result, while CMP_MODE specifies in which
    mode the comparison is done in, so it is the mode of the operands.  */
 
-static rtx
-simplify_relational_operation_1 (enum rtx_code code, machine_mode mode,
-				 machine_mode cmp_mode, rtx op0, rtx op1)
+rtx
+simplify_context::simplify_relational_operation_1 (rtx_code code,
+						   machine_mode mode,
+						   machine_mode cmp_mode,
+						   rtx op0, rtx op1)
 {
   enum rtx_code op0code = GET_CODE (op0);
 
@@ -5988,8 +6006,9 @@ simplify_const_relational_operation (enum rtx_code code,
    Assume X is compared against zero with CMP_CODE and the true
    arm is TRUE_VAL and the false arm is FALSE_VAL.  */
 
-static rtx
-simplify_cond_clz_ctz (rtx x, rtx_code cmp_code, rtx true_val, rtx false_val)
+rtx
+simplify_context::simplify_cond_clz_ctz (rtx x, rtx_code cmp_code,
+					 rtx true_val, rtx false_val)
 {
   if (cmp_code != EQ && cmp_code != NE)
     return NULL_RTX;
@@ -6032,7 +6051,7 @@ simplify_cond_clz_ctz (rtx x, rtx_code cmp_code, rtx true_val, rtx false_val)
    Return the simplified X on success, otherwise return NULL_RTX.  */
 
 rtx
-simplify_merge_mask (rtx x, rtx mask, int op)
+simplify_context::simplify_merge_mask (rtx x, rtx mask, int op)
 {
   gcc_assert (VECTOR_MODE_P (GET_MODE (x)));
   poly_uint64 nunits = GET_MODE_NUNITS (GET_MODE (x));
@@ -6102,9 +6121,9 @@ simplify_merge_mask (rtx x, rtx mask, int op)
    a constant.  Return 0 if no simplifications is possible.  */
 
 rtx
-simplify_ternary_operation (enum rtx_code code, machine_mode mode,
-			    machine_mode op0_mode, rtx op0, rtx op1,
-			    rtx op2)
+simplify_context::simplify_ternary_operation (rtx_code code, machine_mode mode,
+					      machine_mode op0_mode,
+					      rtx op0, rtx op1, rtx op2)
 {
   bool any_change = false;
   rtx tem, trueop2;
@@ -6971,8 +6990,8 @@ simplify_immed_subreg (fixed_size_mode outermode, rtx x,
 /* Simplify SUBREG:OUTERMODE(OP:INNERMODE, BYTE)
    Return 0 if no simplifications are possible.  */
 rtx
-simplify_subreg (machine_mode outermode, rtx op,
-		 machine_mode innermode, poly_uint64 byte)
+simplify_context::simplify_subreg (machine_mode outermode, rtx op,
+				   machine_mode innermode, poly_uint64 byte)
 {
   /* Little bit of sanity checking.  */
   gcc_assert (innermode != VOIDmode);
@@ -7246,8 +7265,9 @@ simplify_subreg (machine_mode outermode, rtx op,
 /* Make a SUBREG operation or equivalent if it folds.  */
 
 rtx
-simplify_gen_subreg (machine_mode outermode, rtx op,
-		     machine_mode innermode, poly_uint64 byte)
+simplify_context::simplify_gen_subreg (machine_mode outermode, rtx op,
+				       machine_mode innermode,
+				       poly_uint64 byte)
 {
   rtx newx;
 
@@ -7270,8 +7290,8 @@ simplify_gen_subreg (machine_mode outermode, rtx op,
    INNER_MODE) to OUTER_MODE.  */
 
 rtx
-lowpart_subreg (machine_mode outer_mode, rtx expr,
-			     machine_mode inner_mode)
+simplify_context::lowpart_subreg (machine_mode outer_mode, rtx expr,
+				  machine_mode inner_mode)
 {
   return simplify_gen_subreg (outer_mode, expr, inner_mode,
 			      subreg_lowpart_offset (outer_mode, inner_mode));
@@ -7685,6 +7705,12 @@ test_vector_ops_series (machine_mode mode, rtx scalar_reg)
 					     series_0_1));
 }
 
+static rtx
+simplify_merge_mask (rtx x, rtx mask, int op)
+{
+  return simplify_context ().simplify_merge_mask (x, mask, op);
+}
+
 /* Verify simplify_merge_mask works correctly.  */
 
 static void
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [15/23] recog: Add a validate_change_xveclen function
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (13 preceding siblings ...)
  2020-11-13  8:18 ` [14/23] simplify-rtx: Put simplify routines into a class Richard Sandiford
@ 2020-11-13  8:19 ` Richard Sandiford
  2020-11-30 20:03   ` Jeff Law
  2020-11-13  8:19 ` [16/23] recog: Add a way of temporarily undoing changes Richard Sandiford
                   ` (8 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:19 UTC (permalink / raw)
  To: gcc-patches

A later patch wants to be able to use the validate_change machinery
to reduce the XVECLEN of a PARALLEL.  This should be more efficient
than allocating a separate PARALLEL at a possibly distant memory
location, especially since the new PARALLEL would be garbage rtl if
the new pattern turns out not to match.  Combine already pulls this
trick with SUBST_INT.

This patch adds a general helper for doing that.

gcc/
	* recog.h (validate_change_xveclen): Declare.
	* recog.c (change_t::old_len): New field.
	(validate_change_1): Add a new_len parameter.  Conditionally
	replace the XVECLEN of an rtx, avoiding single-element PARALLELs.
	(validate_change_xveclen): New function.
	(cancel_changes): Undo changes made by validate_change_xveclen.
---
 gcc/recog.c | 41 +++++++++++++++++++++++++++++++++++------
 gcc/recog.h |  1 +
 2 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/gcc/recog.c b/gcc/recog.c
index 2d934169a81..65125b8f0d1 100644
--- a/gcc/recog.c
+++ b/gcc/recog.c
@@ -183,6 +183,7 @@ struct change_t
 {
   rtx object;
   int old_code;
+  int old_len;
   bool unshare;
   rtx *loc;
   rtx old;
@@ -194,8 +195,10 @@ static int changes_allocated;
 static int num_changes = 0;
 
 /* Validate a proposed change to OBJECT.  LOC is the location in the rtl
-   at which NEW_RTX will be placed.  If OBJECT is zero, no validation is done,
-   the change is simply made.
+   at which NEW_RTX will be placed.  If NEW_LEN is >= 0, XVECLEN (NEW_RTX, 0)
+   will also be changed to NEW_LEN, which is no greater than the current
+   XVECLEN.  If OBJECT is zero, no validation is done, the change is
+   simply made.
 
    Two types of objects are supported:  If OBJECT is a MEM, memory_address_p
    will be called with the address and mode as parameters.  If OBJECT is
@@ -212,14 +215,25 @@ static int num_changes = 0;
    Otherwise, perform the change and return 1.  */
 
 static bool
-validate_change_1 (rtx object, rtx *loc, rtx new_rtx, bool in_group, bool unshare)
+validate_change_1 (rtx object, rtx *loc, rtx new_rtx, bool in_group,
+		   bool unshare, int new_len = -1)
 {
   rtx old = *loc;
 
-  if (old == new_rtx || rtx_equal_p (old, new_rtx))
+  /* Single-element parallels aren't valid and won't match anything.
+     Replace them with the single element.  */
+  if (new_len == 1 && GET_CODE (new_rtx) == PARALLEL)
+    {
+      new_rtx = XVECEXP (new_rtx, 0, 0);
+      new_len = -1;
+    }
+
+  if ((old == new_rtx || rtx_equal_p (old, new_rtx))
+      && (new_len < 0 || XVECLEN (new_rtx, 0) == new_len))
     return 1;
 
-  gcc_assert (in_group != 0 || num_changes == 0);
+  gcc_assert ((in_group != 0 || num_changes == 0)
+	      && (new_len < 0 || new_rtx == *loc));
 
   *loc = new_rtx;
 
@@ -239,8 +253,12 @@ validate_change_1 (rtx object, rtx *loc, rtx new_rtx, bool in_group, bool unshar
   changes[num_changes].object = object;
   changes[num_changes].loc = loc;
   changes[num_changes].old = old;
+  changes[num_changes].old_len = (new_len >= 0 ? XVECLEN (new_rtx, 0) : -1);
   changes[num_changes].unshare = unshare;
 
+  if (new_len >= 0)
+    XVECLEN (new_rtx, 0) = new_len;
+
   if (object && !MEM_P (object))
     {
       /* Set INSN_CODE to force rerecognition of insn.  Save old code in
@@ -278,6 +296,14 @@ validate_unshare_change (rtx object, rtx *loc, rtx new_rtx, bool in_group)
   return validate_change_1 (object, loc, new_rtx, in_group, true);
 }
 
+/* Change XVECLEN (*LOC, 0) to NEW_LEN.  OBJECT, IN_GROUP and the return
+   value are as for validate_change_1.  */
+
+bool
+validate_change_xveclen (rtx object, rtx *loc, int new_len, bool in_group)
+{
+  return validate_change_1 (object, loc, *loc, in_group, false, new_len);
+}
 
 /* Keep X canonicalized if some changes have made it non-canonical; only
    modifies the operands of X, not (for example) its code.  Simplifications
@@ -541,7 +567,10 @@ cancel_changes (int num)
      they were made.  */
   for (i = num_changes - 1; i >= num; i--)
     {
-      *changes[i].loc = changes[i].old;
+      if (changes[i].old_len >= 0)
+	XVECLEN (*changes[i].loc, 0) = changes[i].old_len;
+      else
+	*changes[i].loc = changes[i].old;
       if (changes[i].object && !MEM_P (changes[i].object))
 	INSN_CODE (changes[i].object) = changes[i].old_code;
     }
diff --git a/gcc/recog.h b/gcc/recog.h
index d87456c257f..e152e2bb591 100644
--- a/gcc/recog.h
+++ b/gcc/recog.h
@@ -88,6 +88,7 @@ extern int check_asm_operands (rtx);
 extern int asm_operand_ok (rtx, const char *, const char **);
 extern bool validate_change (rtx, rtx *, rtx, bool);
 extern bool validate_unshare_change (rtx, rtx *, rtx, bool);
+extern bool validate_change_xveclen (rtx, rtx *, int, bool);
 extern bool canonicalize_change_group (rtx_insn *insn, rtx x);
 extern int insn_invalid_p (rtx_insn *, bool);
 extern int verify_changes (int);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [16/23] recog: Add a way of temporarily undoing changes
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (14 preceding siblings ...)
  2020-11-13  8:19 ` [15/23] recog: Add a validate_change_xveclen function Richard Sandiford
@ 2020-11-13  8:19 ` Richard Sandiford
  2020-11-25 20:27   ` Jeff Law
  2020-11-13  8:20 ` [17/23] recog: Add a class for propagating into insns Richard Sandiford
                   ` (7 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:19 UTC (permalink / raw)
  To: gcc-patches

In some cases, it can be convenient to roll back the changes that
have been made by validate_change to see how things looked before,
then reroll the changes.  For example, this makes it possible
to defer calculating the cost of an instruction until we know that
the result is actually needed.  It can also make dumps easier to read.

This patch adds a couple of helper functions for doing that.

gcc/
	* recog.h (temporarily_undo_changes, redo_changes): Declare.
	* recog.c (swap_change, temporarily_undo_changes): New functions.
	(redo_changes): Likewise.
---
 gcc/recog.c | 40 ++++++++++++++++++++++++++++++++++++++++
 gcc/recog.h |  2 ++
 2 files changed, 42 insertions(+)

diff --git a/gcc/recog.c b/gcc/recog.c
index 65125b8f0d1..309a578a151 100644
--- a/gcc/recog.c
+++ b/gcc/recog.c
@@ -577,6 +577,46 @@ cancel_changes (int num)
   num_changes = num;
 }
 
+/* Swap the status of change NUM from being applied to not being applied,
+   or vice versa.  */
+
+static void
+swap_change (int num)
+{
+  if (changes[num].old_len >= 0)
+    std::swap (XVECLEN (*changes[num].loc, 0), changes[num].old_len);
+  else
+    std::swap (*changes[num].loc, changes[num].old);
+  if (changes[num].object && !MEM_P (changes[num].object))
+    std::swap (INSN_CODE (changes[num].object), changes[num].old_code);
+}
+
+/* Temporarily undo all the changes numbered NUM and up, with a view
+   to reapplying them later.  The next call to the changes machinery
+   must be:
+
+      redo_changes (NUM)
+
+   otherwise things will end up in an invalid state.  */
+
+void
+temporarily_undo_changes (int num)
+{
+  for (int i = num_changes - 1; i >= num; i--)
+    swap_change (i);
+}
+
+/* Redo the changes that were temporarily undone by:
+
+      temporarily_undo_changes (NUM).  */
+
+void
+redo_changes (int num)
+{
+  for (int i = num; i < num_changes; ++i)
+    swap_change (i);
+}
+
 /* Reduce conditional compilation elsewhere.  */
 /* A subroutine of validate_replace_rtx_1 that tries to simplify the resulting
    rtx.  */
diff --git a/gcc/recog.h b/gcc/recog.h
index e152e2bb591..facf36e7c08 100644
--- a/gcc/recog.h
+++ b/gcc/recog.h
@@ -96,6 +96,8 @@ extern void confirm_change_group (void);
 extern int apply_change_group (void);
 extern int num_validated_changes (void);
 extern void cancel_changes (int);
+extern void temporarily_undo_changes (int);
+extern void redo_changes (int);
 extern int constrain_operands (int, alternative_mask);
 extern int constrain_operands_cached (rtx_insn *, int);
 extern int memory_address_addr_space_p (machine_mode, rtx, addr_space_t);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [17/23] recog: Add a class for propagating into insns
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (15 preceding siblings ...)
  2020-11-13  8:19 ` [16/23] recog: Add a way of temporarily undoing changes Richard Sandiford
@ 2020-11-13  8:20 ` Richard Sandiford
  2020-12-03 22:32   ` Jeff Law
  2020-11-13  8:20 ` [18/23] recog: Add an RAII class for undoing insn changes Richard Sandiford
                   ` (6 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:20 UTC (permalink / raw)
  To: gcc-patches

This patch adds yet another way of propagating into an instruction and
simplifying the result.  (The net effect of the series is to keep the
total number of propagation approaches the same though, since a later
patch removes the fwprop.c routines.)

One of the drawbacks of the validate_replace_* routines is that
they only do simple simplifications, mostly canonicalisations:

  /* Do changes needed to keep rtx consistent.  Don't do any other
     simplifications, as it is not our job.  */
  if (simplify)
    simplify_while_replacing (loc, to, object, op0_mode);

But substituting can often lead to real simplification opportunities.
simplify-rtx.c:simplify_replace_rtx does fully simplify the result,
but it only operates on specific rvalues rather than full instruction
patterns.  It is also nondestructive, which means that it returns a
new rtx whenever a substitution or simplification was possible.
This can create quite a bit of garbage rtl in the context of a
speculative recog, where changing the contents of a pointer is
often enough.

The new routines are therefore supposed to provide simplify_replace_rtx-
style substitution in recog.  They go to some effort to prevent garbage
rtl from being created.

At the moment, the new routines fail if the pattern would still refer
to the old "from" value in some way.  That might be unnecessary in
some contexts; if so, it could be put behind a configuration parameter.

gcc/
	* recog.h (insn_propagation): New class.
	* recog.c (insn_propagation::apply_to_mem_1): New function.
	(insn_propagation::apply_to_rvalue_1): Likewise.
	(insn_propagation::apply_to_lvalue_1): Likewise.
	(insn_propagation::apply_to_pattern_1): Likewise.
	(insn_propagation::apply_to_pattern): Likewise.
	(insn_propagation::apply_to_rvalue): Likewise.
---
 gcc/recog.c | 372 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/recog.h | 100 ++++++++++++++
 2 files changed, 472 insertions(+)

diff --git a/gcc/recog.c b/gcc/recog.c
index 309a578a151..cb0952d8c6c 100644
--- a/gcc/recog.c
+++ b/gcc/recog.c
@@ -989,6 +989,378 @@ validate_simplify_insn (rtx_insn *insn)
   return ((num_changes_pending () > 0) && (apply_change_group () > 0));
 }
 
+/* Try to process the address of memory expression MEM.  Return true on
+   success; leave the caller to clean up on failure.  */
+
+bool
+insn_propagation::apply_to_mem_1 (rtx mem)
+{
+  auto old_num_changes = num_validated_changes ();
+  mem_depth += 1;
+  bool res = apply_to_rvalue_1 (&XEXP (mem, 0));
+  mem_depth -= 1;
+  if (!res)
+    return false;
+
+  if (old_num_changes != num_validated_changes ()
+      && should_check_mems
+      && !check_mem (old_num_changes, mem))
+    return false;
+
+  return true;
+}
+
+/* Try to process the rvalue expression at *LOC.  Return true on success;
+   leave the caller to clean up on failure.  */
+
+bool
+insn_propagation::apply_to_rvalue_1 (rtx *loc)
+{
+  rtx x = *loc;
+  enum rtx_code code = GET_CODE (x);
+  machine_mode mode = GET_MODE (x);
+
+  auto old_num_changes = num_validated_changes ();
+  if (from && GET_CODE (x) == GET_CODE (from) && rtx_equal_p (x, from))
+    {
+      if (should_unshare)
+	validate_unshare_change (insn, loc, to, 1);
+      else
+	validate_change (insn, loc, to, 1);
+      if (mem_depth && !REG_P (to) && !CONSTANT_P (to))
+	{
+	  /* We're substituting into an address, but TO will have the
+	     form expected outside an address.  Canonicalize it if
+	     necessary.  */
+	  insn_propagation subprop (insn);
+	  subprop.mem_depth += 1;
+	  if (!subprop.apply_to_rvalue (loc))
+	    gcc_unreachable ();
+	  if (should_unshare
+	      && num_validated_changes () != old_num_changes + 1)
+	    {
+	      /* TO is owned by someone else, so create a copy and
+		 return TO to its original form.  */
+	      rtx to = copy_rtx (*loc);
+	      cancel_changes (old_num_changes);
+	      validate_change (insn, loc, to, 1);
+	    }
+	}
+      num_replacements += 1;
+      should_unshare = true;
+      result_flags |= UNSIMPLIFIED;
+      return true;
+    }
+
+  /* Recursively apply the substitution and see if we can simplify
+     the result.  This specifically shouldn't use simplify_gen_* for
+     speculative simplifications, since we want to avoid generating new
+     expressions where possible.  */
+  auto old_result_flags = result_flags;
+  rtx newx = NULL_RTX;
+  bool recurse_p = false;
+  switch (GET_RTX_CLASS (code))
+    {
+    case RTX_UNARY:
+      {
+	machine_mode op0_mode = GET_MODE (XEXP (x, 0));
+	if (!apply_to_rvalue_1 (&XEXP (x, 0)))
+	  return false;
+	if (from && old_num_changes == num_validated_changes ())
+	  return true;
+
+	newx = simplify_unary_operation (code, mode, XEXP (x, 0), op0_mode);
+	break;
+      }
+
+    case RTX_BIN_ARITH:
+    case RTX_COMM_ARITH:
+      {
+	if (!apply_to_rvalue_1 (&XEXP (x, 0))
+	    || !apply_to_rvalue_1 (&XEXP (x, 1)))
+	  return false;
+	if (from && old_num_changes == num_validated_changes ())
+	  return true;
+
+	if (GET_RTX_CLASS (code) == RTX_COMM_ARITH
+	    && swap_commutative_operands_p (XEXP (x, 0), XEXP (x, 1)))
+	  newx = simplify_gen_binary (code, mode, XEXP (x, 1), XEXP (x, 0));
+	else
+	  newx = simplify_binary_operation (code, mode,
+					    XEXP (x, 0), XEXP (x, 1));
+	break;
+      }
+
+    case RTX_COMPARE:
+    case RTX_COMM_COMPARE:
+      {
+	machine_mode op_mode = (GET_MODE (XEXP (x, 0)) != VOIDmode
+				? GET_MODE (XEXP (x, 0))
+				: GET_MODE (XEXP (x, 1)));
+	if (!apply_to_rvalue_1 (&XEXP (x, 0))
+	    || !apply_to_rvalue_1 (&XEXP (x, 1)))
+	  return false;
+	if (from && old_num_changes == num_validated_changes ())
+	  return true;
+
+	newx = simplify_relational_operation (code, mode, op_mode,
+					      XEXP (x, 0), XEXP (x, 1));
+	break;
+      }
+
+    case RTX_TERNARY:
+    case RTX_BITFIELD_OPS:
+      {
+	machine_mode op0_mode = GET_MODE (XEXP (x, 0));
+	if (!apply_to_rvalue_1 (&XEXP (x, 0))
+	    || !apply_to_rvalue_1 (&XEXP (x, 1))
+	    || !apply_to_rvalue_1 (&XEXP (x, 2)))
+	  return false;
+	if (from && old_num_changes == num_validated_changes ())
+	  return true;
+
+	newx = simplify_ternary_operation (code, mode, op0_mode,
+					   XEXP (x, 0), XEXP (x, 1),
+					   XEXP (x, 2));
+	break;
+      }
+
+    case RTX_EXTRA:
+      if (code == SUBREG)
+	{
+	  machine_mode inner_mode = GET_MODE (SUBREG_REG (x));
+	  if (!apply_to_rvalue_1 (&SUBREG_REG (x)))
+	    return false;
+	  if (from && old_num_changes == num_validated_changes ())
+	    return true;
+
+	  rtx inner = SUBREG_REG (x);
+	  newx = simplify_subreg (mode, inner, inner_mode, SUBREG_BYTE (x));
+	  /* Reject the same cases that simplify_gen_subreg would.  */
+	  if (!newx
+	      && (GET_CODE (inner) == SUBREG
+		  || GET_CODE (inner) == CONCAT
+		  || GET_MODE (inner) == VOIDmode
+		  || !validate_subreg (mode, inner_mode,
+				       inner, SUBREG_BYTE (x))))
+	    {
+	      failure_reason = "would create an invalid subreg";
+	      return false;
+	    }
+	  break;
+	}
+      else
+	recurse_p = true;
+      break;
+
+    case RTX_OBJ:
+      if (code == LO_SUM)
+	{
+	  if (!apply_to_rvalue_1 (&XEXP (x, 0))
+	      || !apply_to_rvalue_1 (&XEXP (x, 1)))
+	    return false;
+	  if (from && old_num_changes == num_validated_changes ())
+	    return true;
+
+	  /* (lo_sum (high x) y) -> y where x and y have the same base.  */
+	  rtx op0 = XEXP (x, 0);
+	  rtx op1 = XEXP (x, 1);
+	  if (GET_CODE (op0) == HIGH)
+	    {
+	      rtx base0, base1, offset0, offset1;
+	      split_const (XEXP (op0, 0), &base0, &offset0);
+	      split_const (op1, &base1, &offset1);
+	      if (rtx_equal_p (base0, base1))
+		newx = op1;
+	    }
+	}
+      else if (code == REG)
+	{
+	  if (from && REG_P (from) && reg_overlap_mentioned_p (x, from))
+	    {
+	      failure_reason = "inexact register overlap";
+	      return false;
+	    }
+	}
+      else if (code == MEM)
+	return apply_to_mem_1 (x);
+      else
+	recurse_p = true;
+      break;
+
+    case RTX_CONST_OBJ:
+      break;
+
+    case RTX_AUTOINC:
+      if (from && reg_overlap_mentioned_p (XEXP (x, 0), from))
+	{
+	  failure_reason = "is subject to autoinc";
+	  return false;
+	}
+      recurse_p = true;
+      break;
+
+    case RTX_MATCH:
+    case RTX_INSN:
+      gcc_unreachable ();
+    }
+
+  if (recurse_p)
+    {
+      const char *fmt = GET_RTX_FORMAT (code);
+      for (int i = 0; fmt[i]; i++)
+	switch (fmt[i])
+	  {
+	  case 'E':
+	    for (int j = 0; j < XVECLEN (x, i); j++)
+	      if (!apply_to_rvalue_1 (&XVECEXP (x, i, j)))
+		return false;
+	    break;
+
+	  case 'e':
+	    if (XEXP (x, i) && !apply_to_rvalue_1 (&XEXP (x, i)))
+	      return false;
+	    break;
+	  }
+    }
+  else if (newx && !rtx_equal_p (x, newx))
+    {
+      /* All substitutions made by OLD_NUM_CHANGES onwards have been
+	 simplified.  */
+      result_flags = ((result_flags & ~UNSIMPLIFIED)
+		      | (old_result_flags & UNSIMPLIFIED));
+
+      if (should_note_simplifications)
+	note_simplification (old_num_changes, old_result_flags, x, newx);
+
+      /* There's no longer any point unsharing the substitutions made
+	 for subexpressions, since we'll just copy this one instead.  */
+      bool unshare = false;
+      for (int i = old_num_changes; i < num_changes; ++i)
+	{
+	  unshare |= changes[i].unshare;
+	  changes[i].unshare = false;
+	}
+      if (unshare)
+	validate_unshare_change (insn, loc, newx, 1);
+      else
+	validate_change (insn, loc, newx, 1);
+    }
+
+  return true;
+}
+
+/* Try to process the lvalue expression at *LOC.  Return true on success;
+   leave the caller to clean up on failure.  */
+
+bool
+insn_propagation::apply_to_lvalue_1 (rtx dest)
+{
+  rtx old_dest = dest;
+  while (GET_CODE (dest) == SUBREG
+	 || GET_CODE (dest) == ZERO_EXTRACT
+	 || GET_CODE (dest) == STRICT_LOW_PART)
+    {
+      if (GET_CODE (dest) == ZERO_EXTRACT
+	  && (!apply_to_rvalue_1 (&XEXP (dest, 1))
+	      || !apply_to_rvalue_1 (&XEXP (dest, 2))))
+	return false;
+      dest = XEXP (dest, 0);
+    }
+
+  if (MEM_P (dest))
+    return apply_to_mem_1 (dest);
+
+  /* Check whether the substitution is safe in the presence of this lvalue.  */
+  if (!from
+      || dest == old_dest
+      || !REG_P (dest)
+      || !reg_overlap_mentioned_p (dest, from))
+    return true;
+
+  if (SUBREG_P (old_dest)
+      && SUBREG_REG (old_dest) == dest
+      && !read_modify_subreg_p (old_dest))
+    return true;
+
+  failure_reason = "is part of a read-write destination";
+  return false;
+}
+
+/* Try to process the instruction pattern at *LOC.  Return true on success;
+   leave the caller to clean up on failure.  */
+
+bool
+insn_propagation::apply_to_pattern_1 (rtx *loc)
+{
+  rtx body = *loc;
+  switch (GET_CODE (body))
+    {
+    case COND_EXEC:
+      return (apply_to_rvalue_1 (&COND_EXEC_TEST (body))
+	      && apply_to_pattern_1 (&COND_EXEC_CODE (body)));
+
+    case PARALLEL:
+      {
+	int last = XVECLEN (body, 0) - 1;
+	for (int i = 0; i < last; ++i)
+	  if (!apply_to_pattern_1 (&XVECEXP (body, 0, i)))
+	    return false;
+	return apply_to_pattern_1 (&XVECEXP (body, 0, last));
+      }
+
+    case ASM_OPERANDS:
+      for (int i = 0, len = ASM_OPERANDS_INPUT_LENGTH (body); i < len; ++i)
+	if (!apply_to_rvalue_1 (&ASM_OPERANDS_INPUT (body, i)))
+	  return false;
+      return true;
+
+    case CLOBBER:
+      return apply_to_lvalue_1 (XEXP (body, 0));
+
+    case SET:
+      return (apply_to_lvalue_1 (SET_DEST (body))
+	      && apply_to_rvalue_1 (&SET_SRC (body)));
+
+    default:
+      /* All the other possibilities never store and can use a normal
+	 rtx walk.  This includes:
+
+	 - USE
+	 - TRAP_IF
+	 - PREFETCH
+	 - UNSPEC
+	 - UNSPEC_VOLATILE.  */
+      return apply_to_rvalue_1 (loc);
+    }
+}
+
+/* Apply this insn_propagation object's simplification or substitution
+   to the instruction pattern at LOC.  */
+
+bool
+insn_propagation::apply_to_pattern (rtx *loc)
+{
+  unsigned int num_changes = num_validated_changes ();
+  bool res = apply_to_pattern_1 (loc);
+  if (!res)
+    cancel_changes (num_changes);
+  return res;
+}
+
+/* Apply this insn_propagation object's simplification or substitution
+   to the rvalue expression at LOC.  */
+
+bool
+insn_propagation::apply_to_rvalue (rtx *loc)
+{
+  unsigned int num_changes = num_validated_changes ();
+  bool res = apply_to_rvalue_1 (loc);
+  if (!res)
+    cancel_changes (num_changes);
+  return res;
+}
+
 /* Check whether INSN matches a specific alternative of an .md pattern.  */
 
 bool
diff --git a/gcc/recog.h b/gcc/recog.h
index facf36e7c08..d6af2aa66d9 100644
--- a/gcc/recog.h
+++ b/gcc/recog.h
@@ -82,6 +82,106 @@ alternative_class (const operand_alternative *alt, int i)
   return alt[i].matches >= 0 ? alt[alt[i].matches].cl : alt[i].cl;
 }
 
+/* A class for substituting one rtx for another within an instruction,
+   or for recursively simplifying the instruction as-is.  Derived classes
+   can record or filter certain decisions.  */
+
+class insn_propagation : public simplify_context
+{
+public:
+  /* Assignments for RESULT_FLAGS.
+
+     UNSIMPLIFIED is true if a substitution has been made inside an rtx
+     X and if neither X nor its parent expressions could be simplified.
+
+     FIRST_SPARE_RESULT is the first flag available for derived classes.  */
+  static const uint16_t UNSIMPLIFIED = 1U << 0;
+  static const uint16_t FIRST_SPARE_RESULT = 1U << 1;
+
+  insn_propagation (rtx_insn *);
+  insn_propagation (rtx_insn *, rtx, rtx, bool = true);
+  bool apply_to_pattern (rtx *);
+  bool apply_to_rvalue (rtx *);
+
+  /* Return true if we should accept a substitution into the address of
+     memory expression MEM.  Undoing changes OLD_NUM_CHANGES and up restores
+     MEM's original address.  */
+  virtual bool check_mem (int /*old_num_changes*/,
+			  rtx /*mem*/) { return true; }
+
+  /* Note that we've simplified OLD_RTX into NEW_RTX.  When substituting,
+     this only happens if a substitution occured within OLD_RTX.
+     Undoing OLD_NUM_CHANGES and up will restore the old form of OLD_RTX.
+     OLD_RESULT_FLAGS is the value that RESULT_FLAGS had before processing
+     OLD_RTX.  */
+  virtual void note_simplification (int /*old_num_changes*/,
+				    uint16_t /*old_result_flags*/,
+				    rtx /*old_rtx*/, rtx /*new_rtx*/) {}
+
+private:
+  bool apply_to_mem_1 (rtx);
+  bool apply_to_lvalue_1 (rtx);
+  bool apply_to_rvalue_1 (rtx *);
+  bool apply_to_pattern_1 (rtx *);
+
+public:
+  /* The instruction that we are simplifying or propagating into.  */
+  rtx_insn *insn;
+
+  /* If FROM is nonnull, we're replacing FROM with TO, otherwise we're
+     just doing a recursive simplification.  */
+  rtx from;
+  rtx to;
+
+  /* The number of times that we have replaced FROM with TO.  */
+  unsigned int num_replacements;
+
+  /* A bitmask of flags that describe the result of the simplificiation;
+     see above for details.  */
+  uint16_t result_flags : 16;
+
+  /* True if we should unshare TO when making the next substitution,
+     false if we can use TO itself.  */
+  uint16_t should_unshare : 1;
+
+  /* True if we should call check_mem after substituting into a memory.  */
+  uint16_t should_check_mems : 1;
+
+  /* True if we should call note_simplification after each simplification.  */
+  uint16_t should_note_simplifications : 1;
+
+  /* For future expansion.  */
+  uint16_t spare : 13;
+
+  /* Gives the reason that a substitution failed, for debug purposes.  */
+  const char *failure_reason;
+};
+
+/* Try to replace FROM with TO in INSN.  SHARED_P is true if TO is shared
+   with other instructions, false if INSN can use TO directly.  */
+
+inline insn_propagation::insn_propagation (rtx_insn *insn, rtx from, rtx to,
+					   bool shared_p)
+  : insn (insn),
+    from (from),
+    to (to),
+    num_replacements (0),
+    result_flags (0),
+    should_unshare (shared_p),
+    should_check_mems (false),
+    should_note_simplifications (false),
+    spare (0),
+    failure_reason (nullptr)
+{
+}
+
+/* Try to simplify INSN without performing a substitution.  */
+
+inline insn_propagation::insn_propagation (rtx_insn *insn)
+  : insn_propagation (insn, NULL_RTX, NULL_RTX)
+{
+}
+
 extern void init_recog (void);
 extern void init_recog_no_volatile (void);
 extern int check_asm_operands (rtx);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [18/23] recog: Add an RAII class for undoing insn changes
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (16 preceding siblings ...)
  2020-11-13  8:20 ` [17/23] recog: Add a class for propagating into insns Richard Sandiford
@ 2020-11-13  8:20 ` Richard Sandiford
  2020-11-25 20:27   ` Jeff Law
  2020-11-13  8:20 ` [19/23] rtlanal: Add some new helper classes Richard Sandiford
                   ` (5 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:20 UTC (permalink / raw)
  To: gcc-patches

When using validate_change to make a group of changes, you have
to remember to cancel them if something goes wrong.  This patch
adds an RAII class to make that easier.  See the comments in the
patch for details and examples.

gcc/
	* recog.h (insn_change_watermark): New class.
---
 gcc/recog.h | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/gcc/recog.h b/gcc/recog.h
index d6af2aa66d9..b8de43b95bb 100644
--- a/gcc/recog.h
+++ b/gcc/recog.h
@@ -503,6 +503,57 @@ alternative_mask get_preferred_alternatives (rtx_insn *, basic_block);
 bool check_bool_attrs (rtx_insn *);
 
 void recog_init ();
+
+/* This RAII class can help to undo tentative insn changes on failure.
+   When an object of the class goes out of scope, it undoes all group
+   changes that have been made via the validate_change machinery and
+   not yet confirmed via confirm_change_group.
+
+   For example:
+
+      insn_change_watermark watermark;
+      validate_change (..., true); // A
+      ...
+      if (test)
+	// Undoes change A.
+	return false;
+      ...
+      validate_change (..., true); // B
+      ...
+      if (test)
+	// Undoes changes A and B.
+	return false;
+      ...
+      confirm_change_group ();
+
+   Code that wants to avoid this behavior can use keep ():
+
+      insn_change_watermark watermark;
+      validate_change (..., true); // A
+      ...
+      if (test)
+	// Undoes change A.
+	return false;
+      ...
+      watermark.keep ();
+      validate_change (..., true); // B
+      ...
+      if (test)
+	// Undoes change B, but not A.
+	return false;
+      ...
+      confirm_change_group ();  */
+class insn_change_watermark
+{
+public:
+  insn_change_watermark () : m_old_num_changes (num_validated_changes ()) {}
+  ~insn_change_watermark () { cancel_changes (m_old_num_changes); }
+  void keep () { m_old_num_changes = num_validated_changes (); }
+
+private:
+  int m_old_num_changes;
+};
+
 #endif
 
 #endif /* GCC_RECOG_H */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [19/23] rtlanal: Add some new helper classes
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (17 preceding siblings ...)
  2020-11-13  8:20 ` [18/23] recog: Add an RAII class for undoing insn changes Richard Sandiford
@ 2020-11-13  8:20 ` Richard Sandiford
  2020-12-13 17:30   ` Jeff Law
  2020-11-13  8:21 ` [20/23] rtlanal: Add simple_regno_set Richard Sandiford
                   ` (4 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:20 UTC (permalink / raw)
  To: gcc-patches

This patch adds some classes for gathering the list of registers
and memory that are read and written by an instruction, along
with various properties about the accesses.  In some ways it's
similar to the information that DF collects for registers,
but extended to memory.  The main reason for using it instead
of DF is that it can analyse tentative changes to instructions
before they've been committed.

The classes also collect general information about the instruction,
since it's cheap to do and helps to avoid multiple walks of the same
RTL pattern.

I've tried to optimise the code quite a bit, since with later patches
it becomes relatively performance-sensitive.  See the discussion in
the comments for the trade-offs involved.

I put the declarations in a new rtlanal.h header file since it
seemed a bit excessive to put so much new inline stuff in rtl.h.

gcc/
	* rtlanal.h: New file.
	(MEM_REGNO): New constant.
	(rtx_obj_flags): New namespace.
	(rtx_obj_reference, rtx_properties): New classes.
	(growing_rtx_properties, vec_rtx_properties_base): Likewise.
	(vec_rtx_properties): New alias.
	* rtlanal.c: Include it.
	(rtx_properties::try_to_add_reg): New function.
	(rtx_properties::try_to_add_dest): Likewise.
	(rtx_properties::try_to_add_src): Likewise.
	(rtx_properties::try_to_add_pattern): Likewise.
	(rtx_properties::try_to_add_insn): Likewise.
	(vec_rtx_properties_base::grow): Likewise.
---
 gcc/rtlanal.c | 282 ++++++++++++++++++++++++++++++++++++++++++
 gcc/rtlanal.h | 334 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 616 insertions(+)
 create mode 100644 gcc/rtlanal.h

diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index 30d5b0c6b76..404813b7668 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -24,6 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "backend.h"
 #include "target.h"
 #include "rtl.h"
+#include "rtlanal.h"
 #include "tree.h"
 #include "predict.h"
 #include "df.h"
@@ -2049,6 +2050,287 @@ note_uses (rtx *pbody, void (*fun) (rtx *, void *), void *data)
       return;
     }
 }
+
+/* Try to add a description of REG X to this object, stopping once
+   the REF_END limit has been reached.  FLAGS is a bitmask of
+   rtx_obj_reference flags that describe the context.  */
+
+void
+rtx_properties::try_to_add_reg (const_rtx x, unsigned int flags)
+{
+  if (REG_NREGS (x) != 1)
+    flags |= rtx_obj_flags::IS_MULTIREG;
+  machine_mode mode = GET_MODE (x);
+  unsigned int start_regno = REGNO (x);
+  unsigned int end_regno = END_REGNO (x);
+  for (unsigned int regno = start_regno; regno < end_regno; ++regno)
+    if (ref_iter != ref_end)
+      *ref_iter++ = rtx_obj_reference (regno, flags, mode,
+				       regno - start_regno);
+}
+
+/* Add a description of destination X to this object.  FLAGS is a bitmask
+   of rtx_obj_reference flags that describe the context.
+
+   This routine accepts all rtxes that can legitimately appear in a
+   SET_DEST.  */
+
+void
+rtx_properties::try_to_add_dest (const_rtx x, unsigned int flags)
+{
+  /* If we have a PARALLEL, SET_DEST is a list of EXPR_LIST expressions,
+     each of whose first operand is a register.  */
+  if (__builtin_expect (GET_CODE (x) == PARALLEL, 0))
+    {
+      for (int i = XVECLEN (x, 0) - 1; i >= 0; --i)
+	if (rtx dest = XEXP (XVECEXP (x, 0, i), 0))
+	  try_to_add_dest (dest, flags);
+      return;
+    }
+
+  unsigned int base_flags = flags & rtx_obj_flags::STICKY_FLAGS;
+  flags |= rtx_obj_flags::IS_WRITE;
+  for (;;)
+    if (GET_CODE (x) == ZERO_EXTRACT)
+      {
+	try_to_add_src (XEXP (x, 1), base_flags);
+	try_to_add_src (XEXP (x, 2), base_flags);
+	flags |= rtx_obj_flags::IS_READ;
+	x = XEXP (x, 0);
+      }
+    else if (GET_CODE (x) == STRICT_LOW_PART)
+      {
+	flags |= rtx_obj_flags::IS_READ;
+	x = XEXP (x, 0);
+      }
+    else if (GET_CODE (x) == SUBREG)
+      {
+	flags |= rtx_obj_flags::IN_SUBREG;
+	if (read_modify_subreg_p (x))
+	  flags |= rtx_obj_flags::IS_READ;
+	x = SUBREG_REG (x);
+      }
+    else
+      break;
+
+  if (MEM_P (x))
+    {
+      if (ref_iter != ref_end)
+	*ref_iter++ = rtx_obj_reference (MEM_REGNO, flags, GET_MODE (x));
+
+      unsigned int addr_flags = base_flags | rtx_obj_flags::IN_MEM_STORE;
+      if (flags & rtx_obj_flags::IS_READ)
+	addr_flags |= rtx_obj_flags::IN_MEM_LOAD;
+      try_to_add_src (XEXP (x, 0), addr_flags);
+      return;
+    }
+
+  if (__builtin_expect (REG_P (x), 1))
+    {
+      /* We want to keep sp alive everywhere -  by making all
+	 writes to sp also use sp. */
+      if (REGNO (x) == STACK_POINTER_REGNUM)
+	flags |= rtx_obj_flags::IS_READ;
+      try_to_add_reg (x, flags);
+      return;
+    }
+}
+
+/* Try to add a description of source X to this object, stopping once
+   the REF_END limit has been reached.  FLAGS is a bitmask of
+   rtx_obj_reference flags that describe the context.
+
+   This routine accepts all rtxes that can legitimately appear in a SET_SRC.  */
+
+void
+rtx_properties::try_to_add_src (const_rtx x, unsigned int flags)
+{
+  unsigned int base_flags = flags & rtx_obj_flags::STICKY_FLAGS;
+  subrtx_iterator::array_type array;
+  FOR_EACH_SUBRTX (iter, array, x, NONCONST)
+    {
+      const_rtx x = *iter;
+      rtx_code code = GET_CODE (x);
+      if (code == REG)
+	try_to_add_reg (x, flags | rtx_obj_flags::IS_READ);
+      else if (code == MEM)
+	{
+	  if (MEM_VOLATILE_P (x))
+	    has_volatile_refs = true;
+
+	  if (!MEM_READONLY_P (x) && ref_iter != ref_end)
+	    {
+	      auto mem_flags = flags | rtx_obj_flags::IS_READ;
+	      *ref_iter++ = rtx_obj_reference (MEM_REGNO, mem_flags,
+					       GET_MODE (x));
+	    }
+
+	  try_to_add_src (XEXP (x, 0),
+			  base_flags | rtx_obj_flags::IN_MEM_LOAD);
+	  iter.skip_subrtxes ();
+	}
+      else if (code == SUBREG)
+	{
+	  try_to_add_src (SUBREG_REG (x), flags | rtx_obj_flags::IN_SUBREG);
+	  iter.skip_subrtxes ();
+	}
+      else if (code == UNSPEC_VOLATILE)
+	has_volatile_refs = true;
+      else if (code == ASM_INPUT || code == ASM_OPERANDS)
+	{
+	  has_asm = true;
+	  if (MEM_VOLATILE_P (x))
+	    has_volatile_refs = true;
+	}
+      else if (code == PRE_INC
+	       || code == PRE_DEC
+	       || code == POST_INC
+	       || code == POST_DEC
+	       || code == PRE_MODIFY
+	       || code == POST_MODIFY)
+	{
+	  has_pre_post_modify = true;
+
+	  unsigned int addr_flags = (base_flags
+				     | rtx_obj_flags::IS_PRE_POST_MODIFY
+				     | rtx_obj_flags::IS_READ);
+	  try_to_add_dest (XEXP (x, 0), addr_flags);
+	  if (code == PRE_MODIFY || code == POST_MODIFY)
+	    iter.substitute (XEXP (XEXP (x, 1), 1));
+	  else
+	    iter.skip_subrtxes ();
+	}
+      else if (code == CALL)
+	has_call = true;
+    }
+}
+
+/* Try to add a description of instruction pattern PAT to this object,
+   stopping once the REF_END limit has been reached.  */
+
+void
+rtx_properties::try_to_add_pattern (const_rtx pat)
+{
+  switch (GET_CODE (pat))
+    {
+    case COND_EXEC:
+      try_to_add_src (COND_EXEC_TEST (pat));
+      try_to_add_pattern (COND_EXEC_CODE (pat));
+      break;
+
+    case PARALLEL:
+      {
+	int last = XVECLEN (pat, 0) - 1;
+	for (int i = 0; i < last; ++i)
+	  try_to_add_pattern (XVECEXP (pat, 0, i));
+	try_to_add_pattern (XVECEXP (pat, 0, last));
+	break;
+      }
+
+    case ASM_OPERANDS:
+      for (int i = 0, len = ASM_OPERANDS_INPUT_LENGTH (pat); i < len; ++i)
+	try_to_add_src (ASM_OPERANDS_INPUT (pat, i));
+      break;
+
+    case CLOBBER:
+      try_to_add_dest (XEXP (pat, 0), rtx_obj_flags::IS_CLOBBER);
+      break;
+
+    case SET:
+      try_to_add_dest (SET_DEST (pat));
+      try_to_add_src (SET_SRC (pat));
+      break;
+
+    default:
+      /* All the other possibilities never store and can use a normal
+	 rtx walk.  This includes:
+
+	 - USE
+	 - TRAP_IF
+	 - PREFETCH
+	 - UNSPEC
+	 - UNSPEC_VOLATILE.  */
+      try_to_add_src (pat);
+      break;
+    }
+}
+
+/* Try to add a description of INSN to this object, stopping once
+   the REF_END limit has been reached.  INCLUDE_NOTES is true if the
+   description should include REG_EQUAL and REG_EQUIV notes; all such
+   references will then be marked with rtx_obj_flags::IN_NOTE.
+
+   For calls, this description includes all accesses in
+   CALL_INSN_FUNCTION_USAGE.  It also include all implicit accesses
+   to global registers by the target function.  However, it does not
+   include clobbers performed by the target function; callers that want
+   this information should instead use the function_abi interface.  */
+
+void
+rtx_properties::try_to_add_insn (const rtx_insn *insn, bool include_notes)
+{
+  if (CALL_P (insn))
+    {
+      /* Adding the global registers first removes a situation in which
+	 a fixed-form clobber of register R could come before a real set
+	 of register R.  */
+      if (!hard_reg_set_empty_p (global_reg_set))
+	{
+	  unsigned int flags = (rtx_obj_flags::IS_READ
+				| rtx_obj_flags::IS_WRITE);
+	  for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno)
+	    if (global_regs[regno] && ref_iter != ref_end)
+	      *ref_iter++ = rtx_obj_reference (regno, flags,
+					       reg_raw_mode[regno], 0);
+	}
+      if (ref_iter != ref_end && !RTL_CONST_CALL_P (insn))
+	{
+	  auto mem_flags = rtx_obj_flags::IS_READ;
+	  if (!RTL_PURE_CALL_P (insn))
+	    mem_flags |= rtx_obj_flags::IS_WRITE;
+	  *ref_iter++ = rtx_obj_reference (MEM_REGNO, mem_flags, BLKmode);
+	}
+      try_to_add_pattern (PATTERN (insn));
+      for (rtx link = CALL_INSN_FUNCTION_USAGE (insn); link;
+	   link = XEXP (link, 1))
+	{
+	  rtx x = XEXP (link, 0);
+	  if (GET_CODE (x) == CLOBBER)
+	    try_to_add_dest (XEXP (x, 0), rtx_obj_flags::IS_CLOBBER);
+	  else if (GET_CODE (x) == USE)
+	    try_to_add_src (XEXP (x, 0));
+	}
+    }
+  else
+    try_to_add_pattern (PATTERN (insn));
+
+  if (include_notes)
+    for (rtx note = REG_NOTES (insn); note; note = XEXP (note, 1))
+      if (REG_NOTE_KIND (note) == REG_EQUAL
+	  || REG_NOTE_KIND (note) == REG_EQUIV)
+	try_to_add_note (XEXP (note, 0));
+}
+
+/* Grow the storage by a bit while keeping the contents of the first
+   START elements.  */
+
+void
+vec_rtx_properties_base::grow (ptrdiff_t start)
+{
+  /* The same heuristic that vec uses.  */
+  ptrdiff_t new_elems = (ref_end - ref_begin) * 3 / 2;
+  if (ref_begin == m_storage)
+    {
+      ref_begin = XNEWVEC (rtx_obj_reference, new_elems);
+      if (start)
+	memcpy (ref_begin, m_storage, start * sizeof (rtx_obj_reference));
+    }
+  else
+    ref_begin = reinterpret_cast<rtx_obj_reference *>
+      (xrealloc (ref_begin, new_elems * sizeof (rtx_obj_reference)));
+  ref_iter = ref_begin + start;
+  ref_end = ref_begin + new_elems;
+}
 \f
 /* Return nonzero if X's old contents don't survive after INSN.
    This will be true if X is (cc0) or if X is a register and
diff --git a/gcc/rtlanal.h b/gcc/rtlanal.h
new file mode 100644
index 00000000000..322761b8cc0
--- /dev/null
+++ b/gcc/rtlanal.h
@@ -0,0 +1,334 @@
+/* Analyze RTL for GNU compiler.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+/* Note that for historical reasons, many rtlanal.c functions are
+   declared in rtl.h rather than here.  */
+
+#ifndef GCC_RTLANAL_H
+#define GCC_RTLANAL_H
+
+/* A dummy register value that represents the whole of variable memory.
+   Using ~0U means that arrays that track both registers and memory can
+   be indexed by regno + 1.  */
+const unsigned int MEM_REGNO = ~0U;
+
+/* Bitmasks of flags describing an rtx_obj_reference.  See the accessors
+   in the class for details.  */
+namespace rtx_obj_flags
+{
+  const uint16_t IS_READ = 1U << 0;
+  const uint16_t IS_WRITE = 1U << 1;
+  const uint16_t IS_CLOBBER = 1U << 2;
+  const uint16_t IS_PRE_POST_MODIFY = 1U << 3;
+  const uint16_t IS_MULTIREG = 1U << 4;
+  const uint16_t IN_MEM_LOAD = 1U << 5;
+  const uint16_t IN_MEM_STORE = 1U << 6;
+  const uint16_t IN_SUBREG = 1U << 7;
+  const uint16_t IN_NOTE = 1U << 8;
+
+  /* Flags that apply to all subrtxes of the rtx they were originally
+     added for.  */
+  static const uint16_t STICKY_FLAGS = IN_NOTE;
+}
+
+/* Contains information about a reference to a register or variable memory.  */
+class rtx_obj_reference
+{
+public:
+  rtx_obj_reference () = default;
+  rtx_obj_reference (unsigned int regno, uint16_t flags,
+		     machine_mode mode, unsigned int multireg_offset = 0);
+
+  bool is_reg () const { return regno != MEM_REGNO; }
+  bool is_mem () const { return regno == MEM_REGNO; }
+
+  /* True if the reference is a read or a write respectively.
+     Both flags are set in a read-modify-write context, such as
+     for read_modify_subreg_p.  */
+  bool is_read () const { return flags & rtx_obj_flags::IS_READ; }
+  bool is_write () const { return flags & rtx_obj_flags::IS_WRITE; }
+
+  /* True if IS_WRITE and if the write is a clobber rather than a set.  */
+  bool is_clobber () const { return flags & rtx_obj_flags::IS_CLOBBER; }
+
+  /* True if the reference is updated by an RTX_AUTOINC.  Both IS_READ
+     and IS_WRITE are also true if so.  */
+  bool is_pre_post_modify () const
+  {
+    return flags & rtx_obj_flags::IS_PRE_POST_MODIFY;
+  }
+
+  /* True if the register is part of a multi-register hard REG.  */
+  bool is_multireg () const { return flags & rtx_obj_flags::IS_MULTIREG; }
+
+  /* True if the reference occurs in the address of a load MEM.  */
+  bool in_mem_load () const { return flags & rtx_obj_flags::IN_MEM_LOAD; }
+
+  /* True if the reference occurs in the address of a store MEM.  */
+  bool in_mem_store () const { return flags & rtx_obj_flags::IN_MEM_STORE; }
+
+  /* True if the reference occurs in any kind of MEM address.  */
+  bool in_address () const { return in_mem_load () || in_mem_store (); }
+
+  /* True if the reference occurs in a SUBREG.  */
+  bool in_subreg () const { return flags & rtx_obj_flags::IN_SUBREG; }
+
+  /* True if the reference occurs in a REG_EQUAL or REG_EQUIV note.  */
+  bool in_note () const { return flags & rtx_obj_flags::IN_NOTE; }
+
+  /* The referenced register, or MEM_REGNO for variable memory.  */
+  unsigned int regno;
+
+  /* A bitmask of rtx_obj_flags.  */
+  unsigned int flags : 16;
+
+  /* The mode of the reference.  If IS_MULTIREG, this is the mode of
+     REGNO - MULTIREG_OFFSET.  */
+  machine_mode mode : 8;
+
+  /* If IS_MULTIREG, the offset of REGNO from the start of the register.  */
+  unsigned int multireg_offset : 8;
+};
+
+/* Construct a reference with the given fields.  */
+
+inline rtx_obj_reference::rtx_obj_reference (unsigned int regno, uint16_t flags,
+					     machine_mode mode,
+					     unsigned int multireg_offset)
+  : regno (regno),
+    flags (flags),
+    mode (mode),
+    multireg_offset (multireg_offset)
+{
+}
+
+/* Contains information about an rtx or an instruction, including a
+   list of rtx_obj_references.  The storage backing the list needs
+   to be filled in by assigning to REF_BEGIN and REF_END.  */
+
+class rtx_properties
+{
+public:
+  rtx_properties ();
+
+  void try_to_add_reg (const_rtx x, unsigned int flags = 0);
+  void try_to_add_dest (const_rtx x, unsigned int flags = 0);
+  void try_to_add_src (const_rtx x, unsigned int flags = 0);
+  void try_to_add_pattern (const_rtx pat);
+  void try_to_add_note (const_rtx x);
+  void try_to_add_insn (const rtx_insn *insn, bool include_notes);
+
+  iterator_range<rtx_obj_reference *> refs () const;
+
+  /* Return the number of rtx_obj_references that have been recorded.  */
+  size_t num_refs () const { return ref_iter - ref_begin; }
+
+  bool has_side_effects () const;
+
+  /* [REF_BEGIN, REF_END) is the maximum extent of the memory available
+     for recording references.  REG_ITER is the first unused entry.  */
+  rtx_obj_reference *ref_begin;
+  rtx_obj_reference *ref_iter;
+  rtx_obj_reference *ref_end;
+
+  /* True if the rtx includes an asm.  */
+  unsigned int has_asm : 1;
+
+  /* True if the rtx includes a call.  */
+  unsigned int has_call : 1;
+
+  /* True if the rtx includes an RTX_AUTOINC expression.  */
+  unsigned int has_pre_post_modify : 1;
+
+  /* True if the rtx contains volatile references, in the sense of
+     volatile_refs_p.  */
+  unsigned int has_volatile_refs : 1;
+
+  /* For future expansion.  */
+  unsigned int spare : 28;
+};
+
+inline rtx_properties::rtx_properties ()
+  : ref_begin (nullptr),
+    ref_iter (nullptr),
+    ref_end (nullptr),
+    has_asm (false),
+    has_call (false),
+    has_pre_post_modify (false),
+    has_volatile_refs (false),
+    spare (0)
+{
+}
+
+/* Like add_src, but treat X has being part of a REG_EQUAL or
+   REG_EQUIV note.  */
+
+inline void
+rtx_properties::try_to_add_note (const_rtx x)
+{
+  try_to_add_src (x, rtx_obj_flags::IN_NOTE);
+}
+
+/* Return true if the rtx has side effects, in the sense of
+   side_effects_p (except for side_effects_p's special handling
+   of combine.c clobbers).  */
+
+inline bool
+rtx_properties::has_side_effects () const
+{
+  return has_volatile_refs || has_pre_post_modify || has_call;
+}
+
+/* Return an iterator range for all the references, suitable for
+   range-based for loops.  */
+
+inline iterator_range<rtx_obj_reference *>
+rtx_properties::refs () const
+{
+  return { ref_begin, ref_iter };
+}
+
+/* BASE is derived from rtx_properties and provides backing storage
+   for REF_BEGIN.  It has a grow () method that increases the amount
+   of memory available if the initial allocation was too small.  */
+
+template<typename Base>
+class growing_rtx_properties : public Base
+{
+public:
+  template<typename... Args>
+  growing_rtx_properties (Args...);
+
+  template<typename AddFn>
+  void repeat (AddFn add);
+
+  /* Wrappers around the try_to_* functions that always succeed.  */
+  void add_dest (const_rtx x, unsigned int flags = 0);
+  void add_src (const_rtx x, unsigned int flags = 0);
+  void add_pattern (const_rtx pat);
+  void add_note (const_rtx x);
+  void add_insn (const rtx_insn *insn, bool include_notes);
+};
+
+template<typename Base>
+template<typename... Args>
+growing_rtx_properties<Base>::growing_rtx_properties (Args... args)
+  : Base (std::forward<Args> (args)...)
+{
+}
+
+/* Perform ADD until there is enough room to hold the result.  */
+
+template<typename Base>
+template<typename AddFn>
+inline void
+growing_rtx_properties<Base>::repeat (AddFn add)
+{
+  ptrdiff_t count = this->num_refs ();
+  for (;;)
+    {
+      add ();
+      /* This retries if the storage happened to be exactly the right size,
+	 but that's expected to be a rare case and so isn't worth
+	 optimizing for.  */
+      if (__builtin_expect (this->ref_iter != this->ref_end, 1))
+	break;
+      this->grow (count);
+    }
+}
+
+template<typename Base>
+inline void
+growing_rtx_properties<Base>::add_dest (const_rtx x, unsigned int flags)
+{
+  repeat ([&]() { this->try_to_add_dest (x, flags); });
+}
+
+template<typename Base>
+inline void
+growing_rtx_properties<Base>::add_src (const_rtx x, unsigned int flags)
+{
+  repeat ([&]() { this->try_to_add_src (x, flags); });
+}
+
+template<typename Base>
+inline void
+growing_rtx_properties<Base>::add_pattern (const_rtx pat)
+{
+  repeat ([&]() { this->try_to_add_pattern (pat); });
+}
+
+template<typename Base>
+inline void
+growing_rtx_properties<Base>::add_note (const_rtx x)
+{
+  repeat ([&]() { this->try_to_add_note (x); });
+}
+
+template<typename Base>
+inline void
+growing_rtx_properties<Base>::add_insn (const rtx_insn *insn, bool include_notes)
+{
+  repeat ([&]() { this->try_to_add_insn (insn, include_notes); });
+}
+
+/* A base class for vec_rtx_properties; see there for details.  */
+
+class vec_rtx_properties_base : public rtx_properties
+{
+  static const size_t SIZE = 32;
+
+public:
+  vec_rtx_properties_base ();
+  ~vec_rtx_properties_base ();
+
+protected:
+  void grow (ptrdiff_t);
+
+private:
+  rtx_obj_reference m_storage[SIZE];
+};
+
+inline vec_rtx_properties_base::vec_rtx_properties_base ()
+{
+  ref_begin = ref_iter = m_storage;
+  ref_end = m_storage + SIZE;
+}
+
+inline vec_rtx_properties_base::~vec_rtx_properties_base ()
+{
+  if (__builtin_expect (ref_begin != m_storage, 0))
+    free (ref_begin);
+}
+
+/* A rtx_properties that stores its references in a temporary array.
+   Like auto_vec, the array is initially on the stack, but can switch
+   to the heap if necessary.
+
+   The reason for implementing this as a derived class is that the
+   default on-stack size should be enough for the vast majority of
+   expressions and instructions.  It's therefore not worth paying
+   the cost of conditionally calling grow code at every site that
+   records a new reference.  Instead, the rtx_properties code can use
+   trivial iterator updates for the common case, and in the rare case
+   that the vector needs to be resized, we can pay the cost of
+   collecting the references a second time.  */
+using vec_rtx_properties = growing_rtx_properties<vec_rtx_properties_base>;
+
+#endif
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [20/23] rtlanal: Add simple_regno_set
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (18 preceding siblings ...)
  2020-11-13  8:20 ` [19/23] rtlanal: Add some new helper classes Richard Sandiford
@ 2020-11-13  8:21 ` Richard Sandiford
  2020-11-25 20:31   ` Jeff Law
  2020-11-13  8:22 ` [21/23] doc: Add documentation for rtl-ssa Richard Sandiford
                   ` (3 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:21 UTC (permalink / raw)
  To: gcc-patches

This patch adds a routine for finding a “simple” SET for a register
definition.  See the comment in the patch for details.

gcc/
	* rtl.h (simple_regno_set): Declare.
	* rtlanal.c (simple_regno_set): New function.
---
 gcc/rtl.h     |  1 +
 gcc/rtlanal.c | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+)

diff --git a/gcc/rtl.h b/gcc/rtl.h
index e9df95b02c4..3915fae61e7 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -3539,6 +3539,7 @@ extern void set_insn_deleted (rtx_insn *);
 /* Functions in rtlanal.c */
 
 extern rtx single_set_2 (const rtx_insn *, const_rtx);
+extern rtx simple_regno_set (rtx, unsigned int);
 extern bool contains_symbol_ref_p (const_rtx);
 extern bool contains_symbolic_reference_p (const_rtx);
 extern bool contains_constant_pool_address_p (const_rtx);
diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index 404813b7668..80e72d6049d 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -1455,6 +1455,39 @@ set_of (const_rtx pat, const_rtx insn)
   return data.found;
 }
 
+/* Check whether instruction pattern PAT contains a SET with the following
+   properties:
+
+   - the SET is executed unconditionally;
+   - the destination of the SET is write-only rather than read-write; and
+   - either:
+     - the destination of the SET is a REG that contains REGNO; or
+     - the destination of the SET is a SUBREG of such a REG.
+
+   If PAT does have a SET like that, return the set, otherwise return null.
+
+   This is intended to be an alternative to single_set for passes that
+   can handle patterns with multiple_sets.  */
+rtx
+simple_regno_set (rtx pat, unsigned int regno)
+{
+  if (GET_CODE (pat) == PARALLEL)
+    {
+      int last = XVECLEN (pat, 0) - 1;
+      for (int i = 0; i < last; ++i)
+	if (rtx set = simple_regno_set (XVECEXP (pat, 0, i), regno))
+	  return set;
+
+      pat = XVECEXP (pat, 0, last);
+    }
+
+  if (GET_CODE (pat) == SET
+      && covers_regno_no_parallel_p (SET_DEST (pat), regno))
+    return pat;
+
+  return nullptr;
+}
+
 /* Add all hard register in X to *PSET.  */
 void
 find_all_hard_regs (const_rtx x, HARD_REG_SET *pset)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [21/23] doc: Add documentation for rtl-ssa
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (19 preceding siblings ...)
  2020-11-13  8:21 ` [20/23] rtlanal: Add simple_regno_set Richard Sandiford
@ 2020-11-13  8:22 ` Richard Sandiford
  2020-11-30  6:26   ` Jeff Law
  2020-11-13  8:23 ` [PATCH 22/23] Add rtl-ssa Richard Sandiford
                   ` (2 subsequent siblings)
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:22 UTC (permalink / raw)
  To: gcc-patches

This patch adds some documentation to rtl.texi about the SSA form.
It only really describes the high-level structure -- I think for
API-level stuff it's better to rely on function comments instead.

gcc/
	* doc/rtl.texi (RTL SSA): New node.
---
 gcc/doc/rtl.texi | 787 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 787 insertions(+)

diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 22af5731bb6..66236204c5e 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -39,6 +39,7 @@ form uses nested parentheses to indicate the pointers in the internal form.
 * Debug Information:: Expressions representing debugging information.
 * Insns::             Expression types for entire insns.
 * Calls::             RTL representation of function call insns.
+* RTL SSA::           An on-the-side SSA form for RTL
 * Sharing::           Some expressions are unique; others *must* be copied.
 * Reading RTL::       Reading textual RTL from a file.
 @end menu
@@ -4420,6 +4421,792 @@ function.  Similarly, if registers other than those in
 containing a single @code{clobber} follow immediately after the call to
 indicate which registers.
 
+@node RTL SSA
+@section On-the-Side SSA Form for RTL
+@cindex SSA, RTL form
+@cindex RTL SSA
+
+The patterns of an individual RTL instruction describe which registers
+are inputs to that instruction and which registers are outputs from
+that instruction.  However, it is often useful to know where the
+definition of a register input comes from and where the result of
+a register output is used.  One way of obtaining this information
+is to use the RTL SSA form, which provides a Static Single Assignment
+representation of the RTL instructions.
+
+The RTL SSA code is located in the @file{rtl-ssa} subdirectory of the GCC
+source tree.  This section only gives a brief overview of it; please
+see the comments in the source code for more details.
+
+@menu
+* Using RTL SSA::             What a pass needs to do to use the RTL SSA form
+* RTL SSA Instructions::      How instructions are represented and organized
+* RTL SSA Basic Blocks::      How instructions are grouped into blocks
+* RTL SSA Resources::         How registers and memory are represented
+* RTL SSA Accesses::          How register and memory accesses are represented
+* RTL SSA Phi Nodes::         How multiple sources are combined into one
+* RTL SSA Access Lists::      How accesses are chained together
+* Changing RTL Instructions:: How to use the RTL SSA framework to change insns
+@end menu
+
+@node Using RTL SSA
+@subsection Using RTL SSA in a pass
+
+A pass that wants to use the RTL SSA form should start with the following:
+
+@smallexample
+#define INCLUDE_ALGORITHM
+#define INCLUDE_FUNCTIONAL
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "df.h"
+#include "rtl-ssa.h"
+@end smallexample
+
+All the RTL SSA code is contained in the @code{rtl_ssa} namespace,
+so most passes will then want to do:
+
+@smallexample
+using namespace rtl_ssa;
+@end smallexample
+
+However, this is purely a matter of taste, and the examples in the rest of
+this section do not require it.
+
+The RTL SSA represention is an optional on-the-side feature that applies
+on top of the normal RTL instructions.  It is currently local to individual
+RTL passes and is not maintained across passes.
+
+However, in order to allow the RTL SSA information to be preserved across
+passes in future, @samp{crtl->ssa} points to the current function's
+SSA form (if any).  Passes that want to use the RTL SSA form should
+first do:
+
+@smallexample
+crtl->ssa = new rtl_ssa::function_info (@var{fn});
+@end smallexample
+
+where @var{fn} is the function that the pass is processing.
+(Passes that are @code{using namespace rtl_ssa} do not need
+the @samp{rtl_ssa::}.)
+
+Once the pass has finished with the SSA form, it should do the following:
+
+@smallexample
+free_dominance_info (CDI_DOMINATORS);
+if (crtl->ssa->perform_pending_updates ())
+  cleanup_cfg (0);
+
+delete crtl->ssa;
+crtl->ssa = nullptr;
+@end smallexample
+
+The @code{free_dominance_info} call is necessary because
+dominance information is not currently maintained between RTL passes.
+The next two lines commit any changes to the RTL instructions that
+were queued for later; see the comment above the declaration of
+@code{perform_pending_updates} for details.  The final two lines
+discard the RTL SSA form and free the associated memory.
+
+@node RTL SSA Instructions
+@subsection RTL SSA Instructions
+
+@cindex RPO
+@cindex reverse postorder
+@cindex instructions, RTL SSA
+@findex rtl_ssa::insn_info
+RTL SSA instructions are represented by an @code{rtl_ssa::insn_info}.
+These instructions are chained together in a single list that follows
+a reverse postorder (RPO) traversal of the function.  This means that
+if any path through the function can execute an instruction @var{I1}
+and then later execute an instruction @var{I2} for the first time,
+@var{I1} appears before @var{I2} in the list@footnote{Note that this
+order is different from the order of the underlying RTL instructions,
+which follow machine code order instead.}.
+
+Two RTL SSA instructions can be compared to find which instruction
+occurs earlier than the other in the RPO@.  One way to do this is
+to use the C++ comparison operators, such as:
+
+@example
+*@var{insn1} < *@var{insn2}
+@end example
+
+Another way is to use the @code{compare_with} function:
+
+@example
+@var{insn1}->compare_with (@var{insn2})
+@end example
+
+This expression is greater than zero if @var{insn1} comes after @var{insn2}
+in the RPO, less than zero if @var{insn1} comes before @var{insn2} in the
+RPO, or zero if @var{insn1} and @var{insn2} are the same.  This order is
+maintained even if instructions are added to the function or moved around.
+
+The main purpose of @code{rtl_ssa::insn_info} is to hold
+SSA information about an instruction.  However, it also caches
+certain properties of the instruction, such as whether it is an
+inline assembly instruction, whether it has volatile accesses, and so on.
+
+@node RTL SSA Basic Blocks
+@subsection RTL SSA Basic Blocks
+
+@cindex basic blocks, RTL SSA
+@findex basic_block
+@findex rtl_ssa::bb_info
+RTL SSA instructions (@pxref{RTL SSA Instructions}) are organized into
+basic blocks, with each block being represented by an @code{rtl_ssa:bb_info}.
+There is a one-to-one mapping between these @code{rtl_ssa:bb_info}
+structures and the underlying CFG @code{basic_block} structures
+(@pxref{Basic Blocks}).
+
+@cindex ``real'' instructions, RTL SSA
+@anchor{real RTL SSA insns}
+If a CFG basic block @var{bb} contains an RTL instruction @var{insn},
+the RTL SSA represenation of @var{bb} also contains an RTL SSA representation
+of @var{insn}@footnote{Note that this excludes non-instruction things like
+@code{note}s and @code{barrier}s that also appear in the chain of RTL
+instructions.}.  Within RTL SSA, these instructions are referred to as
+``real'' instructions.  These real instructions fall into two groups:
+debug instructions and nondebug instructions.  Only nondebug instructions
+should affect code generation decisions.
+
+In addition, each RTL SSA basic block has two ``artificial''
+instructions: a ``head'' instruction that comes before all the real
+instructions and an ``end'' instruction that comes after all real
+instructions.  These instructions exist to represent things that
+are conceptually defined or used at the start and end of a basic block.
+The instructions always exist, even if they do not currently do anything.
+
+Like instructions, these blocks are chained together in a reverse
+postorder.  This list includes the entry block (which always comes
+first) and the exit block (which always comes last).
+
+@cindex extended basic blocks, RTL SSA
+@findex rtl_ssa::ebb_info
+RTL SSA basic blocks are chained together into ``extended basic blocks''
+(EBBs), represented by an @code{rtl_ssa::ebb_info}.  Extended basic
+blocks contain one or more basic blocks.  They have the property
+that if a block @var{bby} comes immediately after a block @var{bbx}
+in an EBB, then @var{bby} can only be reached by @var{bbx}; in other words,
+@var{bbx} is the sole predecessor of @var{bby}.
+
+Each extended basic block starts with an artificial ``phi node''
+instruction.  This instruction defines all phi nodes for the EBB
+(@pxref{RTL SSA Phi Nodes}).  (Individual blocks in an EBB do not
+need phi nodes because their live values can only come from one source.)
+
+The contents of a function are therefore represented using a
+four-level hierarchy:
+
+@itemize @bullet
+@item
+functions (@code{rtl_ssa::function_info}), which contain @dots{}
+
+@item
+extended basic blocks (@code{rtl_ssa::ebb_info}), which contain @dots{}
+
+@item
+basic blocks (@code{rtl_ssa::bb_info}), which contain @dots{}
+
+@item
+instructions (@code{rtl_ssa::insn_info})
+@end itemize
+
+In dumps, a basic block is identified as @code{bb@var{n}}, where @var{n}
+is the index of the associated CFG @code{basic_block} structure.
+An EBB is in turn identified by the index of its first block.
+For example, an EBB that contains @samp{bb10}, @code{bb5}, @code{bb6}
+and @code{bb9} is identified as @var{ebb10}.
+
+@node RTL SSA Resources
+@subsection RTL SSA Resources
+
+The RTL SSA form tracks two types of ``resource'': registers and memory.
+Each hard and pseudo register is a separate resource.  Memory is a
+single unified resource, like it is in GIMPLE (@pxref{GIMPLE}).
+
+Each resource has a unique identifier.  The unique identifier for a
+register is simply its register number.  The unique identifier for
+memory is a special register number called @code{MEM_REGNO}.
+
+Since resource numbers so closely match register numbers, it is somtimes
+convenient to refer to them simply as register numbers, or ``regnos''
+for short.  However, the RTL SSA form also provides an abstraction
+of resources in the form of @code{rtl_ssa::resource_info}.
+This is a lightweight class that records both the regno of a resource
+and the @code{machine_mode} that the resource has (@pxref{Machine Modes}).
+It has functions for testing whether a resource is a register or memory.
+In principle it could be extended to other kinds of resource in future.
+
+@node RTL SSA Accesses
+@subsection RTL SSA Register and Memory Accesses
+
+In the RTL SSA form, most reads or writes of a resource are
+represented as a @code{rtl_ssa::access_info}@footnote{The exceptions
+are call clobbers, which are generally represented separately.
+See the comment above @code{rtl_ssa::insn_info} for details.}.
+These @code{rtl_ssa::access_info}s are organized into the following
+class hierarchy:
+
+@findex rtl_ssa::access_info
+@findex rtl_ssa::use_info
+@findex rtl_ssa::def_info
+@findex rtl_ssa::clobber_info
+@findex rtl_ssa::set_info
+@findex rtl_ssa::phi_info
+@smallexample
+rtl_ssa::access_info
+  |
+  +-- rtl_ssa::use_info
+  |
+  +-- rtl_ssa::def_info
+        |
+        +-- rtl_ssa::clobber_info
+        |
+        +-- rtl_ssa::set_info
+              |
+              +-- rtl_ssa::phi_info
+@end smallexample
+
+A @code{rtl_ssa::use_info} represents a read or use of a resource and
+a @code{rtl_ssa::def_info} represents a write or definition of a resource.
+As in the main RTL representation, there are two basic types of
+definition: clobbers and sets.  The difference is that a clobber
+leaves the register with an unspecified value that cannot be used
+or relied on by later instructions, while a set leaves the register
+with a known value that later instructions could use if they wanted to.
+A @code{rtl_ssa::clobber_info} represents a clobber and
+a @code{rtl_ssa::set_info} represent a set.
+
+Each @code{rtl_ssa::use_info} records which single @code{rtl_ssa::set_info}
+provides the value of the resource; this is null if the resource is
+completely undefined at the point of use.  Each @code{rtl_ssa::set_info}
+in turn records all the @code{rtl_ssa::use_info}s that use its value.
+
+If a value of a resource can come from multiple sources,
+a @code{rtl_ssa::phi_info} brings those multiple sources together
+into a single definition (@pxref{RTL SSA Phi Nodes}).
+
+@node RTL SSA Phi Nodes
+@subsection RTL SSA Phi Nodes
+
+@cindex phi nodes, RTL SSA
+@findex rtl_ssa::phi_info
+If a resource is live on entry to an extended basic block and if the
+resource's value can come from multiple sources, the extended basic block
+has a ``phi node'' that collects together these multiple sources.
+The phi node conceptually has one input for each incoming edge of
+the extended basic block, with the input specifying the value of
+the resource on that edge.  For example, suppose a function contains
+the following RTL:
+
+@smallexample
+;; Basic block bb3
+@dots{}
+(set (reg:SI R1) (const_int 0))  ;; A
+(set (pc) (label_ref bb5))
+
+;; Basic block bb4
+@dots{}
+(set (reg:SI R1) (const_int 1))  ;; B
+;; Fall through
+
+;; Basic block bb5
+;; preds: bb3, bb4
+;; live in: R1 @dots{}
+(code_label bb5)
+@dots{}
+(set (reg:SI @var{R2})
+     (plus:SI (reg:SI R1) @dots{}))  ;; C
+@end smallexample
+
+The value of R1 on entry to block 5 can come from either A or B@.
+The extended basic block that contains block 5 would therefore have a
+phi node with two inputs: the first input would have the value of
+R1 defined by A and the second input would have the value of
+R1 defined by B@.  This phi node would then provide the value of
+R1 for C (assuming that R1 does not change again between
+the start of block 5 and C).
+
+Since RTL is not a ``native'' SSA representation, these phi nodes
+simply collect together definitions that already exist.  Each input
+to a phi node for a resource @var{R} is itself a definition of
+resource @var{R} (or is null if the resource is completely
+undefined for a particular incoming edge).  This is in contrast
+to a native SSA representation like GIMPLE, where the phi inputs
+can be arbitrary expressions.  As a result, RTL SSA phi nodes
+never involve ``hidden'' moves: all moves are instead explicit.
+
+Phi nodes are represented as a @code{rtl_ssa::phi_node}.
+Each input to a phi node is represented as an @code{rtl_ssa::use_info}.
+
+@node RTL SSA Access Lists
+@subsection RTL SSA Access Lists
+
+All the definitions of a resource are chained together in reverse postorder.
+In general, this list can contain an arbitrary mix of both sets
+(@code{rtl_ssa::set_info}) and clobbers (@code{rtl_ssa::clobber_info}).
+However, it is often useful to skip over all intervening clobbers
+of a resource in order to find the next set.  The list is constructed
+in such a way that this can be done in amortized constant time.
+
+All uses (@code{rtl_ssa::use_info}) of a given set are also chained
+together into a list.  This list of uses is divided into three parts:
+
+@enumerate
+@item
+uses by ``real'' nondebug instructions (@pxref{real RTL SSA insns})
+
+@item
+uses by real debug instructions
+
+@item
+uses by phi nodes (@pxref{RTL SSA Phi Nodes})
+@end enumerate
+
+The first and second parts individually follow reverse postorder.
+The third part has no particular order.
+
+@cindex degenerate phi node, RTL SSA
+The last use by a real nondebug instruction always comes earlier in
+the reverse postorder than the next definition of the resource (if any).
+This means that the accesses follow a linear sequence of the form:
+
+@itemize @bullet
+@item
+first definition of resource R
+
+@itemize @bullet
+@item
+first use by a real nondebug instruction of the first definition of resource R
+
+@item
+@dots{}
+
+@item
+last use by a real nondebug instruction of the first definition of resource R
+@end itemize
+
+@item
+second definition of resource R
+
+@itemize @bullet
+@item
+first use by a real nondebug instruction of the second definition of resource R
+
+@item
+@dots{}
+
+@item
+last use by a real nondebug instruction of the second definition of resource R
+@end itemize
+
+@item
+@dots{}
+
+@item
+last definition of resource R
+
+@itemize @bullet
+@item
+first use by a real nondebug instruction of the last definition of resource R
+
+@item
+@dots{}
+
+@item
+last use by a real nondebug instruction of the last definition of resource R
+@end itemize
+@end itemize
+
+(Note that clobbers never have uses; only sets do.)
+
+This linear view is easy to achieve when there is only a single definition
+of a resource, which is commonly true for pseudo registers.  However,
+things are more complex  if code has a structure like the following:
+
+@smallexample
+// ebb2, bb2
+R = @var{va};        // A
+if (@dots{})
+  @{
+    // ebb2, bb3
+    use1 (R);  // B
+    @dots{}
+    R = @var{vc};    // C
+  @}
+else
+  @{
+    // ebb4, bb4
+    use2 (R);  // D
+  @}
+@end smallexample
+
+The list of accesses would begin as follows:
+
+@itemize @bullet
+@item
+definition of R by A
+
+@itemize @bullet
+@item
+use of A's definition of R by B
+@end itemize
+
+@item
+definition of R by C
+@end itemize
+
+The next access to R is in D, but the value of R that D uses comes from
+A rather than C@.
+
+This is resolved by adding a phi node for @code{ebb4}.  All inputs to this
+phi node have the same value, which in the example above is A's definition
+of R@.  In other circumstances, it would not be necessary to create a phi
+node when all inputs are equal, so these phi nodes are referred to as
+``degenerate'' phi nodes.
+
+The full list of accesses to R is therefore:
+
+@itemize @bullet
+@item
+definition of R by A
+
+@itemize @bullet
+@item
+use of A's definition of R by B
+@end itemize
+
+@item
+definition of R by C
+
+@item
+definition of R by ebb4's phi instruction, with the input coming from A
+
+@itemize @bullet
+@item
+use of the ebb4's R phi definition of R by B
+@end itemize
+@end itemize
+
+Note that A's definition is also used by ebb4's phi node, but this
+use belongs to the third part of the use list described above and
+so does not form part of the linear sequence.
+
+It is possible to ``look through'' any degenerate phi to the ultimate
+definition using the function @code{look_through_degenerate_phi}.
+Note that the input to a degenerate phi is never itself provided
+by a degenerate phi.
+
+At present, the SSA form takes this principle one step further
+and guarantees that, for any given resource @var{res}, one of the
+following is true:
+
+@itemize
+@item
+The resource has a single definition @var{def}, which is not a phi node.
+Excluding uses of undefined registers, all uses of @var{res} by real
+nondebug instructions use the value provided by @var{def}.
+
+@item
+Excluding uses of undefined registers, all uses of @var{res} use
+values provided by definitions that occur earlier in the same
+extended basic block.  These definitions might come from phi nodes
+or from real instructions.
+@end itemize
+
+@node Changing RTL Instructions
+@subsection Using the RTL SSA framework to change instructions
+
+@findex rtl_ssa::insn_change
+There are various routines that help to change a single RTL instruction
+or a group of RTL instructions while keeping the RTL SSA form up-to-date.
+This section first describes the process for changing a single instruction,
+then goes on to describe the differences when changing multiple instructions.
+
+@menu
+* Changing One RTL SSA Instruction::
+* Changing Multiple RTL SSA Instructions::
+@end menu
+
+@node Changing One RTL SSA Instruction
+@subsubsection Changing One RTL SSA Instruction
+
+Before making a change, passes should first use a statement like the
+following:
+
+@smallexample
+auto attempt = crtl->ssa->new_change_attempt ();
+@end smallexample
+
+Here, @code{attempt} is an RAII object that should remain in scope
+for the entire change attempt.  It automatically frees temporary
+memory related to the changes when it goes out of scope.
+
+Next, the pass should create an @code{rtl_ssa::insn_change} object
+for the instruction that it wants to change.  This object specifies
+several things:
+
+@itemize @bullet
+@item
+what the instruction's new list of uses should be (@code{new_uses}).
+By default this is the same as the instruction's current list of uses.
+
+@item
+what the instruction's new list of definitions should be (@code{new_defs}).
+By default this is the same as the instruction's current list of
+definitions.
+
+@item
+where the instruction should be located (@code{move_range}).
+This is a range of instructions after which the instruction could
+be placed, represented as an @code{rtl_ssa::insn_range}.
+By default the instruction must remain at its current position.
+@end itemize
+
+If a pass was attempting to change all these properties of an instruction
+@code{insn}, it might do something like this:
+
+@smallexample
+rtl_ssa::insn_change change (insn);
+change.new_defs = @dots{};
+change.new_uses = @dots{};
+change.move_range = @dots{};
+@end smallexample
+
+This @code{rtl_ssa::insn_change} only describes something that the
+pass @emph{might} do; at this stage, nothing has actually changed.
+
+As noted above, the default @code{move_range} requires the instruction
+to remain where it is.  At the other extreme, it is possible to allow
+the instruction to move anywhere within its extended basic block,
+provided that all the new uses and definitions can be performed
+at the new location.  The way to do this is:
+
+@smallexample
+change.move_range = insn->ebb ()->insn_range ();
+@end smallexample
+
+In either case, the next step is to make sure that move range is
+consistent with the new uses and definitions.  The way to do this is:
+
+@smallexample
+if (!rtl_ssa::restrict_movement (change))
+  return false;
+@end smallexample
+
+This function tries to limit @code{move_range} to a range of instructions
+at which @code{new_uses} and @code{new_defs} can be correctly performed.
+It returns true on success or false if no suitable location exists.
+
+The pass should also tentatively change the pattern of the instruction
+to whatever form the pass wants the instruction to have.  This should use
+the facilities provided by @file{recog.c}.  For example:
+
+@smallexample
+rtl_insn *rtl = insn->rtl ();
+insn_change_watermark watermark;
+validate_change (rtl, &PATTERN (rtl), new_pat, 1);
+@end smallexample
+
+will tentatively replace @code{insn}'s pattern with @code{new_pat}.
+
+These changes and the construction of the @code{rtl_ssa::insn_change}
+can happen in either order or be interleaved.
+
+After the tentative changes to the instruction are complete,
+the pass should check whether the new pattern matches a target
+instruction or satisfies the requirements of an inline asm:
+
+@smallexample
+if (!rtl_ssa::recog (change))
+  return false;
+@end smallexample
+
+This step might change the instruction pattern further in order to
+make it match.  It might also add new definitions or restrict the range
+of the move.  For example, if the new pattern did not match in its original
+form, but could be made to match by adding a clobber of the flags
+register, @code{rtl_ssa::recog} will check whether the flags register
+is free at an appropriate point.  If so, it will add a clobber of the
+flags register to @code{new_defs} and restrict @code{move_range} to
+the locations at which the flags register can be safely clobbered.
+
+Even if the proposed new instruction is valid according to
+@code{rtl_ssa::recog}, the change might not be worthwhile.
+For example, when optimizing for speed, the new instruction might
+turn out to be slower than the original one.  When optimizing for
+size, the new instruction might turn out to be bigger than the
+original one.
+
+Passes should check for this case using @code{change_is_worthwhile}.
+For example:
+
+@smallexample
+if (!rtl_ssa::change_is_worthwhile (change))
+  return false;
+@end smallexample
+
+If the change passes this test too then the pass can perform the change using:
+
+@smallexample
+confirm_change_group ();
+crtl->ssa->change_insn (change);
+@end smallexample
+
+Putting all this together, the change has the following form:
+
+@smallexample
+auto attempt = crtl->ssa->new_change_attempt ();
+
+rtl_ssa::insn_change change (insn);
+change.new_defs = @dots{};
+change.new_uses = @dots{};
+change.move_range = @dots{};
+
+if (!rtl_ssa::restrict_movement (change))
+  return false;
+
+insn_change_watermark watermark;
+// Use validate_change etc. to change INSN's pattern.
+@dots{}
+if (!rtl_ssa::recog (change)
+    || !rtl_ssa::change_is_worthwhile (change))
+  return false;
+
+confirm_change_group ();
+crtl->ssa->change_insn (change);
+@end smallexample
+
+@node Changing Multiple RTL SSA Instructions
+@subsubsection Changing Multiple RTL SSA Instructions
+
+The process for changing multiple instructions is similar
+to the process for changing single instructions
+(@pxref{Changing One RTL SSA Instruction}).  The pass should
+again start the change attempt with:
+
+@smallexample
+auto attempt = crtl->ssa->new_change_attempt ();
+@end smallexample
+
+and keep @code{attempt} in scope for the duration of the change
+attempt.  It should then construct an @code{rtl_ssa::insn_change}
+for each change that it wants to make.
+
+After this, it should combine the changes into a sequence of
+@code{rtl_ssa::insn_change} pointers.  This sequence must be in
+reverse postorder; the instructions will remain strictly in the
+order that the sequence specifies.
+
+For example, if a pass is changing exactly two instructions,
+it might do:
+
+@smallexample
+rtl_ssa::insn_change *changes[] = @{ &change1, change2 @};
+@end smallexample
+
+where @code{change1}'s instruction must come before @code{change2}'s.
+Alternatively, if the pass is changing a variable number of
+instructions, it might build up the sequence in a
+@code{vec<rtl_ssa::insn_change *>}.
+
+By default, @code{rtl_ssa::restrict_movement} assumes that all
+instructions other than the one passed to it will remain in their
+current positions and will retain their current uses and definitions.
+When changing multiple instructions, it is usually more effective
+to ignore the other instructions that are changing.  The sequencing
+described above ensures that the changing instructions remain
+in the correct order with respect to each other.
+The way to do this is:
+
+@smallexample
+if (!rtl_ssa::restrict_movement (change, insn_is_changing (changes)))
+  return false;
+@end smallexample
+
+Similarly, when @code{rtl_ssa::restrict_movement} is detecting
+whether a register can be clobbered, it by default assumes that
+all other instructions will remain in their current positions and
+retain their current form.  It is again more effective to ignore
+changing instructions (which might, for example, no longer need
+to clobber the flags register).  The way to do this is:
+
+@smallexample
+if (!rtl_ssa::recog (change, insn_is_changing (changes)))
+  return false;
+@end smallexample
+
+When changing multiple instructions, the important question is usually
+not whether each individual change is worthwhile, but whether the changes
+as a whole are worthwhile.  The way to test this is:
+
+@smallexample
+if (!rtl_ssa::changes_are_worthwhile (changes))
+  return false;
+@end smallexample
+
+The process for changing single instructions makes sure that one
+@code{rtl_ssa::insn_change} in isolation is valid.  But when changing
+multiple instructions, it is also necessary to test whether the
+sequence as a whole is valid.  For example, it might be impossible
+to satisfy all of the @code{move_range}s at once.
+
+Therefore, once the pass has a sequence of changes that are
+individually correct, it should use:
+
+@smallexample
+if (!crtl->ssa->verify_insn_changes (changes))
+  return false;
+@end smallexample
+
+to check whether the sequence as a whole is valid.  If all checks pass,
+the final step is:
+
+@smallexample
+confirm_change_group ();
+crtl->ssa->change_insns (changes);
+@end smallexample
+
+Putting all this together, the process for a two-instruction change is:
+
+@smallexample
+auto attempt = crtl->ssa->new_change_attempt ();
+
+rtl_ssa::insn_change change (insn1);
+change1.new_defs = @dots{};
+change1.new_uses = @dots{};
+change1.move_range = @dots{};
+
+rtl_ssa::insn_change change (insn2);
+change2.new_defs = @dots{};
+change2.new_uses = @dots{};
+change2.move_range = @dots{};
+
+rtl_ssa::insn_change *changes[] = @{ &change1, change2 @};
+
+auto is_changing = insn_is_changing (changes);
+if (!rtl_ssa::restrict_movement (change1, is_changing)
+    || !rtl_ssa::restrict_movement (change2, is_changing))
+  return false;
+
+insn_change_watermark watermark;
+// Use validate_change etc. to change INSN1's and INSN2's patterns.
+@dots{}
+if (!rtl_ssa::recog (change1, is_changing)
+    || !rtl_ssa::recog (change2, is_changing)
+    || !rtl_ssa::changes_are_worthwhile (changes)
+    || !crtl->ssa->verify_insn_changes (changes))
+  return false;
+
+confirm_change_group ();
+crtl->ssa->change_insns (changes);
+@end smallexample
+
 @node Sharing
 @section Structure Sharing Assumptions
 @cindex sharing of RTL components
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH 22/23] Add rtl-ssa
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (20 preceding siblings ...)
  2020-11-13  8:22 ` [21/23] doc: Add documentation for rtl-ssa Richard Sandiford
@ 2020-11-13  8:23 ` Richard Sandiford
  2020-12-16  3:31   ` Jeff Law
  2020-11-13  8:24 ` [PATCH 23/23] fwprop: Rewrite to use RTL SSA Richard Sandiford
  2020-11-25 19:58 ` [00/23] Make fwprop use an on-the-side RTL SSA representation Jeff Law
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:23 UTC (permalink / raw)
  To: gcc-patches

This patch adds the RTL SSA infrastructure itself.  The following
fwprop.c patch will make use of it.

gcc/
	* configure.ac: Add rtl-ssa to the list of dependence directories.
	* configure: Regenerate.
	* Makefile.in (rtl-ssa-warn): New variable.
	(OBJS): Add the rtl-ssa object files.
	* emit-rtl.h (rtl_data::ssa): New field.
	* rtl-ssa.h: New file.
	* system.h: Include <functional> when INCLUDE_FUNCTIONAL is defined.
	* rtl-ssa: New directory.
---
 gcc/Makefile.in            |    6 +
 gcc/configure              |    2 +-
 gcc/configure.ac           |    2 +-
 gcc/emit-rtl.h             |    3 +
 gcc/rtl-ssa.h              |   71 ++
 gcc/rtl-ssa/access-utils.h |  553 +++++++++++++
 gcc/rtl-ssa/accesses.cc    | 1592 ++++++++++++++++++++++++++++++++++++
 gcc/rtl-ssa/accesses.h     | 1032 +++++++++++++++++++++++
 gcc/rtl-ssa/blocks.cc      | 1146 ++++++++++++++++++++++++++
 gcc/rtl-ssa/blocks.h       |  301 +++++++
 gcc/rtl-ssa/change-utils.h |  137 ++++
 gcc/rtl-ssa/changes.cc     | 1025 +++++++++++++++++++++++
 gcc/rtl-ssa/changes.h      |  118 +++
 gcc/rtl-ssa/functions.cc   |  325 ++++++++
 gcc/rtl-ssa/functions.h    |  433 ++++++++++
 gcc/rtl-ssa/insn-utils.h   |   46 ++
 gcc/rtl-ssa/insns.cc       |  718 ++++++++++++++++
 gcc/rtl-ssa/insns.h        |  505 ++++++++++++
 gcc/rtl-ssa/internals.inl  |  682 +++++++++++++++
 gcc/rtl-ssa/is-a.inl       |   98 +++
 gcc/rtl-ssa/member-fns.inl |  928 +++++++++++++++++++++
 gcc/rtl-ssa/movement.h     |  335 ++++++++
 gcc/system.h               |    3 +
 23 files changed, 10059 insertions(+), 2 deletions(-)
 create mode 100644 gcc/rtl-ssa.h
 create mode 100644 gcc/rtl-ssa/access-utils.h
 create mode 100644 gcc/rtl-ssa/accesses.cc
 create mode 100644 gcc/rtl-ssa/accesses.h
 create mode 100644 gcc/rtl-ssa/blocks.cc
 create mode 100644 gcc/rtl-ssa/blocks.h
 create mode 100644 gcc/rtl-ssa/change-utils.h
 create mode 100644 gcc/rtl-ssa/changes.cc
 create mode 100644 gcc/rtl-ssa/changes.h
 create mode 100644 gcc/rtl-ssa/functions.cc
 create mode 100644 gcc/rtl-ssa/functions.h
 create mode 100644 gcc/rtl-ssa/insn-utils.h
 create mode 100644 gcc/rtl-ssa/insns.cc
 create mode 100644 gcc/rtl-ssa/insns.h
 create mode 100644 gcc/rtl-ssa/internals.inl
 create mode 100644 gcc/rtl-ssa/is-a.inl
 create mode 100644 gcc/rtl-ssa/member-fns.inl
 create mode 100644 gcc/rtl-ssa/movement.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 900bf11b0ba..e6907845118 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -207,6 +207,7 @@ VALGRIND_DRIVER_DEFINES = @valgrind_path_defines@
 # This is how we control whether or not the additional warnings are applied.
 .-warn = $(STRICT_WARN)
 build-warn = $(STRICT_WARN)
+rtl-ssa-warn = $(STRICT_WARN)
 GCC_WARN_CFLAGS = $(LOOSE_WARN) $(C_LOOSE_WARN) $($(@D)-warn) $(if $(filter-out $(STRICT_WARN),$($(@D)-warn)),,$(C_STRICT_WARN)) $(NOCOMMON_FLAG) $($@-warn)
 GCC_WARN_CXXFLAGS = $(LOOSE_WARN) $($(@D)-warn) $(NOCOMMON_FLAG) $($@-warn)
 
@@ -1519,6 +1520,11 @@ OBJS = \
 	reorg.o \
 	resource.o \
 	rtl-error.o \
+	rtl-ssa/accesses.o \
+	rtl-ssa/blocks.o \
+	rtl-ssa/changes.o \
+	rtl-ssa/functions.o \
+	rtl-ssa/insns.o \
 	rtl-tests.o \
 	rtl.o \
 	rtlhash.o \
diff --git a/gcc/configure b/gcc/configure
index 9d2fd0dc30b..04ba4617e0d 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -32769,7 +32769,7 @@ $as_echo "$as_me: executing $ac_file commands" >&6;}
     "depdir":C) $SHELL $ac_aux_dir/mkinstalldirs $DEPDIR ;;
     "gccdepdir":C)
   ${CONFIG_SHELL-/bin/sh} $ac_aux_dir/mkinstalldirs build/$DEPDIR
-  for lang in $subdirs c-family common analyzer
+  for lang in $subdirs c-family common analyzer rtl-ssa
   do
       ${CONFIG_SHELL-/bin/sh} $ac_aux_dir/mkinstalldirs $lang/$DEPDIR
   done ;;
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 73034bb902b..716679d0096 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -1242,7 +1242,7 @@ AC_CHECK_HEADERS(ext/hash_map)
 ZW_CREATE_DEPDIR
 AC_CONFIG_COMMANDS([gccdepdir],[
   ${CONFIG_SHELL-/bin/sh} $ac_aux_dir/mkinstalldirs build/$DEPDIR
-  for lang in $subdirs c-family common analyzer
+  for lang in $subdirs c-family common analyzer rtl-ssa
   do
       ${CONFIG_SHELL-/bin/sh} $ac_aux_dir/mkinstalldirs $lang/$DEPDIR
   done], [subdirs="$subdirs" ac_aux_dir=$ac_aux_dir DEPDIR=$DEPDIR])
diff --git a/gcc/emit-rtl.h b/gcc/emit-rtl.h
index 4658c4aa3fd..b20cd76ea74 100644
--- a/gcc/emit-rtl.h
+++ b/gcc/emit-rtl.h
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 class temp_slot;
 typedef class temp_slot *temp_slot_p;
 class predefined_function_abi;
+namespace rtl_ssa { class function_info; }
 
 /* Information mainlined about RTL representation of incoming arguments.  */
 struct GTY(()) incoming_args {
@@ -73,6 +74,8 @@ struct GTY(()) rtl_data {
      different ABIs.  */
   const predefined_function_abi *GTY((skip)) abi;
 
+  rtl_ssa::function_info *GTY((skip)) ssa;
+
   /* For function.c  */
 
   /* # of bytes of outgoing arguments.  If ACCUMULATE_OUTGOING_ARGS is
diff --git a/gcc/rtl-ssa.h b/gcc/rtl-ssa.h
new file mode 100644
index 00000000000..60cdad03dc0
--- /dev/null
+++ b/gcc/rtl-ssa.h
@@ -0,0 +1,71 @@
+// On-the-side RTL SSA representation                               -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef GCC_RTL_SSA_H
+#define GCC_RTL_SSA_H 1
+
+// This is an aggregation header file.  This means it should contain only
+// other include files.
+
+#if 0
+// Files that use this one should first have:
+#define INCLUDE_ALGORITHM
+#define INCLUDE_FUNCTIONAL
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "df.h"
+#endif
+
+// Needed by splay-tree-utils.h and directly by rtl-ssa.
+#include "pretty-print.h"
+
+// Needed directly by recog.h.
+#include "insn-config.h"
+
+// Needed directly by rtl-ssa.
+#include "splay-tree-utils.h"
+#include "recog.h"
+#include "regs.h"
+#include "function-abi.h"
+#include "obstack-utils.h"
+#include "mux-utils.h"
+#include "rtlanal.h"
+
+// Provides the global crtl->ssa.
+#include "tm_p.h"
+#include "memmodel.h"
+#include "emit-rtl.h"
+
+// The rtl-ssa files themselves.
+#include "rtl-ssa/accesses.h"
+#include "rtl-ssa/insns.h"
+#include "rtl-ssa/blocks.h"
+#include "rtl-ssa/changes.h"
+#include "rtl-ssa/functions.h"
+#include "rtl-ssa/is-a.inl"
+#include "rtl-ssa/access-utils.h"
+#include "rtl-ssa/insn-utils.h"
+#include "rtl-ssa/movement.h"
+#include "rtl-ssa/change-utils.h"
+#include "rtl-ssa/member-fns.inl"
+
+#endif
diff --git a/gcc/rtl-ssa/access-utils.h b/gcc/rtl-ssa/access-utils.h
new file mode 100644
index 00000000000..b200e3416d7
--- /dev/null
+++ b/gcc/rtl-ssa/access-utils.h
@@ -0,0 +1,553 @@
+// Access-related utilities for RTL SSA                             -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+namespace rtl_ssa {
+
+// Return a referene to the whole of register REGNO.
+inline resource_info
+full_register (unsigned int regno)
+{
+  return { reg_raw_mode[regno], regno };
+}
+
+// Return true if sorted array ACCESSES includes an access to hard registers.
+inline bool
+accesses_include_hard_registers (const access_array &accesses)
+{
+  return accesses.size () && HARD_REGISTER_NUM_P (accesses.front ()->regno ());
+}
+
+// Return true if sorted array ACCESSES includes an access to memory.
+inline bool
+accesses_include_memory (const access_array &accesses)
+{
+  return accesses.size () && accesses.back ()->is_mem ();
+}
+
+// If sorted array ACCESSES includes an access to memory, return the access,
+// otherwise return null.
+template<typename T>
+inline auto
+memory_access (T accesses) -> decltype (accesses[0])
+{
+  if (accesses.size () && accesses.back ()->is_mem ())
+    return accesses.back ();
+  return nullptr;
+}
+
+// If sorted array ACCESSES includes a reference to REGNO, return the
+// access, otherwise return null.
+template<typename T>
+inline auto
+find_access (T accesses, unsigned int regno) -> decltype (accesses[0])
+{
+  unsigned int start = 0;
+  unsigned int end = accesses.size ();
+  while (start < end)
+    {
+      unsigned int mid = (start + end) / 2;
+      unsigned int found = accesses[mid]->regno ();
+      if (found == regno)
+	return accesses[mid];
+      if (found < regno)
+	start = mid + 1;
+      else
+	end = mid;
+    }
+  return nullptr;
+}
+
+// If sorted array ACCESSES includes a reference to REGNO, return the
+// index of the access, otherwise return -1.
+inline int
+find_access_index (access_array accesses, unsigned int regno)
+{
+  unsigned int start = 0;
+  unsigned int end = accesses.size ();
+  while (start < end)
+    {
+      unsigned int mid = (start + end) / 2;
+      unsigned int found = accesses[mid]->regno ();
+      if (found == regno)
+	return mid;
+      if (found < regno)
+	start = mid + 1;
+      else
+	end = mid;
+    }
+  return -1;
+}
+
+// If ACCESS is a set whose result is used by at least one instruction,
+// return the access as a set_info, otherwise return null.
+inline const set_info *
+set_with_nondebug_insn_uses (const access_info *access)
+{
+  if (access->is_set_with_nondebug_insn_uses ())
+    // No need for as_a; this test is just as definitive.
+    return static_cast<const set_info *> (access);
+  return nullptr;
+}
+
+// A non-const version of the above.
+inline set_info *
+set_with_nondebug_insn_uses (access_info *access)
+{
+  if (access->is_set_with_nondebug_insn_uses ())
+    return static_cast<set_info *> (access);
+  return nullptr;
+}
+
+// Return true if SET is the only set of SET->resource () and if it
+// dominates all uses (excluding uses of SET->resource () at points
+// where SET->resource () is always undefined).
+inline bool
+is_single_dominating_def (const set_info *set)
+{
+  return set->is_first_def () && set->is_last_def ();
+}
+
+// SET is known to be available on entry to BB.  Return true if it is
+// also available on exit from BB.  (The value might or might not be live.)
+inline bool
+remains_available_on_exit (const set_info *set, bb_info *bb)
+{
+  return (set->is_last_def ()
+	  || *set->next_def ()->insn () > *bb->end_insn ());
+}
+
+// ACCESS is known to be associated with an instruction rather than
+// a phi node.  Return which instruction that is.
+inline insn_info *
+access_insn (const access_info *access)
+{
+  // In release builds this function reduces to a single pointer reference.
+  if (auto *def = dyn_cast<const def_info *> (access))
+    return def->insn ();
+  return as_a<const use_info *> (access)->insn ();
+}
+
+// If ACCESS records a use, return the value that it uses.  If ACCESS records
+// a set, return that set.  If ACCESS records a clobber, return null.
+inline const set_info *
+access_value (const access_info *access)
+{
+  if (!access)
+    return nullptr;
+
+  if (auto *use = dyn_cast<const use_info *> (access))
+    return use->def ();
+
+  return dyn_cast<const set_info *> (access);
+}
+
+// A non-const version of the above.
+inline set_info *
+access_value (access_info *access)
+{
+  auto *const_access = const_cast<const access_info *> (access);
+  return const_cast<set_info *> (access_value (const_access));
+}
+
+// If ACCESS is a degenerate phi, return the set_info that defines its input,
+// otherwise return ACCESS itself.
+template<typename T>
+inline const T *
+look_through_degenerate_phi (const T *access)
+{
+  if (auto *phi = dyn_cast<const phi_info *> (access))
+    if (phi->is_degenerate ())
+      return phi->input_value (0);
+  return access;
+}
+
+// A non-const version of the above.
+template<typename T>
+inline T *
+look_through_degenerate_phi (T *access)
+{
+  auto *const_access = const_cast<const T *> (access);
+  return const_cast<T *> (look_through_degenerate_phi (const_access));
+}
+
+// If CLOBBER is in a group, return the first clobber in the group,
+// otherwise return CLOBBER itself.
+inline clobber_info *
+first_clobber_in_group (clobber_info *clobber)
+{
+  if (clobber->is_in_group ())
+    return clobber->group ()->first_clobber ();
+  return clobber;
+}
+
+// If CLOBBER is in a group, return the last clobber in the group,
+// otherwise return CLOBBER itself.
+inline clobber_info *
+last_clobber_in_group (clobber_info *clobber)
+{
+  if (clobber->is_in_group ())
+    return clobber->group ()->last_clobber ();
+  return clobber;
+}
+
+// If DEF is a clobber in a group, return the containing group,
+// otherwise return DEF.
+inline def_mux
+clobber_group_or_single_def (def_info *def)
+{
+  if (auto *clobber = dyn_cast<clobber_info *> (def))
+    if (clobber->is_in_group ())
+      return clobber->group ();
+  return def;
+}
+
+// Return the first definition associated with NODE.  If NODE holds
+// a single set, the result is that set.  If NODE holds a clobber_group,
+// the result is the first clobber in the group.
+inline def_info *
+first_def (def_node *node)
+{
+  return node->first_def ();
+}
+
+// Likewise for something that is either a node or a single definition.
+inline def_info *
+first_def (def_mux mux)
+{
+  return mux.first_def ();
+}
+
+// Return the last definition associated with NODE.  If NODE holds
+// a single set, the result is that set.  If NODE holds a clobber_group,
+// the result is the last clobber in the group.
+inline def_info *
+last_def (def_node *node)
+{
+  if (auto *group = dyn_cast<clobber_group *> (node))
+    return group->last_clobber ();
+  return node->first_def ();
+}
+
+// Likewise for something that is either a node or a single definition.
+inline def_info *
+last_def (def_mux mux)
+{
+  return mux.last_def ();
+}
+
+int lookup_use (splay_tree<use_info *> &, insn_info *);
+int lookup_def (def_splay_tree &, insn_info *);
+int lookup_clobber (clobber_tree &, insn_info *);
+int lookup_call_clobbers (insn_call_clobbers_tree &, insn_info *);
+
+// Search backwards from immediately before INSN for the first instruction
+// recorded in TREE, ignoring any instruction I for which IGNORE (I) is true.
+// Return null if no such instruction exists.
+template<typename IgnorePredicate>
+insn_info *
+prev_call_clobbers_ignoring (insn_call_clobbers_tree &tree, insn_info *insn,
+			     IgnorePredicate ignore)
+{
+  if (!tree)
+    return nullptr;
+
+  int comparison = lookup_call_clobbers (tree, insn);
+  while (comparison <= 0 || ignore (tree->insn ()))
+    {
+      if (!tree.splay_prev_node ())
+	return nullptr;
+
+      comparison = 1;
+    }
+  return tree->insn ();
+}
+
+// Search forwards from immediately after INSN for the first instruction
+// recorded in TREE, ignoring any instruction I for which IGNORE (I) is true.
+// Return null if no such instruction exists.
+template<typename IgnorePredicate>
+insn_info *
+next_call_clobbers_ignoring (insn_call_clobbers_tree &tree, insn_info *insn,
+			     IgnorePredicate ignore)
+{
+  if (!tree)
+    return nullptr;
+
+  int comparison = lookup_call_clobbers (tree, insn);
+  while (comparison >= 0 || ignore (tree->insn ()))
+    {
+      if (!tree.splay_next_node ())
+	return nullptr;
+
+      comparison = -1;
+    }
+  return tree->insn ();
+}
+
+// If ACCESS is a set, return the first use of ACCESS by a nondebug insn I
+// for which IGNORE (I) is false.  Return null if ACCESS is not a set or if
+// no such use exists.
+template<typename IgnorePredicate>
+inline use_info *
+first_nondebug_insn_use_ignoring (const access_info *access,
+				  IgnorePredicate ignore)
+{
+  if (const set_info *set = set_with_nondebug_insn_uses (access))
+    {
+      // Written this way to emphasize to the compiler that first_use
+      // must be nonnull in this situation.
+      use_info *use = set->first_use ();
+      do
+	{
+	  if (!ignore (use->insn ()))
+	    return use;
+	  use = use->next_nondebug_insn_use ();
+	}
+      while (use);
+    }
+  return nullptr;
+}
+
+// If ACCESS is a set, return the last use of ACCESS by a nondebug insn I for
+// which IGNORE (I) is false.  Return null if ACCESS is not a set or if no
+// such use exists.
+template<typename IgnorePredicate>
+inline use_info *
+last_nondebug_insn_use_ignoring (const access_info *access,
+				 IgnorePredicate ignore)
+{
+  if (const set_info *set = set_with_nondebug_insn_uses (access))
+    {
+      // Written this way to emphasize to the compiler that
+      // last_nondebug_insn_use must be nonnull in this situation.
+      use_info *use = set->last_nondebug_insn_use ();
+      do
+	{
+	  if (!ignore (use->insn ()))
+	    return use;
+	  use = use->prev_use ();
+	}
+      while (use);
+    }
+  return nullptr;
+}
+
+// If DEF is null, return null.
+//
+// Otherwise, search backwards for an access to DEF->resource (), starting at
+// the end of DEF's live range.  Ignore clobbers if IGNORE_CLOBBERS_SETTING
+// is YES, otherwise treat them like any other access.  Also ignore any
+// access A for which IGNORE (access_insn (A)) is true.
+//
+// Thus if DEF is a set that is used by nondebug insns, the first access
+// that the function considers is the last such use of the set.  Otherwise,
+// the first access that the function considers is DEF itself.
+//
+// Return the access found, or null if there is no access that meets
+// the criteria.
+//
+// Note that this function does not consider separately-recorded call clobbers,
+// although such clobbers are only relevant if IGNORE_CLOBBERS_SETTING is NO.
+template<typename IgnorePredicate>
+access_info *
+last_access_ignoring (def_info *def, ignore_clobbers ignore_clobbers_setting,
+		      IgnorePredicate ignore)
+{
+  while (def)
+    {
+      auto *clobber = dyn_cast<clobber_info *> (def);
+      if (clobber && ignore_clobbers_setting == ignore_clobbers::YES)
+	def = first_clobber_in_group (clobber);
+      else
+	{
+	  if (use_info *use = last_nondebug_insn_use_ignoring (def, ignore))
+	    return use;
+
+	  insn_info *insn = def->insn ();
+	  if (!ignore (insn))
+	    return def;
+	}
+      def = def->prev_def ();
+    }
+  return nullptr;
+}
+
+// Search backwards for an access to DEF->resource (), starting
+// immediately before the point at which DEF occurs.  Ignore clobbers
+// if IGNORE_CLOBBERS_SETTING is YES, otherwise treat them like any other
+// access.  Also ignore any access A for which IGNORE (access_insn (A))
+// is true.
+//
+// Thus if DEF->insn () uses DEF->resource (), that use is the first access
+// that the function considers, since an instruction's uses occur strictly
+// before its definitions.
+//
+// Note that this function does not consider separately-recorded call clobbers,
+// although such clobbers are only relevant if IGNORE_CLOBBERS_SETTING is NO.
+template<typename IgnorePredicate>
+inline access_info *
+prev_access_ignoring (def_info *def, ignore_clobbers ignore_clobbers_setting,
+		      IgnorePredicate ignore)
+{
+  return last_access_ignoring (def->prev_def (), ignore_clobbers_setting,
+			       ignore);
+}
+
+// If DEF is null, return null.
+//
+// Otherwise, search forwards for a definition of DEF->resource (),
+// starting at DEF itself.  Ignore clobbers if IGNORE_CLOBBERS_SETTING
+// is YES, otherwise treat them like any other access.  Also ignore any
+// definition D for which IGNORE (D->insn ()) is true.
+//
+// Return the definition found, or null if there is no access that meets
+// the criteria.
+//
+// Note that this function does not consider separately-recorded call clobbers,
+// although such clobbers are only relevant if IGNORE_CLOBBERS_SETTING is NO.
+template<typename IgnorePredicate>
+def_info *
+first_def_ignoring (def_info *def, ignore_clobbers ignore_clobbers_setting,
+		    IgnorePredicate ignore)
+{
+  while (def)
+    {
+      auto *clobber = dyn_cast<clobber_info *> (def);
+      if (clobber && ignore_clobbers_setting == ignore_clobbers::YES)
+	def = last_clobber_in_group (clobber);
+      else if (!ignore (def->insn ()))
+	return def;
+
+      def = def->next_def ();
+    }
+  return nullptr;
+}
+
+// Search forwards for the next access to DEF->resource (),
+// starting immediately after DEF's instruction.  Ignore clobbers if
+// IGNORE_CLOBBERS_SETTING is YES, otherwise treat them like any other access.
+// Also ignore any access A for which IGNORE (access_insn (A)) is true;
+// in this context, ignoring a set includes ignoring all uses of the set.
+//
+// Thus if DEF is a set with uses by nondebug insns, the first access that the
+// function considers is the first such use of the set.
+//
+// Return the access found, or null if there is no access that meets the
+// criteria.
+//
+// Note that this function does not consider separately-recorded call clobbers,
+// although such clobbers are only relevant if IGNORE_CLOBBERS_SETTING is NO.
+template<typename IgnorePredicate>
+access_info *
+next_access_ignoring (def_info *def, ignore_clobbers ignore_clobbers_setting,
+		      IgnorePredicate ignore)
+{
+  if (use_info *use = first_nondebug_insn_use_ignoring (def, ignore))
+    return use;
+
+  return first_def_ignoring (def->next_def (), ignore_clobbers_setting,
+			     ignore);
+}
+
+// Return true if ACCESS1 should before ACCESS2 in an access_array.
+inline bool
+compare_access_infos (const access_info *access1, const access_info *access2)
+{
+  gcc_checking_assert (access1 == access2
+		       || access1->regno () != access2->regno ());
+  return access1->regno () < access2->regno ();
+}
+
+// Sort [BEGIN, END) into ascending regno order.  The sequence must have
+// at most one access to a given a regno.
+inline void
+sort_accesses (access_info **begin, access_info **end)
+{
+  auto count = end - begin;
+  if (count <= 1)
+    return;
+
+  if (count == 2)
+    {
+      gcc_checking_assert (begin[0]->regno () != begin[1]->regno ());
+      if (begin[0]->regno () > begin[1]->regno ())
+	std::swap (begin[0], begin[1]);
+      return;
+    }
+
+  std::sort (begin, end, compare_access_infos);
+}
+
+// Sort the accesses in CONTAINER, which contains pointers to access_infos.
+template<typename T>
+inline void
+sort_accesses (T &container)
+{
+  return sort_accesses (container.begin (), container.end ());
+}
+
+// The underlying non-template implementation of merge_access_arrays.
+access_array merge_access_arrays_base (obstack_watermark &, access_array,
+				       access_array);
+// Merge access arrays ACCESSES1 and ACCESSES2, including the allocation
+// in the area governed by WATERMARK.  Return an invalid access_array if
+// ACCESSES1 and ACCESSES2 contain conflicting accesses to the same resource.
+//
+// T can be an access_array, a def_array or a use_array.
+template<typename T>
+inline T
+merge_access_arrays (obstack_watermark &watermark, T accesses1, T accesses2)
+{
+  return T (merge_access_arrays_base (watermark, accesses1, accesses2));
+}
+
+// The underlying non-template implementation of insert_access.
+access_array insert_access_base (obstack_watermark &, access_info *,
+				 access_array);
+
+// Return a new access_array that contains the result of inserting ACCESS1
+// into sorted access array ACCESSES2.  Allocate the returned array in the
+// area governed by WATERMARK.  Return an invalid access_array if ACCESSES2
+// contains a conflicting access to the same resource as ACCESS1.
+//
+// T can be an access_array, a def_array or a use_array.
+template<typename T>
+inline T
+insert_access (obstack_watermark &watermark,
+	       typename T::value_type access1, T accesses2)
+{
+  return T (insert_access_base (watermark, access1, accesses2));
+}
+
+// The underlying non-template implementation of remove_note_accesses.
+access_array remove_note_accesses_base (obstack_watermark &, access_array);
+
+// If ACCESSES contains accesses that only occur in notes, return a new
+// array without such accesses, allocating it in the area governed by
+// WATERMARK.  Return ACCESSES itself otherwise.
+//
+// T can be an access_array, a def_array or a use_array.
+template<typename T>
+inline T
+remove_note_accesses (obstack_watermark &watermark, T accesses)
+{
+  return T (remove_note_accesses_base (watermark, accesses));
+}
+
+}
diff --git a/gcc/rtl-ssa/accesses.cc b/gcc/rtl-ssa/accesses.cc
new file mode 100644
index 00000000000..bd375177866
--- /dev/null
+++ b/gcc/rtl-ssa/accesses.cc
@@ -0,0 +1,1592 @@
+// Implementation of access-related functions for RTL SSA           -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#define INCLUDE_ALGORITHM
+#define INCLUDE_FUNCTIONAL
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "df.h"
+#include "rtl-ssa.h"
+#include "rtl-ssa/internals.inl"
+
+using namespace rtl_ssa;
+
+// This clobber belongs to a clobber_group but m_group appears to be
+// out of date.  Update it and return the new (correct) value.
+clobber_group *
+clobber_info::recompute_group ()
+{
+  using splay_tree = clobber_info::splay_tree;
+
+  // Splay this clobber to the root of the tree while searching for a node
+  // that has the correct group.  The root always has the correct group,
+  // so the search always breaks early and does not install this clobber
+  // as the root.
+  clobber_info *cursor = m_parent;
+  auto find_group = [](clobber_info *node, unsigned int)
+    {
+      return node->m_group->has_been_superceded () ? nullptr : node->m_group;
+    };
+  clobber_group *group = splay_tree::splay_and_search (this, nullptr,
+						       find_group);
+  gcc_checking_assert (m_parent);
+
+  // If the previous splay operation did anything, this clobber is now an
+  // ancestor of CURSOR, and all the nodes inbetween have a stale group.
+  // Since we have visited the nodes, we might as well update them too.
+  //
+  // If the previous splay operation did nothing, start the update from
+  // this clobber instead.  In that case we change at most two clobbers:
+  // this clobber and possibly its parent.
+  if (cursor == m_parent)
+    cursor = this;
+
+  // Walk up the tree from CURSOR updating clobbers that need it.
+  // This walk always includes this clobber.
+  while (cursor->m_group != group)
+    {
+      cursor->m_group = group;
+      cursor = cursor->m_parent;
+    }
+
+  gcc_checking_assert (m_group == group);
+  return group;
+}
+
+// See the comment above the declaration.
+void
+resource_info::print_identifier (pretty_printer *pp) const
+{
+  if (is_mem ())
+    pp_string (pp, "mem");
+  else
+    {
+      char tmp[3 * sizeof (regno) + 2];
+      snprintf (tmp, sizeof (tmp), "r%d", regno);
+      pp_string (pp, tmp);
+    }
+}
+
+// See the comment above the declaration.
+void
+resource_info::print_context (pretty_printer *pp) const
+{
+  if (HARD_REGISTER_NUM_P (regno))
+    {
+      if (const char *name = reg_names[regno])
+	{
+	  pp_space (pp);
+	  pp_left_paren (pp);
+	  pp_string (pp, name);
+	  if (mode != E_BLKmode)
+	    {
+	      pp_colon (pp);
+	      pp_string (pp, GET_MODE_NAME (mode));
+	    }
+	  pp_right_paren (pp);
+	}
+    }
+  else if (is_reg ())
+    {
+      pp_space (pp);
+      pp_left_paren (pp);
+      if (mode != E_BLKmode)
+	{
+	  pp_string (pp, GET_MODE_NAME (mode));
+	  pp_space (pp);
+	}
+      pp_string (pp, "pseudo");
+      pp_right_paren (pp);
+    }
+}
+
+// See the comment above the declaration.
+void
+resource_info::print (pretty_printer *pp) const
+{
+  print_identifier (pp);
+  print_context (pp);
+}
+
+// Some properties can naturally be described using adjectives that attach
+// to nouns like "use" or "definition".  Print such adjectives to PP.
+void
+access_info::print_prefix_flags (pretty_printer *pp) const
+{
+  if (m_is_temp)
+    pp_string (pp, "temporary ");
+  if (m_has_been_superceded)
+    pp_string (pp, "superceded ");
+}
+
+// Print properties not handled by print_prefix_flags to PP, putting
+// each property on a new line indented by two extra spaces.
+void
+access_info::print_properties_on_new_lines (pretty_printer *pp) const
+{
+  if (m_is_pre_post_modify)
+    {
+      pp_newline_and_indent (pp, 2);
+      pp_string (pp, "set by a pre/post-modify");
+      pp_indentation (pp) -= 2;
+    }
+  if (m_includes_address_uses)
+    {
+      pp_newline_and_indent (pp, 2);
+      pp_string (pp, "appears inside an address");
+      pp_indentation (pp) -= 2;
+    }
+  if (m_includes_read_writes)
+    {
+      pp_newline_and_indent (pp, 2);
+      pp_string (pp, "appears in a read/write context");
+      pp_indentation (pp) -= 2;
+    }
+  if (m_includes_subregs)
+    {
+      pp_newline_and_indent (pp, 2);
+      pp_string (pp, "appears inside a subreg");
+      pp_indentation (pp) -= 2;
+    }
+}
+
+// Return true if there are no known issues with the integrity of the
+// link information.
+inline bool
+use_info::check_integrity ()
+{
+  auto subsequence_id = [](use_info *use)
+    {
+      if (use->is_in_nondebug_insn ())
+	return 1;
+      if (use->is_in_debug_insn ())
+	return 2;
+      return 3;
+    };
+
+  use_info *prev = prev_use ();
+  use_info *next = next_use ();
+
+  if (prev && subsequence_id (prev) > subsequence_id (this))
+    return false;
+  if (next && subsequence_id (next) < subsequence_id (this))
+    return false;
+  if (m_is_last_nondebug_insn_use != calculate_is_last_nondebug_insn_use ())
+    return false;
+
+  if (!prev && last_use ()->next_use ())
+    return false;
+  if (!next)
+    if (use_info *use = last_nondebug_insn_use ())
+      if (!use->m_is_last_nondebug_insn_use)
+	return false;
+
+  return true;
+}
+
+// See the comment above the declaration.
+void
+use_info::print_location (pretty_printer *pp) const
+{
+  if (is_in_phi ())
+    pp_access (pp, phi (), PP_ACCESS_INCLUDE_LOCATION);
+  else
+    insn ()->print_identifier_and_location (pp);
+}
+
+// See the comment above the declaration.
+void
+use_info::print_def (pretty_printer *pp) const
+{
+  if (const set_info *set = def ())
+    pp_access (pp, set, 0);
+  else
+    {
+      pp_string (pp, "undefined ");
+      resource ().print (pp);
+    }
+}
+
+// See the comment above the declaration.
+void
+use_info::print (pretty_printer *pp, unsigned int flags) const
+{
+  print_prefix_flags (pp);
+
+  const set_info *set = def ();
+  if (set && set->mode () != mode ())
+    {
+      pp_string (pp, GET_MODE_NAME (mode ()));
+      pp_space (pp);
+    }
+
+  pp_string (pp, "use of ");
+  print_def (pp);
+  if (flags & PP_ACCESS_INCLUDE_LOCATION)
+    {
+      pp_string (pp, " by ");
+      print_location (pp);
+    }
+  if (set && (flags & PP_ACCESS_INCLUDE_LINKS))
+    {
+      pp_newline_and_indent (pp, 2);
+      pp_string (pp, "defined in ");
+      set->insn ()->print_location (pp);
+      pp_indentation (pp) -= 2;
+    }
+  if (flags & PP_ACCESS_INCLUDE_PROPERTIES)
+    print_properties_on_new_lines (pp);
+}
+
+// See the comment above the declaration.
+void
+def_info::print_identifier (pretty_printer *pp) const
+{
+  resource ().print_identifier (pp);
+  pp_colon (pp);
+  insn ()->print_identifier (pp);
+  resource ().print_context (pp);
+}
+
+// See the comment above the declaration.
+void
+def_info::print_location (pretty_printer *pp) const
+{
+  insn ()->print_identifier_and_location (pp);
+}
+
+// See the comment above the declaration.
+void
+clobber_info::print (pretty_printer *pp, unsigned int flags) const
+{
+  print_prefix_flags (pp);
+  if (is_call_clobber ())
+    pp_string (pp, "call ");
+  pp_string (pp, "clobber ");
+  print_identifier (pp);
+  if (flags & PP_ACCESS_INCLUDE_LOCATION)
+    {
+      pp_string (pp, " in ");
+      insn ()->print_location (pp);
+    }
+  if (flags & PP_ACCESS_INCLUDE_PROPERTIES)
+    print_properties_on_new_lines (pp);
+}
+
+// See the comment above the declaration.
+void
+set_info::print_uses_on_new_lines (pretty_printer *pp) const
+{
+  for (const use_info *use : all_uses ())
+    {
+      pp_newline_and_indent (pp, 2);
+      if (use->is_live_out_use ())
+	{
+	  pp_string (pp, "live out from ");
+	  use->insn ()->print_location (pp);
+	}
+      else
+	{
+	  pp_string (pp, "used by ");
+	  use->print_location (pp);
+	}
+      pp_indentation (pp) -= 2;
+    }
+  if (m_use_tree)
+    {
+      pp_newline_and_indent (pp, 2);
+      pp_string (pp, "splay tree:");
+      pp_newline_and_indent (pp, 2);
+      auto print_use = [](pretty_printer *pp,
+			  splay_tree_node<use_info *> *node)
+	{
+	  pp_string (pp, "use by ");
+	  node->value ()->print_location (pp);
+	};
+      m_use_tree.print (pp, m_use_tree.root (), print_use);
+      pp_indentation (pp) -= 4;
+    }
+}
+
+// See the comment above the declaration.
+void
+set_info::print (pretty_printer *pp, unsigned int flags) const
+{
+  print_prefix_flags (pp);
+  pp_string (pp, "set ");
+  print_identifier (pp);
+  if (flags & PP_ACCESS_INCLUDE_LOCATION)
+    {
+      pp_string (pp, " in ");
+      insn ()->print_location (pp);
+    }
+  if (flags & PP_ACCESS_INCLUDE_PROPERTIES)
+    print_properties_on_new_lines (pp);
+  if (flags & PP_ACCESS_INCLUDE_LINKS)
+    print_uses_on_new_lines (pp);
+}
+
+// See the comment above the declaration.
+void
+phi_info::print (pretty_printer *pp, unsigned int flags) const
+{
+  print_prefix_flags (pp);
+  pp_string (pp, "phi node ");
+  print_identifier (pp);
+  if (flags & PP_ACCESS_INCLUDE_LOCATION)
+    {
+      pp_string (pp, " in ");
+      insn ()->print_location (pp);
+    }
+
+  if (flags & PP_ACCESS_INCLUDE_PROPERTIES)
+    print_properties_on_new_lines (pp);
+
+  if (flags & PP_ACCESS_INCLUDE_LINKS)
+    {
+      basic_block cfg_bb = bb ()->cfg_bb ();
+      pp_newline_and_indent (pp, 2);
+      pp_string (pp, "inputs:");
+      unsigned int i = 0;
+      for (const use_info *input : inputs ())
+	{
+	  basic_block pred_cfg_bb = EDGE_PRED (cfg_bb, i)->src;
+	  pp_newline_and_indent (pp, 2);
+	  pp_string (pp, "bb");
+	  pp_decimal_int (pp, pred_cfg_bb->index);
+	  pp_colon (pp);
+	  pp_space (pp);
+	  input->print_def (pp);
+	  pp_indentation (pp) -= 2;
+	  i += 1;
+	}
+      pp_indentation (pp) -= 2;
+
+      print_uses_on_new_lines (pp);
+    }
+}
+
+// See the comment above the declaration.
+void
+set_node::print (pretty_printer *pp) const
+{
+  pp_access (pp, first_def ());
+}
+
+// See the comment above the declaration.
+void
+clobber_group::print (pretty_printer *pp) const
+{
+  auto print_clobber = [](pretty_printer *pp, const def_info *clobber)
+    {
+      pp_access (pp, clobber);
+    };
+  pp_string (pp, "grouped clobber");
+  for (const def_info *clobber : clobbers ())
+    {
+      pp_newline_and_indent (pp, 2);
+      print_clobber (pp, clobber);
+      pp_indentation (pp) -= 2;
+    }
+  pp_newline_and_indent (pp, 2);
+  pp_string (pp, "splay tree");
+  pp_newline_and_indent (pp, 2);
+  m_clobber_tree.print (pp, print_clobber);
+  pp_indentation (pp) -= 4;
+}
+
+// Return a clobber_group for CLOBBER, creating one if CLOBBER doesn't
+// already belong to a group.
+clobber_group *
+function_info::need_clobber_group (clobber_info *clobber)
+{
+  if (clobber->is_in_group ())
+    return clobber->group ();
+  return allocate<clobber_group> (clobber);
+}
+
+// Return a def_node for inserting DEF into the associated resource's
+// splay tree.  Use a clobber_group if DEF is a clobber and a set_node
+// otherwise.
+def_node *
+function_info::need_def_node (def_info *def)
+{
+  if (auto *clobber = dyn_cast<clobber_info *> (def))
+    return need_clobber_group (clobber);
+  return allocate<set_node> (as_a<set_info *> (def));
+}
+
+// LAST is the last thing to define LAST->resource (), and is where any
+// splay tree root for LAST->resource () is stored.  Require such a splay tree
+// to exist, creating a new one if necessary.  Return the root of the tree.
+//
+// The caller must call LAST->set_splay_root after it has finished with
+// the splay tree.
+def_splay_tree
+function_info::need_def_splay_tree (def_info *last)
+{
+  if (def_node *root = last->splay_root ())
+    return root;
+
+  // Use a left-spine rooted at the last node.
+  def_node *root = need_def_node (last);
+  def_node *parent = root;
+  while (def_info *prev = first_def (parent)->prev_def ())
+    {
+      def_node *node = need_def_node (prev);
+      def_splay_tree::insert_child (parent, 0, node);
+      parent = node;
+    }
+  return root;
+}
+
+// Search TREE for either:
+//
+// - a set_info at INSN or
+// - a clobber_group whose range includes INSN
+//
+// If such a node exists, install it as the root of TREE and return 0.
+// Otherwise arbitrarily choose between:
+//
+// (1) Installing the closest preceding node as the root and returning 1.
+// (2) Installing the closest following node as the root and returning -1.
+//
+// Note that this routine should not be used to check whether INSN
+// itself defines a resource; that can be checked more cheaply using
+// find_access_index.
+int
+rtl_ssa::lookup_def (def_splay_tree &tree, insn_info *insn)
+{
+  auto go_left = [&](def_node *node)
+    {
+      return *insn < *first_def (node)->insn ();
+    };
+  auto go_right = [&](def_node *node)
+    {
+      return *insn > *last_def (node)->insn ();
+    };
+  return tree.lookup (go_left, go_right);
+}
+
+// Search TREE for a clobber in INSN.  If such a clobber exists, install
+// it as the root of TREE and return 0.  Otherwise arbitrarily choose between:
+//
+// (1) Installing the closest preceding clobber as the root and returning 1.
+// (2) Installing the closest following clobber as the root and returning -1.
+int
+rtl_ssa::lookup_clobber (clobber_tree &tree, insn_info *insn)
+{
+  auto compare = [&](clobber_info *clobber)
+    {
+      return insn->compare_with (clobber->insn ());
+    };
+  return tree.lookup (compare);
+}
+
+// Search for a definition of RESOURCE at INSN and return the result of
+// the search as a def_lookup.  See the comment above the class for more
+// details.
+def_lookup
+function_info::find_def (resource_info resource, insn_info *insn)
+{
+  def_info *first = m_defs[resource.regno + 1];
+  if (!first)
+    // There are no nodes.  The comparison result is pretty meaningless
+    // in this case.
+    return { nullptr, -1 };
+
+  // See whether the first node matches.
+  auto first_result = clobber_group_or_single_def (first);
+  if (*insn <= *last_def (first_result)->insn ())
+    {
+      int comparison = (*insn >= *first->insn () ? 0 : -1);
+      return { first_result, comparison };
+    }
+
+  // See whether the last node matches.
+  def_info *last = first->last_def ();
+  auto last_result = clobber_group_or_single_def (last);
+  if (*insn >= *first_def (last_result)->insn ())
+    {
+      int comparison = (*insn <= *last->insn () ? 0 : 1);
+      return { last_result, comparison };
+    }
+
+  // Resort to using a splay tree to search for the result.
+  def_splay_tree tree = need_def_splay_tree (last);
+  int comparison = lookup_def (tree, insn);
+  last->set_splay_root (tree.root ());
+  return { tree.root (), comparison };
+}
+
+// Add DEF to the function's list of definitions of DEF->resource (),
+// inserting DEF immediately before BEFORE.  DEF is not currently in the list.
+void
+function_info::insert_def_before (def_info *def, def_info *before)
+{
+  gcc_checking_assert (!def->has_def_links ()
+		       && *before->insn () > *def->insn ());
+
+  def->copy_prev_from (before);
+  if (def_info *prev = def->prev_def ())
+    {
+      gcc_checking_assert (*prev->insn () < *def->insn ());
+      prev->set_next_def (def);
+    }
+  else
+    m_defs[def->regno () + 1] = def;
+
+  def->set_next_def (before);
+  before->set_prev_def (def);
+}
+
+// Add DEF to the function's list of definitions of DEF->resource (),
+// inserting DEF immediately after AFTER.  DEF is not currently in the list.
+void
+function_info::insert_def_after (def_info *def, def_info *after)
+{
+  gcc_checking_assert (!def->has_def_links ()
+		       && *after->insn () < *def->insn ());
+
+  def->copy_next_from (after);
+  if (def_info *next = def->next_def ())
+    {
+      gcc_checking_assert (*next->insn () > *def->insn ());
+      next->set_prev_def (def);
+    }
+  else
+    m_defs[def->regno () + 1]->set_last_def (def);
+
+  def->set_prev_def (after);
+  after->set_next_def (def);
+}
+
+// Remove DEF from the function's list of definitions of DEF->resource ().
+void
+function_info::remove_def_from_list (def_info *def)
+{
+  def_info *prev = def->prev_def ();
+  def_info *next = def->next_def ();
+
+  if (next)
+    next->copy_prev_from (def);
+  else
+    m_defs[def->regno () + 1]->set_last_def (prev);
+
+  if (prev)
+    prev->copy_next_from (def);
+  else
+    m_defs[def->regno () + 1] = next;
+
+  def->clear_def_links ();
+}
+
+// Add CLOBBER to GROUP and insert it into the function's list of
+// accesses to CLOBBER->resource ().  CLOBBER is not currently part
+// of an active group and is not currently in the list.
+void
+function_info::add_clobber (clobber_info *clobber, clobber_group *group)
+{
+  // Search for either the previous or next clobber in the group.
+  // The result is less than zero if CLOBBER should come before NEIGHBOR
+  // or greater than zero if CLOBBER should come after NEIGHBOR.
+  int comparison = lookup_clobber (group->m_clobber_tree, clobber->insn ());
+  gcc_checking_assert (comparison != 0);
+  clobber_info *neighbor = group->m_clobber_tree.root ();
+
+  // Since HEIGHBOR is now the root of the splay tree, its group needs
+  // to be up-to-date.
+  neighbor->update_group (group);
+
+  // If CLOBBER comes before NEIGHBOR, insert CLOBBER to NEIGHBOR's left,
+  // otherwise insert CLOBBER to NEIGHBOR's right.
+  clobber_info::splay_tree::insert_child (neighbor, comparison > 0, clobber);
+  clobber->set_group (group);
+
+  // Insert the clobber into the function-wide list and update the
+  // bounds of the group.
+  if (comparison > 0)
+    {
+      insert_def_after (clobber, neighbor);
+      if (neighbor == group->last_clobber ())
+	group->set_last_clobber (clobber);
+    }
+  else
+    {
+      insert_def_before (clobber, neighbor);
+      if (neighbor == group->first_clobber ())
+	group->set_first_clobber (clobber);
+    }
+}
+
+// Remove CLOBBER from GROUP, given that GROUP contains other clobbers too.
+// Also remove CLOBBER from the function's list of accesses to
+// CLOBBER->resource ().
+void
+function_info::remove_clobber (clobber_info *clobber, clobber_group *group)
+{
+  if (clobber == group->first_clobber ())
+    {
+      auto *new_first = as_a<clobber_info *> (clobber->next_def ());
+      group->set_first_clobber (new_first);
+      new_first->update_group (group);
+    }
+  else if (clobber == group->last_clobber ())
+    {
+      auto *new_last = as_a<clobber_info *> (clobber->prev_def ());
+      group->set_last_clobber (new_last);
+      new_last->update_group (group);
+    }
+
+  clobber_info *replacement = clobber_info::splay_tree::remove_node (clobber);
+  if (clobber == group->m_clobber_tree.root ())
+    {
+      group->m_clobber_tree = replacement;
+      replacement->update_group (group);
+    }
+  clobber->set_group (nullptr);
+
+  remove_def_from_list (clobber);
+}
+
+// Add CLOBBER immediately before the first clobber in GROUP, given that
+// CLOBBER is not currently part of any group.
+void
+function_info::prepend_clobber_to_group (clobber_info *clobber,
+					 clobber_group *group)
+{
+  clobber_info *next = group->first_clobber ();
+  clobber_info::splay_tree::insert_child (next, 0, clobber);
+  group->set_first_clobber (clobber);
+  clobber->set_group (group);
+}
+
+// Add CLOBBER immediately after the last clobber in GROUP, given that
+// CLOBBER is not currently part of any group.
+void
+function_info::append_clobber_to_group (clobber_info *clobber,
+					clobber_group *group)
+{
+  clobber_info *prev = group->last_clobber ();
+  clobber_info::splay_tree::insert_child (prev, 1, clobber);
+  group->set_last_clobber (clobber);
+  clobber->set_group (group);
+}
+
+// Put CLOBBER1 and CLOBBER2 into the same clobber_group, given that
+// CLOBBER1 occurs immediately before CLOBBER2 and that the two clobbers
+// are not currently in the same group.  LAST is the last definition of
+// the associated resource, and is where any splay tree is stored.
+void
+function_info::merge_clobber_groups (clobber_info *clobber1,
+				     clobber_info *clobber2,
+				     def_info *last)
+{
+  if (clobber1->is_in_group () && clobber2->is_in_group ())
+    {
+      clobber_group *group1 = clobber1->group ();
+      clobber_group *group2 = clobber2->group ();
+      gcc_checking_assert (clobber1 == group1->last_clobber ()
+			   && clobber2 == group2->first_clobber ());
+
+      if (def_splay_tree tree = last->splay_root ())
+	{
+	  // Remove GROUP2 from the splay tree.
+	  int comparison = lookup_def (tree, clobber2->insn ());
+	  gcc_checking_assert (comparison == 0);
+	  tree.remove_root ();
+	  last->set_splay_root (tree.root ());
+	}
+
+      // Splice the trees together.
+      group1->m_clobber_tree.splice_next_tree (group2->m_clobber_tree);
+
+      // Bring the two extremes of GROUP2 under GROUP1.  Any other
+      // clobbers in the group are updated lazily on demand.
+      clobber2->set_group (group1);
+      group2->last_clobber ()->set_group (group1);
+      group1->set_last_clobber (group2->last_clobber ());
+
+      // Record that GROUP2 is no more.
+      group2->set_first_clobber (nullptr);
+      group2->set_last_clobber (nullptr);
+      group2->m_clobber_tree = nullptr;
+    }
+  else
+    {
+      // In this case there can be no active splay tree.
+      gcc_assert (!last->splay_root ());
+      if (clobber2->is_in_group ())
+	prepend_clobber_to_group (clobber1, clobber2->group ());
+      else
+	append_clobber_to_group (clobber2, need_clobber_group (clobber1));
+    }
+}
+
+// GROUP spans INSN, and INSN now sets the resource that GROUP clobbers.
+// Split GROUP around INSN and return the clobber that comes immediately
+// before INSN.
+clobber_info *
+function_info::split_clobber_group (clobber_group *group, insn_info *insn)
+{
+  // Search for either the previous or next clobber in the group.
+  // The result is less than zero if CLOBBER should come before NEIGHBOR
+  // or greater than zero if CLOBBER should come after NEIGHBOR.
+  int comparison = lookup_clobber (group->m_clobber_tree, insn);
+  gcc_checking_assert (comparison != 0);
+  clobber_info *neighbor = group->m_clobber_tree.root ();
+
+  clobber_tree tree1, tree2;
+  clobber_info *prev;
+  clobber_info *next;
+  if (comparison > 0)
+    {
+      // NEIGHBOR is the last clobber in what will become the first group.
+      tree1 = neighbor;
+      tree2 = tree1.split_after_root ();
+      prev = neighbor;
+      next = as_a<clobber_info *> (prev->next_def ());
+    }
+  else
+    {
+      // NEIGHBOR is the first clobber in what will become the second group.
+      tree2 = neighbor;
+      tree1 = tree2.split_before_root ();
+      next = neighbor;
+      prev = as_a<clobber_info *> (next->prev_def ());
+    }
+
+  // Use GROUP to hold PREV and earlier clobbers.  Create a new group for
+  // NEXT onwards.
+  clobber_info *last_clobber = group->last_clobber ();
+  clobber_group *group1 = group;
+  clobber_group *group2 = allocate<clobber_group> (next);
+
+  // Finish setting up GROUP1, making sure that the roots and extremities
+  // have a correct group pointer.  Leave the rest to be updated lazily.
+  group1->set_last_clobber (prev);
+  tree1->set_group (group1);
+  prev->set_group (group1);
+
+  // Finish setting up GROUP2, with the same approach as for GROUP1.
+  group2->set_first_clobber (next);
+  group2->set_last_clobber (last_clobber);
+  next->set_group (group2);
+  tree2->set_group (group2);
+  last_clobber->set_group (group2);
+
+  return prev;
+}
+
+// Add DEF to the end of the function's list of definitions of
+// DEF->resource ().  There is known to be no associated splay tree yet.
+void
+function_info::append_def (def_info *def)
+{
+  gcc_checking_assert (!def->has_def_links ());
+  def_info **head = &m_defs[def->regno () + 1];
+  def_info *first = *head;
+  if (!first)
+    {
+      // This is the only definition of the resource.
+      def->set_last_def (def);
+      *head = def;
+      return;
+    }
+
+  def_info *prev = first->last_def ();
+  gcc_checking_assert (!prev->splay_root ());
+
+  // Maintain the invariant that two clobbers must not appear in
+  // neighboring nodes of the splay tree.
+  auto *clobber = dyn_cast<clobber_info *> (def);
+  auto *prev_clobber = dyn_cast<clobber_info *> (prev);
+  if (clobber && prev_clobber)
+    append_clobber_to_group (clobber, need_clobber_group (prev_clobber));
+
+  prev->set_next_def (def);
+  def->set_prev_def (prev);
+  first->set_last_def (def);
+}
+
+// Add DEF to the function's list of definitions of DEF->resource ().
+// Also insert it into the associated splay tree, if there is one.
+// DEF is not currently part of the list and is not in the splay tree.
+void
+function_info::add_def (def_info *def)
+{
+  gcc_checking_assert (!def->has_def_links ()
+		       && !def->m_is_temp
+		       && !def->m_has_been_superceded);
+  def_info **head = &m_defs[def->regno () + 1];
+  def_info *first = *head;
+  if (!first)
+    {
+      // This is the only definition of the resource.
+      def->set_last_def (def);
+      *head = def;
+      return;
+    }
+
+  def_info *last = first->last_def ();
+  insn_info *insn = def->insn ();
+
+  int comparison;
+  def_node *root = nullptr;
+  def_info *prev = nullptr;
+  def_info *next = nullptr;
+  if (*insn > *last->insn ())
+    {
+      // This definition comes after all other definitions.
+      comparison = 1;
+      if (def_splay_tree tree = last->splay_root ())
+	{
+	  tree.splay_max_node ();
+	  root = tree.root ();
+	  last->set_splay_root (root);
+	}
+      prev = last;
+    }
+  else if (*insn < *first->insn ())
+    {
+      // This definition comes before all other definitions.
+      comparison = -1;
+      if (def_splay_tree tree = last->splay_root ())
+	{
+	  tree.splay_min_node ();
+	  root = tree.root ();
+	  last->set_splay_root (root);
+	}
+      next = first;
+    }
+  else
+    {
+      // Search the splay tree for an insertion point.
+      def_splay_tree tree = need_def_splay_tree (last);
+      comparison = lookup_def (tree, insn);
+      root = tree.root ();
+      last->set_splay_root (root);
+
+      // Deal with cases in which we found an overlapping live range.
+      if (comparison == 0)
+	{
+	  auto *group = as_a<clobber_group *> (tree.root ());
+	  if (auto *clobber = dyn_cast<clobber_info *> (def))
+	    {
+	      add_clobber (clobber, group);
+	      return;
+	    }
+	  prev = split_clobber_group (group, insn);
+	  next = prev->next_def ();
+	}
+      // COMPARISON is < 0 if DEF comes before ROOT or > 0 if DEF comes
+      // after ROOT.
+      else if (comparison < 0)
+	{
+	  next = first_def (root);
+	  prev = next->prev_def ();
+	}
+      else
+	{
+	  prev = last_def (root);
+	  next = prev->next_def ();
+	}
+    }
+
+  // See if we should merge CLOBBER with a neighboring clobber.
+  auto *clobber = dyn_cast<clobber_info *> (def);
+  auto *prev_clobber = safe_dyn_cast<clobber_info *> (prev);
+  auto *next_clobber = safe_dyn_cast<clobber_info *> (next);
+  // We shouldn't have consecutive clobber_groups.
+  gcc_checking_assert (!(clobber && prev_clobber && next_clobber));
+  if (clobber && prev_clobber)
+    append_clobber_to_group (clobber, need_clobber_group (prev_clobber));
+  else if (clobber && next_clobber)
+    prepend_clobber_to_group (clobber, need_clobber_group (next_clobber));
+  else if (root)
+    {
+      // If DEF comes before ROOT, insert DEF to ROOT's left,
+      // otherwise insert DEF to ROOT's right.
+      def_node *node = need_def_node (def);
+      def_splay_tree::insert_child (root, comparison >= 0, node);
+    }
+  if (prev)
+    insert_def_after (def, prev);
+  else
+    insert_def_before (def, next);
+}
+
+// Remove DEF from the function's list of definitions of DEF->resource ().
+// Also remove DEF from the associated splay tree, if there is one.
+void
+function_info::remove_def (def_info *def)
+{
+  def_info **head = &m_defs[def->regno () + 1];
+  def_info *first = *head;
+  gcc_checking_assert (first);
+  if (first->is_last_def ())
+    {
+      // DEF is the only definition of the resource.
+      gcc_checking_assert (first == def);
+      *head = nullptr;
+      def->clear_def_links ();
+      return;
+    }
+
+  // If CLOBBER belongs to a clobber_group that contains other clobbers
+  // too, then we need to update the clobber_group and the list, but any
+  // splay tree that contains the clobber_group is unaffected.
+  if (auto *clobber = dyn_cast<clobber_info *> (def))
+    if (clobber->is_in_group ())
+      {
+	clobber_group *group = clobber->group ();
+	if (group->first_clobber () != group->last_clobber ())
+	  {
+	    remove_clobber (clobber, group);
+	    return;
+	  }
+      }
+
+  // If we've created a splay tree for this resource, remove the entry
+  // for DEF.
+  def_info *last = first->last_def ();
+  if (def_splay_tree tree = last->splay_root ())
+    {
+      int comparison = lookup_def (tree, def->insn ());
+      gcc_checking_assert (comparison == 0);
+      tree.remove_root ();
+      last->set_splay_root (tree.root ());
+    }
+
+  // If the definition came between two clobbers, merge them into a single
+  // group.
+  auto *prev_clobber = safe_dyn_cast<clobber_info *> (def->prev_def ());
+  auto *next_clobber = safe_dyn_cast<clobber_info *> (def->next_def ());
+  if (prev_clobber && next_clobber)
+    merge_clobber_groups (prev_clobber, next_clobber, last);
+
+  remove_def_from_list (def);
+}
+
+// Require DEF to have a splay tree that contains all non-phi uses.
+void
+function_info::need_use_splay_tree (set_info *def)
+{
+  if (!def->m_use_tree)
+    for (use_info *use : def->all_insn_uses ())
+      {
+	auto *use_node = allocate<splay_tree_node<use_info *>> (use);
+	def->m_use_tree.insert_max_node (use_node);
+      }
+}
+
+// Compare two instructions by their position in a use splay tree.  Return >0
+// if INSN1 comes after INSN2, <0 if INSN1 comes before INSN2, or 0 if they are
+// the same instruction.
+static inline int
+compare_use_insns (insn_info *insn1, insn_info *insn2)
+{
+  // Debug instructions go after nondebug instructions.
+  int diff = insn1->is_debug_insn () - insn2->is_debug_insn ();
+  if (diff != 0)
+    return diff;
+  return insn1->compare_with (insn2);
+}
+
+// Search TREE for a use in INSN.  If such a use exists, install it as
+// the root of TREE and return 0.  Otherwise arbitrarily choose between:
+//
+// (1) Installing the closest preceding use as the root and returning 1.
+// (2) Installing the closest following use as the root and returning -1.
+int
+rtl_ssa::lookup_use (splay_tree<use_info *> &tree, insn_info *insn)
+{
+  auto compare = [&](splay_tree_node<use_info *> *node)
+    {
+      return compare_use_insns (insn, node->value ()->insn ());
+    };
+  return tree.lookup (compare);
+}
+
+// Add USE to USE->def ()'s list of uses. inserting USE immediately before
+// BEFORE.  USE is not currently in the list.
+//
+// This routine should not be used for inserting phi uses.
+void
+function_info::insert_use_before (use_info *use, use_info *before)
+{
+  gcc_checking_assert (!use->has_use_links () && use->is_in_any_insn ());
+
+  set_info *def = use->def ();
+
+  use->copy_prev_from (before);
+  use->set_next_use (before);
+
+  if (use_info *prev = use->prev_use ())
+    prev->set_next_use (use);
+  else
+    use->def ()->set_first_use (use);
+
+  before->set_prev_use (use);
+  if (use->is_in_nondebug_insn () && before->is_in_debug_insn_or_phi ())
+    def->last_use ()->set_last_nondebug_insn_use (use);
+
+  gcc_checking_assert (use->check_integrity () && before->check_integrity ());
+}
+
+// Add USE to USE->def ()'s list of uses. inserting USE immediately after
+// AFTER.  USE is not currently in the list.
+//
+// This routine should not be used for inserting phi uses.
+void
+function_info::insert_use_after (use_info *use, use_info *after)
+{
+  set_info *def = use->def ();
+  gcc_checking_assert (after->is_in_any_insn ()
+		       && !use->has_use_links ()
+		       && use->is_in_any_insn ());
+
+  use->set_prev_use (after);
+  use->copy_next_from (after);
+
+  after->set_next_use (use);
+
+  if (use_info *next = use->next_use ())
+    {
+      // The last node doesn't change, but we might need to update its
+      // last_nondebug_insn_use record.
+      if (use->is_in_nondebug_insn () && next->is_in_debug_insn_or_phi ())
+	def->last_use ()->set_last_nondebug_insn_use (use);
+      next->set_prev_use (use);
+    }
+  else
+    {
+      // USE is now the last node.
+      if (use->is_in_nondebug_insn ())
+	use->set_last_nondebug_insn_use (use);
+      def->first_use ()->set_last_use (use);
+    }
+
+  gcc_checking_assert (use->check_integrity () && after->check_integrity ());
+}
+
+// If USE has a known definition, add USE to that definition's list of uses.
+// Also update the associated splay tree, if any.
+void
+function_info::add_use (use_info *use)
+{
+  gcc_checking_assert (!use->has_use_links ()
+		       && !use->m_is_temp
+		       && !use->m_has_been_superceded);
+
+  set_info *def = use->def ();
+  if (!def)
+    return;
+
+  use_info *first = def->first_use ();
+  if (!first)
+    {
+      // This is the only use of the definition.
+      use->set_last_use (use);
+      if (use->is_in_nondebug_insn ())
+	use->set_last_nondebug_insn_use (use);
+
+      def->set_first_use (use);
+
+      gcc_checking_assert (use->check_integrity ());
+      return;
+    }
+
+  if (use->is_in_phi ())
+    {
+      // Add USE at the end of the list, as the new first phi.
+      use_info *last = first->last_use ();
+
+      use->set_prev_use (last);
+      use->copy_next_from (last);
+
+      last->set_next_use (use);
+      first->set_last_use (use);
+
+      gcc_checking_assert (use->check_integrity ());
+      return;
+    }
+
+  // If there is currently no splay tree for this definition, see if can
+  // get away with a pure list-based update.
+  insn_info *insn = use->insn ();
+  auto quick_path = [&]()
+    {
+      // Check if USE should come before all current uses.
+      if (first->is_in_phi () || compare_use_insns (insn, first->insn ()) < 0)
+	{
+	  insert_use_before (use, first);
+	  return true;
+	}
+
+      // Check if USE should come after all current uses in the same
+      // subsequence (i.e. the list of nondebug insn uses or the list
+      // of debug insn uses).
+      use_info *last = first->last_use ();
+      if (use->is_in_debug_insn ())
+	{
+	  if (last->is_in_phi ())
+	    return false;
+	}
+      else
+	last = last->last_nondebug_insn_use ();
+
+      if (compare_use_insns (insn, last->insn ()) > 0)
+	{
+	  insert_use_after (use, last);
+	  return true;
+	}
+
+      return false;
+    };
+  if (!def->m_use_tree && quick_path ())
+    return;
+
+  // Search the splay tree for an insertion point.  COMPARISON is less
+  // than zero if USE should come before NEIGHBOR, or greater than zero
+  // if USE should come after NEIGHBOR.
+  need_use_splay_tree (def);
+  int comparison = lookup_use (def->m_use_tree, insn);
+  gcc_checking_assert (comparison != 0);
+  splay_tree_node<use_info *> *neighbor = def->m_use_tree.root ();
+
+  // If USE comes before NEIGHBOR, insert USE to NEIGHBOR's left,
+  // otherwise insert USE to NEIGHBOR's right.
+  auto *use_node = allocate<splay_tree_node<use_info *>> (use);
+  def->m_use_tree.insert_child (neighbor, comparison > 0, use_node);
+  if (comparison > 0)
+    insert_use_after (use, neighbor->value ());
+  else
+    insert_use_before (use, neighbor->value ());
+}
+
+// If USE has a known definition, remove USE from that definition's list
+// of uses.  Also remove if it from the associated splay tree, if any.
+void
+function_info::remove_use (use_info *use)
+{
+  set_info *def = use->def ();
+  if (!def)
+    return;
+
+  // Remove USE from the splay tree.
+  if (def->m_use_tree && use->is_in_any_insn ())
+    {
+      int comparison = lookup_use (def->m_use_tree, use->insn ());
+      gcc_checking_assert (comparison == 0);
+      def->m_use_tree.remove_root ();
+    }
+
+  use_info *prev = use->prev_use ();
+  use_info *next = use->next_use ();
+
+  use_info *first = def->first_use ();
+  use_info *last = first->last_use ();
+  if (last->last_nondebug_insn_use () == use)
+    last->set_last_nondebug_insn_use (prev);
+
+  if (next)
+    next->copy_prev_from (use);
+  else
+    first->set_last_use (prev);
+
+  if (prev)
+    prev->copy_next_from (use);
+  else
+    def->set_first_use (next);
+
+  use->clear_use_links ();
+  gcc_checking_assert ((!prev || prev->check_integrity ())
+		       && (!next || next->check_integrity ()));
+}
+
+// Allocate a temporary clobber_info for register REGNO in insn INSN,
+// including it in the region of the obstack governed by WATERMARK.
+// Return a new def_array that contains OLD_DEFS and the new clobber.
+//
+// OLD_DEFS is known not to define REGNO.
+def_array
+function_info::insert_temp_clobber (obstack_watermark &watermark,
+				    insn_info *insn, unsigned int regno,
+				    def_array old_defs)
+{
+  gcc_checking_assert (watermark == &m_temp_obstack);
+  auto *clobber = allocate_temp<clobber_info> (insn, regno);
+  clobber->m_is_temp = true;
+  return insert_access (watermark, clobber, old_defs);
+}
+
+// A subroutine of make_uses_available.  Try to make USE's definition
+// available at the head of BB.  On success:
+//
+// - If the use would have the same def () as USE, return USE.
+//
+// - If BB already has a degenerate phi for the same definition,
+//   return a temporary use of that phi.
+//
+// - Otherwise, the use would need a new degenerate phi.  Allocate a
+//   temporary phi and return a temporary use of it.
+//
+// Return null on failure.
+use_info *
+function_info::make_use_available (use_info *use, bb_info *bb)
+{
+  set_info *def = use->def ();
+  if (!def)
+    return use;
+
+  if (is_single_dominating_def (def))
+    return use;
+
+  // FIXME: Deliberately limited for fwprop compatibility testing.
+  basic_block cfg_bb = bb->cfg_bb ();
+  bb_info *use_bb = use->bb ();
+  if (single_pred_p (cfg_bb)
+      && single_pred (cfg_bb) == use_bb->cfg_bb ()
+      && remains_available_on_exit (def, use_bb))
+    {
+      if (def->ebb () == bb->ebb ())
+	return use;
+
+      resource_info resource = use->resource ();
+      set_info *ultimate_def = look_through_degenerate_phi (def);
+
+      // See if there is already a (degenerate) phi for DEF.
+      insn_info *phi_insn = bb->ebb ()->phi_insn ();
+      phi_info *phi;
+      def_lookup dl = find_def (resource, phi_insn);
+      if (set_info *set = dl.matching_set ())
+	{
+	  // There is an existing phi.
+	  phi = as_a<phi_info *> (set);
+	  gcc_checking_assert (phi->input_value (0) == ultimate_def);
+	}
+      else
+	{
+	  // Create a temporary placeholder phi.  This will become
+	  // permanent if the change is later committed.
+	  phi = allocate_temp<phi_info> (phi_insn, resource, 0);
+	  auto *input = allocate<use_info> (phi, resource, ultimate_def);
+	  input->m_is_temp = true;
+	  phi->m_is_temp = true;
+	  phi->make_degenerate (input);
+	  phi->set_prev_def (dl.prev_def ());
+	  phi->set_next_def (dl.next_def ());
+	}
+
+      // Create a temporary use of the phi at the head of the first
+      // block, since we know for sure that it's available there.
+      insn_info *use_insn = bb->ebb ()->first_bb ()->head_insn ();
+      auto *new_use = allocate_temp<use_info> (use_insn, resource, phi);
+      new_use->m_is_temp = true;
+      return new_use;
+    }
+  return nullptr;
+}
+
+// See the comment above the declaration.
+use_array
+function_info::make_uses_available (obstack_watermark &watermark,
+				    use_array uses, bb_info *bb)
+{
+  unsigned int num_uses = uses.size ();
+  if (num_uses == 0)
+    return uses;
+
+  auto **new_uses = XOBNEWVEC (watermark, access_info *, num_uses);
+  for (unsigned int i = 0; i < num_uses; ++i)
+    {
+      use_info *use = make_use_available (uses[i], bb);
+      if (!use)
+	return use_array (access_array::invalid ());
+      new_uses[i] = use;
+    }
+  return use_array (new_uses, num_uses);
+}
+
+// Return true if ACCESS1 can represent ACCESS2 and if ACCESS2 can
+// represent ACCESS1.
+static bool
+can_merge_accesses (access_info *access1, access_info *access2)
+{
+  if (access1 == access2)
+    return true;
+
+  auto *use1 = dyn_cast<use_info *> (access1);
+  auto *use2 = dyn_cast<use_info *> (access2);
+  return use1 && use2 && use1->def () == use2->def ();
+}
+
+// See the comment above the declaration.
+access_array
+rtl_ssa::merge_access_arrays_base (obstack_watermark &watermark,
+				   access_array accesses1,
+				   access_array accesses2)
+{
+  if (accesses1.empty ())
+    return accesses2;
+  if (accesses2.empty ())
+    return accesses1;
+
+  auto i1 = accesses1.begin ();
+  auto end1 = accesses1.end ();
+  auto i2 = accesses2.begin ();
+  auto end2 = accesses2.end ();
+
+  access_array_builder builder (watermark);
+  builder.reserve (accesses1.size () + accesses2.size ());
+
+  while (i1 != end1 && i2 != end2)
+    {
+      access_info *access1 = *i1;
+      access_info *access2 = *i2;
+
+      unsigned int regno1 = access1->regno ();
+      unsigned int regno2 = access2->regno ();
+      if (regno1 == regno2)
+	{
+	  if (!can_merge_accesses (access1, access2))
+	    return access_array::invalid ();
+
+	  builder.quick_push (access1);
+	  ++i1;
+	  ++i2;
+	}
+      else if (regno1 < regno2)
+	{
+	  builder.quick_push (access1);
+	  ++i1;
+	}
+      else
+	{
+	  builder.quick_push (access2);
+	  ++i2;
+	}
+    }
+  for (; i1 != end1; ++i1)
+    builder.quick_push (*i1);
+  for (; i2 != end2; ++i2)
+    builder.quick_push (*i2);
+
+  return builder.finish ();
+}
+
+// See the comment above the declaration.
+access_array
+rtl_ssa::insert_access_base (obstack_watermark &watermark,
+			     access_info *access1, access_array accesses2)
+{
+  access_array_builder builder (watermark);
+  builder.reserve (1 + accesses2.size ());
+
+  unsigned int regno1 = access1->regno ();
+  auto i2 = accesses2.begin ();
+  auto end2 = accesses2.end ();
+  while (i2 != end2)
+    {
+      access_info *access2 = *i2;
+
+      unsigned int regno2 = access2->regno ();
+      if (regno1 == regno2)
+	{
+	  if (!can_merge_accesses (access1, access2))
+	    return access_array::invalid ();
+
+	  builder.quick_push (access1);
+	  access1 = nullptr;
+	  ++i2;
+	  break;
+	}
+      else if (regno1 < regno2)
+	{
+	  builder.quick_push (access1);
+	  access1 = nullptr;
+	  break;
+	}
+      else
+	{
+	  builder.quick_push (access2);
+	  ++i2;
+	}
+    }
+  if (access1)
+    builder.quick_push (access1);
+  for (; i2 != end2; ++i2)
+    builder.quick_push (*i2);
+
+  return builder.finish ();
+}
+
+// See the comment above the declaration.
+access_array
+rtl_ssa::remove_note_accesses_base (obstack_watermark &watermark,
+				    access_array accesses)
+{
+  for (access_info *access : accesses)
+    if (access->only_occurs_in_notes ())
+      {
+	access_array_builder builder (watermark);
+	builder.reserve (accesses.size ());
+	for (access_info *access2 : accesses)
+	  if (!access2->only_occurs_in_notes ())
+	    builder.quick_push (access2);
+	return builder.finish ();
+      }
+  return accesses;
+}
+
+// Print RESOURCE to PP.
+void
+rtl_ssa::pp_resource (pretty_printer *pp, resource_info resource)
+{
+  resource.print (pp);
+}
+
+// Print ACCESS to PP.  FLAGS is a bitmask of PP_ACCESS_* flags.
+void
+rtl_ssa::pp_access (pretty_printer *pp, const access_info *access,
+		    unsigned int flags)
+{
+  if (!access)
+    pp_string (pp, "<null>");
+  else if (auto *phi = dyn_cast<const phi_info *> (access))
+    phi->print (pp, flags);
+  else if (auto *set = dyn_cast<const set_info *> (access))
+    set->print (pp, flags);
+  else if (auto *clobber = dyn_cast<const clobber_info *> (access))
+    clobber->print (pp, flags);
+  else if (auto *use = dyn_cast<const use_info *> (access))
+    use->print (pp, flags);
+  else
+    pp_string (pp, "??? Unknown access");
+}
+
+// Print ACCESSES to PP.  FLAGS is a bitmask of PP_ACCESS_* flags.
+void
+rtl_ssa::pp_accesses (pretty_printer *pp, access_array accesses,
+		      unsigned int flags)
+{
+  if (accesses.empty ())
+    pp_string (pp, "none");
+  else
+    {
+      bool is_first = true;
+      for (access_info *access : accesses)
+	{
+	  if (is_first)
+	    is_first = false;
+	  else
+	    pp_newline_and_indent (pp, 0);
+	  pp_access (pp, access, flags);
+	}
+    }
+}
+
+// Print NODE to PP.
+void
+rtl_ssa::pp_def_node (pretty_printer *pp, const def_node *node)
+{
+  if (!node)
+    pp_string (pp, "<null>");
+  else if (auto *group = dyn_cast<const clobber_group *> (node))
+    group->print (pp);
+  else if (auto *set = dyn_cast<const set_node *> (node))
+    set->print (pp);
+  else
+    pp_string (pp, "??? Unknown def node");
+}
+
+// Print MUX to PP.
+void
+rtl_ssa::pp_def_mux (pretty_printer *pp, def_mux mux)
+{
+  if (auto *node = mux.dyn_cast<def_node *> ())
+    pp_def_node (pp, node);
+  else
+    pp_access (pp, mux.as_a<def_info *> ());
+}
+
+// Print DL to PP.
+void
+rtl_ssa::pp_def_lookup (pretty_printer *pp, def_lookup dl)
+{
+  pp_string (pp, "comparison result of ");
+  pp_decimal_int (pp, dl.comparison);
+  pp_string (pp, " for ");
+  pp_newline_and_indent (pp, 0);
+  pp_def_mux (pp, dl.mux);
+}
+
+// Dump RESOURCE to FILE.
+void
+dump (FILE *file, resource_info resource)
+{
+  dump_using (file, pp_resource, resource);
+}
+
+// Dump ACCESS to FILE.  FLAGS is a bitmask of PP_ACCESS_* flags.
+void
+dump (FILE *file, const access_info *access, unsigned int flags)
+{
+  dump_using (file, pp_access, access, flags);
+}
+
+// Dump ACCESSES to FILE.  FLAGS is a bitmask of PP_ACCESS_* flags.
+void
+dump (FILE *file, access_array accesses, unsigned int flags)
+{
+  dump_using (file, pp_accesses, accesses, flags);
+}
+
+// Print NODE to FILE.
+void
+dump (FILE *file, const def_node *node)
+{
+  dump_using (file, pp_def_node, node);
+}
+
+// Print MUX to FILE.
+void
+dump (FILE *file, def_mux mux)
+{
+  dump_using (file, pp_def_mux, mux);
+}
+
+// Print RESULT to FILE.
+void
+dump (FILE *file, def_lookup result)
+{
+  dump_using (file, pp_def_lookup, result);
+}
+
+// Debug interfaces to the dump routines above.
+void debug (const resource_info &x) { dump (stderr, x); }
+void debug (const access_info *x) { dump (stderr, x); }
+void debug (const access_array &x) { dump (stderr, x); }
+void debug (const def_node *x) { dump (stderr, x); }
+void debug (const def_mux &x) { dump (stderr, x); }
+void debug (const def_lookup &x) { dump (stderr, x); }
diff --git a/gcc/rtl-ssa/accesses.h b/gcc/rtl-ssa/accesses.h
new file mode 100644
index 00000000000..fdb4a646183
--- /dev/null
+++ b/gcc/rtl-ssa/accesses.h
@@ -0,0 +1,1032 @@
+// Access-related classes for RTL SSA                               -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+namespace rtl_ssa {
+
+// Forward declarations.
+class bb_info;
+class clobber_group;
+class def_node;
+class ebb_info;
+class insn_info;
+class phi_info;
+class set_info;
+
+// Used as a boolean argunent to certain routines.
+enum class ignore_clobbers { NO, YES };
+
+// Represents something that the SSA form tracks: either a register
+// or memory.
+class resource_info
+{
+public:
+  // Return true if this resource represents memory.
+  bool is_mem () const { return regno == MEM_REGNO; }
+
+  // Return true if this resource represents a register.
+  bool is_reg () const { return regno != MEM_REGNO; }
+
+  // Print the name of the resource to PP.
+  void print_identifier (pretty_printer *pp) const;
+
+  // Possibly print additional information about the resource to PP.
+  void print_context (pretty_printer *pp) const;
+
+  // A combination of print_identifier and print_context.
+  void print (pretty_printer *pp) const;
+
+  // The mode with which the resource is being defined or used.  This is
+  // always BLKmode for memory.  It can also be BLKmode for registers if
+  // we don't yet know the real mode, or if the mode is not relevant for
+  // some reason.
+  machine_mode mode;
+
+  // The pseudo register or single hard register that the resource represents,
+  // or MEM_REGNO for memory.
+  unsigned int regno;
+};
+
+// For simplicity, we treat memory as a single unified entity.
+const resource_info memory = { E_BLKmode, MEM_REGNO };
+
+// Flags used when printing access_infos.
+//
+// Print the location at which the access occurs.  This is redundant
+// when the access is being printed as part of the instruction or phi node
+// that contains the access.
+const unsigned int PP_ACCESS_INCLUDE_LOCATION = 1U << 0;
+//
+// Print links to other accesses: the definition that defines a use,
+// the uses of a definition, and the inputs of a phi node.
+const unsigned int PP_ACCESS_INCLUDE_LINKS = 1U << 1;
+//
+// Print additional properties about the access.
+const unsigned int PP_ACCESS_INCLUDE_PROPERTIES = 1U << 2;
+//
+// The usual flags when printing an access in isolation.
+const unsigned int PP_ACCESS_DEFAULT = (PP_ACCESS_INCLUDE_LOCATION
+					| PP_ACCESS_INCLUDE_LINKS
+					| PP_ACCESS_INCLUDE_PROPERTIES);
+//
+// The usual flags when printing a def_info from its defining instruction.
+const unsigned int PP_ACCESS_SETTER = (PP_ACCESS_INCLUDE_LINKS
+				       | PP_ACCESS_INCLUDE_PROPERTIES);
+//
+// The usual flags when printing a use_info from its user.
+const unsigned int PP_ACCESS_USER = PP_ACCESS_INCLUDE_PROPERTIES;
+
+// The various ways of accessing a resource.  The two range checks that
+// we need to perform are [SET, PHI] (for set_info) and [SET, CLOBBER]
+// (for def_info), so the ordering tries to make those tests as
+// efficient as possible.
+enum class access_kind : uint8_t
+{
+  // Set the resource to a useful value.
+  SET,
+
+  // A form of SET that collects the possible incoming values of the
+  // resource using a phi node; the resource does not actually change value.
+  PHI,
+
+  // Set the resource to a value that is both unknown and not useful.
+  CLOBBER,
+
+  // Use the current value of the resource.
+  USE
+};
+
+// A base class that represents an access to a resource.
+class access_info
+{
+  // Size: 1 LP64 word
+  friend class function_info;
+
+public:
+  // Return the resource that is being accessed.
+  resource_info resource () const { return { m_mode, m_regno }; }
+
+  // Return true if the access is to memory.
+  bool is_mem () const { return m_regno == MEM_REGNO; }
+
+  // Return true if the access is to a register.
+  bool is_reg () const { return m_regno != MEM_REGNO; }
+
+  // If the access is to a register, return the register number,
+  // otherwise return MEM_REGNO.
+  unsigned int regno () const { return m_regno; }
+
+  // For sets, return the mode of the value to which the resource is being set.
+  // For uses, return the mode in which the resource is being used (which for
+  // hard registers might be different from the mode in which the resource
+  // was set).
+  //
+  // When accessing memory, the mode is always BLKmode.  When accessing
+  // pseudo registers, the mode is always the mode of the pseudo register
+  // (and so doesn't, for example, take subregs into account).
+  machine_mode mode () const { return m_mode; }
+
+  // Return the kind of access that this is.
+  access_kind kind () const { return m_kind; }
+
+  // Return true if the access occurs in a phi node or an "artificial"
+  // instruction (see insn_info), false if it occurs in a real instruction.
+  bool is_artificial () const { return m_is_artificial; }
+
+  // Return the opposite of is_artificial.
+  bool is_real () const { return !m_is_artificial; }
+
+  // Return true if this access is a set_info whose result is used by at least
+  // one nondebug instruction.
+  bool is_set_with_nondebug_insn_uses () const;
+
+  // Return true if the access describes a set_info and if the value
+  // is defined by an RTX_AUTOINC rtx.
+  bool is_pre_post_modify () const { return m_is_pre_post_modify; }
+
+  // Return true if the access is a clobber_info that describes the effect
+  // of a called function.  This kind of clobber is added for -fipa-ra
+  // functions that clobber only a strict subset of the normal ABI set.
+  bool is_call_clobber () const { return m_is_call_clobber; }
+
+  // Return true if the access is a use_info that simply marks a point in
+  // the live range of a set_info at which the value is live out from
+  // the containing EBB.
+  bool is_live_out_use () const { return m_is_live_out_use; }
+
+  // Return true if the access is a use_info for an instruction and if
+  // at least some of the uses occur within a MEM address.
+  //
+  // There shouldn't be a need to check whether *all* uses occur within
+  // a MEM address, since in principle:
+  //
+  // A: (set (reg:SI R1) (mem:SI (post_inc:SI (reg:SI R2))))
+  //
+  // should be semantically equivalent to:
+  //
+  // B: (parallel [(set (reg:SI R1) (mem:SI (reg:SI R2)))
+  //               (set (reg:SI R2) (plus:SI (reg:SI R2) (const_int 4)))])
+  //
+  // even though R2 occurs only in MEMs for A but occurs outside MEMs for B.
+  bool includes_address_uses () const { return m_includes_address_uses; }
+
+  // Return true if the access occurs in an instruction and if at least
+  // some accesses to resource () occur in a read-modify-write context.
+  // This is equivalent to the DF_REF_READ_WRITE flag.
+  bool includes_read_writes () const { return m_includes_read_writes; }
+
+  // Return true if the access occurs in an instruction and if at least
+  // some accesses to resource () occur in a subreg context.
+  bool includes_subregs () const { return m_includes_subregs; }
+
+  // Return true if the access occurs in an instruction and if at least
+  // some accesses to resource () occur in a multi-register REG.
+  // This implies that resource () is a hard register.
+  bool includes_multiregs () const { return m_includes_multiregs; }
+
+  // Return true if the access occurs in a real nondebug instruction
+  // and if all accesses to resource () occur in notes, rather than
+  // in the main instruction pattern.
+  bool only_occurs_in_notes () const { return m_only_occurs_in_notes; }
+
+protected:
+  access_info (resource_info, access_kind);
+
+  void print_prefix_flags (pretty_printer *) const;
+  void print_properties_on_new_lines (pretty_printer *) const;
+
+private:
+  void set_mode (machine_mode mode) { m_mode = mode; }
+
+  // The values returned by the accessors above.
+  unsigned int m_regno;
+  access_kind m_kind : 8;
+
+protected:
+  // The value returned by the accessors above.
+  unsigned int m_is_artificial : 1;
+  unsigned int m_is_set_with_nondebug_insn_uses : 1;
+  unsigned int m_is_pre_post_modify : 1;
+  unsigned int m_is_call_clobber : 1;
+  unsigned int m_is_live_out_use : 1;
+  unsigned int m_includes_address_uses : 1;
+  unsigned int m_includes_read_writes : 1;
+  unsigned int m_includes_subregs : 1;
+  unsigned int m_includes_multiregs : 1;
+  unsigned int m_only_occurs_in_notes : 1;
+
+  // True if this access is a use_insn that occurs in a nondebug instruction,
+  // and if there are no following uses by nondebug instructions.  The next use
+  // is null, a use_info for a debug instruction, or a use_info for a phi node.
+  //
+  // Providing this helps to optimize use_info::next_nondebug_insn_use.
+  unsigned int m_is_last_nondebug_insn_use : 1;
+
+  // True if this access is a use_info for a debug instruction or
+  // a phi node.
+  unsigned int m_is_in_debug_insn_or_phi : 1;
+
+private:
+  // Used as a flag during various update routines; has no long-lasting
+  // meaning.
+  unsigned int m_has_been_superceded : 1;
+
+  // Indicates that this access has been allocated on the function_info's
+  // temporary obstack and so is not (yet) part of the proper SSA form.
+  unsigned int m_is_temp : 1;
+
+  // Bits for future expansion.
+  unsigned int m_spare : 2;
+
+  // The value returned by the accessor above.
+  machine_mode m_mode : 8;
+};
+
+// A contiguous array of access_info pointers.  Used to represent a
+// (mostly small) number of definitions and/or uses.
+using access_array = array_slice<access_info *const>;
+
+// A class for building an access_array on an obstack.  It automatically
+// frees any in-progress array if the build attempt fails before finish ()
+// has been called.
+class access_array_builder : public obstack_watermark
+{
+public:
+  using obstack_watermark::obstack_watermark;
+
+  // Make sure that the array has enough for NUM_ACCESSES accesses.
+  void reserve (unsigned int num_accesses);
+
+  // Add ACCESS to the end of the array that we're building, given that
+  // reserve () has already made room.
+  void quick_push (access_info *access);
+
+  // Finish and return the new array.  The array survives the destruction
+  // of the builder.
+  array_slice<access_info *> finish ();
+};
+
+// An access_info that represents the use of a resource in either a phi node
+// or an instruction.  It records which set_info (if any) provides the
+// resource's value.
+class use_info : public access_info
+{
+  // Overall size: 5 LP64 words.
+  friend class set_info;
+  friend class function_info;
+
+public:
+  // Return true if the access occurs in an instruction rather than a phi node.
+  // The instruction might be a debug instruction or a nondebug instruction.
+  bool is_in_any_insn () const { return m_insn_or_phi.is_first (); }
+
+  // Return true if the access occurs in a nondebug instruction,
+  // false if it occurs in a debug instruction or a phi node.
+  bool is_in_nondebug_insn () const { return !m_is_in_debug_insn_or_phi; }
+
+  // Return true if the instruction occurs in a debug instruction.
+  bool is_in_debug_insn () const;
+
+  // Return true if the access occurs in a phi node rather than in an
+  // instruction.
+  bool is_in_phi () const { return m_insn_or_phi.is_second (); }
+
+  // Return true if the access occurs in a debug instruction or a phi node,
+  // false if it occurs in a nondebug instruction.
+  bool is_in_debug_insn_or_phi () const { return m_is_in_debug_insn_or_phi; }
+
+  // Return the instruction that uses the resource.  Only valid is
+  // is_in_any_insn ().
+  insn_info *insn () const { return m_insn_or_phi.known_first (); }
+
+  // Return the phi node that uses the resource.  Only valid if is_in_phi ().
+  phi_info *phi () const { return m_insn_or_phi.known_second (); }
+
+  // Return the basic block that contains the access.
+  bb_info *bb () const;
+
+  // Return the extended basic block that contains the access.
+  ebb_info *ebb () const;
+
+  // Return the set_info whose result the access uses, or null if the
+  // value of the resource is completely undefined.
+  //
+  // The value is undefined if the use is completely upwards exposed
+  // (i.e. has no preceding definition) or if the preceding definition
+  // is a clobber rather than a set.
+  //
+  // The mode of the definition can be different from the mode of the use;
+  // for example, a hard register might be set in DImode and used in SImode.
+  set_info *def () const { return m_def; }
+
+  // Return the previous and next uses of the definition.  See set_info
+  // for details about the ordering.
+  //
+  // These routines are only meaningful when def () is nonnull.
+  use_info *prev_use () const;
+  use_info *next_use () const;
+
+  // Return the next use by a nondebug instruction, or null if none.
+  //
+  // This is only valid if is_in_nondebug_insn ().  It is equivalent to,
+  // but more efficient than:
+  //
+  //    next_use () && next_use ()->is_in_nondebug_insn ()
+  //    ? next_use () : nullptr
+  use_info *next_nondebug_insn_use () const;
+
+  // Return the next use by an instruction, or null if none.  The use might
+  // be by a debug instruction or a nondebug instruction.
+  //
+  // This is only valid if is_in_any_insn ().  It is equivalent to:
+  //
+  //    next_use () && next_use ()->is_in_any_insn () ? next_use () : nullptr
+  use_info *next_any_insn_use () const;
+
+  // Return the previous use by a phi node in the list, or null if none.
+  //
+  // This is only valid if is_in_phi ().  It is equivalent to:
+  //
+  //    prev_use () && prev_use ()->is_in_phi () ? prev_use () : nullptr
+  use_info *prev_phi_use () const;
+
+  // Return true if this is the first use of the definition.  See set_info
+  // for details about the ordering.
+  //
+  // This routine is only meaningful when def () is nonnull.
+  bool is_first_use () const;
+
+  // Return true if this is the last use of the definition.  See set_info
+  // for details about the ordering.
+  //
+  // This routine is only meaningful when def () is nonnull.
+  bool is_last_use () const;
+
+  // Print a description of def () to PP.
+  void print_def (pretty_printer *pp) const;
+
+  // Print a description of the location of the use to PP.
+  void print_location (pretty_printer *pp) const;
+
+  // Print a description of the use to PP under the control of
+  // PP_ACCESS_* flags FLAGS.
+  void print (pretty_printer *pp,
+	      unsigned int flags = PP_ACCESS_DEFAULT) const;
+
+private:
+  // If we only create a set_info splay tree for sets that are used by
+  // three instructions or more, then only about 16% of uses need to be in
+  // a splay tree.  It is therefore more memory-efficient to use separate
+  // nodes for the splay tree, instead of storing the child nodes
+  // directly in the use_info.
+
+  // Make insn_info the first (and thus directly-encoded) choice since
+  // insn () is read much more often than phi ().
+  using insn_or_phi = pointer_mux<insn_info, phi_info>;
+
+  // The use belongs to a list that is partitioned into three sections:
+  //
+  // (1) all uses in nondebug instructions, in reverse postorder
+  //
+  // (2) all uses in debug instructions, in reverse postorder
+  //
+  // (3) all phi nodes, in no particular order.
+  //
+  // In order to preserve memory:
+  //
+  // - The set_info just has a pointer to the first use.
+  //
+  // - The first use's "prev" pointer points to the last use.
+  //
+  // - The last use's "next" pointer points to the last use in a nondebug
+  //   instruction, or null if there are no such uses.
+  using last_use_or_prev_use = pointer_mux<use_info>;
+  using last_nondebug_insn_use_or_next_use = pointer_mux<use_info>;
+
+  use_info (insn_or_phi, resource_info, set_info *);
+
+  use_info *last_use () const;
+  use_info *last_nondebug_insn_use () const;
+  bool calculate_is_last_nondebug_insn_use () const;
+
+  void record_reference (rtx_obj_reference, bool);
+  void set_insn (insn_info *);
+  void set_def (set_info *set) { m_def = set; }
+  void set_is_live_out_use (bool value) { m_is_live_out_use = value; }
+  void copy_prev_from (use_info *);
+  void copy_next_from (use_info *);
+  void set_last_use (use_info *);
+  void set_prev_use (use_info *);
+  void set_last_nondebug_insn_use (use_info *);
+  void set_next_use (use_info *);
+  void clear_use_links ();
+  bool has_use_links ();
+  bool check_integrity ();
+
+  // The location of the use.
+  insn_or_phi m_insn_or_phi;
+
+  // The overloaded "prev" and "next" pointers, as described above.
+  last_use_or_prev_use m_last_use_or_prev_use;
+  last_nondebug_insn_use_or_next_use m_last_nondebug_insn_use_or_next_use;
+
+  // The value of def ().
+  set_info *m_def;
+};
+
+// Iterators for lists of uses.
+using use_iterator = list_iterator<use_info, &use_info::next_use>;
+using reverse_use_iterator = list_iterator<use_info, &use_info::prev_use>;
+
+// Like use_iterator, but specifically for uses by nondebug instructions,
+// uses by any kind of instruction, and uses by phi nodes respectively.
+// These iterators allow a nullptr end point even if there are other types
+// of use in the same definition.
+using nondebug_insn_use_iterator
+  = list_iterator<use_info, &use_info::next_nondebug_insn_use>;
+using any_insn_use_iterator
+  = list_iterator<use_info, &use_info::next_any_insn_use>;
+using phi_use_iterator = list_iterator<use_info, &use_info::prev_phi_use>;
+
+// A view of an access_array in which every entry is known to be a use_info.
+using use_array = const_derived_container<use_info *, access_array>;
+
+// An access_info that describes a definition of a resource.  The definition
+// can be a set or a clobber; the difference is that a set provides a known
+// and potentially useful value, while a clobber provides an unknown and
+// unusable value.
+//
+// Every definition is associated with an insn_info.  All definitions of
+// a given resource are stored in a linked list, maintained in reverse
+// postorder.
+class def_info : public access_info
+{
+  // Overall size: 4 LP64 words
+  friend class function_info;
+  friend class clobber_group;
+
+public:
+  // Return the instruction that contains the definition.
+  insn_info *insn () const { return m_insn; }
+
+  // Return the basic block that contains the definition.
+  bb_info *bb () const;
+
+  // Return the extended basic block that contains the access.
+  ebb_info *ebb () const;
+
+  // Return the previous and next definitions of the same resource,
+  // in reverse postorder, or null if no such definition exists.
+  def_info *prev_def () const;
+  def_info *next_def () const;
+
+  // Return true if this is the first definition in the list.
+  bool is_first_def () const;
+
+  // Return true if this is the last definition in the list.
+  bool is_last_def () const;
+
+  // Print the location of the definition to PP.
+  void print_location (pretty_printer *pp) const;
+
+  // Print a unique identifier for this definition to PP.  The identifier has
+  // the form <resource>:<insn uid>.
+  void print_identifier (pretty_printer *pp) const;
+
+protected:
+  def_info (insn_info *insn, resource_info resource, access_kind kind);
+
+private:
+  // In order to preserve memory, the list head only points to the first
+  // definition in the list.  The "prev" entry of the first definition
+  // then points to the last definition.
+  using last_def_or_prev_def = pointer_mux<def_info>;
+
+  // For similar memory-saving reasons, if we want to create a splay tree
+  // of accesses to a resource, we hang the root off the "next" entry of
+  // the last definition in the list.
+  using splay_root_or_next_def = pointer_mux<def_node, def_info>;
+
+  void set_insn (insn_info *insn) { m_insn = insn; }
+
+  def_info *last_def () const;
+  def_node *splay_root () const;
+
+  void record_reference (rtx_obj_reference, bool);
+  void copy_prev_from (def_info *);
+  void copy_next_from (def_info *);
+  void set_last_def (def_info *);
+  void set_prev_def (def_info *);
+  void set_splay_root (def_node *);
+  void set_next_def (def_info *);
+  void clear_def_links ();
+  bool has_def_links ();
+
+  // The location of the definition.
+  insn_info *m_insn;
+
+  // The overloaded "prev" and "next" pointers, as described above.
+  last_def_or_prev_def m_last_def_or_prev_def;
+  splay_root_or_next_def m_splay_root_or_next_def;
+};
+
+// Iterators for lists of definitions.
+using def_iterator = list_iterator<def_info, &def_info::next_def>;
+using reverse_def_iterator = list_iterator<def_info, &def_info::prev_def>;
+
+// A view of an access_array in which every entry is known to be a
+// def_info.
+using def_array = const_derived_container<def_info *, access_array>;
+
+// A def_info that sets the resource to a value that is both
+// unknown and not useful.  This is only ever used for registers,
+// since memory always has some useful contents.
+//
+// Neighboring clobbers are grouped into clobber_groups, so that it's
+// possibly to skip over all neighboring clobbers in a single step.
+class clobber_info : public def_info
+{
+  // Overall size: 8 LP64 words
+  friend class default_splay_tree_accessors<clobber_info *>;
+  friend class default_splay_tree_accessors_with_parent<clobber_info *>;
+  friend class function_info;
+  friend class clobber_group;
+
+public:
+  using splay_tree = default_rootless_splay_tree<clobber_info *>;
+
+  // Return true if the clobber belongs to a clobber_group, false if it
+  // is standalone.
+  bool is_in_group () const { return m_group; }
+
+  // Return the group that the clobber is in, or null if none.
+  //
+  // Complexity: amortized O(1), worst case O(N), where N is the number
+  // of clobbers in the containing clobber_group.
+  clobber_group *group () const;
+
+  // Print a description of the clobber to PP under the control of
+  // PP_ACCESS_* flags FLAGS.
+  void print (pretty_printer *pp,
+	      unsigned int flags = PP_ACCESS_DEFAULT) const;
+
+private:
+  // Once normal call clobbers are taken out of the equation by
+  // insn_call_clobbers_notes, clobber_infos account for roughly 6% of all
+  // def_infos, with the rest being set_infos.  clobber_infos are
+  // therefore much less size-sensitive than set_infos are.
+  //
+  // As noted above, we want to group neighboring clobbers together so that
+  // we can quickly step over them to find the previous or next "real" set.
+  // We also want to be able to split the group in sublinear time,
+  // for example when inserting a set/use pair between two clobbers
+  // in a group.
+  //
+  // So:
+  //
+  // - Clobbers need to have ready access to their group, so that we
+  //   can cheaply skip over the whole group.  This means that they
+  //   need a group pointer.
+  //
+  // - We need to be able to update the group pointer lazily, so that
+  //   the cost of updating it is counted against accesses to the clobbers
+  //   that need updating.
+  //
+  // We also want to be able to insert clobbers into a group in
+  // amortized logarithmic time.
+  //
+  // We therefore use a splay tree to represent the clobbers in a group,
+  // with the nodes storing their parent node.  It is then possible to
+  // perform splay operations without first getting hold of the root.
+  // The root of the splay tree always has a valid, up-to-date group,
+  // so lazy group updates can get the new group from there.
+  //
+  // Roughly 90% of clobbers have a neighboring definition in the same
+  // block, which means that most need to be stored in a splay tree.
+  // We therefore store the splay tree fields directly in the clobber_info
+  // rather than using a separate node object.
+
+  clobber_info (insn_info *, unsigned int);
+
+  void set_group (clobber_group *group) { m_group = group; }
+  void update_group (clobber_group *);
+  clobber_group *recompute_group ();
+
+  // The child and parent nodes in the splay tree.
+  clobber_info *m_children[2];
+  clobber_info *m_parent;
+
+  // The last known value of group (), which might now be out of date.
+  clobber_group *m_group;
+};
+
+using clobber_tree = clobber_info::splay_tree::rooted;
+
+// A def_info that sets the resource to a useful value.  It records
+// all uses of the value in a linked list.  The list is partitioned
+// into three sections:
+//
+// (1) all uses by nondebug instructions, in reverse postorder, followed by
+// (2) all uses by debug instructions, in reverse postorder, followed by
+// (3) all uses by phi nodes, in no particular order.
+//
+// There are two cases:
+//
+// - If we know in advance that there is a single definition of a resource R
+//   and therefore decide not to use phi nodes for R, (1) and (2) contain
+//   all uses of R, regardless of which blocks contain the uses.  (3) is
+//   then empty.
+//
+// - Otherwise, (1) only contains uses in the same extended basic block
+//   as the definition, and it is terminated by a use that marks the end
+//   of the live range for the EBB.  In other words, if the resource dies
+//   in the EBB, the last use by a nondebug instruction marks the point at
+//   which it dies, otherwise there is a fake live-out use at the end of
+//   the EBB.
+//
+// Since debug instructions should not affect codegen, they opportunisticly
+// attach to the same set_info as nondebug instructions where possible.
+// If a nondebug instruction would attach to a degenerate phi and if no
+// such phi exists, debug instructions instead attach to whichever set_info
+// provides the value, regardless of where that set_info is.
+class set_info : public def_info
+{
+  // Overall size: 6 LP64 words.
+  friend class function_info;
+  using use_splay_tree = splay_tree<use_info *>;
+
+public:
+  // Return the first and last uses of the set, or null if the list is empty.
+  // See the comment above for details about the order.
+  use_info *first_use () const { return m_first_use; }
+  use_info *last_use () const;
+
+  // Return the first and last uses of the set by nondebug instructions,
+  // or null if there are no such uses.  The uses are in reverse postorder.
+  use_info *first_nondebug_insn_use () const;
+  use_info *last_nondebug_insn_use () const;
+
+  // Return the first use of the set by any kind of instruction, or null
+  // if there are no such uses.  The uses are in the order described above.
+  use_info *first_any_insn_use () const;
+
+  // Return the last use of the set by phi inputs, or null if there are no
+  // such uses.  The phi input uses are in no particular order.
+  use_info *last_phi_use () const;
+
+  // Return true if at least one nondebug instruction or phi node uses
+  // the set's result.  This is equivalent to testing whether the set is
+  // ever live.
+  bool has_nondebug_uses () const;
+
+  // Return true if anything uses the set's result.  Note that this includes
+  // uses by debug instructions, so it should not be used for optimization
+  // decisions.
+  bool has_any_uses () const { return m_first_use; }
+
+  // Return true if at least one nondebug instruction uses the set's result.
+  bool has_nondebug_insn_uses () const;
+
+  // Return true if at least one phi node uses the set's result.
+  bool has_phi_uses () const;
+
+  // Return true if the set and its uses are contained within a single
+  // extended basic block, with the set coming first.  This implies
+  // that all uses are by instructions rather than phi nodes.
+  bool is_local_to_ebb () const;
+
+  // List all the uses of the set, in the order described above.
+  iterator_range<use_iterator> all_uses () const;
+
+  // Return uses () in reverse order.
+  iterator_range<reverse_use_iterator> reverse_all_uses () const;
+
+  // List the uses of the set by nondebug instructions, in reverse postorder.
+  iterator_range<nondebug_insn_use_iterator> nondebug_insn_uses () const;
+
+  // Return nondebug_insn_uses () in reverse order.
+  iterator_range<reverse_use_iterator> reverse_nondebug_insn_uses () const;
+
+  // List the uses of the set by any kind of instruction.  The list follows
+  // the order described above.
+  iterator_range<any_insn_use_iterator> all_insn_uses () const;
+
+  // List the uses of the set by phi nodes, in no particular order.
+  // There is therefore no reversed equivalent of this list.
+  iterator_range<phi_use_iterator> phi_uses () const;
+
+  // Print a description of the set to PP under the control of
+  // PP_ACCESS_* flags FLAGS.
+  void print (pretty_printer *pp,
+	      unsigned int flags = PP_ACCESS_DEFAULT) const;
+
+protected:
+  set_info (insn_info *, resource_info, access_kind);
+
+  // Print information about uses () to PP, continuing information printed
+  // about the set itself.
+  void print_uses_on_new_lines (pretty_printer *pp) const;
+
+private:
+  // Sets (including phis) account for about 94% of all definitions
+
+  set_info (insn_info *, resource_info);
+
+  void set_first_use (use_info *);
+
+  // The first use in the list.
+  use_info *m_first_use;
+
+  // The root of a splay tree of all uses, built lazily when we first
+  // think it's needed.
+  use_splay_tree m_use_tree;
+};
+
+// A set_info for an on-the-side phi node.  The phi node is attached
+// to an extended basic block EBB and has one input for each incoming edge.
+// The inputs are represented as an array of use_infos, with input I
+// corresponding to EDGE_PRED (EBB->first_bb ()->cfg_bb (), I).
+//
+// Each phi node has a densely-allocated unique identifier, which is intended
+// to be suitable for bitmaps or sbitmaps.
+//
+// All the phi nodes in an extended basic block are chained together
+// into a linked list.  The list has no particular order.
+class phi_info : public set_info
+{
+  // Overall size: 8 LP64 words
+  friend class function_info;
+
+public:
+  // Return the previous and next phi nodes in the extended basic block's list,
+  // or null if none.
+  phi_info *prev_phi () const { return m_prev_phi; }
+  phi_info *next_phi () const { return m_next_phi; }
+
+  // Return the number of phi inputs.  This is 1 for degenerate phis,
+  // otherwise it is equal to the number of incoming edges.
+  unsigned int num_inputs () const { return m_num_inputs; }
+
+  // Return true if the phi node is degenerate, i.e. if it has only a
+  // single input.
+  bool is_degenerate () const { return m_num_inputs == 1; }
+
+  // Return the phi node's unique identifier.
+  unsigned int uid () const { return m_uid; }
+
+  // Return the array of inputs.  For degenerate phi nodes, this array contains
+  // a single element, otherwise it has one input per incoming edge,
+  // with element E corresponding to incoming edge E.
+  use_array inputs () const;
+
+  // Return the use_info that describes the phi input for incoming edge E.
+  use_info *input_use (unsigned int e) const;
+
+  // Return the value of resource () on incoming edge E, or null if the
+  // value is completely undefined for that edge.
+  set_info *input_value (unsigned int e) const;
+
+  // Print a description of the phi node to PP under the control of
+  // PP_ACCESS_* flags FLAGS.
+  void print (pretty_printer *pp,
+	      unsigned int flags = PP_ACCESS_DEFAULT) const;
+
+private:
+  phi_info (insn_info *insn, resource_info resource, unsigned int uid);
+
+  void make_degenerate (use_info *);
+  void set_inputs (use_array inputs);
+  void set_prev_phi (phi_info *prev_phi) { m_prev_phi = prev_phi; }
+  void set_next_phi (phi_info *next_phi) { m_next_phi = next_phi; }
+  void clear_phi_links () { m_prev_phi = m_next_phi = nullptr; }
+  bool has_phi_links () { return m_prev_phi || m_next_phi; }
+
+  // The values returned by the accessors above.
+  unsigned int m_uid;
+  unsigned int m_num_inputs;
+  union
+  {
+    access_info *const *m_inputs;
+    access_info *m_single_input;
+  };
+  phi_info *m_prev_phi;
+  phi_info *m_next_phi;
+};
+
+// An iterator for lists of phi nodes.
+using phi_iterator = list_iterator<phi_info, &phi_info::next_phi>;
+
+// One node in a splay tree of definitions.  This base class represents
+// a single def_info, but it is structured to allow derived classes
+// to add a range.
+class def_node
+{
+  // Size: 3 LP64 words.
+  friend class function_info;
+  friend class default_splay_tree_accessors<def_node *>;
+
+public:
+  // Return the first definition that the node represents.
+  def_info *first_def () const;
+
+  // Return which type of access first_def () is.
+  bool contains_clobber () const { return m_clobber_or_set.is_first (); }
+  bool contains_set () const { return m_clobber_or_set.is_second (); }
+
+protected:
+  // More nodes are clobbers rather than sets, so put clobbers first.
+  // Neither choice can be null.
+  using clobber_or_set = pointer_mux<clobber_info, set_info>;
+
+  // Construct a node that represents FIRST_DEF (and possibly later
+  // definitions too, if called from a derived class).
+  def_node (clobber_or_set first_def);
+
+  // The first definition in the node.
+  clobber_or_set m_clobber_or_set;
+
+private:
+  // The splay tree child nodes.
+  def_node *m_children[2];
+};
+
+// One node in a splay tree of def_infos, representing a single set_info.
+class set_node : public def_node
+{
+  // Overall size: 3 LP64 words.
+  friend class function_info;
+
+public:
+  // Return the set that the node contains.
+  set_info *set () const { return m_clobber_or_set.known_second (); }
+
+  // Print a description of the node to PP.
+  void print (pretty_printer *pp) const;
+
+private:
+  // Construct a node for SET.
+  set_node (set_info *set) : def_node (set) {}
+};
+
+// One node in a splay tree of def_infos.  This class represents
+// a list of contiguous clobber_infos, in execution order.
+class clobber_group : public def_node
+{
+  // Overall size: 5 LP64 words.
+  friend class function_info;
+
+public:
+  // Return the first and last clobbers in the group.  The results are
+  // always nonnull.
+  clobber_info *first_clobber () const;
+  clobber_info *last_clobber () const { return m_last_clobber; }
+
+  // Return true if this group has been replaced by new clobber_groups.
+  bool has_been_superceded () const { return !m_last_clobber; }
+
+  // Return a list of the clobbers in the group, in execution order.
+  iterator_range<def_iterator> clobbers () const;
+
+  // Print a description of the group to PP.
+  void print (pretty_printer *pp) const;
+
+private:
+  clobber_group (clobber_info *clobber);
+
+  // Set the values of first_clobber () and last_clobber ().
+  void set_first_clobber (clobber_info *c) { m_clobber_or_set = c; }
+  void set_last_clobber (clobber_info *c) { m_last_clobber = c; }
+
+  // The value returned by last_clobber ().
+  clobber_info *m_last_clobber;
+
+  // A splay tree that contains all the clobbers in the group.
+  // The root of the splay tree always has an up-to-date group
+  // pointer, but the other clobbers in the tree might not.
+  clobber_tree m_clobber_tree;
+};
+
+// A splay tree in which one node represents a standalone set_info or a
+// range of consecutive clobber_infos.  The nodes follow execution order
+// and maintain the invariant that no two groups of clobber_infos appear
+// next to each other (instead, the groups are merged).
+using def_splay_tree = default_splay_tree<def_node *>;
+
+// This type represents a choice between:
+//
+// (1) a single definition of a resource
+// (2) a node in a def_splay_tree that represents either a single
+//     set or a group of clobbers.
+class def_mux : public pointer_mux<def_info, def_node>
+{
+  using parent = pointer_mux<def_info, def_node>;
+
+  // Provide the same constructors as the pointer_mux.
+  using parent::parent;
+
+public:
+  // Return the first definition associated with this mux.  If the mux holds
+  // a single definition, the result is that definition.  If the mux holds
+  // a clobber_group, the result is the first clobber in the group.
+  def_info *first_def () const;
+
+  // Return the last definition associated with this mux.  If the mux holds
+  // a single definition, the result is that definition.  If the mux holds
+  // a clobber_group, the result is the last clobber in the group.
+  def_info *last_def () const;
+
+  // If the pointer represents a set_info, return that set_info,
+  // otherwise return null.
+  set_info *set () const;
+};
+
+// This class represents the result of looking up the definition of a
+// resource at a particular point, here referred to as point P.
+// There are four states:
+//
+// - MUX is null if there were no definitions to search.
+//
+// - Otherwise, COMPARISON is 0 if we found a definition at P or a
+//   clobber_group that spans P.  MUX then contains this definition
+//   or clobber_group.
+//
+// - Otherwise, COMPARISON is less than 0 if we found the definition
+//   that precedes P or the group of clobbers that precedes P.  MUX then
+//   contains this definition or clobber_group.
+//
+// - Otherwise, COMPARISON is greater than zero and we found the
+//   definition that follows P, or the group of clobbers that follows P.
+//   MUX then contains this definition or clobber_group.
+class def_lookup
+{
+public:
+  // If we found a clobber_group that spans P, return the definition
+  // that precedes the start of the group, or null if none.
+  //
+  // Otherwise, return the last definition that occurs before P,
+  // or null if none.
+  def_info *prev_def () const;
+
+  // If we found a clobber_group that spans P, return the definition
+  // that follows the end of the group, or null if none.
+  //
+  // Otherwise, return the first definition that occurs after P,
+  // or null if none.
+  def_info *next_def () const;
+
+  // If we found a set_info at P, return that set_info, otherwise return null.
+  set_info *matching_set () const;
+
+  // If we found a set_info at P, return that set_info, otherwise return
+  // prev_def ().
+  def_info *matching_or_prev_def () const;
+
+  // If we found a set_info at P, return that set_info, otherwise return
+  // next_def ().
+  def_info *matching_or_next_def () const;
+
+  def_mux mux;
+  int comparison;
+};
+
+void pp_resource (pretty_printer *, resource_info);
+void pp_access (pretty_printer *, const access_info *,
+		unsigned int flags = PP_ACCESS_DEFAULT);
+void pp_accesses (pretty_printer *, access_array,
+		  unsigned int flags = PP_ACCESS_DEFAULT);
+void pp_def_node (pretty_printer *, const def_node *);
+void pp_def_mux (pretty_printer *, def_mux);
+void pp_def_lookup (pretty_printer *, def_lookup);
+
+}
+
+void dump (FILE *, rtl_ssa::resource_info);
+void dump (FILE *, const rtl_ssa::access_info *,
+	   unsigned int flags = rtl_ssa::PP_ACCESS_DEFAULT);
+void dump (FILE *, rtl_ssa::access_array,
+	   unsigned int flags = rtl_ssa::PP_ACCESS_DEFAULT);
+void dump (FILE *, const rtl_ssa::def_node *);
+void dump (FILE *, rtl_ssa::def_mux);
+void dump (FILE *, rtl_ssa::def_lookup);
+
+void DEBUG_FUNCTION debug (const rtl_ssa::resource_info *);
+void DEBUG_FUNCTION debug (const rtl_ssa::access_info *);
+void DEBUG_FUNCTION debug (const rtl_ssa::access_array);
+void DEBUG_FUNCTION debug (const rtl_ssa::def_node *);
+void DEBUG_FUNCTION debug (const rtl_ssa::def_mux &);
+void DEBUG_FUNCTION debug (const rtl_ssa::def_lookup &);
diff --git a/gcc/rtl-ssa/blocks.cc b/gcc/rtl-ssa/blocks.cc
new file mode 100644
index 00000000000..5436305b11c
--- /dev/null
+++ b/gcc/rtl-ssa/blocks.cc
@@ -0,0 +1,1146 @@
+// Implementation of basic-block-related functions for RTL SSA      -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#define INCLUDE_ALGORITHM
+#define INCLUDE_FUNCTIONAL
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "df.h"
+#include "rtl-ssa.h"
+#include "rtl-ssa/internals.inl"
+#include "cfganal.h"
+#include "cfgrtl.h"
+#include "predict.h"
+
+using namespace rtl_ssa;
+
+// See the comment above the declaration.
+void
+bb_info::print_identifier (pretty_printer *pp) const
+{
+  char tmp[3 * sizeof (index ()) + 3];
+  snprintf (tmp, sizeof (tmp), "bb%d", index ());
+  pp_string (pp, tmp);
+  if (ebb_info *ebb = this->ebb ())
+    {
+      pp_space (pp);
+      pp_left_bracket (pp);
+      ebb->print_identifier (pp);
+      pp_right_bracket (pp);
+    }
+}
+
+// See the comment above the declaration.
+void
+bb_info::print_full (pretty_printer *pp) const
+{
+  pp_string (pp, "basic block ");
+  print_identifier (pp);
+  pp_colon (pp);
+
+  auto print_insn = [pp](const char *header, const insn_info *insn)
+    {
+      pp_newline_and_indent (pp, 2);
+      pp_string (pp, header);
+      pp_newline_and_indent (pp, 2);
+      if (insn)
+	pp_insn (pp, insn);
+      else
+	pp_string (pp, "<uninitialized>");
+      pp_indentation (pp) -= 4;
+    };
+
+  print_insn ("head:", head_insn ());
+
+  pp_newline (pp);
+  pp_newline_and_indent (pp, 2);
+  pp_string (pp, "contents:");
+  if (!head_insn ())
+    {
+      pp_newline_and_indent (pp, 2);
+      pp_string (pp, "<uninitialized>");
+      pp_indentation (pp) -= 2;
+    }
+  else if (auto insns = real_insns ())
+    {
+      bool is_first = true;
+      for (const insn_info *insn : insns)
+	{
+	  if (is_first)
+	    is_first = false;
+	  else
+	    pp_newline (pp);
+	  pp_newline_and_indent (pp, 2);
+	  pp_insn (pp, insn);
+	  pp_indentation (pp) -= 2;
+	}
+    }
+  else
+    {
+      pp_newline_and_indent (pp, 2);
+      pp_string (pp, "none");
+      pp_indentation (pp) -= 2;
+    }
+  pp_indentation (pp) -= 2;
+
+  pp_newline (pp);
+  print_insn ("end:", end_insn ());
+}
+
+// See the comment above the declaration.
+void
+ebb_call_clobbers_info::print_summary (pretty_printer *pp) const
+{
+  pp_string (pp, "call clobbers for ABI ");
+  if (m_abi)
+    pp_decimal_int (pp, m_abi->id ());
+  else
+    pp_string (pp, "<null>");
+}
+
+// See the comment above the declaration.
+void
+ebb_call_clobbers_info::print_full (pretty_printer *pp) const
+{
+  print_summary (pp);
+  pp_colon (pp);
+  pp_newline_and_indent (pp, 2);
+  auto print_node = [](pretty_printer *pp,
+		       const insn_call_clobbers_note *note)
+    {
+      if (insn_info *insn = note->insn ())
+	insn->print_identifier_and_location (pp);
+      else
+	pp_string (pp, "<null>");
+    };
+  print (pp, root (), print_node);
+  pp_indentation (pp) -= 2;
+}
+
+// See the comment above the declaration.
+void
+ebb_info::print_identifier (pretty_printer *pp) const
+{
+  // first_bb is populated by the constructor and so should always
+  // be nonnull.
+  auto index = first_bb ()->index ();
+  char tmp[3 * sizeof (index) + 4];
+  snprintf (tmp, sizeof (tmp), "ebb%d", index);
+  pp_string (pp, tmp);
+}
+
+// See the comment above the declaration.
+void
+ebb_info::print_full (pretty_printer *pp) const
+{
+  pp_string (pp, "extended basic block ");
+  print_identifier (pp);
+  pp_colon (pp);
+
+  pp_newline_and_indent (pp, 2);
+  if (insn_info *phi_insn = this->phi_insn ())
+    {
+      phi_insn->print_identifier_and_location (pp);
+      pp_colon (pp);
+      if (auto phis = this->phis ())
+	{
+	  bool is_first = true;
+	  for (const phi_info *phi : phis)
+	    {
+	      if (is_first)
+		is_first = false;
+	      else
+		pp_newline (pp);
+	      pp_newline_and_indent (pp, 2);
+	      pp_access (pp, phi, PP_ACCESS_SETTER);
+	      pp_indentation (pp) -= 2;
+	    }
+	}
+      else
+	{
+	  pp_newline_and_indent (pp, 2);
+	  pp_string (pp, "no phi nodes");
+	  pp_indentation (pp) -= 2;
+	}
+    }
+  else
+    pp_string (pp, "no phi insn");
+  pp_indentation (pp) -= 2;
+
+  for (const bb_info *bb : bbs ())
+    {
+      pp_newline (pp);
+      pp_newline_and_indent (pp, 2);
+      pp_bb (pp, bb);
+      pp_indentation (pp) -= 2;
+    }
+
+  for (ebb_call_clobbers_info *ecc : call_clobbers ())
+    {
+      pp_newline (pp);
+      pp_newline_and_indent (pp, 2);
+      pp_ebb_call_clobbers (pp, ecc);
+      pp_indentation (pp) -= 2;
+    }
+}
+
+// Add a dummy use to mark that DEF is live out of BB's EBB at the end of BB.
+void
+function_info::add_live_out_use (bb_info *bb, set_info *def)
+{
+  // There is nothing to do if DEF is an artificial definition at the end
+  // of BB.  In that case the definitino is rooted at the end of the block
+  // and we wouldn't gain anything by inserting a use immediately after it.
+  // If we did want to insert a use, we'd need to associate it with a new
+  // instruction that comes after bb->end_insn ().
+  if (def->insn () == bb->end_insn ())
+    return;
+
+  // If the end of the block already has an artificial use, that use
+  // acts to make DEF live at the appropriate point.
+  unsigned int regno = def->regno ();
+  if (find_access (bb->end_insn ()->uses (), regno))
+    return;
+
+  // Currently there is no need to maintain a backward link from the end
+  // instruction to the list of live-out uses.  Such a list would be
+  // expensive to update if it was represented using the usual insn_info
+  // access arrays.
+  use_info *use = allocate<use_info> (bb->end_insn (), def->resource (), def);
+  use->set_is_live_out_use (true);
+  add_use (use);
+}
+
+// Return true if all nondebug uses of DEF are live-out uses.
+static bool
+all_uses_are_live_out_uses (set_info *def)
+{
+  for (use_info *use : def->all_uses ())
+    if (!use->is_in_debug_insn () && !use->is_live_out_use ())
+      return false;
+  return true;
+}
+
+// SET, if nonnull, is a definition of something that is live out from BB.
+// Return the live-out value itself.
+set_info *
+function_info::live_out_value (bb_info *bb, set_info *set)
+{
+  // Degenerate phis only exist to provide a definition for uses in the
+  // same EBB.  The live-out value is the same as the live-in value.
+  if (auto *phi = safe_dyn_cast<phi_info *> (set))
+    if (phi->is_degenerate ())
+      {
+	set = phi->input_value (0);
+
+	// Remove the phi if it turned out to be useless.  This is
+	// mainly useful for memory, because we don't know ahead of time
+	// whether a block will use memory or not.
+	if (bb == bb->ebb ()->last_bb () && all_uses_are_live_out_uses (phi))
+	  replace_phi (phi, set);
+      }
+
+  return set;
+}
+
+// Add PHI to EBB and enter it into the function's hash table.
+void
+function_info::append_phi (ebb_info *ebb, phi_info *phi)
+{
+  phi_info *first_phi = ebb->first_phi ();
+  if (first_phi)
+    first_phi->set_prev_phi (phi);
+  phi->set_next_phi (first_phi);
+  ebb->set_first_phi (phi);
+  add_def (phi);
+}
+
+// Remove PHI from its current position in the SSA graph.
+void
+function_info::remove_phi (phi_info *phi)
+{
+  phi_info *next = phi->next_phi ();
+  phi_info *prev = phi->prev_phi ();
+
+  if (next)
+    next->set_prev_phi (prev);
+
+  if (prev)
+    prev->set_next_phi (next);
+  else
+    phi->ebb ()->set_first_phi (next);
+
+  remove_def (phi);
+  phi->clear_phi_links ();
+}
+
+// Remove PHI from the SSA graph and free its memory.
+void
+function_info::delete_phi (phi_info *phi)
+{
+  gcc_assert (!phi->has_any_uses ());
+
+  // Remove the inputs to the phi.
+  for (use_info *input : phi->inputs ())
+    remove_use (input);
+
+  remove_phi (phi);
+
+  phi->set_next_phi (m_free_phis);
+  m_free_phis = phi;
+}
+
+// If possible, remove PHI and replace all uses with NEW_VALUE.
+void
+function_info::replace_phi (phi_info *phi, set_info *new_value)
+{
+  auto update_use = [&](use_info *use)
+    {
+      remove_use (use);
+      use->set_def (new_value);
+      add_use (use);
+    };
+
+  if (new_value)
+    for (use_info *use : phi->nondebug_insn_uses ())
+      if (!use->is_live_out_use ())
+	{
+	  // We need to keep the phi around for its local uses.
+	  // Turn it into a degenerate phi, if it isn't already.
+	  use_info *use = phi->input_use (0);
+	  if (use->def () != new_value)
+	    update_use (use);
+
+	  if (phi->is_degenerate ())
+	    return;
+
+	  phi->make_degenerate (use);
+
+	  // Redirect all phi users to NEW_VALUE.
+	  while (use_info *phi_use = phi->last_phi_use ())
+	    update_use (phi_use);
+
+	  return;
+	}
+
+  // Replace the uses.  We can discard uses that only existed for the
+  // sake of marking live-out values, since the resource is now transparent
+  // in the phi's EBB.
+  while (use_info *use = phi->last_use ())
+    if (use->is_live_out_use ())
+      remove_use (use);
+    else
+      update_use (use);
+
+  delete_phi (phi);
+}
+
+// Create and return a phi node for EBB.  RESOURCE is the resource that
+// the phi node sets (and thus that all the inputs set too).  NUM_INPUTS
+// is the number of inputs, which is 1 for a degenerate phi.  INPUTS[I]
+// is a set_info that gives the value of input I, or null if the value
+// is either unknown or uninitialized.  If NUM_INPUTS > 1, this array
+// is allocated on the main obstack and can be reused for the use array.
+//
+// Add the created phi node to its basic block and enter it into the
+// function's hash table.
+phi_info *
+function_info::create_phi (ebb_info *ebb, resource_info resource,
+			   access_info **inputs, unsigned int num_inputs)
+{
+  phi_info *phi = m_free_phis;
+  if (phi)
+    {
+      m_free_phis = phi->next_phi ();
+      *phi = phi_info (ebb->phi_insn (), resource, phi->uid ());
+    }
+  else
+    {
+      phi = allocate<phi_info> (ebb->phi_insn (), resource, m_next_phi_uid);
+      m_next_phi_uid += 1;
+    }
+
+  // Convert the array of set_infos into an array of use_infos.  Also work
+  // out what mode the phi should have.
+  machine_mode new_mode = resource.mode;
+  for (unsigned int i = 0; i < num_inputs; ++i)
+    {
+      auto *input = safe_as_a<set_info *> (inputs[i]);
+      auto *use = allocate<use_info> (phi, resource, input);
+      add_use (use);
+      inputs[i] = use;
+      if (input)
+	new_mode = combine_modes (new_mode, input->mode ());
+    }
+
+  phi->set_inputs (use_array (inputs, num_inputs));
+  phi->set_mode (new_mode);
+
+  append_phi (ebb, phi);
+
+  return phi;
+}
+
+// Create and return a degenerate phi for EBB whose input comes from DEF.
+// This is used in cases where DEF is known to be available on entry to
+// EBB but was not previously used within it.  If DEF is for a register,
+// there are two cases:
+//
+// (1) DEF was already live on entry to EBB but was previously transparent
+//     within it.
+//
+// (2) DEF was not previously live on entry to EBB and is being made live
+//     by this update.
+//
+// At the moment, this function only handles the case in which EBB has a
+// single predecessor block and DEF is defined in that block's EBB.
+phi_info *
+function_info::create_degenerate_phi (ebb_info *ebb, set_info *def)
+{
+  access_info *input = def;
+  phi_info *phi = create_phi (ebb, def->resource (), &input, 1);
+  if (def->is_reg ())
+    {
+      unsigned int regno = def->regno ();
+
+      // Find the single predecessor mentioned above.
+      basic_block pred_cfg_bb = single_pred (ebb->first_bb ()->cfg_bb ());
+      bb_info *pred_bb = this->bb (pred_cfg_bb);
+
+      if (!bitmap_set_bit (DF_LR_IN (ebb->first_bb ()->cfg_bb ()), regno))
+	{
+	  // The register was not previously live on entry to EBB and
+	  // might not have been live on exit from PRED_BB either.
+	  if (bitmap_set_bit (DF_LR_OUT (pred_cfg_bb), regno))
+	    add_live_out_use (pred_bb, def);
+	}
+      else
+	{
+	  // The register was previously live in to EBB.  Add live-out uses
+	  // at the appropriate points.
+	  insn_info *next_insn = nullptr;
+	  if (def_info *next_def = phi->next_def ())
+	    next_insn = next_def->insn ();
+	  for (bb_info *bb : ebb->bbs ())
+	    {
+	      if ((next_insn && *next_insn <= *bb->end_insn ())
+		  || !bitmap_bit_p (DF_LR_OUT (bb->cfg_bb ()), regno))
+		break;
+	      add_live_out_use (bb, def);
+	    }
+	}
+    }
+  return phi;
+}
+
+// Create a bb_info for CFG_BB, given that no such structure currently exists.
+bb_info *
+function_info::create_bb_info (basic_block cfg_bb)
+{
+  bb_info *bb = allocate<bb_info> (cfg_bb);
+  gcc_checking_assert (!m_bbs[cfg_bb->index]);
+  m_bbs[cfg_bb->index] = bb;
+  return bb;
+}
+
+// Add BB to the end of the list of blocks.
+void
+function_info::append_bb (bb_info *bb)
+{
+  if (m_last_bb)
+    m_last_bb->set_next_bb (bb);
+  else
+    m_first_bb = bb;
+  bb->set_prev_bb (m_last_bb);
+  m_last_bb = bb;
+}
+
+// Called while building SSA form using BI, with BI.current_bb being
+// the entry block.
+//
+// Create the entry block instructions and their definitions.  The only
+// useful instruction is the end instruction, which carries definitions
+// for the values that are live on entry to the function.  However, it
+// seems simpler to create a head instruction too, rather than force all
+// users of the block information to treat the entry block as a special case.
+void
+function_info::add_entry_block_defs (build_info &bi)
+{
+  bb_info *bb = bi.current_bb;
+  basic_block cfg_bb = bi.current_bb->cfg_bb ();
+  auto *lr_info = DF_LR_BB_INFO (cfg_bb);
+
+  bb->set_head_insn (append_artificial_insn (bb));
+  insn_info *insn = append_artificial_insn (bb);
+  bb->set_end_insn (insn);
+
+  start_insn_accesses ();
+
+  // Using LR to derive the liveness information means that we create an
+  // entry block definition for upwards exposed registers.  These registers
+  // are sometimes genuinely uninitialized.  However, some targets also
+  // create a pseudo PIC base register and only initialize it later.
+  // Handling that case correctly seems more important than optimizing
+  // uninitialized uses.
+  unsigned int regno;
+  bitmap_iterator in_bi;
+  EXECUTE_IF_SET_IN_BITMAP (&lr_info->out, 0, regno, in_bi)
+    {
+      auto *set = allocate<set_info> (insn, full_register (regno));
+      append_def (set);
+      m_temp_defs.safe_push (set);
+      bi.record_reg_def (regno, set);
+    }
+
+  // Create a definition that reflects the state of memory on entry to
+  // the function.
+  auto *set = allocate<set_info> (insn, memory);
+  append_def (set);
+  m_temp_defs.safe_push (set);
+  bi.record_mem_def (set);
+
+  finish_insn_accesses (insn);
+}
+
+// Called while building SSA form using BI.  Create phi nodes for the
+// current EBB, leaving backedge inputs to be filled in later.  Set
+// bi.last_access to the values that are live on entry to the EBB,
+// regardless of whether or not they are phi nodes.
+void
+function_info::add_phi_nodes (build_info &bi)
+{
+  ebb_info *ebb = bi.current_ebb;
+  basic_block cfg_bb = ebb->first_bb ()->cfg_bb ();
+  auto *lr_info = DF_LR_BB_INFO (cfg_bb);
+
+  // Get a local cache of the predecessor blocks' live out values.
+  unsigned int num_preds = EDGE_COUNT (cfg_bb->preds);
+  auto_vec<const bb_live_out_info *, 16> pred_live_outs (num_preds);
+  bool has_backedge = false;
+  bool has_eh_edge = false;
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_EDGE (e, ei, cfg_bb->preds)
+    {
+      bb_info *pred_bb = this->bb (e->src);
+      const bb_live_out_info *live_out = &bi.bb_live_out[e->src->index];
+
+      // In LR (but not LIVE), the registers live on entry to a block must
+      // normally be a subset of the registers live on exit from any
+      // given predecessor block.  The exceptions are EH edges, which
+      // implicitly clobber all registers in eh_edge_abi.full_reg_clobbers ().
+      // Thus if a register is upwards exposed in an EH handler, it won't
+      // be propagated across the EH edge.
+      //
+      // Excluding that special case, all registers live on entry to
+      // EBB are also live on exit from PRED_BB and were (or will be)
+      // considered when creating LIVE_OUT.
+      gcc_checking_assert ((e->flags & EDGE_EH)
+			   || !bitmap_intersect_compl_p (&lr_info->in,
+							 DF_LR_OUT (e->src)));
+      if (!pred_bb || !pred_bb->head_insn ())
+	{
+	  has_backedge = true;
+	  live_out = nullptr;
+	}
+      has_eh_edge |= (e->flags & EDGE_EH);
+      pred_live_outs.quick_push (live_out);
+    }
+
+  // PRED_REG_INDICES[I] tracks the index into PRED_LIVE_OUTS[I]->reg_values
+  // of the first unused entry.
+  auto_vec<unsigned int, 16> pred_reg_indices (num_preds);
+  pred_reg_indices.quick_grow_cleared (num_preds);
+
+  // Use this array to build up the list of inputs to each phi.
+  m_temp_defs.safe_grow (num_preds);
+
+  // Return true if the current phi is degenerate, i.e. if all its inputs
+  // are the same.
+  auto is_degenerate_phi = [&]()
+    {
+      if (has_backedge)
+	return false;
+
+      for (unsigned int i = 1; i < num_preds; ++i)
+	if (m_temp_defs[i] != m_temp_defs[0])
+	  return false;
+
+      return true;
+    };
+
+  // Finish calculating the live-in value for RESOURCE.  Decide how to
+  // represent the value of RESOURCE on entry to EBB and return its definition.
+  auto finish_phi = [&](resource_info resource) -> set_info *
+    {
+      access_info **inputs;
+      unsigned int num_inputs;
+      if (is_degenerate_phi ())
+	{
+	  auto *input = safe_as_a<set_info *> (m_temp_defs[0]);
+	  if (!input)
+	    // The live-in value is completely uninitialized.
+	    return nullptr;
+
+	  unsigned int regno = input->regno ();
+	  if (input->is_reg () && !bitmap_bit_p (bi.ebb_use, regno))
+	    // The live-in value comes from a single source and there
+	    // are no uses of it within the EBB itself.  We therefore
+	    // don't need a phi node.
+	    return input;
+
+	  // The live-in value comes from a single source and might be
+	  // used by the EBB itself.  Create a degenerate phi for it.
+	  inputs = m_temp_defs.begin ();
+	  num_inputs = 1;
+	}
+      else
+	{
+	  obstack_grow (&m_obstack, m_temp_defs.address (),
+			num_preds * sizeof (access_info *));
+	  inputs = static_cast<access_info **> (obstack_finish (&m_obstack));
+	  num_inputs = num_preds;
+	}
+      return create_phi (ebb, resource, inputs, num_inputs);
+    };
+
+  if (bi.ebb_live_in_for_debug)
+    bitmap_clear (bi.ebb_live_in_for_debug);
+
+  // Get the definition of each live input register, excluding registers
+  // that are known to have a single definition that dominates all uses.
+  unsigned int regno;
+  bitmap_iterator in_bi;
+  EXECUTE_IF_AND_IN_BITMAP (&lr_info->in, m_potential_phi_regs,
+			    0, regno, in_bi)
+    {
+      for (unsigned int pred_i = 0; pred_i < num_preds; ++pred_i)
+	{
+	  set_info *input = nullptr;
+	  if (const bb_live_out_info *pred_live_out = pred_live_outs[pred_i])
+	    {
+	      // Skip over registers that aren't live on entry to this block.
+	      unsigned int reg_i = pred_reg_indices[pred_i];
+	      while (reg_i < pred_live_out->num_reg_values
+		     && pred_live_out->reg_values[reg_i]->regno () < regno)
+		reg_i += 1;
+
+	      // As we asserted above, REGNO is live out from the predecessor
+	      // block, at least by the LR reckoning.  But there are three
+	      // cases:
+	      //
+	      // (1) The live-out value is well-defined (the normal case),
+	      //     with the definition coming either from the block itself
+	      //     or from a predecessor block.  In this case reg_values
+	      //     has a set_info entry for the register.
+	      //
+	      // (2) The live-out value was not modified by the predecessor
+	      //     EBB and did not have a defined value on input to that
+	      //     EBB either.  In this case reg_values has no entry for
+	      //     the register.
+	      //
+	      // (3) The live-out value was modified by the predecessor EBB,
+	      //     but the final modification was a clobber rather than
+	      //     a set.  In this case reg_values again has no entry for
+	      //     the register.
+	      //
+	      // The phi input for (2) and (3) is undefined, which we
+	      // represent as a null set_info.
+	      if (reg_i < pred_live_out->num_reg_values)
+		{
+		  set_info *set = pred_live_out->reg_values[reg_i];
+		  if (set->regno () == regno)
+		    {
+		      input = set;
+		      reg_i += 1;
+		    }
+		}
+
+	      // Fully call-clobbered values do not survive across EH edges.
+	      // In particular, if a call that normally sets a result register
+	      // throws an exception, the set of the result register should
+	      // not be treated as live on entry to the EH handler.
+	      if (has_eh_edge
+		  && HARD_REGISTER_NUM_P (regno)
+		  && eh_edge_abi.clobbers_full_reg_p (regno)
+		  && (EDGE_PRED (cfg_bb, pred_i)->flags & EDGE_EH))
+		input = nullptr;
+
+	      pred_reg_indices[pred_i] = reg_i;
+	    }
+	  m_temp_defs[pred_i] = input;
+	}
+      // Later code works out the correct mode of the phi.  Use BLKmode
+      // as a placeholder for now.
+      bi.record_reg_def (regno, finish_phi ({ E_BLKmode, regno }));
+      if (bi.ebb_live_in_for_debug)
+	bitmap_set_bit (bi.ebb_live_in_for_debug, regno);
+    }
+
+  // Repeat the process above for memory.
+  for (unsigned int pred_i = 0; pred_i < num_preds; ++pred_i)
+    {
+      set_info *input = nullptr;
+      if (const bb_live_out_info *pred_live_out = pred_live_outs[pred_i])
+	input = pred_live_out->mem_value;
+      m_temp_defs[pred_i] = input;
+    }
+  bi.record_mem_def (finish_phi (memory));
+
+  m_temp_defs.truncate (0);
+}
+
+// Called while building SSA form using BI.
+//
+// If FLAGS is DF_REF_AT_TOP, create the head insn for BI.current_bb
+// and populate its uses and definitions.  If FLAGS is 0, do the same
+// for the end insn.
+void
+function_info::add_artificial_accesses (build_info &bi, df_ref_flags flags)
+{
+  bb_info *bb = bi.current_bb;
+  basic_block cfg_bb = bb->cfg_bb ();
+  auto *lr_info = DF_LR_BB_INFO (cfg_bb);
+  df_ref ref;
+
+  insn_info *insn;
+  if (flags == DF_REF_AT_TOP)
+    {
+      if (cfg_bb->index == EXIT_BLOCK)
+	insn = append_artificial_insn (bb);
+      else
+	insn = append_artificial_insn (bb, bb_note (cfg_bb));
+      bb->set_head_insn (insn);
+    }
+  else
+    {
+      insn = append_artificial_insn (bb);
+      bb->set_end_insn (insn);
+    }
+
+  start_insn_accesses ();
+
+  FOR_EACH_ARTIFICIAL_USE (ref, cfg_bb->index)
+    if ((DF_REF_FLAGS (ref) & DF_REF_AT_TOP) == flags)
+      {
+	unsigned int regno = DF_REF_REGNO (ref);
+	machine_mode mode = GET_MODE (DF_REF_REAL_REG (ref));
+	resource_info resource { mode, regno };
+
+	// A definition must be available.
+	gcc_checking_assert (bitmap_bit_p (&lr_info->in, regno)
+			     || (flags != DF_REF_AT_TOP
+				 && bitmap_bit_p (&lr_info->def, regno)));
+	set_info *def = bi.current_reg_value (regno);
+	auto *use = allocate<use_info> (insn, resource, def);
+	add_use (use);
+	m_temp_uses.safe_push (use);
+      }
+
+  // Track the return value of memory by adding an artificial use of
+  // memory at the end of the exit block.
+  if (flags == 0 && cfg_bb->index == EXIT_BLOCK)
+    {
+      auto *use = allocate<use_info> (insn, memory, bi.current_mem_value ());
+      add_use (use);
+      m_temp_uses.safe_push (use);
+    }
+
+  FOR_EACH_ARTIFICIAL_DEF (ref, cfg_bb->index)
+    if ((DF_REF_FLAGS (ref) & DF_REF_AT_TOP) == flags)
+      {
+	unsigned int regno = DF_REF_REGNO (ref);
+	machine_mode mode = GET_MODE (DF_REF_REAL_REG (ref));
+	resource_info resource { mode, regno };
+
+	// If the value isn't used later in the block and isn't live
+	// on exit, we could instead represent the definition as a
+	// clobber_info.  However, that case should be relatively
+	// rare and set_info is any case more compact than clobber_info.
+	set_info *def = allocate<set_info> (insn, resource);
+	append_def (def);
+	m_temp_defs.safe_push (def);
+	bi.record_reg_def (regno, def);
+      }
+
+  // Model the effect of a memory clobber on an incoming edge by adding
+  // a fake definition of memory at the start of the block.  We don't need
+  // to add a use of the phi node because memory is implicitly always live.
+  if (flags == DF_REF_AT_TOP && has_abnormal_call_or_eh_pred_edge_p (cfg_bb))
+    {
+      set_info *def = allocate<set_info> (insn, memory);
+      append_def (def);
+      m_temp_defs.safe_push (def);
+      bi.record_mem_def (def);
+    }
+
+  finish_insn_accesses (insn);
+}
+
+// Called while building SSA form using BI.  Create insn_infos for all
+// relevant instructions in BI.current_bb.
+void
+function_info::add_block_contents (build_info &bi)
+{
+  basic_block cfg_bb = bi.current_bb->cfg_bb ();
+  rtx_insn *insn;
+  FOR_BB_INSNS (cfg_bb, insn)
+    if (INSN_P (insn))
+      add_insn_to_block (bi, insn);
+}
+
+// Called while building SSA form using BI.  Use BI.bb_live_out to record
+// the values that are live out from BI.current_bb.
+void
+function_info::record_block_live_out (build_info &bi)
+{
+  bb_info *bb = bi.current_bb;
+  ebb_info *ebb = bi.current_ebb;
+  basic_block cfg_bb = bb->cfg_bb ();
+  bb_live_out_info *live_out = &bi.bb_live_out[bb->index ()];
+  auto *lr_info = DF_LR_BB_INFO (bb->cfg_bb ());
+
+  // Calculate which subset of m_potential_phi_regs is live out from EBB
+  // at the end of BB.
+  auto_bitmap live_out_from_ebb;
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_EDGE (e, ei, cfg_bb->succs)
+    {
+      bb_info *dest_bb = this->bb (e->dest);
+      if (!dest_bb || dest_bb->ebb () != ebb)
+	bitmap_ior_and_into (live_out_from_ebb, DF_LR_IN (e->dest),
+			     m_potential_phi_regs);
+    }
+
+  // Record the live-out register values.
+  unsigned int regno;
+  bitmap_iterator out_bi;
+  EXECUTE_IF_AND_IN_BITMAP (&lr_info->out, m_potential_phi_regs,
+			    0, regno, out_bi)
+    if (set_info *value = live_out_value (bb, bi.current_reg_value (regno)))
+      {
+	if (value->ebb () == ebb && bitmap_bit_p (live_out_from_ebb, regno))
+	  add_live_out_use (bb, value);
+	obstack_ptr_grow (&m_temp_obstack, value);
+      }
+
+  live_out->num_reg_values = (obstack_object_size (&m_temp_obstack)
+			      / sizeof (set_info *));
+  auto *data = obstack_finish (&m_temp_obstack);
+  live_out->reg_values = static_cast<set_info **> (data);
+
+  live_out->mem_value = live_out_value (bb, bi.current_mem_value ());
+}
+
+// Called while building SSA form using BI.  Check if BI.current_bb has
+// any outgoing backedges.  If so, use the up-to-date contents of
+// BI.bb_live_out to populate the associated inputs of any phi nodes.
+void
+function_info::populate_backedge_phis (build_info &bi)
+{
+  bb_info *bb = bi.current_bb;
+  basic_block cfg_bb = bb->cfg_bb ();
+  const bb_live_out_info *live_out = &bi.bb_live_out[bb->index ()];
+
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_EDGE (e, ei, cfg_bb->succs)
+    {
+      // Check if this edge counts as a backedge in the current traversal.
+      bb_info *succ_bb = this->bb (e->dest);
+      if (!succ_bb || !succ_bb->head_insn ())
+	continue;
+
+      // Although the phis do not keep a defined order long-term, they are
+      // still in reverse regno order at this point.  We can therefore use
+      // a merge operation on the phis and the live-out values.
+      unsigned int input_i = e->dest_idx;
+      int reg_i = live_out->num_reg_values - 1;
+      for (phi_info *phi : succ_bb->ebb ()->phis ())
+	{
+	  set_info *input = nullptr;
+	  if (phi->is_mem ())
+	    input = live_out->mem_value;
+	  else
+	    {
+	      // Skip over any intervening live-out values.
+	      unsigned int regno = phi->regno ();
+	      while (reg_i >= 0)
+		{
+		  set_info *reg_value = live_out->reg_values[reg_i];
+		  if (reg_value->regno () < regno)
+		    break;
+		  reg_i -= 1;
+		  if (reg_value->regno () == regno)
+		    {
+		      input = reg_value;
+		      break;
+		    }
+		}
+	    }
+	  if (input)
+	    {
+	      use_info *use = phi->input_use (input_i);
+	      gcc_assert (!use->def ());
+	      use->set_def (input);
+	      add_use (use);
+	    }
+	}
+    }
+}
+
+// Return true if it would be better to continue an EBB across NEW_EDGE
+// rather than across OLD_EDGE, given that both edges are viable candidates.
+// This is not a total ordering.
+static bool
+better_ebb_edge_p (edge new_edge, edge old_edge)
+{
+  // Prefer the likeliest edge.
+  if (new_edge->probability.initialized_p ()
+      && old_edge->probability.initialized_p ()
+      && !(old_edge->probability == new_edge->probability))
+    return old_edge->probability < new_edge->probability;
+
+  // If both edges are equally likely, prefer a fallthru edge.
+  if (new_edge->flags & EDGE_FALLTHRU)
+    return true;
+  if (old_edge->flags & EDGE_FALLTHRU)
+    return false;
+
+  // Otherwise just stick with OLD_EDGE.
+  return false;
+}
+
+// Pick and return the next basic block in an EBB that currently ends with BB.
+// Return null if the EBB must end with BB.
+static basic_block
+choose_next_block_in_ebb (basic_block bb)
+{
+  // Although there's nothing in principle wrong with having an EBB that
+  // starts with the entry block and includes later blocks, there's not
+  // really much point either.  Keeping the entry block separate means
+  // that uses of arguments consistently occur through phi nodes, rather
+  // than the arguments sometimes appearing to come from an EBB-local
+  // definition instead.
+  if (bb->index == ENTRY_BLOCK)
+    return nullptr;
+
+  bool optimize_for_speed_p = optimize_bb_for_speed_p (bb);
+  edge best_edge = nullptr;
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_COMPLEX)
+	&& e->dest->index != EXIT_BLOCK
+	&& single_pred_p (e->dest)
+	&& optimize_for_speed_p == optimize_bb_for_speed_p (e->dest)
+	&& (!best_edge || better_ebb_edge_p (e, best_edge)))
+      best_edge = e;
+
+  return best_edge ? best_edge->dest : nullptr;
+}
+
+// Partition the function's blocks into EBBs and build SSA form for all
+// EBBs in the function.
+void
+function_info::process_all_blocks ()
+{
+  auto temps = temp_watermark ();
+  unsigned int num_bb_indices = last_basic_block_for_fn (m_fn);
+
+  // Compute the starting reverse postorder.  We tweak this later to try
+  // to get better EBB assignments.
+  auto *postorder = new int[n_basic_blocks_for_fn (m_fn)];
+  unsigned int postorder_num
+    = pre_and_rev_post_order_compute (nullptr, postorder, true);
+  gcc_assert (int (postorder_num) <= n_basic_blocks_for_fn (m_fn));
+
+  // Construct the working state for this function and its subroutines.
+  build_info bi;
+  bi.last_access = XOBNEWVEC (&m_temp_obstack, access_info *, m_num_regs + 1);
+  memset (bi.last_access, 0, (m_num_regs + 1) * sizeof (set_info *));
+
+  // The bb_live_out array shouldn't need to be initialized, since we'll
+  // always write to an entry before reading from it.  But poison the
+  // contents when checking, just to make sure we don't accidentally use
+  // an uninitialized value.
+  bi.bb_live_out = XOBNEWVEC (&m_temp_obstack, bb_live_out_info,
+			      num_bb_indices);
+  if (flag_checking)
+    memset (bi.bb_live_out, 0xaf,
+	    num_bb_indices * sizeof (bb_live_out_info));
+
+  // Only pay the overhead of recording a separate live-in bitmap if
+  // there are debug instructions that might need it.
+  auto_bitmap ebb_live_in;
+  if (MAY_HAVE_DEBUG_INSNS)
+    {
+      bi.ebb_live_in_for_debug = ebb_live_in;
+      // The bitmap is tested using individual bit operations, so optimize
+      // for that case.
+      bitmap_tree_view (ebb_live_in);
+    }
+  else
+    bi.ebb_live_in_for_debug = nullptr;
+
+  // Iterate over the blocks in reverse postorder.  In cases where
+  // multiple possible orders exist, prefer orders that chain blocks
+  // together into EBBs.  If multiple possible EBBs exist, try to pick
+  // the ones that are most likely to be profitable.
+  auto_vec<bb_info *, 16> ebb;
+  auto_bitmap ebb_use_tmp;
+  auto_bitmap ebb_def_tmp;
+  for (unsigned int i = 0; i < postorder_num; ++i)
+    if (!m_bbs[postorder[i]])
+      {
+	// Choose and create the blocks that should form the next EBB,
+	// and calculate the set of registers that the EBB uses and defines
+	// Only do actual bitmap operations if the EBB contains multiple
+	// blocks.
+	basic_block cfg_bb = BASIC_BLOCK_FOR_FN (m_fn, postorder[i]);
+	bi.ebb_use = &DF_LR_BB_INFO (cfg_bb)->use;
+	bi.ebb_def = &DF_LR_BB_INFO (cfg_bb)->def;
+	ebb.safe_push (create_bb_info (cfg_bb));
+	cfg_bb = choose_next_block_in_ebb (cfg_bb);
+	if (cfg_bb)
+	  {
+	    // An EBB with two blocks.
+	    bitmap_ior (ebb_use_tmp, bi.ebb_use, &DF_LR_BB_INFO (cfg_bb)->use);
+	    bitmap_ior (ebb_def_tmp, bi.ebb_def, &DF_LR_BB_INFO (cfg_bb)->def);
+	    bi.ebb_use = ebb_use_tmp;
+	    bi.ebb_def = ebb_def_tmp;
+	    ebb.safe_push (create_bb_info (cfg_bb));
+	    cfg_bb = choose_next_block_in_ebb (cfg_bb);
+	    while (cfg_bb)
+	      {
+		// An EBB with three or more blocks.
+		bitmap_ior_into (bi.ebb_use, &DF_LR_BB_INFO (cfg_bb)->use);
+		bitmap_ior_into (bi.ebb_def, &DF_LR_BB_INFO (cfg_bb)->def);
+		ebb.safe_push (create_bb_info (cfg_bb));
+		cfg_bb = choose_next_block_in_ebb (cfg_bb);
+	      }
+	  }
+
+	// Create the EBB itself.
+	bi.current_ebb = allocate<ebb_info> (ebb[0], ebb.last ());
+	for (bb_info *bb : ebb)
+	  {
+	    bb->set_ebb (bi.current_ebb);
+	    append_bb (bb);
+	  }
+
+	// Populate the contents of the EBB.
+	bi.current_ebb->set_phi_insn (append_artificial_insn (ebb[0]));
+	if (ebb[0]->index () == ENTRY_BLOCK)
+	  {
+	    gcc_assert (ebb.length () == 1);
+	    bi.current_bb = ebb[0];
+	    add_entry_block_defs (bi);
+	    record_block_live_out (bi);
+	  }
+	else if (EDGE_COUNT (ebb[0]->cfg_bb ()->preds) == 0)
+	  // Leave unreachable blocks empty, since there is no useful
+	  // liveness information for them, and anything they do will
+	  // be wasted work.  In a cleaned-up cfg, the only unreachable
+	  // block we should see is the exit block of a noreturn function.
+	  for (bb_info *bb : ebb)
+	    {
+	      bb->set_head_insn (append_artificial_insn (bb));
+	      bb->set_end_insn (append_artificial_insn (bb));
+	    }
+	else
+	  {
+	    add_phi_nodes (bi);
+	    for (bb_info *bb : ebb)
+	      {
+		bi.current_bb = bb;
+		add_artificial_accesses (bi, DF_REF_AT_TOP);
+		if (bb->index () != EXIT_BLOCK)
+		  add_block_contents (bi);
+		add_artificial_accesses (bi, df_ref_flags ());
+		record_block_live_out (bi);
+		populate_backedge_phis (bi);
+	      }
+	  }
+	ebb.truncate (0);
+      }
+
+  delete[] postorder;
+}
+
+// Print a description of CALL_CLOBBERS to PP.
+void
+rtl_ssa::pp_ebb_call_clobbers (pretty_printer *pp,
+			       const ebb_call_clobbers_info *call_clobbers)
+{
+  if (!call_clobbers)
+    pp_string (pp, "<null>");
+  else
+    call_clobbers->print_full (pp);
+}
+
+// Print a description of BB to PP.
+void
+rtl_ssa::pp_bb (pretty_printer *pp, const bb_info *bb)
+{
+  if (!bb)
+    pp_string (pp, "<null>");
+  else
+    bb->print_full (pp);
+}
+
+// Print a description of EBB to PP
+void
+rtl_ssa::pp_ebb (pretty_printer *pp, const ebb_info *ebb)
+{
+  if (!ebb)
+    pp_string (pp, "<null>");
+  else
+    ebb->print_full (pp);
+}
+
+// Print a description of CALL_CLOBBERS to FILE.
+void
+dump (FILE *file, const ebb_call_clobbers_info *call_clobbers)
+{
+  dump_using (file, pp_ebb_call_clobbers, call_clobbers);
+}
+
+// Print a description of BB to FILE.
+void
+dump (FILE *file, const bb_info *bb)
+{
+  dump_using (file, pp_bb, bb);
+}
+
+// Print a description of EBB to FILE.
+void
+dump (FILE *file, const ebb_info *ebb)
+{
+  dump_using (file, pp_ebb, ebb);
+}
+
+// Debug interfaces to the dump routines above.
+void debug (const ebb_call_clobbers_info *x) { dump (stderr, x); }
+void debug (const bb_info *x) { dump (stderr, x); }
+void debug (const ebb_info *x) { dump (stderr, x); }
diff --git a/gcc/rtl-ssa/blocks.h b/gcc/rtl-ssa/blocks.h
new file mode 100644
index 00000000000..f173e6ff8da
--- /dev/null
+++ b/gcc/rtl-ssa/blocks.h
@@ -0,0 +1,301 @@
+// Basic-block-related classes for RTL SSA                          -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+namespace rtl_ssa {
+
+// SSA-related information about a basic block.  Each block contains
+// the following, which are conceptually executed in order:
+//
+// - an artificial "head" insn_info that holds artificial uses and definitions
+//   for the start of the block.
+//
+// - one insn_info for each "real" instruction in the block
+//   (i.e. those that have an RTL pattern).
+//
+// - an artificial "end" insn_info that holds artificial uses and definitions
+//   for the end of the block.
+//
+// Blocks are grouped together into extended basic blocks.  In cases where
+// multiple EBBs exist (such as in a full diamond), we try to pick the one
+// that's most frequently executed.
+//
+// Blocks are chained together in reverse postorder.  (Rather than use a
+// list, we could instead have stored the index of the block in the overall
+// postorder.  However, using lists should make it cheaper to update the
+// information after trivial CFG manipulations.)
+class bb_info
+{
+  // Size: 6 LP64 words.
+  friend class function_info;
+
+public:
+  // Return the previous basic block in reverse postorder, or null if this
+  // is the entry block.
+  bb_info *prev_bb () const { return m_prev_bb; }
+
+  // Return the next basic block in reverse postorder, or null if this
+  // is the exit block.
+  bb_info *next_bb () const { return m_next_bb; }
+
+  // Return true if this block is the function's entry block.
+  bool is_entry_block () const { return !m_prev_bb; }
+
+  // Return true if this block is the function's exit block.
+  bool is_exit_block () const { return !m_next_bb; }
+
+  // Return the underlying basic_block structure.
+  basic_block cfg_bb () const { return m_cfg_bb; }
+
+  // Return the unique identifier of the underlying basic_block.  These uids
+  // do not follow any particular order.
+  unsigned int index () const { return m_cfg_bb->index; }
+
+  // Return the EBB that contains this block.
+  ebb_info *ebb () const { return m_ebb; }
+
+  // Return a list of all the instructions in the block, in execution order.
+  // The list includes the head and end instructions described above.
+  //
+  // Iterations over the list will pick up any new instructions that are
+  // inserted after the iterator's current instruction.
+  iterator_range<any_insn_iterator> all_insns () const;
+
+  // Like all_insns (), except that the instructions are in reverse order.
+  //
+  // Iterations over the list will pick up any new instructions that are
+  // inserted before the iterator's current instruction.
+  iterator_range<reverse_any_insn_iterator> reverse_all_insns () const;
+
+  // Like all_insns (), but without the debug instructions.
+  iterator_range<nondebug_insn_iterator> nondebug_insns () const;
+
+  // Like reverse_all_insns (), but without the debug instructions.
+  iterator_range<reverse_nondebug_insn_iterator>
+    reverse_nondebug_insns () const;
+
+  // Like all_insns (), but without the artificial instructions.
+  iterator_range<any_insn_iterator> real_insns () const;
+
+  // Like reverse_all_insns (), but without the artificial instructions.
+  iterator_range<reverse_any_insn_iterator> reverse_real_insns () const;
+
+  // Like real_insns (), but without the debug instructions.
+  iterator_range<nondebug_insn_iterator> real_nondebug_insns () const;
+
+  // Like reverse_real_insns (), but without the debug instructions.
+  iterator_range<reverse_nondebug_insn_iterator>
+    reverse_real_nondebug_insns () const;
+
+  // Return the instruction that holds the artificial uses and
+  // definitions at the head of the block.  The associated RTL insn
+  // is the block head note.
+  //
+  // This instruction always exists, even if it has no uses and definitions.
+  insn_info *head_insn () const { return m_head_insn; }
+
+  // Return the instruction that holds the artificial uses and definitions
+  // at the end of the block.  There is no associated RTL insn.
+  //
+  // This instruction always exists, even if it has no uses and definitions.
+  insn_info *end_insn () const { return m_end_insn; }
+
+  // Print "bb" + index () to PP.
+  void print_identifier (pretty_printer *pp) const;
+
+  // Print a full description of the block to PP.
+  void print_full (pretty_printer *) const;
+
+private:
+  bb_info (basic_block);
+
+  void set_prev_bb (bb_info *bb) { m_prev_bb = bb; }
+  void set_next_bb (bb_info *bb) { m_next_bb = bb; }
+  void set_cfg_bb (basic_block cfg_bb) { m_cfg_bb = cfg_bb; }
+  void set_ebb (ebb_info *ebb) { m_ebb = ebb; }
+  void set_head_insn (insn_info *insn) { m_head_insn = insn; }
+  void set_end_insn (insn_info *insn) { m_end_insn = insn; }
+
+  // The values returned by the functions above.
+  bb_info *m_prev_bb;
+  bb_info *m_next_bb;
+  basic_block m_cfg_bb;
+  ebb_info *m_ebb;
+  insn_info *m_head_insn;
+  insn_info *m_end_insn;
+};
+
+// Iterators for lists of basic blocks.
+using bb_iterator = list_iterator<bb_info, &bb_info::next_bb>;
+using reverse_bb_iterator = list_iterator<bb_info, &bb_info::prev_bb>;
+
+// This class collects together instructions for which has_call_clobbers ()
+// is true, storing them in a splay tree that follows reverse postorder.
+// Instances of the class form a singly-linked list, with one instance
+// per predefined_function_abi.
+class ebb_call_clobbers_info : public insn_call_clobbers_tree
+{
+  // Size 3 LP64 words.
+  friend class function_info;
+
+public:
+  // Return the next group in the list.
+  ebb_call_clobbers_info *next () const { return m_next; }
+
+  // Return the function abi used by all the calls in the group.
+  const predefined_function_abi *abi () const { return m_abi; }
+
+  // Return true if at least one call in the group should conservatively
+  // be assumed to clobber RESOURCE.
+  bool clobbers (resource_info) const;
+
+  // Print a summary of what the class describes to PP, without printing
+  // the actual instructions.
+  void print_summary (pretty_printer *pp) const;
+
+  // Print a full description of the object to PP, including the
+  // instructions it contains.
+  void print_full (pretty_printer *) const;
+
+private:
+  ebb_call_clobbers_info (const predefined_function_abi *);
+
+  // The values returned by the accessors above.
+  ebb_call_clobbers_info *m_next;
+  const predefined_function_abi *m_abi;
+};
+
+// A list of ebb_call_clobbers_infos.
+using ebb_call_clobbers_iterator
+  = list_iterator<ebb_call_clobbers_info, &ebb_call_clobbers_info::next>;
+
+// Information about an extended basic block.
+//
+// Each EBB has a list of phi nodes and starts with an artificial phi
+// instruction that conceptually "executes" the phi nodes.  The phi
+// nodes are independent of one another and so can be executed in any
+// order.  The order of the phi nodes in the list is not significant.
+//
+// Each EBB also maintains a list of ebb_call_clobbers_info structures
+// that describe all instructions for which has_call_clobbers () is true.
+// See the comment above that class for details.
+class ebb_info
+{
+  // Size: 5 LP64 words.
+  friend class function_info;
+
+public:
+  // Return the previous EBB in reverse postorder, or null if this EBB
+  // contains the entry block.
+  ebb_info *prev_ebb () const;
+
+  // Return the next EBB in reverse postorder, or null if this EBB contains
+  // the exit block.
+  ebb_info *next_ebb () const;
+
+  // Return the instruction that holds the EBB's phi nodes (and does
+  // nothing else).  There is no associated RTL insn.
+  //
+  // This instruction always exists, even if the EBB does not currently
+  // need any phi nodes.
+  insn_info *phi_insn () const { return m_phi_insn; }
+
+  // Return the first and last blocks in the EBB.
+  bb_info *first_bb () const { return m_first_bb; }
+  bb_info *last_bb () const { return m_last_bb; }
+
+  // Return the first of the EBB's phi nodes.
+  phi_info *first_phi () const { return m_first_phi; }
+
+  // Return the head of the list of ebb_call_clobbers_infos.
+  ebb_call_clobbers_info *first_call_clobbers () const;
+
+  // Return the list of ebb_call_clobbers_infos.
+  iterator_range<ebb_call_clobbers_iterator> call_clobbers () const;
+
+  // Return a list of the EBB's phi nodes, in arbitrary order.
+  iterator_range<phi_iterator> phis () const;
+
+  // Return a list of the blocks in the EBB, in execution order.
+  iterator_range<bb_iterator> bbs () const;
+
+  // Return a list of the blocks in the EBB, in reverse execution order.
+  iterator_range<reverse_bb_iterator> reverse_bbs () const;
+
+  // Return a list of all the instructions in the EBB, in execution order.
+  // The list includes phi_insn (), the head and end of each block,
+  // and the real instructions in each block.
+  //
+  // Iterations over the list will pick up any new instructions that are
+  // inserted after the iterator's current instruction.
+  iterator_range<any_insn_iterator> all_insns () const;
+
+  // Like all_insns (), except that the instructions are in reverse order.
+  //
+  // Iterations over the list will pick up any new instructions that are
+  // inserted before the iterator's current instruction.
+  iterator_range<reverse_any_insn_iterator> reverse_all_insns () const;
+
+  // Like all_insns (), but without the debug instructions.
+  iterator_range<nondebug_insn_iterator> nondebug_insns () const;
+
+  // Like reverse_all_insns (), but without the debug instructions.
+  iterator_range<reverse_nondebug_insn_iterator>
+    reverse_nondebug_insns () const;
+
+  // Return an insn_range that covers the same instructions as all_insns ().
+  insn_range_info insn_range () const;
+
+  // Print "ebb" + first_bb ()->index () to PP.
+  void print_identifier (pretty_printer *pp) const;
+
+  // Print a full description of the EBB to PP.
+  void print_full (pretty_printer *pp) const;
+
+private:
+  ebb_info (bb_info *, bb_info *);
+
+  void set_first_phi (phi_info *phi) { m_first_phi = phi; }
+  void set_phi_insn (insn_info *insn) { m_phi_insn = insn; }
+  void set_first_call_clobbers (ebb_call_clobbers_info *);
+
+  // The values returned by the functions above.
+  phi_info *m_first_phi;
+  insn_info *m_phi_insn;
+  bb_info *m_first_bb;
+  bb_info *m_last_bb;
+  ebb_call_clobbers_info *m_first_call_clobbers;
+};
+
+// Iterators for lists of extended basic blocks.
+using ebb_iterator = list_iterator<ebb_info, &ebb_info::next_ebb>;
+using reverse_ebb_iterator = list_iterator<ebb_info, &ebb_info::prev_ebb>;
+
+void pp_bb (pretty_printer *, const bb_info *);
+void pp_ebb_call_clobbers (pretty_printer *, const ebb_call_clobbers_info *);
+void pp_ebb (pretty_printer *, const ebb_info *);
+
+}
+
+void dump (FILE *, const rtl_ssa::bb_info *);
+void dump (FILE *, const rtl_ssa::ebb_call_clobbers_info *);
+void dump (FILE *, const rtl_ssa::ebb_info *);
+
+void DEBUG_FUNCTION debug (const rtl_ssa::bb_info *);
+void DEBUG_FUNCTION debug (const rtl_ssa::ebb_call_clobbers_info *);
+void DEBUG_FUNCTION debug (const rtl_ssa::ebb_info *);
diff --git a/gcc/rtl-ssa/change-utils.h b/gcc/rtl-ssa/change-utils.h
new file mode 100644
index 00000000000..824533076e9
--- /dev/null
+++ b/gcc/rtl-ssa/change-utils.h
@@ -0,0 +1,137 @@
+// RTL SSA utility functions for changing instructions              -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+namespace rtl_ssa {
+
+// Return true if INSN is one of the instructions being changed by CHANGES.
+inline bool
+insn_is_changing (array_slice<insn_change *const> changes,
+		  const insn_info *insn)
+{
+  for (const insn_change *change : changes)
+    if (change->insn () == insn)
+      return true;
+  return false;
+}
+
+// Return a closure of insn_is_changing, for use as a predicate.
+// This could be done using local lambdas instead, but the predicate is
+// used often enough that having a class should be more convenient and allow
+// reuse of template instantiations.
+//
+// We don't use std::bind because it would involve an indirect function call,
+// whereas this function is used in relatively performance-critical code.
+inline insn_is_changing_closure
+insn_is_changing (array_slice<insn_change *const> changes)
+{
+  return insn_is_changing_closure (changes);
+}
+
+// Restrict CHANGE.move_range so that the changed instruction can perform
+// all its definitions and uses.  Assume that if:
+//
+// - CHANGE contains an access A1 of resource R;
+// - an instruction I2 contains another access A2 to R; and
+// - IGNORE (I2) is true
+//
+// then either:
+//
+// - A2 will be removed; or
+// - something will ensure that A1 and A2 maintain their current order,
+//   without this having to be enforced by CHANGE's move range.
+//
+// IGNORE should return true for CHANGE.insn ().
+//
+// Return true on success, otherwise leave CHANGE.move_range in an invalid
+// state.
+//
+// This function only works correctly for instructions that remain within
+// the same extended basic block.
+template<typename IgnorePredicate>
+bool
+restrict_movement_ignoring (insn_change &change, IgnorePredicate ignore)
+{
+  // Uses generally lead to failure quicker, so test those first.
+  return (restrict_movement_for_uses_ignoring (change.move_range,
+					       change.new_uses, ignore)
+	  && restrict_movement_for_defs_ignoring (change.move_range,
+						  change.new_defs, ignore)
+	  && canonicalize_move_range (change.move_range, change.insn ()));
+}
+
+// Like restrict_movement_ignoring, but ignore only the instruction
+// that is being changed.
+inline bool
+restrict_movement (insn_change &change)
+{
+  return restrict_movement_ignoring (change, insn_is (change.insn ()));
+}
+
+using add_regno_clobber_fn = std::function<bool (insn_change &,
+						 unsigned int)>;
+bool recog_internal (insn_change &, add_regno_clobber_fn);
+
+// Try to recognize the new instruction pattern for CHANGE, potentially
+// tweaking the pattern or adding extra clobbers in order to make it match.
+//
+// When adding an extra clobber for register R, restrict CHANGE.move_range
+// to a range of instructions for which R is not live.  When determining
+// whether R is live, ignore accesses made by an instruction I if
+// IGNORE (I) is true.  The caller then assumes the responsibility
+// of ensuring that CHANGE and I are placed in a valid order.
+//
+// IGNORE should return true for CHANGE.insn ().
+//
+// Return true on success.  Leave CHANGE unmodified on failure.
+template<typename IgnorePredicate>
+inline bool
+recog_ignoring (obstack_watermark &watermark, insn_change &change,
+		IgnorePredicate ignore)
+{
+  auto add_regno_clobber = [&](insn_change &change, unsigned int regno)
+    {
+      return crtl->ssa->add_regno_clobber (watermark, change, regno, ignore);
+    };
+  return recog_internal (change, add_regno_clobber);
+}
+
+// As for recog_ignoring, but ignore only the instruction that is being
+// changed.
+inline bool
+recog (obstack_watermark &watermark, insn_change &change)
+{
+  return recog_ignoring (watermark, change, insn_is (change.insn ()));
+}
+
+// Check whether insn costs indicate that the net effect of the changes
+// in CHANGES is worthwhile.  Require a strict improvement if STRICT_P,
+// otherwise allow the new instructions to be the same cost as the old
+// instructions.
+bool changes_are_worthwhile (array_slice<insn_change *const> changes,
+			     bool strict_p = false);
+
+// Like changes_are_worthwhile, but for a single change.
+inline bool
+change_is_worthwhile (insn_change &change, bool strict_p = false)
+{
+  insn_change *changes[] = { &change };
+  return changes_are_worthwhile (changes, strict_p);
+}
+
+}
diff --git a/gcc/rtl-ssa/changes.cc b/gcc/rtl-ssa/changes.cc
new file mode 100644
index 00000000000..1885a80e2e9
--- /dev/null
+++ b/gcc/rtl-ssa/changes.cc
@@ -0,0 +1,1025 @@
+// RTL SSA routines for changing instructions                       -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#define INCLUDE_ALGORITHM
+#define INCLUDE_FUNCTIONAL
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "df.h"
+#include "rtl-ssa.h"
+#include "rtl-ssa/internals.inl"
+#include "target.h"
+#include "predict.h"
+#include "memmodel.h" // Needed by emit-rtl.h
+#include "emit-rtl.h"
+#include "cfghooks.h"
+#include "cfgrtl.h"
+
+using namespace rtl_ssa;
+
+// See the comment above the declaration.
+void
+insn_change::print (pretty_printer *pp) const
+{
+  if (m_is_deletion)
+    {
+      pp_string (pp, "deletion of ");
+      pp_insn (pp, m_insn);
+    }
+  else
+    {
+      pp_string (pp, "change to ");
+      pp_insn (pp, m_insn);
+      pp_newline_and_indent (pp, 2);
+      pp_string (pp, "~~~~~~~");
+
+      pp_newline_and_indent (pp, 0);
+      pp_string (pp, "new cost: ");
+      pp_decimal_int (pp, new_cost);
+
+      pp_newline_and_indent (pp, 0);
+      pp_string (pp, "new uses:");
+      pp_newline_and_indent (pp, 2);
+      pp_accesses (pp, new_uses);
+      pp_indentation (pp) -= 2;
+
+      pp_newline_and_indent (pp, 0);
+      pp_string (pp, "new defs:");
+      pp_newline_and_indent (pp, 2);
+      pp_accesses (pp, new_defs);
+      pp_indentation (pp) -= 2;
+
+      pp_newline_and_indent (pp, 0);
+      pp_string (pp, "first insert-after candidate: ");
+      move_range.first->print_identifier_and_location (pp);
+
+      pp_newline_and_indent (pp, 0);
+      pp_string (pp, "last insert-after candidate: ");
+      move_range.last->print_identifier_and_location (pp);
+    }
+}
+
+// Return a copy of access_array ACCESSES, allocating it on the
+// temporary obstack.
+access_array
+function_info::temp_access_array (access_array accesses)
+{
+  if (accesses.empty ())
+    return accesses;
+
+  gcc_assert (obstack_object_size (&m_temp_obstack) == 0);
+  obstack_grow (&m_temp_obstack, accesses.begin (), accesses.size_bytes ());
+  return { static_cast<access_info **> (obstack_finish (&m_temp_obstack)),
+	   accesses.size () };
+}
+
+// See the comment above the declaration.
+bool
+function_info::verify_insn_changes (array_slice<insn_change *const> changes)
+{
+  HARD_REG_SET defined_hard_regs, clobbered_hard_regs;
+  CLEAR_HARD_REG_SET (defined_hard_regs);
+  CLEAR_HARD_REG_SET (clobbered_hard_regs);
+
+  insn_info *min_insn = m_first_insn;
+  for (insn_change *change : changes)
+    if (!change->is_deletion ())
+      {
+	// Make sure that the changes can be kept in their current order
+	// while honoring all of the move ranges.
+	min_insn = later_insn (min_insn, change->move_range.first);
+	while (min_insn != change->insn () && !can_insert_after (min_insn))
+	  min_insn = min_insn->next_nondebug_insn ();
+	if (*min_insn > *change->move_range.last)
+	  {
+	    if (dump_file && (dump_flags & TDF_DETAILS))
+	      fprintf (dump_file, "no viable insn position assignment\n");
+	    return false;
+	  }
+
+	// If recog introduced new clobbers of a register as part of
+	// the matching process, make sure that they don't conflict
+	// with any other new definitions or uses of the register.
+	// (We have already checked that they don't conflict with
+	// unchanging definitions and uses.)
+	for (use_info *use : change->new_uses)
+	  {
+	    unsigned int regno = use->regno ();
+	    if (HARD_REGISTER_NUM_P (regno)
+		&& TEST_HARD_REG_BIT (clobbered_hard_regs, regno))
+	      {
+		if (dump_file && (dump_flags & TDF_DETAILS))
+		  fprintf (dump_file, "register %d would be clobbered"
+			   " while it is still live\n", regno);
+		return false;
+	      }
+	  }
+	for (def_info *def : change->new_defs)
+	  {
+	    unsigned int regno = def->regno ();
+	    if (HARD_REGISTER_NUM_P (regno))
+	      {
+		if (def->m_is_temp)
+		  {
+		    // This is a clobber introduced by recog.
+		    gcc_checking_assert (is_a<clobber_info *> (def));
+		    if (TEST_HARD_REG_BIT (defined_hard_regs, regno))
+		      {
+			if (dump_file && (dump_flags & TDF_DETAILS))
+			  fprintf (dump_file, "conflicting definitions of"
+				   " register %d\n", regno);
+			return false;
+		      }
+		    SET_HARD_REG_BIT (clobbered_hard_regs, regno);
+		  }
+		else if (is_a<set_info *> (def))
+		  {
+		    // REGNO now has a defined value.
+		    SET_HARD_REG_BIT (defined_hard_regs, regno);
+		    CLEAR_HARD_REG_BIT (clobbered_hard_regs, regno);
+		  }
+	      }
+	  }
+      }
+  return true;
+}
+
+// See the comment above the declaration.
+bool
+rtl_ssa::changes_are_worthwhile (array_slice<insn_change *const> changes,
+				 bool strict_p)
+{
+  unsigned int old_cost = 0;
+  unsigned int new_cost = 0;
+  for (insn_change *change : changes)
+    {
+      old_cost += change->old_cost ();
+      if (!change->is_deletion ())
+	{
+	  basic_block cfg_bb = change->bb ()->cfg_bb ();
+	  change->new_cost = insn_cost (change->rtl (),
+					optimize_bb_for_speed_p (cfg_bb));
+	  new_cost += change->new_cost;
+	}
+    }
+  bool ok_p = (strict_p ? new_cost < old_cost : new_cost <= old_cost);
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      fprintf (dump_file, "original cost");
+      char sep = '=';
+      for (const insn_change *change : changes)
+	{
+	  fprintf (dump_file, " %c %d", sep, change->old_cost ());
+	  sep = '+';
+	}
+      fprintf (dump_file, ", replacement cost");
+      sep = '=';
+      for (const insn_change *change : changes)
+	if (!change->is_deletion ())
+	  {
+	    fprintf (dump_file, " %c %d", sep, change->new_cost);
+	    sep = '+';
+	  }
+      fprintf (dump_file, "; %s\n",
+	       ok_p ? "keeping replacement" : "rejecting replacement");
+    }
+  if (!ok_p)
+    return false;
+
+  return true;
+}
+
+// Update the REG_NOTES of INSN, whose pattern has just been changed.
+static void
+update_notes (rtx_insn *insn)
+{
+  for (rtx *note_ptr = &REG_NOTES (insn); *note_ptr; )
+    {
+      rtx note = *note_ptr;
+      bool keep_p = true;
+      switch (REG_NOTE_KIND (note))
+	{
+	case REG_EQUAL:
+	case REG_EQUIV:
+	case REG_NOALIAS:
+	  keep_p = (single_set (insn) != nullptr);
+	  break;
+
+	case REG_UNUSED:
+	case REG_DEAD:
+	  // These notes are stale.  We'll recompute REG_UNUSED notes
+	  // after the update.
+	  keep_p = false;
+	  break;
+
+	default:
+	  break;
+	}
+      if (keep_p)
+	note_ptr = &XEXP (*note_ptr, 1);
+      else
+	{
+	  *note_ptr = XEXP (*note_ptr, 1);
+	  free_EXPR_LIST_node (note);
+	}
+    }
+}
+
+// Pick a location for CHANGE's instruction and return the instruction
+// after which it should be placed.
+static insn_info *
+choose_insn_placement (insn_change &change)
+{
+  gcc_checking_assert (change.move_range);
+
+  insn_info *insn = change.insn ();
+  insn_info *first = change.move_range.first;
+  insn_info *last = change.move_range.last;
+
+  // Quick(ish) exit if there is only one possible choice.
+  if (first == last)
+    return first;
+  if (first == insn->prev_nondebug_insn () && last == insn)
+    return insn;
+
+  // For now just use the closest valid choice to the original instruction.
+  // If the register usage has changed significantly, it might instead be
+  // better to try to take register pressure into account.
+  insn_info *closest = change.move_range.clamp_insn_to_range (insn);
+  while (closest != insn && !can_insert_after (closest))
+    closest = closest->next_nondebug_insn ();
+  return closest;
+}
+
+// Record any changes related to CHANGE that need to be queued for later.
+void
+function_info::possibly_queue_changes (insn_change &change)
+{
+  insn_info *insn = change.insn ();
+  rtx_insn *rtl = insn->rtl ();
+
+  // If the instruction could previously throw, we eventually need to call
+  // purge_dead_edges to check whether things have changed.
+  if (find_reg_note (rtl, REG_EH_REGION, nullptr))
+    bitmap_set_bit (m_need_to_purge_dead_edges, insn->bb ()->index ());
+
+  auto needs_pending_update = [&]()
+    {
+      // If an instruction became a no-op without the pass explicitly
+      // deleting it, queue the deletion for later.  Removing the
+      // instruction on the fly would require an update to all instructions
+      // that use the result of the move, which would be a potential source
+      // of quadraticness.  Also, definitions shouldn't disappear under
+      // the pass's feet.
+      if (INSN_CODE (rtl) == NOOP_MOVE_INSN_CODE)
+	return true;
+
+      // If any jumps got turned into unconditional jumps or nops, we need
+      // to update the CFG accordingly.
+      if (JUMP_P (rtl)
+	  && (returnjump_p (rtl) || any_uncondjump_p (rtl))
+	  && !single_succ_p (insn->bb ()->cfg_bb ()))
+	return true;
+
+      // If a previously conditional trap now always fires, execution
+      // terminates at that point.
+      rtx pattern = PATTERN (rtl);
+      if (GET_CODE (pattern) == TRAP_IF
+	  && XEXP (pattern, 0) == const1_rtx)
+	return true;
+
+      return false;
+    };
+
+  if (needs_pending_update ()
+      && bitmap_set_bit (m_queued_insn_update_uids, insn->uid ()))
+    {
+      gcc_assert (!change.is_deletion ());
+      m_queued_insn_updates.safe_push (insn);
+    }
+}
+
+// Remove the instruction described by CHANGE from the underlying RTL
+// and from the insn_info list.
+static void
+delete_insn (insn_change &change)
+{
+  insn_info *insn = change.insn ();
+  rtx_insn *rtl = change.rtl ();
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    fprintf (dump_file, "deleting insn %d\n", insn->uid ());
+  set_insn_deleted (rtl);
+}
+
+// Move the RTL instruction associated with CHANGE so that it comes
+// immediately after AFTER.
+static void
+move_insn (insn_change &change, insn_info *after)
+{
+  rtx_insn *rtl = change.rtl ();
+  rtx_insn *after_rtl = after->rtl ();
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    fprintf (dump_file, "moving insn %d after insn %d\n",
+	     INSN_UID (rtl), INSN_UID (after_rtl));
+
+  // At the moment we don't support moving instructions between EBBs,
+  // but this would be worth adding if it's useful.
+  insn_info *insn = change.insn ();
+  gcc_assert (after->ebb () == insn->ebb ());
+  bb_info *bb = after->bb ();
+  basic_block cfg_bb = bb->cfg_bb ();
+
+  if (insn->bb () != bb)
+    // Force DF to mark the old block as dirty.
+    df_insn_delete (rtl);
+  ::remove_insn (rtl);
+  ::add_insn_after (rtl, after_rtl, cfg_bb);
+}
+
+// The instruction associated with CHANGE is being changed in-place.
+// Update the DF information for its new pattern.
+static void
+update_insn_in_place (insn_change &change)
+{
+  insn_info *insn = change.insn ();
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    fprintf (dump_file, "updating insn %d in-place\n", insn->uid ());
+  df_insn_rescan (change.rtl ());
+}
+
+// Finalize the new list of definitions and uses in CHANGE, removing
+// any uses and definitions that are no longer needed, and converting
+// pending clobbers into actual definitions.
+void
+function_info::finalize_new_accesses (insn_change &change)
+{
+  insn_info *insn = change.insn ();
+
+  // Get a list of all the things that the instruction now references.
+  vec_rtx_properties properties;
+  properties.add_insn (insn->rtl (), true);
+
+  // Build up the new list of definitions.
+  for (rtx_obj_reference ref : properties.refs ())
+    if (ref.is_write ())
+      {
+	def_info *def = find_access (change.new_defs, ref.regno);
+	gcc_assert (def);
+	if (def->m_is_temp)
+	  {
+	    // At present, the only temporary instruction definitions we
+	    // create are clobbers, such as those added during recog.
+	    gcc_assert (is_a<clobber_info *> (def));
+	    def = allocate<clobber_info> (change.insn (), ref.regno);
+	  }
+	else if (!def->m_has_been_superceded)
+	  {
+	    // This is a second or subsequent definition.
+	    // See function_info::record_def for a discussion of when
+	    // this can happen.
+	    def->record_reference (ref, false);
+	    continue;
+	  }
+	else
+	  {
+	    def->m_has_been_superceded = false;
+
+	    // Clobbers can move around, so remove them from their current
+	    // position and them back in their final position.
+	    //
+	    // At the moment, we don't allow sets to move relative to other
+	    // definitions of the same resource, so we can leave those where
+	    // they are.  It might be useful to relax this in future.
+	    // The main complication is that removing a set would potentially
+	    // fuse two adjoining clobber_groups, and adding the set back
+	    // would require the group to be split again.
+	    if (is_a<clobber_info *> (def))
+	      remove_def (def);
+	    else if (ref.is_reg ())
+	      def->set_mode (ref.mode);
+	    def->set_insn (insn);
+	  }
+	def->record_reference (ref, true);
+	m_temp_defs.safe_push (def);
+      }
+
+  // Also keep any explicitly-recorded call clobbers, which are deliberately
+  // excluded from the vec_rtx_properties.
+  for (def_info *def : change.new_defs)
+    if (def->m_has_been_superceded && def->is_call_clobber ())
+      {
+	def->m_has_been_superceded = false;
+	def->set_insn (insn);
+	m_temp_defs.safe_push (def);
+      }
+
+  // Install the new list of definitions in CHANGE.
+  sort_accesses (m_temp_defs);
+  access_array accesses = temp_access_array (m_temp_defs);
+  change.new_defs = def_array (accesses);
+  m_temp_defs.truncate (0);
+
+  // Create temporary copies of use_infos that are already attached to
+  // other insns, which could happen if the uses come from unchanging
+  // insns or if they have been used by earlier changes.  Doing this
+  // makes it easier to detect multiple reads below.
+  auto *unshared_uses_base = XOBNEWVEC (&m_temp_obstack, access_info *,
+					change.new_uses.size ());
+  unsigned int i = 0;
+  for (use_info *use : change.new_uses)
+    {
+      if (!use->m_has_been_superceded)
+	{
+	  use = allocate_temp<use_info> (insn, use->resource (), use->def ());
+	  use->m_has_been_superceded = true;
+	  use->m_is_temp = true;
+	}
+      unshared_uses_base[i++] = use;
+    }
+  auto unshared_uses = use_array (unshared_uses_base, change.new_uses.size ());
+
+  // Add (possibly temporary) uses to m_temp_uses for each resource.
+  // If there are multiple references to the same resource, aggregate
+  // information in the modes and flags.
+  for (rtx_obj_reference ref : properties.refs ())
+    if (ref.is_read ())
+      {
+	unsigned int regno = ref.regno;
+	machine_mode mode = ref.is_reg () ? ref.mode : BLKmode;
+	use_info *use = find_access (unshared_uses, ref.regno);
+	gcc_assert (use);
+	if (use->m_has_been_superceded)
+	  {
+	    // This is the first reference to the resource.
+	    bool is_temp = use->m_is_temp;
+	    *use = use_info (insn, resource_info { mode, regno }, use->def ());
+	    use->m_is_temp = is_temp;
+	    use->record_reference (ref, true);
+	    m_temp_uses.safe_push (use);
+	  }
+	else
+	  {
+	    // Record the mode of the largest use.  The choice is arbitrary if
+	    // the instruction (unusually) references the same register in two
+	    // different but equal-sized modes.
+	    if (HARD_REGISTER_NUM_P (regno)
+		&& partial_subreg_p (use->mode (), mode))
+	      use->set_mode (mode);
+	    use->record_reference (ref, false);
+	  }
+      }
+
+  // Replace any temporary uses and definitions with real ones.
+  for (unsigned int i = 0; i < m_temp_uses.length (); ++i)
+    {
+      auto *use = as_a<use_info *> (m_temp_uses[i]);
+      if (use->m_is_temp)
+	{
+	  m_temp_uses[i] = use = allocate<use_info> (*use);
+	  use->m_is_temp = false;
+	  set_info *def = use->def ();
+	  // Handle cases in which the value was previously not used
+	  // within the block.
+	  if (def && def->m_is_temp)
+	    {
+	      phi_info *phi = as_a<phi_info *> (def);
+	      gcc_assert (phi->is_degenerate ());
+	      phi = create_degenerate_phi (phi->ebb (), phi->input_value (0));
+	      use->set_def (phi);
+	    }
+	}
+    }
+
+  // Install the new list of definitions in CHANGE.
+  sort_accesses (m_temp_uses);
+  change.new_uses = use_array (temp_access_array (m_temp_uses));
+  m_temp_uses.truncate (0);
+
+  // Record the new instruction-wide properties.
+  insn->set_properties (properties);
+}
+
+// Copy information from CHANGE to its underlying insn_info, given that
+// the insn_info has already been placed appropriately.
+void
+function_info::apply_changes_to_insn (insn_change &change)
+{
+  insn_info *insn = change.insn ();
+  if (change.is_deletion ())
+    {
+      insn->set_accesses (nullptr, 0, 0);
+      return;
+    }
+
+  // Copy the cost.
+  insn->set_cost (change.new_cost);
+
+  // Add all clobbers.  Sets never moved relative to other definitions,
+  // so are OK as-is.
+  for (def_info *def : change.new_defs)
+    if (is_a<clobber_info *> (def))
+      add_def (def);
+
+  // Add all uses, now that their position is final.
+  for (use_info *use : change.new_uses)
+    add_use (use);
+
+  // Copy the uses and definitions.
+  unsigned int num_defs = change.new_defs.size ();
+  unsigned int num_uses = change.new_uses.size ();
+  if (num_defs + num_uses <= insn->num_defs () + insn->num_uses ())
+    insn->copy_accesses (change.new_defs, change.new_uses);
+  else
+    {
+      access_array_builder builder (&m_obstack);
+      builder.reserve (num_defs + num_uses);
+
+      for (def_info *def : change.new_defs)
+	builder.quick_push (def);
+      for (use_info *use : change.new_uses)
+	builder.quick_push (use);
+
+      insn->set_accesses (builder.finish ().begin (), num_defs, num_uses);
+    }
+
+  add_reg_unused_notes (insn);
+}
+
+// Add a temporary placeholder instruction after AFTER.
+insn_info *
+function_info::add_placeholder_after (insn_info *after)
+{
+  insn_info *insn = allocate_temp<insn_info> (after->bb (), nullptr, -1);
+  add_insn_after (insn, after);
+  return insn;
+}
+
+// See the comment above the declaration.
+void
+function_info::change_insns (array_slice<insn_change *> changes)
+{
+  auto watermark = temp_watermark ();
+
+  insn_info *min_insn = m_first_insn;
+  for (insn_change *change : changes)
+    {
+      // Tentatively mark all the old uses and definitions for deletion.
+      for (use_info *use : change->old_uses ())
+	{
+	  use->m_has_been_superceded = true;
+	  remove_use (use);
+	}
+      for (def_info *def : change->old_defs ())
+	def->m_has_been_superceded = true;
+
+      if (!change->is_deletion ())
+	{
+	  // Remove any notes that are no longer relevant.
+	  update_notes (change->rtl ());
+
+	  // Make sure that the placement of this instruction would still
+	  // leave room for previous instructions.
+	  change->move_range = move_later_than (change->move_range, min_insn);
+	  if (!canonicalize_move_range (change->move_range, change->insn ()))
+	    // verify_insn_changes is supposed to make sure that this holds.
+	    gcc_unreachable ();
+	  min_insn = later_insn (min_insn, change->move_range.first);
+	}
+    }
+
+  // Walk backwards through the changes, allocating specific positions
+  // to each one.  Update the underlying RTL and its associated DF
+  // information.
+  insn_info *following_insn = nullptr;
+  auto_vec<insn_info *, 16> placeholders;
+  placeholders.safe_grow_cleared (changes.size ());
+  for (unsigned int i = changes.size (); i-- > 0;)
+    {
+      insn_change &change = *changes[i];
+      insn_info *placeholder = nullptr;
+      possibly_queue_changes (change);
+      if (change.is_deletion ())
+	delete_insn (change);
+      else
+	{
+	  // Make sure that this instruction comes before later ones.
+	  if (following_insn)
+	    {
+	      change.move_range = move_earlier_than (change.move_range,
+						     following_insn);
+	      if (!canonicalize_move_range (change.move_range,
+					    change.insn ()))
+		// verify_insn_changes is supposed to make sure that this
+		// holds.
+		gcc_unreachable ();
+	    }
+
+	  // Decide which instruction INSN should go after.
+	  insn_info *after = choose_insn_placement (change);
+
+	  // If INSN is moving, insert a placeholder insn_info at the
+	  // new location.  We can't move INSN itself yet because it
+	  // might still be referenced by earlier move ranges.
+	  insn_info *insn = change.insn ();
+	  if (after == insn || after == insn->prev_nondebug_insn ())
+	    {
+	      update_insn_in_place (change);
+	      following_insn = insn;
+	    }
+	  else
+	    {
+	      move_insn (change, after);
+	      placeholder = add_placeholder_after (after);
+	      following_insn = placeholder;
+	    }
+
+	  // Finalize the new list of accesses for the change.  Don't install
+	  // them yet, so that we still have access to the old lists below.
+	  finalize_new_accesses (change);
+	}
+      placeholders[i] = placeholder;
+    }
+
+  // Remove all definitions that are no longer needed.  After the above,
+  // such definitions should no longer have any registered users.
+  //
+  // In particular, this means that consumers must handle debug
+  // instructions before removing a set.
+  for (insn_change *change : changes)
+    for (def_info *def : change->old_defs ())
+      if (def->m_has_been_superceded)
+	{
+	  auto *set = dyn_cast<set_info *> (def);
+	  gcc_assert (!set || !set->has_any_uses ());
+	  remove_def (def);
+	}
+
+  // Move the insn_infos to their new locations.
+  for (unsigned int i = 0; i < changes.size (); ++i)
+    {
+      insn_change &change = *changes[i];
+      insn_info *insn = change.insn ();
+      if (change.is_deletion ())
+	remove_insn (insn);
+      else if (insn_info *placeholder = placeholders[i])
+	{
+	  // Check if earlier movements turned a move into a no-op.
+	  if (placeholder->prev_nondebug_insn () == insn
+	      || placeholder->next_nondebug_insn () == insn)
+	    {
+	      remove_insn (placeholder);
+	      placeholders[i] = nullptr;
+	    }
+	  else
+	    {
+	      // Remove the placeholder first so that we have a wider range of
+	      // program points when inserting INSN.
+	      insn_info *after = placeholder->prev_any_insn ();
+	      remove_insn (insn);
+	      remove_insn (placeholder);
+	      insn->set_bb (after->bb ());
+	      add_insn_after (insn, after);
+	    }
+	}
+    }
+
+  // Finally apply the changes to the underlying insn_infos.
+  for (insn_change *change : changes)
+    apply_changes_to_insn (*change);
+}
+
+// See the comment above the declaration.
+void
+function_info::change_insn (insn_change &change)
+{
+  insn_change *changes[] = { &change };
+  return change_insns (changes);
+}
+
+// Try to adjust CHANGE so that its pattern can include clobber rtx CLOBBER.
+// Return true on success.
+//
+// ADD_REGNO_CLOBBER is a specialization of function_info::add_regno_clobber
+// for a specific caller-provided predicate.
+static bool
+add_clobber (insn_change &change, add_regno_clobber_fn add_regno_clobber,
+	     rtx clobber)
+{
+  rtx pat = PATTERN (change.rtl ());
+  gcc_assert (GET_CODE (clobber) == CLOBBER);
+  rtx dest = XEXP (clobber, 0);
+  if (GET_CODE (dest) == SCRATCH)
+    {
+      if (reload_completed)
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    {
+	      // ??? Maybe we could try to do some RA here?
+	      fprintf (dump_file, "instruction requires a scratch"
+		       " after reload:\n");
+	      print_rtl_single (dump_file, pat);
+	    }
+	  return false;
+	}
+      return true;
+    }
+
+  gcc_assert (REG_P (dest));
+  for (unsigned int regno = REGNO (dest); regno != END_REGNO (dest); ++regno)
+    if (!add_regno_clobber (change, regno))
+      {
+	if (dump_file && (dump_flags & TDF_DETAILS))
+	  {
+	    fprintf (dump_file, "cannot clobber live register %d in:\n",
+		     regno);
+	    print_rtl_single (dump_file, pat);
+	  }
+	return false;
+      }
+  return true;
+}
+
+// Try to recognize the new form of the insn associated with CHANGE,
+// adding any clobbers that are necessary to make the instruction match
+// an .md pattern.  Return true on success.
+//
+// ADD_REGNO_CLOBBER is a specialization of function_info::add_regno_clobber
+// for a specific caller-provided predicate.
+static bool
+recog_level2 (insn_change &change, add_regno_clobber_fn add_regno_clobber)
+{
+  insn_change_watermark insn_watermark;
+  rtx_insn *rtl = change.rtl ();
+  rtx pat = PATTERN (rtl);
+  int num_clobbers = 0;
+  int icode = -1;
+  bool asm_p = asm_noperands (pat) >= 0;
+  if (asm_p)
+    {
+      if (!check_asm_operands (pat))
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    {
+	      fprintf (dump_file, "failed to match this asm instruction:\n");
+	      print_rtl_single (dump_file, pat);
+	    }
+	  return false;
+	}
+    }
+  else if (noop_move_p (rtl))
+    {
+      INSN_CODE (rtl) = NOOP_MOVE_INSN_CODE;
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	{
+	  fprintf (dump_file, "instruction becomes a no-op:\n");
+	  print_rtl_single (dump_file, pat);
+	}
+      insn_watermark.keep ();
+      return true;
+    }
+  else
+    {
+      icode = ::recog (pat, rtl, &num_clobbers);
+      if (icode < 0)
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    {
+	      fprintf (dump_file, "failed to match this instruction:\n");
+	      print_rtl_single (dump_file, pat);
+	    }
+	  return false;
+	}
+    }
+
+  auto prev_new_defs = change.new_defs;
+  auto prev_move_range = change.move_range;
+  if (num_clobbers > 0)
+    {
+      // ??? It would be good to have a way of recycling the rtxes on failure,
+      // but any attempt to cache old PARALLELs would at best be a half
+      // measure, since add_clobbers would still generate fresh clobbers
+      // each time.  It would be better to have a more general recycling
+      // mechanism that all rtx passes can use.
+      rtvec newvec;
+      int oldlen;
+      if (GET_CODE (pat) == PARALLEL)
+	{
+	  oldlen = XVECLEN (pat, 0);
+	  newvec = rtvec_alloc (num_clobbers + oldlen);
+	  for (int i = 0; i < oldlen; ++i)
+	    RTVEC_ELT (newvec, i) = XVECEXP (pat, 0, i);
+	}
+      else
+	{
+	  oldlen = 1;
+	  newvec = rtvec_alloc (num_clobbers + oldlen);
+	  RTVEC_ELT (newvec, 0) = pat;
+	}
+      rtx newpat = gen_rtx_PARALLEL (VOIDmode, newvec);
+      add_clobbers (newpat, icode);
+      validate_change (rtl, &PATTERN (rtl), newpat, true);
+      for (int i = 0; i < num_clobbers; ++i)
+	if (!add_clobber (change, add_regno_clobber,
+			  XVECEXP (newpat, 0, oldlen + i)))
+	  {
+	    change.new_defs = prev_new_defs;
+	    change.move_range = prev_move_range;
+	    return false;
+	  }
+
+      pat = newpat;
+    }
+
+  INSN_CODE (rtl) = icode;
+  if (reload_completed)
+    {
+      extract_insn (rtl);
+      if (!constrain_operands (1, get_preferred_alternatives (rtl)))
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    {
+	      if (asm_p)
+		fprintf (dump_file, "asm does not match its constraints:\n");
+	      else if (const char *name = get_insn_name (icode))
+		fprintf (dump_file, "instruction does not match the"
+			 " constraints for %s:\n", name);
+	      else
+		fprintf (dump_file, "instruction does not match its"
+			 " constraints:\n");
+	      print_rtl_single (dump_file, pat);
+	    }
+	  change.new_defs = prev_new_defs;
+	  change.move_range = prev_move_range;
+	  return false;
+	}
+    }
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      const char *name;
+      if (!asm_p && (name = get_insn_name (icode)))
+	fprintf (dump_file, "successfully matched this instruction "
+		 "to %s:\n", name);
+      else
+	fprintf (dump_file, "successfully matched this instruction:\n");
+      print_rtl_single (dump_file, pat);
+    }
+
+  insn_watermark.keep ();
+  return true;
+}
+
+// Try to recognize the new form of the insn associated with CHANGE,
+// adding and removing clobbers as necessary to make the instruction
+// match an .md pattern.  Return true on success, otherwise leave
+// CHANGE as it was on entry.
+//
+// ADD_REGNO_CLOBBER is a specialization of function_info::add_regno_clobber
+// for a specific caller-provided predicate.
+bool
+rtl_ssa::recog_internal (insn_change &change,
+			 add_regno_clobber_fn add_regno_clobber)
+{
+  // Accept all changes to debug instructions.
+  insn_info *insn = change.insn ();
+  if (insn->is_debug_insn ())
+    return true;
+
+  rtx_insn *rtl = insn->rtl ();
+  rtx pat = PATTERN (rtl);
+  if (GET_CODE (pat) == PARALLEL && asm_noperands (pat) < 0)
+    {
+      // Try to remove trailing (clobber (scratch)) rtxes, since the new form
+      // of the instruction might not need those scratches.  recog will add
+      // back any that are needed.
+      int len = XVECLEN (pat, 0);
+      int new_len = len;
+      while (new_len > 0
+	     && GET_CODE (XVECEXP (pat, 0, new_len - 1)) == CLOBBER
+	     && GET_CODE (XEXP (XVECEXP (pat, 0, new_len - 1), 0)) == SCRATCH)
+	new_len -= 1;
+
+      int old_num_changes = num_validated_changes ();
+      validate_change_xveclen (rtl, &PATTERN (rtl), new_len, true);
+      if (recog_level2 (change, add_regno_clobber))
+	return true;
+      cancel_changes (old_num_changes);
+
+      // Try to remove all trailing clobbers.  For example, a pattern that
+      // used to clobber the flags might no longer need to do so.
+      int prev_len = new_len;
+      while (new_len > 0
+	     && GET_CODE (XVECEXP (pat, 0, new_len - 1)) == CLOBBER)
+	new_len -= 1;
+      if (new_len != prev_len)
+	{
+	  validate_change_xveclen (rtl, &PATTERN (rtl), new_len, true);
+	  if (recog_level2 (change, add_regno_clobber))
+	    return true;
+	  cancel_changes (old_num_changes);
+	}
+      return false;
+    }
+
+  return recog_level2 (change, add_regno_clobber);
+}
+
+// See the comment above the declaration.
+bool
+function_info::perform_pending_updates ()
+{
+  bool changed_cfg = false;
+  bool changed_jumps = false;
+  for (insn_info *insn : m_queued_insn_updates)
+    {
+      rtx_insn *rtl = insn->rtl ();
+      if (JUMP_P (rtl))
+	{
+	  if (INSN_CODE (rtl) == NOOP_MOVE_INSN_CODE)
+	    {
+	      ::delete_insn (rtl);
+	      bitmap_set_bit (m_need_to_purge_dead_edges,
+			      insn->bb ()->index ());
+	    }
+	  else if (returnjump_p (rtl) || any_uncondjump_p (rtl))
+	    {
+	      mark_jump_label (PATTERN (rtl), rtl, 0);
+	      update_cfg_for_uncondjump (rtl);
+	      changed_cfg = true;
+	      changed_jumps = true;
+	    }
+	}
+      else if (INSN_CODE (rtl) == NOOP_MOVE_INSN_CODE)
+	::delete_insn (rtl);
+      else
+	{
+	  rtx pattern = PATTERN (rtl);
+	  if (GET_CODE (pattern) == TRAP_IF
+	      && XEXP (pattern, 0) == const1_rtx)
+	    {
+	      remove_edge (split_block (BLOCK_FOR_INSN (rtl), rtl));
+	      emit_barrier_after_bb (BLOCK_FOR_INSN (rtl));
+	      changed_cfg = true;
+	    }
+	}
+    }
+
+  unsigned int index;
+  bitmap_iterator bi;
+  EXECUTE_IF_SET_IN_BITMAP (m_need_to_purge_dead_edges, 0, index, bi)
+    if (purge_dead_edges (BASIC_BLOCK_FOR_FN (m_fn, index)))
+      changed_cfg = true;
+
+  if (changed_jumps)
+    // This uses its own timevar internally, so we don't need to push
+    // one ourselves.
+    rebuild_jump_labels (get_insns ());
+
+  bitmap_clear (m_need_to_purge_dead_edges);
+  bitmap_clear (m_queued_insn_update_uids);
+  m_queued_insn_updates.truncate (0);
+
+  if (changed_cfg)
+    {
+      free_dominance_info (CDI_DOMINATORS);
+      free_dominance_info (CDI_POST_DOMINATORS);
+    }
+
+  return changed_cfg;
+}
+
+// Print a description of CHANGE to PP.
+void
+rtl_ssa::pp_insn_change (pretty_printer *pp, const insn_change &change)
+{
+  change.print (pp);
+}
+
+// Print a description of CHANGE to FILE.
+void
+dump (FILE *file, const insn_change &change)
+{
+  dump_using (file, pp_insn_change, change);
+}
+
+// Debug interface to the dump routine above.
+void debug (const insn_change &x) { dump (stderr, x); }
diff --git a/gcc/rtl-ssa/changes.h b/gcc/rtl-ssa/changes.h
new file mode 100644
index 00000000000..308c5edc409
--- /dev/null
+++ b/gcc/rtl-ssa/changes.h
@@ -0,0 +1,118 @@
+// RTL SSA classes related to changing instructions                 -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+namespace rtl_ssa {
+
+// A class that describes a change that we're considering making to an
+// instruction.  There are three choices:
+//
+// (1) delete the instruction
+// (2) replace the instruction with a new instruction in-place
+// (3) replace the instruction with a new instruction at a different location
+//
+// Anything related to the "new instruction" is irrelevant for (1).
+//
+// The class doesn't actually change anything itself, it simply records
+// something that we might do.
+class insn_change
+{
+public:
+  enum delete_action { DELETE };
+
+  // Construct a possible change to INSN.
+  insn_change (insn_info *insn);
+
+  // Construct a possible deletion of INSN.
+  insn_change (insn_info *insn, delete_action);
+
+  // The instruction that we would change.
+  insn_info *insn () const { return m_insn; }
+
+  // The rtx_insn of the instruction that we would change.
+  rtx_insn *rtl () const { return m_insn->rtl (); }
+
+  // The basic block that contains insn ().
+  bb_info *bb () const { return m_insn->bb (); }
+
+  // The extended basic block that contains insn ().
+  ebb_info *ebb () const { return m_insn->ebb (); }
+
+  // The uid of the instruction that we would change.
+  unsigned int insn_uid () const { return m_insn->uid (); }
+
+  // The list of things that the original instruction defined and used.
+  def_array old_defs () const { return m_insn->defs (); }
+  use_array old_uses () const { return m_insn->uses (); }
+
+  // The cost of the original instruction, as calculated by the target.
+  unsigned int old_cost () const { return m_insn->cost (); }
+
+  // Return true if the original instruction would simply be deleted,
+  // rather than being replaced by a new instruction.
+  bool is_deletion () const { return m_is_deletion; }
+
+  // Print a description of the change to PP.
+  void print (pretty_printer *pp) const;
+
+  // Return an insn_change for deleting INSN.
+  static insn_change delete_insn (insn_info *insn) { return { insn, DELETE }; }
+
+private:
+  // The value returned by insn ().
+  insn_info *m_insn;
+
+public:
+  // The list of things that the new instruction would define and use.
+  def_array new_defs;
+  use_array new_uses;
+
+  // The range of instructions after which the instruction could be placed.
+  // The range can include INSN itself: placing the instruction after either
+  // INSN or INSN->prev_nondebug_insn () is equivalent to not moving the
+  // instruction.
+  insn_range_info move_range;
+
+  // The cost that the new instruction would have, as calculated by the target.
+  unsigned int new_cost;
+
+private:
+  // The value returned by is_deletion ().
+  bool m_is_deletion;
+};
+
+// A class that represents a closure of the two-argument form of
+// insn_is_changing.  See the comment above the one-argument form
+// for details.
+class insn_is_changing_closure
+{
+public:
+  insn_is_changing_closure (array_slice<insn_change *const> changes);
+  bool operator() (const insn_info *) const;
+
+private:
+  array_slice<insn_change *const> m_changes;
+};
+
+void pp_insn_change (pretty_printer *, const insn_change &);
+
+}
+
+void dump (FILE *, const rtl_ssa::insn_change &);
+
+void DEBUG_FUNCTION debug (const rtl_ssa::insn_change &);
diff --git a/gcc/rtl-ssa/functions.cc b/gcc/rtl-ssa/functions.cc
new file mode 100644
index 00000000000..50595ac8ed6
--- /dev/null
+++ b/gcc/rtl-ssa/functions.cc
@@ -0,0 +1,325 @@
+// Implementation of function-related RTL SSA functions             -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#define INCLUDE_ALGORITHM
+#define INCLUDE_FUNCTIONAL
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "df.h"
+#include "rtl-ssa.h"
+#include "rtl-ssa/internals.inl"
+
+using namespace rtl_ssa;
+
+function_info::function_info (function *fn)
+  : m_fn (fn)
+{
+  // Force the alignment to be obstack_alignment.  Everything else is normal.
+  obstack_specify_allocation (&m_obstack, OBSTACK_CHUNK_SIZE,
+			      obstack_alignment, obstack_chunk_alloc,
+			      obstack_chunk_free);
+  obstack_specify_allocation (&m_temp_obstack, OBSTACK_CHUNK_SIZE,
+			      obstack_alignment, obstack_chunk_alloc,
+			      obstack_chunk_free);
+
+  // Record the start of the obstacks.
+  m_obstack_start = XOBNEWVAR (&m_obstack, char, 0);
+  m_temp_obstack_start = XOBNEWVAR (&m_temp_obstack, char, 0);
+
+  init_function_data ();
+  process_all_blocks ();
+  simplify_phis ();
+}
+
+function_info::~function_info ()
+{
+  // Anything using the temporary obstack should free it afterwards,
+  // preferably via temp_watermark ().
+  gcc_assert (XOBNEWVAR (&m_temp_obstack, char, 0) == m_temp_obstack_start);
+
+  obstack_free (&m_temp_obstack, nullptr);
+  obstack_free (&m_obstack, nullptr);
+}
+
+// See the comment above the declaration.
+void
+function_info::print (pretty_printer *pp) const
+{
+  pp_string (pp, "Function: ");
+  pp_string (pp, function_name (m_fn));
+  for (ebb_info *ebb : ebbs ())
+    {
+      pp_newline (pp);
+      pp_newline_and_indent (pp, 0);
+      pp_ebb (pp, ebb);
+    }
+}
+
+// Calculate m_potential_phi_regs.
+void
+function_info::calculate_potential_phi_regs ()
+{
+  auto *lr_info = DF_LR_BB_INFO (ENTRY_BLOCK_PTR_FOR_FN (m_fn));
+  for (unsigned int regno = 0; regno < m_num_regs; ++regno)
+    if (regno >= DF_REG_SIZE (DF)
+	// Exclude registers that have a single definition that dominates
+	// all uses.  If the definition does not dominate all uses,
+	// the register will be exposed upwards to the entry block but
+	// will not be defined by the entry block.
+	|| DF_REG_DEF_COUNT (regno) > 1
+	|| (!bitmap_bit_p (&lr_info->def, regno)
+	    && bitmap_bit_p (&lr_info->out, regno)))
+      bitmap_set_bit (m_potential_phi_regs, regno);
+}
+
+// Initialize all member variables in preparation for (re)building
+// SSA form from scratch.
+void
+function_info::init_function_data ()
+{
+  m_next_artificial_uid = -1;
+  m_next_phi_uid = 0;
+  m_num_regs = max_reg_num ();
+  m_defs.safe_grow_cleared (m_num_regs + 1);
+  m_bbs.safe_grow_cleared (last_basic_block_for_fn (m_fn));
+  m_first_bb = nullptr;
+  m_last_bb = nullptr;
+  m_first_insn = nullptr;
+  m_last_insn = nullptr;
+  m_last_nondebug_insn = nullptr;
+  m_free_phis = nullptr;
+
+  calculate_potential_phi_regs ();
+}
+
+// The initial phase of the phi simplification process.  The cumulative
+// effect of the initial phase is to set up ASSUMED_VALUES such that,
+// for a phi P with uid ID:
+//
+// - if we think all inputs to P have the same value, ASSUMED_VALUES[ID]
+//   is that value
+//
+// - otherwise, ASSUMED_VALUES[ID] is P.
+//
+// This has already been done for phis with a lower uid than PHI,
+// initially making optimistic assumptions about backedge inputs.
+// Now do the same for PHI.  If this might invalidate any assumptions
+// made for earlier phis, add the uids of those phis to WORKLIST.
+void
+function_info::simplify_phi_setup (phi_info *phi, set_info **assumed_values,
+				   bitmap worklist)
+{
+  // If all non-backedge inputs have the same value, set NEW_VALUE
+  // to that value.  Otherwise set NEW_VALUE to PHI, to indicate
+  // that PHI cannot be simplified.
+  unsigned int phi_uid = phi->uid ();
+  bool is_first_input = true;
+  set_info *new_value = nullptr;
+  machine_mode phi_mode = phi->mode ();
+  for (use_info *input : phi->inputs ())
+    {
+      set_info *def = input->def ();
+
+      if (auto *input_phi = safe_dyn_cast<phi_info *> (def))
+	{
+	  // Ignore backedges for now.
+	  unsigned int input_phi_uid = input_phi->uid ();
+	  if (phi_uid <= input_phi_uid)
+	    continue;
+
+	  def = assumed_values[input_phi_uid];
+	}
+
+      // Compare this definition with previous ones.
+      if (is_first_input)
+	{
+	  new_value = def;
+	  is_first_input = false;
+	}
+      else if (new_value != def)
+	new_value = phi;
+
+      // If the input has a known mode (i.e. not BLKmode), make sure
+      // that the phi's mode is at least as large.
+      if (def)
+	phi_mode = combine_modes (phi_mode, def->mode ());
+    }
+  if (phi->mode () != phi_mode)
+    phi->set_mode (phi_mode);
+
+  // Since we use a reverse postorder traversal, no phi can consist
+  // entirely of backedges.
+  gcc_checking_assert (!is_first_input);
+  assumed_values[phi_uid] = new_value;
+
+  // See whether any assumptions for earlier phis are now invalid.
+  simplify_phi_propagate (phi, assumed_values, nullptr, worklist);
+}
+
+// The propagation phase of the phi simplification process, with
+// ASSUMED_VALUES as described above simplify_phi_setup.  Iteratively
+// update the phis that use PHI based on PHI's entry in ASSUMED_VALUES.
+// If CURR_WORKLIST is null, consider only phi uses with a lower uid
+// than PHI, otherwise consider all phi uses.
+//
+// If a phi with a higher uid than PHI needs updating, add its uid to
+// CURR_WORKLIST; if a phi with a lower uid than PHI needs updating,
+// add its uid to NEXT_WORKLIST.
+void
+function_info::simplify_phi_propagate (phi_info *phi,
+				       set_info **assumed_values,
+				       bitmap curr_worklist,
+				       bitmap next_worklist)
+{
+  // Go through each phi user of PHI to see whether it needs updating.
+  unsigned int phi_uid = phi->uid ();
+  machine_mode phi_mode = phi->mode ();
+  set_info *phi_value = assumed_values[phi_uid];
+  for (use_info *use : phi->phi_uses ())
+    {
+      phi_info *user_phi = use->phi ();
+
+      // Propagate the phi's new mode to all phi users.  Insn uses should
+      // not be updated, since their modes reflect a property of the insns
+      // rather than the phi.
+      if (use->mode () != phi_mode)
+	use->set_mode (phi_mode);
+
+      if (user_phi == phi)
+	continue;
+
+      // If this is a phi we should be looking at, see whether it needs
+      // an update.
+      unsigned int user_phi_uid = user_phi->uid ();
+      if (user_phi_uid < phi_uid || curr_worklist)
+	{
+	  bool needs_update = false;
+
+	  // Make sure that USER_PHI's mode is at least as big as PHI_MODE.
+	  machine_mode user_phi_mode = user_phi->mode ();
+	  machine_mode new_mode = combine_modes (user_phi_mode, phi_mode);
+	  if (user_phi_mode != new_mode)
+	    {
+	      user_phi->set_mode (new_mode);
+	      needs_update = true;
+	    }
+
+	  // If USER_PHI optimistically assumed an incorrect value,
+	  // adjust it now.
+	  if (assumed_values[user_phi_uid] != user_phi
+	      && assumed_values[user_phi_uid] != phi_value)
+	    {
+	      assumed_values[user_phi_uid] = user_phi;
+	      needs_update = true;
+	    }
+
+	  if (needs_update)
+	    {
+	      if (user_phi_uid < phi_uid)
+		bitmap_set_bit (next_worklist, user_phi_uid);
+	      else
+		bitmap_set_bit (curr_worklist, user_phi_uid);
+	    }
+	}
+    }
+}
+
+// Update the modes of all phis so that they are at least as big as
+// all inputs.  Remove any non-degenerate phis whose inputs are all equal.
+void
+function_info::simplify_phis ()
+{
+  auto temps = temp_watermark ();
+
+  // See the comment above simplify_phi_setup for details about this array.
+  auto *assumed_values = XOBNEWVEC (&m_temp_obstack, set_info *,
+				    m_next_phi_uid);
+
+  // An array of all phis, indexed by uid.
+  auto *phis = XOBNEWVEC (&m_temp_obstack, phi_info *, m_next_phi_uid);
+
+  // Which phi uids are actually in use.
+  auto_sbitmap valid_phi_uids (m_next_phi_uid);
+  bitmap_clear (valid_phi_uids);
+
+  // Bitmaps used for the main double-queue propagation phase.
+  auto_bitmap worklist1;
+  auto_bitmap worklist2;
+  bitmap curr_worklist = worklist1;
+  bitmap next_worklist = worklist2;
+
+  // Perform the set-up phase; see simplify_phi_setup for details.
+  for (ebb_info *ebb : ebbs ())
+    for (phi_info *phi : ebb->phis ())
+      {
+	bitmap_set_bit (valid_phi_uids, phi->uid ());
+	phis[phi->uid ()] = phi;
+	simplify_phi_setup (phi, assumed_values, curr_worklist);
+      }
+
+  // Iteratively process any phis that need updating; see
+  // simplify_phi_propagate for details.  Using a double queue
+  // should reduce the number of times that any given phi node
+  // needs to be revisited.
+  while (!bitmap_empty_p (curr_worklist))
+    {
+      do
+	{
+	  unsigned int uid = bitmap_first_set_bit (curr_worklist);
+	  bitmap_clear_bit (curr_worklist, uid);
+	  simplify_phi_propagate (phis[uid], assumed_values,
+				  curr_worklist, next_worklist);
+	}
+      while (!bitmap_empty_p (curr_worklist));
+      std::swap (next_worklist, curr_worklist);
+    }
+
+  // Make sure that assumed_values is a transitive closure.  This ensures
+  // that each use_info is only updated once.
+  if (flag_checking)
+    for (unsigned int i = 0; i < m_next_phi_uid; ++i)
+      if (bitmap_bit_p (valid_phi_uids, i))
+	if (auto *new_phi = safe_dyn_cast<phi_info *> (assumed_values[i]))
+	  gcc_assert (assumed_values[new_phi->uid ()] == new_phi);
+
+  // Update any phis that turned out to be equivalent to a single input.
+  for (unsigned int i = 0; i < m_next_phi_uid; ++i)
+    if (bitmap_bit_p (valid_phi_uids, i) && phis[i] != assumed_values[i])
+      replace_phi (phis[i], assumed_values[i]);
+}
+
+// Print a description of FUNCTION to PP.
+void
+rtl_ssa::pp_function (pretty_printer *pp, const function_info *function)
+{
+  function->print (pp);
+}
+
+// Print a description of FUNCTION to FILE.
+void
+dump (FILE *file, const function_info *function)
+{
+  dump_using (file, pp_function, function);
+}
+
+// Debug interface to the dump routine above.
+void debug (const function_info *x) { dump (stderr, x); }
diff --git a/gcc/rtl-ssa/functions.h b/gcc/rtl-ssa/functions.h
new file mode 100644
index 00000000000..b09d50e86b0
--- /dev/null
+++ b/gcc/rtl-ssa/functions.h
@@ -0,0 +1,433 @@
+// Function-related RTL SSA classes                                 -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+namespace rtl_ssa {
+
+// SSA-related information about a function.  It contains three levels
+// of information, each in reverse postorder:
+//
+// - a list of extended basic blocks
+// - a list of basic blocks
+// - a list of instructions
+//
+// It also maintains a list of definitions of memory, and a list of
+// definitions of each register.
+//
+// See doc/rtl.texi for more details about the way this information
+// is organized and how changes to it are made.
+class function_info
+{
+  // The default obstack alignment takes long double into account.
+  // Since we have no use for that here, and since we allocate many
+  // relatively small objects, it's better to specify an alignment
+  // explicitly.  The allocation routines assert that the alignment
+  // is enough for the objects being allocated.
+  //
+  // Because various structures use pointer_mux, we need at least 2 bytes
+  // of alignment.
+  static const size_t obstack_alignment = sizeof (void *);
+
+public:
+  // Construct SSA form for function FN.
+  function_info (function *fn);
+  ~function_info ();
+
+  // Return a list of all the extended basic blocks in the function, in reverse
+  // postorder.  The list includes the entry and exit blocks.
+  iterator_range<ebb_iterator> ebbs () const;
+
+  // Like ebbs (), but in the reverse order.
+  iterator_range<reverse_ebb_iterator> reverse_ebbs () const;
+
+  // Return a list of all the basic blocks in the function, in reverse
+  // postorder.  The list includes the entry and exit blocks.
+  iterator_range<bb_iterator> bbs () const;
+
+  // Like bbs (), but in the reverse order.
+  iterator_range<reverse_bb_iterator> reverse_bbs () const;
+
+  // Return the SSA information for the basic block with index INDEX.
+  bb_info *bb (unsigned int index) const { return m_bbs[index]; }
+
+  // Return the SSA information for CFG_BB.
+  bb_info *bb (basic_block cfg_bb) const { return m_bbs[cfg_bb->index]; }
+
+  // Return a list of all the instructions in the function, in reverse
+  // postorder.  The list includes both real and artificial instructions.
+  //
+  // Iterations over the list will pick up any new instructions that are
+  // inserted after the iterator's current instruction.
+  iterator_range<any_insn_iterator> all_insns () const;
+
+  // Like all_insns (), but in the reverse order.
+  //
+  // Iterations over the list will pick up any new instructions that are
+  // inserted before the iterator's current instruction.
+  iterator_range<reverse_any_insn_iterator> reverse_all_insns () const;
+
+  // Like all_insns (), but without the debug instructions.
+  iterator_range<nondebug_insn_iterator> nondebug_insns () const;
+
+  // Like reverse_all_insns (), but without the debug instructions.
+  iterator_range<reverse_nondebug_insn_iterator>
+    reverse_nondebug_insns () const;
+
+  // Return the first and last instructions in insns ().
+  insn_info *first_insn () const { return m_first_insn; }
+  insn_info *last_insn () const { return m_last_insn; }
+
+  // Return a list of all definitions of memory, in reverse postorder.
+  // This includes both real stores by instructions and artificial
+  // definitions by things like phi nodes.
+  iterator_range<def_iterator> mem_defs () const;
+
+  // Return a list of all definitions of register REGNO, in reverse postorder.
+  // This includes both real stores by instructions and artificial
+  // definitions by things like phi nodes.
+  iterator_range<def_iterator> ref_defs (unsigned int regno) const;
+
+  // Check if all uses of register REGNO are either unconditionally undefined
+  // or use the same single dominating definition.  Return the definition
+  // if so, otherwise return null.
+  set_info *single_dominating_def (unsigned int regno) const;
+
+  // Look for a definition of RESOURCE at INSN.  Return the result of the
+  // search as a def_lookup; see the comments there for more details.
+  def_lookup find_def (resource_info resource, insn_info *insn);
+
+  // Return an RAII object that owns all temporary RTL SSA memory
+  // allocated during a change attempt.  The object should remain in
+  // scope until the change has been aborted or successfully completed.
+  obstack_watermark new_change_attempt () { return &m_temp_obstack; }
+
+  // Make a best attempt to check whether the values used by USES are
+  // available on entry to BB, without solving a full dataflow problem.
+  // If all the values are already live on entry to BB or can be made
+  // available there, return a use_array that describes the uses as
+  // if they occured at the start of BB.  These uses are purely temporary,
+  // and will not become permanent unless applied using change_insns.
+  //
+  // If the operation fails, return an invalid use_array.
+  //
+  // WATERMARK is a watermark returned by new_change_attempt ().
+  use_array make_uses_available (obstack_watermark &watermark,
+				 use_array uses, bb_info *bb);
+
+  // If CHANGE doesn't already clobber REGNO, try to add such a clobber,
+  // limiting the movement range in order to make the clobber valid.
+  // When determining whether REGNO is live, ignore accesses made by an
+  // instruction I if IGNORE (I) is true.  The caller then assumes the
+  // responsibility of ensuring that CHANGE and I are placed in a valid order.
+  //
+  // Return true on success.  Leave CHANGE unmodified when returning false.
+  //
+  // WATERMARK is a watermark returned by new_change_attempt ().
+  template<typename IgnorePredicate>
+  bool add_regno_clobber (obstack_watermark &watermark, insn_change &change,
+			  unsigned int regno, IgnorePredicate ignore);
+
+  // Return true if change_insns will be able to perform the changes
+  // described by CHANGES.
+  bool verify_insn_changes (array_slice<insn_change *const> changes);
+
+  // Perform all the changes in CHANGES, keeping the instructions in the
+  // order specified by the CHANGES array.  On return, the SSA information
+  // remains up-to-date.  The same is true for instruction-level DF
+  // information, although the block-level DF information might be
+  // marked dirty.
+  void change_insns (array_slice<insn_change *> changes);
+
+  // Like change_insns, but for a single change CHANGE.
+  void change_insn (insn_change &change);
+
+  // If the changes that have been made to instructions require updates
+  // to the CFG, perform those updates now.  Return true if something changed.
+  // If it did:
+  //
+  // - The SSA information is now invalid and needs to be recomputed.
+  //
+  // - Dominance information is no longer available (in either direction).
+  //
+  // - The caller will need to call cleanup_cfg at some point.
+  //
+  // ??? We could probably update the SSA information for simple updates,
+  // but currently nothing would benefit.  These late CFG changes are
+  // relatively rare anyway, since gimple optimisers should remove most
+  // unnecessary control flow.
+  bool perform_pending_updates ();
+
+  // Print the contents of the function to PP.
+  void print (pretty_printer *pp) const;
+
+private:
+  // Information about the values that are live on exit from a basic block.
+  // This class is only used when constructing the SSA form, it isn't
+  // designed for being kept up-to-date.
+  class bb_live_out_info
+  {
+  public:
+    // REG_VALUES contains all the registers that live out from the block,
+    // in order of increasing register number.  There are NUM_REG_VALUES
+    // in total.  Registers do not appear here if their values are known
+    // to be completely undefined; in that sense, the information is
+    // closer to DF_LIVE than to DF_LR.
+    unsigned int num_reg_values;
+    set_info **reg_values;
+
+    // The memory value that is live on exit from the block.
+    set_info *mem_value;
+  };
+
+  // Information used while constructing the SSA form and discarded
+  // afterwards.
+  class build_info
+  {
+  public:
+    set_info *current_reg_value (unsigned int) const;
+    set_info *current_mem_value () const;
+
+    void record_reg_def (unsigned int, def_info *);
+    void record_mem_def (def_info *);
+
+    // The block that we're currently processing.
+    bb_info *current_bb;
+
+    // The EBB that contains CURRENT_BB.
+    ebb_info *current_ebb;
+
+    // Except for the local exception noted below:
+    //
+    // - If register R has been defined in the current EBB, LAST_ACCESS[R + 1]
+    //   is the last definition of R in the EBB.
+    //
+    // - If register R is currently live but has not yet been defined
+    //   in the EBB, LAST_ACCESS[R + 1] is the current value of R,
+    //   or null if the register's value is completely undefined.
+    //
+    // - The contents are not meaningful for other registers.
+    //
+    // Similarly:
+    //
+    // - If the current EBB has defined memory, LAST_ACCESS[0] is the last
+    //   definition of memory in the EBB.
+    //
+    // - Otherwise LAST_ACCESS[0] is the value of memory that is live on
+    // - entry to the EBB.
+    //
+    // The exception is that while building instructions, LAST_ACCESS[I]
+    // can temporarily be the use of regno I - 1 by that instruction.
+    access_info **last_access;
+
+    // A bitmap of registers that are live on entry to this EBB, with a tree
+    // view for quick lookup.  Only used if MAY_HAVE_DEBUG_INSNS.
+    bitmap ebb_live_in_for_debug;
+
+    // A conservative superset of the registers that are used by
+    // instructions in CURRENT_EBB.  That is, all used registers
+    // are in the set, but some unused registers might be too.
+    bitmap ebb_use;
+
+    // A similarly conservative superset of the registers that are defined
+    // by instructions in CURRENT_EBB.
+    bitmap ebb_def;
+
+    // BB_LIVE_OUT[BI] gives the live-out values for the basic block
+    // with index BI.
+    bb_live_out_info *bb_live_out;
+  };
+
+  // Return an RAII object that owns all objects allocated by
+  // allocate_temp during its lifetime.
+  obstack_watermark temp_watermark () { return &m_temp_obstack; }
+
+  template<typename T, typename... Ts>
+  T *allocate (Ts... args);
+
+  template<typename T, typename... Ts>
+  T *allocate_temp (Ts... args);
+
+  access_array temp_access_array (access_array accesses);
+
+  clobber_group *need_clobber_group (clobber_info *);
+  def_node *need_def_node (def_info *);
+  def_splay_tree need_def_splay_tree (def_info *);
+
+  use_info *make_use_available (use_info *, bb_info *);
+  def_array insert_temp_clobber (obstack_watermark &, insn_info *,
+				 unsigned int, def_array);
+
+  void insert_def_before (def_info *, def_info *);
+  void insert_def_after (def_info *, def_info *);
+  void remove_def_from_list (def_info *);
+
+  void add_clobber (clobber_info *, clobber_group *);
+  void remove_clobber (clobber_info *, clobber_group *);
+  void prepend_clobber_to_group (clobber_info *, clobber_group *);
+  void append_clobber_to_group (clobber_info *, clobber_group *);
+  void merge_clobber_groups (clobber_info *, clobber_info *,
+			     def_info *);
+  clobber_info *split_clobber_group (clobber_group *, insn_info *);
+
+  void append_def (def_info *);
+  void add_def (def_info *);
+  void remove_def (def_info *);
+
+  void need_use_splay_tree (set_info *);
+
+  static void insert_use_before (use_info *, use_info *);
+  static void insert_use_after (use_info *, use_info *);
+
+  void add_use (use_info *);
+  void remove_use (use_info *);
+
+  insn_info::order_node *need_order_node (insn_info *);
+
+  void add_insn_after (insn_info *, insn_info *);
+  void append_insn (insn_info *);
+  void remove_insn (insn_info *);
+
+  insn_info *append_artificial_insn (bb_info *, rtx_insn * = nullptr);
+
+  void start_insn_accesses ();
+  void finish_insn_accesses (insn_info *);
+
+  void record_use (build_info &, insn_info *, rtx_obj_reference);
+  void record_call_clobbers (build_info &, insn_info *, rtx_call_insn *);
+  void record_def (build_info &, insn_info *, rtx_obj_reference);
+  void add_insn_to_block (build_info &, rtx_insn *);
+
+  void add_reg_unused_notes (insn_info *);
+
+  void add_live_out_use (bb_info *, set_info *);
+  set_info *live_out_value (bb_info *, set_info *);
+
+  void append_phi (ebb_info *, phi_info *);
+  void remove_phi (phi_info *);
+  void delete_phi (phi_info *);
+  void replace_phi (phi_info *, set_info *);
+  phi_info *create_phi (ebb_info *, resource_info, access_info **,
+			unsigned int);
+  phi_info *create_degenerate_phi (ebb_info *, set_info *);
+
+  bb_info *create_bb_info (basic_block);
+  void append_bb (bb_info *);
+  void calculate_potential_phi_regs ();
+
+  insn_info *add_placeholder_after (insn_info *);
+  void possibly_queue_changes (insn_change &);
+  void finalize_new_accesses (insn_change &);
+  void apply_changes_to_insn (insn_change &);
+
+  void init_function_data ();
+  void add_entry_block_defs (build_info &);
+  void add_phi_nodes (build_info &);
+  void add_artificial_accesses (build_info &, df_ref_flags);
+  void add_block_contents (build_info &);
+  void record_block_live_out (build_info &);
+  void populate_backedge_phis (build_info &);
+  void process_all_blocks ();
+
+  void simplify_phi_setup (phi_info *, set_info **, bitmap);
+  void simplify_phi_propagate (phi_info *, set_info **, bitmap, bitmap);
+  void simplify_phis ();
+
+  // The function that this object describes.
+  function *m_fn;
+
+  // The lowest (negative) in-use artificial insn uid minus one.
+  int m_next_artificial_uid;
+
+  // The highest in-use phi uid plus one.
+  unsigned int m_next_phi_uid;
+
+  // The highest in-use register number plus one.
+  unsigned int m_num_regs;
+
+  // M_DEFS[R] is the first definition of register R - 1 in a reverse
+  // postorder traversal of the function, or null if the function has
+  // no definition of R.  Applying last () gives the last definition of R.
+  //
+  // M_DEFS[0] is for memory; MEM_REGNO + 1 == 0.
+  auto_vec<def_info *> m_defs;
+
+  // M_BBS[BI] gives the SSA information about the block with index BI.
+  auto_vec<bb_info *> m_bbs;
+
+  // An obstack used to allocate the main RTL SSA information.
+  obstack m_obstack;
+
+  // An obstack used for temporary work, such as while building up a list
+  // of possible instruction changes.
+  obstack m_temp_obstack;
+
+  // The start of each obstack, so that all memory in them can be freed.
+  char *m_obstack_start;
+  char *m_temp_obstack_start;
+
+  // The entry and exit blocks.
+  bb_info *m_first_bb;
+  bb_info *m_last_bb;
+
+  // The first and last instructions in a reverse postorder traversal
+  // of the function.
+  insn_info *m_first_insn;
+  insn_info *m_last_insn;
+
+  // The last nondebug instruction in the list of instructions.
+  // This is only different from m_last_insn when building the initial
+  // SSA information; after that, the last instruction is always a
+  // BB end instruction.
+  insn_info *m_last_nondebug_insn;
+
+  // Temporary working state when building up lists of definitions and uses.
+  // Keeping them around should reduce the number of unnecessary reallocations.
+  auto_vec<access_info *> m_temp_defs;
+  auto_vec<access_info *> m_temp_uses;
+
+  // The set of registers that might need to have phis associated with them.
+  // Registers outside this set are known to have a single definition that
+  // dominates all uses.
+  //
+  // Before RA, about 5% of registers are typically in the set.
+  auto_bitmap m_potential_phi_regs;
+
+  // A list of phis that are no longer in use.  Their uids are still unique
+  // and so can be recycled.
+  phi_info *m_free_phis;
+
+  // A list of instructions that have been changed in ways that need
+  // further processing later, such as removing dead instructions or
+  // altering the CFG.
+  auto_vec<insn_info *> m_queued_insn_updates;
+
+  // The INSN_UIDs of all instructions in M_QUEUED_INSN_UPDATES.
+  auto_bitmap m_queued_insn_update_uids;
+
+  // A basic_block is in this bitmap if we need to call purge_dead_edges
+  // on it.  As with M_QUEUED_INSN_UPDATES, these updates are queued until
+  // a convenient point.
+  auto_bitmap m_need_to_purge_dead_edges;
+};
+
+void pp_function (pretty_printer *, const function_info *);
+}
+
+void dump (FILE *, const rtl_ssa::function_info *);
+
+void DEBUG_FUNCTION debug (const rtl_ssa::function_info *);
diff --git a/gcc/rtl-ssa/insn-utils.h b/gcc/rtl-ssa/insn-utils.h
new file mode 100644
index 00000000000..d7705e96ac8
--- /dev/null
+++ b/gcc/rtl-ssa/insn-utils.h
@@ -0,0 +1,46 @@
+// Instruction-related utilities for RTL SSA                        -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+namespace rtl_ssa {
+
+// Return whichever of INSN1 and INSN2 occurs earlier in the function's
+// reverse postorder.
+inline insn_info *
+earlier_insn (insn_info *insn1, insn_info *insn2)
+{
+  return *insn1 < *insn2 ? insn1 : insn2;
+}
+
+// Return whichever of INSN1 and INSN2 occurs later in the function's
+// reverse postorder.
+inline insn_info *
+later_insn (insn_info *insn1, insn_info *insn2)
+{
+  return *insn1 < *insn2 ? insn2 : insn1;
+}
+
+// Return a closure of operator== for INSN.  See insn_is_changing for
+// the rationale for defining the function this way.
+inline insn_is_closure
+insn_is (const insn_info *insn)
+{
+  return insn_is_closure (insn);
+}
+
+}
diff --git a/gcc/rtl-ssa/insns.cc b/gcc/rtl-ssa/insns.cc
new file mode 100644
index 00000000000..e8300e036f1
--- /dev/null
+++ b/gcc/rtl-ssa/insns.cc
@@ -0,0 +1,718 @@
+// Implementation of instruction-related RTL SSA functions          -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#define INCLUDE_ALGORITHM
+#define INCLUDE_FUNCTIONAL
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "df.h"
+#include "rtl-ssa.h"
+#include "rtl-ssa/internals.inl"
+#include "predict.h"
+#include "print-rtl.h"
+#include "rtl-iter.h"
+
+using namespace rtl_ssa;
+
+// The gap to leave between program points when building up the list
+// of instructions for the first time.  Using 2 allows an instruction
+// to be inserted between two others without resorting to splay tree
+// ordering.  Using 0 is useful as a debugging aid to stress the
+// splay tree code.
+static const unsigned int POINT_INCREASE = 2;
+
+// Calculate and record the cost of the instruction, based on the
+// form it had before any in-progress changes were made.
+void
+insn_info::calculate_cost () const
+{
+  basic_block cfg_bb = BLOCK_FOR_INSN (m_rtl);
+  temporarily_undo_changes (0);
+  m_cost_or_uid = insn_cost (m_rtl, optimize_bb_for_speed_p (cfg_bb));
+  redo_changes (0);
+}
+
+// Add NOTE to the instruction's notes.
+void
+insn_info::add_note (insn_note *note)
+{
+  insn_note **ptr = &m_first_note;
+  // Always put the order node first, since it's the one that's likely
+  // to be used most often.
+  if (*ptr && (*ptr)->kind () == insn_note_kind::ORDER_NODE)
+    ptr = &(*ptr)->m_next_note;
+  note->m_next_note = *ptr;
+  *ptr = note;
+}
+
+// Implement compare_with for the case in which this insn and OTHER
+// have the same program point.
+int
+insn_info::slow_compare_with (const insn_info &other) const
+{
+  return order_splay_tree::compare_nodes (get_known_order_node (),
+					  other.get_known_order_node ());
+}
+
+// Print insn uid UID to PP, where UID has the same form as insn_info::uid.
+void
+insn_info::print_uid (pretty_printer *pp, int uid)
+{
+  char tmp[3 * sizeof (uid) + 2];
+  if (uid < 0)
+    // An artificial instruction.
+    snprintf (tmp, sizeof (tmp), "a%d", -uid);
+  else
+    // A real RTL instruction.
+    snprintf (tmp, sizeof (tmp), "i%d", uid);
+  pp_string (pp, tmp);
+}
+
+// See comment above declaration.
+void
+insn_info::print_identifier (pretty_printer *pp) const
+{
+  print_uid (pp, uid ());
+}
+
+// See comment above declaration.
+void
+insn_info::print_location (pretty_printer *pp) const
+{
+  if (bb_info *bb = this->bb ())
+    {
+      ebb_info *ebb = bb->ebb ();
+      if (ebb && is_phi ())
+	ebb->print_identifier (pp);
+      else
+	bb->print_identifier (pp);
+      pp_string (pp, " at point ");
+      pp_decimal_int (pp, m_point);
+    }
+  else
+    pp_string (pp, "<unknown location>");
+}
+
+// See comment above declaration.
+void
+insn_info::print_identifier_and_location (pretty_printer *pp) const
+{
+  if (m_is_asm)
+    pp_string (pp, "asm ");
+  if (m_is_debug_insn)
+    pp_string (pp, "debug ");
+  pp_string (pp, "insn ");
+  print_identifier (pp);
+  pp_string (pp, " in ");
+  print_location (pp);
+}
+
+// See comment above declaration.
+void
+insn_info::print_full (pretty_printer *pp) const
+{
+  print_identifier_and_location (pp);
+  pp_colon (pp);
+  if (is_real ())
+    {
+      pp_newline_and_indent (pp, 2);
+      if (has_been_deleted ())
+	pp_string (pp, "deleted");
+      else
+	{
+	  // Print the insn pattern to a temporary printer.
+	  pretty_printer sub_pp;
+	  print_insn_with_notes (&sub_pp, rtl ());
+	  const char *text = pp_formatted_text (&sub_pp);
+
+	  // Calculate the length of the maximum line in the pattern.
+	  unsigned int max_len = 0;
+	  const char *start = text;
+	  while (const char *end = strchr (start, '\n'))
+	    {
+	      max_len = MAX (max_len, end - start);
+	      start = end + 1;
+	    }
+
+	  // Print a separator before or after the pattern.
+	  auto print_top_bottom = [&]()
+	    {
+	      pp_character (pp, '+');
+	      for (unsigned int i = 0; i < max_len + 2; ++i)
+		pp_character (pp, '-');
+	    };
+
+	  print_top_bottom ();
+	  start = text;
+	  while (const char *end = strchr (start, '\n'))
+	    {
+	      pp_newline_and_indent (pp, 0);
+	      pp_character (pp, '|');
+	      // Each line of the pattern already starts with a space.
+	      // so we don't need to add another one here.
+	      pp_append_text (pp, start, end);
+	      start = end + 1;
+	    }
+	  pp_newline_and_indent (pp, 0);
+	  print_top_bottom ();
+
+	  if (m_cost_or_uid != UNKNOWN_COST)
+	    {
+	      pp_newline_and_indent (pp, 0);
+	      pp_string (pp, "cost: ");
+	      pp_decimal_int (pp, m_cost_or_uid);
+	    }
+	  if (m_has_pre_post_modify)
+	    {
+	      pp_newline_and_indent (pp, 0);
+	      pp_string (pp, "has pre/post-modify operations");
+	    }
+	  if (m_has_volatile_refs)
+	    {
+	      pp_newline_and_indent (pp, 0);
+	      pp_string (pp, "has volatile refs");
+	    }
+	}
+      pp_indentation (pp) -= 2;
+    }
+
+  auto print_accesses = [&](const char *heading, access_array accesses,
+			    unsigned int flags)
+    {
+      if (!accesses.empty ())
+	{
+	  pp_newline_and_indent (pp, 2);
+	  pp_string (pp, heading);
+	  pp_newline_and_indent (pp, 2);
+	  pp_accesses (pp, accesses, flags);
+	  pp_indentation (pp) -= 4;
+	}
+    };
+
+  print_accesses ("uses:", uses (), PP_ACCESS_USER);
+  auto *call_clobbers_note = find_note<insn_call_clobbers_note> ();
+  if (call_clobbers_note)
+    {
+      pp_newline_and_indent (pp, 2);
+      pp_string (pp, "has call clobbers for ABI ");
+      pp_decimal_int (pp, call_clobbers_note->abi_id ());
+      pp_indentation (pp) -= 2;
+    }
+  print_accesses ("defines:", defs (), PP_ACCESS_SETTER);
+  if (num_uses () == 0 && !call_clobbers_note && num_defs () == 0)
+    {
+      pp_newline_and_indent (pp, 2);
+      pp_string (pp, "has no uses or defs");
+      pp_indentation (pp) -= 2;
+    }
+
+  if (order_node *node = get_order_node ())
+    {
+      while (node->m_parent)
+	node = node->m_parent;
+
+      pp_newline_and_indent (pp, 2);
+      pp_string (pp, "insn order: ");
+      pp_newline_and_indent (pp, 2);
+      auto print_order = [](pretty_printer *pp, order_node *node)
+	{
+	  print_uid (pp, node->uid ());
+	};
+      order_splay_tree::print (pp, node, print_order);
+      pp_indentation (pp) -= 4;
+    }
+}
+
+// Return an insn_info::order_node for INSN, creating one if necessary.
+insn_info::order_node *
+function_info::need_order_node (insn_info *insn)
+{
+  insn_info::order_node *order = insn->get_order_node ();
+  if (!order)
+    {
+      order = allocate<insn_info::order_node> (insn->uid ());
+      insn->add_note (order);
+    }
+  return order;
+}
+
+// Add instruction INSN immediately after AFTER in the reverse postorder list.
+// INSN is not currently in the list.
+void
+function_info::add_insn_after (insn_info *insn, insn_info *after)
+{
+  gcc_checking_assert (!insn->has_insn_links ());
+
+  insn->copy_next_from (after);
+  after->set_next_any_insn (insn);
+
+  // The prev link is easy if AFTER and INSN are the same type.
+  // Handle the other cases below.
+  if (after->is_debug_insn () == insn->is_debug_insn ())
+    insn->set_prev_sametype_insn (after);
+
+  if (insn_info *next = insn->next_any_insn ())
+    {
+      if (insn->is_debug_insn () == next->is_debug_insn ())
+	{
+	  // INSN might now be the start of the subsequence of debug insns,
+	  // and so its prev pointer might point to the end of the subsequence
+	  // instead of AFTER.
+	  insn->copy_prev_from (next);
+	  next->set_prev_sametype_insn (insn);
+	}
+      else if (insn->is_debug_insn ()) // && !next->is_debug_insn ()
+	{
+	  // INSN ends a subsequence of debug instructions.  Find the
+	  // first debug instruction in the subsequence, which might
+	  // be INSN itself.  (If it isn't, then AFTER is also a debug
+	  // instruction and we updated INSN's prev link above.)
+	  insn_info *first = next->prev_nondebug_insn ()->next_any_insn ();
+	  first->set_last_debug_insn (insn);
+	}
+      else // !insn->is_debug_insn () && next->is_debug_insn ()
+	// At present we don't (need to) support inserting a nondebug
+	// instruction between two existing debug instructions.
+	gcc_assert (!after->is_debug_insn ());
+
+      // If AFTER and NEXT are separated by at least two points, we can
+      // use a unique point number for INSN.  Otherwise INSN will have
+      // the same point number as AFTER.
+      insn->set_point ((next->point () + after->point ()) / 2);
+    }
+  else
+    {
+      if (!insn->is_debug_insn ())
+	{
+	  insn->set_prev_sametype_insn (m_last_nondebug_insn);
+	  m_last_nondebug_insn = insn;
+	}
+      else
+	// There is now at least one debug instruction after
+	// m_last_nondebug_insn: either INSN itself, or the start of
+	// a longer subsequence of debug insns that now ends with AFTER
+	// followed by INSN.
+	m_last_nondebug_insn->next_any_insn ()->set_last_debug_insn (insn);
+      m_last_insn = insn;
+
+      insn->set_point (after->point () + POINT_INCREASE);
+    }
+
+  // If INSN's program point is the same as AFTER's, we need to use the
+  // splay tree to record their relative order.
+  if (insn->point () == after->point ())
+    {
+      insn_info::order_node *after_node = need_order_node (after);
+      insn_info::order_node *insn_node = need_order_node (insn);
+      insn_info::order_splay_tree::insert_child (after_node, 1, insn_node);
+    }
+}
+
+// Remove INSN from the function's list of instructions.
+void
+function_info::remove_insn (insn_info *insn)
+{
+  if (insn_info::order_node *order = insn->get_order_node ())
+    insn_info::order_splay_tree::remove_node (order);
+
+  if (auto *note = insn->find_note<insn_call_clobbers_note> ())
+    {
+      ebb_call_clobbers_info *ecc = insn->ebb ()->first_call_clobbers ();
+      while (ecc->abi ()->id () != note->abi_id ())
+	ecc = ecc->next ();
+      int comparison = lookup_call_clobbers (*ecc, insn);
+      gcc_assert (comparison == 0);
+      ecc->remove_root ();
+    }
+
+  insn_info *prev = insn->prev_any_insn ();
+  insn_info *next = insn->next_any_insn ();
+  insn_info *prev_nondebug = insn->prev_nondebug_insn ();
+  insn_info *next_nondebug = insn->next_nondebug_insn ();
+
+  // We should never remove the entry or exit block's instructions.
+  // At present we also don't remove entire blocks, so should never
+  // remove debug instructions.
+  gcc_checking_assert (prev_nondebug
+		       && next_nondebug
+		       && !insn->is_debug_insn ());
+
+  if (prev->is_debug_insn () && next->is_debug_insn ())
+    {
+      // We need to stitch together two subsequences of debug insns.
+      insn_info *last = next->last_debug_insn ();
+      next->set_prev_sametype_insn (prev);
+      prev_nondebug->next_any_insn ()->set_last_debug_insn (last);
+    }
+  prev->set_next_any_insn (next);
+  next_nondebug->set_prev_sametype_insn (prev_nondebug);
+
+  insn->clear_insn_links ();
+}
+
+// Create an artificial instruction for BB, associating it with RTL (which can
+// be null).  Add the new instruction to the end of the function's list and
+// return the new instruction.
+insn_info *
+function_info::append_artificial_insn (bb_info *bb, rtx_insn *rtl)
+{
+  insn_info *insn = allocate<insn_info> (bb, rtl, m_next_artificial_uid);
+  m_next_artificial_uid -= 1;
+  append_insn (insn);
+  return insn;
+}
+
+// Finish building a new list of uses and definitions for instruction INSN.
+void
+function_info::finish_insn_accesses (insn_info *insn)
+{
+  unsigned int num_defs = m_temp_defs.length ();
+  unsigned int num_uses = m_temp_uses.length ();
+  obstack_make_room (&m_obstack, num_defs + num_uses);
+  if (num_defs)
+    {
+      sort_accesses (m_temp_defs);
+      obstack_grow (&m_obstack, m_temp_defs.address (),
+		    num_defs * sizeof (access_info *));
+      m_temp_defs.truncate (0);
+    }
+  if (num_uses)
+    {
+      sort_accesses (m_temp_uses);
+      obstack_grow (&m_obstack, m_temp_uses.address (),
+		    num_uses * sizeof (access_info *));
+      m_temp_uses.truncate (0);
+    }
+  void *addr = obstack_finish (&m_obstack);
+  insn->set_accesses (static_cast<access_info **> (addr), num_defs, num_uses);
+}
+
+// Called while building SSA form using BI.  Record that INSN contains
+// read reference REF.  If this requires new entries to be added to
+// INSN->uses (), add those entries to the list we're building in
+// m_temp_uses.
+void
+function_info::record_use (build_info &bi, insn_info *insn,
+			   rtx_obj_reference ref)
+{
+  unsigned int regno = ref.regno;
+  machine_mode mode = ref.is_reg () ? ref.mode : BLKmode;
+  access_info *access = bi.last_access[ref.regno + 1];
+  use_info *use = safe_dyn_cast<use_info *> (access);
+  if (!use)
+    {
+      set_info *value = safe_dyn_cast<set_info *> (access);
+      // In order to ensure that -g doesn't affect codegen, uses in debug
+      // instructions do not affect liveness, either in DF or here.
+      // This means that there might be no correct definition of the resource
+      // available (e.g. if it would require a phi node that the nondebug
+      // code doesn't need).  Perhaps we could have "debug phi nodes" as
+      // well as "debug instructions", but that would require a method
+      // of building phi nodes that didn't depend on DF liveness information,
+      // and so might be significantly more expensive.
+      //
+      // Therefore, the only value we try to attach to a use by a debug
+      // instruction is VALUE itself (as we would for nondebug instructions).
+      // We then need to make a conservative check for whether VALUE is
+      // actually correct.
+      auto value_is_valid = [&]()
+	{
+	  // Memmory always has a valid definition.
+	  if (ref.is_mem ())
+	    return true;
+
+	  // If VALUE would lead to an uninitialized use anyway, there's
+	  // nothing to check.
+	  if (!value)
+	    return false;
+
+	  // If the previous definition occurs in the same EBB then it
+	  // is certainly correct.
+	  if (value->ebb () == bi.current_ebb)
+	    return true;
+
+	  // If the register is live on entry to the EBB but not used
+	  // within it, VALUE is the correct live-in value.
+	  if (bitmap_bit_p (bi.ebb_live_in_for_debug, regno))
+	    return true;
+
+	  // Check if VALUE is the function's only definition of REGNO
+	  // and if it dominates the use.
+	  if (regno != MEM_REGNO
+	      && regno < DF_REG_SIZE (DF)
+	      && DF_REG_DEF_COUNT (regno) == 1
+	      && dominated_by_p (CDI_DOMINATORS, insn->bb ()->cfg_bb (),
+				 value->bb ()->cfg_bb ()))
+	    return true;
+
+	  // Punt for other cases.
+	  return false;
+	};
+      if (insn->is_debug_insn () && !value_is_valid ())
+	value = nullptr;
+
+      use = allocate<use_info> (insn, resource_info { mode, regno }, value);
+      add_use (use);
+      m_temp_uses.safe_push (use);
+      bi.last_access[ref.regno + 1] = use;
+      use->record_reference (ref, true);
+    }
+  else
+    {
+      // Record the mode of the largest use.  The choice is arbitrary if
+      // the instruction (unusually) references the same register in two
+      // different but equal-sized modes.
+      gcc_checking_assert (use->insn () == insn);
+      if (HARD_REGISTER_NUM_P (regno)
+	  && partial_subreg_p (use->mode (), mode))
+	use->set_mode (mode);
+      use->record_reference (ref, false);
+    }
+}
+
+// Called while building SSA form for INSN using BI.  Record the effect
+// of call clobbers in RTL.  We have already added the explicit sets and
+// clobbers for RTL, which have priority over any call clobbers.
+void
+function_info::record_call_clobbers (build_info &bi, insn_info *insn,
+				     rtx_call_insn *rtl)
+{
+  // See whether we should record this call in the EBB's list of
+  // call clobbers.  Three things affect this choice:
+  //
+  // (1) The list is the only way we have of recording partial clobbers.
+  //     All calls that only partially clobber registers must therefore
+  //     be in the list.
+  //
+  // (2) Adding calls to the list is much more memory-efficient than
+  //     creating a long list of clobber_infos.
+  //
+  // (3) Adding calls to the list limits the ability to move definitions
+  //     of registers that are normally fully or partially clobbered
+  //     by the associated predefined ABI.  So adding calls to the list
+  //     can hamper optimization if (thanks to -fipa-ra) the number of
+  //     clobbers is much smaller than the usual set.
+  //
+  // The trade-off that we currently take is to use the list if there
+  // are some registers that the call only partially clobbers or if
+  // the set of clobbers is the standard set.
+  function_abi abi = insn_callee_abi (rtl);
+  if (abi.base_abi ().full_reg_clobbers () == abi.full_reg_clobbers ()
+      || abi.full_and_partial_reg_clobbers () != abi.full_reg_clobbers ())
+    {
+      // Find an entry for this predefined ABI, creating one if necessary.
+      ebb_call_clobbers_info *ecc = bi.current_ebb->first_call_clobbers ();
+      while (ecc && ecc->abi () != &abi.base_abi ())
+	ecc = ecc->next ();
+      if (!ecc)
+	{
+	  ecc = allocate<ebb_call_clobbers_info> (&abi.base_abi ());
+	  ecc->m_next = bi.current_ebb->first_call_clobbers ();
+	  bi.current_ebb->set_first_call_clobbers (ecc);
+	}
+
+      auto abi_id = abi.base_abi ().id ();
+      auto *insn_clobbers = allocate<insn_call_clobbers_note> (abi_id, insn);
+      insn->add_note (insn_clobbers);
+
+      ecc->insert_max_node (insn_clobbers);
+    }
+  else
+    for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno)
+      if (TEST_HARD_REG_BIT (abi.full_reg_clobbers (), regno))
+	{
+	  def_info *def = m_defs[regno + 1];
+	  if (!def || def->last_def ()->insn () != insn)
+	    {
+	      def = allocate<clobber_info> (insn, regno);
+	      def->m_is_call_clobber = true;
+	      append_def (def);
+	      m_temp_defs.safe_push (def);
+	      bi.last_access[regno + 1] = def;
+	    }
+	}
+}
+
+// Called while building SSA form using BI.  Record that INSN contains
+// write reference REF.  Add associated def_infos to the list of accesses
+// that we're building in m_temp_defs.  Record the register's new live
+// value in BI.
+void
+function_info::record_def (build_info &bi, insn_info *insn,
+			   rtx_obj_reference ref)
+{
+  // Punt if we see multiple definitions of the same resource.
+  // This can happen for several reasons:
+  //
+  // - An instruction might store two values to memory at once, giving two
+  //   distinct memory references.
+  //
+  // - An instruction might assign to multiple pieces of a wide pseudo
+  //   register.  For example, on 32-bit targets, an instruction might
+  //   assign to both the upper and lower halves of a 64-bit pseudo register.
+  //
+  // - It's possible for the same register to be clobbered by the
+  //   CALL_INSN_FUNCTION_USAGE and to be set by the main instruction
+  //   pattern as well.  In that case, the clobber conceptually happens
+  //   before the set and can essentially be ignored.
+  //
+  // - Similarly, global registers are implicitly set by a call but can
+  //   be explicitly set or clobbered as well.  In that situation, the sets
+  //   are listed first and should win over a clobber.
+  unsigned int regno = ref.regno;
+  machine_mode mode = ref.is_reg () ? ref.mode : BLKmode;
+  def_info *def = safe_dyn_cast<def_info *> (bi.last_access[ref.regno + 1]);
+  if (def && def->insn () == insn)
+    {
+      if (!ref.is_clobber ())
+	{
+	  gcc_checking_assert (!is_a<clobber_info *> (def));
+	  def->record_reference (ref, false);
+	}
+      return;
+    }
+
+  // Memory is always well-defined, so only use clobber_infos for registers.
+  if (ref.is_reg () && ref.is_clobber ())
+    def = allocate<clobber_info> (insn, regno);
+  else
+    def = allocate<set_info> (insn, resource_info { mode, regno });
+  def->record_reference (ref, true);
+  append_def (def);
+  m_temp_defs.safe_push (def);
+  bi.last_access[ref.regno + 1] = def;
+}
+
+// Called while building SSA form using BI.  Add an insn_info for RTL
+// to the block that we're current building.
+void
+function_info::add_insn_to_block (build_info &bi, rtx_insn *rtl)
+{
+  insn_info *insn = allocate<insn_info> (bi.current_bb, rtl, UNKNOWN_COST);
+  append_insn (insn);
+
+  vec_rtx_properties properties;
+  properties.add_insn (rtl, true);
+  insn->set_properties (properties);
+
+  start_insn_accesses ();
+
+  // Record the uses.
+  for (rtx_obj_reference ref : properties.refs ())
+    if (ref.is_read ())
+      record_use (bi, insn, ref);
+
+  // Restore the contents of bi.last_access, which we used as a cache
+  // when assembling the uses.
+  for (access_info *access : m_temp_uses)
+    {
+      unsigned int regno = access->regno ();
+      gcc_checking_assert (bi.last_access[regno + 1] == access);
+      bi.last_access[regno + 1] = as_a<use_info *> (access)->def ();
+    }
+
+  // Record the definitions.
+  for (rtx_obj_reference ref : properties.refs ())
+    if (ref.is_write ())
+      record_def (bi, insn, ref);
+
+  // Logically these happen before the explicit definitions, but if the
+  // explicit definitions and call clobbers reference the same register,
+  // the explicit definition should win.
+  if (auto *call_rtl = dyn_cast<rtx_call_insn *> (rtl))
+    record_call_clobbers (bi, insn, call_rtl);
+
+  finish_insn_accesses (insn);
+}
+
+// Check whether INSN sets any registers that are never subsequently used.
+// If so, add REG_UNUSED notes for them.  The caller has already removed
+// any previous REG_UNUSED notes.
+void
+function_info::add_reg_unused_notes (insn_info *insn)
+{
+  rtx_insn *rtl = insn->rtl ();
+
+  auto handle_potential_set = [&](rtx pattern)
+    {
+      if (GET_CODE (pattern) != SET)
+	return;
+
+      rtx dest = SET_DEST (pattern);
+      if (!REG_P (dest))
+	return;
+
+      def_array defs = insn->defs ();
+      unsigned int index = find_access_index (defs, REGNO (dest));
+      for (unsigned int i = 0; i < REG_NREGS (dest); ++i)
+	{
+	  def_info *def = defs[index + i];
+	  gcc_checking_assert (def->regno () == REGNO (dest) + i);
+	  set_info *set = dyn_cast<set_info *> (def);
+	  if (set && set->has_nondebug_uses ())
+	    return;
+	}
+      add_reg_note (rtl, REG_UNUSED, dest);
+    };
+
+  rtx pattern = PATTERN (rtl);
+  if (GET_CODE (pattern) == PARALLEL)
+    for (int i = 0; i < XVECLEN (pattern, 0); ++i)
+      handle_potential_set (XVECEXP (pattern, 0, i));
+  else
+    handle_potential_set (pattern);
+}
+
+// Search TREE for call clobbers at INSN.  Return:
+//
+// - less than zero if INSN occurs before the root of TREE
+// - 0 if INSN is the root of TREE
+// - greater than zero if INSN occurs after the root of TREE
+int
+rtl_ssa::lookup_call_clobbers (insn_call_clobbers_tree &tree, insn_info *insn)
+{
+  auto compare = [&](insn_call_clobbers_note *clobbers)
+    {
+      return insn->compare_with (clobbers->insn ());
+    };
+  return tree.lookup (compare);
+}
+
+// Print a description of INSN to PP.
+void
+rtl_ssa::pp_insn (pretty_printer *pp, const insn_info *insn)
+{
+  if (!insn)
+    pp_string (pp, "<null>");
+  else
+    insn->print_full (pp);
+}
+
+// Print a description of INSN to FILE.
+void
+dump (FILE *file, const insn_info *insn)
+{
+  dump_using (file, pp_insn, insn);
+}
+
+// Debug interface to the dump routine above.
+void debug (const insn_info *x) { dump (stderr, x); }
diff --git a/gcc/rtl-ssa/insns.h b/gcc/rtl-ssa/insns.h
new file mode 100644
index 00000000000..a663103c1d9
--- /dev/null
+++ b/gcc/rtl-ssa/insns.h
@@ -0,0 +1,505 @@
+// Instruction-related RTL SSA classes                              -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+namespace rtl_ssa {
+
+// A fake cost for instructions that we haven't costed yet.
+const int UNKNOWN_COST = INT_MAX;
+
+// Enumerates the kinds of note that can be added to an instruction.
+// See the comment above insn_info for details.
+enum class insn_note_kind : uint8_t
+{
+  ORDER_NODE,
+  CALL_CLOBBERS
+};
+
+// The base class for notes that can be added to an instruction.
+// See the comment above insn_info for details.
+class insn_note
+{
+  // Size: 2 LP64 words.
+  friend class insn_info;
+  friend class function_info;
+
+public:
+  // Return what kind of note this is.
+  insn_note_kind kind () const { return m_kind; }
+
+  // Return the next note in the list, or null if none.
+  insn_note *next_note () const { return m_next_note; }
+
+  // Used with T = Derived *, where Derived is derived from insn_note.
+  // Convert the note to Derived, asserting that it has the right kind.
+  template<typename T>
+  T as_a ();
+
+  // Used with T = Derived *, where Derived is derived from insn_note.
+  // If the note is a Derived note, return it in that form, otherwise
+  // return null.
+  template<typename T>
+  T dyn_cast ();
+
+protected:
+  // Construct a note with the given kind.
+  insn_note (insn_note_kind);
+
+private:
+  // The next note in the list, or null if none.
+  insn_note *m_next_note;
+
+  // The kind of note this is.
+  insn_note_kind m_kind : 8;
+
+protected:
+  // Fill in the remaining LP64 word with data that derived classes can use.
+  unsigned int m_data8 : 8;
+  unsigned int m_data16 : 16;
+  unsigned int m_data32 : 32;
+};
+
+// Instructions have one of these notes if insn_info::has_call_clobbers ()
+// is true.  All such instructions in an EBB are first grouped together
+// by the predefined_function_abis of the functions that they call.
+// Then, for each such predefined ABI, the call_clobbers notes are put
+// into a splay tree whose nodes follow execution order.
+class insn_call_clobbers_note : public insn_note
+{
+  friend class function_info;
+  friend class default_splay_tree_accessors<insn_call_clobbers_note *>;
+
+public:
+  static const insn_note_kind kind = insn_note_kind::CALL_CLOBBERS;
+
+  // Return the identifier of the predefined_function_abi.
+  unsigned int abi_id () const { return m_data32; }
+
+  // Return the instruction to which the note is attached.
+  insn_info *insn () const { return m_insn; }
+
+protected:
+  insn_call_clobbers_note (unsigned int abi_id, insn_info *insn);
+
+  // The splay tree pointers.
+  insn_call_clobbers_note *m_children[2];
+
+  // The value returned by insn ().
+  insn_info *m_insn;
+};
+
+// A splay tree of insn_call_clobbers_notes.
+using insn_call_clobbers_tree = default_splay_tree<insn_call_clobbers_note *>;
+
+// SSA-related information about an instruction.  It also represents
+// artificial instructions that are added to make the dataflow correct;
+// these artificial instructions fall into three categories:
+//
+// - Instructions that hold the phi nodes for an extended basic block (is_phi).
+//
+// - Instructions that represent the head of a basic block and that hold
+//   all the associated artificial uses and definitions.
+//
+// - Instructions that represent the end of a basic block and that again
+//   hold all the associated artificial uses and definitions.
+//
+// Dataflow-wise, each instruction goes through three stages:
+//
+// (1) Use all the values in uses ().
+//
+// (2) If has_call_clobbers (), clobber the registers indicated by
+//     insn_callee_abi.
+//
+// (3) Define all the values in defs ().
+//
+// Having stage (2) is a trade-off: it makes processing the instructions
+// more complicated, but it saves having to allocate memory for every
+// individual call clobber.  Without it, clobbers for calls would often
+// make up a large proportion of the total definitions in a function.
+//
+// All the instructions in a function are chained together in a list
+// that follows a reverse postorder traversal of the CFG.  The list
+// contains both debug and nondebug instructions, but it is possible
+// to hop from one nondebug instruction to the next with constant complexity.
+//
+// Instructions can have supplemental information attached in the form
+// of "notes", a bit like REG_NOTES for the underlying RTL insns.
+class insn_info
+{
+  // Size: 8 LP64 words.
+  friend class ebb_info;
+  friend class function_info;
+
+public:
+  // Compare instructions by their positions in the function list described
+  // above.  Thus for two instructions in the same basic block, I1 < I2 if
+  // I1 comes before I2 in the block.
+  bool operator< (const insn_info &) const;
+  bool operator<= (const insn_info &) const;
+  bool operator>= (const insn_info &) const;
+  bool operator> (const insn_info &) const;
+
+  // Return -1 if this instruction comes before INSN in the reverse
+  // postorder, 0 if this instruction is INSN, or 1 if this instruction
+  // comes after INSN in the reverse postorder.
+  int compare_with (const insn_info *insn) const;
+
+  // Return the previous and next instructions in the list described above,
+  // or null if there are no such instructions.
+  insn_info *prev_any_insn () const;
+  insn_info *next_any_insn () const;
+
+  // Only valid if !is_debug_insn ().  Return the previous and next
+  // nondebug instructions in the list described above, skipping over
+  // any intervening debug instructions.  These are constant-time operations.
+  insn_info *prev_nondebug_insn () const;
+  insn_info *next_nondebug_insn () const;
+
+  // Return the underlying RTL insn.  This instruction is null if is_phi ()
+  // or is_bb_end () are true.  The instruction is a basic block note if
+  // is_bb_head () is true.
+  rtx_insn *rtl () const { return m_rtl; }
+
+  // Return true if the instruction is a real insn with an rtl pattern.
+  // Return false if it is an artificial instruction that represents the
+  // phi nodes in an extended basic block or the head or end of a basic block.
+  bool is_real () const { return m_cost_or_uid >= 0; }
+
+  // Return the opposite of is_real ().
+  bool is_artificial () const { return m_cost_or_uid < 0; }
+
+  // Return true if the instruction was a real instruction but has now
+  // been deleted.  In this case the instruction is no longer part of
+  // the SSA information.
+  bool has_been_deleted () const { return m_rtl && !INSN_P (m_rtl); }
+
+  // Return true if the instruction is a debug instruction (and thus
+  // also a real instruction).
+  bool is_debug_insn () const { return m_is_debug_insn; }
+
+  // Return true if the instruction is something that we can optimize.
+  // This implies that it is a real instruction that contains an asm
+  // or that contains something that matches an .md define_insn pattern.
+  bool can_be_optimized () const { return m_can_be_optimized; }
+
+  // Return true if the instruction is a call instruction.
+  //
+  // ??? We could cache this information, but since most callers would
+  // go on to access PATTERN (rtl ()), a cache might not be helpful and
+  // could even be counterproductive.
+  bool is_call () const { return CALL_P (m_rtl); }
+
+  // Return true if the instruction is a jump instruction.
+  //
+  // ??? See is_call for the reason we don't cache this.
+  bool is_jump () const { return JUMP_P (m_rtl); }
+
+  // Return true if the instruction is real and contains an inline asm.
+  bool is_asm () const { return m_is_asm; }
+
+  // Return true if the instruction is real and includes an RTX_AUTOINC
+  // operation.
+  bool has_pre_post_modify () const { return m_has_pre_post_modify; }
+
+  // Return true if the instruction is real and has volatile references,
+  // in the sense of volatile_refs_p.  This includes volatile memory,
+  // volatile asms and UNSPEC_VOLATILEs.
+  bool has_volatile_refs () const { return m_has_volatile_refs; }
+
+  // Return true if the instruction is aritificial and if its (sole)
+  // purpose is to hold the phi nodes in an extended basic block.
+  bool is_phi () const;
+
+  // Return true if the instruction is artificial and if it represents
+  // the head of a basic block.  If so, the instruction conceptually
+  // executes before the real instructions in the block.  The uses
+  // and definitions represent the df_get_artificial_uses and
+  // df_get_artificial_defs entries for the head of the block.
+  bool is_bb_head () const;
+
+  // Return true if the instruction is artificial and if it represents
+  // the end of a basic block.  The uses and definitions represent the
+  // the df_get_artificial_uses and df_get_artificial_defs entries for
+  // the end of the block.
+  bool is_bb_end () const;
+
+  // Return the basic block that the instruction is in.
+  bb_info *bb () const { return m_bb; }
+
+  // Return the extended basic block that the instruction is in;
+  // see bb_info for details.
+  ebb_info *ebb () const;
+
+  // If the instruction is real, return the unique identifier of the
+  // underlying RTL insn.  If the instruction is artificial, return
+  // a unique negative identifier for the instructions.
+  //
+  // Note that the identifiers are not linear: it can be the case that
+  // an instruction with a higher uid comes earlier in a block than an
+  // instruction with a lower uid.  The identifiers are however persistent;
+  // the identifier remains the same after the instruction has been moved
+  // or changed.
+  int uid () const;
+
+  // Return the list of things that this instruction uses.  Registers
+  // come first, in register number order, followed by memory.
+  use_array uses () const;
+
+  // Return true if the instruction is a call and if the clobbers
+  // described by insn_callee_abi have been omitted from the list
+  // of definitions.
+  bool has_call_clobbers () const;
+
+  // Return the list of things that this instruction sets or clobbers.
+  // Registers come first, in register number order, followed by memory.
+  //
+  // If has_call_clobbers () is true, the list omits both the full and
+  // partial register clobbers described by insn_callee_abi.
+  def_array defs () const;
+
+  // The number of entries in uses ().
+  unsigned int num_uses () const { return m_num_uses; }
+
+  // The number of entries in defs ().
+  unsigned int num_defs () const { return m_num_defs; }
+
+  // Return the cost of the instruction, as calculated by the target.
+  // For performance reasons, the cost is evaluated lazily on first use.
+  //
+  // Artificial instructions have a cost of 0.
+  unsigned int cost () const;
+
+  // Return the first insn_note attached to the instruction, or null
+  // if none.
+  insn_note *first_note () const { return m_first_note; }
+
+  // See if a note of type T is attached to the instruction.  Return it
+  // if so, otherwise return null.
+  template<typename T>
+  const T *find_note () const;
+
+  // Print "i" + uid () for real instructions and "a" + -uid () for
+  // artificial instructions.
+  void print_identifier (pretty_printer *) const;
+
+  // Print a short(ish) description of where the instruction is.
+  void print_location (pretty_printer *) const;
+
+  // Combine print_identifier and print_location.
+  void print_identifier_and_location (pretty_printer *) const;
+
+  // Print a full description of the instruction.
+  void print_full (pretty_printer *) const;
+
+private:
+  // The first-order way of representing the order between instructions
+  // is to assign "program points", with higher point numbers coming
+  // later in the reverse postorder than lower point numbers.  However,
+  // after a sequence of instruction movements, we may end up in a situation
+  // that adjacent instructions have the same program point.
+  //
+  // When that happens, we put the instructions into a splay tree that
+  // records their relative order.  Each node of the splay tree is an
+  // order_node note that is attached to its respective instruction.
+  // The root of the splay tree is not stored, since the only thing
+  // we need the tree for is to compare two nodes.
+  class order_node : public insn_note
+  {
+  public:
+    static const insn_note_kind kind = insn_note_kind::ORDER_NODE;
+
+    order_node (int uid);
+
+    // Return the uid of the instruction that this node describes.
+    int uid () const { return m_data32; }
+
+    // The splay tree pointers.
+    order_node *m_children[2];
+    order_node *m_parent;
+  };
+  using order_splay_tree = default_rootless_splay_tree<order_node *>;
+
+  // prev_insn_or_last_debug_insn represents a choice between two things:
+  //
+  // (1) A pointer to the previous instruction in the list that has the
+  //     same is_debug_insn () value, or null if no such instruction exists.
+  //
+  // (2) A pointer to the end of a sublist of debug instructions.
+  //
+  // (2) is used if this instruction is a debug instruction and the
+  // previous instruction is not.  (1) is used otherwise.
+  //
+  // next_nondebug_or_debug_insn points to the next instruction but also
+  // records whether that next instruction is a debug instruction or a
+  // nondebug instruction.
+  //
+  // Thus the list is chained as follows:
+  //
+  //         ---->        ---->     ---->     ---->     ---->
+  // NONDEBUG     NONDEBUG     DEBUG     DEBUG     DEBUG     NONDEBUG ...
+  //         <----    ^     +--     <----     <----  ^    +--
+  //                  |     |                        |    |
+  //                  |     +------------------------+    |
+  //                  |                                   |
+  //                  +-----------------------------------+
+  using prev_insn_or_last_debug_insn = pointer_mux<insn_info>;
+  using next_nondebug_or_debug_insn = pointer_mux<insn_info>;
+
+  insn_info (bb_info *bb, rtx_insn *rtl, int cost_or_uid);
+
+  static void print_uid (pretty_printer *, int);
+
+  void calculate_cost () const;
+  void set_properties (const rtx_properties &);
+  void set_accesses (access_info **, unsigned int, unsigned int);
+  void copy_accesses (access_array, access_array);
+  void set_cost (unsigned int cost) { m_cost_or_uid = cost; }
+  void set_bb (bb_info *bb) { m_bb = bb; }
+
+  void add_note (insn_note *note);
+
+  order_node *get_order_node () const;
+  order_node *get_known_order_node () const;
+  int slow_compare_with (const insn_info &) const;
+
+  insn_info *last_debug_insn () const;
+
+  unsigned int point () const { return m_point; }
+  void copy_prev_from (insn_info *);
+  void copy_next_from (insn_info *);
+  void set_prev_sametype_insn (insn_info *);
+  void set_last_debug_insn (insn_info *);
+  void set_next_any_insn (insn_info *);
+  void set_point (unsigned int point) { m_point = point; }
+  void clear_insn_links ();
+  bool has_insn_links ();
+
+  // The values returned by the accessors above.
+  prev_insn_or_last_debug_insn m_prev_insn_or_last_debug_insn;
+  next_nondebug_or_debug_insn m_next_nondebug_or_debug_insn;
+  bb_info *m_bb;
+  rtx_insn *m_rtl;
+
+  // The list of definitions followed by the list of uses.
+  access_info **m_accesses;
+
+  // The number of definitions and the number uses.  FIRST_PSEUDO_REGISTER + 1
+  // is the maximum number of accesses to hard registers and memory, and
+  // MAX_RECOG_OPERANDS is the maximum number of pseudos that can be
+  // defined by an instruction, so the number of definitions should fit
+  // easily in 16 bits.
+  unsigned int m_num_uses;
+  unsigned int m_num_defs : 16;
+
+  // Flags returned by the accessors above.
+  unsigned int m_is_debug_insn : 1;
+  unsigned int m_can_be_optimized : 1;
+  unsigned int m_is_asm : 1;
+  unsigned int m_has_pre_post_modify : 1;
+  unsigned int m_has_volatile_refs : 1;
+
+  // For future expansion.
+  unsigned int m_spare : 11;
+
+  // The program point at which the instruction occurs.
+  //
+  // Note that the values of the program points are influenced by -g
+  // and so should not used to make codegen decisions.
+  unsigned int m_point;
+
+  // Negative if the instruction is artificial, nonnegative if it is real.
+  //
+  // For real instructions: the cost of the instruction, or UNKNOWN_COST
+  // if we haven't measured it yet.
+  //
+  // For artificial instructions: the (negative) unique identifier of the
+  // instruction.
+  mutable int m_cost_or_uid;
+
+  // The list of notes that have been attached to the instruction.
+  insn_note *m_first_note;
+};
+
+// Iterators for unfiltered lists of instructions.
+using any_insn_iterator = list_iterator<insn_info, &insn_info::next_any_insn>;
+using reverse_any_insn_iterator
+  = list_iterator<insn_info, &insn_info::prev_any_insn>;
+
+// Iterators for nondebug instructions only.
+using nondebug_insn_iterator
+  = list_iterator<insn_info, &insn_info::next_nondebug_insn>;
+using reverse_nondebug_insn_iterator
+  = list_iterator<insn_info, &insn_info::prev_nondebug_insn>;
+
+// A class that describes an inclusive range of instructions.
+class insn_range_info
+{
+public:
+  insn_range_info () = default;
+
+  // Create a range that contains a singleton instruction.
+  insn_range_info (insn_info *insn) : first (insn), last (insn) {}
+
+  // Create a range [FIRST, LAST], given that *FIRST <= *LAST.
+  insn_range_info (insn_info *first, insn_info *last);
+
+  // Return true if the range contains at least one instruction.
+  explicit operator bool () const { return *first <= *last; }
+
+  bool operator== (const insn_range_info &) const;
+  bool operator!= (const insn_range_info &) const;
+
+  // If the range contains a single instruction, return that instruction,
+  // otherwise return null.
+  insn_info *singleton () const;
+
+  // Return true if the range includes INSN.
+  bool includes (insn_info *insn) const;
+
+  // If INSN is inside the range, return INSN, otherwise return the
+  // nearest in-range instruction.
+  insn_info *clamp_insn_to_range (insn_info *insn) const;
+
+  // Return true if this range is a subrange of OTHER, i.e. if OTHER
+  // includes every instruction that this range does.
+  bool is_subrange_of (const insn_range_info &other) const;
+
+  // The lower and upper bounds of the range.
+  insn_info *first;
+  insn_info *last;
+};
+
+// A class that represents a closure of operator== for instructions.
+// This is used by insn_is; see there for details.
+class insn_is_closure
+{
+public:
+  insn_is_closure (const insn_info *insn) : m_insn (insn) {}
+  bool operator() (const insn_info *other) const { return m_insn == other; }
+
+private:
+  const insn_info *m_insn;
+};
+
+void pp_insn (pretty_printer *, const insn_info *);
+
+}
+
+void dump (FILE *, const rtl_ssa::insn_info *);
+
+void DEBUG_FUNCTION debug (const rtl_ssa::insn_info *);
diff --git a/gcc/rtl-ssa/internals.inl b/gcc/rtl-ssa/internals.inl
new file mode 100644
index 00000000000..00ad6422ee8
--- /dev/null
+++ b/gcc/rtl-ssa/internals.inl
@@ -0,0 +1,682 @@
+// Implementation of private inline member functions for RTL SSA    -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+namespace rtl_ssa {
+
+// Construct a new access with the given resource () and kind () values.
+inline access_info::access_info (resource_info resource, access_kind kind)
+  : m_regno (resource.regno),
+    m_kind (kind),
+    m_is_artificial (false),
+    m_is_set_with_nondebug_insn_uses (false),
+    m_is_pre_post_modify (false),
+    m_is_call_clobber (false),
+    m_is_live_out_use (false),
+    m_includes_address_uses (false),
+    m_includes_read_writes (false),
+    m_includes_subregs (false),
+    m_includes_multiregs (false),
+    m_only_occurs_in_notes (false),
+    m_is_last_nondebug_insn_use (false),
+    m_is_in_debug_insn_or_phi (false),
+    m_has_been_superceded (false),
+    m_is_temp (false),
+    m_spare (0),
+    m_mode (resource.mode)
+{
+}
+
+// Construct a use of RESOURCE in LOCATION.  The resource's value is provided
+// by DEF, or is completely undefined if DEF is null.
+inline use_info::use_info (insn_or_phi location, resource_info resource,
+			   set_info *definition)
+  : access_info (resource, access_kind::USE),
+    m_insn_or_phi (location),
+    m_last_use_or_prev_use (nullptr),
+    m_last_nondebug_insn_use_or_next_use (nullptr),
+    m_def (definition)
+{
+  if (m_insn_or_phi.is_second ())
+    {
+      m_is_in_debug_insn_or_phi = true;
+      m_is_artificial = true;
+    }
+  else
+    {
+      insn_info *insn = m_insn_or_phi.known_first ();
+      m_is_in_debug_insn_or_phi = insn->is_debug_insn ();
+      m_is_artificial = insn->is_artificial ();
+    }
+}
+
+// Return the correct (uncached) value of m_is_last_nondebug_insn_use.
+inline bool
+use_info::calculate_is_last_nondebug_insn_use () const
+{
+  use_info *next = next_use ();
+  return is_in_nondebug_insn () && (!next || next->is_in_debug_insn_or_phi ());
+}
+
+// Accumulate any properties about REF that are also stored in use_infos.
+// IS_FIRST is true if REF is the first access to resource () that we have
+// recorded in this way, false if we have already recorded previous
+// references.
+inline void
+use_info::record_reference (rtx_obj_reference ref, bool is_first)
+{
+  if (is_first)
+    {
+      m_includes_address_uses = ref.in_address ();
+      m_includes_read_writes = ref.is_write ();
+      m_includes_subregs = ref.in_subreg ();
+      m_includes_multiregs = ref.is_multireg ();
+      m_only_occurs_in_notes = ref.in_note ();
+    }
+  else
+    {
+      m_includes_address_uses |= ref.in_address ();
+      m_includes_read_writes |= ref.is_write ();
+      m_includes_subregs |= ref.in_subreg ();
+      m_includes_multiregs |= ref.is_multireg ();
+      m_only_occurs_in_notes &= ref.in_note ();
+    }
+}
+
+// Change the value of insn () to INSN.
+inline void
+use_info::set_insn (insn_info *insn)
+{
+  m_insn_or_phi = insn;
+  m_is_artificial = insn->is_artificial ();
+}
+
+// Copy the overloaded prev link from OTHER.
+inline void
+use_info::copy_prev_from (use_info *other)
+{
+  m_last_use_or_prev_use = other->m_last_use_or_prev_use;
+}
+
+// Copy the overloaded next link from OTHER.
+inline void
+use_info::copy_next_from (use_info *other)
+{
+  m_last_nondebug_insn_use_or_next_use
+    = other->m_last_nondebug_insn_use_or_next_use;
+  m_is_last_nondebug_insn_use = calculate_is_last_nondebug_insn_use ();
+}
+
+// Record that this use is the first in the list and that the last use is LAST.
+inline void
+use_info::set_last_use (use_info *last_use)
+{
+  m_last_use_or_prev_use.set_first (last_use);
+}
+
+// Record that this use is not the first in the list and that the previous
+// use is PREV.
+inline void
+use_info::set_prev_use (use_info *prev_use)
+{
+  m_last_use_or_prev_use.set_second (prev_use);
+}
+
+// Record that this use is the last use in the list.  If USE is nonnull,
+// record that USE is the last use in the list by a nondebug instruction,
+// otherwise record that there are no uses by nondebug instructions
+// in the list.
+inline void
+use_info::set_last_nondebug_insn_use (use_info *use)
+{
+  m_last_nondebug_insn_use_or_next_use.set_first (use);
+  m_is_last_nondebug_insn_use = (use == this);
+}
+
+// Record that this use is not the last in the list and that the next
+// use is NEXT_USE.
+inline void
+use_info::set_next_use (use_info *next_use)
+{
+  m_last_nondebug_insn_use_or_next_use.set_second (next_use);
+  m_is_last_nondebug_insn_use = calculate_is_last_nondebug_insn_use ();
+}
+
+// Clear any information relating to the position of the use in its
+// definition's list.
+inline void
+use_info::clear_use_links ()
+{
+  m_last_use_or_prev_use = nullptr;
+  m_last_nondebug_insn_use_or_next_use = nullptr;
+  m_is_last_nondebug_insn_use = false;
+}
+
+// Return true if the use has any links to other uses.  This is mostly
+// for assert checking.
+inline bool
+use_info::has_use_links ()
+{
+  return (m_last_use_or_prev_use
+	  || m_last_nondebug_insn_use_or_next_use
+	  || m_is_last_nondebug_insn_use);
+}
+
+// Construct a definition of RESOURCE in INSN, giving it kind KIND.
+inline def_info::def_info (insn_info *insn, resource_info resource,
+			   access_kind kind)
+  : access_info (resource, kind),
+    m_insn (insn),
+    m_last_def_or_prev_def (nullptr),
+    m_splay_root_or_next_def (nullptr)
+{
+  m_is_artificial = insn->is_artificial ();
+}
+
+// Record any properties about REF that are also stored in def_infos.
+// IS_FIRST is true if REF is the first access to resource () that we have
+// recorded in this way, false if we have already recorded previous
+// references.
+inline void
+def_info::record_reference (rtx_obj_reference ref, bool is_first)
+{
+  if (is_first)
+    {
+      m_is_pre_post_modify = ref.is_pre_post_modify ();
+      m_includes_read_writes = ref.is_read ();
+      m_includes_subregs = ref.in_subreg ();
+      m_includes_multiregs = ref.is_multireg ();
+    }
+  else
+    {
+      m_is_pre_post_modify |= ref.is_pre_post_modify ();
+      m_includes_read_writes |= ref.is_read ();
+      m_includes_subregs |= ref.in_subreg ();
+      m_includes_multiregs |= ref.is_multireg ();
+    }
+}
+
+// Return the last definition in the list.  Only valid when is_first ()
+// is true.
+inline def_info *
+def_info::last_def () const
+{
+  return m_last_def_or_prev_def.known_first ();
+}
+
+// Return the root of the splay tree of definitions of resource (),
+// or null if no splay tree has been created for this resource.
+// Only valid when is_last () is true.
+inline def_node *
+def_info::splay_root () const
+{
+  return m_splay_root_or_next_def.known_first ();
+}
+
+// Copy the overloaded prev link from OTHER.
+inline void
+def_info::copy_prev_from (def_info *other)
+{
+  m_last_def_or_prev_def
+    = other->m_last_def_or_prev_def;
+}
+
+// Copy the overloaded next link from OTHER.
+inline void
+def_info::copy_next_from (def_info *other)
+{
+  m_splay_root_or_next_def = other->m_splay_root_or_next_def;
+}
+
+// Record that this definition is the first in the list and that the last
+// definition is LAST.
+inline void
+def_info::set_last_def (def_info *last_def)
+{
+  m_last_def_or_prev_def.set_first (last_def);
+}
+
+// Record that this definition is not the first in the list and that the
+// previous definition is PREV.
+inline void
+def_info::set_prev_def (def_info *prev_def)
+{
+  m_last_def_or_prev_def.set_second (prev_def);
+}
+
+// Record that this definition is the last in the list and that the root
+// of the splay tree associated with resource () is ROOT.
+inline void
+def_info::set_splay_root (def_node *root)
+{
+  m_splay_root_or_next_def = root;
+}
+
+// Record that this definition is not the last in the list and that the
+// next definition is NEXT.
+inline void
+def_info::set_next_def (def_info *next_def)
+{
+  m_splay_root_or_next_def = next_def;
+}
+
+// Clear the prev and next links
+inline void
+def_info::clear_def_links ()
+{
+  m_last_def_or_prev_def = nullptr;
+  m_splay_root_or_next_def = nullptr;
+}
+
+// Return true if the definition has any links to other definitions.
+// This is mostly for assert checking.
+inline bool
+def_info::has_def_links ()
+{
+  return m_last_def_or_prev_def || m_splay_root_or_next_def;
+}
+
+// Construct a clobber of register REGNO in insn INSN.
+inline clobber_info::clobber_info (insn_info *insn, unsigned int regno)
+  : def_info (insn, { E_BLKmode, regno }, access_kind::CLOBBER),
+    m_children (),
+    m_parent (nullptr),
+    m_group (nullptr)
+{
+}
+
+// Set the containing group to GROUP, if it isn't already.  The main
+// use of this function is to update the new root of GROUP's splay tree.
+inline void
+clobber_info::update_group (clobber_group *group)
+{
+  if (__builtin_expect (m_group != group, 0))
+    m_group = group;
+}
+
+// Cconstruct a set_info for a store to RESOURCE in INSN, giving it
+// kind KIND.
+inline set_info::set_info (insn_info *insn, resource_info resource,
+			   access_kind kind)
+  : def_info (insn, resource, kind),
+    m_first_use (nullptr)
+{
+}
+
+// Cconstruct a set_info for a store to RESOURCE in INSN.
+inline set_info::set_info (insn_info *insn, resource_info resource)
+  : set_info (insn, resource, access_kind::SET)
+{
+}
+
+// Record that USE is the first use of this definition.
+inline void
+set_info::set_first_use (use_info *first_use)
+{
+  m_first_use = first_use;
+  m_is_set_with_nondebug_insn_uses
+    = (first_use && first_use->is_in_nondebug_insn ());
+}
+
+// Construct a phi for RESOURCE in INSN, giving it identifier UID.
+inline phi_info::phi_info (insn_info *insn, resource_info resource,
+			   unsigned int uid)
+  : set_info (insn, resource, access_kind::PHI),
+    m_uid (uid),
+    m_num_inputs (0),
+    m_prev_phi (nullptr),
+    m_next_phi (nullptr)
+{
+}
+
+// Turn the phi into a degenerate phi, with INPUT representing the
+// value of the resource on all incoming edges.
+inline void
+phi_info::make_degenerate (use_info *input)
+{
+  m_num_inputs = 1;
+  m_single_input = input;
+}
+
+// Set the inputs of the phi to INPUTS.
+inline void
+phi_info::set_inputs (use_array inputs)
+{
+  m_num_inputs = inputs.size ();
+  if (inputs.size () == 1)
+    m_single_input = inputs[0];
+  else
+    m_inputs = access_array (inputs).begin ();
+}
+
+// Construct a definition splay tree node for FIRST_DEF, which is either
+// the first clobber_info in a group or a standalone set_info.
+inline def_node::def_node (clobber_or_set first_def)
+  : m_clobber_or_set (first_def),
+    m_children ()
+{
+}
+
+// Construct a new group of clobber_infos that initially contains just CLOBBER.
+inline clobber_group::clobber_group (clobber_info *clobber)
+  : def_node (clobber),
+    m_last_clobber (clobber),
+    m_clobber_tree (clobber)
+{
+  clobber->m_group = this;
+}
+
+// Construct a node for the instruction with uid UID.
+inline insn_info::order_node::order_node (int uid)
+  : insn_note (kind),
+    m_children (),
+    m_parent (nullptr)
+{
+  m_data32 = uid;
+}
+
+// Construct a note for instruction INSN, giving it abi_id () value ABI_ID.
+inline insn_call_clobbers_note::insn_call_clobbers_note (unsigned int abi_id,
+							 insn_info *insn)
+  : insn_note (kind),
+    m_children (),
+    m_insn (insn)
+{
+  m_data32 = abi_id;
+}
+
+// Construct an instruction with the given bb () and rtl () values.
+// If the instruction is real, COST_OR_UID is the value of cost (),
+// otherwise it is the value of uid ().
+inline insn_info::insn_info (bb_info *bb, rtx_insn *rtl, int cost_or_uid)
+  : m_prev_insn_or_last_debug_insn (nullptr),
+    m_next_nondebug_or_debug_insn (nullptr),
+    m_bb (bb),
+    m_rtl (rtl),
+    m_accesses (nullptr),
+    m_num_uses (0),
+    m_num_defs (0),
+    m_is_debug_insn (rtl && DEBUG_INSN_P (rtl)),
+    m_can_be_optimized (false),
+    m_is_asm (false),
+    m_has_pre_post_modify (false),
+    m_has_volatile_refs (false),
+    m_spare (0),
+    m_point (0),
+    m_cost_or_uid (cost_or_uid),
+    m_first_note (nullptr)
+{
+}
+
+// Copy any insn properties from PROPERTIES that are also stored in an
+// insn_info.
+inline void
+insn_info::set_properties (const rtx_properties &properties)
+{
+  m_is_asm = properties.has_asm;
+  m_has_pre_post_modify = properties.has_pre_post_modify;
+  m_has_volatile_refs = properties.has_volatile_refs;
+  // Not strictly related to the properties we've been given, but it's
+  // a convenient location to do this.
+  m_can_be_optimized = (NONDEBUG_INSN_P (m_rtl)
+			& (GET_CODE (PATTERN (m_rtl)) != USE)
+			& (GET_CODE (PATTERN (m_rtl)) != CLOBBER));
+}
+
+// Change the list of instruction accesses to ACCESSES, which contains
+// NUM_DEFS definitions followed by NUM_USES uses.
+inline void
+insn_info::set_accesses (access_info **accesses,
+			 unsigned int num_defs, unsigned int num_uses)
+{
+  m_accesses = accesses;
+  m_num_defs = num_defs;
+  gcc_assert (num_defs == m_num_defs);
+  m_num_uses = num_uses;
+}
+
+// Change defs () and uses () to DEFS and USES respectively, given that
+// the existing m_accesses array has enough room for them.
+inline void
+insn_info::copy_accesses (access_array defs, access_array uses)
+{
+  gcc_assert (defs.size () + uses.size () <= m_num_defs + m_num_uses);
+  memcpy (m_accesses, defs.begin (), defs.size_bytes ());
+  memcpy (m_accesses + defs.size (), uses.begin (), uses.size_bytes ());
+  m_num_defs = defs.size ();
+  gcc_assert (m_num_defs == defs.size ());
+  m_num_uses = uses.size ();
+}
+
+// If the instruction has an insn_info::order_node, return the node,
+// otherwise return null.
+inline insn_info::order_node *
+insn_info::get_order_node () const
+{
+  // The order_node always comes first.
+  if (insn_note *note = first_note ())
+    return note->dyn_cast<insn_info::order_node *> ();
+  return nullptr;
+}
+
+// Like get_order_node (), but the node is known to exist.
+inline insn_info::order_node *
+insn_info::get_known_order_node () const
+{
+  // The order_node always comes first.
+  return first_note ()->as_a<insn_info::order_node *> ();
+}
+
+// Copy the overloaded prev link from OTHER.
+inline void
+insn_info::copy_prev_from (insn_info *other)
+{
+  m_prev_insn_or_last_debug_insn = other->m_prev_insn_or_last_debug_insn;
+}
+
+// Copy the overloaded next link from OTHER.
+inline void
+insn_info::copy_next_from (insn_info *other)
+{
+  m_next_nondebug_or_debug_insn = other->m_next_nondebug_or_debug_insn;
+}
+
+// If this is a nondebug instruction, record that the previous nondebug
+// instruction is PREV.  (There might be intervening debug instructions.)
+//
+// If this is a debug instruction, record that the previous instruction
+// is debug instruction PREV.
+inline void
+insn_info::set_prev_sametype_insn (insn_info *prev)
+{
+  m_prev_insn_or_last_debug_insn.set_first (prev);
+}
+
+// Only valid for debug instructions.  Record that this instruction starts
+// a subsequence of debug instructions that ends with LAST.
+inline void
+insn_info::set_last_debug_insn (insn_info *last)
+{
+  m_prev_insn_or_last_debug_insn.set_second (last);
+}
+
+// Record that the next instruction of any kind is NEXT.
+inline void
+insn_info::set_next_any_insn (insn_info *next)
+{
+  if (next && next->is_debug_insn ())
+    m_next_nondebug_or_debug_insn.set_second (next);
+  else
+    m_next_nondebug_or_debug_insn.set_first (next);
+}
+
+// Clear the list links and point number for this instruction.
+inline void
+insn_info::clear_insn_links ()
+{
+  m_prev_insn_or_last_debug_insn = nullptr;
+  m_next_nondebug_or_debug_insn = nullptr;
+  m_point = 0;
+}
+
+// Return true if the instruction contains any list information.
+// This is used by assert checking.
+inline bool
+insn_info::has_insn_links ()
+{
+  return (m_prev_insn_or_last_debug_insn
+	  || m_next_nondebug_or_debug_insn
+	  || m_point);
+}
+
+// Construct a representation of basic block CFG_BB.
+inline bb_info::bb_info (basic_block cfg_bb)
+  : m_prev_bb (nullptr),
+    m_next_bb (nullptr),
+    m_cfg_bb (cfg_bb),
+    m_ebb (nullptr),
+    m_head_insn (nullptr),
+    m_end_insn (nullptr)
+{
+}
+
+// Construct a tree of call clobbers for the given ABI.
+inline ebb_call_clobbers_info::
+ebb_call_clobbers_info (const predefined_function_abi *abi)
+  : m_next (nullptr),
+    m_abi (abi)
+{
+}
+
+// Construct an EBB whose first block is FIRST_BB and whose last block
+// is LAST_BB.
+inline ebb_info::ebb_info (bb_info *first_bb, bb_info *last_bb)
+  : m_first_phi (nullptr),
+    m_phi_insn (nullptr),
+    m_first_bb (first_bb),
+    m_last_bb (last_bb),
+    m_first_call_clobbers (nullptr)
+{
+}
+
+// Set the contents of last_access for register REGNO to DEF.
+inline void
+function_info::build_info::record_reg_def (unsigned int regno, def_info *def)
+{
+  last_access[regno + 1] = def;
+}
+
+// Set the contents of last_access for memory to DEF.
+inline void
+function_info::build_info::record_mem_def (def_info *def)
+{
+  last_access[0] = def;
+}
+
+// Return the current value of live register REGNO, or null if the register's
+// value is completedly undefined.
+inline set_info *
+function_info::build_info::current_reg_value (unsigned int regno) const
+{
+  return safe_dyn_cast<set_info *> (last_access[regno + 1]);
+}
+
+// Return the current value of memory.
+inline set_info *
+function_info::build_info::current_mem_value () const
+{
+  return as_a<set_info *> (last_access[0]);
+}
+
+// Allocate a T on the function's main obstack, passing ARGS
+// to its constructor.
+template<typename T, typename... Ts>
+inline T *
+function_info::allocate (Ts... args)
+{
+  static_assert (std::is_trivially_destructible<T>::value,
+		 "destructor won't be called");
+  static_assert (alignof (T) <= obstack_alignment,
+		 "too much alignment required");
+  void *addr = obstack_alloc (&m_obstack, sizeof (T));
+  return new (addr) T (std::forward<Ts> (args)...);
+}
+
+// Allocate a T on the function's temporary obstack, passing ARGS
+// to its constructor.
+template<typename T, typename... Ts>
+inline T *
+function_info::allocate_temp (Ts... args)
+{
+  static_assert (std::is_trivially_destructible<T>::value,
+		 "destructor won't be called");
+  static_assert (alignof (T) <= obstack_alignment,
+		 "too much alignment required");
+  void *addr = obstack_alloc (&m_temp_obstack, sizeof (T));
+  return new (addr) T (std::forward<Ts> (args)...);
+}
+
+// Add INSN to the end of the function's list of instructions.
+inline void
+function_info::append_insn (insn_info *insn)
+{
+  gcc_checking_assert (!insn->has_insn_links ());
+  if (insn_info *after = m_last_insn)
+    add_insn_after (insn, after);
+  else
+    // The first instruction is for the entry block and is always a nondebug
+    // insn
+    m_first_insn = m_last_insn = m_last_nondebug_insn = insn;
+}
+
+// Start building a new list of uses and definitions for an instruction.
+inline void
+function_info::start_insn_accesses ()
+{
+  gcc_checking_assert (m_temp_defs.is_empty ()
+		       && m_temp_uses.is_empty ());
+}
+
+// Return a mode that encapsulates two distinct references to a register,
+// one with mode MODE1 and one with mode MODE2.  Treat BLKmode as a
+// "don't know" wildcard.
+inline machine_mode
+combine_modes (machine_mode mode1, machine_mode mode2)
+{
+  if (mode1 == E_BLKmode)
+    return mode2;
+
+  if (mode2 == E_BLKmode)
+    return mode1;
+
+  return wider_subreg_mode (mode1, mode2);
+}
+
+// PRINTER (PP, ARGS...) prints ARGS... to a pretty_printer PP.  Use it
+// to print ARGS... to FILE.
+template<typename Printer, typename... Args>
+inline void
+dump_using (FILE *file, Printer printer, Args... args)
+{
+  pretty_printer pp;
+  printer (&pp, args...);
+  pp_newline (&pp);
+  fprintf (file, "%s", pp_formatted_text (&pp));
+}
+
+}
diff --git a/gcc/rtl-ssa/is-a.inl b/gcc/rtl-ssa/is-a.inl
new file mode 100644
index 00000000000..14e1316b6c0
--- /dev/null
+++ b/gcc/rtl-ssa/is-a.inl
@@ -0,0 +1,98 @@
+// is_a<> support for RTL SSA classes                               -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+template<>
+struct is_a_helper<rtl_ssa::def_info *>
+  : static_is_a_helper<rtl_ssa::def_info *>
+{
+  static inline bool
+  test (const rtl_ssa::access_info *ref)
+  {
+    return (ref->kind () == rtl_ssa::access_kind::SET
+	    || ref->kind () == rtl_ssa::access_kind::PHI
+	    || ref->kind () == rtl_ssa::access_kind::CLOBBER);
+  }
+};
+
+template<>
+struct is_a_helper<rtl_ssa::clobber_info *>
+  : static_is_a_helper<rtl_ssa::clobber_info *>
+{
+  static inline bool
+  test (const rtl_ssa::access_info *ref)
+  {
+    return ref->kind () == rtl_ssa::access_kind::CLOBBER;
+  }
+};
+
+template<>
+struct is_a_helper<rtl_ssa::use_info *>
+  : static_is_a_helper<rtl_ssa::use_info *>
+{
+  static inline bool
+  test (const rtl_ssa::access_info *ref)
+  {
+    return ref->kind () == rtl_ssa::access_kind::USE;
+  }
+};
+
+template<>
+struct is_a_helper<rtl_ssa::set_info *>
+  : static_is_a_helper<rtl_ssa::set_info *>
+{
+  static inline bool
+  test (const rtl_ssa::access_info *ref)
+  {
+    return (ref->kind () == rtl_ssa::access_kind::SET
+	    || ref->kind () == rtl_ssa::access_kind::PHI);
+  }
+};
+
+template<>
+struct is_a_helper<rtl_ssa::phi_info *>
+  : static_is_a_helper<rtl_ssa::phi_info *>
+{
+  static inline bool
+  test (const rtl_ssa::access_info *ref)
+  {
+    return ref->kind () == rtl_ssa::access_kind::PHI;
+  }
+};
+
+template<>
+struct is_a_helper<rtl_ssa::set_node *>
+  : static_is_a_helper<rtl_ssa::set_node *>
+{
+  static inline bool
+  test (const rtl_ssa::def_node *node)
+  {
+    return node->contains_set ();
+  }
+};
+
+template<>
+struct is_a_helper<rtl_ssa::clobber_group *>
+  : static_is_a_helper<rtl_ssa::clobber_group *>
+{
+  static inline bool
+  test (const rtl_ssa::def_node *node)
+  {
+    return node->contains_clobber ();
+  }
+};
diff --git a/gcc/rtl-ssa/member-fns.inl b/gcc/rtl-ssa/member-fns.inl
new file mode 100644
index 00000000000..4f3bb2e1518
--- /dev/null
+++ b/gcc/rtl-ssa/member-fns.inl
@@ -0,0 +1,928 @@
+// Implementation of public inline member functions for RTL SSA     -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// This file contains inline implementations of public member functions that
+// are too large to be written in the class definition.  It also contains
+// some non-inline template definitions of public member functions.
+// See the comments above the function declarations for details.
+//
+// The file also contains the bare minimum of private and protected inline
+// member functions that are needed to make the public functions compile.
+namespace rtl_ssa {
+
+inline void
+access_array_builder::reserve (unsigned int num_accesses)
+{
+  obstack_make_room (m_obstack, num_accesses * sizeof (access_info *));
+}
+
+inline void
+access_array_builder::quick_push (access_info *access)
+{
+  obstack_ptr_grow_fast (m_obstack, access);
+}
+
+inline array_slice<access_info *>
+access_array_builder::finish ()
+{
+  auto num_accesses = obstack_object_size (m_obstack) / sizeof (access_info *);
+  if (num_accesses == 0)
+    return {};
+
+  auto **base = static_cast<access_info **> (obstack_finish (m_obstack));
+  keep ();
+  return { base, num_accesses };
+}
+
+inline bool
+access_info::is_set_with_nondebug_insn_uses () const
+{
+  return m_is_set_with_nondebug_insn_uses;
+}
+
+inline bool
+use_info::is_in_debug_insn () const
+{
+  return m_insn_or_phi.is_first () && m_is_in_debug_insn_or_phi;
+}
+
+inline bb_info *
+use_info::bb () const
+{
+  if (m_insn_or_phi.is_first ())
+    return m_insn_or_phi.known_first ()->bb ();
+  return m_insn_or_phi.known_second ()->bb ();
+}
+
+inline ebb_info *
+use_info::ebb () const
+{
+  return bb ()->ebb ();
+}
+
+inline use_info *
+use_info::prev_use () const
+{
+  return m_last_use_or_prev_use.second_or_null ();
+}
+
+inline use_info *
+use_info::next_use () const
+{
+  return m_last_nondebug_insn_use_or_next_use.second_or_null ();
+}
+
+inline bool
+use_info::is_first_use () const
+{
+  return m_last_use_or_prev_use.is_first ();
+}
+
+inline bool
+use_info::is_last_use () const
+{
+  return m_last_nondebug_insn_use_or_next_use.is_first ();
+}
+
+inline use_info *
+use_info::next_nondebug_insn_use () const
+{
+  if (m_is_last_nondebug_insn_use)
+    return nullptr;
+  return m_last_nondebug_insn_use_or_next_use.known_second ();
+}
+
+inline use_info *
+use_info::next_any_insn_use () const
+{
+  // This is used less often than next_nondebug_insn_use, so it doesn't
+  // seem worth having an m_is_last_nondebug_insn_use-style end marker.
+  if (use_info *use = next_use ())
+    if (use->is_in_any_insn ())
+      return use;
+  return nullptr;
+}
+
+inline use_info *
+use_info::prev_phi_use () const
+{
+  // This is used less often than next_nondebug_insn_use, so it doesn't
+  // seem worth having an m_is_last_nondebug_insn_use-style end marker.
+  if (use_info *use = prev_use ())
+    if (use->is_in_phi ())
+      return use;
+  return nullptr;
+}
+
+// Return the last use of any kind in the list.  Only valid when is_first ()
+// is true.
+inline use_info *
+use_info::last_use () const
+{
+  return m_last_use_or_prev_use.known_first ();
+}
+
+// Return the last nondebug insn use in the list, or null if none.  Only valid
+// when is_last_use () is true.
+inline use_info *
+use_info::last_nondebug_insn_use () const
+{
+  return m_last_nondebug_insn_use_or_next_use.known_first ();
+}
+
+inline def_info *
+def_info::prev_def () const
+{
+  return m_last_def_or_prev_def.second_or_null ();
+}
+
+inline def_info *
+def_info::next_def () const
+{
+  return m_splay_root_or_next_def.second_or_null ();
+}
+
+inline bool
+def_info::is_first_def () const
+{
+  return m_last_def_or_prev_def.is_first ();
+}
+
+inline bool
+def_info::is_last_def () const
+{
+  return m_splay_root_or_next_def.is_first ();
+}
+
+inline bb_info *
+def_info::bb () const
+{
+  return m_insn->bb ();
+}
+
+inline ebb_info *
+def_info::ebb () const
+{
+  return m_insn->ebb ();
+}
+
+inline clobber_group *
+clobber_info::group () const
+{
+  if (!m_group || !m_group->has_been_superceded ())
+    return m_group;
+  return const_cast<clobber_info *> (this)->recompute_group ();
+}
+
+inline use_info *
+set_info::last_use () const
+{
+  return m_first_use ? m_first_use->last_use () : nullptr;
+}
+
+inline use_info *
+set_info::first_nondebug_insn_use () const
+{
+  if (m_is_set_with_nondebug_insn_uses)
+    return m_first_use;
+  return nullptr;
+}
+
+inline use_info *
+set_info::last_nondebug_insn_use () const
+{
+  if (m_is_set_with_nondebug_insn_uses)
+    return m_first_use->last_use ()->last_nondebug_insn_use ();
+  return nullptr;
+}
+
+inline use_info *
+set_info::first_any_insn_use () const
+{
+  if (m_first_use->is_in_any_insn ())
+    return m_first_use;
+  return nullptr;
+}
+
+inline use_info *
+set_info::last_phi_use () const
+{
+  if (m_first_use)
+    {
+      use_info *last = m_first_use->last_use ();
+      if (last->is_in_phi ())
+	return last;
+    }
+  return nullptr;
+}
+
+inline bool
+set_info::has_nondebug_uses () const
+{
+  return has_nondebug_insn_uses () || has_phi_uses ();
+}
+
+inline bool
+set_info::has_nondebug_insn_uses () const
+{
+  return m_is_set_with_nondebug_insn_uses;
+}
+
+inline bool
+set_info::has_phi_uses () const
+{
+  return m_first_use && m_first_use->last_use ()->is_in_phi ();
+}
+
+inline bool
+set_info::is_local_to_ebb () const
+{
+  if (!m_first_use)
+    return true;
+
+  use_info *last = m_first_use->last_use ();
+  if (last->is_in_phi ())
+    return false;
+
+  last = last->last_nondebug_insn_use ();
+  return !last || last->ebb () == ebb ();
+}
+
+inline iterator_range<use_iterator>
+set_info::all_uses () const
+{
+  return { m_first_use, nullptr };
+}
+
+inline iterator_range<reverse_use_iterator>
+set_info::reverse_all_uses () const
+{
+  return { last_use (), nullptr };
+}
+
+inline iterator_range<nondebug_insn_use_iterator>
+set_info::nondebug_insn_uses () const
+{
+  return { first_nondebug_insn_use (), nullptr };
+}
+
+inline iterator_range<reverse_use_iterator>
+set_info::reverse_nondebug_insn_uses () const
+{
+  return { last_nondebug_insn_use (), nullptr };
+}
+
+inline iterator_range<any_insn_use_iterator>
+set_info::all_insn_uses () const
+{
+  return { first_any_insn_use (), nullptr };
+}
+
+inline iterator_range<phi_use_iterator>
+set_info::phi_uses () const
+{
+  return { last_phi_use (), nullptr };
+}
+
+inline use_array
+phi_info::inputs () const
+{
+  if (m_num_inputs == 1)
+    return use_array (&m_single_input, 1);
+  return use_array (m_inputs, m_num_inputs);
+}
+
+inline use_info *
+phi_info::input_use (unsigned int i) const
+{
+  if (m_num_inputs == 1)
+    return as_a<use_info *> (m_single_input);
+  return as_a<use_info *> (m_inputs[i]);
+}
+
+inline set_info *
+phi_info::input_value (unsigned int i) const
+{
+  return input_use (i)->def ();
+}
+
+inline def_info *
+def_node::first_def () const
+{
+  // This should get optimized into an AND with -2.
+  if (m_clobber_or_set.is_first ())
+    return m_clobber_or_set.known_first ();
+  return m_clobber_or_set.known_second ();
+}
+
+inline clobber_info *
+clobber_group::first_clobber () const
+{
+  return m_clobber_or_set.known_first ();
+}
+
+inline iterator_range<def_iterator>
+clobber_group::clobbers () const
+{
+  return { first_clobber (), m_last_clobber->next_def () };
+}
+
+inline def_info *
+def_mux::first_def () const
+{
+  if (is_first ())
+    return known_first ();
+  return known_second ()->first_def ();
+}
+
+inline def_info *
+def_mux::last_def () const
+{
+  if (is_first ())
+    return known_first ();
+
+  def_node *node = known_second ();
+  if (auto *clobber = ::dyn_cast<clobber_group *> (node))
+    return clobber->last_clobber ();
+
+  return node->first_def ();
+}
+
+inline set_info *
+def_mux::set () const
+{
+  if (is_first ())
+    return ::safe_dyn_cast<set_info *> (known_first ());
+  return ::dyn_cast<set_info *> (known_second ()->first_def ());
+}
+
+inline def_info *
+def_lookup::prev_def () const
+{
+  if (!mux)
+    return nullptr;
+
+  if (comparison > 0)
+    return mux.last_def ();
+
+  return mux.first_def ()->prev_def ();
+}
+
+inline def_info *
+def_lookup::next_def () const
+{
+  if (!mux)
+    return nullptr;
+
+  if (comparison < 0)
+    return mux.first_def ();
+
+  return mux.last_def ()->next_def ();
+}
+
+inline set_info *
+def_lookup::matching_set () const
+{
+  if (comparison == 0)
+    return mux.set ();
+  return nullptr;
+}
+
+inline def_info *
+def_lookup::matching_or_prev_def () const
+{
+  if (set_info *set = matching_set ())
+    return set;
+  return prev_def ();
+}
+
+inline def_info *
+def_lookup::matching_or_next_def () const
+{
+  if (set_info *set = matching_set ())
+    return set;
+  return next_def ();
+}
+
+inline insn_note::insn_note (insn_note_kind kind)
+  : m_next_note (nullptr),
+    m_kind (kind),
+    m_data8 (0),
+    m_data16 (0),
+    m_data32 (0)
+{
+}
+
+template<typename T>
+inline T
+insn_note::as_a ()
+{
+  using deref_type = decltype (*std::declval<T> ());
+  using derived = typename std::remove_reference<deref_type>::type;
+  gcc_checking_assert (m_kind == derived::kind);
+  return static_cast<T> (this);
+}
+
+template<typename T>
+inline T
+insn_note::dyn_cast ()
+{
+  using deref_type = decltype (*std::declval<T> ());
+  using derived = typename std::remove_reference<deref_type>::type;
+  if (m_kind == derived::kind)
+    return static_cast<T> (this);
+  return nullptr;
+}
+
+inline bool
+insn_info::operator< (const insn_info &other) const
+{
+  if (this == &other)
+    return false;
+
+  if (__builtin_expect (m_point != other.m_point, 1))
+    return m_point < other.m_point;
+
+  return slow_compare_with (other) < 0;
+}
+
+inline bool
+insn_info::operator> (const insn_info &other) const
+{
+  return other < *this;
+}
+
+inline bool
+insn_info::operator<= (const insn_info &other) const
+{
+  return !(other < *this);
+}
+
+inline bool
+insn_info::operator>= (const insn_info &other) const
+{
+  return !(*this < other);
+}
+
+inline int
+insn_info::compare_with (const insn_info *other) const
+{
+  if (this == other)
+    return 0;
+
+  if (__builtin_expect (m_point != other->m_point, 1))
+    // Assume that points remain in [0, INT_MAX].
+    return m_point - other->m_point;
+
+  return slow_compare_with (*other);
+}
+
+inline insn_info *
+insn_info::prev_nondebug_insn () const
+{
+  gcc_checking_assert (!is_debug_insn ());
+  return m_prev_insn_or_last_debug_insn.known_first ();
+}
+
+inline insn_info *
+insn_info::next_nondebug_insn () const
+{
+  gcc_checking_assert (!is_debug_insn ());
+  const insn_info *from = this;
+  if (insn_info *first_debug = m_next_nondebug_or_debug_insn.second_or_null ())
+    from = first_debug->last_debug_insn ();
+  return from->m_next_nondebug_or_debug_insn.known_first ();
+}
+
+inline insn_info *
+insn_info::prev_any_insn () const
+{
+  const insn_info *from = this;
+  if (insn_info *last_debug = m_prev_insn_or_last_debug_insn.second_or_null ())
+    // This instruction is the first in a subsequence of debug instructions.
+    // Move to the following nondebug instruction.
+    from = last_debug->m_next_nondebug_or_debug_insn.known_first ();
+  return from->m_prev_insn_or_last_debug_insn.known_first ();
+}
+
+inline insn_info *
+insn_info::next_any_insn () const
+{
+  // This should get optimized into an AND with -2.
+  if (m_next_nondebug_or_debug_insn.is_first ())
+    return m_next_nondebug_or_debug_insn.known_first ();
+  return m_next_nondebug_or_debug_insn.known_second ();
+}
+
+inline bool
+insn_info::is_phi () const
+{
+  return this == ebb ()->phi_insn ();
+}
+
+inline bool
+insn_info::is_bb_head () const
+{
+  return this == m_bb->head_insn ();
+}
+
+inline bool
+insn_info::is_bb_end () const
+{
+  return this == m_bb->end_insn ();
+}
+
+inline ebb_info *
+insn_info::ebb () const
+{
+  return m_bb->ebb ();
+}
+
+inline int
+insn_info::uid () const
+{
+  return m_cost_or_uid < 0 ? m_cost_or_uid : INSN_UID (m_rtl);
+}
+
+inline use_array
+insn_info::uses () const
+{
+  return use_array (m_accesses + m_num_defs, m_num_uses);
+}
+
+inline bool
+insn_info::has_call_clobbers () const
+{
+  return find_note<insn_call_clobbers_note> ();
+}
+
+inline def_array
+insn_info::defs () const
+{
+  return def_array (m_accesses, m_num_defs);
+}
+
+inline unsigned int
+insn_info::cost () const
+{
+  if (m_cost_or_uid < 0)
+    return 0;
+  if (m_cost_or_uid == UNKNOWN_COST)
+    calculate_cost ();
+  return m_cost_or_uid;
+}
+
+template<typename T>
+inline const T *
+insn_info::find_note () const
+{
+  // We could break if the note kind is > T::kind, but since the number
+  // of notes should be very small, the check is unlikely to pay for itself.
+  for (const insn_note *note = first_note (); note; note = note->next_note ())
+    if (note->kind () == T::kind)
+      return static_cast<const T *> (note);
+  return nullptr;
+}
+
+// Only valid for debug instructions that come after a nondebug instruction,
+// and so start a subsequence of debug instructions.  Return the last debug
+// instruction in the subsequence.
+inline insn_info *
+insn_info::last_debug_insn () const
+{
+  return m_prev_insn_or_last_debug_insn.known_second ();
+}
+
+inline insn_range_info::insn_range_info (insn_info *first, insn_info *last)
+  : first (first), last (last)
+{
+}
+
+inline bool
+insn_range_info::operator== (const insn_range_info &other) const
+{
+  return first == other.first && last == other.last;
+}
+
+inline bool
+insn_range_info::operator!= (const insn_range_info &other) const
+{
+  return first != other.first || last != other.last;
+}
+
+inline insn_info *
+insn_range_info::singleton () const
+{
+  return first == last ? last : nullptr;
+}
+
+inline bool
+insn_range_info::includes (insn_info *insn) const
+{
+  return *insn >= *first && *insn <= *last;
+}
+
+inline insn_info *
+insn_range_info::clamp_insn_to_range (insn_info *insn) const
+{
+  if (*first > *insn)
+    return first;
+  if (*last < *insn)
+    return last;
+  return insn;
+}
+
+inline bool
+insn_range_info::is_subrange_of (const insn_range_info &other) const
+{
+  return *first >= *other.first && *last <= *other.last;
+}
+
+inline iterator_range<any_insn_iterator>
+bb_info::all_insns () const
+{
+  return { m_head_insn, m_end_insn->next_any_insn () };
+}
+
+inline iterator_range<reverse_any_insn_iterator>
+bb_info::reverse_all_insns () const
+{
+  return { m_end_insn, m_head_insn->prev_any_insn () };
+}
+
+inline iterator_range<nondebug_insn_iterator>
+bb_info::nondebug_insns () const
+{
+  return { m_head_insn, m_end_insn->next_nondebug_insn () };
+}
+
+inline iterator_range<reverse_nondebug_insn_iterator>
+bb_info::reverse_nondebug_insns () const
+{
+  return { m_end_insn, m_head_insn->prev_nondebug_insn () };
+}
+
+inline iterator_range<any_insn_iterator>
+bb_info::real_insns () const
+{
+  return { m_head_insn->next_any_insn (), m_end_insn };
+}
+
+inline iterator_range<reverse_any_insn_iterator>
+bb_info::reverse_real_insns () const
+{
+  return { m_end_insn->prev_any_insn (), m_head_insn };
+}
+
+inline iterator_range<nondebug_insn_iterator>
+bb_info::real_nondebug_insns () const
+{
+  return { m_head_insn->next_nondebug_insn (), m_end_insn };
+}
+
+inline iterator_range<reverse_nondebug_insn_iterator>
+bb_info::reverse_real_nondebug_insns () const
+{
+  return { m_end_insn->prev_nondebug_insn (), m_head_insn };
+}
+
+inline bool
+ebb_call_clobbers_info::clobbers (resource_info resource) const
+{
+  // Only register clobbers are tracked this way.  Other clobbers are
+  // recorded explicitly.
+  return (resource.is_reg ()
+	  && m_abi->clobbers_reg_p (resource.mode, resource.regno));
+}
+
+inline ebb_info *
+ebb_info::prev_ebb () const
+{
+  if (bb_info *prev_bb = m_first_bb->prev_bb ())
+    return prev_bb->ebb ();
+  return nullptr;
+}
+
+inline ebb_info *
+ebb_info::next_ebb () const
+{
+  if (bb_info *next_bb = m_last_bb->next_bb ())
+    return next_bb->ebb ();
+  return nullptr;
+}
+
+inline iterator_range<phi_iterator>
+ebb_info::phis () const
+{
+  return { m_first_phi, nullptr };
+}
+
+inline iterator_range<bb_iterator>
+ebb_info::bbs () const
+{
+  return { m_first_bb, m_last_bb->next_bb () };
+}
+
+inline iterator_range<reverse_bb_iterator>
+ebb_info::reverse_bbs () const
+{
+  return { m_last_bb, m_first_bb->prev_bb () };
+}
+
+inline iterator_range<any_insn_iterator>
+ebb_info::all_insns () const
+{
+  return { m_phi_insn, m_last_bb->end_insn ()->next_any_insn () };
+}
+
+inline iterator_range<reverse_any_insn_iterator>
+ebb_info::reverse_all_insns () const
+{
+  return { m_last_bb->end_insn (), m_phi_insn->prev_any_insn () };
+}
+
+inline iterator_range<nondebug_insn_iterator>
+ebb_info::nondebug_insns () const
+{
+  return { m_phi_insn, m_last_bb->end_insn ()->next_nondebug_insn () };
+}
+
+inline iterator_range<reverse_nondebug_insn_iterator>
+ebb_info::reverse_nondebug_insns () const
+{
+  return { m_last_bb->end_insn (), m_phi_insn->prev_nondebug_insn () };
+}
+
+inline insn_range_info
+ebb_info::insn_range () const
+{
+  return { m_phi_insn, m_last_bb->end_insn () };
+}
+
+inline void
+ebb_info::set_first_call_clobbers (ebb_call_clobbers_info *call_clobbers)
+{
+  m_first_call_clobbers = call_clobbers;
+}
+
+inline ebb_call_clobbers_info *
+ebb_info::first_call_clobbers () const
+{
+  return m_first_call_clobbers;
+}
+
+inline iterator_range<ebb_call_clobbers_iterator>
+ebb_info::call_clobbers () const
+{
+  return { m_first_call_clobbers, nullptr };
+}
+
+inline insn_change::insn_change (insn_info *insn)
+  : m_insn (insn),
+    new_defs (insn->defs ()),
+    new_uses (insn->uses ()),
+    move_range (insn),
+    new_cost (UNKNOWN_COST),
+    m_is_deletion (false)
+{
+}
+
+inline insn_change::insn_change (insn_info *insn, delete_action)
+  : m_insn (insn),
+    new_defs (),
+    new_uses (),
+    move_range (insn),
+    new_cost (0),
+    m_is_deletion (true)
+{
+}
+
+inline insn_is_changing_closure::
+insn_is_changing_closure (array_slice<insn_change *const> changes)
+  : m_changes (changes)
+{
+}
+
+inline bool
+insn_is_changing_closure::operator() (const insn_info *insn) const
+{
+  for (const insn_change *change : m_changes)
+    if (change->insn () == insn)
+      return true;
+  return false;
+}
+
+inline iterator_range<bb_iterator>
+function_info::bbs () const
+{
+  return { m_first_bb, nullptr };
+}
+
+inline iterator_range<reverse_bb_iterator>
+function_info::reverse_bbs () const
+{
+  return { m_last_bb, nullptr };
+}
+
+inline iterator_range<ebb_iterator>
+function_info::ebbs () const
+{
+  return { m_first_bb->ebb (), nullptr };
+}
+
+inline iterator_range<reverse_ebb_iterator>
+function_info::reverse_ebbs () const
+{
+  return { m_last_bb->ebb (), nullptr };
+}
+
+inline iterator_range<any_insn_iterator>
+function_info::all_insns () const
+{
+  return { m_first_insn, nullptr };
+}
+
+inline iterator_range<reverse_any_insn_iterator>
+function_info::reverse_all_insns () const
+{
+  return { m_last_insn, nullptr };
+}
+
+inline iterator_range<nondebug_insn_iterator>
+function_info::nondebug_insns () const
+{
+  return { m_first_insn, nullptr };
+}
+
+inline iterator_range<reverse_nondebug_insn_iterator>
+function_info::reverse_nondebug_insns () const
+{
+  return { m_last_insn, nullptr };
+}
+
+inline iterator_range<def_iterator>
+function_info::mem_defs () const
+{
+  return { m_defs[0], nullptr };
+}
+
+inline iterator_range<def_iterator>
+function_info::ref_defs (unsigned int regno) const
+{
+  return { m_defs[regno + 1], nullptr };
+}
+
+inline set_info *
+function_info::single_dominating_def (unsigned int regno) const
+{
+  if (set_info *set = safe_dyn_cast<set_info *> (m_defs[regno + 1]))
+    if (is_single_dominating_def (set))
+      return set;
+  return nullptr;
+}
+
+template<typename IgnorePredicate>
+bool
+function_info::add_regno_clobber (obstack_watermark &watermark,
+				  insn_change &change, unsigned int regno,
+				  IgnorePredicate ignore)
+{
+  // Check whether CHANGE already clobbers REGNO.
+  if (find_access (change.new_defs, regno))
+    return true;
+
+  // Get the closest position to INSN at which the new instruction
+  // could be placed.
+  insn_info *insn = change.move_range.clamp_insn_to_range (change.insn ());
+  def_array new_defs = insert_temp_clobber (watermark, insn, regno,
+					    change.new_defs);
+  if (!new_defs.is_valid ())
+    return false;
+
+  // Find a definition at or neighboring INSN.
+  insn_range_info move_range = change.move_range;
+  if (!restrict_movement_for_dead_range (move_range, regno, insn, ignore))
+    return false;
+
+  change.new_defs = new_defs;
+  change.move_range = move_range;
+  return true;
+}
+
+}
diff --git a/gcc/rtl-ssa/movement.h b/gcc/rtl-ssa/movement.h
new file mode 100644
index 00000000000..3b0cbf9d411
--- /dev/null
+++ b/gcc/rtl-ssa/movement.h
@@ -0,0 +1,335 @@
+// RTL SSA utilities relating to instruction movement               -*- C++ -*-
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+namespace rtl_ssa {
+
+// Restrict movement range RANGE so that the instruction is placed later
+// than INSN.  (The movement range is the range of instructions after which
+// an instruction can be placed.)
+inline insn_range_info
+move_later_than (insn_range_info range, insn_info *insn)
+{
+  return { later_insn (range.first, insn), range.last };
+}
+
+// Restrict movement range RANGE so that the instruction is placed no earlier
+// than INSN.  (The movement range is the range of instructions after which
+// an instruction can be placed.)
+inline insn_range_info
+move_no_earlier_than (insn_range_info range, insn_info *insn)
+{
+  insn_info *first = later_insn (range.first, insn->prev_nondebug_insn ());
+  return { first, range.last };
+}
+
+// Restrict movement range RANGE so that the instruction is placed no later
+// than INSN.  (The movement range is the range of instructions after which
+// an instruction can be placed.)
+inline insn_range_info
+move_no_later_than (insn_range_info range, insn_info *insn)
+{
+  return { range.first, earlier_insn (range.last, insn) };
+}
+
+// Restrict movement range RANGE so that the instruction is placed earlier
+// than INSN.  (The movement range is the range of instructions after which
+// an instruction can be placed.)
+inline insn_range_info
+move_earlier_than (insn_range_info range, insn_info *insn)
+{
+  insn_info *last = earlier_insn (range.last, insn->prev_nondebug_insn ());
+  return { range.first, last };
+}
+
+// Return true if it is possible to insert a new instruction after INSN.
+inline bool
+can_insert_after (insn_info *insn)
+{
+  return insn->is_bb_head () || (insn->is_real () && !insn->is_jump ());
+}
+
+// Try to restrict move range MOVE_RANGE so that it is possible to
+// insert INSN after both of the end points.  Return true on success,
+// otherwise leave MOVE_RANGE in an invalid state.
+inline bool
+canonicalize_move_range (insn_range_info &move_range, insn_info *insn)
+{
+  while (move_range.first != insn && !can_insert_after (move_range.first))
+    move_range.first = move_range.first->next_nondebug_insn ();
+  while (move_range.last != insn && !can_insert_after (move_range.last))
+    move_range.last = move_range.last->prev_nondebug_insn ();
+  return bool (move_range);
+}
+
+// Try to restrict movement range MOVE_RANGE of INSN so that it can set
+// or clobber REGNO.  Assume that if:
+//
+// - an instruction I2 contains another access A to REGNO; and
+// - IGNORE (I2) is true
+//
+// then either:
+//
+// - A will be removed; or
+// - something will ensure that the new definition of REGNO does not
+//   interfere with A, without this having to be enforced by I1's move range.
+//
+// Return true on success, otherwise leave MOVE_RANGE in an invalid state.
+//
+// This function only works correctly for instructions that remain within
+// the same extended basic block.
+template<typename IgnorePredicate>
+bool
+restrict_movement_for_dead_range (insn_range_info &move_range,
+				  unsigned int regno, insn_info *insn,
+				  IgnorePredicate ignore)
+{
+  // Find a definition at or neighboring INSN.
+  resource_info resource = full_register (regno);
+  def_lookup dl = crtl->ssa->find_def (resource, insn);
+
+  def_info *prev = dl.prev_def ();
+  ebb_info *ebb = insn->ebb ();
+  if (!prev || prev->ebb () != ebb)
+    {
+      // REGNO is not defined or used in EBB before INSN, but it
+      // might be live on entry.  To keep complexity under control,
+      // handle only these cases:
+      //
+      // - If the register is not live on entry to EBB, the register is
+      //   free from the start of EBB to the first definition in EBB.
+      //
+      // - Otherwise, if the register is live on entry to BB, refuse
+      //   to allocate the register.  We could in principle try to move
+      //   the instruction to later blocks in the EBB, but it's rarely
+      //   worth the effort, and could lead to linear complexity.
+      //
+      // - Otherwise, don't allow INSN to move earlier than its current
+      //   block.  Again, we could in principle look backwards to find where
+      //   REGNO dies, but it's rarely worth the effort.
+      bb_info *bb = insn->bb ();
+      insn_info *limit;
+      if (!bitmap_bit_p (DF_LR_IN (ebb->first_bb ()->cfg_bb ()), regno))
+	limit = ebb->phi_insn ();
+      else if (bitmap_bit_p (DF_LR_IN (bb->cfg_bb ()), regno))
+	return false;
+      else
+	limit = bb->head_insn ();
+      move_range = move_later_than (move_range, limit);
+    }
+  else
+    {
+      // Stop the instruction moving beyond the previous relevant access
+      // to REGNO.
+      access_info *prev_access
+	= last_access_ignoring (prev, ignore_clobbers::YES, ignore);
+      if (prev_access)
+	move_range = move_later_than (move_range, access_insn (prev_access));
+    }
+
+  // Stop the instruction moving beyond the next relevant definition of REGNO.
+  def_info *next = first_def_ignoring (dl.matching_or_next_def (),
+				       ignore_clobbers::YES, ignore);
+  if (next)
+    move_range = move_earlier_than (move_range, next->insn ());
+
+  return canonicalize_move_range (move_range, insn);
+}
+
+// Try to restrict movement range MOVE_RANGE so that it is possible for the
+// instruction being moved ("instruction I1") to perform all the definitions
+// in DEFS while still preserving dependencies between those definitions
+// and surrounding instructions.  Assume that if:
+//
+// - DEFS contains a definition D of resource R;
+// - an instruction I2 contains another access A to R; and
+// - IGNORE (I2) is true
+//
+// then either:
+//
+// - A will be removed; or
+// - something will ensure that D and A maintain their current order,
+//   without this having to be enforced by I1's move range.
+//
+// Return true on success, otherwise leave MOVE_RANGE in an invalid state.
+//
+// This function only works correctly for instructions that remain within
+// the same extended basic block.
+template<typename IgnorePredicate>
+bool
+restrict_movement_for_defs_ignoring (insn_range_info &move_range,
+				     def_array defs, IgnorePredicate ignore)
+{
+  for (def_info *def : defs)
+    {
+      // If the definition is a clobber, we can move it with respect
+      // to other clobbers.
+      //
+      // ??? We could also do this if a definition and all its uses
+      // are being moved at once.
+      bool is_clobber = is_a<clobber_info *> (def);
+
+      // Search back for the first unfiltered use or definition of the
+      // same resource.
+      access_info *access;
+      access = prev_access_ignoring (def, ignore_clobbers (is_clobber),
+				     ignore);
+      if (access)
+	move_range = move_later_than (move_range, access_insn (access));
+
+      // Search forward for the first unfiltered use of DEF,
+      // or the first unfiltered definition that follows DEF.
+      //
+      // We don't need to consider uses of following definitions, since
+      // if IGNORE (D->insn ()) is true for some definition D, the caller
+      // is guarantees that either
+      //
+      // - D will be removed, and thus its uses will be removed; or
+      // - D will occur after DEF, and thus D's uses will also occur
+      //   after DEF.
+      //
+      // This is purely a simplification: we could also process D's uses,
+      // but we don't need to.
+      access = next_access_ignoring (def, ignore_clobbers (is_clobber),
+				     ignore);
+      if (access)
+	move_range = move_earlier_than (move_range, access_insn (access));
+
+      // If DEF sets a hard register, take any call clobbers
+      // into account.
+      unsigned int regno = def->regno ();
+      if (!HARD_REGISTER_NUM_P (regno) || is_clobber)
+	continue;
+
+      ebb_info *ebb = def->ebb ();
+      for (ebb_call_clobbers_info *call_group : ebb->call_clobbers ())
+	{
+	  if (!call_group->clobbers (def->resource ()))
+	    continue;
+
+	  // Exit now if we've already failed, and if the splay accesses
+	  // below would be wasted work.
+	  if (!move_range)
+	    return false;
+
+	  insn_info *insn;
+	  insn = prev_call_clobbers_ignoring (*call_group, def->insn (),
+					      ignore);
+	  if (insn)
+	    move_range = move_later_than (move_range, insn);
+
+	  insn = next_call_clobbers_ignoring (*call_group, def->insn (),
+					      ignore);
+	  if (insn)
+	    move_range = move_earlier_than (move_range, insn);
+	}
+    }
+
+  // Make sure that we don't move stores between basic blocks, since we
+  // don't have enough information to tell whether it's safe.
+  if (def_info *def = memory_access (defs))
+    {
+      move_range = move_later_than (move_range, def->bb ()->head_insn ());
+      move_range = move_earlier_than (move_range, def->bb ()->end_insn ());
+    }
+
+  return bool (move_range);
+}
+
+// Like restrict_movement_for_defs_ignoring, but for the uses in USES.
+template<typename IgnorePredicate>
+bool
+restrict_movement_for_uses_ignoring (insn_range_info &move_range,
+				     use_array uses, IgnorePredicate ignore)
+{
+  for (const use_info *use : uses)
+    {
+      // Ignore uses of undefined values.
+      set_info *set = use->def ();
+      if (!set)
+	continue;
+
+      // Ignore uses by debug instructions.  Debug instructions are
+      // never supposed to move, and uses by debug instructions are
+      // never supposed to be transferred elsewhere, so we know that
+      // the caller must be changing the uses on the debug instruction
+      // and checking whether all new uses are available at the debug
+      // instruction's original location.
+      if (use->is_in_debug_insn ())
+	continue;
+
+      // If the used value is defined by an instruction I2 for which
+      // IGNORE (I2) is true, the caller guarantees that I2 will occur
+      // before change.insn ().  Otherwise, make sure that the use occurs
+      // after the definition.
+      insn_info *insn = set->insn ();
+      if (!ignore (insn))
+	move_range = move_later_than (move_range, insn);
+
+      // Search forward for the first unfiltered definition that follows SET.
+      //
+      // We don't need to consider the uses of these definitions, since
+      // if IGNORE (D->insn ()) is true for some definition D, the caller
+      // is guarantees that either
+      //
+      // - D will be removed, and thus its uses will be removed; or
+      // - D will occur after USE, and thus D's uses will also occur
+      //   after USE.
+      //
+      // This is purely a simplification: we could also process D's uses,
+      // but we don't need to.
+      def_info *def;
+      def = first_def_ignoring (set->next_def (), ignore_clobbers::NO,
+				ignore);
+      if (def)
+	move_range = move_earlier_than (move_range, def->insn ());
+
+      // If USE uses a hard register, take any call clobbers into account too.
+      // SET will necessarily occur after any previous call clobber, so we
+      // only need to check for later clobbers.
+      unsigned int regno = use->regno ();
+      if (!HARD_REGISTER_NUM_P (regno))
+	continue;
+
+      ebb_info *ebb = use->ebb ();
+      for (ebb_call_clobbers_info *call_group : ebb->call_clobbers ())
+	{
+	  if (!call_group->clobbers (use->resource ()))
+	    continue;
+
+	  if (!move_range)
+	    return false;
+
+	  insn_info *insn = next_call_clobbers_ignoring (*call_group,
+							 use->insn (), ignore);
+	  if (insn)
+	    move_range = move_earlier_than (move_range, insn);
+	}
+    }
+
+  // Make sure that we don't move loads into an earlier basic block.
+  //
+  // ??? It would be good to relax this for loads that can be safely
+  // speculated.
+  if (use_info *use = memory_access (uses))
+    move_range = move_later_than (move_range, use->bb ()->head_insn ());
+
+  return bool (move_range);
+}
+
+}
diff --git a/gcc/system.h b/gcc/system.h
index d4e29e53ee3..5470a21fb05 100644
--- a/gcc/system.h
+++ b/gcc/system.h
@@ -235,6 +235,9 @@ extern int errno;
 #ifdef INCLUDE_ARRAY
 # include <array>
 #endif
+#ifdef INCLUDE_FUNCTIONAL
+# include <functional>
+#endif
 # include <cstring>
 # include <new>
 # include <utility>
-- 
2.17.1


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH 23/23] fwprop: Rewrite to use RTL SSA
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (21 preceding siblings ...)
  2020-11-13  8:23 ` [PATCH 22/23] Add rtl-ssa Richard Sandiford
@ 2020-11-13  8:24 ` Richard Sandiford
  2020-12-16  3:52   ` Jeff Law
  2020-11-25 19:58 ` [00/23] Make fwprop use an on-the-side RTL SSA representation Jeff Law
  23 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-13  8:24 UTC (permalink / raw)
  To: gcc-patches

This patch rewrites fwprop.c to use the RTL SSA framework.  It tries
as far as possible to mimic the old behaviour, even in caes where
that doesn't fit naturally with the new framework.  I've added ???
comments to mark those places, but I think “fixing” them should
be done separately to make bisection easier.

In particular:

* The old implementation iterated over uses, and after a successful
  substitution, the new insn's uses were added to the end of the list.
  The pass still processed those uses, but because it processed them at
  the end, it didn't fully optimise one instruction before propagating
  it into the next.

  The new version follows the same approach for comparison purposes,
  but I'd like to drop that as a follow-on patch.

* The old implementation operated on single use sites (DF_REF_LOCs).
  This doesn't work well for instructions with match_dups, where it's
  necessary to update both an operand and its dups at the same time.
  For example, attempting to substitute into a divmod instruction would
  fail because only the div or the mod side would be updated.

  The new version again follows this to some extent for comparison
  purposes (although not exactly).  Again I'd like to drop it as a
  follow-on patch.

  One difference is that if a register occurs in multiple MEM addresses
  in a set, the new version will try to update them all at once.  This is
  what causes the SVE ACLE st4* output to improve.

Also, the old version didn't naturally guarantee termination (PR79405),
whereas the new one does.

gcc/
	* fwprop.c: Rewrite to use the RTL SSA framework.

gcc/testsuite/
	* gcc.dg/rtl/x86_64/test-return-const.c.before-fwprop.c: Don't
	expect insn updates to be deferred.
	* gcc.target/aarch64/sve/acle/asm/st4_s8.c: Expect the addition
	to be folded into the address.
	* gcc.target/aarch64/sve/acle/asm/st4_s8.c: Likewise.
---
 gcc/fwprop.c                                  | 1698 ++++++-----------
 .../test-return-const.c.before-fwprop.c       |    2 +-
 .../gcc.target/aarch64/sve/acle/asm/st4_s8.c  |    8 +-
 .../gcc.target/aarch64/sve/acle/asm/st4_u8.c  |    8 +-
 4 files changed, 561 insertions(+), 1155 deletions(-)

*** /tmp/9upGS6_fwprop.c	2020-11-13 08:23:52.837409271 +0000
--- gcc/fwprop.c	2020-11-13 08:05:06.490403698 +0000
***************
*** 18,49 ****
  along with GCC; see the file COPYING3.  If not see
  <http://www.gnu.org/licenses/>.  */
  
  #include "config.h"
  #include "system.h"
  #include "coretypes.h"
  #include "backend.h"
- #include "target.h"
  #include "rtl.h"
- #include "predict.h"
  #include "df.h"
! #include "memmodel.h"
! #include "tm_p.h"
! #include "insn-config.h"
! #include "emit-rtl.h"
! #include "recog.h"
  
  #include "sparseset.h"
  #include "cfgrtl.h"
  #include "cfgcleanup.h"
  #include "cfgloop.h"
  #include "tree-pass.h"
- #include "domwalk.h"
  #include "rtl-iter.h"
! 
  
  /* This pass does simple forward propagation and simplification when an
     operand of an insn can only come from a single def.  This pass uses
!    df.c, so it is global.  However, we only do limited analysis of
     available expressions.
  
     1) The pass tries to propagate the source of the def into the use,
--- 18,47 ----
  along with GCC; see the file COPYING3.  If not see
  <http://www.gnu.org/licenses/>.  */
  
+ #define ADD_NOTES 0
+ 
+ #define INCLUDE_ALGORITHM
+ #define INCLUDE_FUNCTIONAL
  #include "config.h"
  #include "system.h"
  #include "coretypes.h"
  #include "backend.h"
  #include "rtl.h"
  #include "df.h"
! #include "rtl-ssa.h"
  
  #include "sparseset.h"
+ #include "predict.h"
  #include "cfgrtl.h"
  #include "cfgcleanup.h"
  #include "cfgloop.h"
  #include "tree-pass.h"
  #include "rtl-iter.h"
! #include "target.h"
  
  /* This pass does simple forward propagation and simplification when an
     operand of an insn can only come from a single def.  This pass uses
!    RTL SSA, so it is global.  However, we only do limited analysis of
     available expressions.
  
     1) The pass tries to propagate the source of the def into the use,
***************
*** 60,68 ****
        (set (subreg:SI (reg:DI 120) 0) (const_int 0))
        (set (subreg:SI (reg:DI 120) 4) (const_int -1))
        (set (subreg:SI (reg:DI 122) 0)
!          (ior:SI (subreg:SI (reg:DI 119) 0) (subreg:SI (reg:DI 120) 0)))
        (set (subreg:SI (reg:DI 122) 4)
!          (ior:SI (subreg:SI (reg:DI 119) 4) (subreg:SI (reg:DI 120) 4)))
  
     can be simplified to the much simpler
  
--- 58,66 ----
        (set (subreg:SI (reg:DI 120) 0) (const_int 0))
        (set (subreg:SI (reg:DI 120) 4) (const_int -1))
        (set (subreg:SI (reg:DI 122) 0)
! 	 (ior:SI (subreg:SI (reg:DI 119) 0) (subreg:SI (reg:DI 120) 0)))
        (set (subreg:SI (reg:DI 122) 4)
! 	 (ior:SI (subreg:SI (reg:DI 119) 4) (subreg:SI (reg:DI 120) 4)))
  
     can be simplified to the much simpler
  
***************
*** 89,95 ****
       (set (reg:QI 120) (subreg:QI (reg:SI 118) 0))
       (set (reg:QI 121) (subreg:QI (reg:SI 119) 0))
       (set (reg:SI 122) (plus:SI (subreg:SI (reg:QI 120) 0)
!                                 (subreg:SI (reg:QI 121) 0)))
  
     are very common on machines that can only do word-sized operations.
     For each use of a paradoxical subreg (subreg:WIDER (reg:NARROW N) 0),
--- 87,93 ----
       (set (reg:QI 120) (subreg:QI (reg:SI 118) 0))
       (set (reg:QI 121) (subreg:QI (reg:SI 119) 0))
       (set (reg:SI 122) (plus:SI (subreg:SI (reg:QI 120) 0)
! 				(subreg:SI (reg:QI 121) 0)))
  
     are very common on machines that can only do word-sized operations.
     For each use of a paradoxical subreg (subreg:WIDER (reg:NARROW N) 0),
***************
*** 101,318 ****
       (set (reg:QI 121) (subreg:QI (reg:SI 119) 0))
       (set (reg:SI 122) (plus:SI (reg:SI 118) (reg:SI 119)))
  
!    where the first two insns are now dead.
! 
!    We used to use reaching definitions to find which uses have a
!    single reaching definition (sounds obvious...), but this is too
!    complex a problem in nasty testcases like PR33928.  Now we use the
!    multiple definitions problem in df-problems.c.  The similarity
!    between that problem and SSA form creation is taken further, in
!    that fwprop does a dominator walk to create its chains; however,
!    instead of creating a PHI function where multiple definitions meet
!    I just punt and record only singleton use-def chains, which is
!    all that is needed by fwprop.  */
  
  
  static int num_changes;
  
- static vec<df_ref> use_def_ref;
- static vec<df_ref> reg_defs;
- static vec<df_ref> reg_defs_stack;
- 
- /* The maximum number of propagations that are still allowed.  If we do
-    more propagations than originally we had uses, we must have ended up
-    in a propagation loop, as in PR79405.  Until the algorithm fwprop
-    uses can obviously not get into such loops we need a workaround like
-    this.  */
- static int propagations_left;
- 
- /* The MD bitmaps are trimmed to include only live registers to cut
-    memory usage on testcases like insn-recog.c.  Track live registers
-    in the basic block and do not perform forward propagation if the
-    destination is a dead pseudo occurring in a note.  */
- static bitmap local_md;
- static bitmap local_lr;
- 
- /* Return the only def in USE's use-def chain, or NULL if there is
-    more than one def in the chain.  */
- 
- static inline df_ref
- get_def_for_use (df_ref use)
- {
-   return use_def_ref[DF_REF_ID (use)];
- }
- 
- 
- /* Update the reg_defs vector with non-partial definitions in DEF_REC.
-    TOP_FLAG says which artificials uses should be used, when DEF_REC
-    is an artificial def vector.  LOCAL_MD is modified as after a
-    df_md_simulate_* function; we do more or less the same processing
-    done there, so we do not use those functions.  */
- 
- #define DF_MD_GEN_FLAGS \
- 	(DF_REF_PARTIAL | DF_REF_CONDITIONAL | DF_REF_MAY_CLOBBER)
- 
- static void
- process_defs (df_ref def, int top_flag)
- {
-   for (; def; def = DF_REF_NEXT_LOC (def))
-     {
-       df_ref curr_def = reg_defs[DF_REF_REGNO (def)];
-       unsigned int dregno;
- 
-       if ((DF_REF_FLAGS (def) & DF_REF_AT_TOP) != top_flag)
- 	continue;
- 
-       dregno = DF_REF_REGNO (def);
-       if (curr_def)
- 	reg_defs_stack.safe_push (curr_def);
-       else
- 	{
- 	  /* Do not store anything if "transitioning" from NULL to NULL.  But
-              otherwise, push a special entry on the stack to tell the
- 	     leave_block callback that the entry in reg_defs was NULL.  */
- 	  if (DF_REF_FLAGS (def) & DF_MD_GEN_FLAGS)
- 	    ;
- 	  else
- 	    reg_defs_stack.safe_push (def);
- 	}
- 
-       if (DF_REF_FLAGS (def) & DF_MD_GEN_FLAGS)
- 	{
- 	  bitmap_set_bit (local_md, dregno);
- 	  reg_defs[dregno] = NULL;
- 	}
-       else
- 	{
- 	  bitmap_clear_bit (local_md, dregno);
- 	  reg_defs[dregno] = def;
- 	}
-     }
- }
- 
- 
- /* Fill the use_def_ref vector with values for the uses in USE_REC,
-    taking reaching definitions info from LOCAL_MD and REG_DEFS.
-    TOP_FLAG says which artificials uses should be used, when USE_REC
-    is an artificial use vector.  */
- 
- static void
- process_uses (df_ref use, int top_flag)
- {
-   for (; use; use = DF_REF_NEXT_LOC (use))
-     if ((DF_REF_FLAGS (use) & DF_REF_AT_TOP) == top_flag)
-       {
-         unsigned int uregno = DF_REF_REGNO (use);
-         if (reg_defs[uregno]
- 	    && !bitmap_bit_p (local_md, uregno)
- 	    && bitmap_bit_p (local_lr, uregno))
- 	  use_def_ref[DF_REF_ID (use)] = reg_defs[uregno];
-       }
- }
- 
- class single_def_use_dom_walker : public dom_walker
- {
- public:
-   single_def_use_dom_walker (cdi_direction direction)
-     : dom_walker (direction) {}
-   virtual edge before_dom_children (basic_block);
-   virtual void after_dom_children (basic_block);
- };
- 
- edge
- single_def_use_dom_walker::before_dom_children (basic_block bb)
- {
-   int bb_index = bb->index;
-   class df_md_bb_info *md_bb_info = df_md_get_bb_info (bb_index);
-   class df_lr_bb_info *lr_bb_info = df_lr_get_bb_info (bb_index);
-   rtx_insn *insn;
- 
-   bitmap_copy (local_md, &md_bb_info->in);
-   bitmap_copy (local_lr, &lr_bb_info->in);
- 
-   /* Push a marker for the leave_block callback.  */
-   reg_defs_stack.safe_push (NULL);
- 
-   process_uses (df_get_artificial_uses (bb_index), DF_REF_AT_TOP);
-   process_defs (df_get_artificial_defs (bb_index), DF_REF_AT_TOP);
- 
-   /* We don't call df_simulate_initialize_forwards, as it may overestimate
-      the live registers if there are unused artificial defs.  We prefer
-      liveness to be underestimated.  */
- 
-   FOR_BB_INSNS (bb, insn)
-     if (INSN_P (insn))
-       {
-         unsigned int uid = INSN_UID (insn);
-         process_uses (DF_INSN_UID_USES (uid), 0);
-         process_uses (DF_INSN_UID_EQ_USES (uid), 0);
-         process_defs (DF_INSN_UID_DEFS (uid), 0);
- 	df_simulate_one_insn_forwards (bb, insn, local_lr);
-       }
- 
-   process_uses (df_get_artificial_uses (bb_index), 0);
-   process_defs (df_get_artificial_defs (bb_index), 0);
- 
-   return NULL;
- }
- 
- /* Pop the definitions created in this basic block when leaving its
-    dominated parts.  */
- 
- void
- single_def_use_dom_walker::after_dom_children (basic_block bb ATTRIBUTE_UNUSED)
- {
-   df_ref saved_def;
-   while ((saved_def = reg_defs_stack.pop ()) != NULL)
-     {
-       unsigned int dregno = DF_REF_REGNO (saved_def);
- 
-       /* See also process_defs.  */
-       if (saved_def == reg_defs[dregno])
- 	reg_defs[dregno] = NULL;
-       else
- 	reg_defs[dregno] = saved_def;
-     }
- }
- 
- 
- /* Build a vector holding the reaching definitions of uses reached by a
-    single dominating definition.  */
- 
- static void
- build_single_def_use_links (void)
- {
-   /* We use the multiple definitions problem to compute our restricted
-      use-def chains.  */
-   df_set_flags (DF_EQ_NOTES);
-   df_md_add_problem ();
-   df_note_add_problem ();
-   df_analyze ();
-   df_maybe_reorganize_use_refs (DF_REF_ORDER_BY_INSN_WITH_NOTES);
- 
-   use_def_ref.create (DF_USES_TABLE_SIZE ());
-   use_def_ref.safe_grow_cleared (DF_USES_TABLE_SIZE (), true);
- 
-   reg_defs.create (max_reg_num ());
-   reg_defs.safe_grow_cleared (max_reg_num (), true);
- 
-   reg_defs_stack.create (n_basic_blocks_for_fn (cfun) * 10);
-   local_md = BITMAP_ALLOC (NULL);
-   local_lr = BITMAP_ALLOC (NULL);
- 
-   /* Walk the dominator tree looking for single reaching definitions
-      dominating the uses.  This is similar to how SSA form is built.  */
-   single_def_use_dom_walker (CDI_DOMINATORS)
-     .walk (cfun->cfg->x_entry_block_ptr);
- 
-   BITMAP_FREE (local_lr);
-   BITMAP_FREE (local_md);
-   reg_defs.release ();
-   reg_defs_stack.release ();
- }
- 
- \f
  /* Do not try to replace constant addresses or addresses of local and
     argument slots.  These MEM expressions are made only once and inserted
     in many instructions, as well as being used to control symbol table
--- 99,110 ----
       (set (reg:QI 121) (subreg:QI (reg:SI 119) 0))
       (set (reg:SI 122) (plus:SI (reg:SI 118) (reg:SI 119)))
  
!    where the first two insns are now dead.  */
  
+ using namespace rtl_ssa;
  
  static int num_changes;
  
  /* Do not try to replace constant addresses or addresses of local and
     argument slots.  These MEM expressions are made only once and inserted
     in many instructions, as well as being used to control symbol table
***************
*** 342,1114 ****
  	      && REGNO (reg) != ARG_POINTER_REGNUM));
  }
  
! /* Returns a canonical version of X for the address, from the point of view,
!    that all multiplications are represented as MULT instead of the multiply
!    by a power of 2 being represented as ASHIFT.
! 
!    Every ASHIFT we find has been made by simplify_gen_binary and was not
!    there before, so it is not shared.  So we can do this in place.  */
! 
! static void
! canonicalize_address (rtx x)
! {
!   for (;;)
!     switch (GET_CODE (x))
!       {
!       case ASHIFT:
!         if (CONST_INT_P (XEXP (x, 1))
! 	    && INTVAL (XEXP (x, 1)) < GET_MODE_UNIT_BITSIZE (GET_MODE (x))
! 	    && INTVAL (XEXP (x, 1)) >= 0)
! 	  {
! 	    HOST_WIDE_INT shift = INTVAL (XEXP (x, 1));
! 	    PUT_CODE (x, MULT);
! 	    XEXP (x, 1) = gen_int_mode (HOST_WIDE_INT_1 << shift,
! 					GET_MODE (x));
! 	  }
! 
! 	x = XEXP (x, 0);
!         break;
! 
!       case PLUS:
!         if (GET_CODE (XEXP (x, 0)) == PLUS
! 	    || GET_CODE (XEXP (x, 0)) == ASHIFT
! 	    || GET_CODE (XEXP (x, 0)) == CONST)
! 	  canonicalize_address (XEXP (x, 0));
! 
! 	x = XEXP (x, 1);
!         break;
! 
!       case CONST:
! 	x = XEXP (x, 0);
!         break;
! 
!       default:
!         return;
!       }
! }
! 
! /* OLD is a memory address.  Return whether it is good to use NEW instead,
!    for a memory access in the given MODE.  */
  
  static bool
! should_replace_address (rtx old_rtx, rtx new_rtx, machine_mode mode,
! 			addr_space_t as, bool speed)
  {
    int gain;
  
-   if (rtx_equal_p (old_rtx, new_rtx)
-       || !memory_address_addr_space_p (mode, new_rtx, as))
-     return false;
- 
-   /* Copy propagation is always ok.  */
-   if (REG_P (old_rtx) && REG_P (new_rtx))
-     return true;
- 
    /* Prefer the new address if it is less expensive.  */
!   gain = (address_cost (old_rtx, mode, as, speed)
! 	  - address_cost (new_rtx, mode, as, speed));
  
    /* If the addresses have equivalent cost, prefer the new address
       if it has the highest `set_src_cost'.  That has the potential of
       eliminating the most insns without additional costs, and it
       is the same that cse.c used to do.  */
    if (gain == 0)
!     gain = (set_src_cost (new_rtx, VOIDmode, speed)
! 	    - set_src_cost (old_rtx, VOIDmode, speed));
  
    return (gain > 0);
  }
  
  
! /* Flags for the last parameter of propagate_rtx_1.  */
! 
! enum {
!   /* If PR_CAN_APPEAR is true, propagate_rtx_1 always returns true;
!      if it is false, propagate_rtx_1 returns false if, for at least
!      one occurrence OLD, it failed to collapse the result to a constant.
!      For example, (mult:M (reg:M A) (minus:M (reg:M B) (reg:M A))) may
!      collapse to zero if replacing (reg:M B) with (reg:M A).
! 
!      PR_CAN_APPEAR is disregarded inside MEMs: in that case,
!      propagate_rtx_1 just tries to make cheaper and valid memory
!      addresses.  */
!   PR_CAN_APPEAR = 1,
! 
!   /* If PR_HANDLE_MEM is not set, propagate_rtx_1 won't attempt any replacement
!      outside memory addresses.  This is needed because propagate_rtx_1 does
!      not do any analysis on memory; thus it is very conservative and in general
!      it will fail if non-read-only MEMs are found in the source expression.
! 
!      PR_HANDLE_MEM is set when the source of the propagation was not
!      another MEM.  Then, it is safe not to treat non-read-only MEMs as
!      ``opaque'' objects.  */
!   PR_HANDLE_MEM = 2,
! 
!   /* Set when costs should be optimized for speed.  */
!   PR_OPTIMIZE_FOR_SPEED = 4
! };
  
! /* Check that X has a single def.  */
  
! static bool
! reg_single_def_p (rtx x)
! {
!   if (!REG_P (x))
!     return false;
  
!   int regno = REGNO (x);
!   return (DF_REG_DEF_COUNT (regno) == 1
! 	  && !bitmap_bit_p (DF_LR_OUT (ENTRY_BLOCK_PTR_FOR_FN (cfun)), regno));
  }
  
! /* Replace all occurrences of OLD in *PX with NEW and try to simplify the
!    resulting expression.  Replace *PX with a new RTL expression if an
!    occurrence of OLD was found.
  
!    This is only a wrapper around simplify-rtx.c: do not add any pattern
!    matching code here.  (The sole exception is the handling of LO_SUM, but
!    that is because there is no simplify_gen_* function for LO_SUM).  */
  
! static bool
! propagate_rtx_1 (rtx *px, rtx old_rtx, rtx new_rtx, int flags)
  {
!   rtx x = *px, tem = NULL_RTX, op0, op1, op2;
!   enum rtx_code code = GET_CODE (x);
!   machine_mode mode = GET_MODE (x);
!   machine_mode op_mode;
!   bool can_appear = (flags & PR_CAN_APPEAR) != 0;
!   bool valid_ops = true;
! 
!   if (!(flags & PR_HANDLE_MEM) && MEM_P (x) && !MEM_READONLY_P (x))
!     {
!       /* If unsafe, change MEMs to CLOBBERs or SCRATCHes (to preserve whether
! 	 they have side effects or not).  */
!       *px = (side_effects_p (x)
! 	     ? gen_rtx_CLOBBER (GET_MODE (x), const0_rtx)
! 	     : gen_rtx_SCRATCH (GET_MODE (x)));
        return false;
      }
  
!   /* If X is OLD_RTX, return NEW_RTX.  But not if replacing only within an
!      address, and we are *not* inside one.  */
!   if (x == old_rtx)
!     {
!       *px = new_rtx;
!       return can_appear;
!     }
! 
!   /* If this is an expression, try recursive substitution.  */
!   switch (GET_RTX_CLASS (code))
!     {
!     case RTX_UNARY:
!       op0 = XEXP (x, 0);
!       op_mode = GET_MODE (op0);
!       valid_ops &= propagate_rtx_1 (&op0, old_rtx, new_rtx, flags);
!       if (op0 == XEXP (x, 0))
! 	return true;
!       tem = simplify_gen_unary (code, mode, op0, op_mode);
!       break;
! 
!     case RTX_BIN_ARITH:
!     case RTX_COMM_ARITH:
!       op0 = XEXP (x, 0);
!       op1 = XEXP (x, 1);
!       valid_ops &= propagate_rtx_1 (&op0, old_rtx, new_rtx, flags);
!       valid_ops &= propagate_rtx_1 (&op1, old_rtx, new_rtx, flags);
!       if (op0 == XEXP (x, 0) && op1 == XEXP (x, 1))
! 	return true;
!       tem = simplify_gen_binary (code, mode, op0, op1);
!       break;
! 
!     case RTX_COMPARE:
!     case RTX_COMM_COMPARE:
!       op0 = XEXP (x, 0);
!       op1 = XEXP (x, 1);
!       op_mode = GET_MODE (op0) != VOIDmode ? GET_MODE (op0) : GET_MODE (op1);
!       valid_ops &= propagate_rtx_1 (&op0, old_rtx, new_rtx, flags);
!       valid_ops &= propagate_rtx_1 (&op1, old_rtx, new_rtx, flags);
!       if (op0 == XEXP (x, 0) && op1 == XEXP (x, 1))
! 	return true;
!       tem = simplify_gen_relational (code, mode, op_mode, op0, op1);
!       break;
! 
!     case RTX_TERNARY:
!     case RTX_BITFIELD_OPS:
!       op0 = XEXP (x, 0);
!       op1 = XEXP (x, 1);
!       op2 = XEXP (x, 2);
!       op_mode = GET_MODE (op0);
!       valid_ops &= propagate_rtx_1 (&op0, old_rtx, new_rtx, flags);
!       valid_ops &= propagate_rtx_1 (&op1, old_rtx, new_rtx, flags);
!       valid_ops &= propagate_rtx_1 (&op2, old_rtx, new_rtx, flags);
!       if (op0 == XEXP (x, 0) && op1 == XEXP (x, 1) && op2 == XEXP (x, 2))
! 	return true;
!       if (op_mode == VOIDmode)
! 	op_mode = GET_MODE (op0);
!       tem = simplify_gen_ternary (code, mode, op_mode, op0, op1, op2);
!       break;
! 
!     case RTX_EXTRA:
!       /* The only case we try to handle is a SUBREG.  */
!       if (code == SUBREG)
! 	{
!           op0 = XEXP (x, 0);
! 	  valid_ops &= propagate_rtx_1 (&op0, old_rtx, new_rtx, flags);
!           if (op0 == XEXP (x, 0))
! 	    return true;
! 	  tem = simplify_gen_subreg (mode, op0, GET_MODE (SUBREG_REG (x)),
! 				     SUBREG_BYTE (x));
! 	}
! 
!       else
! 	{
! 	  rtvec vec;
! 	  rtvec newvec;
! 	  const char *fmt = GET_RTX_FORMAT (code);
! 	  rtx op;
! 
! 	  for (int i = 0; fmt[i]; i++)
! 	    switch (fmt[i])
! 	      {
! 	      case 'E':
! 		vec = XVEC (x, i);
! 		newvec = vec;
! 		for (int j = 0; j < GET_NUM_ELEM (vec); j++)
! 		  {
! 		    op = RTVEC_ELT (vec, j);
! 		    valid_ops &= propagate_rtx_1 (&op, old_rtx, new_rtx, flags);
! 		    if (op != RTVEC_ELT (vec, j))
! 		      {
! 			if (newvec == vec)
! 			  {
! 			    newvec = shallow_copy_rtvec (vec);
! 			    if (!tem)
! 			      tem = shallow_copy_rtx (x);
! 			    XVEC (tem, i) = newvec;
! 			  }
! 			RTVEC_ELT (newvec, j) = op;
! 		      }
! 		  }
! 	        break;
! 
! 	      case 'e':
! 		if (XEXP (x, i))
! 		  {
! 		    op = XEXP (x, i);
! 		    valid_ops &= propagate_rtx_1 (&op, old_rtx, new_rtx, flags);
! 		    if (op != XEXP (x, i))
! 		      {
! 			if (!tem)
! 			  tem = shallow_copy_rtx (x);
! 			XEXP (tem, i) = op;
! 		      }
! 		  }
! 	        break;
! 	      }
! 	}
! 
!       break;
! 
!     case RTX_OBJ:
!       if (code == MEM && x != new_rtx)
! 	{
! 	  rtx new_op0;
! 	  op0 = XEXP (x, 0);
! 
! 	  /* There are some addresses that we cannot work on.  */
! 	  if (!can_simplify_addr (op0))
! 	    return true;
! 
! 	  op0 = new_op0 = targetm.delegitimize_address (op0);
! 	  valid_ops &= propagate_rtx_1 (&new_op0, old_rtx, new_rtx,
! 					flags | PR_CAN_APPEAR);
! 
! 	  /* Dismiss transformation that we do not want to carry on.  */
! 	  if (!valid_ops
! 	      || new_op0 == op0
! 	      || !(GET_MODE (new_op0) == GET_MODE (op0)
! 		   || GET_MODE (new_op0) == VOIDmode))
! 	    return true;
! 
! 	  canonicalize_address (new_op0);
! 
! 	  /* Copy propagations are always ok.  Otherwise check the costs.  */
! 	  if (!(REG_P (old_rtx) && REG_P (new_rtx))
! 	      && !should_replace_address (op0, new_op0, GET_MODE (x),
! 					  MEM_ADDR_SPACE (x),
! 	      			 	  flags & PR_OPTIMIZE_FOR_SPEED))
! 	    return true;
! 
! 	  tem = replace_equiv_address_nv (x, new_op0);
! 	}
! 
!       else if (code == LO_SUM)
! 	{
!           op0 = XEXP (x, 0);
!           op1 = XEXP (x, 1);
  
! 	  /* The only simplification we do attempts to remove references to op0
! 	     or make it constant -- in both cases, op0's invalidity will not
! 	     make the result invalid.  */
! 	  propagate_rtx_1 (&op0, old_rtx, new_rtx, flags | PR_CAN_APPEAR);
! 	  valid_ops &= propagate_rtx_1 (&op1, old_rtx, new_rtx, flags);
!           if (op0 == XEXP (x, 0) && op1 == XEXP (x, 1))
! 	    return true;
  
! 	  /* (lo_sum (high x) x) -> x  */
! 	  if (GET_CODE (op0) == HIGH && rtx_equal_p (XEXP (op0, 0), op1))
! 	    tem = op1;
! 	  else
! 	    tem = gen_rtx_LO_SUM (mode, op0, op1);
! 
! 	  /* OP1 is likely not a legitimate address, otherwise there would have
! 	     been no LO_SUM.  We want it to disappear if it is invalid, return
! 	     false in that case.  */
! 	  return memory_address_p (mode, tem);
! 	}
  
!       else if (code == REG)
! 	{
! 	  if (rtx_equal_p (x, old_rtx))
! 	    {
!               *px = new_rtx;
!               return can_appear;
! 	    }
! 	}
!       break;
  
!     default:
!       break;
      }
  
-   /* No change, no trouble.  */
-   if (tem == NULL_RTX)
-     return true;
- 
-   *px = tem;
- 
    /* Allow replacements that simplify operations on a vector or complex
       value to a component.  The most prominent case is
       (subreg ([vec_]concat ...)).   */
!   if (REG_P (tem) && !HARD_REGISTER_P (tem)
!       && (VECTOR_MODE_P (GET_MODE (new_rtx))
! 	  || COMPLEX_MODE_P (GET_MODE (new_rtx)))
!       && GET_MODE (tem) == GET_MODE_INNER (GET_MODE (new_rtx)))
!     return true;
  
!   /* The replacement we made so far is valid, if all of the recursive
!      replacements were valid, or we could simplify everything to
!      a constant.  */
!   return valid_ops || can_appear || CONSTANT_P (tem);
  }
  
  
! /* Return true if X constains a non-constant mem.  */
  
! static bool
! varying_mem_p (const_rtx x)
  {
!   subrtx_iterator::array_type array;
!   FOR_EACH_SUBRTX (iter, array, x, NONCONST)
!     if (MEM_P (*iter) && !MEM_READONLY_P (*iter))
!       return true;
!   return false;
! }
! 
  
! /* Replace all occurrences of OLD in X with NEW and try to simplify the
!    resulting expression (in mode MODE).  Return a new expression if it is
!    a constant, otherwise X.
! 
!    Simplifications where occurrences of NEW collapse to a constant are always
!    accepted.  All simplifications are accepted if NEW is a pseudo too.
!    Otherwise, we accept simplifications that have a lower or equal cost.  */
! 
! static rtx
! propagate_rtx (rtx x, machine_mode mode, rtx old_rtx, rtx new_rtx,
! 	       bool speed)
! {
!   rtx tem;
!   bool collapsed;
!   int flags;
  
!   if (REG_P (new_rtx) && REGNO (new_rtx) < FIRST_PSEUDO_REGISTER)
!     return NULL_RTX;
  
!   flags = 0;
!   if (REG_P (new_rtx)
!       || CONSTANT_P (new_rtx)
!       || (GET_CODE (new_rtx) == SUBREG
! 	  && REG_P (SUBREG_REG (new_rtx))
! 	  && !paradoxical_subreg_p (new_rtx)))
!     flags |= PR_CAN_APPEAR;
!   if (!varying_mem_p (new_rtx))
!     flags |= PR_HANDLE_MEM;
! 
!   if (speed)
!     flags |= PR_OPTIMIZE_FOR_SPEED;
! 
!   tem = x;
!   collapsed = propagate_rtx_1 (&tem, old_rtx, copy_rtx (new_rtx), flags);
!   if (tem == x || !collapsed)
!     return NULL_RTX;
! 
!   /* gen_lowpart_common will not be able to process VOIDmode entities other
!      than CONST_INTs.  */
!   if (GET_MODE (tem) == VOIDmode && !CONST_INT_P (tem))
!     return NULL_RTX;
  
!   if (GET_MODE (tem) == VOIDmode)
!     tem = rtl_hooks.gen_lowpart_no_emit (mode, tem);
!   else
!     gcc_assert (GET_MODE (tem) == mode);
  
!   return tem;
  }
  
  
! \f
  
! /* Return true if the register from reference REF is killed
!    between FROM to (but not including) TO.  */
  
  static bool
! local_ref_killed_between_p (df_ref ref, rtx_insn *from, rtx_insn *to)
  {
!   rtx_insn *insn;
! 
!   for (insn = from; insn != to; insn = NEXT_INSN (insn))
      {
!       df_ref def;
!       if (!INSN_P (insn))
! 	continue;
! 
!       FOR_EACH_INSN_DEF (def, insn)
! 	if (DF_REF_REGNO (ref) == DF_REF_REGNO (def))
! 	  return true;
      }
    return false;
  }
  
  
! /* Check if USE is killed between DEF_INSN and TARGET_INSN.  This would
!    require full computation of available expressions; we check only a few
!    restricted conditions:
!    - if the reg in USE has only one definition, go ahead;
!    - in the same basic block, we check for no definitions killing the use;
!    - if TARGET_INSN's basic block has DEF_INSN's basic block as its sole
!      predecessor, we check if the use is killed after DEF_INSN or before
!      TARGET_INSN insn, in their respective basic blocks.  */
  
! static bool
! use_killed_between (df_ref use, rtx_insn *def_insn, rtx_insn *target_insn)
  {
!   basic_block def_bb = BLOCK_FOR_INSN (def_insn);
!   basic_block target_bb = BLOCK_FOR_INSN (target_insn);
!   int regno;
!   df_ref def;
! 
!   /* We used to have a def reaching a use that is _before_ the def,
!      with the def not dominating the use even though the use and def
!      are in the same basic block, when a register may be used
!      uninitialized in a loop.  This should not happen anymore since
!      we do not use reaching definitions, but still we test for such
!      cases and assume that DEF is not available.  */
!   if (def_bb == target_bb
!       ? DF_INSN_LUID (def_insn) >= DF_INSN_LUID (target_insn)
!       : !dominated_by_p (CDI_DOMINATORS, target_bb, def_bb))
!     return true;
! 
!   /* Check if the reg in USE has only one definition.  We already
!      know that this definition reaches use, or we wouldn't be here.
!      However, this is invalid for hard registers because if they are
!      live at the beginning of the function it does not mean that we
!      have an uninitialized access.  And we have to check for the case
!      where a register may be used uninitialized in a loop as above.  */
!   regno = DF_REF_REGNO (use);
!   def = DF_REG_DEF_CHAIN (regno);
!   if (def
!       && DF_REF_NEXT_REG (def) == NULL
!       && regno >= FIRST_PSEUDO_REGISTER
!       && (BLOCK_FOR_INSN (DF_REF_INSN (def)) == def_bb
! 	  ? DF_INSN_LUID (DF_REF_INSN (def)) < DF_INSN_LUID (def_insn)
! 	  : dominated_by_p (CDI_DOMINATORS,
! 			    def_bb, BLOCK_FOR_INSN (DF_REF_INSN (def)))))
!     return false;
! 
!   /* Check locally if we are in the same basic block.  */
!   if (def_bb == target_bb)
!     return local_ref_killed_between_p (use, def_insn, target_insn);
! 
!   /* Finally, if DEF_BB is the sole predecessor of TARGET_BB.  */
!   if (single_pred_p (target_bb)
!       && single_pred (target_bb) == def_bb)
!     {
!       df_ref x;
! 
!       /* See if USE is killed between DEF_INSN and the last insn in the
! 	 basic block containing DEF_INSN.  */
!       x = df_bb_regno_last_def_find (def_bb, regno);
!       if (x && DF_INSN_LUID (DF_REF_INSN (x)) >= DF_INSN_LUID (def_insn))
! 	return true;
  
!       /* See if USE is killed between TARGET_INSN and the first insn in the
! 	 basic block containing TARGET_INSN.  */
!       x = df_bb_regno_first_def_find (target_bb, regno);
!       if (x && DF_INSN_LUID (DF_REF_INSN (x)) < DF_INSN_LUID (target_insn))
! 	return true;
! 
!       return false;
      }
  
!   /* Otherwise assume the worst case.  */
!   return true;
! }
  
! 
! /* Check if all uses in DEF_INSN can be used in TARGET_INSN.  This
!    would require full computation of available expressions;
!    we check only restricted conditions, see use_killed_between.  */
! static bool
! all_uses_available_at (rtx_insn *def_insn, rtx_insn *target_insn)
! {
!   df_ref use;
!   struct df_insn_info *insn_info = DF_INSN_INFO_GET (def_insn);
!   rtx def_set = single_set (def_insn);
!   rtx_insn *next;
! 
!   gcc_assert (def_set);
! 
!   /* If target_insn comes right after def_insn, which is very common
!      for addresses, we can use a quicker test.  Ignore debug insns
!      other than target insns for this.  */
!   next = NEXT_INSN (def_insn);
!   while (next && next != target_insn && DEBUG_INSN_P (next))
!     next = NEXT_INSN (next);
!   if (next == target_insn && REG_P (SET_DEST (def_set)))
!     {
!       rtx def_reg = SET_DEST (def_set);
! 
!       /* If the insn uses the reg that it defines, the substitution is
!          invalid.  */
!       FOR_EACH_INSN_INFO_USE (use, insn_info)
! 	if (rtx_equal_p (DF_REF_REG (use), def_reg))
! 	  return false;
!       FOR_EACH_INSN_INFO_EQ_USE (use, insn_info)
! 	if (rtx_equal_p (DF_REF_REG (use), def_reg))
! 	  return false;
!     }
!   else
      {
!       rtx def_reg = REG_P (SET_DEST (def_set)) ? SET_DEST (def_set) : NULL_RTX;
! 
!       /* Look at all the uses of DEF_INSN, and see if they are not
! 	 killed between DEF_INSN and TARGET_INSN.  */
!       FOR_EACH_INSN_INFO_USE (use, insn_info)
  	{
! 	  if (def_reg && rtx_equal_p (DF_REF_REG (use), def_reg))
! 	    return false;
! 	  if (use_killed_between (use, def_insn, target_insn))
! 	    return false;
  	}
!       FOR_EACH_INSN_INFO_EQ_USE (use, insn_info)
  	{
! 	  if (def_reg && rtx_equal_p (DF_REF_REG (use), def_reg))
! 	    return false;
! 	  if (use_killed_between (use, def_insn, target_insn))
! 	    return false;
  	}
      }
  
!   return true;
  }
  
! \f
! static df_ref *active_defs;
! static sparseset active_defs_check;
! 
! /* Fill the ACTIVE_DEFS array with the use->def link for the registers
!    mentioned in USE_REC.  Register the valid entries in ACTIVE_DEFS_CHECK
!    too, for checking purposes.  */
  
! static void
! register_active_defs (df_ref use)
  {
!   for (; use; use = DF_REF_NEXT_LOC (use))
!     {
!       df_ref def = get_def_for_use (use);
!       int regno = DF_REF_REGNO (use);
  
!       if (flag_checking)
! 	sparseset_set_bit (active_defs_check, regno);
!       active_defs[regno] = def;
      }
- }
- 
  
! /* Build the use->def links that we use to update the dataflow info
!    for new uses.  Note that building the links is very cheap and if
!    it were done earlier, they could be used to rule out invalid
!    propagations (in addition to what is done in all_uses_available_at).
!    I'm not doing this yet, though.  */
  
! static void
! update_df_init (rtx_insn *def_insn, rtx_insn *insn)
! {
!   if (flag_checking)
!     sparseset_clear (active_defs_check);
!   register_active_defs (DF_INSN_USES (def_insn));
!   register_active_defs (DF_INSN_USES (insn));
!   register_active_defs (DF_INSN_EQ_USES (insn));
! }
  
  
! /* Update the USE_DEF_REF array for the given use, using the active definitions
!    in the ACTIVE_DEFS array to match pseudos to their def. */
  
! static inline void
! update_uses (df_ref use)
! {
!   for (; use; use = DF_REF_NEXT_LOC (use))
      {
!       int regno = DF_REF_REGNO (use);
  
!       /* Set up the use-def chain.  */
!       if (DF_REF_ID (use) >= (int) use_def_ref.length ())
! 	use_def_ref.safe_grow_cleared (DF_REF_ID (use) + 1, true);
  
!       if (flag_checking)
! 	gcc_assert (sparseset_bit_p (active_defs_check, regno));
!       use_def_ref[DF_REF_ID (use)] = active_defs[regno];
!     }
! }
  
  
! /* Update the USE_DEF_REF array for the uses in INSN.  Only update note
!    uses if NOTES_ONLY is true.  */
  
! static void
! update_df (rtx_insn *insn, rtx note)
! {
!   struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
  
!   if (note)
!     {
!       df_uses_create (&XEXP (note, 0), insn, DF_REF_IN_NOTE);
!       df_notes_rescan (insn);
      }
!   else
      {
!       df_uses_create (&PATTERN (insn), insn, 0);
!       df_insn_rescan (insn);
!       update_uses (DF_INSN_INFO_USES (insn_info));
      }
  
!   update_uses (DF_INSN_INFO_EQ_USES (insn_info));
  }
  
! 
! /* Try substituting NEW into LOC, which originated from forward propagation
!    of USE's value from DEF_INSN.  SET_REG_EQUAL says whether we are
!    substituting the whole SET_SRC, so we can set a REG_EQUAL note if the
!    new insn is not recognized.  Return whether the substitution was
!    performed.  */
  
  static bool
! try_fwprop_subst (df_ref use, rtx *loc, rtx new_rtx, rtx_insn *def_insn,
! 		  bool set_reg_equal)
! {
!   rtx_insn *insn = DF_REF_INSN (use);
!   rtx set = single_set (insn);
!   rtx note = NULL_RTX;
!   bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn));
!   int old_cost = 0;
!   bool ok;
  
!   update_df_init (def_insn, insn);
  
!   /* forward_propagate_subreg may be operating on an instruction with
!      multiple sets.  If so, assume the cost of the new instruction is
!      not greater than the old one.  */
!   if (set)
!     old_cost = set_src_cost (SET_SRC (set), GET_MODE (SET_DEST (set)), speed);
!   if (dump_file)
!     {
!       fprintf (dump_file, "\nIn insn %d, replacing\n ", INSN_UID (insn));
!       print_inline_rtx (dump_file, *loc, 2);
!       fprintf (dump_file, "\n with ");
!       print_inline_rtx (dump_file, new_rtx, 2);
!       fprintf (dump_file, "\n");
!     }
  
!   validate_unshare_change (insn, loc, new_rtx, true);
!   if (!verify_changes (0))
!     {
!       if (dump_file)
! 	fprintf (dump_file, "Changes to insn %d not recognized\n",
! 		 INSN_UID (insn));
!       ok = false;
!     }
! 
!   else if (DF_REF_TYPE (use) == DF_REF_REG_USE
! 	   && set
! 	   && (set_src_cost (SET_SRC (set), GET_MODE (SET_DEST (set)), speed)
! 	       > old_cost))
!     {
!       if (dump_file)
! 	fprintf (dump_file, "Changes to insn %d not profitable\n",
! 		 INSN_UID (insn));
!       ok = false;
!     }
  
!   else
!     {
!       if (dump_file)
! 	fprintf (dump_file, "Changed insn %d\n", INSN_UID (insn));
!       ok = true;
!     }
  
!   if (ok)
      {
!       confirm_change_group ();
!       num_changes++;
      }
-   else
-     {
-       cancel_changes (0);
  
!       /* Can also record a simplified value in a REG_EQUAL note,
! 	 making a new one if one does not already exist.  */
!       if (set_reg_equal)
! 	{
! 	  /* If there are any paradoxical SUBREGs, don't add REG_EQUAL note,
! 	     because the bits in there can be anything and so might not
! 	     match the REG_EQUAL note content.  See PR70574.  */
! 	  subrtx_var_iterator::array_type array;
! 	  FOR_EACH_SUBRTX_VAR (iter, array, *loc, NONCONST)
! 	    {
! 	      rtx x = *iter;
! 	      if (SUBREG_P (x) && paradoxical_subreg_p (x))
! 		{
! 		  set_reg_equal = false;
! 		  break;
! 		}
! 	    }
  
! 	  if (set_reg_equal)
! 	    {
! 	      if (dump_file)
! 		fprintf (dump_file, " Setting REG_EQUAL note\n");
  
! 	      note = set_unique_reg_note (insn, REG_EQUAL, copy_rtx (new_rtx));
! 	    }
! 	}
!     }
! 
!   if ((ok || note) && !CONSTANT_P (new_rtx))
!     update_df (insn, note);
  
!   return ok;
  }
  
  /* For the given single_set INSN, containing SRC known to be a
--- 134,610 ----
  	      && REGNO (reg) != ARG_POINTER_REGNUM));
  }
  
! /* MEM is the result of an address simplification, and temporarily
!    undoing changes OLD_NUM_CHANGES onwards restores the original address.
!    Return whether it is good to use the new address instead of the
!    old one.  INSN is the containing instruction.  */
  
  static bool
! should_replace_address (int old_num_changes, rtx mem, rtx_insn *insn)
  {
    int gain;
  
    /* Prefer the new address if it is less expensive.  */
!   bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn));
!   temporarily_undo_changes (old_num_changes);
!   gain = address_cost (XEXP (mem, 0), GET_MODE (mem),
! 		       MEM_ADDR_SPACE (mem), speed);
!   redo_changes (old_num_changes);
!   gain -= address_cost (XEXP (mem, 0), GET_MODE (mem),
! 			MEM_ADDR_SPACE (mem), speed);
  
    /* If the addresses have equivalent cost, prefer the new address
       if it has the highest `set_src_cost'.  That has the potential of
       eliminating the most insns without additional costs, and it
       is the same that cse.c used to do.  */
    if (gain == 0)
!     {
!       gain = set_src_cost (XEXP (mem, 0), VOIDmode, speed);
!       temporarily_undo_changes (old_num_changes);
!       gain -= set_src_cost (XEXP (mem, 0), VOIDmode, speed);
!       redo_changes (old_num_changes);
!     }
  
    return (gain > 0);
  }
  
  
! namespace
! {
!   class fwprop_propagation : public insn_propagation
!   {
!   public:
!     static const uint16_t CHANGED_MEM = FIRST_SPARE_RESULT;
!     static const uint16_t CONSTANT = FIRST_SPARE_RESULT << 1;
!     static const uint16_t PROFITABLE = FIRST_SPARE_RESULT << 2;
  
!     fwprop_propagation (rtx_insn *, rtx, rtx);
  
!     bool changed_mem_p () const { return result_flags & CHANGED_MEM; }
!     bool folded_to_constants_p () const;
!     bool profitable_p () const;
  
!     bool check_mem (int, rtx) final override;
!     void note_simplification (int, uint16_t, rtx, rtx) final override;
!     uint16_t classify_result (rtx, rtx);
!   };
  }
  
! /* Prepare to replace FROM with TO in INSN.  */
! 
! fwprop_propagation::fwprop_propagation (rtx_insn *insn, rtx from, rtx to)
!   : insn_propagation (insn, from, to)
! {
!   should_check_mems = true;
!   should_note_simplifications = true;
! }
  
! /* MEM is the result of an address simplification, and temporarily
!    undoing changes OLD_NUM_CHANGES onwards restores the original address.
!    Return true if the propagation should continue, false if it has failed.  */
  
! bool
! fwprop_propagation::check_mem (int old_num_changes, rtx mem)
  {
!   if (!memory_address_addr_space_p (GET_MODE (mem), XEXP (mem, 0),
! 				    MEM_ADDR_SPACE (mem)))
!     {
!       failure_reason = "would create an invalid MEM";
        return false;
      }
  
!   temporarily_undo_changes (old_num_changes);
!   bool can_simplify = can_simplify_addr (XEXP (mem, 0));
!   redo_changes (old_num_changes);
!   if (!can_simplify)
!     {
!       failure_reason = "would replace a frame address";
!       return false;
!     }
  
!   /* Copy propagations are always ok.  Otherwise check the costs.  */
!   if (!(REG_P (from) && REG_P (to))
!       && !should_replace_address (old_num_changes, mem, insn))
!     {
!       failure_reason = "would increase the cost of a MEM";
!       return false;
!     }
  
!   result_flags |= CHANGED_MEM;
!   return true;
! }
  
! /* OLDX has been simplified to NEWX.  Describe the change in terms of
!    result_flags.  */
  
! uint16_t
! fwprop_propagation::classify_result (rtx old_rtx, rtx new_rtx)
! {
!   if (CONSTANT_P (new_rtx))
!     {
!       /* If OLD_RTX is a LO_SUM, then it presumably exists for a reason,
! 	 and NEW_RTX is likely not a legitimate address.  We want it to
! 	 disappear if it is invalid.
! 
! 	 ??? Using the mode of the LO_SUM as the mode of the address
! 	 seems odd, but it was what the pre-SSA code did.  */
!       if (GET_CODE (old_rtx) == LO_SUM
! 	  && !memory_address_p (GET_MODE (old_rtx), new_rtx))
! 	return CONSTANT;
!       return CONSTANT | PROFITABLE;
      }
  
    /* Allow replacements that simplify operations on a vector or complex
       value to a component.  The most prominent case is
       (subreg ([vec_]concat ...)).   */
!   if (REG_P (new_rtx)
!       && !HARD_REGISTER_P (new_rtx)
!       && (VECTOR_MODE_P (GET_MODE (from))
! 	  || COMPLEX_MODE_P (GET_MODE (from)))
!       && GET_MODE (new_rtx) == GET_MODE_INNER (GET_MODE (from)))
!     return PROFITABLE;
! 
!   return 0;
! }
  
! /* Record that OLD_RTX has been simplified to NEW_RTX.  OLD_NUM_CHANGES
!    is the number of unrelated changes that had been made before processing
!    OLD_RTX and its subrtxes.  OLD_RESULT_FLAGS is the value that result_flags
!    had at that point.  */
! 
! void
! fwprop_propagation::note_simplification (int old_num_changes,
! 					 uint16_t old_result_flags,
! 					 rtx old_rtx, rtx new_rtx)
! {
!   result_flags &= ~(CONSTANT | PROFITABLE);
!   uint16_t new_flags = classify_result (old_rtx, new_rtx);
!   if (old_num_changes)
!     new_flags &= old_result_flags;
!   result_flags |= new_flags;
! }
! 
! /* Return true if all substitutions eventually folded to constants.  */
! 
! bool
! fwprop_propagation::folded_to_constants_p () const
! {
!   /* If we're propagating a HIGH, require it to be folded with a
!      partnering LO_SUM.  For example, a REG_EQUAL note with a register
!      replaced by an unfolded HIGH is not useful.  */
!   if (CONSTANT_P (to) && GET_CODE (to) != HIGH)
!     return true;
!   return !(result_flags & UNSIMPLIFIED) && (result_flags & CONSTANT);
  }
  
  
! /* Return true if it is worth keeping the result of the propagation,
!    false if it would increase the complexity of the pattern too much.  */
  
! bool
! fwprop_propagation::profitable_p () const
  {
!   if (changed_mem_p ())
!     return true;
  
!   if (!(result_flags & UNSIMPLIFIED)
!       && (result_flags & PROFITABLE))
!     return true;
  
!   if (REG_P (to))
!     return true;
  
!   if (GET_CODE (to) == SUBREG
!       && REG_P (SUBREG_REG (to))
!       && !paradoxical_subreg_p (to))
!     return true;
  
!   if (CONSTANT_P (to))
!     return true;
  
!   return false;
  }
  
+ /* Check that X has a single def.  */
  
! static bool
! reg_single_def_p (rtx x)
! {
!   return REG_P (x) && crtl->ssa->single_dominating_def (REGNO (x));
! }
  
! /* Return true if X contains a paradoxical subreg.  */
  
  static bool
! contains_paradoxical_subreg_p (rtx x)
  {
!   subrtx_var_iterator::array_type array;
!   FOR_EACH_SUBRTX_VAR (iter, array, x, NONCONST)
      {
!       x = *iter;
!       if (SUBREG_P (x) && paradoxical_subreg_p (x))
! 	return true;
      }
    return false;
  }
  
+ /* Try to substitute (set DEST SRC) from DEF_INSN into note NOTE of USE_INSN.
+    Return the number of substitutions on success, otherwise return -1 and
+    leave USE_INSN unchanged.
  
!    If REQUIRE_CONSTANT is true, require all substituted occurences of SRC
!    to fold to a constant, so that the note does not use any more registers
!    than it did previously.  If REQUIRE_CONSTANT is false, also allow the
!    substitution if it's something we'd normally allow for the main
!    instruction pattern.  */
  
! static int
! try_fwprop_subst_note (insn_info *use_insn, insn_info *def_insn,
! 		       rtx note, rtx dest, rtx src, bool require_constant)
  {
!   rtx_insn *use_rtl = use_insn->rtl ();
  
!   insn_change_watermark watermark;
!   fwprop_propagation prop (use_rtl, dest, src);
!   if (!prop.apply_to_rvalue (&XEXP (note, 0)))
!     {
!       if (dump_file && (dump_flags & TDF_DETAILS))
! 	fprintf (dump_file, "cannot propagate from insn %d into"
! 		 " notes of insn %d: %s\n", def_insn->uid (),
! 		 use_insn->uid (), prop.failure_reason);
!       return -1;
      }
  
!   if (prop.num_replacements == 0)
!     return 0;
  
!   if (require_constant)
      {
!       if (!prop.folded_to_constants_p ())
  	{
! 	  if (dump_file && (dump_flags & TDF_DETAILS))
! 	    fprintf (dump_file, "cannot propagate from insn %d into"
! 		     " notes of insn %d: %s\n", def_insn->uid (),
! 		     use_insn->uid (), "wouldn't fold to constants");
! 	  return -1;
  	}
!     }
!   else
!     {
!       if (!prop.folded_to_constants_p () && !prop.profitable_p ())
  	{
! 	  if (dump_file && (dump_flags & TDF_DETAILS))
! 	    fprintf (dump_file, "cannot propagate from insn %d into"
! 		     " notes of insn %d: %s\n", def_insn->uid (),
! 		     use_insn->uid (), "would increase complexity of node");
! 	  return -1;
  	}
      }
  
!   if (dump_file && (dump_flags & TDF_DETAILS))
!     {
!       fprintf (dump_file, "\nin notes of insn %d, replacing:\n  ",
! 	       INSN_UID (use_rtl));
!       temporarily_undo_changes (0);
!       print_inline_rtx (dump_file, note, 2);
!       redo_changes (0);
!       fprintf (dump_file, "\n with:\n  ");
!       print_inline_rtx (dump_file, note, 2);
!       fprintf (dump_file, "\n");
!     }
!   watermark.keep ();
!   return prop.num_replacements;
  }
  
! /* Try to substitute (set DEST SRC) from DEF_INSN into location LOC of
!    USE_INSN's pattern.  Return true on success, otherwise leave USE_INSN
!    unchanged.  */
  
! static bool
! try_fwprop_subst_pattern (obstack_watermark &attempt, insn_change &use_change,
! 			  insn_info *def_insn, rtx *loc, rtx dest, rtx src)
  {
!   insn_info *use_insn = use_change.insn ();
!   rtx_insn *use_rtl = use_insn->rtl ();
  
!   insn_change_watermark watermark;
!   fwprop_propagation prop (use_rtl, dest, src);
!   if (!prop.apply_to_pattern (loc))
!     {
!       if (dump_file && (dump_flags & TDF_DETAILS))
! 	fprintf (dump_file, "cannot propagate from insn %d into"
! 		 " insn %d: %s\n", def_insn->uid (), use_insn->uid (),
! 		 prop.failure_reason);
!       return false;
      }
  
!   if (prop.num_replacements == 0)
!     return false;
  
!   if (!prop.profitable_p ())
!     {
!       if (dump_file && (dump_flags & TDF_DETAILS))
! 	fprintf (dump_file, "cannot propagate from insn %d into"
! 		 " insn %d: %s\n", def_insn->uid (), use_insn->uid (),
! 		 "would increase complexity of pattern");
!       return false;
!     }
  
+   if (dump_file && (dump_flags & TDF_DETAILS))
+     {
+       fprintf (dump_file, "\npropagating insn %d into insn %d, replacing:\n",
+ 	       def_insn->uid (), use_insn->uid ());
+       temporarily_undo_changes (0);
+       print_rtl_single (dump_file, PATTERN (use_rtl));
+       redo_changes (0);
+     }
  
!   /* ??? In theory, it should be better to use insn costs rather than
!      set_src_costs here.  That would involve replacing this code with
!      change_is_worthwhile.  */
!   bool ok = recog (attempt, use_change);
!   if (ok && !prop.changed_mem_p () && !use_insn->is_asm ())
!     if (rtx use_set = single_set (use_rtl))
!       {
! 	bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (use_rtl));
! 	temporarily_undo_changes (0);
! 	auto old_cost = set_src_cost (SET_SRC (use_set),
! 				      GET_MODE (SET_DEST (use_set)), speed);
! 	redo_changes (0);
! 	auto new_cost = set_src_cost (SET_SRC (use_set),
! 				      GET_MODE (SET_DEST (use_set)), speed);
! 	if (new_cost > old_cost)
! 	  {
! 	    if (dump_file)
! 	      fprintf (dump_file, "change not profitable"
! 		       " (cost %d -> cost %d)\n", old_cost, new_cost);
! 	    ok = false;
! 	  }
!       }
  
!   if (!ok)
      {
!       /* The pattern didn't match, but if all uses of SRC folded to
! 	 constants, we can add a REG_EQUAL note for the result, if there
! 	 isn't one already.  */
!       if (!prop.folded_to_constants_p ())
! 	return false;
  
!       /* Test this first to avoid creating an unnecessary copy of SRC.  */
!       if (find_reg_note (use_rtl, REG_EQUAL, NULL_RTX))
! 	return false;
  
!       rtx set = set_for_reg_notes (use_rtl);
!       if (!set || !REG_P (SET_DEST (set)))
! 	return false;
  
+       rtx value = copy_rtx (SET_SRC (set));
+       cancel_changes (0);
  
!       /* If there are any paradoxical SUBREGs, drop the REG_EQUAL note,
! 	 because the bits in there can be anything and so might not
! 	 match the REG_EQUAL note content.  See PR70574.  */
!       if (contains_paradoxical_subreg_p (SET_SRC (set)))
! 	return false;
  
!       if (dump_file && (dump_flags & TDF_DETAILS))
! 	fprintf (dump_file, " Setting REG_EQUAL note\n");
  
!       return set_unique_reg_note (use_rtl, REG_EQUAL, value);
      }
! 
!   rtx *note_ptr = &REG_NOTES (use_rtl);
!   while (rtx note = *note_ptr)
      {
!       if ((REG_NOTE_KIND (note) == REG_EQUAL
! 	   || REG_NOTE_KIND (note) == REG_EQUIV)
! 	  && try_fwprop_subst_note (use_insn, def_insn, note,
! 				    dest, src, false) < 0)
! 	{
! 	  *note_ptr = XEXP (note, 1);
! 	  free_EXPR_LIST_node (note);
! 	}
!       else
! 	note_ptr = &XEXP (note, 1);
      }
  
!   confirm_change_group ();
!   crtl->ssa->change_insn (use_change);
!   num_changes++;
!   return true;
  }
  
! /* Try to substitute (set DEST SRC) from DEF_INSN into USE_INSN's notes,
!    given that it was not possible to do this for USE_INSN's main pattern.
!    Return true on success, otherwise leave USE_INSN unchanged.  */
  
  static bool
! try_fwprop_subst_notes (insn_info *use_insn, insn_info *def_insn,
! 			rtx dest, rtx src)
! {
!   rtx_insn *use_rtl = use_insn->rtl ();
!   for (rtx note = REG_NOTES (use_rtl); note; note = XEXP (note, 1))
!     if ((REG_NOTE_KIND (note) == REG_EQUAL
! 	 || REG_NOTE_KIND (note) == REG_EQUIV)
! 	&& try_fwprop_subst_note (use_insn, def_insn, note,
! 				  dest, src, true) > 0)
!       {
! 	confirm_change_group ();
! 	return true;
!       }
! 
!   return false;
! }
  
! /* Check whether we could validly substitute (set DEST SRC) from DEF_INSN
!    into USE.  If so, first try performing the substitution in location LOC
!    of USE->insn ()'s pattern.  If that fails, try instead to substitute
!    into the notes.
  
!    Return true on success, otherwise leave USE_INSN unchanged.  */
  
! static bool
! try_fwprop_subst (use_info *use, insn_info *def_insn,
! 		  rtx *loc, rtx dest, rtx src)
! {
!   insn_info *use_insn = use->insn ();
  
!   auto attempt = crtl->ssa->new_change_attempt ();
!   use_array src_uses = remove_note_accesses (attempt, def_insn->uses ());
  
!   /* ??? Not really a meaningful test: it means we can propagate arithmetic
!      involving hard registers but not bare references to them.  A better
!      test would be to iterate over src_uses looking for hard registers
!      that are not fixed.  */
!   if (REG_P (src) && HARD_REGISTER_P (src))
!     return false;
! 
!   /* ??? It would be better to make this EBB-based instead.  That would
!      involve checking for eqaul EBBs rather than equal BBs and trying
!      to make the uses available at use_insn->ebb ()->first_bb ().  */
!   if (def_insn->bb () != use_insn->bb ())
      {
!       src_uses = crtl->ssa->make_uses_available (attempt, src_uses,
! 						 use_insn->bb ());
!       if (!src_uses.is_valid ())
! 	return false;
      }
  
!   insn_change use_change (use_insn);
!   use_change.new_uses = merge_access_arrays (attempt, use_change.new_uses,
! 					     src_uses);
!   if (!use_change.new_uses.is_valid ())
!     return false;
  
!   /* ??? We could allow movement within the EBB by adding:
  
!      use_change.move_range = use_insn->ebb ()->insn_range ();  */
!   if (!restrict_movement (use_change))
!     return false;
  
!   return (try_fwprop_subst_pattern (attempt, use_change, def_insn,
! 				    loc, dest, src)
! 	  || try_fwprop_subst_notes (use_insn, def_insn, dest, src));
  }
  
  /* For the given single_set INSN, containing SRC known to be a
***************
*** 1117,1149 ****
     load from memory.  */
  
  static bool
! free_load_extend (rtx src, rtx_insn *insn)
  {
!   rtx reg;
!   df_ref def, use;
! 
!   reg = XEXP (src, 0);
    if (load_extend_op (GET_MODE (reg)) != GET_CODE (src))
      return false;
  
!   FOR_EACH_INSN_USE (use, insn)
!     if (!DF_REF_IS_ARTIFICIAL (use)
! 	&& DF_REF_TYPE (use) == DF_REF_REG_USE
! 	&& DF_REF_REG (use) == reg)
!       break;
!   if (!use)
!     return false;
  
-   def = get_def_for_use (use);
    if (!def)
      return false;
  
!   if (DF_REF_IS_ARTIFICIAL (def))
      return false;
  
!   if (NONJUMP_INSN_P (DF_REF_INSN (def)))
      {
!       rtx patt = PATTERN (DF_REF_INSN (def));
  
        if (GET_CODE (patt) == SET
  	  && GET_CODE (SET_SRC (patt)) == MEM
--- 613,643 ----
     load from memory.  */
  
  static bool
! free_load_extend (rtx src, insn_info *insn)
  {
!   rtx reg = XEXP (src, 0);
    if (load_extend_op (GET_MODE (reg)) != GET_CODE (src))
      return false;
  
!   def_info *def = nullptr;
!   for (use_info *use : insn->uses ())
!     if (use->regno () == REGNO (reg))
!       {
! 	def = use->def ();
! 	break;
!       }
  
    if (!def)
      return false;
  
!   insn_info *def_insn = def->insn ();
!   if (def_insn->is_artificial ())
      return false;
  
!   rtx_insn *def_rtl = def_insn->rtl ();
!   if (NONJUMP_INSN_P (def_rtl))
      {
!       rtx patt = PATTERN (def_rtl);
  
        if (GET_CODE (patt) == SET
  	  && GET_CODE (SET_SRC (patt)) == MEM
***************
*** 1153,1174 ****
    return false;
  }
  
! /* If USE is a subreg, see if it can be replaced by a pseudo.  */
  
  static bool
! forward_propagate_subreg (df_ref use, rtx_insn *def_insn, rtx def_set)
  {
-   rtx use_reg = DF_REF_REG (use);
-   rtx_insn *use_insn;
-   rtx src;
    scalar_int_mode int_use_mode, src_mode;
  
    /* Only consider subregs... */
    machine_mode use_mode = GET_MODE (use_reg);
    if (GET_CODE (use_reg) != SUBREG
!       || !REG_P (SET_DEST (def_set)))
      return false;
  
    if (paradoxical_subreg_p (use_reg))
      {
        /* If this is a paradoxical SUBREG, we have no idea what value the
--- 647,670 ----
    return false;
  }
  
! /* Subroutine of forward_propagate_subreg that handles a use of DEST
!    in REF.  The other parameters are the same.  */
  
  static bool
! forward_propagate_subreg (use_info *use, insn_info *def_insn,
! 			  rtx dest, rtx src, df_ref ref)
  {
    scalar_int_mode int_use_mode, src_mode;
  
    /* Only consider subregs... */
+   rtx use_reg = DF_REF_REG (ref);
    machine_mode use_mode = GET_MODE (use_reg);
    if (GET_CODE (use_reg) != SUBREG
!       || GET_MODE (SUBREG_REG (use_reg)) != GET_MODE (dest))
      return false;
  
+   /* ??? Replacing throughout the pattern would help for match_dups.  */
+   rtx *loc = DF_REF_LOC (ref);
    if (paradoxical_subreg_p (use_reg))
      {
        /* If this is a paradoxical SUBREG, we have no idea what value the
***************
*** 1176,1191 ****
  	 a SUBREG whose operand is the same as our mode, and all the modes
  	 are within a word, we can just use the inner operand because
  	 these SUBREGs just say how to treat the register.  */
-       use_insn = DF_REF_INSN (use);
-       src = SET_SRC (def_set);
        if (GET_CODE (src) == SUBREG
  	  && REG_P (SUBREG_REG (src))
  	  && REGNO (SUBREG_REG (src)) >= FIRST_PSEUDO_REGISTER
  	  && GET_MODE (SUBREG_REG (src)) == use_mode
! 	  && subreg_lowpart_p (src)
! 	  && all_uses_available_at (def_insn, use_insn))
! 	return try_fwprop_subst (use, DF_REF_LOC (use), SUBREG_REG (src),
! 				 def_insn, false);
      }
  
    /* If this is a SUBREG of a ZERO_EXTEND or SIGN_EXTEND, and the SUBREG
--- 672,684 ----
  	 a SUBREG whose operand is the same as our mode, and all the modes
  	 are within a word, we can just use the inner operand because
  	 these SUBREGs just say how to treat the register.  */
        if (GET_CODE (src) == SUBREG
  	  && REG_P (SUBREG_REG (src))
  	  && REGNO (SUBREG_REG (src)) >= FIRST_PSEUDO_REGISTER
  	  && GET_MODE (SUBREG_REG (src)) == use_mode
! 	  && subreg_lowpart_p (src))
! 	return try_fwprop_subst (use, def_insn, loc,
! 				 use_reg, SUBREG_REG (src));
      }
  
    /* If this is a SUBREG of a ZERO_EXTEND or SIGN_EXTEND, and the SUBREG
***************
*** 1206,1213 ****
    else if (is_a <scalar_int_mode> (use_mode, &int_use_mode)
  	   && subreg_lowpart_p (use_reg))
      {
-       use_insn = DF_REF_INSN (use);
-       src = SET_SRC (def_set);
        if ((GET_CODE (src) == ZERO_EXTEND
  	   || GET_CODE (src) == SIGN_EXTEND)
  	  && is_a <scalar_int_mode> (GET_MODE (src), &src_mode)
--- 699,704 ----
***************
*** 1216,1354 ****
  	  && GET_MODE (XEXP (src, 0)) == use_mode
  	  && !free_load_extend (src, def_insn)
  	  && (targetm.mode_rep_extended (int_use_mode, src_mode)
! 	      != (int) GET_CODE (src))
! 	  && all_uses_available_at (def_insn, use_insn))
! 	return try_fwprop_subst (use, DF_REF_LOC (use), XEXP (src, 0),
! 				 def_insn, false);
      }
  
    return false;
  }
  
! /* Try to replace USE with SRC (defined in DEF_INSN) in __asm.  */
  
  static bool
! forward_propagate_asm (df_ref use, rtx_insn *def_insn, rtx def_set, rtx reg)
  {
!   rtx_insn *use_insn = DF_REF_INSN (use);
!   rtx src, use_pat, asm_operands, new_rtx, *loc;
!   int speed_p, i;
!   df_ref uses;
! 
!   gcc_assert ((DF_REF_FLAGS (use) & DF_REF_IN_NOTE) == 0);
! 
!   src = SET_SRC (def_set);
!   use_pat = PATTERN (use_insn);
  
!   /* In __asm don't replace if src might need more registers than
!      reg, as that could increase register pressure on the __asm.  */
!   uses = DF_INSN_USES (def_insn);
!   if (uses && DF_REF_NEXT_LOC (uses))
      return false;
  
!   update_df_init (def_insn, use_insn);
!   speed_p = optimize_bb_for_speed_p (BLOCK_FOR_INSN (use_insn));
!   asm_operands = NULL_RTX;
!   switch (GET_CODE (use_pat))
!     {
!     case ASM_OPERANDS:
!       asm_operands = use_pat;
!       break;
!     case SET:
!       if (MEM_P (SET_DEST (use_pat)))
! 	{
! 	  loc = &SET_DEST (use_pat);
! 	  new_rtx = propagate_rtx (*loc, GET_MODE (*loc), reg, src, speed_p);
! 	  if (new_rtx)
! 	    validate_unshare_change (use_insn, loc, new_rtx, true);
! 	}
!       asm_operands = SET_SRC (use_pat);
!       break;
!     case PARALLEL:
!       for (i = 0; i < XVECLEN (use_pat, 0); i++)
! 	if (GET_CODE (XVECEXP (use_pat, 0, i)) == SET)
! 	  {
! 	    if (MEM_P (SET_DEST (XVECEXP (use_pat, 0, i))))
! 	      {
! 		loc = &SET_DEST (XVECEXP (use_pat, 0, i));
! 		new_rtx = propagate_rtx (*loc, GET_MODE (*loc), reg,
! 					 src, speed_p);
! 		if (new_rtx)
! 		  validate_unshare_change (use_insn, loc, new_rtx, true);
! 	      }
! 	    asm_operands = SET_SRC (XVECEXP (use_pat, 0, i));
! 	  }
! 	else if (GET_CODE (XVECEXP (use_pat, 0, i)) == ASM_OPERANDS)
! 	  asm_operands = XVECEXP (use_pat, 0, i);
!       break;
!     default:
!       gcc_unreachable ();
!     }
  
!   gcc_assert (asm_operands && GET_CODE (asm_operands) == ASM_OPERANDS);
!   for (i = 0; i < ASM_OPERANDS_INPUT_LENGTH (asm_operands); i++)
!     {
!       loc = &ASM_OPERANDS_INPUT (asm_operands, i);
!       new_rtx = propagate_rtx (*loc, GET_MODE (*loc), reg, src, speed_p);
!       if (new_rtx)
! 	validate_unshare_change (use_insn, loc, new_rtx, true);
!     }
  
!   if (num_changes_pending () == 0 || !apply_change_group ())
!     return false;
  
!   update_df (use_insn, NULL);
!   num_changes++;
!   return true;
  }
  
! /* Try to replace USE with SRC (defined in DEF_INSN) and simplify the
!    result.  */
  
  static bool
! forward_propagate_and_simplify (df_ref use, rtx_insn *def_insn, rtx def_set)
  {
!   rtx_insn *use_insn = DF_REF_INSN (use);
!   rtx use_set = single_set (use_insn);
!   rtx src, reg, new_rtx, *loc;
!   bool set_reg_equal;
!   machine_mode mode;
!   int asm_use = -1;
! 
!   if (INSN_CODE (use_insn) < 0)
!     asm_use = asm_noperands (PATTERN (use_insn));
  
!   if (!use_set && asm_use < 0 && !DEBUG_INSN_P (use_insn))
      return false;
  
!   /* Do not propagate into PC, CC0, etc.  */
    if (use_set && GET_MODE (SET_DEST (use_set)) == VOIDmode)
      return false;
  
!   /* If def and use are subreg, check if they match.  */
!   reg = DF_REF_REG (use);
!   if (GET_CODE (reg) == SUBREG && GET_CODE (SET_DEST (def_set)) == SUBREG)
!     {
!       if (maybe_ne (SUBREG_BYTE (SET_DEST (def_set)), SUBREG_BYTE (reg)))
! 	return false;
!     }
!   /* Check if the def had a subreg, but the use has the whole reg.  */
!   else if (REG_P (reg) && GET_CODE (SET_DEST (def_set)) == SUBREG)
!     return false;
!   /* Check if the use has a subreg, but the def had the whole reg.  Unlike the
!      previous case, the optimization is possible and often useful indeed.  */
!   else if (GET_CODE (reg) == SUBREG && REG_P (SET_DEST (def_set)))
!     reg = SUBREG_REG (reg);
! 
!   /* Make sure that we can treat REG as having the same mode as the
!      source of DEF_SET.  */
!   if (GET_MODE (SET_DEST (def_set)) != GET_MODE (reg))
!     return false;
! 
!   /* Check if the substitution is valid (last, because it's the most
!      expensive check!).  */
!   src = SET_SRC (def_set);
!   if (!CONSTANT_P (src) && !all_uses_available_at (def_insn, use_insn))
      return false;
  
    /* Check if the def is loading something from the constant pool; in this
--- 707,779 ----
  	  && GET_MODE (XEXP (src, 0)) == use_mode
  	  && !free_load_extend (src, def_insn)
  	  && (targetm.mode_rep_extended (int_use_mode, src_mode)
! 	      != (int) GET_CODE (src)))
! 	return try_fwprop_subst (use, def_insn, loc, use_reg, XEXP (src, 0));
      }
  
    return false;
  }
  
! /* Try to substitute (set DEST SRC) from DEF_INSN into USE and simplify
!    the result, handling cases where DEST is used in a subreg and where
!    applying that subreg to SRC results in a useful simplification.  */
  
  static bool
! forward_propagate_subreg (use_info *use, insn_info *def_insn,
! 			  rtx dest, rtx src)
  {
!   if (!use->includes_subregs () || !REG_P (dest))
!     return false;
  
!   if (GET_CODE (src) != SUBREG
!       && GET_CODE (src) != ZERO_EXTEND
!       && GET_CODE (src) != SIGN_EXTEND)
      return false;
  
!   rtx_insn *use_rtl = use->insn ()->rtl ();
!   df_ref ref;
  
!   FOR_EACH_INSN_USE (ref, use_rtl)
!     if (DF_REF_REGNO (ref) == use->regno ()
! 	&& forward_propagate_subreg (use, def_insn, dest, src, ref))
!       return true;
  
!   FOR_EACH_INSN_EQ_USE (ref, use_rtl)
!     if (DF_REF_REGNO (ref) == use->regno ()
! 	&& forward_propagate_subreg (use, def_insn, dest, src, ref))
!       return true;
  
!   return false;
  }
  
! /* Try to substitute (set DEST SRC) from DEF_INSN into USE and
!    simplify the result.  */
  
  static bool
! forward_propagate_and_simplify (use_info *use, insn_info *def_insn,
! 				rtx dest, rtx src)
  {
!   insn_info *use_insn = use->insn ();
!   rtx_insn *use_rtl = use_insn->rtl ();
  
!   /* ??? This check seems unnecessary.  We should be able to propagate
!      into any kind of instruction, regardless of whether it's a single set.
!      It seems odd to be more permissive with asms than normal instructions.  */
!   bool need_single_set = (!use_insn->is_asm () && !use_insn->is_debug_insn ());
!   rtx use_set = single_set (use_rtl);
!   if (need_single_set && !use_set)
      return false;
  
!   /* Do not propagate into PC, CC0, etc.
! 
!      ??? This too seems unnecessary.  The current code should work correctly
!      without it, including cases where jumps become unconditional.  */
    if (use_set && GET_MODE (SET_DEST (use_set)) == VOIDmode)
      return false;
  
!   /* In __asm don't replace if src might need more registers than
!      reg, as that could increase register pressure on the __asm.  */
!   if (use_insn->is_asm () && def_insn->uses ().size () > 1)
      return false;
  
    /* Check if the def is loading something from the constant pool; in this
***************
*** 1357,1505 ****
    if (MEM_P (src) && MEM_READONLY_P (src))
      {
        rtx x = avoid_constant_pool_reference (src);
!       if (x != src && use_set)
  	{
!           rtx note = find_reg_note (use_insn, REG_EQUAL, NULL_RTX);
! 	  rtx old_rtx = note ? XEXP (note, 0) : SET_SRC (use_set);
  	  rtx new_rtx = simplify_replace_rtx (old_rtx, src, x);
  	  if (old_rtx != new_rtx)
!             set_unique_reg_note (use_insn, REG_EQUAL, copy_rtx (new_rtx));
  	}
        return false;
      }
  
!   if (asm_use >= 0)
!     return forward_propagate_asm (use, def_insn, def_set, reg);
! 
!   /* Else try simplifying.  */
! 
!   if (DF_REF_TYPE (use) == DF_REF_REG_MEM_STORE)
!     {
!       loc = &SET_DEST (use_set);
!       set_reg_equal = false;
!     }
!   else if (!use_set)
!     {
!       loc = &INSN_VAR_LOCATION_LOC (use_insn);
!       set_reg_equal = false;
!     }
!   else
!     {
!       rtx note = find_reg_note (use_insn, REG_EQUAL, NULL_RTX);
!       if (DF_REF_FLAGS (use) & DF_REF_IN_NOTE)
! 	loc = &XEXP (note, 0);
!       else
! 	loc = &SET_SRC (use_set);
! 
!       /* Do not replace an existing REG_EQUAL note if the insn is not
! 	 recognized.  Either we're already replacing in the note, or we'll
! 	 separately try plugging the definition in the note and simplifying.
! 	 And only install a REQ_EQUAL note when the destination is a REG
! 	 that isn't mentioned in USE_SET, as the note would be invalid
! 	 otherwise.  We also don't want to install a note if we are merely
! 	 propagating a pseudo since verifying that this pseudo isn't dead
! 	 is a pain; moreover such a note won't help anything.
! 	 If the use is a paradoxical subreg, make sure we don't add a
! 	 REG_EQUAL note for it, because it is not equivalent, it is one
! 	 possible value for it, but we can't rely on it holding that value.
! 	 See PR70574.  */
!       set_reg_equal = (note == NULL_RTX
! 		       && REG_P (SET_DEST (use_set))
! 		       && !REG_P (src)
! 		       && !(GET_CODE (src) == SUBREG
! 			    && REG_P (SUBREG_REG (src)))
! 		       && !reg_mentioned_p (SET_DEST (use_set),
! 					    SET_SRC (use_set))
! 		       && !paradoxical_subreg_p (DF_REF_REG (use)));
!     }
! 
!   if (GET_MODE (*loc) == VOIDmode)
!     mode = GET_MODE (SET_DEST (use_set));
!   else
!     mode = GET_MODE (*loc);
! 
!   new_rtx = propagate_rtx (*loc, mode, reg, src,
!   			   optimize_bb_for_speed_p (BLOCK_FOR_INSN (use_insn)));
! 
!   if (!new_rtx)
!     return false;
! 
!   return try_fwprop_subst (use, loc, new_rtx, def_insn, set_reg_equal);
  }
  
- 
  /* Given a use USE of an insn, if it has a single reaching
     definition, try to forward propagate it into that insn.
!    Return true if cfg cleanup will be needed.
     REG_PROP_ONLY is true if we should only propagate register copies.  */
  
  static bool
! forward_propagate_into (df_ref use, bool reg_prop_only = false)
  {
!   df_ref def;
!   rtx_insn *def_insn, *use_insn;
!   rtx def_set;
!   rtx parent;
! 
!   if (DF_REF_FLAGS (use) & DF_REF_READ_WRITE)
!     return false;
!   if (DF_REF_IS_ARTIFICIAL (use))
      return false;
  
!   /* Only consider uses that have a single definition.  */
!   def = get_def_for_use (use);
    if (!def)
      return false;
-   if (DF_REF_FLAGS (def) & DF_REF_READ_WRITE)
-     return false;
-   if (DF_REF_IS_ARTIFICIAL (def))
-     return false;
  
!   /* Check if the use is still present in the insn!  */
!   use_insn = DF_REF_INSN (use);
!   if (DF_REF_FLAGS (use) & DF_REF_IN_NOTE)
!     parent = find_reg_note (use_insn, REG_EQUAL, NULL_RTX);
!   else
!     parent = PATTERN (use_insn);
  
!   if (!reg_mentioned_p (DF_REF_REG (use), parent))
      return false;
  
!   def_insn = DF_REF_INSN (def);
!   if (multiple_sets (def_insn))
      return false;
!   def_set = single_set (def_insn);
    if (!def_set)
      return false;
  
!   if (reg_prop_only
!       && (!reg_single_def_p (SET_SRC (def_set))
! 	  || !reg_single_def_p (SET_DEST (def_set))))
!     return false;
  
    /* Allow propagations into a loop only for reg-to-reg copies, since
       replacing one register by another shouldn't increase the cost.  */
  
!   if (DF_REF_BB (def)->loop_father != DF_REF_BB (use)->loop_father
!       && (!reg_single_def_p (SET_SRC (def_set))
! 	  || !reg_single_def_p (SET_DEST (def_set))))
      return false;
  
!   /* Only try one kind of propagation.  If two are possible, we'll
!      do it on the following iterations.  */
!   if (forward_propagate_and_simplify (use, def_insn, def_set)
!       || forward_propagate_subreg (use, def_insn, def_set))
!     {
!       propagations_left--;
  
-       if (cfun->can_throw_non_call_exceptions
- 	  && find_reg_note (use_insn, REG_EH_REGION, NULL_RTX)
- 	  && purge_dead_edges (DF_REF_BB (use)))
- 	return true;
-     }
    return false;
  }
- 
  \f
  static void
  fwprop_init (void)
--- 782,876 ----
    if (MEM_P (src) && MEM_READONLY_P (src))
      {
        rtx x = avoid_constant_pool_reference (src);
!       rtx note_set;
!       if (x != src
! 	  && (note_set = set_for_reg_notes (use_rtl))
! 	  && REG_P (SET_DEST (note_set))
! 	  && !contains_paradoxical_subreg_p (SET_SRC (note_set)))
  	{
! 	  rtx note = find_reg_note (use_rtl, REG_EQUAL, NULL_RTX);
! 	  rtx old_rtx = note ? XEXP (note, 0) : SET_SRC (note_set);
  	  rtx new_rtx = simplify_replace_rtx (old_rtx, src, x);
  	  if (old_rtx != new_rtx)
! 	    set_unique_reg_note (use_rtl, REG_EQUAL, copy_rtx (new_rtx));
  	}
        return false;
      }
  
!   /* ??? Unconditionally propagating into PATTERN would work better
!      for instructions that have match_dups.  */
!   rtx *loc = need_single_set ? &use_set : &PATTERN (use_rtl);
!   return try_fwprop_subst (use, def_insn, loc, dest, src);
  }
  
  /* Given a use USE of an insn, if it has a single reaching
     definition, try to forward propagate it into that insn.
!    Return true if something changed.
! 
     REG_PROP_ONLY is true if we should only propagate register copies.  */
  
  static bool
! forward_propagate_into (use_info *use, bool reg_prop_only = false)
  {
!   if (use->includes_read_writes ())
      return false;
  
!   /* Disregard uninitialized uses.  */
!   def_info *def = use->def ();
    if (!def)
      return false;
  
!   /* Only consider single-register definitions.  This could be relaxed,
!      but it should rarely be needed before RA.  */
!   def = look_through_degenerate_phi (def);
!   if (def->includes_multiregs ())
!     return false;
  
!   /* Only consider uses whose definition comes from a real instruction.  */
!   insn_info *def_insn = def->insn ();
!   if (def_insn->is_artificial ())
      return false;
  
!   rtx_insn *def_rtl = def_insn->rtl ();
!   if (!NONJUMP_INSN_P (def_rtl))
      return false;
!   /* ??? This seems an unnecessary restriction.  We can easily tell
!      which set the definition comes from.  */
!   if (multiple_sets (def_rtl))
!     return false;
!   rtx def_set = simple_regno_set (PATTERN (def_rtl), def->regno ());
    if (!def_set)
      return false;
  
!   rtx dest = SET_DEST (def_set);
!   rtx src = SET_SRC (def_set);
  
    /* Allow propagations into a loop only for reg-to-reg copies, since
       replacing one register by another shouldn't increase the cost.  */
+   struct loop *def_loop = def_insn->bb ()->cfg_bb ()->loop_father;
+   struct loop *use_loop = use->bb ()->cfg_bb ()->loop_father;
+   if ((reg_prop_only || def_loop != use_loop)
+       && (!reg_single_def_p (dest) || !reg_single_def_p (src)))
+     return false;
  
!   /* Don't substitute into a non-local goto, this confuses CFG.  */
!   insn_info *use_insn = use->insn ();
!   rtx_insn *use_rtl = use_insn->rtl ();
!   if (JUMP_P (use_rtl)
!       && find_reg_note (use_rtl, REG_NON_LOCAL_GOTO, NULL_RTX))
      return false;
  
!   /* Don't replace register asms in asm statements; we mustn't
!      change the user's register allocation.  */
!   if (use_insn->is_asm () && register_asm_p (dest))
!     return false;
! 
!   if (forward_propagate_and_simplify (use, def_insn, dest, src)
!       || forward_propagate_subreg (use, def_insn, dest, src))
!     return true;
  
    return false;
  }
  \f
  static void
  fwprop_init (void)
***************
*** 1513,1526 ****
       build_single_def_use_links.  */
    loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
  
!   build_single_def_use_links ();
!   df_set_flags (DF_DEFER_INSN_RESCAN);
! 
!   active_defs = XNEWVEC (df_ref, max_reg_num ());
!   if (flag_checking)
!     active_defs_check = sparseset_alloc (max_reg_num ());
! 
!   propagations_left = DF_USES_TABLE_SIZE ();
  }
  
  static void
--- 884,897 ----
       build_single_def_use_links.  */
    loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
  
! #if ADD_NOTES
!   /* Not necessary with the SSA version, just makes comparing the dumps
!      easier.  */
!   df_set_flags (DF_EQ_NOTES);
!   df_note_add_problem ();
! #endif
!   df_analyze ();
!   crtl->ssa = new rtl_ssa::function_info (cfun);
  }
  
  static void
***************
*** 1528,1540 ****
  {
    loop_optimizer_finalize ();
  
!   use_def_ref.release ();
!   free (active_defs);
!   if (flag_checking)
!     sparseset_free (active_defs_check);
! 
    free_dominance_info (CDI_DOMINATORS);
    cleanup_cfg (0);
    delete_trivially_dead_insns (get_insns (), max_reg_num ());
  
    if (dump_file)
--- 899,911 ----
  {
    loop_optimizer_finalize ();
  
!   crtl->ssa->perform_pending_updates ();
    free_dominance_info (CDI_DOMINATORS);
    cleanup_cfg (0);
+ 
+   delete crtl->ssa;
+   crtl->ssa = nullptr;
+ 
    delete_trivially_dead_insns (get_insns (), max_reg_num ());
  
    if (dump_file)
***************
*** 1543,1548 ****
--- 914,954 ----
  	     num_changes);
  }
  
+ /* Try to optimize INSN, returning true if something changes.
+    FWPROP_ADDR_P is true if we are running fwprop_addr rather than
+    the full fwprop.  */
+ 
+ static bool
+ fwprop_insn (insn_info *insn, bool fwprop_addr_p)
+ {
+   for (use_info *use : insn->uses ())
+     {
+       if (use->is_mem ())
+ 	continue;
+       /* ??? The choices here follow those in the pre-SSA code.  */
+       if (!use->includes_address_uses ())
+ 	{
+ 	  if (forward_propagate_into (use, fwprop_addr_p))
+ 	    return true;
+ 	}
+       else
+ 	{
+ 	  struct loop *loop = insn->bb ()->cfg_bb ()->loop_father;
+ 	  /* The outermost loop is not really a loop.  */
+ 	  if (loop == NULL || loop_outer (loop) == NULL)
+ 	    {
+ 	      if (forward_propagate_into (use, fwprop_addr_p))
+ 		return true;
+ 	    }
+ 	  else if (fwprop_addr_p)
+ 	    {
+ 	      if (forward_propagate_into (use, false))
+ 		return true;
+ 	    }
+ 	}
+     }
+   return false;
+ }
  
  /* Main entry point.  */
  
***************
*** 1555,1587 ****
  static unsigned int
  fwprop (bool fwprop_addr_p)
  {
-   unsigned i;
- 
    fwprop_init ();
  
!   /* Go through all the uses.  df_uses_create will create new ones at the
!      end, and we'll go through them as well.
  
       Do not forward propagate addresses into loops until after unrolling.
       CSE did so because it was able to fix its own mess, but we are not.  */
  
!   for (i = 0; i < DF_USES_TABLE_SIZE (); i++)
!     {
!       if (!propagations_left)
! 	break;
! 
!       df_ref use = DF_USES_GET (i);
!       if (use)
! 	{
! 	  if (DF_REF_TYPE (use) == DF_REF_REG_USE
! 	      || DF_REF_BB (use)->loop_father == NULL
! 	      /* The outer most loop is not really a loop.  */
! 	      || loop_outer (DF_REF_BB (use)->loop_father) == NULL)
! 	    forward_propagate_into (use, fwprop_addr_p);
  
! 	  else if (fwprop_addr_p)
! 	    forward_propagate_into (use, false);
! 	}
      }
  
    fwprop_done ();
--- 961,993 ----
  static unsigned int
  fwprop (bool fwprop_addr_p)
  {
    fwprop_init ();
  
!   /* Go through all the instructions (including debug instructions) looking
!      for uses that we could propagate into.
  
       Do not forward propagate addresses into loops until after unrolling.
       CSE did so because it was able to fix its own mess, but we are not.  */
  
!   insn_info *next;
  
!   /* ??? This code uses a worklist in order to preserve the behavior
!      of the pre-SSA implementation.  It would be better to instead
!      iterate on each instruction until no more propagations are
!      possible, then move on to the next.  */
!   auto_vec<insn_info *> worklist;
!   for (insn_info *insn = crtl->ssa->first_insn (); insn; insn = next)
!     {
!       next = insn->next_any_insn ();
!       if (insn->can_be_optimized () || insn->is_debug_insn ())
! 	if (fwprop_insn (insn, fwprop_addr_p))
! 	  worklist.safe_push (insn);
!     }
!   for (unsigned int i = 0; i < worklist.length (); ++i)
!     {
!       insn_info *insn = worklist[i];
!       if (fwprop_insn (insn, fwprop_addr_p))
! 	worklist.safe_push (insn);
      }
  
    fwprop_done ();
*** /tmp/dfQipa_test-return-const.c.before-fwprop.c	2020-11-13 08:23:52.853409199 +0000
--- gcc/testsuite/gcc.dg/rtl/x86_64/test-return-const.c.before-fwprop.c	2020-11-13 08:05:06.490403698 +0000
***************
*** 31,37 ****
  }
  
  /* Verify that insn 5 is eliminated.  */
! /* { dg-final { scan-rtl-dump "deferring deletion of insn with uid = 5" "fwprop1" } } */
  /* { dg-final { scan-rtl-dump "Deleted 1 trivially dead insns" "fwprop1" } } */
  
  int main (void)
--- 31,37 ----
  }
  
  /* Verify that insn 5 is eliminated.  */
! /* { dg-final { scan-rtl-dump "deleting insn with uid = 5" "fwprop1" } } */
  /* { dg-final { scan-rtl-dump "Deleted 1 trivially dead insns" "fwprop1" } } */
  
  int main (void)
*** /tmp/XS4Rr9_st4_s8.c	2020-11-13 08:23:52.865409146 +0000
--- gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st4_s8.c	2020-11-13 08:05:06.490403698 +0000
***************
*** 74,80 ****
  /*
  ** st4_s8_32:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_s8_32, svint8x4_t, int8_t,
--- 74,80 ----
  /*
  ** st4_s8_32:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+, x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_s8_32, svint8x4_t, int8_t,
***************
*** 135,141 ****
  /*
  ** st4_s8_m36:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_s8_m36, svint8x4_t, int8_t,
--- 135,141 ----
  /*
  ** st4_s8_m36:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+, x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_s8_m36, svint8x4_t, int8_t,
***************
*** 205,211 ****
  /*
  ** st4_vnum_s8_32:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_vnum_s8_32, svint8x4_t, int8_t,
--- 205,211 ----
  /*
  ** st4_vnum_s8_32:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+, x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_vnum_s8_32, svint8x4_t, int8_t,
***************
*** 266,272 ****
  /*
  ** st4_vnum_s8_m36:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_vnum_s8_m36, svint8x4_t, int8_t,
--- 266,272 ----
  /*
  ** st4_vnum_s8_m36:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+, x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_vnum_s8_m36, svint8x4_t, int8_t,
*** /tmp/nepSbd_st4_u8.c	2020-11-13 08:23:52.881409075 +0000
--- gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st4_u8.c	2020-11-13 08:05:06.490403698 +0000
***************
*** 74,80 ****
  /*
  ** st4_u8_32:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_u8_32, svuint8x4_t, uint8_t,
--- 74,80 ----
  /*
  ** st4_u8_32:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+, x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_u8_32, svuint8x4_t, uint8_t,
***************
*** 135,141 ****
  /*
  ** st4_u8_m36:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_u8_m36, svuint8x4_t, uint8_t,
--- 135,141 ----
  /*
  ** st4_u8_m36:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+, x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_u8_m36, svuint8x4_t, uint8_t,
***************
*** 205,211 ****
  /*
  ** st4_vnum_u8_32:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_vnum_u8_32, svuint8x4_t, uint8_t,
--- 205,211 ----
  /*
  ** st4_vnum_u8_32:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+, x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_vnum_u8_32, svuint8x4_t, uint8_t,
***************
*** 266,272 ****
  /*
  ** st4_vnum_u8_m36:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_vnum_u8_m36, svuint8x4_t, uint8_t,
--- 266,272 ----
  /*
  ** st4_vnum_u8_m36:
  **	[^{]*
! **	st4b	{z0\.b - z3\.b}, p0, \[x[0-9]+, x[0-9]+\]
  **	ret
  */
  TEST_STORE (st4_vnum_u8_m36, svuint8x4_t, uint8_t,

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [00/23] Make fwprop use an on-the-side RTL SSA representation
  2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
                   ` (22 preceding siblings ...)
  2020-11-13  8:24 ` [PATCH 23/23] fwprop: Rewrite to use RTL SSA Richard Sandiford
@ 2020-11-25 19:58 ` Jeff Law
  2020-11-26 16:03   ` Richard Sandiford
  23 siblings, 1 reply; 88+ messages in thread
From: Jeff Law @ 2020-11-25 19:58 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:10 AM, Richard Sandiford via Gcc-patches wrote:
> Just after GCC 10 stage 1 closed (oops), I posted a patch to add a new
> combine pass.  One of its main aims was to allow instructions to move
> around where necessary in order to make a combination possible.
> It also tried to parallelise instructions that use the same resource.
>
> That pass contained its own code for maintaining limited def-use chains.
> When I posted the patch, Segher asked why we wanted yet another piece
> of pass-specific code to do that.  Although I had specific reasons
> (which I explained at the time) I've gradually come round to agreeing
> that that was a flaw.
>
> This series of patches is the result of a Covid-time project to add
> a more general, pass-agnostic framework.  There are two parts:
> adding the framework itself, and using it to make fwprop.c faster.
>
> The framework part
> ------------------
>
> The framework provides an optional, on-the-side SSA view of existing
> RTL instructions.  Each instruction gets a list of definitions and a
> list of uses, with each use having a single definition.  Phi nodes
> handle cases in which there are multiple possible definitions of a
> register on entry to a basic block.  There are also routines for
> updating instructions while keeping the SSA representation intact.
>
> The aim is only to provide a different view of existing RTL instructions.
> Unlike gimple, and unlike (IIRC) the old RTL SSA project from way back,
> the new framework isn't a “native” SSA representation.  This means that
> all inputs to a phi node for a register R are also definitions of
> register R; no move operation is “hidden” in the phi node.
Hmm, I'm trying to parse what the last phrase means.  Does it mean that
the "hidden copy" problem for out-of-ssa is avoided?  And if so, how is
that maintained over time.  Things like copy-prop will tend to introduce
those issues even if they didn't originally exist.

>
> Like gimple, the framework treats memory as a single unified resource.
>
> A more in-depth summary is contained in the doc patch, but some
> other random notes:
>
> * At the moment, the SSA information is local to one pass, but it might
>   be good to maintain it between passes in future.
Right.  I think we can look at the passes near fwprop as good targets
for extending the lifetime over which we have an SSA framework.   I note
CSE is just before the first fwprop and CSE is a hell of a lot easier in
an SSA world :-)  It's unfortunately that there's no DCE passes abutting
fwprop as DCE is really easy in an SSA world.

>
> * The SSA code groups blocks into extended basic blocks, with the
>   EBBs rather than individual blocks having phi nodes.  
So I haven't looked at the patch, but the usual place to put PHIs is at
the dominance frontier.  But extra PHIs just increase time/memory and
shouldn't affect correctness.

>
> * The framework also provides live range information for registers
>   within an extended basic block and allows instructions to move within
>   their EBB.  It might be useful to allow further movement in future;
>   I just don't have a use case for it yet.
Yup.   You could do something like Click's algorithm to schedule the
instructions in a block to maximize CSE opportunities on top of this.

>
> * One advantage of the new infrastructure is that it gives
>   recog_for_combine-like behaviour: if recog wants to add clobbers
>   of things like the flags register, the SSA code will make sure
>   that the flags register is free.
I look more at the intersection between combine and SSA as an
opportunity to combine on extended blocks, simplify the "does dataflow
allow this combination" logic, drop the need to build/maintain LOG_LINKS
and more generally simplify note distribution.

> * I've tried to optimise the code for both memory footprint and
>   compile time.  The first part involves quite a bit of overloading
>   of pointers and various other kinds of reuse, so most of the new data
>   structures use private member variables and public accessor functions.
>   I know that style isn't universally popular, but I think it's
>   justified here.  Things could easily go wrong if passes tried
>   to operate directly on the underlying data structures.
ACK.

>
> * Debug instructions get SSA information too, on a best-effort basis.
>   Providing complete information would be significantly more expensive.
>
> * I wasn't sure for new C++ code whether to stick to the old C /* … */
>   comments, or whether to switch to //.  In the end I went for //,
>   on the basis that:
>
>   - The ranger code already does this.
>
>   - // is certainly more idiomatic in C++.
>
>   - // is in the lisp tradition of per-line comments and it matches the
>     ;; used in .md files.  I feel sure that GCC would have been written
>     using // from the outset if that had been possible.
I think we're allowing both and realistically /* */ vs // shouldn't be
something we spend a lot of time arguing about :-)


>
>   The patches only do this for new files.  The aim is to ensure that
>   each file is at least self-consistent.
ACK.



Anyway, given this posted before end of stage1, it deserves
consideration of gcc-11.  It's (by far) the largest set in my gcc-11 queue.

jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [01/23] vec: Silence clang warning
  2020-11-13  8:11 ` [01/23] vec: Silence clang warning Richard Sandiford
@ 2020-11-25 19:58   ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-11-25 19:58 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:11 AM, Richard Sandiford via Gcc-patches wrote:
> I noticed during compatibility testing that clang warns that this
> operator won't be implicitly const in C++14 onwards.
>
> gcc/
> 	* vec.h (vnull::operator vec<T, A, L>): Make const.
OK
jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [02/23] rtlanal: Remove noop_move_p REG_EQUAL condition
  2020-11-13  8:12 ` [02/23] rtlanal: Remove noop_move_p REG_EQUAL condition Richard Sandiford
@ 2020-11-25 20:00   ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-11-25 20:00 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:12 AM, Richard Sandiford via Gcc-patches wrote:
> noop_move_p currently keeps any instruction that has a REG_EQUAL
> note, on the basis that the equality might be useful in future.
> But this creates a perverse incentive not to add potentially-useful
> REG_EQUAL notes, in case they prevent an instruction from later being
> removed as dead.
>
> The condition originates from flow.c:life_analysis_1 and predates
> the changes tracked by the current repository (1992).  It probably
> made sense when most optimisations were done on RTL rather than FE
> trees, but it seems counterproductive now.
>
> gcc/
> 	* rtlanal.c (noop_move_p): Don't check for REG_EQUAL notes.
I would  guess this was primarily for the old libcall mechanism where
we'd have a self-copy at the end of the sequence with a REG_EQUAL note
for the expression's natural form.  All that's been broken for a long
time.  So I'm not going to lose any sleep if we want to remove this
little chunk of code.

OK

jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [03/23] reginfo: Add a global_reg_set
  2020-11-13  8:12 ` [03/23] reginfo: Add a global_reg_set Richard Sandiford
@ 2020-11-25 20:01   ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-11-25 20:01 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:12 AM, Richard Sandiford via Gcc-patches wrote:
> A later patch wants to use the set of global registers as a HARD_REG_SET
> rather than a bool/char array.  Most other arrays already have a
> HARD_REG_SET counterpart, but this one didn't.
>
> gcc/
> 	* hard-reg-set.h (global_reg_set): Declare.
> 	* reginfo.c (global_reg_set): New variable.
> 	(init_reg_sets_1, globalize_reg): Update it when globalizing
> 	registers.
OK
jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [04/23] Move iterator_range to a new iterator-utils.h file
  2020-11-13  8:13 ` [04/23] Move iterator_range to a new iterator-utils.h file Richard Sandiford
@ 2020-11-25 20:02   ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-11-25 20:02 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:13 AM, Richard Sandiford via Gcc-patches wrote:
> A later patch will add more iterator-related utilities.  Rather than
> putting them all directly in coretypes.h, it seemed better to add a
> new header file, here called "iterator-utils.h".  This preliminary
> patch moves the existing iterator_range class there too.
>
> I used the same copyright date range as coretypes.h “just to be sure”.
>
> gcc/
> 	* coretypes.h (iterator_range): Move to...
> 	* iterator-utils.h: ...this new file.
OK
jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [05/23] Add more iterator utilities
  2020-11-13  8:13 ` [05/23] Add more iterator utilities Richard Sandiford
@ 2020-11-25 20:12   ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-11-25 20:12 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:13 AM, Richard Sandiford via Gcc-patches wrote:
> This patch adds some more iterator helper classes.  They really fall
> into two groups, but there didn't seem much value in separating them:
>
> - A later patch has a class hierarchy of the form:
>
>      Base
>       +- Derived1
>       +- Derived2
>
>   A class wants to store an array A1 of Derived1 pointers and an
>   array A2 of Derived2 pointers.  However, for compactness reasons,
>   it was convenient to have a single array of Base pointers,
>   with A1 and A2 being slices of this array.  This reduces the
>   overhead from two pointers and two ints (3 LP64 words) to one
>   pointer and two ints (2 LP64 words).
>
>   But consumers of the class shouldn't be aware of this: they should
>   see A1 as containing Derived1 pointers rather than Base pointers
>   and A2 as containing Derived2 pointers rather than Base pointers.
>   This patch adds derived_iterator and const_derived_container
>   classes to support this use case.
>
> - A later patch also adds various linked lists.  This patch adds
>   wrapper_iterator and list_iterator classes to make it easier
>   to create iterators for these linked lists.  For example:
>
>     // Iterators for lists of definitions.
>     using def_iterator = list_iterator<def_info, &def_info::next_def>;
>     using reverse_def_iterator
>       = list_iterator<def_info, &def_info::prev_def>;
>
>   This in turn makes it possible to use range-based for loops
>   on the lists.
>
> The patch just adds the things that the later patches need; it doesn't
> try to make the classes as functionally complete as possible.  I think
> we should add extra functionality when needed rather than ahead of time.
>
> gcc/
> 	* iterator-utils.h (derived_iterator): New class.
> 	(const_derived_container, wrapper_iterator): Likewise.
> 	(list_iterator): Likewise.
OK
jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [06/23] Add an RAII class for managing obstacks
  2020-11-13  8:14 ` [06/23] Add an RAII class for managing obstacks Richard Sandiford
@ 2020-11-25 20:15   ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-11-25 20:15 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:14 AM, Richard Sandiford via Gcc-patches wrote:
> This patch adds an RAII class for managing the lifetimes of objects
> on an obstack.  See the comments in the patch for more details and
> example usage.
>
> gcc/
> 	* obstack-utils.h: New file.
RAII is goodness.  One could argue that most of our obstacks should
probably be converted.


jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [07/23] Add a class that multiplexes two pointer types
  2020-11-13  8:14 ` [07/23] Add a class that multiplexes two pointer types Richard Sandiford
@ 2020-11-25 20:23   ` Jeff Law
  2020-11-26 16:15     ` Richard Sandiford
  2020-11-25 23:33   ` Martin Sebor
  1 sibling, 1 reply; 88+ messages in thread
From: Jeff Law @ 2020-11-25 20:23 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:14 AM, Richard Sandiford via Gcc-patches wrote:
> This patch adds a pointer_mux<T1, T2> class that provides similar
> functionality to:
>
>     union { T1 *a; T2 *b; };
>     ...
>     bool is_b_rather_than_a;
>
> except that the is_b_rather_than_a tag is stored in the low bit
> of the pointer.  See the comments in the patch for a comparison
> between the two approaches and why this one can be more efficient.
>
> I've tried to microoptimise the class a fair bit, since a later
> patch uses it extensively in order to keep the sizes of data
> structures down.
>
> gcc/
> 	* mux-utils.h: New file.
Do we have any potentially bootstrappable targets where we can't
guarantee pointer alignment of at least 16 bits?  I see what look like
suitable asserts, and presumably if we trigger them, then we're going to
need to rethink this and fall back to a separate bit?

jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [12/23] Export print-rtl.c:print_insn_with_notes
  2020-11-13  8:17 ` [12/23] Export print-rtl.c:print_insn_with_notes Richard Sandiford
@ 2020-11-25 20:24   ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-11-25 20:24 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:17 AM, Richard Sandiford via Gcc-patches wrote:
> Later patches want to use print_insn_with_notes (printing to
> a pretty_printer).  This patch exports it from print-rtl.c.
>
> The non-notes version is already public.
>
> gcc/
> 	* print-rtl.h (print_insn_with_notes): Declare.
> 	* print-rtl.c (print_insn_with_notes): Make non-static
OK
jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [13/23] recog: Split out a register_asm_p function
  2020-11-13  8:18 ` [13/23] recog: Split out a register_asm_p function Richard Sandiford
@ 2020-11-25 20:24   ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-11-25 20:24 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:18 AM, Richard Sandiford via Gcc-patches wrote:
> verify_changes has a test for whether a particular hard register
> is a user-defined register asm.  A later patch needs to test the
> same thing, so this patch splits it out into a helper.
>
> gcc/
> 	* rtl.h (register_asm_p): Declare.
> 	* recog.c (verify_changes): Split out the test for whether
> 	a hard register is a register asm to...
> 	(register_asm_p): ...this new function.
OK
jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [16/23] recog: Add a way of temporarily undoing changes
  2020-11-13  8:19 ` [16/23] recog: Add a way of temporarily undoing changes Richard Sandiford
@ 2020-11-25 20:27   ` Jeff Law
  2020-12-17  0:22     ` Richard Sandiford
  0 siblings, 1 reply; 88+ messages in thread
From: Jeff Law @ 2020-11-25 20:27 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:19 AM, Richard Sandiford via Gcc-patches wrote:
> In some cases, it can be convenient to roll back the changes that
> have been made by validate_change to see how things looked before,
> then reroll the changes.  For example, this makes it possible
> to defer calculating the cost of an instruction until we know that
> the result is actually needed.  It can also make dumps easier to read.
>
> This patch adds a couple of helper functions for doing that.
>
> gcc/
> 	* recog.h (temporarily_undo_changes, redo_changes): Declare.
> 	* recog.c (swap_change, temporarily_undo_changes): New functions.
> 	(redo_changes): Likewise.
OK...  But...
+
> +/* Temporarily undo all the changes numbered NUM and up, with a view
> +   to reapplying them later.  The next call to the changes machinery
> +   must be:
> +
> +      redo_changes (NUM)
> +
> +   otherwise things will end up in an invalid state.  */
It'd be nice if we had state validation in the other routines. Somebody
is likely to mess this up at some point...


jeff



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [18/23] recog: Add an RAII class for undoing insn changes
  2020-11-13  8:20 ` [18/23] recog: Add an RAII class for undoing insn changes Richard Sandiford
@ 2020-11-25 20:27   ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-11-25 20:27 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:20 AM, Richard Sandiford via Gcc-patches wrote:
> When using validate_change to make a group of changes, you have
> to remember to cancel them if something goes wrong.  This patch
> adds an RAII class to make that easier.  See the comments in the
> patch for details and examples.
>
> gcc/
> 	* recog.h (insn_change_watermark): New class.
Ah, funny, I nearly suggested this with the temporary undo thingie.

OK
jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [20/23] rtlanal: Add simple_regno_set
  2020-11-13  8:21 ` [20/23] rtlanal: Add simple_regno_set Richard Sandiford
@ 2020-11-25 20:31   ` Jeff Law
  2020-12-17  0:47     ` Richard Sandiford
  0 siblings, 1 reply; 88+ messages in thread
From: Jeff Law @ 2020-11-25 20:31 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:21 AM, Richard Sandiford via Gcc-patches wrote:
> This patch adds a routine for finding a “simple” SET for a register
> definition.  See the comment in the patch for details.
>
> gcc/
> 	* rtl.h (simple_regno_set): Declare.
> 	* rtlanal.c (simple_regno_set): New function.
So I was a bit confused that this is supposed to reject read-write, but
what it's really rejecting is a narrow subset of read-write.  In
particular it rejects things that are potentially RMW via subregs. It
doesn't prevent the destination from appearing as a source operand.  You
might consider clarifying the comment.

OK

jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [07/23] Add a class that multiplexes two pointer types
  2020-11-13  8:14 ` [07/23] Add a class that multiplexes two pointer types Richard Sandiford
  2020-11-25 20:23   ` Jeff Law
@ 2020-11-25 23:33   ` Martin Sebor
  2020-11-26 17:06     ` Richard Sandiford
  1 sibling, 1 reply; 88+ messages in thread
From: Martin Sebor @ 2020-11-25 23:33 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 11/13/20 1:14 AM, Richard Sandiford via Gcc-patches wrote:
> This patch adds a pointer_mux<T1, T2> class that provides similar
> functionality to:
> 
>      union { T1 *a; T2 *b; };
>      ...
>      bool is_b_rather_than_a;
> 
> except that the is_b_rather_than_a tag is stored in the low bit
> of the pointer.  See the comments in the patch for a comparison
> between the two approaches and why this one can be more efficient.
> 
> I've tried to microoptimise the class a fair bit, since a later
> patch uses it extensively in order to keep the sizes of data
> structures down.

I've been reading these changes mostly out of interest than to
provide comments.  I like your use of C++ -- you clearly know
the language very well.  I also appreciate the extensive
commentary.  It makes understanding the code (and the changes)
much easier.  Thank you for doing that!  We should all aspire
to follow your example! :)

I do have one concern: the tendency to prioritize efficiency
over safety (this can be said about most GCC code). Specifically
in this class, the address bit twiddling makes me uneasy.  I don't
think the object model in either language (certainly not C but
I don't have the impression C++ either) makes it unequivocally
valid.  On the contrary, I'd say many of us interpret the current
rules as leaving it undefined.  There are efforts to sanction
this sort of thing under some conditions (e.g, the C object
model proposal) but they have not been adopted yet.  I think
we should try to avoid exploiting these dark corners in new
code.

I'm not too concerned that it will break with some compilers
(it might, but code like this is out there already and works).
What, I worry is that it will either prevent or make much more
difficult any access checking that might otherwise be possible.
I also worry that it will encourage people who look to GCC code
for examples to duplicate these tricks in their own code, making
it in turn harder for us to help them detect bugs in it.

Having said that, I looked for tests that verify this new utility
class (and the others in this series), partly to get a better
idea of how it's meant to be used.  I couldn't find any.  I'd
expect every nontrivial, general-purpose utility class to come
with tests.  (Having a library of these components might make
testing easier.)

Martin

> 
> gcc/
> 	* mux-utils.h: New file.
> ---
>   gcc/mux-utils.h | 248 ++++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 248 insertions(+)
>   create mode 100644 gcc/mux-utils.h
> 
> diff --git a/gcc/mux-utils.h b/gcc/mux-utils.h
> new file mode 100644
> index 00000000000..17ced49cd22
> --- /dev/null
> +++ b/gcc/mux-utils.h
> @@ -0,0 +1,248 @@
> +// Multiplexer utilities
> +// Copyright (C) 2020 Free Software Foundation, Inc.
> +//
> +// This file is part of GCC.
> +//
> +// GCC is free software; you can redistribute it and/or modify it under
> +// the terms of the GNU General Public License as published by the Free
> +// Software Foundation; either version 3, or (at your option) any later
> +// version.
> +//
> +// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +// WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +// for more details.
> +//
> +// You should have received a copy of the GNU General Public License
> +// along with GCC; see the file COPYING3.  If not see
> +// <http://www.gnu.org/licenses/>.
> +
> +#ifndef GCC_MUX_UTILS_H
> +#define GCC_MUX_UTILS_H 1
> +
> +// A class that stores a choice "A or B", where A has type T1 * and B has
> +// type T2 *.  Both T1 and T2 must have an alignment greater than 1, since
> +// the low bit is used to identify B over A.  T1 and T2 can be the same.
> +//
> +// A can be a null pointer but B cannot.
> +//
> +// Barring the requirement that B must be nonnull, using the class is
> +// equivalent to using:
> +//
> +//     union { T1 *A; T2 *B; };
> +//
> +// and having a separate tag bit to indicate which alternative is active.
> +// However, using this class can have two advantages over a union:
> +//
> +// - It avoides the need to find somewhere to store the tag bit.
> +//
> +// - The compiler is aware that B cannot be null, which can make checks
> +//   of the form:
> +//
> +//       if (auto *B = mux.dyn_cast<T2 *> ())
> +//
> +//   more efficient.  With a union-based representation, the dyn_cast
> +//   check could fail either because MUX is an A or because MUX is a
> +//   null B, both of which require a run-time test.  With a pointer_mux,
> +//   only a check for MUX being A is needed.
> +template<typename T1, typename T2 = T1>
> +class pointer_mux
> +{
> +public:
> +  // Return an A pointer with the given value.
> +  static pointer_mux first (T1 *);
> +
> +  // Return a B pointer with the given (nonnull) value.
> +  static pointer_mux second (T2 *);
> +
> +  pointer_mux () = default;
> +
> +  // Create a null A pointer.
> +  pointer_mux (std::nullptr_t) : m_ptr (nullptr) {}
> +
> +  // Create an A or B pointer with the given value.  This is only valid
> +  // if T1 and T2 are distinct and if T can be resolved to exactly one
> +  // of them.
> +  template<typename T,
> +	   typename Enable = typename
> +	     std::enable_if<std::is_convertible<T *, T1 *>::value
> +			    != std::is_convertible<T *, T2 *>::value>::type>
> +  pointer_mux (T *ptr);
> +
> +  // Return true unless the pointer is a null A pointer.
> +  explicit operator bool () const { return m_ptr; }
> +
> +  // Assign A and B pointers respectively.
> +  void set_first (T1 *ptr) { *this = first (ptr); }
> +  void set_second (T2 *ptr) { *this = second (ptr); }
> +
> +  // Return true if the pointer is an A pointer.
> +  bool is_first () const { return !(uintptr_t (m_ptr) & 1); }
> +
> +  // Return true if the pointer is a B pointer.
> +  bool is_second () const { return uintptr_t (m_ptr) & 1; }
> +
> +  // Return the contents of the pointer, given that it is known to be
> +  // an A pointer.
> +  T1 *known_first () const { return reinterpret_cast<T1 *> (m_ptr); }
> +
> +  // Return the contents of the pointer, given that it is known to be
> +  // a B pointer.
> +  T2 *known_second () const { return reinterpret_cast<T2 *> (m_ptr - 1); }
> +
> +  // If the pointer is an A pointer, return its contents, otherwise
> +  // return null.  Thus a null return can mean that the pointer is
> +  // either a null A pointer or a B pointer.
> +  //
> +  // If all A pointers are nonnull, it is more efficient to use:
> +  //
> +  //    if (ptr.is_first ())
> +  //      ...use ptr.known_first ()...
> +  //
> +  // over:
> +  //
> +  //    if (T1 *a = ptr.first_or_null ())
> +  //      ...use a...
> +  T1 *first_or_null () const;
> +
> +  // If the pointer is a B pointer, return its contents, otherwise
> +  // return null.  Using:
> +  //
> +  //    if (T1 *b = ptr.second_or_null ())
> +  //      ...use b...
> +  //
> +  // should be at least as efficient as:
> +  //
> +  //    if (ptr.is_second ())
> +  //      ...use ptr.known_second ()...
> +  T2 *second_or_null () const;
> +
> +  // Return true if the pointer is a T.
> +  //
> +  // This is only valid if T1 and T2 are distinct and if T can be
> +  // resolved to exactly one of them.  The condition is checked using
> +  // a static assertion rather than SFINAE because it gives a clearer
> +  // error message.
> +  template<typename T>
> +  bool is_a () const;
> +
> +  // Assert that the pointer is a T and return it as such.  See is_a
> +  // for the restrictions on T.
> +  template<typename T>
> +  T as_a () const;
> +
> +  // If the pointer is a T, return it as such, otherwise return null.
> +  // See is_a for the restrictions on T.
> +  template<typename T>
> +  T dyn_cast () const;
> +
> +private:
> +  pointer_mux (char *ptr) : m_ptr (ptr) {}
> +
> +  // The pointer value for A pointers, or the pointer value + 1 for B pointers.
> +  // Using a pointer rather than a uintptr_t tells the compiler that second ()
> +  // can never return null, and that second_or_null () is only null if
> +  // is_first ().
> +  char *m_ptr;
> +};
> +
> +template<typename T1, typename T2>
> +inline pointer_mux<T1, T2>
> +pointer_mux<T1, T2>::first (T1 *ptr)
> +{
> +  gcc_checking_assert (!(uintptr_t (ptr) & 1));
> +  return reinterpret_cast<char *> (ptr);
> +}
> +
> +template<typename T1, typename T2>
> +inline pointer_mux<T1, T2>
> +pointer_mux<T1, T2>::second (T2 *ptr)
> +{
> +  gcc_checking_assert (!(uintptr_t (ptr) & 1));
> +  return reinterpret_cast<char *> (ptr) + 1;
> +}
> +
> +template<typename T1, typename T2>
> +template<typename T, typename Enable>
> +inline pointer_mux<T1, T2>::pointer_mux (T *ptr)
> +  : m_ptr (reinterpret_cast<char *> (ptr))
> +{
> +  if (std::is_convertible<T *, T2 *>::value)
> +    m_ptr += 1;
> +}
> +
> +template<typename T1, typename T2>
> +inline T1 *
> +pointer_mux<T1, T2>::first_or_null () const
> +{
> +  return is_first () ? known_first () : nullptr;
> +}
> +
> +template<typename T1, typename T2>
> +inline T2 *
> +pointer_mux<T1, T2>::second_or_null () const
> +{
> +  // Micro optimization that's effective as of GCC 11: compute the value
> +  // of the second pointer as an integer and test that, so that the integer
> +  // result can be reused as the pointer and so that all computation can
> +  // happen before a branch on null.  This reduces the number of branches
> +  // needed for loops.
> +  return uintptr_t (m_ptr - 1) & 1 ? nullptr : known_second ();
> +}
> +
> +template<typename T1, typename T2>
> +template<typename T>
> +inline bool
> +pointer_mux<T1, T2>::is_a () const
> +{
> +  static_assert (std::is_convertible<T1 *, T>::value
> +		 != std::is_convertible<T2 *, T>::value,
> +		 "Ambiguous pointer type");
> +  if (std::is_convertible<T2 *, T>::value)
> +    return is_second ();
> +  else
> +    return is_first ();
> +}
> +
> +template<typename T1, typename T2>
> +template<typename T>
> +inline T
> +pointer_mux<T1, T2>::as_a () const
> +{
> +  static_assert (std::is_convertible<T1 *, T>::value
> +		 != std::is_convertible<T2 *, T>::value,
> +		 "Ambiguous pointer type");
> +  if (std::is_convertible<T2 *, T>::value)
> +    {
> +      gcc_checking_assert (is_second ());
> +      return reinterpret_cast<T> (m_ptr - 1);
> +    }
> +  else
> +    {
> +      gcc_checking_assert (is_first ());
> +      return reinterpret_cast<T> (m_ptr);
> +    }
> +}
> +
> +template<typename T1, typename T2>
> +template<typename T>
> +inline T
> +pointer_mux<T1, T2>::dyn_cast () const
> +{
> +  static_assert (std::is_convertible<T1 *, T>::value
> +		 != std::is_convertible<T2 *, T>::value,
> +		 "Ambiguous pointer type");
> +  if (std::is_convertible<T2 *, T>::value)
> +    {
> +      if (is_second ())
> +	return reinterpret_cast<T> (m_ptr - 1);
> +    }
> +  else
> +    {
> +      if (is_first ())
> +	return reinterpret_cast<T> (m_ptr);
> +    }
> +  return nullptr;
> +}
> +
> +#endif
> 


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [00/23] Make fwprop use an on-the-side RTL SSA representation
  2020-11-25 19:58 ` [00/23] Make fwprop use an on-the-side RTL SSA representation Jeff Law
@ 2020-11-26 16:03   ` Richard Sandiford
  2020-11-27 15:56     ` Michael Matz
  2020-11-30  6:45     ` Jeff Law
  0 siblings, 2 replies; 88+ messages in thread
From: Richard Sandiford @ 2020-11-26 16:03 UTC (permalink / raw)
  To: Jeff Law via Gcc-patches

Thanks for the reviews.

Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On 11/13/20 1:10 AM, Richard Sandiford via Gcc-patches wrote:
>> Just after GCC 10 stage 1 closed (oops), I posted a patch to add a new
>> combine pass.  One of its main aims was to allow instructions to move
>> around where necessary in order to make a combination possible.
>> It also tried to parallelise instructions that use the same resource.
>>
>> That pass contained its own code for maintaining limited def-use chains.
>> When I posted the patch, Segher asked why we wanted yet another piece
>> of pass-specific code to do that.  Although I had specific reasons
>> (which I explained at the time) I've gradually come round to agreeing
>> that that was a flaw.
>>
>> This series of patches is the result of a Covid-time project to add
>> a more general, pass-agnostic framework.  There are two parts:
>> adding the framework itself, and using it to make fwprop.c faster.
>>
>> The framework part
>> ------------------
>>
>> The framework provides an optional, on-the-side SSA view of existing
>> RTL instructions.  Each instruction gets a list of definitions and a
>> list of uses, with each use having a single definition.  Phi nodes
>> handle cases in which there are multiple possible definitions of a
>> register on entry to a basic block.  There are also routines for
>> updating instructions while keeping the SSA representation intact.
>>
>> The aim is only to provide a different view of existing RTL instructions.
>> Unlike gimple, and unlike (IIRC) the old RTL SSA project from way back,
>> the new framework isn't a “native” SSA representation.  This means that
>> all inputs to a phi node for a register R are also definitions of
>> register R; no move operation is “hidden” in the phi node.
> Hmm, I'm trying to parse what the last phrase means.  Does it mean that
> the "hidden copy" problem for out-of-ssa is avoided?  And if so, how is
> that maintained over time.  Things like copy-prop will tend to introduce
> those issues even if they didn't originally exist.

Yeah, the phi nodes simply say which definition of register R provides
the value of R on a particular incoming edge.  That definition will
itself be a phi node for R, an artificial definition of R created by DF
(e.g. for incoming function arguments or for EH data registers), or an
actual instruction that sets R.

In other words, the SSA form is a purely on-the-side thing and the
underlying RTL instructions are maintained in the same way as normal.
The SSA form can be deleted at any time without performing a separate
out-of-ssa step.  In that respect it's different from cfglayout,
for example.

One of the goals was to allow the SSA form to be used even after RA,
where invisible copies would be more problematic.

>> Like gimple, the framework treats memory as a single unified resource.
>>
>> A more in-depth summary is contained in the doc patch, but some
>> other random notes:
>>
>> * At the moment, the SSA information is local to one pass, but it might
>>   be good to maintain it between passes in future.
> Right.  I think we can look at the passes near fwprop as good targets
> for extending the lifetime over which we have an SSA framework.   I note
> CSE is just before the first fwprop and CSE is a hell of a lot easier in
> an SSA world :-)

Yeah, agree this would be a good approach: start from fwprop and spread
out in both directions.

> It's unfortunately that there's no DCE passes abutting
> fwprop as DCE is really easy in an SSA world.

fwprop.c calls delete_trivially_dead_insns, so it does some light DCE.
One thing I wanted to do (but ran out of time to do) was get the main
SSA insn-change routine (rtl_ssa::change_insns) to record when an
instruction becomes dead, and then perform DCE as part of the later
rtl_ssa::perform_pending_updates step.  This would be much cheaper
than doing another full scan of the instruction stream (which is what
delete_trivially_dead_insns needs needs to do).

Unfortunately, I suspect we're relying on this delete_trivially_dead_insns
call to delete instructions that became dead during earlier passes, not just
those that become dead during fwprop.c.  So I guess we would need a full
DCE at some point: making fwprop.c clean up its own mess might not be
enough.

>> * The SSA code groups blocks into extended basic blocks, with the
>>   EBBs rather than individual blocks having phi nodes.  
> So I haven't looked at the patch, but the usual place to put PHIs is at
> the dominance frontier.  But extra PHIs just increase time/memory and
> shouldn't affect correctness.

Yeah, the phis still are at dominance frontiers (except for certain
cases where we use degenerate phis to maintain a linear RPO view;
see the doc patch for more details about that).  It wasn't clear from
the description above, but I was really talking about a pure data
structure choice: once we have both BBs and EBBs, the phis naturally
attach to the EBB data structure rather than the BB data structure,
since second and subsequent BBs in an EBB have a single predecessor
and so never need phi nodes.

>> * The framework also provides live range information for registers
>>   within an extended basic block and allows instructions to move within
>>   their EBB.  It might be useful to allow further movement in future;
>>   I just don't have a use case for it yet.
> Yup.   You could do something like Click's algorithm to schedule the
> instructions in a block to maximize CSE opportunities on top of this.

Yeah.

>> * One advantage of the new infrastructure is that it gives
>>   recog_for_combine-like behaviour: if recog wants to add clobbers
>>   of things like the flags register, the SSA code will make sure
>>   that the flags register is free.
> I look more at the intersection between combine and SSA as an
> opportunity to combine on extended blocks, simplify the "does dataflow
> allow this combination" logic, drop the need to build/maintain LOG_LINKS
> and more generally simplify note distribution.

Yeah, my ultimate goal (for GCC12, I hope this time for real) is still
to provide an SSA version of combine.  Initially it could sit alongside
the existing combine pass, and perhaps be run only after RA by default.
(Testing locally, that seems to give nice results, and reduces the
pressure to reimplement everything in combine.c in one go.)

But the point above was instead that, at the moment, combine is the
only pass that can add (say) new clobbers of the flags register as
part of a recog.  I think ideally *all* passes should be able to do that.
But passes would then need to track the live ranges of the flags
register in order to tell when the flags register is free.  One of the
side-benefits of the SSA stuff is that it can do this in amortised
sublinear complexity.  So RTL SSA provides its own interface to recog
that can do the same things as recog_for_combine does.

>> * Debug instructions get SSA information too, on a best-effort basis.
>>   Providing complete information would be significantly more expensive.
>>
>> * I wasn't sure for new C++ code whether to stick to the old C /* … */
>>   comments, or whether to switch to //.  In the end I went for //,
>>   on the basis that:
>>
>>   - The ranger code already does this.
>>
>>   - // is certainly more idiomatic in C++.
>>
>>   - // is in the lisp tradition of per-line comments and it matches the
>>     ;; used in .md files.  I feel sure that GCC would have been written
>>     using // from the outset if that had been possible.
> I think we're allowing both and realistically /* */ vs // shouldn't be
> something we spend a lot of time arguing about :-)

Agreed :-)

Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [07/23] Add a class that multiplexes two pointer types
  2020-11-25 20:23   ` Jeff Law
@ 2020-11-26 16:15     ` Richard Sandiford
  2020-11-30  1:28       ` Jeff Law
  0 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-26 16:15 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

Jeff Law <law@redhat.com> writes:
> On 11/13/20 1:14 AM, Richard Sandiford via Gcc-patches wrote:
>> This patch adds a pointer_mux<T1, T2> class that provides similar
>> functionality to:
>>
>>     union { T1 *a; T2 *b; };
>>     ...
>>     bool is_b_rather_than_a;
>>
>> except that the is_b_rather_than_a tag is stored in the low bit
>> of the pointer.  See the comments in the patch for a comparison
>> between the two approaches and why this one can be more efficient.
>>
>> I've tried to microoptimise the class a fair bit, since a later
>> patch uses it extensively in order to keep the sizes of data
>> structures down.
>>
>> gcc/
>> 	* mux-utils.h: New file.
> Do we have any potentially bootstrappable targets where we can't
> guarantee pointer alignment of at least 16 bits?  I see what look like
> suitable asserts, and presumably if we trigger them, then we're going to
> need to rethink this and fall back to a separate bit?

Like you say, the asserts should trap that if it happens.  I think the
fix would be to increase the alignment of the data types involved (which
we can now do in a standard way -- probably wouldn't have been possible
before the switch to C++11, at least not without using extensions).

Thanks,
Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [07/23] Add a class that multiplexes two pointer types
  2020-11-25 23:33   ` Martin Sebor
@ 2020-11-26 17:06     ` Richard Sandiford
  2020-11-27 18:12       ` Richard Sandiford
  2020-11-28  0:17       ` Martin Sebor
  0 siblings, 2 replies; 88+ messages in thread
From: Richard Sandiford @ 2020-11-26 17:06 UTC (permalink / raw)
  To: Martin Sebor; +Cc: gcc-patches

Martin Sebor <msebor@gmail.com> writes:
> I do have one concern: the tendency to prioritize efficiency
> over safety (this can be said about most GCC code). Specifically
> in this class, the address bit twiddling makes me uneasy.  I don't
> think the object model in either language (certainly not C but
> I don't have the impression C++ either) makes it unequivocally
> valid.  On the contrary, I'd say many of us interpret the current
> rules as leaving it undefined.  There are efforts to sanction
> this sort of thing under some conditions (e.g, the C object
> model proposal) but they have not been adopted yet.  I think
> we should try to avoid exploiting these dark corners in new
> code.

I'd tried to stick to operations that I thought were well-defined.
The primitives being used are really:

(1) convert a T1* or T2* to char*
(2) increment an unincremented char*
(3) decrement an incremented char*
(4) convert a char* back to T1* or T2*
(5) convert a char* to an intptr_t in order to test its low bit

I thought (1) and (4) were allowed.  At least, [basic.compound] says
that void* must be able to hold any object pointer and that it must have
the same representation as char*, so I thought the conversion in (1) was
guaranteed to be representable.  And (4) only ever undoes (1): it only
converts the result of (1) back to the original pointer type.

For (2) and (3), the incremented pointer will still be within the
containing object, so I thought it would be well-defined.  Here too,
(3) only ever undoes (2): it only decrements a pointer that had
previously been incremented.

One thing I'd deliberately tried to avoid was converting integers
“back” to pointers, because that seemed like a more dangerous thing.
That's why:

>> +template<typename T1, typename T2>
>> +inline T2 *
>> +pointer_mux<T1, T2>::second_or_null () const
>> +{
>> +  // Micro optimization that's effective as of GCC 11: compute the value
>> +  // of the second pointer as an integer and test that, so that the integer
>> +  // result can be reused as the pointer and so that all computation can
>> +  // happen before a branch on null.  This reduces the number of branches
>> +  // needed for loops.
>> +  return uintptr_t (m_ptr - 1) & 1 ? nullptr : known_second ();
>> +}

is written in a somewhat indirect way.

Are your concerns with the primitives above, or is the problem with
something else?

Thanks,
Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [00/23] Make fwprop use an on-the-side RTL SSA representation
  2020-11-26 16:03   ` Richard Sandiford
@ 2020-11-27 15:56     ` Michael Matz
  2020-11-27 16:31       ` Richard Sandiford
  2020-11-30  6:45     ` Jeff Law
  1 sibling, 1 reply; 88+ messages in thread
From: Michael Matz @ 2020-11-27 15:56 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Jeff Law via Gcc-patches

Hello,

On Thu, 26 Nov 2020, Richard Sandiford via Gcc-patches wrote:

> >> The aim is only to provide a different view of existing RTL instructions.
> >> Unlike gimple, and unlike (IIRC) the old RTL SSA project from way back,
> >> the new framework isn't a “native” SSA representation.  This means that
> >> all inputs to a phi node for a register R are also definitions of
> >> register R; no move operation is “hidden” in the phi node.
> > Hmm, I'm trying to parse what the last phrase means.  Does it mean that
> > the "hidden copy" problem for out-of-ssa is avoided?  And if so, how is
> > that maintained over time.  Things like copy-prop will tend to introduce
> > those issues even if they didn't originally exist.
> 
> Yeah, the phi nodes simply say which definition of register R provides
> the value of R on a particular incoming edge.  That definition will
> itself be a phi node for R, an artificial definition of R created by DF
> (e.g. for incoming function arguments or for EH data registers), or an
> actual instruction that sets R.
> 
> In other words, the SSA form is a purely on-the-side thing and the
> underlying RTL instructions are maintained in the same way as normal.
> The SSA form can be deleted at any time without performing a separate
> out-of-ssa step.  In that respect it's different from cfglayout,
> for example.

Hmm, I don't see how that answers Jeffs question, if I got it correctly.  
If I didn't get it correctly let me ask my own version of the question :)

(I haven't studied your implementation in detail, if I had maybe answers 
to the below would become obvious, sorry if so :) )

So, you're saying that in your implementation the operands of PHIs can be 
PHIs and real defs.  Further you say nothing about any restriction in RTL 
instruction moving and/or propagation.  So, then let's start with one of 
the prime examples of SSA deconstruction problems, the lost swap, and how 
it comes to be: we start with a swap:

  x = ..., y = ...
  if (cond)
    tmp=x, x=y, y=tmp

(1) into SSA:

  x0 = ..., y0 = ...
  if (cond)
    tmp = x0, x1=y0, y1=tmp;
  x2 = PHI(x0,x1),  y2 = PHI(y0,y1)

(2) copy-prop:

  x0 = ..., y0 = ...
  if (cond)
    ;
  x2 = PHI(x0,y0),  y2 = PHI(y0,x0)

Now you're also saying that the SSA form can simply be deleted without any 
consideration of the parallel copy nature, i.e. no real out-of-ssa phase.  
In the above example that would lead to wrong code, so that can't be it.  
So what in your representation avoids either (1) or (2)?  Do these 
restrictions also work if the above crucial code is within a loop (and 
hence the inputs to PHIs are the PHIs themself, which is the actual 
canonical variant of the lost-copy and swap problems).

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [00/23] Make fwprop use an on-the-side RTL SSA representation
  2020-11-27 15:56     ` Michael Matz
@ 2020-11-27 16:31       ` Richard Sandiford
  2020-11-30 21:13         ` Jeff Law
  0 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-11-27 16:31 UTC (permalink / raw)
  To: Michael Matz; +Cc: Jeff Law via Gcc-patches

Michael Matz <matz@suse.de> writes:
> Hello,
>
> On Thu, 26 Nov 2020, Richard Sandiford via Gcc-patches wrote:
>
>> >> The aim is only to provide a different view of existing RTL instructions.
>> >> Unlike gimple, and unlike (IIRC) the old RTL SSA project from way back,
>> >> the new framework isn't a “native” SSA representation.  This means that
>> >> all inputs to a phi node for a register R are also definitions of
>> >> register R; no move operation is “hidden” in the phi node.
>> > Hmm, I'm trying to parse what the last phrase means.  Does it mean that
>> > the "hidden copy" problem for out-of-ssa is avoided?  And if so, how is
>> > that maintained over time.  Things like copy-prop will tend to introduce
>> > those issues even if they didn't originally exist.
>> 
>> Yeah, the phi nodes simply say which definition of register R provides
>> the value of R on a particular incoming edge.  That definition will
>> itself be a phi node for R, an artificial definition of R created by DF
>> (e.g. for incoming function arguments or for EH data registers), or an
>> actual instruction that sets R.
>> 
>> In other words, the SSA form is a purely on-the-side thing and the
>> underlying RTL instructions are maintained in the same way as normal.
>> The SSA form can be deleted at any time without performing a separate
>> out-of-ssa step.  In that respect it's different from cfglayout,
>> for example.
>
> Hmm, I don't see how that answers Jeffs question, if I got it correctly.  
> If I didn't get it correctly let me ask my own version of the question :)
>
> (I haven't studied your implementation in detail, if I had maybe answers 
> to the below would become obvious, sorry if so :) )
>  
> So, you're saying that in your implementation the operands of PHIs can be 
> PHIs and real defs.

Specifically real defs of the same register (or memory, for memory phis).

> Further you say nothing about any restriction in RTL 
> instruction moving and/or propagation.

The RTL SSA form doesn't add any extra restrictions beyond those that apply
to non-SSA RTL passes.  But it also doesn't take any restrictions away.
In other words, the changes that RTL SSA passes make to RTL instructions
are the same as those that non-SSA RTL passes would make.  The SSA form
is just there to make it easier to process use-def chains (and also
to process live ranges, to a limited extent).

> So, then let's start with one of 
> the prime examples of SSA deconstruction problems, the lost swap, and how 
> it comes to be: we start with a swap:
>
>   x = ..., y = ...
>   if (cond)
>     tmp=x, x=y, y=tmp
>
> (1) into SSA:
>
>   x0 = ..., y0 = ...
>   if (cond)
>     tmp = x0, x1=y0, y1=tmp;
>   x2 = PHI(x0,x1),  y2 = PHI(y0,y1)
>
> (2) copy-prop:
>
>   x0 = ..., y0 = ...
>   if (cond)
>     ;
>   x2 = PHI(x0,y0),  y2 = PHI(y0,x0)

So the point is that this isn't what the RTL would look like even
when using RTL SSA.  Putting y0 in x2 PHI and x0 in the y2 PHI is
representationally invalid.

Like I say, this isn't a “native” SSA form: it's just using SSA
constructs to represent dataflow in normal RTL.

> Now you're also saying that the SSA form can simply be deleted without any 
> consideration of the parallel copy nature, i.e. no real out-of-ssa phase.  
> In the above example that would lead to wrong code, so that can't be it.  
> So what in your representation avoids either (1) or (2)?  Do these 
> restrictions also work if the above crucial code is within a loop (and 
> hence the inputs to PHIs are the PHIs themself, which is the actual 
> canonical variant of the lost-copy and swap problems).

Hope the above answers this.  Using the notation above, every input
to an xn PHI always has the form xi.

I don't think it's worth having a native SSA form in RTL given that
we already have one in gimple.  It would just lose some low-level
details that are (IMO) important for RTL, and that distinguish RTL
from gimple, such as the need for a temporary register in your swap
example.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [07/23] Add a class that multiplexes two pointer types
  2020-11-26 17:06     ` Richard Sandiford
@ 2020-11-27 18:12       ` Richard Sandiford
  2020-11-28  0:17       ` Martin Sebor
  1 sibling, 0 replies; 88+ messages in thread
From: Richard Sandiford @ 2020-11-27 18:12 UTC (permalink / raw)
  To: Martin Sebor; +Cc: gcc-patches

Richard Sandiford via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> One thing I'd deliberately tried to avoid was converting integers
> “back” to pointers, because that seemed like a more dangerous thing.
> That's why:
>
>>> +template<typename T1, typename T2>
>>> +inline T2 *
>>> +pointer_mux<T1, T2>::second_or_null () const
>>> +{
>>> +  // Micro optimization that's effective as of GCC 11: compute the value
>>> +  // of the second pointer as an integer and test that, so that the integer
>>> +  // result can be reused as the pointer and so that all computation can
>>> +  // happen before a branch on null.  This reduces the number of branches
>>> +  // needed for loops.
>>> +  return uintptr_t (m_ptr - 1) & 1 ? nullptr : known_second ();
>>> +}
>
> is written in a somewhat indirect way.

Gah, I really should learn to read my own quotes.  In my head this was
still using integer operations, but I take your point that the final form
is probably invalid.  I'll try to achieve the same effect in some other way.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [07/23] Add a class that multiplexes two pointer types
  2020-11-26 17:06     ` Richard Sandiford
  2020-11-27 18:12       ` Richard Sandiford
@ 2020-11-28  0:17       ` Martin Sebor
  2020-12-17  0:17         ` Richard Sandiford
  1 sibling, 1 reply; 88+ messages in thread
From: Martin Sebor @ 2020-11-28  0:17 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 11/26/20 10:06 AM, Richard Sandiford wrote:
> Martin Sebor <msebor@gmail.com> writes:
>> I do have one concern: the tendency to prioritize efficiency
>> over safety (this can be said about most GCC code). Specifically
>> in this class, the address bit twiddling makes me uneasy.  I don't
>> think the object model in either language (certainly not C but
>> I don't have the impression C++ either) makes it unequivocally
>> valid.  On the contrary, I'd say many of us interpret the current
>> rules as leaving it undefined.  There are efforts to sanction
>> this sort of thing under some conditions (e.g, the C object
>> model proposal) but they have not been adopted yet.  I think
>> we should try to avoid exploiting these dark corners in new
>> code.
> 
> I'd tried to stick to operations that I thought were well-defined.
> The primitives being used are really:
> 
> (1) convert a T1* or T2* to char*
> (2) increment an unincremented char*
> (3) decrement an incremented char*
> (4) convert a char* back to T1* or T2*
> (5) convert a char* to an intptr_t in order to test its low bit

All those are valid as long as the pointer points into the same
object, both before and after.

> I thought (1) and (4) were allowed.  At least, [basic.compound] says
> that void* must be able to hold any object pointer and that it must have
> the same representation as char*, so I thought the conversion in (1) was
> guaranteed to be representable.  And (4) only ever undoes (1): it only
> converts the result of (1) back to the original pointer type.
> 
> For (2) and (3), the incremented pointer will still be within the
> containing object, so I thought it would be well-defined.  Here too,
> (3) only ever undoes (2): it only decrements a pointer that had
> previously been incremented.
> 
> One thing I'd deliberately tried to avoid was converting integers
> “back” to pointers, because that seemed like a more dangerous thing.
> That's why:
> 
>>> +template<typename T1, typename T2>
>>> +inline T2 *
>>> +pointer_mux<T1, T2>::second_or_null () const
>>> +{
>>> +  // Micro optimization that's effective as of GCC 11: compute the value
>>> +  // of the second pointer as an integer and test that, so that the integer
>>> +  // result can be reused as the pointer and so that all computation can
>>> +  // happen before a branch on null.  This reduces the number of branches
>>> +  // needed for loops.
>>> +  return uintptr_t (m_ptr - 1) & 1 ? nullptr : known_second ();

This is only valid if m_ptr points to the second byte of an object.
If it points to the first byte of A then it's invalid.  This would
make the test valid but the result strictly unspecified (though in
practice I'd expect it to do what you expect):

   return (uintptr_t (m_ptr) - 1) & 1 ? nullptr : known_second ();

>>> +}
> 
> is written in a somewhat indirect way.
> 
> Are your concerns with the primitives above, or is the problem with
> something else?

My initial impression was that the code stored information in
the least significant bits of the pointer.  Now that I've looked
at it again I still think it does that, except not directly but
indirectly, by storing a pointer to either the first byte of one
object (A) or to the second byte of another (B).  Correct?  (If
so, I would recommend to expand the documentation to make this
clearer.)

It's clever (a little too clever for my taste) but other than
the m_ptr - 1 part above I can't think of anything undefined
about it.  My main concern also wasn't with the bit twiddling
as such but with hiding the identity of the objects by manipulating
the representation of the pointers via integer operations.  Since
(if) the code doesn't really do that, it may be less of a problem
than I thought.

Martin

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [07/23] Add a class that multiplexes two pointer types
  2020-11-26 16:15     ` Richard Sandiford
@ 2020-11-30  1:28       ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-11-30  1:28 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/26/20 9:15 AM, Richard Sandiford wrote:
> Jeff Law <law@redhat.com> writes:
>> On 11/13/20 1:14 AM, Richard Sandiford via Gcc-patches wrote:
>>> This patch adds a pointer_mux<T1, T2> class that provides similar
>>> functionality to:
>>>
>>>     union { T1 *a; T2 *b; };
>>>     ...
>>>     bool is_b_rather_than_a;
>>>
>>> except that the is_b_rather_than_a tag is stored in the low bit
>>> of the pointer.  See the comments in the patch for a comparison
>>> between the two approaches and why this one can be more efficient.
>>>
>>> I've tried to microoptimise the class a fair bit, since a later
>>> patch uses it extensively in order to keep the sizes of data
>>> structures down.
>>>
>>> gcc/
>>> 	* mux-utils.h: New file.
>> Do we have any potentially bootstrappable targets where we can't
>> guarantee pointer alignment of at least 16 bits?  I see what look like
>> suitable asserts, and presumably if we trigger them, then we're going to
>> need to rethink this and fall back to a separate bit?
> Like you say, the asserts should trap that if it happens.  I think the
> fix would be to increase the alignment of the data types involved (which
> we can now do in a standard way -- probably wouldn't have been possible
> before the switch to C++11, at least not without using extensions).
Yea, I keep forgetting about some of the facilities C++ can provide us now.

Seems like we've got a good backup plan.  So this is OK by me.
jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [11/23] Split update_cfg_for_uncondjump out of combine
  2020-11-13  8:16 ` [11/23] Split update_cfg_for_uncondjump out of combine Richard Sandiford
@ 2020-11-30  6:14   ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-11-30  6:14 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:16 AM, Richard Sandiford via Gcc-patches wrote:
> Later patches want to reuse combine's update_cfg_for_uncondjump,
> so this patch makes it a public cfgrtl.c function.
>
> gcc/
> 	* cfgrtl.h (update_cfg_for_uncondjump): Declare.
> 	* combine.c (update_cfg_for_uncondjump): Move to...
> 	* cfgrtl.c: ...here.
OK
jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [21/23] doc: Add documentation for rtl-ssa
  2020-11-13  8:22 ` [21/23] doc: Add documentation for rtl-ssa Richard Sandiford
@ 2020-11-30  6:26   ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-11-30  6:26 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:22 AM, Richard Sandiford via Gcc-patches wrote:
> This patch adds some documentation to rtl.texi about the SSA form.
> It only really describes the high-level structure -- I think for
> API-level stuff it's better to rely on function comments instead.
>
> gcc/
> 	* doc/rtl.texi (RTL SSA): New node.
I suspect we'll need to twiddle things as we have more consumers and get
feedback from folks looking to try and wire this into existing RTL
optimizers and I like that the docs focus on how to add the SSA
on-the-side info to an existing pass.

> +
> +Since resource numbers so closely match register numbers, it is somtimes
NIT, typo "somtimes" -> "sometimes"

I'm still struggling to see how we avoid the lost copy problem as
changes are made to the SSA graph.  I suspect we're going to end up with
some additional documentation around that.  But I think we can have a
distinct patch for that.

Jeff

> +convenient to refer to them simply as register numbers, or ``regnos''
> +for short.  However, the RTL SSA form also provides an abstraction
> +of resources in the form of @code{rtl_ssa::resource_info}.
> +This is a lightweight class that records both the regno of a resource
> +and the @code{machine_mode} that the resource has (@pxref{Machine Modes}).
> +It has functions for testing whether a resource is a register or memory.
> +In principle it could be extended to other kinds of resource in future.
> +
> +@node RTL SSA Accesses
> +@subsection RTL SSA Register and Memory Accesses
> +
> +In the RTL SSA form, most reads or writes of a resource are
> +represented as a @code{rtl_ssa::access_info}@footnote{The exceptions
> +are call clobbers, which are generally represented separately.
> +See the comment above @code{rtl_ssa::insn_info} for details.}.
> +These @code{rtl_ssa::access_info}s are organized into the following
> +class hierarchy:
> +
> +@findex rtl_ssa::access_info
> +@findex rtl_ssa::use_info
> +@findex rtl_ssa::def_info
> +@findex rtl_ssa::clobber_info
> +@findex rtl_ssa::set_info
> +@findex rtl_ssa::phi_info
> +@smallexample
> +rtl_ssa::access_info
> +  |
> +  +-- rtl_ssa::use_info
> +  |
> +  +-- rtl_ssa::def_info
> +        |
> +        +-- rtl_ssa::clobber_info
> +        |
> +        +-- rtl_ssa::set_info
> +              |
> +              +-- rtl_ssa::phi_info
> +@end smallexample
> +
> +A @code{rtl_ssa::use_info} represents a read or use of a resource and
> +a @code{rtl_ssa::def_info} represents a write or definition of a resource.
> +As in the main RTL representation, there are two basic types of
> +definition: clobbers and sets.  The difference is that a clobber
> +leaves the register with an unspecified value that cannot be used
> +or relied on by later instructions, while a set leaves the register
> +with a known value that later instructions could use if they wanted to.
> +A @code{rtl_ssa::clobber_info} represents a clobber and
> +a @code{rtl_ssa::set_info} represent a set.
> +
> +Each @code{rtl_ssa::use_info} records which single @code{rtl_ssa::set_info}
> +provides the value of the resource; this is null if the resource is
> +completely undefined at the point of use.  Each @code{rtl_ssa::set_info}
> +in turn records all the @code{rtl_ssa::use_info}s that use its value.
> +
> +If a value of a resource can come from multiple sources,
> +a @code{rtl_ssa::phi_info} brings those multiple sources together
> +into a single definition (@pxref{RTL SSA Phi Nodes}).
> +
> +@node RTL SSA Phi Nodes
> +@subsection RTL SSA Phi Nodes
> +
> +@cindex phi nodes, RTL SSA
> +@findex rtl_ssa::phi_info
> +If a resource is live on entry to an extended basic block and if the
> +resource's value can come from multiple sources, the extended basic block
> +has a ``phi node'' that collects together these multiple sources.
> +The phi node conceptually has one input for each incoming edge of
> +the extended basic block, with the input specifying the value of
> +the resource on that edge.  For example, suppose a function contains
> +the following RTL:
> +
> +@smallexample
> +;; Basic block bb3
> +@dots{}
> +(set (reg:SI R1) (const_int 0))  ;; A
> +(set (pc) (label_ref bb5))
> +
> +;; Basic block bb4
> +@dots{}
> +(set (reg:SI R1) (const_int 1))  ;; B
> +;; Fall through
> +
> +;; Basic block bb5
> +;; preds: bb3, bb4
> +;; live in: R1 @dots{}
> +(code_label bb5)
> +@dots{}
> +(set (reg:SI @var{R2})
> +     (plus:SI (reg:SI R1) @dots{}))  ;; C
> +@end smallexample
> +
> +The value of R1 on entry to block 5 can come from either A or B@.
> +The extended basic block that contains block 5 would therefore have a
> +phi node with two inputs: the first input would have the value of
> +R1 defined by A and the second input would have the value of
> +R1 defined by B@.  This phi node would then provide the value of
> +R1 for C (assuming that R1 does not change again between
> +the start of block 5 and C).
> +
> +Since RTL is not a ``native'' SSA representation, these phi nodes
> +simply collect together definitions that already exist.  Each input
> +to a phi node for a resource @var{R} is itself a definition of
> +resource @var{R} (or is null if the resource is completely
> +undefined for a particular incoming edge).  This is in contrast
> +to a native SSA representation like GIMPLE, where the phi inputs
> +can be arbitrary expressions.  As a result, RTL SSA phi nodes
> +never involve ``hidden'' moves: all moves are instead explicit.
> +
> +Phi nodes are represented as a @code{rtl_ssa::phi_node}.
> +Each input to a phi node is represented as an @code{rtl_ssa::use_info}.
> +
> +@node RTL SSA Access Lists
> +@subsection RTL SSA Access Lists
> +
> +All the definitions of a resource are chained together in reverse postorder.
> +In general, this list can contain an arbitrary mix of both sets
> +(@code{rtl_ssa::set_info}) and clobbers (@code{rtl_ssa::clobber_info}).
> +However, it is often useful to skip over all intervening clobbers
> +of a resource in order to find the next set.  The list is constructed
> +in such a way that this can be done in amortized constant time.
> +
> +All uses (@code{rtl_ssa::use_info}) of a given set are also chained
> +together into a list.  This list of uses is divided into three parts:
> +
> +@enumerate
> +@item
> +uses by ``real'' nondebug instructions (@pxref{real RTL SSA insns})
> +
> +@item
> +uses by real debug instructions
> +
> +@item
> +uses by phi nodes (@pxref{RTL SSA Phi Nodes})
> +@end enumerate
> +
> +The first and second parts individually follow reverse postorder.
> +The third part has no particular order.
> +
> +@cindex degenerate phi node, RTL SSA
> +The last use by a real nondebug instruction always comes earlier in
> +the reverse postorder than the next definition of the resource (if any).
> +This means that the accesses follow a linear sequence of the form:
> +
> +@itemize @bullet
> +@item
> +first definition of resource R
> +
> +@itemize @bullet
> +@item
> +first use by a real nondebug instruction of the first definition of resource R
> +
> +@item
> +@dots{}
> +
> +@item
> +last use by a real nondebug instruction of the first definition of resource R
> +@end itemize
> +
> +@item
> +second definition of resource R
> +
> +@itemize @bullet
> +@item
> +first use by a real nondebug instruction of the second definition of resource R
> +
> +@item
> +@dots{}
> +
> +@item
> +last use by a real nondebug instruction of the second definition of resource R
> +@end itemize
> +
> +@item
> +@dots{}
> +
> +@item
> +last definition of resource R
> +
> +@itemize @bullet
> +@item
> +first use by a real nondebug instruction of the last definition of resource R
> +
> +@item
> +@dots{}
> +
> +@item
> +last use by a real nondebug instruction of the last definition of resource R
> +@end itemize
> +@end itemize
> +
> +(Note that clobbers never have uses; only sets do.)
> +
> +This linear view is easy to achieve when there is only a single definition
> +of a resource, which is commonly true for pseudo registers.  However,
> +things are more complex  if code has a structure like the following:
> +
> +@smallexample
> +// ebb2, bb2
> +R = @var{va};        // A
> +if (@dots{})
> +  @{
> +    // ebb2, bb3
> +    use1 (R);  // B
> +    @dots{}
> +    R = @var{vc};    // C
> +  @}
> +else
> +  @{
> +    // ebb4, bb4
> +    use2 (R);  // D
> +  @}
> +@end smallexample
> +
> +The list of accesses would begin as follows:
> +
> +@itemize @bullet
> +@item
> +definition of R by A
> +
> +@itemize @bullet
> +@item
> +use of A's definition of R by B
> +@end itemize
> +
> +@item
> +definition of R by C
> +@end itemize
> +
> +The next access to R is in D, but the value of R that D uses comes from
> +A rather than C@.
> +
> +This is resolved by adding a phi node for @code{ebb4}.  All inputs to this
> +phi node have the same value, which in the example above is A's definition
> +of R@.  In other circumstances, it would not be necessary to create a phi
> +node when all inputs are equal, so these phi nodes are referred to as
> +``degenerate'' phi nodes.
> +
> +The full list of accesses to R is therefore:
> +
> +@itemize @bullet
> +@item
> +definition of R by A
> +
> +@itemize @bullet
> +@item
> +use of A's definition of R by B
> +@end itemize
> +
> +@item
> +definition of R by C
> +
> +@item
> +definition of R by ebb4's phi instruction, with the input coming from A
> +
> +@itemize @bullet
> +@item
> +use of the ebb4's R phi definition of R by B
> +@end itemize
> +@end itemize
> +
> +Note that A's definition is also used by ebb4's phi node, but this
> +use belongs to the third part of the use list described above and
> +so does not form part of the linear sequence.
> +
> +It is possible to ``look through'' any degenerate phi to the ultimate
> +definition using the function @code{look_through_degenerate_phi}.
> +Note that the input to a degenerate phi is never itself provided
> +by a degenerate phi.
> +
> +At present, the SSA form takes this principle one step further
> +and guarantees that, for any given resource @var{res}, one of the
> +following is true:
> +
> +@itemize
> +@item
> +The resource has a single definition @var{def}, which is not a phi node.
> +Excluding uses of undefined registers, all uses of @var{res} by real
> +nondebug instructions use the value provided by @var{def}.
> +
> +@item
> +Excluding uses of undefined registers, all uses of @var{res} use
> +values provided by definitions that occur earlier in the same
> +extended basic block.  These definitions might come from phi nodes
> +or from real instructions.
> +@end itemize
> +
> +@node Changing RTL Instructions
> +@subsection Using the RTL SSA framework to change instructions
> +
> +@findex rtl_ssa::insn_change
> +There are various routines that help to change a single RTL instruction
> +or a group of RTL instructions while keeping the RTL SSA form up-to-date.
> +This section first describes the process for changing a single instruction,
> +then goes on to describe the differences when changing multiple instructions.
> +
> +@menu
> +* Changing One RTL SSA Instruction::
> +* Changing Multiple RTL SSA Instructions::
> +@end menu
> +
> +@node Changing One RTL SSA Instruction
> +@subsubsection Changing One RTL SSA Instruction
> +
> +Before making a change, passes should first use a statement like the
> +following:
> +
> +@smallexample
> +auto attempt = crtl->ssa->new_change_attempt ();
> +@end smallexample
> +
> +Here, @code{attempt} is an RAII object that should remain in scope
> +for the entire change attempt.  It automatically frees temporary
> +memory related to the changes when it goes out of scope.
> +
> +Next, the pass should create an @code{rtl_ssa::insn_change} object
> +for the instruction that it wants to change.  This object specifies
> +several things:
> +
> +@itemize @bullet
> +@item
> +what the instruction's new list of uses should be (@code{new_uses}).
> +By default this is the same as the instruction's current list of uses.
> +
> +@item
> +what the instruction's new list of definitions should be (@code{new_defs}).
> +By default this is the same as the instruction's current list of
> +definitions.
> +
> +@item
> +where the instruction should be located (@code{move_range}).
> +This is a range of instructions after which the instruction could
> +be placed, represented as an @code{rtl_ssa::insn_range}.
> +By default the instruction must remain at its current position.
> +@end itemize
> +
> +If a pass was attempting to change all these properties of an instruction
> +@code{insn}, it might do something like this:
> +
> +@smallexample
> +rtl_ssa::insn_change change (insn);
> +change.new_defs = @dots{};
> +change.new_uses = @dots{};
> +change.move_range = @dots{};
> +@end smallexample
> +
> +This @code{rtl_ssa::insn_change} only describes something that the
> +pass @emph{might} do; at this stage, nothing has actually changed.
> +
> +As noted above, the default @code{move_range} requires the instruction
> +to remain where it is.  At the other extreme, it is possible to allow
> +the instruction to move anywhere within its extended basic block,
> +provided that all the new uses and definitions can be performed
> +at the new location.  The way to do this is:
> +
> +@smallexample
> +change.move_range = insn->ebb ()->insn_range ();
> +@end smallexample
> +
> +In either case, the next step is to make sure that move range is
> +consistent with the new uses and definitions.  The way to do this is:
> +
> +@smallexample
> +if (!rtl_ssa::restrict_movement (change))
> +  return false;
> +@end smallexample
> +
> +This function tries to limit @code{move_range} to a range of instructions
> +at which @code{new_uses} and @code{new_defs} can be correctly performed.
> +It returns true on success or false if no suitable location exists.
> +
> +The pass should also tentatively change the pattern of the instruction
> +to whatever form the pass wants the instruction to have.  This should use
> +the facilities provided by @file{recog.c}.  For example:
> +
> +@smallexample
> +rtl_insn *rtl = insn->rtl ();
> +insn_change_watermark watermark;
> +validate_change (rtl, &PATTERN (rtl), new_pat, 1);
> +@end smallexample
> +
> +will tentatively replace @code{insn}'s pattern with @code{new_pat}.
> +
> +These changes and the construction of the @code{rtl_ssa::insn_change}
> +can happen in either order or be interleaved.
> +
> +After the tentative changes to the instruction are complete,
> +the pass should check whether the new pattern matches a target
> +instruction or satisfies the requirements of an inline asm:
> +
> +@smallexample
> +if (!rtl_ssa::recog (change))
> +  return false;
> +@end smallexample
> +
> +This step might change the instruction pattern further in order to
> +make it match.  It might also add new definitions or restrict the range
> +of the move.  For example, if the new pattern did not match in its original
> +form, but could be made to match by adding a clobber of the flags
> +register, @code{rtl_ssa::recog} will check whether the flags register
> +is free at an appropriate point.  If so, it will add a clobber of the
> +flags register to @code{new_defs} and restrict @code{move_range} to
> +the locations at which the flags register can be safely clobbered.
> +
> +Even if the proposed new instruction is valid according to
> +@code{rtl_ssa::recog}, the change might not be worthwhile.
> +For example, when optimizing for speed, the new instruction might
> +turn out to be slower than the original one.  When optimizing for
> +size, the new instruction might turn out to be bigger than the
> +original one.
> +
> +Passes should check for this case using @code{change_is_worthwhile}.
> +For example:
> +
> +@smallexample
> +if (!rtl_ssa::change_is_worthwhile (change))
> +  return false;
> +@end smallexample
> +
> +If the change passes this test too then the pass can perform the change using:
> +
> +@smallexample
> +confirm_change_group ();
> +crtl->ssa->change_insn (change);
> +@end smallexample
> +
> +Putting all this together, the change has the following form:
> +
> +@smallexample
> +auto attempt = crtl->ssa->new_change_attempt ();
> +
> +rtl_ssa::insn_change change (insn);
> +change.new_defs = @dots{};
> +change.new_uses = @dots{};
> +change.move_range = @dots{};
> +
> +if (!rtl_ssa::restrict_movement (change))
> +  return false;
> +
> +insn_change_watermark watermark;
> +// Use validate_change etc. to change INSN's pattern.
> +@dots{}
> +if (!rtl_ssa::recog (change)
> +    || !rtl_ssa::change_is_worthwhile (change))
> +  return false;
> +
> +confirm_change_group ();
> +crtl->ssa->change_insn (change);
> +@end smallexample
> +
> +@node Changing Multiple RTL SSA Instructions
> +@subsubsection Changing Multiple RTL SSA Instructions
> +
> +The process for changing multiple instructions is similar
> +to the process for changing single instructions
> +(@pxref{Changing One RTL SSA Instruction}).  The pass should
> +again start the change attempt with:
> +
> +@smallexample
> +auto attempt = crtl->ssa->new_change_attempt ();
> +@end smallexample
> +
> +and keep @code{attempt} in scope for the duration of the change
> +attempt.  It should then construct an @code{rtl_ssa::insn_change}
> +for each change that it wants to make.
> +
> +After this, it should combine the changes into a sequence of
> +@code{rtl_ssa::insn_change} pointers.  This sequence must be in
> +reverse postorder; the instructions will remain strictly in the
> +order that the sequence specifies.
> +
> +For example, if a pass is changing exactly two instructions,
> +it might do:
> +
> +@smallexample
> +rtl_ssa::insn_change *changes[] = @{ &change1, change2 @};
> +@end smallexample
> +
> +where @code{change1}'s instruction must come before @code{change2}'s.
> +Alternatively, if the pass is changing a variable number of
> +instructions, it might build up the sequence in a
> +@code{vec<rtl_ssa::insn_change *>}.
> +
> +By default, @code{rtl_ssa::restrict_movement} assumes that all
> +instructions other than the one passed to it will remain in their
> +current positions and will retain their current uses and definitions.
> +When changing multiple instructions, it is usually more effective
> +to ignore the other instructions that are changing.  The sequencing
> +described above ensures that the changing instructions remain
> +in the correct order with respect to each other.
> +The way to do this is:
> +
> +@smallexample
> +if (!rtl_ssa::restrict_movement (change, insn_is_changing (changes)))
> +  return false;
> +@end smallexample
> +
> +Similarly, when @code{rtl_ssa::restrict_movement} is detecting
> +whether a register can be clobbered, it by default assumes that
> +all other instructions will remain in their current positions and
> +retain their current form.  It is again more effective to ignore
> +changing instructions (which might, for example, no longer need
> +to clobber the flags register).  The way to do this is:
> +
> +@smallexample
> +if (!rtl_ssa::recog (change, insn_is_changing (changes)))
> +  return false;
> +@end smallexample
> +
> +When changing multiple instructions, the important question is usually
> +not whether each individual change is worthwhile, but whether the changes
> +as a whole are worthwhile.  The way to test this is:
> +
> +@smallexample
> +if (!rtl_ssa::changes_are_worthwhile (changes))
> +  return false;
> +@end smallexample
> +
> +The process for changing single instructions makes sure that one
> +@code{rtl_ssa::insn_change} in isolation is valid.  But when changing
> +multiple instructions, it is also necessary to test whether the
> +sequence as a whole is valid.  For example, it might be impossible
> +to satisfy all of the @code{move_range}s at once.
> +
> +Therefore, once the pass has a sequence of changes that are
> +individually correct, it should use:
> +
> +@smallexample
> +if (!crtl->ssa->verify_insn_changes (changes))
> +  return false;
> +@end smallexample
> +
> +to check whether the sequence as a whole is valid.  If all checks pass,
> +the final step is:
> +
> +@smallexample
> +confirm_change_group ();
> +crtl->ssa->change_insns (changes);
> +@end smallexample
> +
> +Putting all this together, the process for a two-instruction change is:
> +
> +@smallexample
> +auto attempt = crtl->ssa->new_change_attempt ();
> +
> +rtl_ssa::insn_change change (insn1);
> +change1.new_defs = @dots{};
> +change1.new_uses = @dots{};
> +change1.move_range = @dots{};
> +
> +rtl_ssa::insn_change change (insn2);
> +change2.new_defs = @dots{};
> +change2.new_uses = @dots{};
> +change2.move_range = @dots{};
> +
> +rtl_ssa::insn_change *changes[] = @{ &change1, change2 @};
> +
> +auto is_changing = insn_is_changing (changes);
> +if (!rtl_ssa::restrict_movement (change1, is_changing)
> +    || !rtl_ssa::restrict_movement (change2, is_changing))
> +  return false;
> +
> +insn_change_watermark watermark;
> +// Use validate_change etc. to change INSN1's and INSN2's patterns.
> +@dots{}
> +if (!rtl_ssa::recog (change1, is_changing)
> +    || !rtl_ssa::recog (change2, is_changing)
> +    || !rtl_ssa::changes_are_worthwhile (changes)
> +    || !crtl->ssa->verify_insn_changes (changes))
> +  return false;
> +
> +confirm_change_group ();
> +crtl->ssa->change_insns (changes);
> +@end smallexample
> +
>  @node Sharing
>  @section Structure Sharing Assumptions
>  @cindex sharing of RTL components


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [00/23] Make fwprop use an on-the-side RTL SSA representation
  2020-11-26 16:03   ` Richard Sandiford
  2020-11-27 15:56     ` Michael Matz
@ 2020-11-30  6:45     ` Jeff Law
  2020-11-30 14:12       ` Richard Sandiford
  1 sibling, 1 reply; 88+ messages in thread
From: Jeff Law @ 2020-11-30  6:45 UTC (permalink / raw)
  To: Jeff Law via Gcc-patches, richard.sandiford



On 11/26/20 9:03 AM, Richard Sandiford wrote:
> Thanks for the reviews.
>
> Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> On 11/13/20 1:10 AM, Richard Sandiford via Gcc-patches wrote:
>>> Just after GCC 10 stage 1 closed (oops), I posted a patch to add a new
>>> combine pass.  One of its main aims was to allow instructions to move
>>> around where necessary in order to make a combination possible.
>>> It also tried to parallelise instructions that use the same resource.
>>>
>>> That pass contained its own code for maintaining limited def-use chains.
>>> When I posted the patch, Segher asked why we wanted yet another piece
>>> of pass-specific code to do that.  Although I had specific reasons
>>> (which I explained at the time) I've gradually come round to agreeing
>>> that that was a flaw.
>>>
>>> This series of patches is the result of a Covid-time project to add
>>> a more general, pass-agnostic framework.  There are two parts:
>>> adding the framework itself, and using it to make fwprop.c faster.
>>>
>>> The framework part
>>> ------------------
>>>
>>> The framework provides an optional, on-the-side SSA view of existing
>>> RTL instructions.  Each instruction gets a list of definitions and a
>>> list of uses, with each use having a single definition.  Phi nodes
>>> handle cases in which there are multiple possible definitions of a
>>> register on entry to a basic block.  There are also routines for
>>> updating instructions while keeping the SSA representation intact.
>>>
>>> The aim is only to provide a different view of existing RTL instructions.
>>> Unlike gimple, and unlike (IIRC) the old RTL SSA project from way back,
>>> the new framework isn't a “native” SSA representation.  This means that
>>> all inputs to a phi node for a register R are also definitions of
>>> register R; no move operation is “hidden” in the phi node.
>> Hmm, I'm trying to parse what the last phrase means.  Does it mean that
>> the "hidden copy" problem for out-of-ssa is avoided?  And if so, how is
>> that maintained over time.  Things like copy-prop will tend to introduce
>> those issues even if they didn't originally exist.
> Yeah, the phi nodes simply say which definition of register R provides
> the value of R on a particular incoming edge.  That definition will
> itself be a phi node for R, an artificial definition of R created by DF
> (e.g. for incoming function arguments or for EH data registers), or an
> actual instruction that sets R.
>
> In other words, the SSA form is a purely on-the-side thing and the
> underlying RTL instructions are maintained in the same way as normal.
> The SSA form can be deleted at any time without performing a separate
> out-of-ssa step.  In that respect it's different from cfglayout,
> for example.
>
> One of the goals was to allow the SSA form to be used even after RA,
> where invisible copies would be more problematic.
Right.  But what I'm struggling a bit with is whether or not we have to
put restrictions on what passes can do with that on-the-side data
structure.  While I think we can have that on-the-side data structure be
conservatively correct, I think we have to make sure that we don't allow
changes to the on-the-side data structure to occur that ultimately we
can't reflect into RTL.

I may need to go back and re-read the lost copy problem literature.  But
it's definitely an area that I'm concerned about.


>> It's unfortunately that there's no DCE passes abutting
>> fwprop as DCE is really easy in an SSA world.
> fwprop.c calls delete_trivially_dead_insns, so it does some light DCE.
> One thing I wanted to do (but ran out of time to do) was get the main
> SSA insn-change routine (rtl_ssa::change_insns) to record when an
> instruction becomes dead, and then perform DCE as part of the later
> rtl_ssa::perform_pending_updates step.  This would be much cheaper
> than doing another full scan of the instruction stream (which is what
> delete_trivially_dead_insns needs needs to do).
>
> Unfortunately, I suspect we're relying on this delete_trivially_dead_insns
> call to delete instructions that became dead during earlier passes, not just
> those that become dead during fwprop.c.  So I guess we would need a full
> DCE at some point: making fwprop.c clean up its own mess might not be
> enough.
Oh, yea, if it's using delete_trivially_dead_insns, then, yea, it's got
a mini-DCE and using the SSA algorithm would seem to be a step forward.

I don't necessarily see that incoming dead code is that big of a
problem.  Ultimately it's still going to look like SSA definition with
no uses, in the on-the-side data structure, right?  So an SSA based DCE
should be able to clean up the mess from fwprop as well as any incoming
dead code.
>
>>> * The SSA code groups blocks into extended basic blocks, with the
>>>   EBBs rather than individual blocks having phi nodes.  
>> So I haven't looked at the patch, but the usual place to put PHIs is at
>> the dominance frontier.  But extra PHIs just increase time/memory and
>> shouldn't affect correctness.
> Yeah, the phis still are at dominance frontiers (except for certain
> cases where we use degenerate phis to maintain a linear RPO view;
> see the doc patch for more details about that).  It wasn't clear from
> the description above, but I was really talking about a pure data
> structure choice: once we have both BBs and EBBs, the phis naturally
> attach to the EBB data structure rather than the BB data structure,
> since second and subsequent BBs in an EBB have a single predecessor
> and so never need phi nodes.
Certainly its the case that the dominance frontier must be at the start
of an EBB.  So inserting PHIs at the start of EBBs should be correct. 
But my recollection was that if do it naively you end up with
unnecessary PHIs.    But I don't think we have to do a "no useless PHIs"
algorithm, we just have to do something sensible -- it's my suspicion
that all the work in the early days of SSA to minimize PHIs isn't as
important as it used to be.

>
>>> * The framework also provides live range information for registers
>>>   within an extended basic block and allows instructions to move within
>>>   their EBB.  It might be useful to allow further movement in future;
>>>   I just don't have a use case for it yet.
>> Yup.   You could do something like Click's algorithm to schedule the
>> instructions in a block to maximize CSE opportunities on top of this.
> Yeah.
I noticed that you've got a lot of the infrastructure to do this already
:-) 

>
>>> * One advantage of the new infrastructure is that it gives
>>>   recog_for_combine-like behaviour: if recog wants to add clobbers
>>>   of things like the flags register, the SSA code will make sure
>>>   that the flags register is free.
>> I look more at the intersection between combine and SSA as an
>> opportunity to combine on extended blocks, simplify the "does dataflow
>> allow this combination" logic, drop the need to build/maintain LOG_LINKS
>> and more generally simplify note distribution.
> Yeah, my ultimate goal (for GCC12, I hope this time for real) is still
> to provide an SSA version of combine.  Initially it could sit alongside
> the existing combine pass, and perhaps be run only after RA by default.
> (Testing locally, that seems to give nice results, and reduces the
> pressure to reimplement everything in combine.c in one go.)
>
> But the point above was instead that, at the moment, combine is the
> only pass that can add (say) new clobbers of the flags register as
> part of a recog.  I think ideally *all* passes should be able to do that.
> But passes would then need to track the live ranges of the flags
> register in order to tell when the flags register is free.  One of the
> side-benefits of the SSA stuff is that it can do this in amortised
> sublinear complexity.  So RTL SSA provides its own interface to recog
> that can do the same things as recog_for_combine does.
Sweet.


Jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [00/23] Make fwprop use an on-the-side RTL SSA representation
  2020-11-30  6:45     ` Jeff Law
@ 2020-11-30 14:12       ` Richard Sandiford
  0 siblings, 0 replies; 88+ messages in thread
From: Richard Sandiford @ 2020-11-30 14:12 UTC (permalink / raw)
  To: Jeff Law; +Cc: Jeff Law via Gcc-patches

Jeff Law <law@redhat.com> writes:
> On 11/26/20 9:03 AM, Richard Sandiford wrote:
>> Thanks for the reviews.
>>
>> Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>>> On 11/13/20 1:10 AM, Richard Sandiford via Gcc-patches wrote:
>>>> Just after GCC 10 stage 1 closed (oops), I posted a patch to add a new
>>>> combine pass.  One of its main aims was to allow instructions to move
>>>> around where necessary in order to make a combination possible.
>>>> It also tried to parallelise instructions that use the same resource.
>>>>
>>>> That pass contained its own code for maintaining limited def-use chains.
>>>> When I posted the patch, Segher asked why we wanted yet another piece
>>>> of pass-specific code to do that.  Although I had specific reasons
>>>> (which I explained at the time) I've gradually come round to agreeing
>>>> that that was a flaw.
>>>>
>>>> This series of patches is the result of a Covid-time project to add
>>>> a more general, pass-agnostic framework.  There are two parts:
>>>> adding the framework itself, and using it to make fwprop.c faster.
>>>>
>>>> The framework part
>>>> ------------------
>>>>
>>>> The framework provides an optional, on-the-side SSA view of existing
>>>> RTL instructions.  Each instruction gets a list of definitions and a
>>>> list of uses, with each use having a single definition.  Phi nodes
>>>> handle cases in which there are multiple possible definitions of a
>>>> register on entry to a basic block.  There are also routines for
>>>> updating instructions while keeping the SSA representation intact.
>>>>
>>>> The aim is only to provide a different view of existing RTL instructions.
>>>> Unlike gimple, and unlike (IIRC) the old RTL SSA project from way back,
>>>> the new framework isn't a “native” SSA representation.  This means that
>>>> all inputs to a phi node for a register R are also definitions of
>>>> register R; no move operation is “hidden” in the phi node.
>>> Hmm, I'm trying to parse what the last phrase means.  Does it mean that
>>> the "hidden copy" problem for out-of-ssa is avoided?  And if so, how is
>>> that maintained over time.  Things like copy-prop will tend to introduce
>>> those issues even if they didn't originally exist.
>> Yeah, the phi nodes simply say which definition of register R provides
>> the value of R on a particular incoming edge.  That definition will
>> itself be a phi node for R, an artificial definition of R created by DF
>> (e.g. for incoming function arguments or for EH data registers), or an
>> actual instruction that sets R.
>>
>> In other words, the SSA form is a purely on-the-side thing and the
>> underlying RTL instructions are maintained in the same way as normal.
>> The SSA form can be deleted at any time without performing a separate
>> out-of-ssa step.  In that respect it's different from cfglayout,
>> for example.
>>
>> One of the goals was to allow the SSA form to be used even after RA,
>> where invisible copies would be more problematic.
> Right.  But what I'm struggling a bit with is whether or not we have to
> put restrictions on what passes can do with that on-the-side data
> structure.  While I think we can have that on-the-side data structure be
> conservatively correct, I think we have to make sure that we don't allow
> changes to the on-the-side data structure to occur that ultimately we
> can't reflect into RTL.

Right.  Passes have no way of changing the phi nodes directly.  If we
allowed them to do that in future, it would need to be under controlled
cirumstances.

I think the way to think about it is that these “phis” are really just
an alternative way of representing reaching definitions.  Both describe
something that is already (independently) true.  The same goes for
DF liveness information.  It doesn't really make sense to change the
DF live-in or live-out sets directly: you can't independently make
something live or dead just by twiddling bits in the bitmap.

In other words, liveness and reaching definitions are purely reactive:
they change in response to changes to the RTL instructions.  The same is
true for this on-the-side SSA form.

I think that's the trade-off with using this kind of representation
instead of a native SSA one: you can do SSA-style analysis at any stage
of compilation (even after RA), but the result of the transformation
always has to be represented directly as RTL instructions.

>>> It's unfortunately that there's no DCE passes abutting
>>> fwprop as DCE is really easy in an SSA world.
>> fwprop.c calls delete_trivially_dead_insns, so it does some light DCE.
>> One thing I wanted to do (but ran out of time to do) was get the main
>> SSA insn-change routine (rtl_ssa::change_insns) to record when an
>> instruction becomes dead, and then perform DCE as part of the later
>> rtl_ssa::perform_pending_updates step.  This would be much cheaper
>> than doing another full scan of the instruction stream (which is what
>> delete_trivially_dead_insns needs needs to do).
>>
>> Unfortunately, I suspect we're relying on this delete_trivially_dead_insns
>> call to delete instructions that became dead during earlier passes, not just
>> those that become dead during fwprop.c.  So I guess we would need a full
>> DCE at some point: making fwprop.c clean up its own mess might not be
>> enough.
> Oh, yea, if it's using delete_trivially_dead_insns, then, yea, it's got
> a mini-DCE and using the SSA algorithm would seem to be a step forward.
>
> I don't necessarily see that incoming dead code is that big of a
> problem.  Ultimately it's still going to look like SSA definition with
> no uses, in the on-the-side data structure, right?  So an SSA based DCE
> should be able to clean up the mess from fwprop as well as any incoming
> dead code.

Yeah, it's easy to do.  But deleting dead code created by fwprop would
come “for free”: the rtl_ssa:: code that changes the instructions could
just record which instructions become dead as a result of the
transformations, and then use those to populate the worklist.

If we deal with incoming dead code then we need to look at every
instruction individually to see whether it's already dead.  Although
that's easy to do, I don't think it would be a significant saving over
delete_trivially_dead_insns.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [14/23] simplify-rtx: Put simplify routines into a class
  2020-11-13  8:18 ` [14/23] simplify-rtx: Put simplify routines into a class Richard Sandiford
@ 2020-11-30 19:54   ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-11-30 19:54 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:18 AM, Richard Sandiford via Gcc-patches wrote:
> One of the recurring warts of RTL is that multiplication by a power
> of 2 is represented as a MULT inside a MEM but as an ASHIFT outside
> a MEM.  It would obviously be better if we didn't have this kind of
> context sensitivity, but it would be difficult to remove.
>
> Currently the simplify-rtx.c routines are hard-coded for the
> ASHIFT form.  This means that some callers have to convert the
> ASHIFTs “back” into MULTs after calling the simplify-rtx.c
> routines; see fwprop.c:canonicalize_address for an example.
>
> I think we can relieve some of the pain by wrapping the simplify-rtx.c
> routines in a simple class that tracks whether the expression occurs
> in a MEM or not, so that no post-processing is needed.
>
> An obvious concern is whether passing the “this” pointer around
> will slow things down or bloat the code.  I can't measure any
> increase in compile time after applying the patch.  Sizewise,
> simplify-rtx.o text increases by 2.3% in default-checking builds
> and 4.1% in release-checking builds.
>
> I realise the MULT/ASHIFT thing isn't the most palatable
> reason for doing this, but I think it might be useful for
> other things in future, such as using local nonzero_bits
> hooks/virtual functions instead of the global hooks.
>
> The obvious alternative would be to add a static variable
> and hope that it is always updated correctly.
I think a static would be step in the wrong direction and I'm generally
in favor using classes, particularly when there's contextual information
we want to query.  Shoving things into a class allows us to reason about
when/how that contextual information is used.



>
> Later patches make use of this.
>
> gcc/
> 	* rtl.h (simplify_context): New class.
> 	(simplify_unary_operation, simplify_binary_operation): Use it.
> 	(simplify_ternary_operation, simplify_relational_operation): Likewise.
> 	(simplify_subreg, simplify_gen_unary, simplify_gen_binary): Likewise.
> 	(simplify_gen_ternary, simplify_gen_relational): Likewise.
> 	(simplify_gen_subreg, lowpart_subreg): Likewise.
> 	* simplify-rtx.c (simplify_gen_binary): Turn into a member function
> 	of simplify_context.
> 	(simplify_gen_unary, simplify_gen_ternary, simplify_gen_relational)
> 	(simplify_truncation, simplify_unary_operation): Likewise.
> 	(simplify_unary_operation_1, simplify_byte_swapping_operation)
> 	(simplify_associative_operation, simplify_logical_relational_operation)
> 	(simplify_binary_operation, simplify_binary_operation_series)
> 	(simplify_distributive_operation, simplify_plus_minus): Likewise.
> 	(simplify_relational_operation, simplify_relational_operation_1)
> 	(simplify_cond_clz_ctz, simplify_merge_mask): Likewise.
> 	(simplify_ternary_operation, simplify_subreg, simplify_gen_subreg)
> 	(lowpart_subreg): Likewise.
> 	(simplify_binary_operation_1): Likewise.  Test mem_depth when
> 	deciding whether the ASHIFT or MULT form is canonical.
> 	(simplify_merge_mask): Use simplify_context.
I like it.  OK for the trunk.

jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [09/23] Add a cut-down version of std::span (array_slice)
  2020-11-13  8:15 ` [09/23] Add a cut-down version of std::span (array_slice) Richard Sandiford
@ 2020-11-30 19:56   ` Jeff Law
  2022-08-03 15:13   ` Martin Jambor
  2022-08-10 16:03   ` Martin Jambor
  2 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-11-30 19:56 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:15 AM, Richard Sandiford via Gcc-patches wrote:
> A later patch wants to be able to pass around subarray views of an
> existing array.  The standard class to do that is std::span, but it's
> a C++20 thing.  This patch just adds a cut-down version of it.
>
> The intention is just to provide what's currently needed.
>
> gcc/
> 	* vec.h (array_slice): New class.
OK.  Obviously we can add more capabilities as we need them.

jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [15/23] recog: Add a validate_change_xveclen function
  2020-11-13  8:19 ` [15/23] recog: Add a validate_change_xveclen function Richard Sandiford
@ 2020-11-30 20:03   ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-11-30 20:03 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:19 AM, Richard Sandiford via Gcc-patches wrote:
> A later patch wants to be able to use the validate_change machinery
> to reduce the XVECLEN of a PARALLEL.  This should be more efficient
> than allocating a separate PARALLEL at a possibly distant memory
> location, especially since the new PARALLEL would be garbage rtl if
> the new pattern turns out not to match.  Combine already pulls this
> trick with SUBST_INT.
>
> This patch adds a general helper for doing that.
>
> gcc/
> 	* recog.h (validate_change_xveclen): Declare.
> 	* recog.c (change_t::old_len): New field.
> 	(validate_change_1): Add a new_len parameter.  Conditionally
> 	replace the XVECLEN of an rtx, avoiding single-element PARALLELs.
> 	(validate_change_xveclen): New function.
> 	(cancel_changes): Undo changes made by validate_change_xveclen.
OK.

Presumably it's better to reduce the length of the existing vector,
possibly losing a bit of memory off the end rather than releasing the
existing vector and allocating a new one of the desired length (as you
say it'd likely have no good locality).

And note that when I say lose, I mean lose the ability to use those
slots after we shrink the vector (say if we later wanted to lengthen
it).  The memory doesn't actually leak as it'll be released if we ever
release the vector.


Jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [00/23] Make fwprop use an on-the-side RTL SSA representation
  2020-11-27 16:31       ` Richard Sandiford
@ 2020-11-30 21:13         ` Jeff Law
  2020-12-01  0:03           ` Michael Matz
  0 siblings, 1 reply; 88+ messages in thread
From: Jeff Law @ 2020-11-30 21:13 UTC (permalink / raw)
  To: Michael Matz, Jeff Law via Gcc-patches, richard.sandiford



On 11/27/20 9:31 AM, Richard Sandiford via Gcc-patches wrote:
> Michael Matz <matz@suse.de> writes:
>> Hello,
>>
>> On Thu, 26 Nov 2020, Richard Sandiford via Gcc-patches wrote:
>>
>>>>> The aim is only to provide a different view of existing RTL instructions.
>>>>> Unlike gimple, and unlike (IIRC) the old RTL SSA project from way back,
>>>>> the new framework isn't a “native” SSA representation.  This means that
>>>>> all inputs to a phi node for a register R are also definitions of
>>>>> register R; no move operation is “hidden” in the phi node.
>>>> Hmm, I'm trying to parse what the last phrase means.  Does it mean that
>>>> the "hidden copy" problem for out-of-ssa is avoided?  And if so, how is
>>>> that maintained over time.  Things like copy-prop will tend to introduce
>>>> those issues even if they didn't originally exist.
>>> Yeah, the phi nodes simply say which definition of register R provides
>>> the value of R on a particular incoming edge.  That definition will
>>> itself be a phi node for R, an artificial definition of R created by DF
>>> (e.g. for incoming function arguments or for EH data registers), or an
>>> actual instruction that sets R.
>>>
>>> In other words, the SSA form is a purely on-the-side thing and the
>>> underlying RTL instructions are maintained in the same way as normal.
>>> The SSA form can be deleted at any time without performing a separate
>>> out-of-ssa step.  In that respect it's different from cfglayout,
>>> for example.
>> Hmm, I don't see how that answers Jeffs question, if I got it correctly.  
>> If I didn't get it correctly let me ask my own version of the question :)
>>
>> (I haven't studied your implementation in detail, if I had maybe answers 
>> to the below would become obvious, sorry if so :) )
>>  
>> So, you're saying that in your implementation the operands of PHIs can be 
>> PHIs and real defs.
> Specifically real defs of the same register (or memory, for memory phis).
>
>> Further you say nothing about any restriction in RTL 
>> instruction moving and/or propagation.
> The RTL SSA form doesn't add any extra restrictions beyond those that apply
> to non-SSA RTL passes.  But it also doesn't take any restrictions away.
> In other words, the changes that RTL SSA passes make to RTL instructions
> are the same as those that non-SSA RTL passes would make.  The SSA form
> is just there to make it easier to process use-def chains (and also
> to process live ranges, to a limited extent).
>
>> So, then let's start with one of 
>> the prime examples of SSA deconstruction problems, the lost swap, and how 
>> it comes to be: we start with a swap:
>>
>>   x = ..., y = ...
>>   if (cond)
>>     tmp=x, x=y, y=tmp
>>
>> (1) into SSA:
>>
>>   x0 = ..., y0 = ...
>>   if (cond)
>>     tmp = x0, x1=y0, y1=tmp;
>>   x2 = PHI(x0,x1),  y2 = PHI(y0,y1)
>>
>> (2) copy-prop:
>>
>>   x0 = ..., y0 = ...
>>   if (cond)
>>     ;
>>   x2 = PHI(x0,y0),  y2 = PHI(y0,x0)
> So the point is that this isn't what the RTL would look like even
> when using RTL SSA.  Putting y0 in x2 PHI and x0 in the y2 PHI is
> representationally invalid.
>
> Like I say, this isn't a “native” SSA form: it's just using SSA
> constructs to represent dataflow in normal RTL.
It appears that the PHI arguments have to be different instances of the
result.  So the case above can't happen, which helps, but I'm not sure
it's necessarily sufficient to avoid all the problems in this space.  
IIRC you can get into a similar scenario by transformations that result
in overlapping lifetimes for different instances of the same object. 
They didn't necessarily overlap when the SSA form was created, but may
after things like CSE or copy propagation.

The fact that passes don't directly manipulate the PHIs definitely helps
as well.  But I've still got some reading to do in this space to refresh
my memory of the issues.

jeff



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [00/23] Make fwprop use an on-the-side RTL SSA representation
  2020-11-30 21:13         ` Jeff Law
@ 2020-12-01  0:03           ` Michael Matz
  2020-12-01 10:15             ` Richard Sandiford
  2020-12-02  0:25             ` Jeff Law
  0 siblings, 2 replies; 88+ messages in thread
From: Michael Matz @ 2020-12-01  0:03 UTC (permalink / raw)
  To: Jeff Law; +Cc: Jeff Law via Gcc-patches, richard.sandiford

Hello,

On Mon, 30 Nov 2020, Jeff Law wrote:

> >> So, then let's start with one of 
> >> the prime examples of SSA deconstruction problems, the lost swap, and how 
> >> it comes to be: we start with a swap:
> >>
> >>   x = ..., y = ...
> >>   if (cond)
> >>     tmp=x, x=y, y=tmp
> >>
> >> (1) into SSA:
> >>
> >>   x0 = ..., y0 = ...
> >>   if (cond)
> >>     tmp = x0, x1=y0, y1=tmp;
> >>   x2 = PHI(x0,x1),  y2 = PHI(y0,y1)
> >>
> >> (2) copy-prop:
> >>
> >>   x0 = ..., y0 = ...
> >>   if (cond)
> >>     ;
> >>   x2 = PHI(x0,y0),  y2 = PHI(y0,x0)
> > So the point is that this isn't what the RTL would look like even
> > when using RTL SSA.  Putting y0 in x2 PHI and x0 in the y2 PHI is
> > representationally invalid.
> >
> > Like I say, this isn't a “native” SSA form: it's just using SSA
> > constructs to represent dataflow in normal RTL.
> It appears that the PHI arguments have to be different instances of the
> result.  So the case above can't happen, which helps, but I'm not sure
> it's necessarily sufficient to avoid all the problems in this space.
> IIRC you can get into a similar scenario by transformations that result
> in overlapping lifetimes for different instances of the same object. 
> They didn't necessarily overlap when the SSA form was created, but may
> after things like CSE or copy propagation.

I think the reasoning why this can't (or should not) happen is the 
following: if different instances of the same objects (say, one before, 
one after a modification) exist, they must necessarily be stored in 
different pseudos (otherwise the RTL transformation itself was already 
invalid), and that causes them to be invalid operands of the same PHI 
node.  Ala:

input:

   regA =  ....    /1
   use1(regA)      /2
   regA += ...     /3
   use2(regA)      /4

let's try creating different instances of regA (from point 2 and 4) that 
overlap, e.g. by swapping insns 2 and 3.  We _have_ to rename regA from 
insn 3 into a new pseudo, otherwise the uses of 2 and 4 can't be 
differentiated anymore, so:

   regA  =  ....    /1
   regA' = regA
   regA' += ....    /3'
   use1(regA)       /2
   use2(regA')      /4'

So if Richards model constrains the pseudo PHI nodes such that regA and 
regA' can't be operands of one, that might solve the issue, as both the 
lost copy and the swap problem need overlaps of different values to occur.

> The fact that passes don't directly manipulate the PHIs definitely helps
> as well.  But I've still got some reading to do in this space to refresh
> my memory of the issues.

AFAIU Richards approach is more comparable to factored def-use chains than 
to real SSA, which might indeed have no issues, though I then think the 
problem moves into keeping _those_ consistent with the real instruction 
stream as it changes.

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [00/23] Make fwprop use an on-the-side RTL SSA representation
  2020-12-01  0:03           ` Michael Matz
@ 2020-12-01 10:15             ` Richard Sandiford
  2020-12-02  0:25             ` Jeff Law
  1 sibling, 0 replies; 88+ messages in thread
From: Richard Sandiford @ 2020-12-01 10:15 UTC (permalink / raw)
  To: Michael Matz; +Cc: Jeff Law, Jeff Law via Gcc-patches

Michael Matz <matz@suse.de> writes:
> Hello,
>
> On Mon, 30 Nov 2020, Jeff Law wrote:
>
>> >> So, then let's start with one of 
>> >> the prime examples of SSA deconstruction problems, the lost swap, and how 
>> >> it comes to be: we start with a swap:
>> >>
>> >>   x = ..., y = ...
>> >>   if (cond)
>> >>     tmp=x, x=y, y=tmp
>> >>
>> >> (1) into SSA:
>> >>
>> >>   x0 = ..., y0 = ...
>> >>   if (cond)
>> >>     tmp = x0, x1=y0, y1=tmp;
>> >>   x2 = PHI(x0,x1),  y2 = PHI(y0,y1)
>> >>
>> >> (2) copy-prop:
>> >>
>> >>   x0 = ..., y0 = ...
>> >>   if (cond)
>> >>     ;
>> >>   x2 = PHI(x0,y0),  y2 = PHI(y0,x0)
>> > So the point is that this isn't what the RTL would look like even
>> > when using RTL SSA.  Putting y0 in x2 PHI and x0 in the y2 PHI is
>> > representationally invalid.
>> >
>> > Like I say, this isn't a “native” SSA form: it's just using SSA
>> > constructs to represent dataflow in normal RTL.
>> It appears that the PHI arguments have to be different instances of the
>> result.  So the case above can't happen, which helps, but I'm not sure
>> it's necessarily sufficient to avoid all the problems in this space.
>> IIRC you can get into a similar scenario by transformations that result
>> in overlapping lifetimes for different instances of the same object. 
>> They didn't necessarily overlap when the SSA form was created, but may
>> after things like CSE or copy propagation.
>
> I think the reasoning why this can't (or should not) happen is the 
> following: if different instances of the same objects (say, one before, 
> one after a modification) exist, they must necessarily be stored in 
> different pseudos (otherwise the RTL transformation itself was already 
> invalid), and that causes them to be invalid operands of the same PHI 
> node.  Ala:
>
> input:
>
>    regA =  ....    /1
>    use1(regA)      /2
>    regA += ...     /3
>    use2(regA)      /4
>
> let's try creating different instances of regA (from point 2 and 4) that 
> overlap, e.g. by swapping insns 2 and 3.  We _have_ to rename regA from 
> insn 3 into a new pseudo, otherwise the uses of 2 and 4 can't be 
> differentiated anymore, so:
>
>    regA  =  ....    /1
>    regA' = regA
>    regA' += ....    /3'
>    use1(regA)       /2
>    use2(regA')      /4'
>
> So if Richards model constrains the pseudo PHI nodes such that regA and 
> regA' can't be operands of one, that might solve the issue, as both the 
> lost copy and the swap problem need overlaps of different values to occur.

Right.  This isn't conceptually different from the way that virtual
operands work in gimple.  It's just that rather than having one vop
for memory (as in gimple), we have one vop for memory, one vop
for each individual hard register, and one vop for each individual
pseudo register.

FWIW, there's also a routine (rtl_ssa::restrict_movement) that takes
the changes that a pass wants to make to an instruction and finds a
range of locations that satisfy all the constraints.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [00/23] Make fwprop use an on-the-side RTL SSA representation
  2020-12-01  0:03           ` Michael Matz
  2020-12-01 10:15             ` Richard Sandiford
@ 2020-12-02  0:25             ` Jeff Law
  1 sibling, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-12-02  0:25 UTC (permalink / raw)
  To: Michael Matz; +Cc: Jeff Law via Gcc-patches, richard.sandiford



On 11/30/20 5:03 PM, Michael Matz wrote:
> Hello,
>
> On Mon, 30 Nov 2020, Jeff Law wrote:
>
>>>> So, then let's start with one of 
>>>> the prime examples of SSA deconstruction problems, the lost swap, and how 
>>>> it comes to be: we start with a swap:
>>>>
>>>>   x = ..., y = ...
>>>>   if (cond)
>>>>     tmp=x, x=y, y=tmp
>>>>
>>>> (1) into SSA:
>>>>
>>>>   x0 = ..., y0 = ...
>>>>   if (cond)
>>>>     tmp = x0, x1=y0, y1=tmp;
>>>>   x2 = PHI(x0,x1),  y2 = PHI(y0,y1)
>>>>
>>>> (2) copy-prop:
>>>>
>>>>   x0 = ..., y0 = ...
>>>>   if (cond)
>>>>     ;
>>>>   x2 = PHI(x0,y0),  y2 = PHI(y0,x0)
>>> So the point is that this isn't what the RTL would look like even
>>> when using RTL SSA.  Putting y0 in x2 PHI and x0 in the y2 PHI is
>>> representationally invalid.
>>>
>>> Like I say, this isn't a “native” SSA form: it's just using SSA
>>> constructs to represent dataflow in normal RTL.
>> It appears that the PHI arguments have to be different instances of the
>> result.  So the case above can't happen, which helps, but I'm not sure
>> it's necessarily sufficient to avoid all the problems in this space.
>> IIRC you can get into a similar scenario by transformations that result
>> in overlapping lifetimes for different instances of the same object. 
>> They didn't necessarily overlap when the SSA form was created, but may
>> after things like CSE or copy propagation.
> I think the reasoning why this can't (or should not) happen is the 
> following: if different instances of the same objects (say, one before, 
> one after a modification) exist, they must necessarily be stored in 
> different pseudos (otherwise the RTL transformation itself was already 
> invalid), and that causes them to be invalid operands of the same PHI 
> node.  Ala:
>
> input:
>
>    regA =  ....    /1
>    use1(regA)      /2
>    regA += ...     /3
>    use2(regA)      /4
>
> let's try creating different instances of regA (from point 2 and 4) that 
> overlap, e.g. by swapping insns 2 and 3.  We _have_ to rename regA from 
> insn 3 into a new pseudo, otherwise the uses of 2 and 4 can't be 
> differentiated anymore, so:
>
>    regA  =  ....    /1
>    regA' = regA
>    regA' += ....    /3'
>    use1(regA)       /2
>    use2(regA')      /4'
>
> So if Richards model constrains the pseudo PHI nodes such that regA and 
> regA' can't be operands of one, that might solve the issue, as both the 
> lost copy and the swap problem need overlaps of different values to occur.
Right.  I was thinking about cases where something like CSE on this form
transforms the RHS of some operation into an instance of a pseudo.  That
insn is now a copy and we propagate the RHS into the uses of the LHS. 
That extends the lifetime of the pseudo's instance.  The question is
whether or not those actions either create a lost copy/swap problem or
not within the on-the-side SSA representation and whether or not there
could be implications if that happens.

>
>> The fact that passes don't directly manipulate the PHIs definitely helps
>> as well.  But I've still got some reading to do in this space to refresh
>> my memory of the issues.
> AFAIU Richards approach is more comparable to factored def-use chains than 
> to real SSA, which might indeed have no issues, though I then think the 
> problem moves into keeping _those_ consistent with the real instruction 
> stream as it changes.
To some degree, the change in model moves where we have to tackle these
issues.  Instead of tackling them in the out-of-ssa phase, we instead
have to think more about them in the analysis/optimization phases.  We
already do that to some degree in the gimple SSA representation (for
things like SSA_NAMEs associated with abnormal edges).

jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [10/23] Tweak the way that is_a is implemented
  2020-11-13  8:16 ` [10/23] Tweak the way that is_a is implemented Richard Sandiford
@ 2020-12-02  5:15   ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-12-02  5:15 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:16 AM, Richard Sandiford via Gcc-patches wrote:
> At the moment, class hierarchies that use is_a are expected
> to define specialisations like:
>
>   template <>
>   template <>
>   inline bool
>   is_a_helper <cgraph_node *>::test (symtab_node *p)
>   {
>     return p->type == SYMTAB_FUNCTION;
>   }
>
> But this doesn't scale well to larger hierarchies, because it only
> defines ::test for an argument that is exactly “symtab_node *”
> (and not for example “const symtab_node *” or something that
> comes between cgraph_node and symtab_node in the hierarchy).
>
> For example:
>
>   struct A { int x; };
>   struct B : A {};
>   struct C : B {};
>
>   template <>
>   template <>
>   inline bool
>   is_a_helper <C *>::test (A *a)
>   {
>     return a->x == 1;
>   }
>
>   bool f(B *b) { return is_a<C *> (b); }
>
> gives:
>
>   warning: inline function ‘static bool is_a_helper<T>::test(U*) [with U = B; T = C*]’ used but never defined
>
> and:
>
>   bool f(const A *a) { return is_a<const C *> (a); }
>
> gives:
>
>   warning: inline function ‘static bool is_a_helper<T>::test(U*) [with U = const A; T = const C*]’ used but never defined
>
> This patch instead allows is_a to be implemented by specialising
> is_a_helper as a whole, for example:
>
>   template<>
>   struct is_a_helper<C *> : static_is_a_helper<C *>
>   {
>     static inline bool test (const A *a) { return a->x == 1; }
>   };
>
> It also adds a general specialisation of is_a_helper for const
> pointers.  Together, this makes both of the above examples work.
>
> gcc/
> 	* is-a.h (reinterpret_is_a_helper): New class.
> 	(static_is_a_helper): Likewise.
> 	(is_a_helper): Inherit from reinterpret_is_a_helper.
> 	(is_a_helper<const T *>): New specialization.
OK

Jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [08/23] Add an alternative splay tree implementation
  2020-11-13  8:15 ` [08/23] Add an alternative splay tree implementation Richard Sandiford
@ 2020-12-02 20:36   ` Jeff Law
  2020-12-17  0:29     ` Richard Sandiford
  2021-01-01  8:25   ` Andreas Schwab
  1 sibling, 1 reply; 88+ messages in thread
From: Jeff Law @ 2020-12-02 20:36 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:15 AM, Richard Sandiford via Gcc-patches wrote:
> We already have two splay tree implementations: the old C one in
> libiberty and a templated reimplementation of it in typed-splay-tree.h.
> However, they have some drawbacks:
>
> - They hard-code the assumption that nodes should have both a key and
>   a value, which isn't always true.
>
> - They use the two-phase method of lookup, and so nodes need to store
>   a temporary back pointer.  We can avoid that overhead by using the
>   top-down method (as e.g. the bitmap tree code already does).
>
> - The tree node has to own the key and the value.  For some use cases
>   it's more convenient to embed the tree links in the value instead.
>
> Also, a later patch wants to use splay trees to represent an
> adaptive total order: the splay tree itself records whether node N1
> is less than node N2, and (in the worst case) comparing nodes is
> a splay operation.
>
> This patch therefore adds an alternative implementation.  The main
> features are:
>
> - Nodes can optionally point back to their parents.
>
> - An Accessors class abstracts accessing child nodes and (where
>   applicable) parent nodes, so that the information can be embedded
>   in larger data structures.
>
> - There is no fixed comparison function at the class level.  Instead,
>   individual functions that do comparisons take a comparison function
>   argument.
>
> - There are two styles of comparison function, optimised for different
>   use cases.  (See the comments in the patch for details.)
>
> - It's possible to do some operations directly on a given node,
>   without knowing whether it's the root.  This includes the comparison
>   use case described above.
>
> This of course has its own set of drawbacks.  It's really providing
> splay utility functions rather than a true ADT, and so is more low-level
> than the existing routines.  It's mostly geared for cases in which the
> client code wants to participate in the splay operations to some extent.
>
> gcc/
> 	* Makefile.in (OBJS): Add splay-tree-utils.o.
> 	* system.h: Include <array> when INCLUDE_ARRAY is defined.
> 	* selftest.h (splay_tree_cc_tests): Declare.
> 	* selftest-run-tests.c (selftest::run_tests): Run splay_tree_cc_tests.
> 	* splay-tree-utils.h: New file.
> 	* splay-tree-utils.tcc: Likewise.
> 	* splay-tree-utils.cc: Likewise.
I must admit, I'm not a fan of adding another splay tree.  Though I
suspect the one in libiberty will be there forever since there could
well be clients outside our source base.

The typed_splay_tree implementation however is internal to GCC and only
has a couple users (the JIT and fixit hints).  Is there any chance we
could convert those two users to the new one?  Your cover hints that's
not the case, but I'm going to explicitly ask :-)

As for the implementation, I've got no real concerns there.  If there's
issues, I'm sure you'll deal with them.

Jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [17/23] recog: Add a class for propagating into insns
  2020-11-13  8:20 ` [17/23] recog: Add a class for propagating into insns Richard Sandiford
@ 2020-12-03 22:32   ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-12-03 22:32 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:20 AM, Richard Sandiford via Gcc-patches wrote:
> This patch adds yet another way of propagating into an instruction and
> simplifying the result.  (The net effect of the series is to keep the
> total number of propagation approaches the same though, since a later
> patch removes the fwprop.c routines.)
>
> One of the drawbacks of the validate_replace_* routines is that
> they only do simple simplifications, mostly canonicalisations:
>
>   /* Do changes needed to keep rtx consistent.  Don't do any other
>      simplifications, as it is not our job.  */
>   if (simplify)
>     simplify_while_replacing (loc, to, object, op0_mode);
>
> But substituting can often lead to real simplification opportunities.
> simplify-rtx.c:simplify_replace_rtx does fully simplify the result,
> but it only operates on specific rvalues rather than full instruction
> patterns.  It is also nondestructive, which means that it returns a
> new rtx whenever a substitution or simplification was possible.
> This can create quite a bit of garbage rtl in the context of a
> speculative recog, where changing the contents of a pointer is
> often enough.
>
> The new routines are therefore supposed to provide simplify_replace_rtx-
> style substitution in recog.  They go to some effort to prevent garbage
> rtl from being created.
>
> At the moment, the new routines fail if the pattern would still refer
> to the old "from" value in some way.  That might be unnecessary in
> some contexts; if so, it could be put behind a configuration parameter.
>
> gcc/
> 	* recog.h (insn_propagation): New class.
> 	* recog.c (insn_propagation::apply_to_mem_1): New function.
> 	(insn_propagation::apply_to_rvalue_1): Likewise.
> 	(insn_propagation::apply_to_lvalue_1): Likewise.
> 	(insn_propagation::apply_to_pattern_1): Likewise.
> 	(insn_propagation::apply_to_pattern): Likewise.
> 	(insn_propagation::apply_to_rvalue): Likewise.
>

OK
jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [19/23] rtlanal: Add some new helper classes
  2020-11-13  8:20 ` [19/23] rtlanal: Add some new helper classes Richard Sandiford
@ 2020-12-13 17:30   ` Jeff Law
  2020-12-14 16:37     ` Richard Sandiford
  0 siblings, 1 reply; 88+ messages in thread
From: Jeff Law @ 2020-12-13 17:30 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:20 AM, Richard Sandiford via Gcc-patches wrote:
> This patch adds some classes for gathering the list of registers
> and memory that are read and written by an instruction, along
> with various properties about the accesses.  In some ways it's
> similar to the information that DF collects for registers,
> but extended to memory.  The main reason for using it instead
> of DF is that it can analyse tentative changes to instructions
> before they've been committed.
>
> The classes also collect general information about the instruction,
> since it's cheap to do and helps to avoid multiple walks of the same
> RTL pattern.
>
> I've tried to optimise the code quite a bit, since with later patches
> it becomes relatively performance-sensitive.  See the discussion in
> the comments for the trade-offs involved.
>
> I put the declarations in a new rtlanal.h header file since it
> seemed a bit excessive to put so much new inline stuff in rtl.h.
>
> gcc/
> 	* rtlanal.h: New file.
> 	(MEM_REGNO): New constant.
> 	(rtx_obj_flags): New namespace.
> 	(rtx_obj_reference, rtx_properties): New classes.
> 	(growing_rtx_properties, vec_rtx_properties_base): Likewise.
> 	(vec_rtx_properties): New alias.
> 	* rtlanal.c: Include it.
> 	(rtx_properties::try_to_add_reg): New function.
> 	(rtx_properties::try_to_add_dest): Likewise.
> 	(rtx_properties::try_to_add_src): Likewise.
> 	(rtx_properties::try_to_add_pattern): Likewise.
> 	(rtx_properties::try_to_add_insn): Likewise.
> 	(vec_rtx_properties_base::grow): Likewise.
One might argue at least some of these should become first class
properties of insns but then we have the joy of keeping them up-to-date
as transformations are made.  It also reminds me a bit of the old
var_ann stuff we had in the tree SSA implementation. 

OK. 

jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [19/23] rtlanal: Add some new helper classes
  2020-12-13 17:30   ` Jeff Law
@ 2020-12-14 16:37     ` Richard Sandiford
  2020-12-14 20:02       ` Jeff Law
  0 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-12-14 16:37 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

Jeff Law <law@redhat.com> writes:
> On 11/13/20 1:20 AM, Richard Sandiford via Gcc-patches wrote:
>> This patch adds some classes for gathering the list of registers
>> and memory that are read and written by an instruction, along
>> with various properties about the accesses.  In some ways it's
>> similar to the information that DF collects for registers,
>> but extended to memory.  The main reason for using it instead
>> of DF is that it can analyse tentative changes to instructions
>> before they've been committed.
>>
>> The classes also collect general information about the instruction,
>> since it's cheap to do and helps to avoid multiple walks of the same
>> RTL pattern.
>>
>> I've tried to optimise the code quite a bit, since with later patches
>> it becomes relatively performance-sensitive.  See the discussion in
>> the comments for the trade-offs involved.
>>
>> I put the declarations in a new rtlanal.h header file since it
>> seemed a bit excessive to put so much new inline stuff in rtl.h.
>>
>> gcc/
>> 	* rtlanal.h: New file.
>> 	(MEM_REGNO): New constant.
>> 	(rtx_obj_flags): New namespace.
>> 	(rtx_obj_reference, rtx_properties): New classes.
>> 	(growing_rtx_properties, vec_rtx_properties_base): Likewise.
>> 	(vec_rtx_properties): New alias.
>> 	* rtlanal.c: Include it.
>> 	(rtx_properties::try_to_add_reg): New function.
>> 	(rtx_properties::try_to_add_dest): Likewise.
>> 	(rtx_properties::try_to_add_src): Likewise.
>> 	(rtx_properties::try_to_add_pattern): Likewise.
>> 	(rtx_properties::try_to_add_insn): Likewise.
>> 	(vec_rtx_properties_base::grow): Likewise.
> One might argue at least some of these should become first class
> properties of insns but then we have the joy of keeping them up-to-date
> as transformations are made.  It also reminds me a bit of the old
> var_ann stuff we had in the tree SSA implementation. 

Yeah.  The RTL SSA insn info does store these properties, but that has
the advantage of being new code and so can require all updates to go
through new interfaces.  I agree that ideally we'd store the information
directly in RTL insns instead.

I guess one question is where we would store the new flags.  In some
ways it might be easier to do that after any future split of rtx_insn
and rtx, since we could then use a smaller code field and potentially
replace the mode field.  (Things like :TI markers for VLIW bundles
could use a separate flag instead.)

It might even be possible to do the update fairly cheaply, if we made
the generators record static flags for the fixed part of the pattern
and only computed the operand information dynamically.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [19/23] rtlanal: Add some new helper classes
  2020-12-14 16:37     ` Richard Sandiford
@ 2020-12-14 20:02       ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-12-14 20:02 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 12/14/20 9:37 AM, Richard Sandiford wrote:
> Jeff Law <law@redhat.com> writes:
>> On 11/13/20 1:20 AM, Richard Sandiford via Gcc-patches wrote:
>>> This patch adds some classes for gathering the list of registers
>>> and memory that are read and written by an instruction, along
>>> with various properties about the accesses.  In some ways it's
>>> similar to the information that DF collects for registers,
>>> but extended to memory.  The main reason for using it instead
>>> of DF is that it can analyse tentative changes to instructions
>>> before they've been committed.
>>>
>>> The classes also collect general information about the instruction,
>>> since it's cheap to do and helps to avoid multiple walks of the same
>>> RTL pattern.
>>>
>>> I've tried to optimise the code quite a bit, since with later patches
>>> it becomes relatively performance-sensitive.  See the discussion in
>>> the comments for the trade-offs involved.
>>>
>>> I put the declarations in a new rtlanal.h header file since it
>>> seemed a bit excessive to put so much new inline stuff in rtl.h.
>>>
>>> gcc/
>>> 	* rtlanal.h: New file.
>>> 	(MEM_REGNO): New constant.
>>> 	(rtx_obj_flags): New namespace.
>>> 	(rtx_obj_reference, rtx_properties): New classes.
>>> 	(growing_rtx_properties, vec_rtx_properties_base): Likewise.
>>> 	(vec_rtx_properties): New alias.
>>> 	* rtlanal.c: Include it.
>>> 	(rtx_properties::try_to_add_reg): New function.
>>> 	(rtx_properties::try_to_add_dest): Likewise.
>>> 	(rtx_properties::try_to_add_src): Likewise.
>>> 	(rtx_properties::try_to_add_pattern): Likewise.
>>> 	(rtx_properties::try_to_add_insn): Likewise.
>>> 	(vec_rtx_properties_base::grow): Likewise.
>> One might argue at least some of these should become first class
>> properties of insns but then we have the joy of keeping them up-to-date
>> as transformations are made.  It also reminds me a bit of the old
>> var_ann stuff we had in the tree SSA implementation. 
> Yeah.  The RTL SSA insn info does store these properties, but that has
> the advantage of being new code and so can require all updates to go
> through new interfaces.  I agree that ideally we'd store the information
> directly in RTL insns instead.
I guess it's less problematic since we're using real classes.  The old
annotation stuff was done before we had C++ & classes.

>
> I guess one question is where we would store the new flags.  In some
> ways it might be easier to do that after any future split of rtx_insn
> and rtx, since we could then use a smaller code field and potentially
> replace the mode field.  (Things like :TI markers for VLIW bundles
> could use a separate flag instead.)
Dunno offhand.  I'm OK with things as-is for now -- again, having
classes allows us to real wrappers and enforce a degree to access
control, which greatly diminishes the pain we had with var_ann.

Splitting off rtx_insn from rtx is independent, but definitely something
I want to see move forward.

Jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 22/23] Add rtl-ssa
  2020-11-13  8:23 ` [PATCH 22/23] Add rtl-ssa Richard Sandiford
@ 2020-12-16  3:31   ` Jeff Law
  2020-12-17  0:33     ` Richard Sandiford
  0 siblings, 1 reply; 88+ messages in thread
From: Jeff Law @ 2020-12-16  3:31 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:23 AM, Richard Sandiford via Gcc-patches wrote:
> This patch adds the RTL SSA infrastructure itself.  The following
> fwprop.c patch will make use of it.
>
> gcc/
> 	* configure.ac: Add rtl-ssa to the list of dependence directories.
> 	* configure: Regenerate.
> 	* Makefile.in (rtl-ssa-warn): New variable.
> 	(OBJS): Add the rtl-ssa object files.
> 	* emit-rtl.h (rtl_data::ssa): New field.
> 	* rtl-ssa.h: New file.
> 	* system.h: Include <functional> when INCLUDE_FUNCTIONAL is defined.
> 	* rtl-ssa: New directory.
So I know its late in stage3, but I think the risks here are relatively
minor as this would only be used in fwprop right now and it was posted
in advance of the deadline.

My high level questions/concerns have been addressed and while I
strongly suspect we'll have to iterate on infrastructure, I think we can
do that on demand as passes start to use the infrastructure and we have
a clearer picture of what tweaks we need to make for the clients.

Obviously we'll want to keep an eye on testresults, but again, the
potential for fallout here is relatively minor IMHO.  OK for the trunk.

Jeff



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 23/23] fwprop: Rewrite to use RTL SSA
  2020-11-13  8:24 ` [PATCH 23/23] fwprop: Rewrite to use RTL SSA Richard Sandiford
@ 2020-12-16  3:52   ` Jeff Law
  2020-12-17  0:34     ` Richard Sandiford
  0 siblings, 1 reply; 88+ messages in thread
From: Jeff Law @ 2020-12-16  3:52 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 11/13/20 1:24 AM, Richard Sandiford via Gcc-patches wrote:
> This patch rewrites fwprop.c to use the RTL SSA framework.  It tries
> as far as possible to mimic the old behaviour, even in caes where
> that doesn't fit naturally with the new framework.  I've added ???
> comments to mark those places, but I think “fixing” them should
> be done separately to make bisection easier.
>
> In particular:
>
> * The old implementation iterated over uses, and after a successful
>   substitution, the new insn's uses were added to the end of the list.
>   The pass still processed those uses, but because it processed them at
>   the end, it didn't fully optimise one instruction before propagating
>   it into the next.
>
>   The new version follows the same approach for comparison purposes,
>   but I'd like to drop that as a follow-on patch.
>
> * The old implementation operated on single use sites (DF_REF_LOCs).
>   This doesn't work well for instructions with match_dups, where it's
>   necessary to update both an operand and its dups at the same time.
>   For example, attempting to substitute into a divmod instruction would
>   fail because only the div or the mod side would be updated.
>
>   The new version again follows this to some extent for comparison
>   purposes (although not exactly).  Again I'd like to drop it as a
>   follow-on patch.
>
>   One difference is that if a register occurs in multiple MEM addresses
>   in a set, the new version will try to update them all at once.  This is
>   what causes the SVE ACLE st4* output to improve.
>
> Also, the old version didn't naturally guarantee termination (PR79405),
> whereas the new one does.
>
> gcc/
> 	* fwprop.c: Rewrite to use the RTL SSA framework.
>
> gcc/testsuite/
> 	* gcc.dg/rtl/x86_64/test-return-const.c.before-fwprop.c: Don't
> 	expect insn updates to be deferred.
> 	* gcc.target/aarch64/sve/acle/asm/st4_s8.c: Expect the addition
> 	to be folded into the address.
> 	* gcc.target/aarch64/sve/acle/asm/st4_s8.c: Likewise.
Consider killing the ADD_NOTES bits.

s/eqaul/equal/ to fix a typo.

Naturally I'm happy at how much by-hand RTL analysis code just
disappears with this change :-)

Ideally you'll drop this in tomorrow and we can get a fresh run of all
the targets in my tester before the weekend.  I won't be stressed if we
see some fallout, but I don't expect much.  I'll help track them down if
they occur.

Thanks for your patience.

Jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [07/23] Add a class that multiplexes two pointer types
  2020-11-28  0:17       ` Martin Sebor
@ 2020-12-17  0:17         ` Richard Sandiford
  2020-12-17 14:21           ` Tom Tromey
  0 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-12-17  0:17 UTC (permalink / raw)
  To: Martin Sebor; +Cc: gcc-patches

Martin Sebor <msebor@gmail.com> writes:
> On 11/26/20 10:06 AM, Richard Sandiford wrote:
>> Martin Sebor <msebor@gmail.com> writes:
>>> I do have one concern: the tendency to prioritize efficiency
>>> over safety (this can be said about most GCC code). Specifically
>>> in this class, the address bit twiddling makes me uneasy.  I don't
>>> think the object model in either language (certainly not C but
>>> I don't have the impression C++ either) makes it unequivocally
>>> valid.  On the contrary, I'd say many of us interpret the current
>>> rules as leaving it undefined.  There are efforts to sanction
>>> this sort of thing under some conditions (e.g, the C object
>>> model proposal) but they have not been adopted yet.  I think
>>> we should try to avoid exploiting these dark corners in new
>>> code.
>> 
>> I'd tried to stick to operations that I thought were well-defined.
>> The primitives being used are really:
>> 
>> (1) convert a T1* or T2* to char*
>> (2) increment an unincremented char*
>> (3) decrement an incremented char*
>> (4) convert a char* back to T1* or T2*
>> (5) convert a char* to an intptr_t in order to test its low bit
>
> All those are valid as long as the pointer points into the same
> object, both before and after.
>
>> I thought (1) and (4) were allowed.  At least, [basic.compound] says
>> that void* must be able to hold any object pointer and that it must have
>> the same representation as char*, so I thought the conversion in (1) was
>> guaranteed to be representable.  And (4) only ever undoes (1): it only
>> converts the result of (1) back to the original pointer type.
>> 
>> For (2) and (3), the incremented pointer will still be within the
>> containing object, so I thought it would be well-defined.  Here too,
>> (3) only ever undoes (2): it only decrements a pointer that had
>> previously been incremented.
>> 
>> One thing I'd deliberately tried to avoid was converting integers
>> “back” to pointers, because that seemed like a more dangerous thing.
>> That's why:
>> 
>>>> +template<typename T1, typename T2>
>>>> +inline T2 *
>>>> +pointer_mux<T1, T2>::second_or_null () const
>>>> +{
>>>> +  // Micro optimization that's effective as of GCC 11: compute the value
>>>> +  // of the second pointer as an integer and test that, so that the integer
>>>> +  // result can be reused as the pointer and so that all computation can
>>>> +  // happen before a branch on null.  This reduces the number of branches
>>>> +  // needed for loops.
>>>> +  return uintptr_t (m_ptr - 1) & 1 ? nullptr : known_second ();
>
> This is only valid if m_ptr points to the second byte of an object.
> If it points to the first byte of A then it's invalid.  This would
> make the test valid but the result strictly unspecified (though in
> practice I'd expect it to do what you expect):
>
>    return (uintptr_t (m_ptr) - 1) & 1 ? nullptr : known_second ();

Yeah, I think that's what I meant to write (and was what I thought
the code said when I quoted it, without looking properly).  It does
seem to preserve the optimisation.

Here's what I installed after retesting.

Thanks,
Richard


gcc/
	* mux-utils.h: New file.
---
 gcc/mux-utils.h | 251 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 251 insertions(+)
 create mode 100644 gcc/mux-utils.h

diff --git a/gcc/mux-utils.h b/gcc/mux-utils.h
new file mode 100644
index 00000000000..b026a9fa4c1
--- /dev/null
+++ b/gcc/mux-utils.h
@@ -0,0 +1,251 @@
+// Multiplexer utilities
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify it under
+// the terms of the GNU General Public License as published by the Free
+// Software Foundation; either version 3, or (at your option) any later
+// version.
+//
+// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+// WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with GCC; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef GCC_MUX_UTILS_H
+#define GCC_MUX_UTILS_H 1
+
+// A class that stores a choice "A or B", where A has type T1 * and B has
+// type T2 *.  Both T1 and T2 must have an alignment greater than 1, since
+// the low bit is used to identify B over A.  T1 and T2 can be the same.
+//
+// A can be a null pointer but B cannot.
+//
+// Barring the requirement that B must be nonnull, using the class is
+// equivalent to using:
+//
+//     union { T1 *A; T2 *B; };
+//
+// and having a separate tag bit to indicate which alternative is active.
+// However, using this class can have two advantages over a union:
+//
+// - It avoides the need to find somewhere to store the tag bit.
+//
+// - The compiler is aware that B cannot be null, which can make checks
+//   of the form:
+//
+//       if (auto *B = mux.dyn_cast<T2 *> ())
+//
+//   more efficient.  With a union-based representation, the dyn_cast
+//   check could fail either because MUX is an A or because MUX is a
+//   null B, both of which require a run-time test.  With a pointer_mux,
+//   only a check for MUX being A is needed.
+template<typename T1, typename T2 = T1>
+class pointer_mux
+{
+public:
+  // Return an A pointer with the given value.
+  static pointer_mux first (T1 *);
+
+  // Return a B pointer with the given (nonnull) value.
+  static pointer_mux second (T2 *);
+
+  pointer_mux () = default;
+
+  // Create a null A pointer.
+  pointer_mux (std::nullptr_t) : m_ptr (nullptr) {}
+
+  // Create an A or B pointer with the given value.  This is only valid
+  // if T1 and T2 are distinct and if T can be resolved to exactly one
+  // of them.
+  template<typename T,
+	   typename Enable = typename
+	     std::enable_if<std::is_convertible<T *, T1 *>::value
+			    != std::is_convertible<T *, T2 *>::value>::type>
+  pointer_mux (T *ptr);
+
+  // Return true unless the pointer is a null A pointer.
+  explicit operator bool () const { return m_ptr; }
+
+  // Assign A and B pointers respectively.
+  void set_first (T1 *ptr) { *this = first (ptr); }
+  void set_second (T2 *ptr) { *this = second (ptr); }
+
+  // Return true if the pointer is an A pointer.
+  bool is_first () const { return !(uintptr_t (m_ptr) & 1); }
+
+  // Return true if the pointer is a B pointer.
+  bool is_second () const { return uintptr_t (m_ptr) & 1; }
+
+  // Return the contents of the pointer, given that it is known to be
+  // an A pointer.
+  T1 *known_first () const { return reinterpret_cast<T1 *> (m_ptr); }
+
+  // Return the contents of the pointer, given that it is known to be
+  // a B pointer.
+  T2 *known_second () const { return reinterpret_cast<T2 *> (m_ptr - 1); }
+
+  // If the pointer is an A pointer, return its contents, otherwise
+  // return null.  Thus a null return can mean that the pointer is
+  // either a null A pointer or a B pointer.
+  //
+  // If all A pointers are nonnull, it is more efficient to use:
+  //
+  //    if (ptr.is_first ())
+  //      ...use ptr.known_first ()...
+  //
+  // over:
+  //
+  //    if (T1 *a = ptr.first_or_null ())
+  //      ...use a...
+  T1 *first_or_null () const;
+
+  // If the pointer is a B pointer, return its contents, otherwise
+  // return null.  Using:
+  //
+  //    if (T1 *b = ptr.second_or_null ())
+  //      ...use b...
+  //
+  // should be at least as efficient as:
+  //
+  //    if (ptr.is_second ())
+  //      ...use ptr.known_second ()...
+  T2 *second_or_null () const;
+
+  // Return true if the pointer is a T.
+  //
+  // This is only valid if T1 and T2 are distinct and if T can be
+  // resolved to exactly one of them.  The condition is checked using
+  // a static assertion rather than SFINAE because it gives a clearer
+  // error message.
+  template<typename T>
+  bool is_a () const;
+
+  // Assert that the pointer is a T and return it as such.  See is_a
+  // for the restrictions on T.
+  template<typename T>
+  T as_a () const;
+
+  // If the pointer is a T, return it as such, otherwise return null.
+  // See is_a for the restrictions on T.
+  template<typename T>
+  T dyn_cast () const;
+
+private:
+  pointer_mux (char *ptr) : m_ptr (ptr) {}
+
+  // The pointer value for A pointers, or the pointer value + 1 for B pointers.
+  // Using a pointer rather than a uintptr_t tells the compiler that second ()
+  // can never return null, and that second_or_null () is only null if
+  // is_first ().
+  char *m_ptr;
+};
+
+template<typename T1, typename T2>
+inline pointer_mux<T1, T2>
+pointer_mux<T1, T2>::first (T1 *ptr)
+{
+  gcc_checking_assert (!(uintptr_t (ptr) & 1));
+  return reinterpret_cast<char *> (ptr);
+}
+
+template<typename T1, typename T2>
+inline pointer_mux<T1, T2>
+pointer_mux<T1, T2>::second (T2 *ptr)
+{
+  gcc_checking_assert (ptr && !(uintptr_t (ptr) & 1));
+  return reinterpret_cast<char *> (ptr) + 1;
+}
+
+template<typename T1, typename T2>
+template<typename T, typename Enable>
+inline pointer_mux<T1, T2>::pointer_mux (T *ptr)
+  : m_ptr (reinterpret_cast<char *> (ptr))
+{
+  if (std::is_convertible<T *, T2 *>::value)
+    {
+      gcc_checking_assert (m_ptr);
+      m_ptr += 1;
+    }
+}
+
+template<typename T1, typename T2>
+inline T1 *
+pointer_mux<T1, T2>::first_or_null () const
+{
+  return is_first () ? known_first () : nullptr;
+}
+
+template<typename T1, typename T2>
+inline T2 *
+pointer_mux<T1, T2>::second_or_null () const
+{
+  // Micro optimization that's effective as of GCC 11: compute the value
+  // of the second pointer as an integer and test that, so that the integer
+  // result can be reused as the pointer and so that all computation can
+  // happen before a branch on null.  This reduces the number of branches
+  // needed for loops.
+  return (uintptr_t (m_ptr) - 1) & 1 ? nullptr : known_second ();
+}
+
+template<typename T1, typename T2>
+template<typename T>
+inline bool
+pointer_mux<T1, T2>::is_a () const
+{
+  static_assert (std::is_convertible<T1 *, T>::value
+		 != std::is_convertible<T2 *, T>::value,
+		 "Ambiguous pointer type");
+  if (std::is_convertible<T2 *, T>::value)
+    return is_second ();
+  else
+    return is_first ();
+}
+
+template<typename T1, typename T2>
+template<typename T>
+inline T
+pointer_mux<T1, T2>::as_a () const
+{
+  static_assert (std::is_convertible<T1 *, T>::value
+		 != std::is_convertible<T2 *, T>::value,
+		 "Ambiguous pointer type");
+  if (std::is_convertible<T2 *, T>::value)
+    {
+      gcc_checking_assert (is_second ());
+      return reinterpret_cast<T> (m_ptr - 1);
+    }
+  else
+    {
+      gcc_checking_assert (is_first ());
+      return reinterpret_cast<T> (m_ptr);
+    }
+}
+
+template<typename T1, typename T2>
+template<typename T>
+inline T
+pointer_mux<T1, T2>::dyn_cast () const
+{
+  static_assert (std::is_convertible<T1 *, T>::value
+		 != std::is_convertible<T2 *, T>::value,
+		 "Ambiguous pointer type");
+  if (std::is_convertible<T2 *, T>::value)
+    {
+      if (is_second ())
+	return reinterpret_cast<T> (m_ptr - 1);
+    }
+  else
+    {
+      if (is_first ())
+	return reinterpret_cast<T> (m_ptr);
+    }
+  return nullptr;
+}
+
+#endif

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [16/23] recog: Add a way of temporarily undoing changes
  2020-11-25 20:27   ` Jeff Law
@ 2020-12-17  0:22     ` Richard Sandiford
  0 siblings, 0 replies; 88+ messages in thread
From: Richard Sandiford @ 2020-12-17  0:22 UTC (permalink / raw)
  To: Jeff Law via Gcc-patches

Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On 11/13/20 1:19 AM, Richard Sandiford via Gcc-patches wrote:
>> In some cases, it can be convenient to roll back the changes that
>> have been made by validate_change to see how things looked before,
>> then reroll the changes.  For example, this makes it possible
>> to defer calculating the cost of an instruction until we know that
>> the result is actually needed.  It can also make dumps easier to read.
>>
>> This patch adds a couple of helper functions for doing that.
>>
>> gcc/
>> 	* recog.h (temporarily_undo_changes, redo_changes): Declare.
>> 	* recog.c (swap_change, temporarily_undo_changes): New functions.
>> 	(redo_changes): Likewise.
> OK...  But...
> +
>> +/* Temporarily undo all the changes numbered NUM and up, with a view
>> +   to reapplying them later.  The next call to the changes machinery
>> +   must be:
>> +
>> +      redo_changes (NUM)
>> +
>> +   otherwise things will end up in an invalid state.  */
> It'd be nice if we had state validation in the other routines. Somebody
> is likely to mess this up at some point...

Yeah, good point, can definitely see myself doing that. :-)

Here's the version I committed, with temporarily_undone_changes tracking
what has been undone.

Thanks,
Richard


gcc/
	* recog.h (temporarily_undo_changes, redo_changes): Declare.
	* recog.c (temporarily_undone_changes): New variable.
	(validate_change_1, confirm_change_group): Check that it's zero.
	(cancel_changes): Likewise.
	(swap_change, temporarily_undo_changes): New functions.
	(redo_changes): Likewise.
---
 gcc/recog.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/recog.h |  2 ++
 2 files changed, 50 insertions(+)

diff --git a/gcc/recog.c b/gcc/recog.c
index 65125b8f0d1..cee481f4fa0 100644
--- a/gcc/recog.c
+++ b/gcc/recog.c
@@ -193,6 +193,7 @@ static change_t *changes;
 static int changes_allocated;
 
 static int num_changes = 0;
+static int temporarily_undone_changes = 0;
 
 /* Validate a proposed change to OBJECT.  LOC is the location in the rtl
    at which NEW_RTX will be placed.  If NEW_LEN is >= 0, XVECLEN (NEW_RTX, 0)
@@ -218,6 +219,7 @@ static bool
 validate_change_1 (rtx object, rtx *loc, rtx new_rtx, bool in_group,
 		   bool unshare, int new_len = -1)
 {
+  gcc_assert (temporarily_undone_changes == 0);
   rtx old = *loc;
 
   /* Single-element parallels aren't valid and won't match anything.
@@ -506,6 +508,7 @@ confirm_change_group (void)
   int i;
   rtx last_object = NULL;
 
+  gcc_assert (temporarily_undone_changes == 0);
   for (i = 0; i < num_changes; i++)
     {
       rtx object = changes[i].object;
@@ -561,6 +564,7 @@ num_validated_changes (void)
 void
 cancel_changes (int num)
 {
+  gcc_assert (temporarily_undone_changes == 0);
   int i;
 
   /* Back out all the changes.  Do this in the opposite order in which
@@ -577,6 +581,50 @@ cancel_changes (int num)
   num_changes = num;
 }
 
+/* Swap the status of change NUM from being applied to not being applied,
+   or vice versa.  */
+
+static void
+swap_change (int num)
+{
+  if (changes[num].old_len >= 0)
+    std::swap (XVECLEN (*changes[num].loc, 0), changes[num].old_len);
+  else
+    std::swap (*changes[num].loc, changes[num].old);
+  if (changes[num].object && !MEM_P (changes[num].object))
+    std::swap (INSN_CODE (changes[num].object), changes[num].old_code);
+}
+
+/* Temporarily undo all the changes numbered NUM and up, with a view
+   to reapplying them later.  The next call to the changes machinery
+   must be:
+
+      redo_changes (NUM)
+
+   otherwise things will end up in an invalid state.  */
+
+void
+temporarily_undo_changes (int num)
+{
+  gcc_assert (temporarily_undone_changes == 0 && num <= num_changes);
+  for (int i = num_changes - 1; i >= num; i--)
+    swap_change (i);
+  temporarily_undone_changes = num_changes - num;
+}
+
+/* Redo the changes that were temporarily undone by:
+
+      temporarily_undo_changes (NUM).  */
+
+void
+redo_changes (int num)
+{
+  gcc_assert (temporarily_undone_changes == num_changes - num);
+  for (int i = num; i < num_changes; ++i)
+    swap_change (i);
+  temporarily_undone_changes = 0;
+}
+
 /* Reduce conditional compilation elsewhere.  */
 /* A subroutine of validate_replace_rtx_1 that tries to simplify the resulting
    rtx.  */
diff --git a/gcc/recog.h b/gcc/recog.h
index e152e2bb591..facf36e7c08 100644
--- a/gcc/recog.h
+++ b/gcc/recog.h
@@ -96,6 +96,8 @@ extern void confirm_change_group (void);
 extern int apply_change_group (void);
 extern int num_validated_changes (void);
 extern void cancel_changes (int);
+extern void temporarily_undo_changes (int);
+extern void redo_changes (int);
 extern int constrain_operands (int, alternative_mask);
 extern int constrain_operands_cached (rtx_insn *, int);
 extern int memory_address_addr_space_p (machine_mode, rtx, addr_space_t);

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [08/23] Add an alternative splay tree implementation
  2020-12-02 20:36   ` Jeff Law
@ 2020-12-17  0:29     ` Richard Sandiford
  2021-01-04 15:27       ` Jeff Law
  0 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-12-17  0:29 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

Jeff Law <law@redhat.com> writes:
> On 11/13/20 1:15 AM, Richard Sandiford via Gcc-patches wrote:
>> We already have two splay tree implementations: the old C one in
>> libiberty and a templated reimplementation of it in typed-splay-tree.h.
>> However, they have some drawbacks:
>>
>> - They hard-code the assumption that nodes should have both a key and
>>   a value, which isn't always true.
>>
>> - They use the two-phase method of lookup, and so nodes need to store
>>   a temporary back pointer.  We can avoid that overhead by using the
>>   top-down method (as e.g. the bitmap tree code already does).
>>
>> - The tree node has to own the key and the value.  For some use cases
>>   it's more convenient to embed the tree links in the value instead.
>>
>> Also, a later patch wants to use splay trees to represent an
>> adaptive total order: the splay tree itself records whether node N1
>> is less than node N2, and (in the worst case) comparing nodes is
>> a splay operation.
>>
>> This patch therefore adds an alternative implementation.  The main
>> features are:
>>
>> - Nodes can optionally point back to their parents.
>>
>> - An Accessors class abstracts accessing child nodes and (where
>>   applicable) parent nodes, so that the information can be embedded
>>   in larger data structures.
>>
>> - There is no fixed comparison function at the class level.  Instead,
>>   individual functions that do comparisons take a comparison function
>>   argument.
>>
>> - There are two styles of comparison function, optimised for different
>>   use cases.  (See the comments in the patch for details.)
>>
>> - It's possible to do some operations directly on a given node,
>>   without knowing whether it's the root.  This includes the comparison
>>   use case described above.
>>
>> This of course has its own set of drawbacks.  It's really providing
>> splay utility functions rather than a true ADT, and so is more low-level
>> than the existing routines.  It's mostly geared for cases in which the
>> client code wants to participate in the splay operations to some extent.
>>
>> gcc/
>> 	* Makefile.in (OBJS): Add splay-tree-utils.o.
>> 	* system.h: Include <array> when INCLUDE_ARRAY is defined.
>> 	* selftest.h (splay_tree_cc_tests): Declare.
>> 	* selftest-run-tests.c (selftest::run_tests): Run splay_tree_cc_tests.
>> 	* splay-tree-utils.h: New file.
>> 	* splay-tree-utils.tcc: Likewise.
>> 	* splay-tree-utils.cc: Likewise.
> I must admit, I'm not a fan of adding another splay tree.  Though I
> suspect the one in libiberty will be there forever since there could
> well be clients outside our source base.
>
> The typed_splay_tree implementation however is internal to GCC and only
> has a couple users (the JIT and fixit hints).  Is there any chance we
> could convert those two users to the new one?  Your cover hints that's
> not the case, but I'm going to explicitly ask :-)

Yeah, I agree it's not great to have three versions.  I had a look at
converting the uses of typed_splay_tree, and all of them seem to be a
natural fit for the new scheme.  In particular, although typed_splay_tree
maps keys to values, in practice the keys are already part of the values.

However, I think a natural conversion would need a couple of new helpers
for “get or insert” type operations.  Would it be OK to wait until GCC 12
stage 1 for that?

Thanks,
Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 22/23] Add rtl-ssa
  2020-12-16  3:31   ` Jeff Law
@ 2020-12-17  0:33     ` Richard Sandiford
  2020-12-19 20:01       ` Jeff Law
  0 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-12-17  0:33 UTC (permalink / raw)
  To: Jeff Law via Gcc-patches

Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On 11/13/20 1:23 AM, Richard Sandiford via Gcc-patches wrote:
>> This patch adds the RTL SSA infrastructure itself.  The following
>> fwprop.c patch will make use of it.
>>
>> gcc/
>> 	* configure.ac: Add rtl-ssa to the list of dependence directories.
>> 	* configure: Regenerate.
>> 	* Makefile.in (rtl-ssa-warn): New variable.
>> 	(OBJS): Add the rtl-ssa object files.
>> 	* emit-rtl.h (rtl_data::ssa): New field.
>> 	* rtl-ssa.h: New file.
>> 	* system.h: Include <functional> when INCLUDE_FUNCTIONAL is defined.
>> 	* rtl-ssa: New directory.
> So I know its late in stage3, but I think the risks here are relatively
> minor as this would only be used in fwprop right now and it was posted
> in advance of the deadline.
>
> My high level questions/concerns have been addressed and while I
> strongly suspect we'll have to iterate on infrastructure, I think we can
> do that on demand as passes start to use the infrastructure and we have
> a clearer picture of what tweaks we need to make for the clients.

Agreed.  And I agree we'll need to iterate.  What's there now certainly
isn't complete.

> Obviously we'll want to keep an eye on testresults, but again, the
> potential for fallout here is relatively minor IMHO.  OK for the trunk.

Thanks a lot for the reviews.  Now pushed after retesting on
aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
(as far as I could given the current state), x86_64-linux-gnu
and powerpc64le-linux-gnu.

Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 23/23] fwprop: Rewrite to use RTL SSA
  2020-12-16  3:52   ` Jeff Law
@ 2020-12-17  0:34     ` Richard Sandiford
  0 siblings, 0 replies; 88+ messages in thread
From: Richard Sandiford @ 2020-12-17  0:34 UTC (permalink / raw)
  To: Jeff Law via Gcc-patches

Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On 11/13/20 1:24 AM, Richard Sandiford via Gcc-patches wrote:
>> This patch rewrites fwprop.c to use the RTL SSA framework.  It tries
>> as far as possible to mimic the old behaviour, even in caes where
>> that doesn't fit naturally with the new framework.  I've added ???
>> comments to mark those places, but I think “fixing” them should
>> be done separately to make bisection easier.
>>
>> In particular:
>>
>> * The old implementation iterated over uses, and after a successful
>>   substitution, the new insn's uses were added to the end of the list.
>>   The pass still processed those uses, but because it processed them at
>>   the end, it didn't fully optimise one instruction before propagating
>>   it into the next.
>>
>>   The new version follows the same approach for comparison purposes,
>>   but I'd like to drop that as a follow-on patch.
>>
>> * The old implementation operated on single use sites (DF_REF_LOCs).
>>   This doesn't work well for instructions with match_dups, where it's
>>   necessary to update both an operand and its dups at the same time.
>>   For example, attempting to substitute into a divmod instruction would
>>   fail because only the div or the mod side would be updated.
>>
>>   The new version again follows this to some extent for comparison
>>   purposes (although not exactly).  Again I'd like to drop it as a
>>   follow-on patch.
>>
>>   One difference is that if a register occurs in multiple MEM addresses
>>   in a set, the new version will try to update them all at once.  This is
>>   what causes the SVE ACLE st4* output to improve.
>>
>> Also, the old version didn't naturally guarantee termination (PR79405),
>> whereas the new one does.
>>
>> gcc/
>> 	* fwprop.c: Rewrite to use the RTL SSA framework.
>>
>> gcc/testsuite/
>> 	* gcc.dg/rtl/x86_64/test-return-const.c.before-fwprop.c: Don't
>> 	expect insn updates to be deferred.
>> 	* gcc.target/aarch64/sve/acle/asm/st4_s8.c: Expect the addition
>> 	to be folded into the address.
>> 	* gcc.target/aarch64/sve/acle/asm/st4_s8.c: Likewise.
> Consider killing the ADD_NOTES bits.

Done (and glad to see it go).

> s/eqaul/equal/ to fix a typo.

Oops, fixed.

> Naturally I'm happy at how much by-hand RTL analysis code just
> disappears with this change :-)
>
> Ideally you'll drop this in tomorrow and we can get a fresh run of all
> the targets in my tester before the weekend.  I won't be stressed if we
> see some fallout, but I don't expect much.  I'll help track them down if
> they occur.

Thanks, now pushed with the above changes.

Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [20/23] rtlanal: Add simple_regno_set
  2020-11-25 20:31   ` Jeff Law
@ 2020-12-17  0:47     ` Richard Sandiford
  2021-01-04 15:28       ` Jeff Law
  0 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-12-17  0:47 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

Jeff Law <law@redhat.com> writes:
> On 11/13/20 1:21 AM, Richard Sandiford via Gcc-patches wrote:
>> This patch adds a routine for finding a “simple” SET for a register
>> definition.  See the comment in the patch for details.
>>
>> gcc/
>> 	* rtl.h (simple_regno_set): Declare.
>> 	* rtlanal.c (simple_regno_set): New function.
> So I was a bit confused that this is supposed to reject read-write, but
> what it's really rejecting is a narrow subset of read-write.  In
> particular it rejects things that are potentially RMW via subregs. It
> doesn't prevent the destination from appearing as a source operand.  You
> might consider clarifying the comment.

Yeah, in hindsight it was a mistake to spell out the RMW point
separately when it was really just an extra condition on the subreg.

I'd tweaked this comment and the mux-utils.h one (in response
to Martin's feedback) while doing the cross-target testing,
but forgot to include the changes to the committed version.
(The tested versions were otherwise identical, honest.)

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
	* mux-utils.h (pointer_mux::m_ptr): Tweak description of contents.
	* rtlanal.c (simple_regno_set): Tweak description to clarify the
	RMW condition.
---
 gcc/mux-utils.h | 8 ++++----
 gcc/rtlanal.c   | 8 +++++---
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/gcc/mux-utils.h b/gcc/mux-utils.h
index b026a9fa4c1..6ec0669256d 100644
--- a/gcc/mux-utils.h
+++ b/gcc/mux-utils.h
@@ -139,10 +139,10 @@ public:
 private:
   pointer_mux (char *ptr) : m_ptr (ptr) {}
 
-  // The pointer value for A pointers, or the pointer value + 1 for B pointers.
-  // Using a pointer rather than a uintptr_t tells the compiler that second ()
-  // can never return null, and that second_or_null () is only null if
-  // is_first ().
+  // Points to the first byte of an object for A pointers or the second
+  // byte of an object for B pointers.  Using a pointer rather than a
+  // uintptr_t tells the compiler that second () can never return null,
+  // and that second_or_null () is only null if is_first ().
   char *m_ptr;
 };
 
diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index 80e72d6049d..f0e66a7b26b 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -1458,11 +1458,13 @@ set_of (const_rtx pat, const_rtx insn)
 /* Check whether instruction pattern PAT contains a SET with the following
    properties:
 
-   - the SET is executed unconditionally;
-   - the destination of the SET is write-only rather than read-write; and
+   - the SET is executed unconditionally; and
    - either:
      - the destination of the SET is a REG that contains REGNO; or
-     - the destination of the SET is a SUBREG of such a REG.
+     - both:
+       - the destination of the SET is a SUBREG of such a REG; and
+       - writing to the subreg clobbers all of the SUBREG_REG
+	 (in other words, read_modify_subreg_p is false).
 
    If PAT does have a SET like that, return the set, otherwise return null.
 

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [07/23] Add a class that multiplexes two pointer types
  2020-12-17  0:17         ` Richard Sandiford
@ 2020-12-17 14:21           ` Tom Tromey
  2020-12-17 15:38             ` Richard Sandiford
  0 siblings, 1 reply; 88+ messages in thread
From: Tom Tromey @ 2020-12-17 14:21 UTC (permalink / raw)
  To: Martin Sebor; +Cc: gcc-patches, richard.sandiford

>>>>> "Richard" == Richard Sandiford via Gcc-patches <gcc-patches@gcc.gnu.org> writes:

Richard> +// A class that stores a choice "A or B", where A has type T1 * and B has
Richard> +// type T2 *.  Both T1 and T2 must have an alignment greater than 1, since
Richard> +// the low bit is used to identify B over A.  T1 and T2 can be the same.

It seems like the alignment requirement could be static_assert'd, which
would make using this class a bit safer.

Tom

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [07/23] Add a class that multiplexes two pointer types
  2020-12-17 14:21           ` Tom Tromey
@ 2020-12-17 15:38             ` Richard Sandiford
  2020-12-17 15:44               ` Nathan Sidwell
  0 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2020-12-17 15:38 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Martin Sebor, gcc-patches

Tom Tromey <tom@tromey.com> writes:
>>>>>> "Richard" == Richard Sandiford via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>
> Richard> +// A class that stores a choice "A or B", where A has type T1 * and B has
> Richard> +// type T2 *.  Both T1 and T2 must have an alignment greater than 1, since
> Richard> +// the low bit is used to identify B over A.  T1 and T2 can be the same.
>
> It seems like the alignment requirement could be static_assert'd, which
> would make using this class a bit safer.

Yeah, I wondered about doing that, but in principle there's nothing
to stop people using the class for something like a char*, provided that
the start is suitably aligned.  So having gcc_checking_assert is a
compromise: it only provides run-time checking rather than compile-time
checking, but it doesn't place any artificial limits on how the class
can be used.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [07/23] Add a class that multiplexes two pointer types
  2020-12-17 15:38             ` Richard Sandiford
@ 2020-12-17 15:44               ` Nathan Sidwell
  2021-01-04 15:32                 ` Jeff Law
  0 siblings, 1 reply; 88+ messages in thread
From: Nathan Sidwell @ 2020-12-17 15:44 UTC (permalink / raw)
  To: gcc-patches

On 12/17/20 10:38 AM, Richard Sandiford via Gcc-patches wrote:
> Tom Tromey <tom@tromey.com> writes:
>>>>>>> "Richard" == Richard Sandiford via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>>
>> Richard> +// A class that stores a choice "A or B", where A has type T1 * and B has
>> Richard> +// type T2 *.  Both T1 and T2 must have an alignment greater than 1, since
>> Richard> +// the low bit is used to identify B over A.  T1 and T2 can be the same.
>>
>> It seems like the alignment requirement could be static_assert'd, which
>> would make using this class a bit safer.
> 
> Yeah, I wondered about doing that, but in principle there's nothing
> to stop people using the class for something like a char*, provided that
> the start is suitably aligned.  So having gcc_checking_assert is a
> compromise: it only provides run-time checking rather than compile-time
> checking, but it doesn't place any artificial limits on how the class
> can be used.

I don;t know whether our optimizers are smart enough (credulous enough?) 
to know that the low bits of any pointer to an aligned type must be 
zero, and therefore optimize the assert away in those cases?

If they're not something like

  assert (align (T) > 1 || !(reinterpret_cast<intptr_t> (ptr) & 1))

?

-- 
Nathan Sidwell

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 22/23] Add rtl-ssa
  2020-12-17  0:33     ` Richard Sandiford
@ 2020-12-19 20:01       ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2020-12-19 20:01 UTC (permalink / raw)
  To: Jeff Law via Gcc-patches, richard.sandiford



On 12/16/20 5:33 PM, Richard Sandiford wrote:
> Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> On 11/13/20 1:23 AM, Richard Sandiford via Gcc-patches wrote:
>>> This patch adds the RTL SSA infrastructure itself.  The following
>>> fwprop.c patch will make use of it.
>>>
>>> gcc/
>>> 	* configure.ac: Add rtl-ssa to the list of dependence directories.
>>> 	* configure: Regenerate.
>>> 	* Makefile.in (rtl-ssa-warn): New variable.
>>> 	(OBJS): Add the rtl-ssa object files.
>>> 	* emit-rtl.h (rtl_data::ssa): New field.
>>> 	* rtl-ssa.h: New file.
>>> 	* system.h: Include <functional> when INCLUDE_FUNCTIONAL is defined.
>>> 	* rtl-ssa: New directory.
>> So I know its late in stage3, but I think the risks here are relatively
>> minor as this would only be used in fwprop right now and it was posted
>> in advance of the deadline.
>>
>> My high level questions/concerns have been addressed and while I
>> strongly suspect we'll have to iterate on infrastructure, I think we can
>> do that on demand as passes start to use the infrastructure and we have
>> a clearer picture of what tweaks we need to make for the clients.
> Agreed.  And I agree we'll need to iterate.  What's there now certainly
> isn't complete.
>
>> Obviously we'll want to keep an eye on testresults, but again, the
>> potential for fallout here is relatively minor IMHO.  OK for the trunk.
> Thanks a lot for the reviews.  Now pushed after retesting on
> aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
> (as far as I could given the current state), x86_64-linux-gnu
> and powerpc64le-linux-gnu.
And most of the older embedded targets look OK as well.  There's one or
two regressions that I haven't bisected yet, but nothing at this point
that screams out that things are broken badly.

jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [08/23] Add an alternative splay tree implementation
  2020-11-13  8:15 ` [08/23] Add an alternative splay tree implementation Richard Sandiford
  2020-12-02 20:36   ` Jeff Law
@ 2021-01-01  8:25   ` Andreas Schwab
  2021-01-04 14:53     ` Richard Sandiford
  1 sibling, 1 reply; 88+ messages in thread
From: Andreas Schwab @ 2021-01-01  8:25 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.sandiford

That doesn't build with gcc 4.8:

In file included from ../../gcc/splay-tree-utils.h:491:0,
                 from ../../gcc/rtl-ssa.h:45,
                 from ../../gcc/fwprop.c:29:
../../gcc/splay-tree-utils.tcc:24:1: error: prototype for 'typename base_splay_tree<Accessors>::node_type base_splay_tree<Accessors>::get_child(typename Accessors::node_type, unsigned int)' does not match any in class 'base_splay_tree<Accessors>'
 base_splay_tree<Accessors>::get_child (node_type node, unsigned int index)
 ^
In file included from ../../gcc/rtl-ssa.h:45:0,
                 from ../../gcc/fwprop.c:29:
../../gcc/splay-tree-utils.h:125:20: error: candidate is: static typename Accessors::node_type base_splay_tree<Accessors>::get_child(typename Accessors::node_type, unsigned int)
   static node_type get_child (node_type, unsigned int);
                    ^

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [08/23] Add an alternative splay tree implementation
  2021-01-01  8:25   ` Andreas Schwab
@ 2021-01-04 14:53     ` Richard Sandiford
  2021-01-04 15:02       ` Andreas Schwab
  0 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2021-01-04 14:53 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: gcc-patches

Andreas Schwab <schwab@linux-m68k.org> writes:
> That doesn't build with gcc 4.8:

Which subversion are you using?  It works for me with stock gcc 4.8.5,
which is what I'd used to test the series for C++ compatiblity.

Richard

>
> In file included from ../../gcc/splay-tree-utils.h:491:0,
>                  from ../../gcc/rtl-ssa.h:45,
>                  from ../../gcc/fwprop.c:29:
> ../../gcc/splay-tree-utils.tcc:24:1: error: prototype for 'typename base_splay_tree<Accessors>::node_type base_splay_tree<Accessors>::get_child(typename Accessors::node_type, unsigned int)' does not match any in class 'base_splay_tree<Accessors>'
>  base_splay_tree<Accessors>::get_child (node_type node, unsigned int index)
>  ^
> In file included from ../../gcc/rtl-ssa.h:45:0,
>                  from ../../gcc/fwprop.c:29:
> ../../gcc/splay-tree-utils.h:125:20: error: candidate is: static typename Accessors::node_type base_splay_tree<Accessors>::get_child(typename Accessors::node_type, unsigned int)
>    static node_type get_child (node_type, unsigned int);
>                     ^
>
> Andreas.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [08/23] Add an alternative splay tree implementation
  2021-01-04 14:53     ` Richard Sandiford
@ 2021-01-04 15:02       ` Andreas Schwab
  2021-01-04 15:42         ` Richard Sandiford
  0 siblings, 1 reply; 88+ messages in thread
From: Andreas Schwab @ 2021-01-04 15:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.sandiford

On Jan 04 2021, Richard Sandiford wrote:

> Andreas Schwab <schwab@linux-m68k.org> writes:
>> That doesn't build with gcc 4.8:
>
> Which subversion are you using?

This is 4.8.1.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [08/23] Add an alternative splay tree implementation
  2020-12-17  0:29     ` Richard Sandiford
@ 2021-01-04 15:27       ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2021-01-04 15:27 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 12/16/20 5:29 PM, Richard Sandiford wrote:
> Jeff Law <law@redhat.com> writes:
>> On 11/13/20 1:15 AM, Richard Sandiford via Gcc-patches wrote:
>>> We already have two splay tree implementations: the old C one in
>>> libiberty and a templated reimplementation of it in typed-splay-tree.h.
>>> However, they have some drawbacks:
>>>
>>> - They hard-code the assumption that nodes should have both a key and
>>>   a value, which isn't always true.
>>>
>>> - They use the two-phase method of lookup, and so nodes need to store
>>>   a temporary back pointer.  We can avoid that overhead by using the
>>>   top-down method (as e.g. the bitmap tree code already does).
>>>
>>> - The tree node has to own the key and the value.  For some use cases
>>>   it's more convenient to embed the tree links in the value instead.
>>>
>>> Also, a later patch wants to use splay trees to represent an
>>> adaptive total order: the splay tree itself records whether node N1
>>> is less than node N2, and (in the worst case) comparing nodes is
>>> a splay operation.
>>>
>>> This patch therefore adds an alternative implementation.  The main
>>> features are:
>>>
>>> - Nodes can optionally point back to their parents.
>>>
>>> - An Accessors class abstracts accessing child nodes and (where
>>>   applicable) parent nodes, so that the information can be embedded
>>>   in larger data structures.
>>>
>>> - There is no fixed comparison function at the class level.  Instead,
>>>   individual functions that do comparisons take a comparison function
>>>   argument.
>>>
>>> - There are two styles of comparison function, optimised for different
>>>   use cases.  (See the comments in the patch for details.)
>>>
>>> - It's possible to do some operations directly on a given node,
>>>   without knowing whether it's the root.  This includes the comparison
>>>   use case described above.
>>>
>>> This of course has its own set of drawbacks.  It's really providing
>>> splay utility functions rather than a true ADT, and so is more low-level
>>> than the existing routines.  It's mostly geared for cases in which the
>>> client code wants to participate in the splay operations to some extent.
>>>
>>> gcc/
>>> 	* Makefile.in (OBJS): Add splay-tree-utils.o.
>>> 	* system.h: Include <array> when INCLUDE_ARRAY is defined.
>>> 	* selftest.h (splay_tree_cc_tests): Declare.
>>> 	* selftest-run-tests.c (selftest::run_tests): Run splay_tree_cc_tests.
>>> 	* splay-tree-utils.h: New file.
>>> 	* splay-tree-utils.tcc: Likewise.
>>> 	* splay-tree-utils.cc: Likewise.
>> I must admit, I'm not a fan of adding another splay tree.  Though I
>> suspect the one in libiberty will be there forever since there could
>> well be clients outside our source base.
>>
>> The typed_splay_tree implementation however is internal to GCC and only
>> has a couple users (the JIT and fixit hints).  Is there any chance we
>> could convert those two users to the new one?  Your cover hints that's
>> not the case, but I'm going to explicitly ask :-)
> Yeah, I agree it's not great to have three versions.  I had a look at
> converting the uses of typed_splay_tree, and all of them seem to be a
> natural fit for the new scheme.  In particular, although typed_splay_tree
> maps keys to values, in practice the keys are already part of the values.
>
> However, I think a natural conversion would need a couple of new helpers
> for “get or insert” type operations.  Would it be OK to wait until GCC 12
> stage 1 for that?
Yea, at this point deferring the conversion to gcc-12 seems to make the
most sense.
jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [20/23] rtlanal: Add simple_regno_set
  2020-12-17  0:47     ` Richard Sandiford
@ 2021-01-04 15:28       ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2021-01-04 15:28 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 12/16/20 5:47 PM, Richard Sandiford wrote:
> Jeff Law <law@redhat.com> writes:
>> On 11/13/20 1:21 AM, Richard Sandiford via Gcc-patches wrote:
>>> This patch adds a routine for finding a “simple” SET for a register
>>> definition.  See the comment in the patch for details.
>>>
>>> gcc/
>>> 	* rtl.h (simple_regno_set): Declare.
>>> 	* rtlanal.c (simple_regno_set): New function.
>> So I was a bit confused that this is supposed to reject read-write, but
>> what it's really rejecting is a narrow subset of read-write.  In
>> particular it rejects things that are potentially RMW via subregs. It
>> doesn't prevent the destination from appearing as a source operand.  You
>> might consider clarifying the comment.
> Yeah, in hindsight it was a mistake to spell out the RMW point
> separately when it was really just an extra condition on the subreg.
>
> I'd tweaked this comment and the mux-utils.h one (in response
> to Martin's feedback) while doing the cross-target testing,
> but forgot to include the changes to the committed version.
> (The tested versions were otherwise identical, honest.)
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>
> Thanks,
> Richard
>
>
> gcc/
> 	* mux-utils.h (pointer_mux::m_ptr): Tweak description of contents.
> 	* rtlanal.c (simple_regno_set): Tweak description to clarify the
> 	RMW condition.
OK
jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [07/23] Add a class that multiplexes two pointer types
  2020-12-17 15:44               ` Nathan Sidwell
@ 2021-01-04 15:32                 ` Jeff Law
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff Law @ 2021-01-04 15:32 UTC (permalink / raw)
  To: Nathan Sidwell, gcc-patches



On 12/17/20 8:44 AM, Nathan Sidwell wrote:
> On 12/17/20 10:38 AM, Richard Sandiford via Gcc-patches wrote:
>> Tom Tromey <tom@tromey.com> writes:
>>>>>>>> "Richard" == Richard Sandiford via Gcc-patches
>>>>>>>> <gcc-patches@gcc.gnu.org> writes:
>>>
>>> Richard> +// A class that stores a choice "A or B", where A has type
>>> T1 * and B has
>>> Richard> +// type T2 *.  Both T1 and T2 must have an alignment
>>> greater than 1, since
>>> Richard> +// the low bit is used to identify B over A.  T1 and T2
>>> can be the same.
>>>
>>> It seems like the alignment requirement could be static_assert'd, which
>>> would make using this class a bit safer.
>>
>> Yeah, I wondered about doing that, but in principle there's nothing
>> to stop people using the class for something like a char*, provided that
>> the start is suitably aligned.  So having gcc_checking_assert is a
>> compromise: it only provides run-time checking rather than compile-time
>> checking, but it doesn't place any artificial limits on how the class
>> can be used.
>
> I don;t know whether our optimizers are smart enough (credulous
> enough?) to know that the low bits of any pointer to an aligned type
> must be zero, and therefore optimize the assert away in those cases?
>
> If they're not something like
>
>  assert (align (T) > 1 || !(reinterpret_cast<intptr_t> (ptr) & 1))
I believe both gimple optimizers and the RTL optimizers are aware of the
restrictions on the low bits for STRICT_ALIGNMENT targets and should be
able to optimize the assert away.

jeff


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [08/23] Add an alternative splay tree implementation
  2021-01-04 15:02       ` Andreas Schwab
@ 2021-01-04 15:42         ` Richard Sandiford
  2021-01-05 12:13           ` Richard Biener
  0 siblings, 1 reply; 88+ messages in thread
From: Richard Sandiford @ 2021-01-04 15:42 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: gcc-patches

Andreas Schwab <schwab@linux-m68k.org> writes:
> On Jan 04 2021, Richard Sandiford wrote:
>
>> Andreas Schwab <schwab@linux-m68k.org> writes:
>>> That doesn't build with gcc 4.8:
>>
>> Which subversion are you using?
>
> This is 4.8.1.

Hmm, OK.  I guess that raises the question whether “supporting GCC 4.8”
means supporting every patchlevel, or just the latest.

Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [08/23] Add an alternative splay tree implementation
  2021-01-04 15:42         ` Richard Sandiford
@ 2021-01-05 12:13           ` Richard Biener
  0 siblings, 0 replies; 88+ messages in thread
From: Richard Biener @ 2021-01-05 12:13 UTC (permalink / raw)
  To: Richard Sandiford, Andreas Schwab, GCC Patches

On Mon, Jan 4, 2021 at 4:43 PM Richard Sandiford via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Andreas Schwab <schwab@linux-m68k.org> writes:
> > On Jan 04 2021, Richard Sandiford wrote:
> >
> >> Andreas Schwab <schwab@linux-m68k.org> writes:
> >>> That doesn't build with gcc 4.8:
> >>
> >> Which subversion are you using?
> >
> > This is 4.8.1.
>
> Hmm, OK.  I guess that raises the question whether “supporting GCC 4.8”
> means supporting every patchlevel, or just the latest.

We document

@item ISO C++11 compiler
Necessary to bootstrap GCC.
...

To build all languages in a cross-compiler or other configuration where
3-stage bootstrap is not performed, you need to start with an existing
GCC binary (version 4.8 or later) because source code for language
frontends other than C might use GCC extensions.

Note that to bootstrap GCC with versions of GCC earlier than 4.8, you
may need to use @option{--disable-stage1-checking}, though
bootstrapping the compiler with such earlier compilers is strongly
discouraged.

while the second paragraph suggests GCC 4.8 or later works
(which IMHO includes GCC 4.8.1), the general requirement
lists a C++11 compiler which appearantly GCC 4.8.1 isn't ;)

So for simplicity I'd suggest to be more precise and say
4.8.2 or later (if 4.8.2 works)

Richard.

>
> Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [09/23] Add a cut-down version of std::span (array_slice)
  2020-11-13  8:15 ` [09/23] Add a cut-down version of std::span (array_slice) Richard Sandiford
  2020-11-30 19:56   ` Jeff Law
@ 2022-08-03 15:13   ` Martin Jambor
  2022-08-03 15:31     ` Richard Sandiford
  2022-08-10 16:03   ` Martin Jambor
  2 siblings, 1 reply; 88+ messages in thread
From: Martin Jambor @ 2022-08-03 15:13 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches

Hi Richard,

On Fri, Nov 13 2020, Richard Sandiford via Gcc-patches wrote:
> A later patch wants to be able to pass around subarray views of an
> existing array.  The standard class to do that is std::span, but it's
> a C++20 thing.  This patch just adds a cut-down version of it.

thanks a lot for introducing it.  I hope to use it as a unified view
into something which might be a GC vec or heap vec an an auto_vec.

But I have one question:

>
> The intention is just to provide what's currently needed.
>
> gcc/
> 	* vec.h (array_slice): New class.
> ---
>  gcc/vec.h | 120 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 120 insertions(+)
>
> diff --git a/gcc/vec.h b/gcc/vec.h
> index f02beddc975..7768de9f518 100644
> --- a/gcc/vec.h
> +++ b/gcc/vec.h
> @@ -2128,6 +2128,126 @@ release_vec_vec (vec<vec<T> > &vec)
>    vec.release ();
>  }
>  
> +// Provide a subset of the std::span functionality.  (We can't use std::span
> +// itself because it's a C++20 feature.)
> +//
> +// In addition, provide an invalid value that is distinct from all valid
> +// sequences (including the empty sequence).  This can be used to return
> +// failure without having to use std::optional.
> +//
> +// There is no operator bool because it would be ambiguous whether it is
> +// testing for a valid value or an empty sequence.
> +template<typename T>
> +class array_slice
> +{
> +  template<typename OtherT> friend class array_slice;
> +
> +public:
> +  using value_type = T;
> +  using iterator = T *;
> +  using const_iterator = const T *;
> +
> +  array_slice () : m_base (nullptr), m_size (0) {}
> +
> +  template<typename OtherT>
> +  array_slice (array_slice<OtherT> other)
> +    : m_base (other.m_base), m_size (other.m_size) {}
> +
> +  array_slice (iterator base, unsigned int size)
> +    : m_base (base), m_size (size) {}
> +
> +  template<size_t N>
> +  array_slice (T (&array)[N]) : m_base (array), m_size (N) {}
> +
> +  template<typename OtherT>
> +  array_slice (const vec<OtherT> &v)
> +    : m_base (v.address ()), m_size (v.length ()) {}
> +

What is the reason for making the parameter const here?

The problem is that if you do for example:

  auto_vec<bool, 16> test_base;
  test_base.quick_grow_cleared (10);
  array_slice<bool> test(test_base);

the constructor will get a const reference to test_base and so will
invoke the const variant of v.address() which returns a const bool *
which cannot be assigned into non-const qualified base.  AFAICS, the
constructor only works if the array_slice is array_slice<const bool>.

Is that intentional?  I am not a C++ expert and can be easily
overlooking something.  I understand that users need to be careful not
to cause reallocation of the underlying vector while the array_slice
exists but the const qualifier does not achieve that.  (A wild idea to
be to add a array_slice ref-counter to auto_vec, which seems to be less
space-efficiency-critical than other vecs, and assert on reallocation
when it is not zero, hehe).

Removing the const qualifier in the constructor parameter makes the
error go away - as does adding another constructor without it, which
might be the correct thing to do.

On a related note, would the following constructor be a good addition to
the class (I can make it const too)?

  template<typename OtherT>
  array_slice (vec<OtherT, va_gc> *v)
    : m_base (v ? v->address () : nullptr), m_size (v ? v->length (): 0) {}


Thanks,

Martin



> +  iterator begin () { return m_base; }
> +  iterator end () { return m_base + m_size; }
> +
> +  const_iterator begin () const { return m_base; }
> +  const_iterator end () const { return m_base + m_size; }
> +
> +  value_type &front ();
> +  value_type &back ();
> +  value_type &operator[] (unsigned int i);
> +
> +  const value_type &front () const;
> +  const value_type &back () const;
> +  const value_type &operator[] (unsigned int i) const;
> +
> +  size_t size () const { return m_size; }
> +  size_t size_bytes () const { return m_size * sizeof (T); }
> +  bool empty () const { return m_size == 0; }
> +
> +  // An invalid array_slice that represents a failed operation.  This is
> +  // distinct from an empty slice, which is a valid result in some contexts.
> +  static array_slice invalid () { return { nullptr, ~0U }; }
> +
> +  // True if the array is valid, false if it is an array like INVALID.
> +  bool is_valid () const { return m_base || m_size == 0; }
> +
> +private:
> +  iterator m_base;
> +  unsigned int m_size;
> +};
> +
> +template<typename T>
> +inline typename array_slice<T>::value_type &
> +array_slice<T>::front ()
> +{
> +  gcc_checking_assert (m_size);
> +  return m_base[0];
> +}
> +
> +template<typename T>
> +inline const typename array_slice<T>::value_type &
> +array_slice<T>::front () const
> +{
> +  gcc_checking_assert (m_size);
> +  return m_base[0];
> +}
> +
> +template<typename T>
> +inline typename array_slice<T>::value_type &
> +array_slice<T>::back ()
> +{
> +  gcc_checking_assert (m_size);
> +  return m_base[m_size - 1];
> +}
> +
> +template<typename T>
> +inline const typename array_slice<T>::value_type &
> +array_slice<T>::back () const
> +{
> +  gcc_checking_assert (m_size);
> +  return m_base[m_size - 1];
> +}
> +
> +template<typename T>
> +inline typename array_slice<T>::value_type &
> +array_slice<T>::operator[] (unsigned int i)
> +{
> +  gcc_checking_assert (i < m_size);
> +  return m_base[i];
> +}
> +
> +template<typename T>
> +inline const typename array_slice<T>::value_type &
> +array_slice<T>::operator[] (unsigned int i) const
> +{
> +  gcc_checking_assert (i < m_size);
> +  return m_base[i];
> +}
> +
> +template<typename T>
> +array_slice<T>
> +make_array_slice (T *base, unsigned int size)
> +{
> +  return array_slice<T> (base, size);
> +}
> +
>  #if (GCC_VERSION >= 3000)
>  # pragma GCC poison m_vec m_vecpfx m_vecdata
>  #endif
> -- 
> 2.17.1

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [09/23] Add a cut-down version of std::span (array_slice)
  2022-08-03 15:13   ` Martin Jambor
@ 2022-08-03 15:31     ` Richard Sandiford
  0 siblings, 0 replies; 88+ messages in thread
From: Richard Sandiford @ 2022-08-03 15:31 UTC (permalink / raw)
  To: Martin Jambor; +Cc: gcc-patches

Martin Jambor <mjambor@suse.cz> writes:
> Hi Richard,
>
> On Fri, Nov 13 2020, Richard Sandiford via Gcc-patches wrote:
>> A later patch wants to be able to pass around subarray views of an
>> existing array.  The standard class to do that is std::span, but it's
>> a C++20 thing.  This patch just adds a cut-down version of it.
>
> thanks a lot for introducing it.  I hope to use it as a unified view
> into something which might be a GC vec or heap vec an an auto_vec.
>
> But I have one question:
>
>>
>> The intention is just to provide what's currently needed.
>>
>> gcc/
>> 	* vec.h (array_slice): New class.
>> ---
>>  gcc/vec.h | 120 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 120 insertions(+)
>>
>> diff --git a/gcc/vec.h b/gcc/vec.h
>> index f02beddc975..7768de9f518 100644
>> --- a/gcc/vec.h
>> +++ b/gcc/vec.h
>> @@ -2128,6 +2128,126 @@ release_vec_vec (vec<vec<T> > &vec)
>>    vec.release ();
>>  }
>>  
>> +// Provide a subset of the std::span functionality.  (We can't use std::span
>> +// itself because it's a C++20 feature.)
>> +//
>> +// In addition, provide an invalid value that is distinct from all valid
>> +// sequences (including the empty sequence).  This can be used to return
>> +// failure without having to use std::optional.
>> +//
>> +// There is no operator bool because it would be ambiguous whether it is
>> +// testing for a valid value or an empty sequence.
>> +template<typename T>
>> +class array_slice
>> +{
>> +  template<typename OtherT> friend class array_slice;
>> +
>> +public:
>> +  using value_type = T;
>> +  using iterator = T *;
>> +  using const_iterator = const T *;
>> +
>> +  array_slice () : m_base (nullptr), m_size (0) {}
>> +
>> +  template<typename OtherT>
>> +  array_slice (array_slice<OtherT> other)
>> +    : m_base (other.m_base), m_size (other.m_size) {}
>> +
>> +  array_slice (iterator base, unsigned int size)
>> +    : m_base (base), m_size (size) {}
>> +
>> +  template<size_t N>
>> +  array_slice (T (&array)[N]) : m_base (array), m_size (N) {}
>> +
>> +  template<typename OtherT>
>> +  array_slice (const vec<OtherT> &v)
>> +    : m_base (v.address ()), m_size (v.length ()) {}
>> +
>
> What is the reason for making the parameter const here?
>
> The problem is that if you do for example:
>
>   auto_vec<bool, 16> test_base;
>   test_base.quick_grow_cleared (10);
>   array_slice<bool> test(test_base);
>
> the constructor will get a const reference to test_base and so will
> invoke the const variant of v.address() which returns a const bool *
> which cannot be assigned into non-const qualified base.  AFAICS, the
> constructor only works if the array_slice is array_slice<const bool>.
>
> Is that intentional?  I am not a C++ expert and can be easily
> overlooking something.  I understand that users need to be careful not
> to cause reallocation of the underlying vector while the array_slice
> exists but the const qualifier does not achieve that.  (A wild idea to
> be to add a array_slice ref-counter to auto_vec, which seems to be less
> space-efficiency-critical than other vecs, and assert on reallocation
> when it is not zero, hehe).
>
> Removing the const qualifier in the constructor parameter makes the
> error go away - as does adding another constructor without it, which
> might be the correct thing to do.

Yeah, the latter sounds better to me.  (The existing uses of array_slice
are for const elements, which is why I didn't come across this.)

> On a related note, would the following constructor be a good addition to
> the class (I can make it const too)?
>
>   template<typename OtherT>
>   array_slice (vec<OtherT, va_gc> *v)
>     : m_base (v ? v->address () : nullptr), m_size (v ? v->length (): 0) {}

LGTM.

Thanks,
Richard

> Thanks,
>
> Martin
>
>
>
>> +  iterator begin () { return m_base; }
>> +  iterator end () { return m_base + m_size; }
>> +
>> +  const_iterator begin () const { return m_base; }
>> +  const_iterator end () const { return m_base + m_size; }
>> +
>> +  value_type &front ();
>> +  value_type &back ();
>> +  value_type &operator[] (unsigned int i);
>> +
>> +  const value_type &front () const;
>> +  const value_type &back () const;
>> +  const value_type &operator[] (unsigned int i) const;
>> +
>> +  size_t size () const { return m_size; }
>> +  size_t size_bytes () const { return m_size * sizeof (T); }
>> +  bool empty () const { return m_size == 0; }
>> +
>> +  // An invalid array_slice that represents a failed operation.  This is
>> +  // distinct from an empty slice, which is a valid result in some contexts.
>> +  static array_slice invalid () { return { nullptr, ~0U }; }
>> +
>> +  // True if the array is valid, false if it is an array like INVALID.
>> +  bool is_valid () const { return m_base || m_size == 0; }
>> +
>> +private:
>> +  iterator m_base;
>> +  unsigned int m_size;
>> +};
>> +
>> +template<typename T>
>> +inline typename array_slice<T>::value_type &
>> +array_slice<T>::front ()
>> +{
>> +  gcc_checking_assert (m_size);
>> +  return m_base[0];
>> +}
>> +
>> +template<typename T>
>> +inline const typename array_slice<T>::value_type &
>> +array_slice<T>::front () const
>> +{
>> +  gcc_checking_assert (m_size);
>> +  return m_base[0];
>> +}
>> +
>> +template<typename T>
>> +inline typename array_slice<T>::value_type &
>> +array_slice<T>::back ()
>> +{
>> +  gcc_checking_assert (m_size);
>> +  return m_base[m_size - 1];
>> +}
>> +
>> +template<typename T>
>> +inline const typename array_slice<T>::value_type &
>> +array_slice<T>::back () const
>> +{
>> +  gcc_checking_assert (m_size);
>> +  return m_base[m_size - 1];
>> +}
>> +
>> +template<typename T>
>> +inline typename array_slice<T>::value_type &
>> +array_slice<T>::operator[] (unsigned int i)
>> +{
>> +  gcc_checking_assert (i < m_size);
>> +  return m_base[i];
>> +}
>> +
>> +template<typename T>
>> +inline const typename array_slice<T>::value_type &
>> +array_slice<T>::operator[] (unsigned int i) const
>> +{
>> +  gcc_checking_assert (i < m_size);
>> +  return m_base[i];
>> +}
>> +
>> +template<typename T>
>> +array_slice<T>
>> +make_array_slice (T *base, unsigned int size)
>> +{
>> +  return array_slice<T> (base, size);
>> +}
>> +
>>  #if (GCC_VERSION >= 3000)
>>  # pragma GCC poison m_vec m_vecpfx m_vecdata
>>  #endif
>> -- 
>> 2.17.1

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [09/23] Add a cut-down version of std::span (array_slice)
  2020-11-13  8:15 ` [09/23] Add a cut-down version of std::span (array_slice) Richard Sandiford
  2020-11-30 19:56   ` Jeff Law
  2022-08-03 15:13   ` Martin Jambor
@ 2022-08-10 16:03   ` Martin Jambor
  2022-08-11  6:58     ` Richard Biener
  2 siblings, 1 reply; 88+ messages in thread
From: Martin Jambor @ 2022-08-10 16:03 UTC (permalink / raw)
  To: Richard Sandiford, gcc-patches

Hello,

I have one more question/comment about array_slice.  Ever since I
started to use it...

On Fri, Nov 13 2020, Richard Sandiford via Gcc-patches wrote:
> A later patch wants to be able to pass around subarray views of an
> existing array.  The standard class to do that is std::span, but it's
> a C++20 thing.  This patch just adds a cut-down version of it.
>
> The intention is just to provide what's currently needed.
>
> gcc/
> 	* vec.h (array_slice): New class.
> ---
>  gcc/vec.h | 120 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 120 insertions(+)
>
> diff --git a/gcc/vec.h b/gcc/vec.h
> index f02beddc975..7768de9f518 100644
> --- a/gcc/vec.h
> +++ b/gcc/vec.h
> @@ -2128,6 +2128,126 @@ release_vec_vec (vec<vec<T> > &vec)
>    vec.release ();
>  }
>  
> +// Provide a subset of the std::span functionality.  (We can't use std::span
> +// itself because it's a C++20 feature.)
> +//
> +// In addition, provide an invalid value that is distinct from all valid
> +// sequences (including the empty sequence).  This can be used to return
> +// failure without having to use std::optional.
> +//
> +// There is no operator bool because it would be ambiguous whether it is
> +// testing for a valid value or an empty sequence.
> +template<typename T>
> +class array_slice
> +{
> +  template<typename OtherT> friend class array_slice;
> +
> +public:
> +  using value_type = T;
> +  using iterator = T *;
> +  using const_iterator = const T *;
> +
> +  array_slice () : m_base (nullptr), m_size (0) {}
> +
> +  template<typename OtherT>
> +  array_slice (array_slice<OtherT> other)
> +    : m_base (other.m_base), m_size (other.m_size) {}
> +
> +  array_slice (iterator base, unsigned int size)
> +    : m_base (base), m_size (size) {}
> +
> +  template<size_t N>
> +  array_slice (T (&array)[N]) : m_base (array), m_size (N) {}
> +
> +  template<typename OtherT>
> +  array_slice (const vec<OtherT> &v)
> +    : m_base (v.address ()), m_size (v.length ()) {}
> +
> +  iterator begin () { return m_base; }
> +  iterator end () { return m_base + m_size; }
> +
> +  const_iterator begin () const { return m_base; }
> +  const_iterator end () const { return m_base + m_size; }
> +
> +  value_type &front ();
> +  value_type &back ();
> +  value_type &operator[] (unsigned int i);
> +
> +  const value_type &front () const;
> +  const value_type &back () const;
> +  const value_type &operator[] (unsigned int i) const;
> +
> +  size_t size () const { return m_size; }

...this has been a constant source of compile errors, because vectors
have length () and this is size ().

I understand that the motivation was consistency with std::span, but do
we really want to add another inconsistency with ourselves?

Given that array_slice is not that much used yet, I believe we can still
change to be consistent with vectors.  I personally think we should but
at the very least, if we keep it as it is, I'd like us to do so
deliberately.

Thanks,

Martin


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [09/23] Add a cut-down version of std::span (array_slice)
  2022-08-10 16:03   ` Martin Jambor
@ 2022-08-11  6:58     ` Richard Biener
  2022-08-16  7:59       ` Richard Sandiford
  0 siblings, 1 reply; 88+ messages in thread
From: Richard Biener @ 2022-08-11  6:58 UTC (permalink / raw)
  To: Martin Jambor; +Cc: Richard Sandiford, GCC Patches

On Wed, Aug 10, 2022 at 6:04 PM Martin Jambor <mjambor@suse.cz> wrote:
>
> Hello,
>
> I have one more question/comment about array_slice.  Ever since I
> started to use it...
>
> On Fri, Nov 13 2020, Richard Sandiford via Gcc-patches wrote:
> > A later patch wants to be able to pass around subarray views of an
> > existing array.  The standard class to do that is std::span, but it's
> > a C++20 thing.  This patch just adds a cut-down version of it.
> >
> > The intention is just to provide what's currently needed.
> >
> > gcc/
> >       * vec.h (array_slice): New class.
> > ---
> >  gcc/vec.h | 120 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 120 insertions(+)
> >
> > diff --git a/gcc/vec.h b/gcc/vec.h
> > index f02beddc975..7768de9f518 100644
> > --- a/gcc/vec.h
> > +++ b/gcc/vec.h
> > @@ -2128,6 +2128,126 @@ release_vec_vec (vec<vec<T> > &vec)
> >    vec.release ();
> >  }
> >
> > +// Provide a subset of the std::span functionality.  (We can't use std::span
> > +// itself because it's a C++20 feature.)
> > +//
> > +// In addition, provide an invalid value that is distinct from all valid
> > +// sequences (including the empty sequence).  This can be used to return
> > +// failure without having to use std::optional.
> > +//
> > +// There is no operator bool because it would be ambiguous whether it is
> > +// testing for a valid value or an empty sequence.
> > +template<typename T>
> > +class array_slice
> > +{
> > +  template<typename OtherT> friend class array_slice;
> > +
> > +public:
> > +  using value_type = T;
> > +  using iterator = T *;
> > +  using const_iterator = const T *;
> > +
> > +  array_slice () : m_base (nullptr), m_size (0) {}
> > +
> > +  template<typename OtherT>
> > +  array_slice (array_slice<OtherT> other)
> > +    : m_base (other.m_base), m_size (other.m_size) {}
> > +
> > +  array_slice (iterator base, unsigned int size)
> > +    : m_base (base), m_size (size) {}
> > +
> > +  template<size_t N>
> > +  array_slice (T (&array)[N]) : m_base (array), m_size (N) {}
> > +
> > +  template<typename OtherT>
> > +  array_slice (const vec<OtherT> &v)
> > +    : m_base (v.address ()), m_size (v.length ()) {}
> > +
> > +  iterator begin () { return m_base; }
> > +  iterator end () { return m_base + m_size; }
> > +
> > +  const_iterator begin () const { return m_base; }
> > +  const_iterator end () const { return m_base + m_size; }
> > +
> > +  value_type &front ();
> > +  value_type &back ();
> > +  value_type &operator[] (unsigned int i);
> > +
> > +  const value_type &front () const;
> > +  const value_type &back () const;
> > +  const value_type &operator[] (unsigned int i) const;
> > +
> > +  size_t size () const { return m_size; }
>
> ...this has been a constant source of compile errors, because vectors
> have length () and this is size ().
>
> I understand that the motivation was consistency with std::span, but do
> we really want to add another inconsistency with ourselves?
>
> Given that array_slice is not that much used yet, I believe we can still
> change to be consistent with vectors.  I personally think we should but
> at the very least, if we keep it as it is, I'd like us to do so
> deliberately.

We could alternatively add length in addition to size (and maybe size to
vec<> if std::vector has size but not length) with a comment deprecating
the "non-standard" variant?

Richard.

>
> Thanks,
>
> Martin
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [09/23] Add a cut-down version of std::span (array_slice)
  2022-08-11  6:58     ` Richard Biener
@ 2022-08-16  7:59       ` Richard Sandiford
  0 siblings, 0 replies; 88+ messages in thread
From: Richard Sandiford @ 2022-08-16  7:59 UTC (permalink / raw)
  To: Richard Biener; +Cc: Martin Jambor, GCC Patches

Richard Biener <richard.guenther@gmail.com> writes:
> On Wed, Aug 10, 2022 at 6:04 PM Martin Jambor <mjambor@suse.cz> wrote:
>>
>> Hello,
>>
>> I have one more question/comment about array_slice.  Ever since I
>> started to use it...
>>
>> On Fri, Nov 13 2020, Richard Sandiford via Gcc-patches wrote:
>> > A later patch wants to be able to pass around subarray views of an
>> > existing array.  The standard class to do that is std::span, but it's
>> > a C++20 thing.  This patch just adds a cut-down version of it.
>> >
>> > The intention is just to provide what's currently needed.
>> >
>> > gcc/
>> >       * vec.h (array_slice): New class.
>> > ---
>> >  gcc/vec.h | 120 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >  1 file changed, 120 insertions(+)
>> >
>> > diff --git a/gcc/vec.h b/gcc/vec.h
>> > index f02beddc975..7768de9f518 100644
>> > --- a/gcc/vec.h
>> > +++ b/gcc/vec.h
>> > @@ -2128,6 +2128,126 @@ release_vec_vec (vec<vec<T> > &vec)
>> >    vec.release ();
>> >  }
>> >
>> > +// Provide a subset of the std::span functionality.  (We can't use std::span
>> > +// itself because it's a C++20 feature.)
>> > +//
>> > +// In addition, provide an invalid value that is distinct from all valid
>> > +// sequences (including the empty sequence).  This can be used to return
>> > +// failure without having to use std::optional.
>> > +//
>> > +// There is no operator bool because it would be ambiguous whether it is
>> > +// testing for a valid value or an empty sequence.
>> > +template<typename T>
>> > +class array_slice
>> > +{
>> > +  template<typename OtherT> friend class array_slice;
>> > +
>> > +public:
>> > +  using value_type = T;
>> > +  using iterator = T *;
>> > +  using const_iterator = const T *;
>> > +
>> > +  array_slice () : m_base (nullptr), m_size (0) {}
>> > +
>> > +  template<typename OtherT>
>> > +  array_slice (array_slice<OtherT> other)
>> > +    : m_base (other.m_base), m_size (other.m_size) {}
>> > +
>> > +  array_slice (iterator base, unsigned int size)
>> > +    : m_base (base), m_size (size) {}
>> > +
>> > +  template<size_t N>
>> > +  array_slice (T (&array)[N]) : m_base (array), m_size (N) {}
>> > +
>> > +  template<typename OtherT>
>> > +  array_slice (const vec<OtherT> &v)
>> > +    : m_base (v.address ()), m_size (v.length ()) {}
>> > +
>> > +  iterator begin () { return m_base; }
>> > +  iterator end () { return m_base + m_size; }
>> > +
>> > +  const_iterator begin () const { return m_base; }
>> > +  const_iterator end () const { return m_base + m_size; }
>> > +
>> > +  value_type &front ();
>> > +  value_type &back ();
>> > +  value_type &operator[] (unsigned int i);
>> > +
>> > +  const value_type &front () const;
>> > +  const value_type &back () const;
>> > +  const value_type &operator[] (unsigned int i) const;
>> > +
>> > +  size_t size () const { return m_size; }
>>
>> ...this has been a constant source of compile errors, because vectors
>> have length () and this is size ().
>>
>> I understand that the motivation was consistency with std::span, but do
>> we really want to add another inconsistency with ourselves?
>>
>> Given that array_slice is not that much used yet, I believe we can still
>> change to be consistent with vectors.  I personally think we should but
>> at the very least, if we keep it as it is, I'd like us to do so
>> deliberately.
>
> We could alternatively add length in addition to size (and maybe size to
> vec<> if std::vector has size but not length) with a comment deprecating
> the "non-standard" variant?

Yeah, I'd prefer to do the latter: add vec::size as a synonym of
vec::length, and deprecate length.  Doing anything else seems like
it's going to increase the inconsistency rather than decrease it.
E.g. we already have uses of (hopefully) uncontroversial standard
containers like std::array (my fault).

(FWIW, I keep tripping up in the opposite direction: expecting
size to be available in vec, like for standard containers.)

Thanks,
Richard

^ permalink raw reply	[flat|nested] 88+ messages in thread

end of thread, other threads:[~2022-08-16  7:59 UTC | newest]

Thread overview: 88+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-13  8:10 [00/23] Make fwprop use an on-the-side RTL SSA representation Richard Sandiford
2020-11-13  8:11 ` [01/23] vec: Silence clang warning Richard Sandiford
2020-11-25 19:58   ` Jeff Law
2020-11-13  8:12 ` [02/23] rtlanal: Remove noop_move_p REG_EQUAL condition Richard Sandiford
2020-11-25 20:00   ` Jeff Law
2020-11-13  8:12 ` [03/23] reginfo: Add a global_reg_set Richard Sandiford
2020-11-25 20:01   ` Jeff Law
2020-11-13  8:13 ` [04/23] Move iterator_range to a new iterator-utils.h file Richard Sandiford
2020-11-25 20:02   ` Jeff Law
2020-11-13  8:13 ` [05/23] Add more iterator utilities Richard Sandiford
2020-11-25 20:12   ` Jeff Law
2020-11-13  8:14 ` [06/23] Add an RAII class for managing obstacks Richard Sandiford
2020-11-25 20:15   ` Jeff Law
2020-11-13  8:14 ` [07/23] Add a class that multiplexes two pointer types Richard Sandiford
2020-11-25 20:23   ` Jeff Law
2020-11-26 16:15     ` Richard Sandiford
2020-11-30  1:28       ` Jeff Law
2020-11-25 23:33   ` Martin Sebor
2020-11-26 17:06     ` Richard Sandiford
2020-11-27 18:12       ` Richard Sandiford
2020-11-28  0:17       ` Martin Sebor
2020-12-17  0:17         ` Richard Sandiford
2020-12-17 14:21           ` Tom Tromey
2020-12-17 15:38             ` Richard Sandiford
2020-12-17 15:44               ` Nathan Sidwell
2021-01-04 15:32                 ` Jeff Law
2020-11-13  8:15 ` [08/23] Add an alternative splay tree implementation Richard Sandiford
2020-12-02 20:36   ` Jeff Law
2020-12-17  0:29     ` Richard Sandiford
2021-01-04 15:27       ` Jeff Law
2021-01-01  8:25   ` Andreas Schwab
2021-01-04 14:53     ` Richard Sandiford
2021-01-04 15:02       ` Andreas Schwab
2021-01-04 15:42         ` Richard Sandiford
2021-01-05 12:13           ` Richard Biener
2020-11-13  8:15 ` [09/23] Add a cut-down version of std::span (array_slice) Richard Sandiford
2020-11-30 19:56   ` Jeff Law
2022-08-03 15:13   ` Martin Jambor
2022-08-03 15:31     ` Richard Sandiford
2022-08-10 16:03   ` Martin Jambor
2022-08-11  6:58     ` Richard Biener
2022-08-16  7:59       ` Richard Sandiford
2020-11-13  8:16 ` [10/23] Tweak the way that is_a is implemented Richard Sandiford
2020-12-02  5:15   ` Jeff Law
2020-11-13  8:16 ` [11/23] Split update_cfg_for_uncondjump out of combine Richard Sandiford
2020-11-30  6:14   ` Jeff Law
2020-11-13  8:17 ` [12/23] Export print-rtl.c:print_insn_with_notes Richard Sandiford
2020-11-25 20:24   ` Jeff Law
2020-11-13  8:18 ` [13/23] recog: Split out a register_asm_p function Richard Sandiford
2020-11-25 20:24   ` Jeff Law
2020-11-13  8:18 ` [14/23] simplify-rtx: Put simplify routines into a class Richard Sandiford
2020-11-30 19:54   ` Jeff Law
2020-11-13  8:19 ` [15/23] recog: Add a validate_change_xveclen function Richard Sandiford
2020-11-30 20:03   ` Jeff Law
2020-11-13  8:19 ` [16/23] recog: Add a way of temporarily undoing changes Richard Sandiford
2020-11-25 20:27   ` Jeff Law
2020-12-17  0:22     ` Richard Sandiford
2020-11-13  8:20 ` [17/23] recog: Add a class for propagating into insns Richard Sandiford
2020-12-03 22:32   ` Jeff Law
2020-11-13  8:20 ` [18/23] recog: Add an RAII class for undoing insn changes Richard Sandiford
2020-11-25 20:27   ` Jeff Law
2020-11-13  8:20 ` [19/23] rtlanal: Add some new helper classes Richard Sandiford
2020-12-13 17:30   ` Jeff Law
2020-12-14 16:37     ` Richard Sandiford
2020-12-14 20:02       ` Jeff Law
2020-11-13  8:21 ` [20/23] rtlanal: Add simple_regno_set Richard Sandiford
2020-11-25 20:31   ` Jeff Law
2020-12-17  0:47     ` Richard Sandiford
2021-01-04 15:28       ` Jeff Law
2020-11-13  8:22 ` [21/23] doc: Add documentation for rtl-ssa Richard Sandiford
2020-11-30  6:26   ` Jeff Law
2020-11-13  8:23 ` [PATCH 22/23] Add rtl-ssa Richard Sandiford
2020-12-16  3:31   ` Jeff Law
2020-12-17  0:33     ` Richard Sandiford
2020-12-19 20:01       ` Jeff Law
2020-11-13  8:24 ` [PATCH 23/23] fwprop: Rewrite to use RTL SSA Richard Sandiford
2020-12-16  3:52   ` Jeff Law
2020-12-17  0:34     ` Richard Sandiford
2020-11-25 19:58 ` [00/23] Make fwprop use an on-the-side RTL SSA representation Jeff Law
2020-11-26 16:03   ` Richard Sandiford
2020-11-27 15:56     ` Michael Matz
2020-11-27 16:31       ` Richard Sandiford
2020-11-30 21:13         ` Jeff Law
2020-12-01  0:03           ` Michael Matz
2020-12-01 10:15             ` Richard Sandiford
2020-12-02  0:25             ` Jeff Law
2020-11-30  6:45     ` Jeff Law
2020-11-30 14:12       ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).