public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/6] rtl-ssa: Various fixes needed for the late-combine pass
@ 2023-10-24 10:50 Richard Sandiford
  2023-10-24 10:50 ` [PATCH 1/6] rtl-ssa: Ensure global registers are live on exit Richard Sandiford
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Richard Sandiford @ 2023-10-24 10:50 UTC (permalink / raw)
  To: jlaw, gcc-patches; +Cc: Richard Sandiford

Testing the late-combine pass showed a depressing number of
bugs in areas of RTL-SSA that hadn't been used much until now.
Most of them relate to doing things after RA.

Tested on aarch64-linux-gnu & x86_64-linux-gnu.  OK to install?

Richard

Richard Sandiford (6):
  rtl-ssa: Ensure global registers are live on exit
  rtl-ssa: Create REG_UNUSED notes after all pending changes
  rtl-ssa: Fix ICE when deleting memory clobbers
  rtl-ssa: Handle artifical uses of deleted defs
  rtl-ssa: Calculate dominance frontiers for the exit block
  rtl-ssa: Handle call clobbers in more places

 gcc/rtl-ssa/access-utils.h | 27 ++++++-----------
 gcc/rtl-ssa/accesses.cc    | 25 ++++++++++++++++
 gcc/rtl-ssa/blocks.cc      | 60 ++++++++++++++++++++++++++------------
 gcc/rtl-ssa/changes.cc     | 58 +++++++++++++++++++++++++++++++-----
 gcc/rtl-ssa/functions.cc   |  2 +-
 gcc/rtl-ssa/functions.h    | 15 ++++++++++
 gcc/rtl-ssa/insns.cc       |  2 ++
 gcc/rtl-ssa/internals.h    |  4 +++
 gcc/rtl-ssa/member-fns.inl |  9 ++++++
 9 files changed, 158 insertions(+), 44 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/6] rtl-ssa: Ensure global registers are live on exit
  2023-10-24 10:50 [PATCH 0/6] rtl-ssa: Various fixes needed for the late-combine pass Richard Sandiford
@ 2023-10-24 10:50 ` Richard Sandiford
  2023-10-24 17:21   ` Jeff Law
  2023-10-24 10:50 ` [PATCH 2/6] rtl-ssa: Create REG_UNUSED notes after all pending changes Richard Sandiford
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Richard Sandiford @ 2023-10-24 10:50 UTC (permalink / raw)
  To: jlaw, gcc-patches; +Cc: Richard Sandiford

RTL-SSA mostly relies on DF for block-level register liveness
information, including artificial uses and defs at the beginning
and end of blocks.  But one case was missing.  DF does not add
artificial uses of global registers to the beginning or end
of a block.  Instead it marks them as used within every block
when computing LR and LIVE problems.

For RTL-SSA, global registers behave like memory, which in
turn behaves like gimple vops.  We need to ensure that they
are live on exit so that final definitions do not appear
to be unused.

Also, the previous live-on-exit handling only considered the exit
block itself.  It needs to consider non-local gotos as well, since
they jump directly to some code in a parent function and so do
not have a path to the exit block.

gcc/
	* rtl-ssa/blocks.cc (function_info::add_artificial_accesses): Force
	global registers to be live on exit.  Handle any block with zero
	successors like an exit block.
---
 gcc/rtl-ssa/blocks.cc | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/gcc/rtl-ssa/blocks.cc b/gcc/rtl-ssa/blocks.cc
index ecce7a68c59..49c0d15b3cf 100644
--- a/gcc/rtl-ssa/blocks.cc
+++ b/gcc/rtl-ssa/blocks.cc
@@ -866,11 +866,14 @@ function_info::add_artificial_accesses (build_info &bi, df_ref_flags flags)
 
   start_insn_accesses ();
 
+  HARD_REG_SET added_regs = {};
   FOR_EACH_ARTIFICIAL_USE (ref, cfg_bb->index)
     if ((DF_REF_FLAGS (ref) & DF_REF_AT_TOP) == flags)
       {
 	unsigned int regno = DF_REF_REGNO (ref);
 	machine_mode mode = GET_MODE (DF_REF_REAL_REG (ref));
+	if (HARD_REGISTER_NUM_P (regno))
+	  SET_HARD_REG_BIT (added_regs, regno);
 
 	// A definition must be available.
 	gcc_checking_assert (bitmap_bit_p (&lr_info->in, regno)
@@ -879,10 +882,20 @@ function_info::add_artificial_accesses (build_info &bi, df_ref_flags flags)
 	m_temp_uses.safe_push (create_reg_use (bi, insn, { mode, regno }));
       }
 
-  // Track the return value of memory by adding an artificial use of
-  // memory at the end of the exit block.
-  if (flags == 0 && cfg_bb->index == EXIT_BLOCK)
+  // Ensure that global registers and memory are live at the end of any
+  // block that has no successors, such as the exit block and non-local gotos.
+  // Global registers have to be singled out because they are not part of
+  // the DF artifical use list (they are instead treated as used within
+  // every block).
+  if (flags == 0 && EDGE_COUNT (cfg_bb->succs) == 0)
     {
+      for (unsigned int i = 0; i < FIRST_PSEUDO_REGISTER; ++i)
+	if (global_regs[i] && !TEST_HARD_REG_BIT (added_regs, i))
+	  {
+	    auto mode = reg_raw_mode[i];
+	    m_temp_uses.safe_push (create_reg_use (bi, insn, { mode, i }));
+	  }
+
       auto *use = allocate<use_info> (insn, memory, bi.current_mem_value ());
       add_use (use);
       m_temp_uses.safe_push (use);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/6] rtl-ssa: Create REG_UNUSED notes after all pending changes
  2023-10-24 10:50 [PATCH 0/6] rtl-ssa: Various fixes needed for the late-combine pass Richard Sandiford
  2023-10-24 10:50 ` [PATCH 1/6] rtl-ssa: Ensure global registers are live on exit Richard Sandiford
@ 2023-10-24 10:50 ` Richard Sandiford
  2023-10-24 17:22   ` Jeff Law
  2023-10-24 10:50 ` [PATCH 3/6] rtl-ssa: Fix ICE when deleting memory clobbers Richard Sandiford
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Richard Sandiford @ 2023-10-24 10:50 UTC (permalink / raw)
  To: jlaw, gcc-patches; +Cc: Richard Sandiford

Unlike REG_DEAD notes, REG_UNUSED notes need to be kept free of
false positives by all passes.  function_info::change_insns
does this by removing all REG_UNUSED notes, and then using
add_reg_unused_notes to add notes back (or create new ones)
where appropriate.

The problem was that it called add_reg_unused_notes on the fly
while updating each instruction, which meant that the information
for later instructions in the change set wasn't up to date.
This patch does it in a separate loop instead.

gcc/
	* rtl-ssa/changes.cc (function_info::apply_changes_to_insn): Remove
	call to add_reg_unused_notes and instead...
	(function_info::change_insns): ...use a separate loop here.
---
 gcc/rtl-ssa/changes.cc | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/gcc/rtl-ssa/changes.cc b/gcc/rtl-ssa/changes.cc
index de6222ae736..c73c23c86fb 100644
--- a/gcc/rtl-ssa/changes.cc
+++ b/gcc/rtl-ssa/changes.cc
@@ -586,8 +586,6 @@ function_info::apply_changes_to_insn (insn_change &change)
 
       insn->set_accesses (builder.finish ().begin (), num_defs, num_uses);
     }
-
-  add_reg_unused_notes (insn);
 }
 
 // Add a temporary placeholder instruction after AFTER.
@@ -733,9 +731,14 @@ function_info::change_insns (array_slice<insn_change *> changes)
 	}
     }
 
-  // Finally apply the changes to the underlying insn_infos.
+  // Apply the changes to the underlying insn_infos.
   for (insn_change *change : changes)
     apply_changes_to_insn (*change);
+
+  // Now that the insns and accesses are up to date, add any REG_UNUSED notes.
+  for (insn_change *change : changes)
+    if (!change->is_deletion ())
+      add_reg_unused_notes (change->insn ());
 }
 
 // See the comment above the declaration.
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/6] rtl-ssa: Fix ICE when deleting memory clobbers
  2023-10-24 10:50 [PATCH 0/6] rtl-ssa: Various fixes needed for the late-combine pass Richard Sandiford
  2023-10-24 10:50 ` [PATCH 1/6] rtl-ssa: Ensure global registers are live on exit Richard Sandiford
  2023-10-24 10:50 ` [PATCH 2/6] rtl-ssa: Create REG_UNUSED notes after all pending changes Richard Sandiford
@ 2023-10-24 10:50 ` Richard Sandiford
  2023-10-24 17:24   ` Jeff Law
  2023-10-24 10:50 ` [PATCH 4/6] rtl-ssa: Handle artifical uses of deleted defs Richard Sandiford
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Richard Sandiford @ 2023-10-24 10:50 UTC (permalink / raw)
  To: jlaw, gcc-patches; +Cc: Richard Sandiford

Sometimes an optimisation can remove a clobber of scratch registers
or scratch memory.  We then need to update the DU chains to reflect
the removed clobber.

For registers this isn't a problem.  Clobbers of registers are just
momentary blips in the register's lifetime.  They act as a barrier for
moving uses later or defs earlier, but otherwise they have no effect on
the semantics of other instructions.  Removing a clobber is therefore a
cheap, local operation.

In contrast, clobbers of memory are modelled as full sets.
This is because (a) a clobber of memory does not invalidate
*all* memory and (b) it's a common idiom to use (clobber (mem ...))
in stack barriers.  But removing a set and redirecting all uses
to a different set is a linear operation.  Doing it for potentially
every optimisation could lead to quadratic behaviour.

This patch therefore refrains from removing sets of memory that appear
to be redundant.  There's an opportunity to clean this up in linear time
at the end of the pass, but as things stand, nothing would benefit from
that.

This is also a very rare event.  Usually we should try to optimise the
insn before the scratch memory has been allocated.

gcc/
	* rtl-ssa/changes.cc (function_info::finalize_new_accesses):
	If a change describes a set of memory, ensure that that set
	is kept, regardless of the insn pattern.
---
 gcc/rtl-ssa/changes.cc | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/gcc/rtl-ssa/changes.cc b/gcc/rtl-ssa/changes.cc
index c73c23c86fb..5800f9dba97 100644
--- a/gcc/rtl-ssa/changes.cc
+++ b/gcc/rtl-ssa/changes.cc
@@ -429,8 +429,18 @@ function_info::finalize_new_accesses (insn_change &change, insn_info *pos)
   // Also keep any explicitly-recorded call clobbers, which are deliberately
   // excluded from the vec_rtx_properties.  Calls shouldn't move, so we can
   // keep the definitions in their current position.
+  //
+  // If the change describes a set of memory, but the pattern doesn't
+  // reference memory, keep the set anyway.  This can happen if the
+  // old pattern was a parallel that contained a memory clobber, and if
+  // the new pattern was recognized without that clobber.  Keeping the
+  // set avoids a linear-complexity update to the set's users.
+  //
+  // ??? We could queue an update so that these bogus clobbers are
+  // removed later.
   for (def_info *def : change.new_defs)
-    if (def->m_has_been_superceded && def->is_call_clobber ())
+    if (def->m_has_been_superceded
+	&& (def->is_call_clobber () || def->is_mem ()))
       {
 	def->m_has_been_superceded = false;
 	def->set_insn (insn);
@@ -535,7 +545,7 @@ function_info::finalize_new_accesses (insn_change &change, insn_info *pos)
 	}
     }
 
-  // Install the new list of definitions in CHANGE.
+  // Install the new list of uses in CHANGE.
   sort_accesses (m_temp_uses);
   change.new_uses = use_array (temp_access_array (m_temp_uses));
   m_temp_uses.truncate (0);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 4/6] rtl-ssa: Handle artifical uses of deleted defs
  2023-10-24 10:50 [PATCH 0/6] rtl-ssa: Various fixes needed for the late-combine pass Richard Sandiford
                   ` (2 preceding siblings ...)
  2023-10-24 10:50 ` [PATCH 3/6] rtl-ssa: Fix ICE when deleting memory clobbers Richard Sandiford
@ 2023-10-24 10:50 ` Richard Sandiford
  2023-10-24 17:26   ` Jeff Law
  2023-10-24 10:50 ` [PATCH 5/6] rtl-ssa: Calculate dominance frontiers for the exit block Richard Sandiford
  2023-10-24 10:50 ` [PATCH 6/6] rtl-ssa: Handle call clobbers in more places Richard Sandiford
  5 siblings, 1 reply; 13+ messages in thread
From: Richard Sandiford @ 2023-10-24 10:50 UTC (permalink / raw)
  To: jlaw, gcc-patches; +Cc: Richard Sandiford

If an optimisation removes the last real use of a definition,
there can still be artificial uses left.  This patch removes
those uses too.

These artificial uses exist because RTL-SSA is only an SSA-like
view of the existing RTL IL, rather than a native SSA representation.
It effectively treats RTL registers like gimple vops, but with the
addition of an RPO view of the register's lifetime(s).  Things are
structured to allow most operations to update this RPO view in
amortised sublinear time.

gcc/
	* rtl-ssa/functions.h (function_info::process_uses_of_deleted_def):
	New member function.
	* rtl-ssa/functions.cc (function_info::process_uses_of_deleted_def):
	Likewise.
	(function_info::change_insns): Use it.
---
 gcc/rtl-ssa/changes.cc  | 35 +++++++++++++++++++++++++++++++++--
 gcc/rtl-ssa/functions.h |  1 +
 2 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/gcc/rtl-ssa/changes.cc b/gcc/rtl-ssa/changes.cc
index 5800f9dba97..3e14069421c 100644
--- a/gcc/rtl-ssa/changes.cc
+++ b/gcc/rtl-ssa/changes.cc
@@ -209,6 +209,35 @@ rtl_ssa::changes_are_worthwhile (array_slice<insn_change *const> changes,
   return true;
 }
 
+// SET has been deleted.  Clean up all remaining uses.  Such uses are
+// either dead phis or now-redundant live-out uses.
+void
+function_info::process_uses_of_deleted_def (set_info *set)
+{
+  if (!set->has_any_uses ())
+    return;
+
+  auto *use = *set->all_uses ().begin ();
+  do
+    {
+      auto *next_use = use->next_use ();
+      if (use->is_in_phi ())
+	{
+	  // This call will not recurse.
+	  process_uses_of_deleted_def (use->phi ());
+	  delete_phi (use->phi ());
+	}
+      else
+	{
+	  gcc_assert (use->is_live_out_use ());
+	  remove_use (use);
+	}
+      use = next_use;
+    }
+  while (use);
+  gcc_assert (!set->has_any_uses ());
+}
+
 // Update the REG_NOTES of INSN, whose pattern has just been changed.
 static void
 update_notes (rtx_insn *insn)
@@ -695,7 +724,8 @@ function_info::change_insns (array_slice<insn_change *> changes)
     }
 
   // Remove all definitions that are no longer needed.  After the above,
-  // such definitions should no longer have any registered users.
+  // the only uses of such definitions should be dead phis and now-redundant
+  // live-out uses.
   //
   // In particular, this means that consumers must handle debug
   // instructions before removing a set.
@@ -704,7 +734,8 @@ function_info::change_insns (array_slice<insn_change *> changes)
       if (def->m_has_been_superceded)
 	{
 	  auto *set = dyn_cast<set_info *> (def);
-	  gcc_assert (!set || !set->has_any_uses ());
+	  if (set && set->has_any_uses ())
+	    process_uses_of_deleted_def (set);
 	  remove_def (def);
 	}
 
diff --git a/gcc/rtl-ssa/functions.h b/gcc/rtl-ssa/functions.h
index 73690a0e63b..cd90b6aa9df 100644
--- a/gcc/rtl-ssa/functions.h
+++ b/gcc/rtl-ssa/functions.h
@@ -263,6 +263,7 @@ private:
   bb_info *create_bb_info (basic_block);
   void append_bb (bb_info *);
 
+  void process_uses_of_deleted_def (set_info *);
   insn_info *add_placeholder_after (insn_info *);
   void possibly_queue_changes (insn_change &);
   void finalize_new_accesses (insn_change &, insn_info *);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 5/6] rtl-ssa: Calculate dominance frontiers for the exit block
  2023-10-24 10:50 [PATCH 0/6] rtl-ssa: Various fixes needed for the late-combine pass Richard Sandiford
                   ` (3 preceding siblings ...)
  2023-10-24 10:50 ` [PATCH 4/6] rtl-ssa: Handle artifical uses of deleted defs Richard Sandiford
@ 2023-10-24 10:50 ` Richard Sandiford
  2023-10-24 17:28   ` Jeff Law
  2023-10-24 10:50 ` [PATCH 6/6] rtl-ssa: Handle call clobbers in more places Richard Sandiford
  5 siblings, 1 reply; 13+ messages in thread
From: Richard Sandiford @ 2023-10-24 10:50 UTC (permalink / raw)
  To: jlaw, gcc-patches; +Cc: Richard Sandiford

The exit block can have multiple predecessors, for example if the
function calls __builtin_eh_return.  We might then need PHI nodes
for values that are live on exit.

RTL-SSA uses the normal dominance frontiers approach for calculating
where PHI nodes are needed.  However, dominannce.cc only calculates
dominators for normal blocks, not the exit block.
calculate_dominance_frontiers likewise only calculates dominance
frontiers for normal blocks.

This patch fills in the “missing” frontiers manually.

gcc/
	* rtl-ssa/internals.h (build_info::exit_block_dominator): New
	member variable.
	* rtl-ssa/blocks.cc (build_info::build_info): Initialize it.
	(bb_walker::bb_walker): Use it, moving the computation of the
	dominator to...
	(function_info::process_all_blocks): ...here.
	(function_info::place_phis): Add dominance frontiers for the
	exit block.
---
 gcc/rtl-ssa/blocks.cc   | 41 ++++++++++++++++++++++++++---------------
 gcc/rtl-ssa/internals.h |  4 ++++
 2 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/gcc/rtl-ssa/blocks.cc b/gcc/rtl-ssa/blocks.cc
index 49c0d15b3cf..0ce798e21b7 100644
--- a/gcc/rtl-ssa/blocks.cc
+++ b/gcc/rtl-ssa/blocks.cc
@@ -47,7 +47,8 @@ function_info::build_info::build_info (unsigned int num_regs,
     potential_phi_regs (num_regs),
     bb_phis (num_bb_indices),
     bb_mem_live_out (num_bb_indices),
-    bb_to_rpo (num_bb_indices)
+    bb_to_rpo (num_bb_indices),
+    exit_block_dominator (nullptr)
 {
   last_access.safe_grow_cleared (num_regs + 1);
 
@@ -103,21 +104,8 @@ function_info::bb_walker::bb_walker (function_info *function, build_info &bi)
   : dom_walker (CDI_DOMINATORS, ALL_BLOCKS, bi.bb_to_rpo.address ()),
     m_function (function),
     m_bi (bi),
-    m_exit_block_dominator (nullptr)
+    m_exit_block_dominator (bi.exit_block_dominator)
 {
-  // ??? There is no dominance information associated with the exit block,
-  // so work out its immediate dominator using predecessor blocks.  We then
-  // walk the exit block just before popping its immediate dominator.
-  edge e;
-  edge_iterator ei;
-  FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (m_function->m_fn)->preds)
-    if (m_exit_block_dominator)
-      m_exit_block_dominator
-	= nearest_common_dominator (CDI_DOMINATORS,
-				    m_exit_block_dominator, e->src);
-    else
-      m_exit_block_dominator = e->src;
-
   // If the exit block is unreachable, process it last.
   if (!m_exit_block_dominator)
     m_exit_block_dominator = ENTRY_BLOCK_PTR_FOR_FN (m_function->m_fn);
@@ -624,6 +612,19 @@ function_info::place_phis (build_info &bi)
     bitmap_initialize (&frontiers[i], &bitmap_default_obstack);
   compute_dominance_frontiers (frontiers.address ());
 
+  // The normal dominance information doesn't calculate dominators for
+  // the exit block, so we don't get dominance frontiers for them either.
+  // Calculate them by hand.
+  for (edge e : EXIT_BLOCK_PTR_FOR_FN (m_fn)->preds)
+    {
+      basic_block bb = e->src;
+      while (bb != bi.exit_block_dominator)
+	{
+	  bitmap_set_bit (&frontiers[bb->index], EXIT_BLOCK);
+	  bb = get_immediate_dominator (CDI_DOMINATORS, bb);
+	}
+    }
+
   // In extreme cases, the number of live-in registers can be much
   // greater than the number of phi nodes needed in a block (see PR98863).
   // Try to reduce the number of operations involving live-in sets by using
@@ -1264,6 +1265,16 @@ function_info::process_all_blocks ()
 
   build_info bi (m_num_regs, num_bb_indices);
 
+  // ??? There is no dominance information associated with the exit block,
+  // so work out its immediate dominator using predecessor blocks.
+  for (edge e : EXIT_BLOCK_PTR_FOR_FN (m_fn)->preds)
+    if (bi.exit_block_dominator)
+      bi.exit_block_dominator
+	= nearest_common_dominator (CDI_DOMINATORS,
+				    bi.exit_block_dominator, e->src);
+    else
+      bi.exit_block_dominator = e->src;
+
   calculate_potential_phi_regs (bi);
   create_ebbs (bi);
   place_phis (bi);
diff --git a/gcc/rtl-ssa/internals.h b/gcc/rtl-ssa/internals.h
index 6ed957754e2..e65ba9fe038 100644
--- a/gcc/rtl-ssa/internals.h
+++ b/gcc/rtl-ssa/internals.h
@@ -135,6 +135,10 @@ public:
   // The top of this stack records the start of the current block's
   // section in DEF_STACK.
   auto_vec<unsigned int> old_def_stack_limit;
+
+  // The block that dominates the exit block, or null if the exit block
+  // is unreachable.
+  basic_block exit_block_dominator;
 };
 
 }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 6/6] rtl-ssa: Handle call clobbers in more places
  2023-10-24 10:50 [PATCH 0/6] rtl-ssa: Various fixes needed for the late-combine pass Richard Sandiford
                   ` (4 preceding siblings ...)
  2023-10-24 10:50 ` [PATCH 5/6] rtl-ssa: Calculate dominance frontiers for the exit block Richard Sandiford
@ 2023-10-24 10:50 ` Richard Sandiford
  2023-10-24 17:37   ` Jeff Law
  5 siblings, 1 reply; 13+ messages in thread
From: Richard Sandiford @ 2023-10-24 10:50 UTC (permalink / raw)
  To: jlaw, gcc-patches; +Cc: Richard Sandiford

In order to save (a lot of) memory, RTL-SSA avoids creating
individual clobber records for every call-clobbered register.
It instead maintains a list & splay tree of calls in an EBB,
grouped by ABI.

This patch takes these call clobbers into account in a couple
more routines.  I don't think this will have any effect on
existing users, since it's only necessary for hard registers.

gcc/
	* rtl-ssa/access-utils.h (next_call_clobbers): New function.
	(is_single_dominating_def, remains_available_on_exit): Replace with...
	* rtl-ssa/functions.h (function_info::is_single_dominating_def)
	(function_info::remains_available_on_exit): ...these new member
	functions.
	(function_info::m_clobbered_by_calls): New member variable.
	* rtl-ssa/functions.cc (function_info::function_info): Explicitly
	initialize m_clobbered_by_calls.
	* rtl-ssa/insns.cc (function_info::record_call_clobbers): Update
	m_clobbered_by_calls for each call-clobber note.
	* rtl-ssa/member-fns.inl (function_info::is_single_dominating_def):
	New function.  Check for call clobbers.
	* rtl-ssa/accesses.cc (function_info::remains_available_on_exit):
	Likewise.
---
 gcc/rtl-ssa/access-utils.h | 27 +++++++++------------------
 gcc/rtl-ssa/accesses.cc    | 25 +++++++++++++++++++++++++
 gcc/rtl-ssa/functions.cc   |  2 +-
 gcc/rtl-ssa/functions.h    | 14 ++++++++++++++
 gcc/rtl-ssa/insns.cc       |  2 ++
 gcc/rtl-ssa/member-fns.inl |  9 +++++++++
 6 files changed, 60 insertions(+), 19 deletions(-)

diff --git a/gcc/rtl-ssa/access-utils.h b/gcc/rtl-ssa/access-utils.h
index 84d386b7d8b..0d7a57f843c 100644
--- a/gcc/rtl-ssa/access-utils.h
+++ b/gcc/rtl-ssa/access-utils.h
@@ -127,24 +127,6 @@ set_with_nondebug_insn_uses (access_info *access)
   return nullptr;
 }
 
-// Return true if SET is the only set of SET->resource () and if it
-// dominates all uses (excluding uses of SET->resource () at points
-// where SET->resource () is always undefined).
-inline bool
-is_single_dominating_def (const set_info *set)
-{
-  return set->is_first_def () && set->is_last_def ();
-}
-
-// SET is known to be available on entry to BB.  Return true if it is
-// also available on exit from BB.  (The value might or might not be live.)
-inline bool
-remains_available_on_exit (const set_info *set, bb_info *bb)
-{
-  return (set->is_last_def ()
-	  || *set->next_def ()->insn () > *bb->end_insn ());
-}
-
 // ACCESS is known to be associated with an instruction rather than
 // a phi node.  Return which instruction that is.
 inline insn_info *
@@ -313,6 +295,15 @@ next_call_clobbers_ignoring (insn_call_clobbers_tree &tree, insn_info *insn,
   return tree->insn ();
 }
 
+// Search forwards from immediately after INSN for the first instruction
+// recorded in TREE.  Return null if no such instruction exists.
+inline insn_info *
+next_call_clobbers (insn_call_clobbers_tree &tree, insn_info *insn)
+{
+  auto ignore = [](const insn_info *) { return false; };
+  return next_call_clobbers_ignoring (tree, insn, ignore);
+}
+
 // If ACCESS is a set, return the first use of ACCESS by a nondebug insn I
 // for which IGNORE (I) is false.  Return null if ACCESS is not a set or if
 // no such use exists.
diff --git a/gcc/rtl-ssa/accesses.cc b/gcc/rtl-ssa/accesses.cc
index 774ab9d99ee..c35c7efb73d 100644
--- a/gcc/rtl-ssa/accesses.cc
+++ b/gcc/rtl-ssa/accesses.cc
@@ -1303,6 +1303,31 @@ function_info::insert_temp_clobber (obstack_watermark &watermark,
   return insert_access (watermark, clobber, old_defs);
 }
 
+// See the comment above the declaration.
+bool
+function_info::remains_available_on_exit (const set_info *set, bb_info *bb)
+{
+  if (HARD_REGISTER_NUM_P (set->regno ())
+      && TEST_HARD_REG_BIT (m_clobbered_by_calls, set->regno ()))
+    {
+      insn_info *search_insn = (set->bb () == bb
+				? set->insn ()
+				: bb->head_insn ());
+      for (ebb_call_clobbers_info *call_group : bb->ebb ()->call_clobbers ())
+	{
+	  if (!call_group->clobbers (set->resource ()))
+	    continue;
+
+	  insn_info *insn = next_call_clobbers (*call_group, search_insn);
+	  if (insn && insn->bb () == bb)
+	    return false;
+	}
+    }
+
+  return (set->is_last_def ()
+	  || *set->next_def ()->insn () > *bb->end_insn ());
+}
+
 // A subroutine of make_uses_available.  Try to make USE's definition
 // available at the head of BB.  WILL_BE_DEBUG_USE is true if the
 // definition will be used only in debug instructions.
diff --git a/gcc/rtl-ssa/functions.cc b/gcc/rtl-ssa/functions.cc
index c35d25dbf8f..8a8108baae8 100644
--- a/gcc/rtl-ssa/functions.cc
+++ b/gcc/rtl-ssa/functions.cc
@@ -32,7 +32,7 @@
 using namespace rtl_ssa;
 
 function_info::function_info (function *fn)
-  : m_fn (fn)
+  : m_fn (fn), m_clobbered_by_calls ()
 {
   // Force the alignment to be obstack_alignment.  Everything else is normal.
   obstack_specify_allocation (&m_obstack, OBSTACK_CHUNK_SIZE,
diff --git a/gcc/rtl-ssa/functions.h b/gcc/rtl-ssa/functions.h
index cd90b6aa9df..ab253e750cb 100644
--- a/gcc/rtl-ssa/functions.h
+++ b/gcc/rtl-ssa/functions.h
@@ -102,6 +102,11 @@ public:
   // definitions by things like phi nodes.
   iterator_range<def_iterator> reg_defs (unsigned int regno) const;
 
+  // Return true if SET is the only set of SET->resource () and if it
+  // dominates all uses (excluding uses of SET->resource () at points
+  // where SET->resource () is always undefined).
+  bool is_single_dominating_def (const set_info *set) const;
+
   // Check if all uses of register REGNO are either unconditionally undefined
   // or use the same single dominating definition.  Return the definition
   // if so, otherwise return null.
@@ -116,6 +121,11 @@ public:
   // scope until the change has been aborted or successfully completed.
   obstack_watermark new_change_attempt () { return &m_temp_obstack; }
 
+  // SET either occurs in BB or is known to be available on entry to BB.
+  // Return true if it is also available on exit from BB.  (The value
+  // might or might not be live.)
+  bool remains_available_on_exit (const set_info *set, bb_info *bb);
+
   // Make a best attempt to check whether the values used by USES are
   // available on entry to BB, without solving a full dataflow problem.
   // If all the values are already live on entry to BB or can be made
@@ -357,6 +367,10 @@ private:
   // on it.  As with M_QUEUED_INSN_UPDATES, these updates are queued until
   // a convenient point.
   auto_bitmap m_need_to_purge_dead_edges;
+
+  // The set of hard registers that are fully or partially clobbered
+  // by at least one insn_call_clobbers_note.
+  HARD_REG_SET m_clobbered_by_calls;
 };
 
 void pp_function (pretty_printer *, const function_info *);
diff --git a/gcc/rtl-ssa/insns.cc b/gcc/rtl-ssa/insns.cc
index f970375d906..5fde3f2bb4b 100644
--- a/gcc/rtl-ssa/insns.cc
+++ b/gcc/rtl-ssa/insns.cc
@@ -568,6 +568,8 @@ function_info::record_call_clobbers (build_info &bi, insn_info *insn,
       insn->add_note (insn_clobbers);
 
       ecc->insert_max_node (insn_clobbers);
+
+      m_clobbered_by_calls |= abi.full_and_partial_reg_clobbers ();
     }
   else
     for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno)
diff --git a/gcc/rtl-ssa/member-fns.inl b/gcc/rtl-ssa/member-fns.inl
index 3fdca14e0ef..ce2db045b78 100644
--- a/gcc/rtl-ssa/member-fns.inl
+++ b/gcc/rtl-ssa/member-fns.inl
@@ -916,6 +916,15 @@ function_info::reg_defs (unsigned int regno) const
   return { m_defs[regno + 1], nullptr };
 }
 
+inline bool
+function_info::is_single_dominating_def (const set_info *set) const
+{
+  return (set->is_first_def ()
+	  && set->is_last_def ()
+	  && (!HARD_REGISTER_NUM_P (set->regno ())
+	      || !TEST_HARD_REG_BIT (m_clobbered_by_calls, set->regno ())));
+}
+
 inline set_info *
 function_info::single_dominating_def (unsigned int regno) const
 {
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/6] rtl-ssa: Ensure global registers are live on exit
  2023-10-24 10:50 ` [PATCH 1/6] rtl-ssa: Ensure global registers are live on exit Richard Sandiford
@ 2023-10-24 17:21   ` Jeff Law
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff Law @ 2023-10-24 17:21 UTC (permalink / raw)
  To: Richard Sandiford, jlaw, gcc-patches



On 10/24/23 04:50, Richard Sandiford wrote:
> RTL-SSA mostly relies on DF for block-level register liveness
> information, including artificial uses and defs at the beginning
> and end of blocks.  But one case was missing.  DF does not add
> artificial uses of global registers to the beginning or end
> of a block.  Instead it marks them as used within every block
> when computing LR and LIVE problems.
> 
> For RTL-SSA, global registers behave like memory, which in
> turn behaves like gimple vops.  We need to ensure that they
> are live on exit so that final definitions do not appear
> to be unused.
> 
> Also, the previous live-on-exit handling only considered the exit
> block itself.  It needs to consider non-local gotos as well, since
> they jump directly to some code in a parent function and so do
> not have a path to the exit block.
> 
> gcc/
> 	* rtl-ssa/blocks.cc (function_info::add_artificial_accesses): Force
> 	global registers to be live on exit.  Handle any block with zero
> 	successors like an exit block.
OK
jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/6] rtl-ssa: Create REG_UNUSED notes after all pending changes
  2023-10-24 10:50 ` [PATCH 2/6] rtl-ssa: Create REG_UNUSED notes after all pending changes Richard Sandiford
@ 2023-10-24 17:22   ` Jeff Law
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff Law @ 2023-10-24 17:22 UTC (permalink / raw)
  To: Richard Sandiford, jlaw, gcc-patches



On 10/24/23 04:50, Richard Sandiford wrote:
> Unlike REG_DEAD notes, REG_UNUSED notes need to be kept free of
> false positives by all passes.  function_info::change_insns
> does this by removing all REG_UNUSED notes, and then using
> add_reg_unused_notes to add notes back (or create new ones)
> where appropriate.
> 
> The problem was that it called add_reg_unused_notes on the fly
> while updating each instruction, which meant that the information
> for later instructions in the change set wasn't up to date.
> This patch does it in a separate loop instead.
> 
> gcc/
> 	* rtl-ssa/changes.cc (function_info::apply_changes_to_insn): Remove
> 	call to add_reg_unused_notes and instead...
> 	(function_info::change_insns): ...use a separate loop here.
OK
jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/6] rtl-ssa: Fix ICE when deleting memory clobbers
  2023-10-24 10:50 ` [PATCH 3/6] rtl-ssa: Fix ICE when deleting memory clobbers Richard Sandiford
@ 2023-10-24 17:24   ` Jeff Law
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff Law @ 2023-10-24 17:24 UTC (permalink / raw)
  To: Richard Sandiford, jlaw, gcc-patches



On 10/24/23 04:50, Richard Sandiford wrote:
> Sometimes an optimisation can remove a clobber of scratch registers
> or scratch memory.  We then need to update the DU chains to reflect
> the removed clobber.
> 
> For registers this isn't a problem.  Clobbers of registers are just
> momentary blips in the register's lifetime.  They act as a barrier for
> moving uses later or defs earlier, but otherwise they have no effect on
> the semantics of other instructions.  Removing a clobber is therefore a
> cheap, local operation.
> 
> In contrast, clobbers of memory are modelled as full sets.
> This is because (a) a clobber of memory does not invalidate
> *all* memory and (b) it's a common idiom to use (clobber (mem ...))
> in stack barriers.  But removing a set and redirecting all uses
> to a different set is a linear operation.  Doing it for potentially
> every optimisation could lead to quadratic behaviour.
> 
> This patch therefore refrains from removing sets of memory that appear
> to be redundant.  There's an opportunity to clean this up in linear time
> at the end of the pass, but as things stand, nothing would benefit from
> that.
> 
> This is also a very rare event.  Usually we should try to optimise the
> insn before the scratch memory has been allocated.
> 
> gcc/
> 	* rtl-ssa/changes.cc (function_info::finalize_new_accesses):
> 	If a change describes a set of memory, ensure that that set
> 	is kept, regardless of the insn pattern.
OK
jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/6] rtl-ssa: Handle artifical uses of deleted defs
  2023-10-24 10:50 ` [PATCH 4/6] rtl-ssa: Handle artifical uses of deleted defs Richard Sandiford
@ 2023-10-24 17:26   ` Jeff Law
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff Law @ 2023-10-24 17:26 UTC (permalink / raw)
  To: Richard Sandiford, jlaw, gcc-patches



On 10/24/23 04:50, Richard Sandiford wrote:
> If an optimisation removes the last real use of a definition,
> there can still be artificial uses left.  This patch removes
> those uses too.
> 
> These artificial uses exist because RTL-SSA is only an SSA-like
> view of the existing RTL IL, rather than a native SSA representation.
> It effectively treats RTL registers like gimple vops, but with the
> addition of an RPO view of the register's lifetime(s).  Things are
> structured to allow most operations to update this RPO view in
> amortised sublinear time.
> 
> gcc/
> 	* rtl-ssa/functions.h (function_info::process_uses_of_deleted_def):
> 	New member function.
> 	* rtl-ssa/functions.cc (function_info::process_uses_of_deleted_def):
> 	Likewise.
> 	(function_info::change_insns): Use it.
OK
jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/6] rtl-ssa: Calculate dominance frontiers for the exit block
  2023-10-24 10:50 ` [PATCH 5/6] rtl-ssa: Calculate dominance frontiers for the exit block Richard Sandiford
@ 2023-10-24 17:28   ` Jeff Law
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff Law @ 2023-10-24 17:28 UTC (permalink / raw)
  To: Richard Sandiford, jlaw, gcc-patches



On 10/24/23 04:50, Richard Sandiford wrote:
> The exit block can have multiple predecessors, for example if the
> function calls __builtin_eh_return.  We might then need PHI nodes
> for values that are live on exit.
> 
> RTL-SSA uses the normal dominance frontiers approach for calculating
> where PHI nodes are needed.  However, dominannce.cc only calculates
> dominators for normal blocks, not the exit block.
> calculate_dominance_frontiers likewise only calculates dominance
> frontiers for normal blocks.
> 
> This patch fills in the “missing” frontiers manually.
> 
> gcc/
> 	* rtl-ssa/internals.h (build_info::exit_block_dominator): New
> 	member variable.
> 	* rtl-ssa/blocks.cc (build_info::build_info): Initialize it.
> 	(bb_walker::bb_walker): Use it, moving the computation of the
> 	dominator to...
> 	(function_info::process_all_blocks): ...here.
> 	(function_info::place_phis): Add dominance frontiers for the
> 	exit block.
OK
jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/6] rtl-ssa: Handle call clobbers in more places
  2023-10-24 10:50 ` [PATCH 6/6] rtl-ssa: Handle call clobbers in more places Richard Sandiford
@ 2023-10-24 17:37   ` Jeff Law
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff Law @ 2023-10-24 17:37 UTC (permalink / raw)
  To: Richard Sandiford, jlaw, gcc-patches



On 10/24/23 04:50, Richard Sandiford wrote:
> In order to save (a lot of) memory, RTL-SSA avoids creating
> individual clobber records for every call-clobbered register.
> It instead maintains a list & splay tree of calls in an EBB,
> grouped by ABI.
> 
> This patch takes these call clobbers into account in a couple
> more routines.  I don't think this will have any effect on
> existing users, since it's only necessary for hard registers.
> 
> gcc/
> 	* rtl-ssa/access-utils.h (next_call_clobbers): New function.
> 	(is_single_dominating_def, remains_available_on_exit): Replace with...
> 	* rtl-ssa/functions.h (function_info::is_single_dominating_def)
> 	(function_info::remains_available_on_exit): ...these new member
> 	functions.
> 	(function_info::m_clobbered_by_calls): New member variable.
> 	* rtl-ssa/functions.cc (function_info::function_info): Explicitly
> 	initialize m_clobbered_by_calls.
> 	* rtl-ssa/insns.cc (function_info::record_call_clobbers): Update
> 	m_clobbered_by_calls for each call-clobber note.
> 	* rtl-ssa/member-fns.inl (function_info::is_single_dominating_def):
> 	New function.  Check for call clobbers.
> 	* rtl-ssa/accesses.cc (function_info::remains_available_on_exit):
> 	Likewise.
OK
jeff
> ---

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-10-24 17:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-24 10:50 [PATCH 0/6] rtl-ssa: Various fixes needed for the late-combine pass Richard Sandiford
2023-10-24 10:50 ` [PATCH 1/6] rtl-ssa: Ensure global registers are live on exit Richard Sandiford
2023-10-24 17:21   ` Jeff Law
2023-10-24 10:50 ` [PATCH 2/6] rtl-ssa: Create REG_UNUSED notes after all pending changes Richard Sandiford
2023-10-24 17:22   ` Jeff Law
2023-10-24 10:50 ` [PATCH 3/6] rtl-ssa: Fix ICE when deleting memory clobbers Richard Sandiford
2023-10-24 17:24   ` Jeff Law
2023-10-24 10:50 ` [PATCH 4/6] rtl-ssa: Handle artifical uses of deleted defs Richard Sandiford
2023-10-24 17:26   ` Jeff Law
2023-10-24 10:50 ` [PATCH 5/6] rtl-ssa: Calculate dominance frontiers for the exit block Richard Sandiford
2023-10-24 17:28   ` Jeff Law
2023-10-24 10:50 ` [PATCH 6/6] rtl-ssa: Handle call clobbers in more places Richard Sandiford
2023-10-24 17:37   ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).