public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [00/32] Support multiple ABIs in the same translation unit
@ 2019-09-11 19:02 Richard Sandiford
  2019-09-11 19:03 ` [02/32] Add a target hook for getting an ABI from a function type Richard Sandiford
                   ` (34 more replies)
  0 siblings, 35 replies; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:02 UTC (permalink / raw)
  To: gcc-patches

This series of patches introduces some classes and helpers for handling
multiple ABIs in the same translation unit.  At the moment "ABI" maans
specifically the choice of call-clobbered registers, but I'm hoping the
structures could be used for other ABI properties in future.

The main point of the series is to use these ABI structures instead of
global information like regs_invalidated_by_call, call_used_or_fixed_regs
and targetm.hard_regno_call_part_clobbered.  This has the side effect
of making all passes take -fipa-ra into account (except sel-sched.c,
see its patch for details).

The series also makes -fipa-ra work for partially-clobbered registers too.
Previously, if the ABI said that only the upper bits of a register are
call-clobbered, we'd enforce that rule separately from the -fipa-ra
information and apply it even when -fipa-ra can prove that the registers
aren't modified.  It turns out that fixing this interacts badly with
vzeroupper on x86, so the series has a patch to fix that.

Another general knock-on change is that we now always use the equivalent
of regs_invalidated_by_call rather than call_used_reg_set when deciding
whether a register is clobbered.  Among other things, this means that
cselib no longer invalidates expressions involving the stack pointer
when processing a call, since calls are guaranteed to return with the
same stack pointer.

The main motivating case for the series is the AArch64 vector PCS
and the SVE PCS, which are variants of the base AArch64 ABI but are
interoperable with it.  (Specifically, vector PCS calls preserve the
low 128 bits of 16 vector registers rather than the usual low 64 bits
of 8 registers.  SVE PCS calls instead preserve the whole of those 16
vector registers.)  However, I realised later that we could also use
this for the tlsdesc ABI on SVE targets, which would remove the need
for CLOBBER_HIGH.  I have follow-on patches to do that.

I also think the new structures would be useful for targets that
implement interrupt-handler attributes.  At the moment, we compile
interrupt handlers pretty much like ordinary functions, using the
same optimisation heuristics as for ordinary functions, and then
account for the extra call-saved registers in the prologue and
epilogue code.  Hooks like TARGET_HARD_REGNO_SCRATCH_OK then
prevent later optimisers from introducing new uses of unprotected
call-saved registers.  If the interrupt handler ABI was described
directly, the middle-end code would work with it in the same way
as for ordinary functions, including taking it into account when
making optimisation decisions.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  There were
some extra guality.exp failures due to the extra optimisation,
but they look like acceptable regressions.

Also tested by compiling at least one target per CPU directory and
checking for no new warnings.  It's quite hard to test for accidental
code differences given the general optimisation changes above, so I
resorted to comparing the gcc.c-torture, gcc.dg and g++.dg code at
-O0 only.  This came back clean except on PRU.

The reason for the PRU differences is that the port defines
targetm.hard_regno_call_part_clobbered, but uses it to test whether
a multi-register value contains a mixture of fully-clobbered and
fully-preserved registers.  AFAICT the port doesn't actually have
individual registers that are partly clobbered, so it doesn't need
to define the hook.  (I can see how the documentation gave a misleading
impression though.  I've tried to improve it in one of the patches.)
The series moves away from testing hard_regno_call_part_clobbered
directly to testing cached information instead, and the way that the
cached information is calculated means that defining the hook the way
the PRU port does has no effect.  In other words, after the series we
treat it (rightly IMO) as having a "normal" ABI whereas before we didn't.

Sorry for the long write-up.

Richard

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [02/32] Add a target hook for getting an ABI from a function type
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
@ 2019-09-11 19:03 ` Richard Sandiford
  2019-09-29 20:52   ` Jeff Law
  2019-09-11 19:03 ` [01/32] Add function_abi.{h,cc} Richard Sandiford
                   ` (33 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:03 UTC (permalink / raw)
  To: gcc-patches

This patch adds a target hook that allows targets to return
the ABI associated with a particular function type.  Generally,
when multiple ABIs are in use, it must be possible to tell from
a function type and its attributes which ABI it is using.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* target.def (fntype_abi): New target hook.
	* doc/tm.texi.in (TARGET_FNTYPE_ABI): Likewise.
	* doc/tm.texi: Regenerate.
	* target.h (predefined_function_abi): Declare.
	* function-abi.cc (fntype_abi): Call targetm.calls.fntype_abi,
	if defined.
	* config/aarch64/aarch64.h (ARM_PCS_SIMD): New arm_pcs value.
	* config/aarch64/aarch64.c: Include function-abi.h.
	(aarch64_simd_abi, aarch64_fntype_abi): New functions.
	(TARGET_FNTYPE_ABI): Define.

Index: gcc/target.def
===================================================================
--- gcc/target.def	2019-09-09 17:51:55.848574716 +0100
+++ gcc/target.def	2019-09-11 19:47:20.406290945 +0100
@@ -4892,6 +4892,15 @@ If this hook is not defined, then FUNCTI
  bool, (const unsigned int regno),
  default_function_value_regno_p)
 
+DEFHOOK
+(fntype_abi,
+ "Return the ABI used by a function with type @var{type}; see the\n\
+definition of @code{predefined_function_abi} for details of the ABI\n\
+descriptor.  Targets only need to define this hook if they support\n\
+interoperability between several ABIs in the same translation unit.",
+ const predefined_function_abi &, (const_tree type),
+ NULL)
+
 /* ??? Documenting this hook requires a GFDL license grant.  */
 DEFHOOK_UNDOC
 (internal_arg_pointer,
Index: gcc/doc/tm.texi.in
===================================================================
--- gcc/doc/tm.texi.in	2019-09-10 19:57:04.713041281 +0100
+++ gcc/doc/tm.texi.in	2019-09-11 19:47:20.402290974 +0100
@@ -1709,6 +1709,11 @@ must be defined.  Modern ports should de
 @cindex call-used register
 @cindex call-clobbered register
 @cindex call-saved register
+@hook TARGET_FNTYPE_ABI
+
+@cindex call-used register
+@cindex call-clobbered register
+@cindex call-saved register
 @hook TARGET_HARD_REGNO_CALL_PART_CLOBBERED
 
 @hook TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	2019-09-10 19:57:04.713041281 +0100
+++ gcc/doc/tm.texi	2019-09-11 19:47:20.402290974 +0100
@@ -1898,6 +1898,16 @@ must be defined.  Modern ports should de
 @cindex call-used register
 @cindex call-clobbered register
 @cindex call-saved register
+@deftypefn {Target Hook} {const predefined_function_abi &} TARGET_FNTYPE_ABI (const_tree @var{type})
+Return the ABI used by a function with type @var{type}; see the
+definition of @code{predefined_function_abi} for details of the ABI
+descriptor.  Targets only need to define this hook if they support
+interoperability between several ABIs in the same translation unit.
+@end deftypefn
+
+@cindex call-used register
+@cindex call-clobbered register
+@cindex call-saved register
 @deftypefn {Target Hook} bool TARGET_HARD_REGNO_CALL_PART_CLOBBERED (rtx_insn *@var{insn}, unsigned int @var{regno}, machine_mode @var{mode})
 This hook should return true if @var{regno} is partly call-saved and
 partly call-clobbered, and if a value of mode @var{mode} would be partly
Index: gcc/target.h
===================================================================
--- gcc/target.h	2019-08-20 09:52:11.022820825 +0100
+++ gcc/target.h	2019-09-11 19:47:20.406290945 +0100
@@ -152,6 +152,9 @@ struct noce_if_info;
 /* This is defined in calls.h.  */
 class function_arg_info;
 
+/* This is defined in function-abi.h.  */
+class predefined_function_abi;
+
 /* These are defined in tree-vect-stmts.c.  */
 extern tree stmt_vectype (class _stmt_vec_info *);
 extern bool stmt_in_inner_loop_p (class _stmt_vec_info *);
Index: gcc/function-abi.cc
===================================================================
--- gcc/function-abi.cc	2019-09-11 19:47:07.490381964 +0100
+++ gcc/function-abi.cc	2019-09-11 19:47:20.402290974 +0100
@@ -132,6 +132,8 @@ const predefined_function_abi &
 fntype_abi (const_tree type)
 {
   gcc_assert (FUNC_OR_METHOD_TYPE_P (type));
+  if (targetm.calls.fntype_abi)
+    return targetm.calls.fntype_abi (type);
   return default_function_abi;
 }
 
Index: gcc/config/aarch64/aarch64.h
===================================================================
--- gcc/config/aarch64/aarch64.h	2019-09-05 08:49:31.193737018 +0100
+++ gcc/config/aarch64/aarch64.h	2019-09-11 19:47:20.398291002 +0100
@@ -783,6 +783,7 @@ #define TARGET_ILP32	(aarch64_abi & AARC
 enum arm_pcs
 {
   ARM_PCS_AAPCS64,		/* Base standard AAPCS for 64 bit.  */
+  ARM_PCS_SIMD,			/* For aarch64_vector_pcs functions.  */
   ARM_PCS_UNKNOWN
 };
 
Index: gcc/config/aarch64/aarch64.c
===================================================================
--- gcc/config/aarch64/aarch64.c	2019-09-11 19:47:00.322432478 +0100
+++ gcc/config/aarch64/aarch64.c	2019-09-11 19:47:20.398291002 +0100
@@ -74,6 +74,7 @@ #define INCLUDE_STRING
 #include "rtx-vector-builder.h"
 #include "intl.h"
 #include "expmed.h"
+#include "function-abi.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -1365,6 +1366,24 @@ #define CASE(UPPER, LOWER, VALUE) case A
   gcc_unreachable ();
 }
 
+/* Return the descriptor of the SIMD ABI.  */
+
+static const predefined_function_abi &
+aarch64_simd_abi (void)
+{
+  predefined_function_abi &simd_abi = function_abis[ARM_PCS_SIMD];
+  if (!simd_abi.initialized_p ())
+    {
+      HARD_REG_SET full_reg_clobbers
+	= default_function_abi.full_reg_clobbers ();
+      for (int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
+	if (FP_SIMD_SAVED_REGNUM_P (regno))
+	  CLEAR_HARD_REG_BIT (full_reg_clobbers, regno);
+      simd_abi.initialize (ARM_PCS_SIMD, full_reg_clobbers);
+    }
+  return simd_abi;
+}
+
 /* Generate code to enable conditional branches in functions over 1 MiB.  */
 const char *
 aarch64_gen_far_branch (rtx * operands, int pos_label, const char * dest,
@@ -1810,6 +1829,16 @@ aarch64_hard_regno_mode_ok (unsigned reg
   return false;
 }
 
+/* Implement TARGET_FNTYPE_ABI.  */
+
+static const predefined_function_abi &
+aarch64_fntype_abi (const_tree fntype)
+{
+  if (lookup_attribute ("aarch64_vector_pcs", TYPE_ATTRIBUTES (fntype)))
+    return aarch64_simd_abi ();
+  return default_function_abi;
+}
+
 /* Return true if this is a definition of a vectorized simd function.  */
 
 static bool
@@ -20826,6 +20855,9 @@ #define TARGET_COMP_TYPE_ATTRIBUTES aarc
 #undef TARGET_GET_MULTILIB_ABI_NAME
 #define TARGET_GET_MULTILIB_ABI_NAME aarch64_get_multilib_abi_name
 
+#undef TARGET_FNTYPE_ABI
+#define TARGET_FNTYPE_ABI aarch64_fntype_abi
+
 #if CHECKING_P
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define TARGET_RUN_TARGET_SELFTESTS selftest::aarch64_run_selftests

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [01/32] Add function_abi.{h,cc}
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
  2019-09-11 19:03 ` [02/32] Add a target hook for getting an ABI from a function type Richard Sandiford
@ 2019-09-11 19:03 ` Richard Sandiford
  2019-09-29 20:51   ` Jeff Law
  2019-09-11 19:04 ` [03/32] Add a function for getting the ABI of a call insn target Richard Sandiford
                   ` (32 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:03 UTC (permalink / raw)
  To: gcc-patches

This patch adds new structures and functions for handling
multiple ABIs in a translation unit.  The structures are:

- predefined_function_abi: describes a static, predefined ABI
- function_abi: describes either a predefined ABI or a local
  variant of one (e.g. taking -fipa-ra into account)

The patch adds functions for getting the ABI from a given type
or decl; a later patch will also add a function for getting the
ABI of the target of a call insn.

Although ABIs are about much more than call-clobber/saved choices,
I wanted to keep the name general in case we add more ABI-related
information in future.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* Makefile.in (OBJS): Add function-abi.o.
	(GTFILES): Add function-abi.h.
	* function-abi.cc: New file.
	* function-abi.h: Likewise.
	* emit-rtl.h (rtl_data::abi): New field.
	* function.c: Include function-abi.h.
	(prepare_function_start): Initialize crtl->abi.
	* read-rtl-function.c: Include regs.h and function-abi.h.
	(read_rtl_function_body): Initialize crtl->abi.
	(read_rtl_function_body_from_file_range): Likewise.
	* reginfo.c: Include function-abi.h.
	(init_reg_sets_1): Initialize default_function_abi.
	(globalize_reg): Call add_full_reg_clobber for each predefined ABI
	when making a register global.
	* target-globals.h (this_target_function_abi_info): Declare.
	(target_globals::function_abi_info): New field.
	(restore_target_globals): Copy it.
	* target-globals.c: Include function-abi.h.
	(default_target_globals): Initialize the function_abi_info field.
	(target_globals): Allocate it.
	(save_target_globals): Free it.

Index: gcc/Makefile.in
===================================================================
--- gcc/Makefile.in	2019-09-09 17:51:55.832574829 +0100
+++ gcc/Makefile.in	2019-09-11 19:47:07.486381992 +0100
@@ -1306,6 +1306,7 @@ OBJS = \
 	fold-const.o \
 	fold-const-call.o \
 	function.o \
+	function-abi.o \
 	function-tests.o \
 	fwprop.o \
 	gcc-rich-location.o \
@@ -2522,6 +2523,7 @@ GTFILES = $(CPPLIB_H) $(srcdir)/input.h
   $(srcdir)/libfuncs.h $(SYMTAB_H) \
   $(srcdir)/real.h $(srcdir)/function.h $(srcdir)/insn-addr.h $(srcdir)/hwint.h \
   $(srcdir)/fixed-value.h \
+  $(srcdir)/function-abi.h \
   $(srcdir)/output.h $(srcdir)/cfgloop.h $(srcdir)/cfg.h $(srcdir)/profile-count.h \
   $(srcdir)/cselib.h $(srcdir)/basic-block.h  $(srcdir)/ipa-ref.h $(srcdir)/cgraph.h \
   $(srcdir)/reload.h $(srcdir)/caller-save.c $(srcdir)/symtab.c \
Index: gcc/function-abi.cc
===================================================================
--- /dev/null	2019-07-30 08:53:31.317691683 +0100
+++ gcc/function-abi.cc	2019-09-11 19:47:07.490381964 +0100
@@ -0,0 +1,145 @@
+/* Information about fuunction binary interfaces.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+This file is part of GCC
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "target.h"
+#include "rtl.h"
+#include "tree.h"
+#include "regs.h"
+#include "function-abi.h"
+#include "varasm.h"
+#include "cgraph.h"
+
+target_function_abi_info default_target_function_abi_info;
+#if SWITCHABLE_TARGET
+target_function_abi_info *this_target_function_abi_info
+  = &default_target_function_abi_info;
+#endif
+
+/* Initialize a predefined function ABI with the given values of
+   ID and FULL_REG_CLOBBERS.  */
+
+void
+predefined_function_abi::initialize (unsigned int id,
+				     const_hard_reg_set full_reg_clobbers)
+{
+  m_id = id;
+  m_initialized = true;
+  m_full_reg_clobbers = full_reg_clobbers;
+
+  /* Set up the value of m_full_and_partial_reg_clobbers.
+
+     If the ABI specifies that part of a hard register R is call-clobbered,
+     we should be able to find a single-register mode M for which
+     targetm.hard_regno_call_part_clobbered (NULL, R, M) is true.
+     In other words, it shouldn't be the case that R can hold all
+     single-register modes across a call, but can't hold part of
+     a multi-register mode.
+
+     If that assumption doesn't hold for a future target, we would need
+     to change the interface of TARGET_HARD_REGNO_CALL_PART_CLOBBERED so
+     that it tells us which registers in a multi-register value are
+     actually clobbered.  */
+  m_full_and_partial_reg_clobbers = full_reg_clobbers;
+  for (unsigned int i = 0; i < NUM_MACHINE_MODES; ++i)
+    {
+      machine_mode mode = (machine_mode) i;
+      for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno)
+	if (targetm.hard_regno_mode_ok (regno, mode)
+	    && hard_regno_nregs (regno, mode) == 1
+	    && targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
+	  SET_HARD_REG_BIT (m_full_and_partial_reg_clobbers, regno);
+    }
+
+  /* For each mode MODE, work out which registers are unable to hold
+     any part of a MODE value across a call, i.e. those for which no
+     overlapping call-preserved (reg:MODE REGNO) exists.
+
+     We assume that this can be flipped around to say that a call
+     preserves (reg:MODE REGNO) unless the register overlaps this set.
+     The usual reason for this being true is that if (reg:MODE REGNO)
+     contains a part-clobbered register, that register would be
+     part-clobbered regardless of which part of MODE it holds.
+     For example, if (reg:M 2) occupies two registers and if the
+     register 3 portion of it is part-clobbered, (reg:M 3) is usually
+     either invalid or also part-clobbered.  */
+  for (unsigned int i = 0; i < NUM_MACHINE_MODES; ++i)
+    {
+      machine_mode mode = (machine_mode) i;
+      m_mode_clobbers[i] = m_full_and_partial_reg_clobbers;
+      for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno)
+	if (targetm.hard_regno_mode_ok (regno, mode)
+	    && !overlaps_hard_reg_set_p (m_full_reg_clobbers, mode, regno)
+	    && !targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
+	  remove_from_hard_reg_set (&m_mode_clobbers[i], mode, regno);
+    }
+
+  /* Check that the assumptions above actually hold, i.e. that testing
+     for single-register modes makes sense, and that overlap tests for
+     mode_clobbers work as expected.  */
+  if (flag_checking)
+    for (unsigned int i = 0; i < NUM_MACHINE_MODES; ++i)
+      {
+	machine_mode mode = (machine_mode) i;
+	const_hard_reg_set all_clobbers = m_full_and_partial_reg_clobbers;
+	for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno)
+	  if (targetm.hard_regno_mode_ok (regno, mode)
+	      && !overlaps_hard_reg_set_p (m_full_reg_clobbers, mode, regno)
+	      && targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
+	    gcc_assert (overlaps_hard_reg_set_p (all_clobbers, mode, regno)
+			&& overlaps_hard_reg_set_p (m_mode_clobbers[i],
+						    mode, regno));
+      }
+}
+
+/* If the ABI has been initialized, add REGNO to the set of registers
+   that can be completely altered by a call.  */
+
+void
+predefined_function_abi::add_full_reg_clobber (unsigned int regno)
+{
+  if (!m_initialized)
+    return;
+
+  SET_HARD_REG_BIT (m_full_reg_clobbers, regno);
+  SET_HARD_REG_BIT (m_full_and_partial_reg_clobbers, regno);
+  for (unsigned int i = 0; i < NUM_MACHINE_MODES; ++i)
+    SET_HARD_REG_BIT (m_mode_clobbers[i], regno);
+}
+
+/* Return the predefined ABI used by functions with type TYPE.  */
+
+const predefined_function_abi &
+fntype_abi (const_tree type)
+{
+  gcc_assert (FUNC_OR_METHOD_TYPE_P (type));
+  return default_function_abi;
+}
+
+/* Return the ABI of function decl FNDECL.  */
+
+function_abi
+fndecl_abi (const_tree fndecl)
+{
+  gcc_assert (TREE_CODE (fndecl) == FUNCTION_DECL);
+  return fntype_abi (TREE_TYPE (fndecl));
+}
Index: gcc/function-abi.h
===================================================================
--- /dev/null	2019-07-30 08:53:31.317691683 +0100
+++ gcc/function-abi.h	2019-09-11 19:47:07.490381964 +0100
@@ -0,0 +1,269 @@
+/* Information about fuunction binary interfaces.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+This file is part of GCC
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_FUNCTION_ABI_H
+#define GCC_FUNCTION_ABI_H
+
+/* Most targets use the same ABI for all functions in a translation
+   unit, but some targets support interoperability between several ABIs.
+   Each such ABI has a unique 0-based identifier, with 0 always being
+   the default choice of ABI.
+
+   NUM_ABI_IDS is the maximum number of such ABIs that GCC can handle at once.
+   A bitfield with this number of bits can represent any combinaion of the
+   supported ABIs.  */
+const size_t NUM_ABI_IDS = 8;
+
+/* Information about one of the target's predefined ABIs.  */
+class predefined_function_abi
+{
+public:
+  /* A target-specific identifier for this ABI.  The value must be in
+     the range [0, NUM_ABI_IDS - 1].  */
+  unsigned int id () const { return m_id; }
+
+  /* True if this ABI has been initialized.  */
+  bool initialized_p () const { return m_initialized; }
+
+  /* Return true if a function call is allowed to alter every bit of
+     register REGNO, so that the register contains an arbitrary value
+     on return.  If so, the register cannot hold any part of a value
+     that is live across a call.  */
+  bool
+  clobbers_full_reg_p (unsigned int regno) const
+  {
+    return TEST_HARD_REG_BIT (m_full_reg_clobbers, regno);
+  }
+
+  /* Return true if a function call is allowed to alter some or all bits
+     of register REGNO.
+
+     This is true whenever clobbers_full_reg_p (REGNO) is true.  It is
+     also true if, for example, the ABI says that a call must preserve the
+     low 32 or 64 bits of REGNO, but can clobber the upper bits of REGNO.
+     In the latter case, it is possible for REGNO to hold values that
+     are live across a call, provided that the value occupies only the
+     call-preserved part of the register.  */
+  bool
+  clobbers_at_least_part_of_reg_p (unsigned int regno) const
+  {
+    return TEST_HARD_REG_BIT (m_full_and_partial_reg_clobbers, regno);
+  }
+
+  /* Return true if a function call is allowed to clobber at least part
+     of (reg:MODE REGNO).  If so, it is not possible for the register
+     as a whole to be live across a call.  */
+  bool
+  clobbers_reg_p (machine_mode mode, unsigned int regno) const
+  {
+    return overlaps_hard_reg_set_p (m_mode_clobbers[mode], mode, regno);
+  }
+
+  /* Return the set of registers that a function call is allowed to
+     alter completely, so that the registers contain arbitrary values
+     on return.  This doesn't include registers that a call can only
+     partly clobber (as per TARGET_HARD_REGNO_CALL_PART_CLOBBERED).
+
+     These registers cannot hold any part of a value that is live across
+     a call.  */
+  HARD_REG_SET full_reg_clobbers () const { return m_full_reg_clobbers; }
+
+  /* Return the set of registers that a function call is allowed to alter
+     to some degree.  For example, if an ABI says that a call must preserve
+     the low 32 or 64 bits of a register R, but can clobber the upper bits
+     of R, R would be in this set but not in full_reg_clobbers ().
+
+     This set is a superset of full_reg_clobbers ().  It is possible for a
+     register in full_and_partial_reg_clobbers () & ~full_reg_clobbers ()
+     to contain values that are live across a call, provided that the live
+     value only occupies the call-preserved part of the register.  */
+  HARD_REG_SET
+  full_and_partial_reg_clobbers () const
+  {
+    return m_full_and_partial_reg_clobbers;
+  }
+
+  /* Return the set of registers that cannot be used to hold a value of
+     mode MODE across a function call.  That is:
+
+       (reg:REGNO MODE)
+
+     might be clobbered by a call whenever:
+
+       overlaps_hard_reg_set (mode_clobbers (MODE), MODE, REGNO)
+
+     In allocation terms, the registers in the returned set conflict
+     with any value of mode MODE that is live across a call.  */
+  HARD_REG_SET
+  mode_clobbers (machine_mode mode) const
+  {
+    return m_mode_clobbers[mode];
+  }
+
+  void initialize (unsigned int, const_hard_reg_set);
+  void add_full_reg_clobber (unsigned int);
+
+private:
+  unsigned int m_id : NUM_ABI_IDS;
+  unsigned int m_initialized : 1;
+  HARD_REG_SET m_full_reg_clobbers;
+  HARD_REG_SET m_full_and_partial_reg_clobbers;
+  HARD_REG_SET m_mode_clobbers[NUM_MACHINE_MODES];
+};
+
+/* Describes either a predefined ABI or the ABI of a particular function.
+   In the latter case, the ABI might make use of extra function-specific
+   information, such as for -fipa-ra.  */
+class function_abi
+{
+public:
+  /* Initialize the structure for a general function with the given ABI.  */
+  function_abi (const predefined_function_abi &base_abi)
+    : m_base_abi (&base_abi),
+      m_mask (base_abi.full_and_partial_reg_clobbers ()) {}
+
+  /* Initialize the structure for a function that has the given ABI and
+     that is known not to clobber registers outside MASK.  */
+  function_abi (const predefined_function_abi &base_abi,
+		const_hard_reg_set mask)
+    : m_base_abi (&base_abi), m_mask (mask) {}
+
+  /* The predefined ABI from which this ABI is derived.  */
+  const predefined_function_abi &base_abi () const { return *m_base_abi; }
+
+  /* The target-specific identifier of the predefined ABI.  */
+  unsigned int id () const { return m_base_abi->id (); }
+
+  /* See the corresponding predefined_function_abi functions for
+     details about the following functions.  */
+
+  HARD_REG_SET
+  full_reg_clobbers () const
+  {
+    return m_mask & m_base_abi->full_reg_clobbers ();
+  }
+
+  HARD_REG_SET
+  full_and_partial_reg_clobbers () const
+  {
+    return m_mask & m_base_abi->full_and_partial_reg_clobbers ();
+  }
+
+  HARD_REG_SET
+  mode_clobbers (machine_mode mode) const
+  {
+    return m_mask & m_base_abi->mode_clobbers (mode);
+  }
+
+  bool
+  clobbers_full_reg_p (unsigned int regno) const
+  {
+    return (TEST_HARD_REG_BIT (m_mask, regno)
+	    & m_base_abi->clobbers_full_reg_p (regno));
+  }
+
+  bool
+  clobbers_at_least_part_of_reg_p (unsigned int regno) const
+  {
+    return (TEST_HARD_REG_BIT (m_mask, regno)
+	    & m_base_abi->clobbers_at_least_part_of_reg_p (regno));
+  }
+
+  bool
+  clobbers_reg_p (machine_mode mode, unsigned int regno) const
+  {
+    return overlaps_hard_reg_set_p (mode_clobbers (mode), mode, regno);
+  }
+
+  bool
+  operator== (const function_abi &other) const
+  {
+    return m_base_abi == other.m_base_abi && m_mask == other.m_mask;
+  }
+
+  bool
+  operator!= (const function_abi &other) const
+  {
+    return !operator== (other);
+  }
+
+protected:
+  const predefined_function_abi *m_base_abi;
+  HARD_REG_SET m_mask;
+};
+
+struct target_function_abi_info
+{
+  /* An array of all the target ABIs that are available in this
+     translation unit.  Not all entries are used for all targets,
+     but the structures are relatively small, and using a fixed-size
+     array avoids extra indirection.
+
+     There are various ways of getting an ABI descriptor:
+
+     * fndecl_abi (FNDECL) is the ABI of function FNDECL.
+
+     * fntype_abi (FNTYPE) is the ABI of a function with type FNTYPE.
+
+     * crtl->abi is the ABI of the function that we are currently
+       compiling to rtl.
+
+     * eh_edge_abi is the "ABI" used when taking an EH edge from an
+       exception-throwing statement to an exception handler.  Catching
+       exceptions from calls can be treated as an abnormal return from
+       those calls, and this ABI therefore describes the ABI of functions
+       on such an abnormal return.  Statements that throw non-call
+       exceptions can be treated as being implicitly wrapped in a call
+       that has such an abnormal return.
+
+       At present, no target needs to support more than one EH ABI.
+
+     * function_abis[N] is the ABI with identifier N.  This can be useful
+       when referring back to ABIs that have been collected by number in
+       a bitmask, such as after walking function calls in a particular
+       region of code.
+
+     * default_function_abi refers specifically to the target's default
+       choice of ABI, regardless of which (if any) functions actually
+       use it.  This ABI and data derived from it do *not* provide
+       globally conservatively-correct information, so it is only
+       useful in very specific circumstances.  */
+  predefined_function_abi x_function_abis[NUM_ABI_IDS];
+};
+
+extern target_function_abi_info default_target_function_abi_info;
+#if SWITCHABLE_TARGET
+extern target_function_abi_info *this_target_function_abi_info;
+#else
+#define this_target_function_abi_info (&default_target_function_abi_info)
+#endif
+
+/* See the comment above x_function_abis for when these macros should be used.
+   At present, eh_edge_abi is always the default ABI, but that could change
+   in future if a target needs it to.  */
+#define function_abis \
+  (this_target_function_abi_info->x_function_abis)
+#define default_function_abi \
+  (this_target_function_abi_info->x_function_abis[0])
+#define eh_edge_abi default_function_abi
+
+extern const predefined_function_abi &fntype_abi (const_tree);
+extern function_abi fndecl_abi (const_tree);
+
+#endif
Index: gcc/emit-rtl.h
===================================================================
--- gcc/emit-rtl.h	2019-07-10 19:41:26.395898027 +0100
+++ gcc/emit-rtl.h	2019-09-11 19:47:07.490381964 +0100
@@ -22,6 +22,7 @@ #define GCC_EMIT_RTL_H
 
 class temp_slot;
 typedef class temp_slot *temp_slot_p;
+class predefined_function_abi;
 
 /* Information mainlined about RTL representation of incoming arguments.  */
 struct GTY(()) incoming_args {
@@ -64,6 +65,14 @@ struct GTY(()) rtl_data {
   struct function_subsections subsections;
   struct rtl_eh eh;
 
+  /* The ABI of the function, i.e. the interface it presents to its callers.
+     This is the ABI that should be queried to see which registers the
+     function needs to save before it uses them.
+
+     Other functions (including those called by this function) might use
+     different ABIs.  */
+  const predefined_function_abi *GTY((skip)) abi;
+
   /* For function.c  */
 
   /* # of bytes of outgoing arguments.  If ACCUMULATE_OUTGOING_ARGS is
Index: gcc/function.c
===================================================================
--- gcc/function.c	2019-09-10 19:56:45.353177919 +0100
+++ gcc/function.c	2019-09-11 19:47:07.490381964 +0100
@@ -79,6 +79,7 @@ Software Foundation; either version 3, o
 #include "attribs.h"
 #include "gimple.h"
 #include "options.h"
+#include "function-abi.h"
 
 /* So we can assign to cfun in this file.  */
 #undef cfun
@@ -4827,6 +4828,12 @@ push_struct_function (tree fndecl)
 prepare_function_start (void)
 {
   gcc_assert (!get_last_insn ());
+
+  if (in_dummy_function)
+    crtl->abi = &default_function_abi;
+  else
+    crtl->abi = &fndecl_abi (cfun->decl).base_abi ();
+
   init_temp_slots ();
   init_emit ();
   init_varasm_status ();
Index: gcc/read-rtl-function.c
===================================================================
--- gcc/read-rtl-function.c	2019-07-10 19:41:21.619936277 +0100
+++ gcc/read-rtl-function.c	2019-09-11 19:47:07.490381964 +0100
@@ -41,6 +41,8 @@ Software Foundation; either version 3, o
 #include "read-rtl-function.h"
 #include "selftest.h"
 #include "selftest-rtl.h"
+#include "regs.h"
+#include "function-abi.h"
 
 /* Forward decls.  */
 class function_reader;
@@ -1611,6 +1613,7 @@ function_reader::get_insn_by_uid (int ui
 read_rtl_function_body (const char *path)
 {
   initialize_rtl ();
+  crtl->abi = &default_function_abi;
   init_emit ();
   init_varasm_status ();
 
@@ -1644,6 +1647,7 @@ read_rtl_function_body_from_file_range (
     }
 
   initialize_rtl ();
+  crtl->abi = &fndecl_abi (cfun->decl).base_abi ();
   init_emit ();
   init_varasm_status ();
 
Index: gcc/reginfo.c
===================================================================
--- gcc/reginfo.c	2019-09-10 19:57:04.713041281 +0100
+++ gcc/reginfo.c	2019-09-11 19:47:07.490381964 +0100
@@ -43,6 +43,7 @@ Software Foundation; either version 3, o
 #include "reload.h"
 #include "output.h"
 #include "tree-pass.h"
+#include "function-abi.h"
 
 /* Maximum register number used in this function, plus one.  */
 
@@ -419,6 +420,8 @@ init_reg_sets_1 (void)
 	       }
 	  }
      }
+
+  default_function_abi.initialize (0, regs_invalidated_by_call);
 }
 
 /* Compute the table of register modes.
@@ -728,7 +731,11 @@ globalize_reg (tree decl, int i)
      appropriate regs_invalidated_by_call bit, even if it's already
      set in fixed_regs.  */
   if (i != STACK_POINTER_REGNUM)
-    SET_HARD_REG_BIT (regs_invalidated_by_call, i);
+    {
+      SET_HARD_REG_BIT (regs_invalidated_by_call, i);
+      for (unsigned int i = 0; i < NUM_ABI_IDS; ++i)
+	function_abis[i].add_full_reg_clobber (i);
+    }
 
   /* If already fixed, nothing else to do.  */
   if (fixed_regs[i])
Index: gcc/target-globals.h
===================================================================
--- gcc/target-globals.h	2019-07-10 19:41:21.635936149 +0100
+++ gcc/target-globals.h	2019-09-11 19:47:07.490381964 +0100
@@ -26,6 +26,7 @@ #define TARGET_GLOBALS_H 1
 extern struct target_rtl *this_target_rtl;
 extern struct target_recog *this_target_recog;
 extern struct target_hard_regs *this_target_hard_regs;
+extern struct target_function_abi_info *this_target_function_abi_info;
 extern struct target_reload *this_target_reload;
 extern struct target_expmed *this_target_expmed;
 extern struct target_optabs *this_target_optabs;
@@ -48,6 +49,7 @@ class GTY(()) target_globals {
   struct target_rtl *rtl;
   struct target_recog *GTY((skip)) recog;
   struct target_hard_regs *GTY((skip)) hard_regs;
+  struct target_function_abi_info *GTY((skip)) function_abi_info;
   struct target_reload *GTY((skip)) reload;
   struct target_expmed *GTY((skip)) expmed;
   struct target_optabs *GTY((skip)) optabs;
@@ -75,6 +77,7 @@ restore_target_globals (class target_glo
   this_target_rtl = g->rtl;
   this_target_recog = g->recog;
   this_target_hard_regs = g->hard_regs;
+  this_target_function_abi_info = g->function_abi_info;
   this_target_reload = g->reload;
   this_target_expmed = g->expmed;
   this_target_optabs = g->optabs;
Index: gcc/target-globals.c
===================================================================
--- gcc/target-globals.c	2019-07-10 19:41:21.635936149 +0100
+++ gcc/target-globals.c	2019-09-11 19:47:07.490381964 +0100
@@ -40,6 +40,7 @@ Software Foundation; either version 3, o
 #include "gcse.h"
 #include "bb-reorder.h"
 #include "lower-subreg.h"
+#include "function-abi.h"
 
 #if SWITCHABLE_TARGET
 class target_globals default_target_globals = {
@@ -48,6 +49,7 @@ class target_globals default_target_glob
   &default_target_rtl,
   &default_target_recog,
   &default_target_hard_regs,
+  &default_target_function_abi_info,
   &default_target_reload,
   &default_target_expmed,
   &default_target_optabs,
@@ -70,6 +72,7 @@ save_target_globals (void)
   g->rtl = ggc_cleared_alloc<target_rtl> ();
   g->recog = XCNEW (struct target_recog);
   g->hard_regs = XCNEW (struct target_hard_regs);
+  g->function_abi_info = XCNEW (struct target_function_abi_info);
   g->reload = XCNEW (struct target_reload);
   g->expmed = XCNEW (struct target_expmed);
   g->optabs = XCNEW (struct target_optabs);
@@ -127,6 +130,7 @@ target_globals::~target_globals ()
       XDELETE (regs);
       XDELETE (recog);
       XDELETE (hard_regs);
+      XDELETE (function_abi_info);
       XDELETE (reload);
       XDELETE (expmed);
       XDELETE (optabs);

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [03/32] Add a function for getting the ABI of a call insn target
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
  2019-09-11 19:03 ` [02/32] Add a target hook for getting an ABI from a function type Richard Sandiford
  2019-09-11 19:03 ` [01/32] Add function_abi.{h,cc} Richard Sandiford
@ 2019-09-11 19:04 ` Richard Sandiford
  2019-09-25 15:38   ` Richard Sandiford
  2019-09-11 19:05 ` [05/32] Pass an ABI identifier to hard_regno_call_part_clobbered Richard Sandiford
                   ` (31 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:04 UTC (permalink / raw)
  To: gcc-patches

This patch replaces get_call_reg_set_usage with call_insn_abi,
which returns the ABI of the target of a call insn.  The ABI's
full_reg_clobbers corresponds to regs_invalidated_by_call,
whereas many callers instead passed call_used_or_fixed_regs, i.e.:

  (regs_invalidated_by_call | fixed_reg_set)

The patch slavishly preserves the "| fixed_reg_set" for these callers;
later patches will clean this up.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* target.def (call_insn_abi): New hook.
	(remove_extra_call_preserved_regs): Delete.
	* doc/tm.texi.in (TARGET_CALL_INSN_ABI): New macro.
	(TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete.
	* doc/tm.texi: Regenerate.
	* targhooks.h (default_remove_extra_call_preserved_regs): Delete.
	* targhooks.c (default_remove_extra_call_preserved_regs): Delete.
	* config/aarch64/aarch64.c (aarch64_simd_call_p): Constify the
	insn argument.
	(aarch64_remove_extra_call_preserved_regs): Delete.
	(aarch64_call_insn_abi): New function.
	(TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete.
	(TARGET_CALL_INSN_ABI): New macro.
	* rtl.h (get_call_fndecl): Declare.
	(cgraph_rtl_info): Fix formatting.  Tweak comment for
	function_used_regs.  Remove function_used_regs_valid.
	* rtlanal.c (get_call_fndecl): Moved from final.c
	* function-abi.h (call_insn_abi): Declare.
	(target_function_abi_info): Mention call_insn_abi.
	* function-abi.cc (fndecl_abi): Handle flag_ipa_ra in a similar
	way to get_call_reg_set_usage did.
	(call_insn_abi): New function.
	* regs.h (get_call_reg_set_usage): Delete.
	* final.c: Include function-abi.h.
	(collect_fn_hard_reg_usage): Add fixed and stack registers to
	function_used_regs before the main loop rather than afterwards.
	Use call_insn_abi instead of get_call_reg_set_usage.  Exit early
	if function_used_regs ends up not being useful.
	(get_call_fndecl): Move to rtlanal.c
	(get_call_cgraph_rtl_info, get_call_reg_set_usage): Delete.
	* caller-save.c: Include function-abi.h.
	(setup_save_areas, save_call_clobbered_regs): Use call_insn_abi
	instead of get_call_reg_set_usage.
	* cfgcleanup.c: Include function-abi.h.
	(old_insns_match_p): Use call_insn_abi instead of
	get_call_reg_set_usage.
	* cgraph.h (cgraph_node::rtl_info): Take a const_tree instead of
	a tree.
	* cgraph.c (cgraph_node::rtl_info): Likewise.  Initialize
	function_used_regs.
	* df-scan.c: Include function-abi.h.
	(df_get_call_refs): Use call_insn_abi instead of
	get_call_reg_set_usage.
	* ira-lives.c: Include function-abi.h.
	(process_bb_node_lives): Use call_insn_abi instead of
	get_call_reg_set_usage.
	* lra-lives.c: Include function-abi.h.
	(process_bb_lives): Use call_insn_abi instead of
	get_call_reg_set_usage.
	* postreload.c: Include function-abi.h.
	(reload_combine): Use call_insn_abi instead of get_call_reg_set_usage.
	* regcprop.c: Include function-abi.h.
	(copyprop_hardreg_forward_1): Use call_insn_abi instead of
	get_call_reg_set_usage.
	* resource.c: Include function-abi.h.
	(mark_set_resources, mark_target_live_regs): Use call_insn_abi
	instead of get_call_reg_set_usage.
	* var-tracking.c: Include function-abi.h.
	(dataflow_set_clear_at_call): Use call_insn_abi
	instead of get_call_reg_set_usage.

Index: gcc/target.def
===================================================================
--- gcc/target.def	2019-09-11 19:47:20.406290945 +0100
+++ gcc/target.def	2019-09-11 19:47:24.422262645 +0100
@@ -4901,6 +4901,19 @@ interoperability between several ABIs in
  const predefined_function_abi &, (const_tree type),
  NULL)
 
+DEFHOOK
+(call_insn_abi,
+ "This hook returns a description of the ABI used by the target of\n\
+call instruction @var{insn}; see the definition of\n\
+@code{predefined_function_abi} for details of the ABI descriptor.\n\
+Only the global function @code{call_insn_abi} should call this hook\n\
+directly.\n\
+\n\
+Targets only need to define this hook if they support\n\
+interoperability between several ABIs in the same translation unit.",
+ const predefined_function_abi &, (const rtx_insn *insn),
+ NULL)
+
 /* ??? Documenting this hook requires a GFDL license grant.  */
 DEFHOOK_UNDOC
 (internal_arg_pointer,
@@ -5783,20 +5796,6 @@ DEFHOOK
  const char *, (void),
  hook_constcharptr_void_null)
 
-DEFHOOK
-(remove_extra_call_preserved_regs,
- "This hook removes registers from the set of call-clobbered registers\n\
- in @var{used_regs} if, contrary to the default rules, something guarantees\n\
- that @samp{insn} preserves those registers.  For example, some targets\n\
- support variant ABIs in which functions preserve more registers than\n\
- normal functions would.  Removing those extra registers from @var{used_regs}\n\
- can lead to better register allocation.\n\
- \n\
- The default implementation does nothing, which is always safe.\n\
- Defining the hook is purely an optimization.",
- void, (rtx_insn *insn, HARD_REG_SET *used_regs),
- default_remove_extra_call_preserved_regs)
-
 /* Return the smallest number of different values for which it is best to
    use a jump-table instead of a tree of conditional branches.  */
 DEFHOOK
Index: gcc/doc/tm.texi.in
===================================================================
--- gcc/doc/tm.texi.in	2019-09-11 19:47:20.402290974 +0100
+++ gcc/doc/tm.texi.in	2019-09-11 19:47:24.414262702 +0100
@@ -1711,13 +1711,13 @@ must be defined.  Modern ports should de
 @cindex call-saved register
 @hook TARGET_FNTYPE_ABI
 
+@hook TARGET_CALL_INSN_ABI
+
 @cindex call-used register
 @cindex call-clobbered register
 @cindex call-saved register
 @hook TARGET_HARD_REGNO_CALL_PART_CLOBBERED
 
-@hook TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS
-
 @hook TARGET_RETURN_CALL_WITH_MAX_CLOBBERS
 
 @hook TARGET_GET_MULTILIB_ABI_NAME
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	2019-09-11 19:47:20.402290974 +0100
+++ gcc/doc/tm.texi	2019-09-11 19:47:24.414262702 +0100
@@ -1905,6 +1905,17 @@ descriptor.  Targets only need to define
 interoperability between several ABIs in the same translation unit.
 @end deftypefn
 
+@deftypefn {Target Hook} {const predefined_function_abi &} TARGET_CALL_INSN_ABI (const rtx_insn *@var{insn})
+This hook returns a description of the ABI used by the target of
+call instruction @var{insn}; see the definition of
+@code{predefined_function_abi} for details of the ABI descriptor.
+Only the global function @code{call_insn_abi} should call this hook
+directly.
+
+Targets only need to define this hook if they support
+interoperability between several ABIs in the same translation unit.
+@end deftypefn
+
 @cindex call-used register
 @cindex call-clobbered register
 @cindex call-saved register
@@ -1921,18 +1932,6 @@ The default implementation returns false
 for targets that don't have partly call-clobbered registers.
 @end deftypefn
 
-@deftypefn {Target Hook} void TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS (rtx_insn *@var{insn}, HARD_REG_SET *@var{used_regs})
-This hook removes registers from the set of call-clobbered registers
- in @var{used_regs} if, contrary to the default rules, something guarantees
- that @samp{insn} preserves those registers.  For example, some targets
- support variant ABIs in which functions preserve more registers than
- normal functions would.  Removing those extra registers from @var{used_regs}
- can lead to better register allocation.
- 
- The default implementation does nothing, which is always safe.
- Defining the hook is purely an optimization.
-@end deftypefn
-
 @deftypefn {Target Hook} {rtx_insn *} TARGET_RETURN_CALL_WITH_MAX_CLOBBERS (rtx_insn *@var{call_1}, rtx_insn *@var{call_2})
 This hook returns a pointer to the call that partially clobbers the
 most registers.  If a platform supports multiple ABIs where the registers
Index: gcc/targhooks.h
===================================================================
--- gcc/targhooks.h	2019-09-09 17:51:55.848574716 +0100
+++ gcc/targhooks.h	2019-09-11 19:47:24.422262645 +0100
@@ -281,7 +281,5 @@ extern tree default_preferred_else_value
 extern bool default_have_speculation_safe_value (bool);
 extern bool speculation_safe_value_not_needed (bool);
 extern rtx default_speculation_safe_value (machine_mode, rtx, rtx, rtx);
-extern void default_remove_extra_call_preserved_regs (rtx_insn *,
-						      HARD_REG_SET *);
 
 #endif /* GCC_TARGHOOKS_H */
Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c	2019-09-09 17:51:55.848574716 +0100
+++ gcc/targhooks.c	2019-09-11 19:47:24.422262645 +0100
@@ -2363,9 +2363,4 @@ default_speculation_safe_value (machine_
   return result;
 }
 
-void
-default_remove_extra_call_preserved_regs (rtx_insn *, HARD_REG_SET *)
-{
-}
-
 #include "gt-targhooks.h"
Index: gcc/config/aarch64/aarch64.c
===================================================================
--- gcc/config/aarch64/aarch64.c	2019-09-11 19:47:20.398291002 +0100
+++ gcc/config/aarch64/aarch64.c	2019-09-11 19:47:24.410262730 +0100
@@ -1877,7 +1877,7 @@ aarch64_reg_save_mode (tree fndecl, unsi
    the function.  */
 
 static bool
-aarch64_simd_call_p (rtx_insn *insn)
+aarch64_simd_call_p (const rtx_insn *insn)
 {
   rtx symbol;
   rtx call;
@@ -1895,20 +1895,14 @@ aarch64_simd_call_p (rtx_insn *insn)
   return aarch64_simd_decl_p (fndecl);
 }
 
-/* Implement TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS.  If INSN calls
-   a function that uses the SIMD ABI, take advantage of the extra
-   call-preserved registers that the ABI provides.  */
+/* Implement TARGET_CALL_INSN_ABI.  */
 
-void
-aarch64_remove_extra_call_preserved_regs (rtx_insn *insn,
-					  HARD_REG_SET *return_set)
+const predefined_function_abi &
+aarch64_call_insn_abi (const rtx_insn *insn)
 {
   if (aarch64_simd_call_p (insn))
-    {
-      for (int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-	if (FP_SIMD_SAVED_REGNUM_P (regno))
-	  CLEAR_HARD_REG_BIT (*return_set, regno);
-    }
+    return aarch64_simd_abi ();
+  return default_function_abi;
 }
 
 /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  The callee only saves
@@ -20806,9 +20800,8 @@ #define TARGET_MODES_TIEABLE_P aarch64_m
 #define TARGET_HARD_REGNO_CALL_PART_CLOBBERED \
   aarch64_hard_regno_call_part_clobbered
 
-#undef TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS
-#define TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS \
-  aarch64_remove_extra_call_preserved_regs
+#undef TARGET_CALL_INSN_ABI
+#define TARGET_CALL_INSN_ABI aarch64_call_insn_abi
 
 #undef TARGET_RETURN_CALL_WITH_MAX_CLOBBERS
 #define TARGET_RETURN_CALL_WITH_MAX_CLOBBERS \
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h	2019-09-10 17:18:59.955982709 +0100
+++ gcc/rtl.h	2019-09-11 19:47:24.418262673 +0100
@@ -3420,6 +3420,7 @@ extern int rtx_unstable_p (const_rtx);
 extern bool rtx_varies_p (const_rtx, bool);
 extern bool rtx_addr_varies_p (const_rtx, bool);
 extern rtx get_call_rtx_from (const rtx_insn *);
+extern tree get_call_fndecl (const rtx_insn *);
 extern HOST_WIDE_INT get_integer_term (const_rtx);
 extern rtx get_related_value (const_rtx);
 extern bool offset_within_block_p (const_rtx, HOST_WIDE_INT);
@@ -4371,14 +4372,11 @@ extern tree GTY(()) global_regs_decl[FIR
    Available only for functions that has been already assembled.  */
 
 struct GTY(()) cgraph_rtl_info {
-   unsigned int preferred_incoming_stack_boundary;
+  unsigned int preferred_incoming_stack_boundary;
 
-  /* Call unsaved hard registers really used by the corresponding
-     function (including ones used by functions called by the
-     function).  */
+  /* Which registers the function clobbers, either directly or by
+     calling another function.  */
   HARD_REG_SET function_used_regs;
-  /* Set if function_used_regs is valid.  */
-  unsigned function_used_regs_valid: 1;
 };
 
 /* If loads from memories of mode MODE always sign or zero extend,
Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c	2019-09-10 19:56:32.577268091 +0100
+++ gcc/rtlanal.c	2019-09-11 19:47:24.418262673 +0100
@@ -822,6 +822,24 @@ get_call_rtx_from (const rtx_insn *insn)
     return x;
   return NULL_RTX;
 }
+
+/* Get the declaration of the function called by INSN.  */
+
+tree
+get_call_fndecl (const rtx_insn *insn)
+{
+  rtx note, datum;
+
+  note = find_reg_note (insn, REG_CALL_DECL, NULL_RTX);
+  if (note == NULL_RTX)
+    return NULL_TREE;
+
+  datum = XEXP (note, 0);
+  if (datum != NULL_RTX)
+    return SYMBOL_REF_DECL (datum);
+
+  return NULL_TREE;
+}
 \f
 /* Return the value of the integer term in X, if one is apparent;
    otherwise return 0.
Index: gcc/function-abi.h
===================================================================
--- gcc/function-abi.h	2019-09-11 19:47:07.490381964 +0100
+++ gcc/function-abi.h	2019-09-11 19:47:24.418262673 +0100
@@ -224,6 +224,8 @@ struct target_function_abi_info
      * crtl->abi is the ABI of the function that we are currently
        compiling to rtl.
 
+     * call_insn_abi (INSN) is the ABI used by the target of call insn INSN.
+
      * eh_edge_abi is the "ABI" used when taking an EH edge from an
        exception-throwing statement to an exception handler.  Catching
        exceptions from calls can be treated as an abnormal return from
@@ -265,5 +267,6 @@ #define eh_edge_abi default_function_abi
 
 extern const predefined_function_abi &fntype_abi (const_tree);
 extern function_abi fndecl_abi (const_tree);
+extern function_abi call_insn_abi (const rtx_insn *);
 
 #endif
Index: gcc/function-abi.cc
===================================================================
--- gcc/function-abi.cc	2019-09-11 19:47:20.402290974 +0100
+++ gcc/function-abi.cc	2019-09-11 19:47:24.414262702 +0100
@@ -143,5 +143,28 @@ function_abi
 fndecl_abi (const_tree fndecl)
 {
   gcc_assert (TREE_CODE (fndecl) == FUNCTION_DECL);
-  return fntype_abi (TREE_TYPE (fndecl));
+  const predefined_function_abi &base_abi = fntype_abi (TREE_TYPE (fndecl));
+
+  if (flag_ipa_ra && decl_binds_to_current_def_p (fndecl))
+    if (cgraph_rtl_info *info = cgraph_node::rtl_info (fndecl))
+      return function_abi (base_abi, info->function_used_regs);
+
+  return base_abi;
+}
+
+/* Return the ABI of the function called by INSN.  */
+
+function_abi
+call_insn_abi (const rtx_insn *insn)
+{
+  gcc_assert (insn && CALL_P (insn));
+
+  if (flag_ipa_ra)
+    if (tree fndecl = get_call_fndecl (insn))
+      return fndecl_abi (fndecl);
+
+  if (targetm.calls.call_insn_abi)
+    return targetm.calls.call_insn_abi (insn);
+
+  return default_function_abi;
 }
Index: gcc/regs.h
===================================================================
--- gcc/regs.h	2019-09-09 18:58:28.864430335 +0100
+++ gcc/regs.h	2019-09-11 19:47:24.418262673 +0100
@@ -383,8 +383,4 @@ range_in_hard_reg_set_p (const_hard_reg_
   return true;
 }
 
-/* Get registers used by given function call instruction.  */
-extern bool get_call_reg_set_usage (rtx_insn *insn, HARD_REG_SET *reg_set,
-				    HARD_REG_SET default_set);
-
 #endif /* GCC_REGS_H */
Index: gcc/final.c
===================================================================
--- gcc/final.c	2019-09-10 19:56:32.569268148 +0100
+++ gcc/final.c	2019-09-11 19:47:24.414262702 +0100
@@ -81,6 +81,7 @@ #define INCLUDE_ALGORITHM /* reverse */
 #include "asan.h"
 #include "rtl-iter.h"
 #include "print-rtl.h"
+#include "function-abi.h"
 
 #ifdef XCOFF_DEBUGGING_INFO
 #include "xcoffout.h"		/* Needed for external data declarations.  */
@@ -230,7 +231,6 @@ static int alter_cond (rtx);
 #endif
 static int align_fuzz (rtx, rtx, int, unsigned);
 static void collect_fn_hard_reg_usage (void);
-static tree get_call_fndecl (rtx_insn *);
 \f
 /* Initialize data in final at the beginning of a compilation.  */
 
@@ -4994,7 +4994,16 @@ collect_fn_hard_reg_usage (void)
   if (!targetm.call_fusage_contains_non_callee_clobbers)
     return;
 
-  CLEAR_HARD_REG_SET (function_used_regs);
+  /* Be conservative - mark fixed and global registers as used.  */
+  function_used_regs = fixed_reg_set;
+
+#ifdef STACK_REGS
+  /* Handle STACK_REGS conservatively, since the df-framework does not
+     provide accurate information for them.  */
+
+  for (i = FIRST_STACK_REG; i <= LAST_STACK_REG; i++)
+    SET_HARD_REG_BIT (function_used_regs, i);
+#endif
 
   for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
     {
@@ -5005,96 +5014,23 @@ collect_fn_hard_reg_usage (void)
 
       if (CALL_P (insn)
 	  && !self_recursive_call_p (insn))
-	{
-	  if (!get_call_reg_set_usage (insn, &insn_used_regs,
-				       call_used_or_fixed_regs))
-	    return;
-
-	  function_used_regs |= insn_used_regs;
-	}
+	function_used_regs
+	  |= call_insn_abi (insn).full_and_partial_reg_clobbers ();
 
       find_all_hard_reg_sets (insn, &insn_used_regs, false);
       function_used_regs |= insn_used_regs;
-    }
 
-  /* Be conservative - mark fixed and global registers as used.  */
-  function_used_regs |= fixed_reg_set;
-
-#ifdef STACK_REGS
-  /* Handle STACK_REGS conservatively, since the df-framework does not
-     provide accurate information for them.  */
-
-  for (i = FIRST_STACK_REG; i <= LAST_STACK_REG; i++)
-    SET_HARD_REG_BIT (function_used_regs, i);
-#endif
+      if (hard_reg_set_subset_p (crtl->abi->full_and_partial_reg_clobbers (),
+				 function_used_regs))
+	return;
+    }
 
-  /* The information we have gathered is only interesting if it exposes a
-     register from the call_used_regs that is not used in this function.  */
-  if (hard_reg_set_subset_p (call_used_or_fixed_regs, function_used_regs))
-    return;
+  /* Mask out fully-saved registers, so that they don't affect equality
+     comparisons between function_abis.  */
+  function_used_regs &= crtl->abi->full_and_partial_reg_clobbers ();
 
   node = cgraph_node::rtl_info (current_function_decl);
   gcc_assert (node != NULL);
 
   node->function_used_regs = function_used_regs;
-  node->function_used_regs_valid = 1;
-}
-
-/* Get the declaration of the function called by INSN.  */
-
-static tree
-get_call_fndecl (rtx_insn *insn)
-{
-  rtx note, datum;
-
-  note = find_reg_note (insn, REG_CALL_DECL, NULL_RTX);
-  if (note == NULL_RTX)
-    return NULL_TREE;
-
-  datum = XEXP (note, 0);
-  if (datum != NULL_RTX)
-    return SYMBOL_REF_DECL (datum);
-
-  return NULL_TREE;
-}
-
-/* Return the cgraph_rtl_info of the function called by INSN.  Returns NULL for
-   call targets that can be overwritten.  */
-
-static struct cgraph_rtl_info *
-get_call_cgraph_rtl_info (rtx_insn *insn)
-{
-  tree fndecl;
-
-  if (insn == NULL_RTX)
-    return NULL;
-
-  fndecl = get_call_fndecl (insn);
-  if (fndecl == NULL_TREE
-      || !decl_binds_to_current_def_p (fndecl))
-    return NULL;
-
-  return cgraph_node::rtl_info (fndecl);
-}
-
-/* Find hard registers used by function call instruction INSN, and return them
-   in REG_SET.  Return DEFAULT_SET in REG_SET if not found.  */
-
-bool
-get_call_reg_set_usage (rtx_insn *insn, HARD_REG_SET *reg_set,
-			HARD_REG_SET default_set)
-{
-  if (flag_ipa_ra)
-    {
-      struct cgraph_rtl_info *node = get_call_cgraph_rtl_info (insn);
-      if (node != NULL
-	  && node->function_used_regs_valid)
-	{
-	  *reg_set = node->function_used_regs & default_set;
-	  return true;
-	}
-    }
-  *reg_set = default_set;
-  targetm.remove_extra_call_preserved_regs (insn, reg_set);
-  return false;
 }
Index: gcc/caller-save.c
===================================================================
--- gcc/caller-save.c	2019-09-10 19:56:32.557268232 +0100
+++ gcc/caller-save.c	2019-09-11 19:47:24.402262786 +0100
@@ -37,6 +37,7 @@ Software Foundation; either version 3, o
 #include "dumpfile.h"
 #include "rtl-iter.h"
 #include "target.h"
+#include "function-abi.h"
 
 #define MOVE_MAX_WORDS (MOVE_MAX / UNITS_PER_WORD)
 
@@ -426,7 +427,9 @@ setup_save_areas (void)
       freq = REG_FREQ_FROM_BB (BLOCK_FOR_INSN (insn));
       REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
 			       &chain->live_throughout);
-      get_call_reg_set_usage (insn, &used_regs, call_used_or_fixed_regs);
+      used_regs = call_insn_abi (insn).full_reg_clobbers ();
+      /* ??? This preserves traditional behavior; it might not be needed.  */
+      used_regs |= fixed_reg_set;
 
       /* Record all registers set in this call insn.  These don't
 	 need to be saved.  N.B. the call insn might set a subreg
@@ -509,7 +512,10 @@ setup_save_areas (void)
 
 	  REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
 				   &chain->live_throughout);
-	  get_call_reg_set_usage (insn, &used_regs, call_used_or_fixed_regs);
+	  used_regs = call_insn_abi (insn).full_reg_clobbers ();
+	  /* ??? This preserves traditional behavior; it might not
+	     be needed.  */
+	  used_regs |= fixed_reg_set;
 
 	  /* Record all registers set in this call insn.  These don't
 	     need to be saved.  N.B. the call insn might set a subreg
@@ -838,8 +844,10 @@ save_call_clobbered_regs (void)
 				     | this_insn_sets
 				     | hard_regs_saved);
 	      hard_regs_to_save &= savable_regs;
-	      get_call_reg_set_usage (insn, &call_def_reg_set,
-				      call_used_or_fixed_regs);
+	      call_def_reg_set = call_insn_abi (insn).full_reg_clobbers ();
+	      /* ??? This preserves traditional behavior; it might not
+		 be needed.  */
+	      call_def_reg_set |= fixed_reg_set;
 	      hard_regs_to_save &= call_def_reg_set;
 
 	      for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
Index: gcc/cfgcleanup.c
===================================================================
--- gcc/cfgcleanup.c	2019-09-10 19:56:32.561268204 +0100
+++ gcc/cfgcleanup.c	2019-09-11 19:47:24.402262786 +0100
@@ -54,6 +54,7 @@ Software Foundation; either version 3, o
 #include "dbgcnt.h"
 #include "rtl-iter.h"
 #include "regs.h"
+#include "function-abi.h"
 
 #define FORWARDER_BLOCK_P(BB) ((BB)->flags & BB_FORWARDER_BLOCK)
 
@@ -1226,10 +1227,11 @@ old_insns_match_p (int mode ATTRIBUTE_UN
 	    }
 	}
 
-      HARD_REG_SET i1_used, i2_used;
-
-      get_call_reg_set_usage (i1, &i1_used, call_used_or_fixed_regs);
-      get_call_reg_set_usage (i2, &i2_used, call_used_or_fixed_regs);
+      HARD_REG_SET i1_used = call_insn_abi (i1).full_reg_clobbers ();
+      HARD_REG_SET i2_used = call_insn_abi (i2).full_reg_clobbers ();
+      /* ??? This preserves traditional behavior; it might not be needed.  */
+      i1_used |= fixed_reg_set;
+      i2_used |= fixed_reg_set;
 
       if (i1_used != i2_used)
         return dir_none;
Index: gcc/cgraph.h
===================================================================
--- gcc/cgraph.h	2019-08-05 17:46:19.305733957 +0100
+++ gcc/cgraph.h	2019-09-11 19:47:24.406262758 +0100
@@ -1368,7 +1368,7 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cg
   static cgraph_local_info *local_info (tree decl);
 
   /* Return local info for the compiled function.  */
-  static struct cgraph_rtl_info *rtl_info (tree);
+  static struct cgraph_rtl_info *rtl_info (const_tree);
 
   /* Return the cgraph node that has ASMNAME for its DECL_ASSEMBLER_NAME.
      Return NULL if there's no such node.  */
Index: gcc/cgraph.c
===================================================================
--- gcc/cgraph.c	2019-09-05 08:49:26.265772215 +0100
+++ gcc/cgraph.c	2019-09-11 19:47:24.406262758 +0100
@@ -1908,7 +1908,7 @@ cgraph_node::local_info (tree decl)
 /* Return local info for the compiled function.  */
 
 cgraph_rtl_info *
-cgraph_node::rtl_info (tree decl)
+cgraph_node::rtl_info (const_tree decl)
 {
   gcc_assert (TREE_CODE (decl) == FUNCTION_DECL);
   cgraph_node *node = get (decl);
@@ -1923,7 +1923,10 @@ cgraph_node::rtl_info (tree decl)
     return NULL;
   /* Allocate if it doesn't exist.  */
   if (node->rtl == NULL)
-    node->rtl = ggc_cleared_alloc<cgraph_rtl_info> ();
+    {
+      node->rtl = ggc_cleared_alloc<cgraph_rtl_info> ();
+      node->rtl->function_used_regs = reg_class_contents[ALL_REGS];
+    }
   return node->rtl;
 }
 
Index: gcc/df-scan.c
===================================================================
--- gcc/df-scan.c	2019-09-10 19:56:45.353177919 +0100
+++ gcc/df-scan.c	2019-09-11 19:47:24.410262730 +0100
@@ -35,7 +35,7 @@ Software Foundation; either version 3, o
 #include "emit-rtl.h"  /* FIXME: Can go away once crtl is moved to rtl.h.  */
 #include "dumpfile.h"
 #include "calls.h"
-
+#include "function-abi.h"
 
 /* The set of hard registers in eliminables[i].from. */
 
@@ -3088,13 +3088,11 @@ df_get_call_refs (class df_collection_re
   bool is_sibling_call;
   unsigned int i;
   HARD_REG_SET defs_generated;
-  HARD_REG_SET fn_reg_set_usage;
 
   CLEAR_HARD_REG_SET (defs_generated);
   df_find_hard_reg_defs (PATTERN (insn_info->insn), &defs_generated);
   is_sibling_call = SIBLING_CALL_P (insn_info->insn);
-  get_call_reg_set_usage (insn_info->insn, &fn_reg_set_usage,
-			  regs_invalidated_by_call);
+  function_abi abi = call_insn_abi (insn_info->insn);
 
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     {
@@ -3118,7 +3116,7 @@ df_get_call_refs (class df_collection_re
 			       NULL, bb, insn_info, DF_REF_REG_DEF, flags);
 	    }
 	}
-      else if (TEST_HARD_REG_BIT (fn_reg_set_usage, i)
+      else if (abi.clobbers_full_reg_p (i)
 	       /* no clobbers for regs that are the result of the call */
 	       && !TEST_HARD_REG_BIT (defs_generated, i)
 	       && (!is_sibling_call
Index: gcc/ira-lives.c
===================================================================
--- gcc/ira-lives.c	2019-09-10 19:56:45.357177891 +0100
+++ gcc/ira-lives.c	2019-09-11 19:47:24.418262673 +0100
@@ -33,6 +33,7 @@ Software Foundation; either version 3, o
 #include "ira.h"
 #include "ira-int.h"
 #include "sparseset.h"
+#include "function-abi.h"
 
 /* The code in this file is similar to one in global but the code
    works on the allocno basis and creates live ranges instead of
@@ -1254,10 +1255,11 @@ process_bb_node_lives (ira_loop_tree_nod
 		  ira_object_t obj = ira_object_id_map[i];
 		  a = OBJECT_ALLOCNO (obj);
 		  int num = ALLOCNO_NUM (a);
-		  HARD_REG_SET this_call_used_reg_set;
-
-		  get_call_reg_set_usage (insn, &this_call_used_reg_set,
-					  call_used_or_fixed_regs);
+		  HARD_REG_SET this_call_used_reg_set
+		    = call_insn_abi (insn).full_reg_clobbers ();
+		  /* ??? This preserves traditional behavior; it might not be
+		     needed.  */
+		  this_call_used_reg_set |= fixed_reg_set;
 
 		  /* Don't allocate allocnos that cross setjmps or any
 		     call, if this function receives a nonlocal
Index: gcc/lra-lives.c
===================================================================
--- gcc/lra-lives.c	2019-09-10 19:56:45.357177891 +0100
+++ gcc/lra-lives.c	2019-09-11 19:47:24.418262673 +0100
@@ -43,6 +43,7 @@ Software Foundation; either version 3, o
 #include "sparseset.h"
 #include "lra-int.h"
 #include "target.h"
+#include "function-abi.h"
 
 /* Program points are enumerated by numbers from range
    0..LRA_LIVE_MAX_POINT-1.  There are approximately two times more
@@ -931,9 +932,11 @@ process_bb_lives (basic_block bb, int &c
 	    last_call_used_reg_set = call_used_or_fixed_regs;
 	  else
 	    {
-	      HARD_REG_SET this_call_used_reg_set;
-	      get_call_reg_set_usage (curr_insn, &this_call_used_reg_set,
-				      call_used_or_fixed_regs);
+	      HARD_REG_SET this_call_used_reg_set
+		= call_insn_abi (curr_insn).full_reg_clobbers ();
+	      /* ??? This preserves traditional behavior; it might not
+		 be needed.  */
+	      this_call_used_reg_set |= fixed_reg_set;
 
 	      bool flush = (! hard_reg_set_empty_p (last_call_used_reg_set)
 			    && (last_call_used_reg_set
Index: gcc/postreload.c
===================================================================
--- gcc/postreload.c	2019-09-10 19:56:45.357177891 +0100
+++ gcc/postreload.c	2019-09-11 19:47:24.418262673 +0100
@@ -40,6 +40,7 @@ Software Foundation; either version 3, o
 #include "cselib.h"
 #include "tree-pass.h"
 #include "dbgcnt.h"
+#include "function-abi.h"
 
 static int reload_cse_noop_set_p (rtx);
 static bool reload_cse_simplify (rtx_insn *, rtx);
@@ -1330,9 +1331,10 @@ reload_combine (void)
       if (CALL_P (insn))
 	{
 	  rtx link;
-	  HARD_REG_SET used_regs;
-
-	  get_call_reg_set_usage (insn, &used_regs, call_used_or_fixed_regs);
+	  HARD_REG_SET used_regs = call_insn_abi (insn).full_reg_clobbers ();
+	  /* ??? This preserves traditional behavior; it might not be
+	     needed.  */
+	  used_regs |= fixed_reg_set;
 
 	  for (r = 0; r < FIRST_PSEUDO_REGISTER; r++)
 	    if (TEST_HARD_REG_BIT (used_regs, r))
Index: gcc/regcprop.c
===================================================================
--- gcc/regcprop.c	2019-09-09 18:58:51.472270712 +0100
+++ gcc/regcprop.c	2019-09-11 19:47:24.418262673 +0100
@@ -35,6 +35,7 @@
 #include "rtl-iter.h"
 #include "cfgrtl.h"
 #include "target.h"
+#include "function-abi.h"
 
 /* The following code does forward propagation of hard register copies.
    The object is to eliminate as many dependencies as possible, so that
@@ -1035,7 +1036,6 @@ copyprop_hardreg_forward_1 (basic_block
 	  unsigned int set_nregs = 0;
 	  unsigned int regno;
 	  rtx exp;
-	  HARD_REG_SET regs_invalidated_by_this_call;
 
 	  for (exp = CALL_INSN_FUNCTION_USAGE (insn); exp; exp = XEXP (exp, 1))
 	    {
@@ -1053,11 +1053,9 @@ copyprop_hardreg_forward_1 (basic_block
 		}
 	    }
 
-	  get_call_reg_set_usage (insn,
-				  &regs_invalidated_by_this_call,
-				  regs_invalidated_by_call);
+	  function_abi abi = call_insn_abi (insn);
 	  for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-	    if ((TEST_HARD_REG_BIT (regs_invalidated_by_this_call, regno)
+	    if ((abi.clobbers_full_reg_p (regno)
 		 || (targetm.hard_regno_call_part_clobbered
 		     (insn, regno, vd->e[regno].mode)))
 		&& (regno < set_regno || regno >= set_regno + set_nregs))
Index: gcc/resource.c
===================================================================
--- gcc/resource.c	2019-09-09 18:59:20.864063196 +0100
+++ gcc/resource.c	2019-09-11 19:47:24.418262673 +0100
@@ -30,6 +30,7 @@ Software Foundation; either version 3, o
 #include "resource.h"
 #include "insn-attr.h"
 #include "params.h"
+#include "function-abi.h"
 
 /* This structure is used to record liveness information at the targets or
    fallthrough insns of branches.  We will most likely need the information
@@ -662,12 +663,10 @@ mark_set_resources (rtx x, struct resour
 	{
 	  rtx_call_insn *call_insn = as_a <rtx_call_insn *> (x);
 	  rtx link;
-	  HARD_REG_SET regs;
 
 	  res->cc = res->memory = 1;
 
-	  get_call_reg_set_usage (call_insn, &regs, regs_invalidated_by_call);
-	  res->regs |= regs;
+	  res->regs |= call_insn_abi (call_insn).full_reg_clobbers ();
 
 	  for (link = CALL_INSN_FUNCTION_USAGE (call_insn);
 	       link; link = XEXP (link, 1))
@@ -1038,10 +1037,8 @@ mark_target_live_regs (rtx_insn *insns,
 		 predicated instruction, or if the CALL is NORETURN.  */
 	      if (GET_CODE (PATTERN (real_insn)) != COND_EXEC)
 		{
-		  HARD_REG_SET regs_invalidated_by_this_call;
-		  get_call_reg_set_usage (real_insn,
-					  &regs_invalidated_by_this_call,
-					  regs_invalidated_by_call);
+		  HARD_REG_SET regs_invalidated_by_this_call
+		    = call_insn_abi (real_insn).full_reg_clobbers ();
 		  /* CALL clobbers all call-used regs that aren't fixed except
 		     sp, ap, and fp.  Do this before setting the result of the
 		     call live.  */
Index: gcc/var-tracking.c
===================================================================
--- gcc/var-tracking.c	2019-09-09 18:58:51.476270683 +0100
+++ gcc/var-tracking.c	2019-09-11 19:47:24.422262645 +0100
@@ -116,6 +116,7 @@
 #include "rtl-iter.h"
 #include "fibonacci_heap.h"
 #include "print-rtl.h"
+#include "function-abi.h"
 
 typedef fibonacci_heap <long, basic_block_def> bb_heap_t;
 typedef fibonacci_node <long, basic_block_def> bb_heap_node_t;
@@ -4900,12 +4901,10 @@ dataflow_set_clear_at_call (dataflow_set
 {
   unsigned int r;
   hard_reg_set_iterator hrsi;
-  HARD_REG_SET invalidated_regs;
 
-  get_call_reg_set_usage (call_insn, &invalidated_regs,
-			  regs_invalidated_by_call);
+  function_abi abi = call_insn_abi (call_insn);
 
-  EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
+  EXECUTE_IF_SET_IN_HARD_REG_SET (abi.full_reg_clobbers (), 0, r, hrsi)
     var_regno_delete (set, r);
 
   if (MAY_HAVE_DEBUG_BIND_INSNS)

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [04/32] [x86] Robustify vzeroupper handling across calls
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (3 preceding siblings ...)
  2019-09-11 19:05 ` [05/32] Pass an ABI identifier to hard_regno_call_part_clobbered Richard Sandiford
@ 2019-09-11 19:05 ` Richard Sandiford
  2019-09-25 15:48   ` Richard Sandiford
  2019-09-11 19:06 ` [06/32] Pass an ABI to choose_hard_reg_mode Richard Sandiford
                   ` (29 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:05 UTC (permalink / raw)
  To: gcc-patches

One of the effects of the function_abi series is to make -fipa-ra
work for partially call-clobbered registers.  E.g. if a call preserves
only the low 32 bits of a register R, we handled the partial clobber
separately from -fipa-ra, and so treated the upper bits of R as
clobbered even if we knew that the target function doesn't touch R.

"Fixing" this caused problems for the vzeroupper handling on x86.
The pass that inserts the vzerouppers assumes that no 256-bit or 512-bit
values are live across a call unless the call takes a 256-bit or 512-bit
argument:

      /* Needed mode is set to AVX_U128_CLEAN if there are
	 no 256bit or 512bit modes used in function arguments. */

This implicitly relies on:

/* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  The only ABI that
   saves SSE registers across calls is Win64 (thus no need to check the
   current ABI here), and with AVX enabled Win64 only guarantees that
   the low 16 bytes are saved.  */

static bool
ix86_hard_regno_call_part_clobbered (rtx_insn *insn ATTRIBUTE_UNUSED,
				     unsigned int regno, machine_mode mode)
{
  return SSE_REGNO_P (regno) && GET_MODE_SIZE (mode) > 16;
}

The comment suggests that this code is only needed for Win64 and that
not testing for Win64 is just a simplification.  But in practice it was
needed for correctness on GNU/Linux and other targets too, since without
it the RA would be able to keep 256-bit and 512-bit values in SSE
registers across calls that are known not to clobber them.

This patch conservatively treats calls as AVX_U128_ANY if the RA can see
that some SSE registers are not touched by a call.  There are then no
regressions if the ix86_hard_regno_call_part_clobbered check is disabled
for GNU/Linux (not something we should do, was just for testing).

If in fact we want -fipa-ra to pretend that all functions clobber
SSE registers above 128 bits, it'd certainly be possible to arrange
that.  But IMO that would be an optimisation decision, whereas what
the patch is fixing is a correctness decision.  So I think we should
have this check even so.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/i386/i386.c: Include function-abi.h.
	(ix86_avx_u128_mode_needed): Treat function calls as AVX_U128_ANY
	if they preserve some 256-bit or 512-bit SSE registers.

Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	2019-09-10 19:56:55.601105594 +0100
+++ gcc/config/i386/i386.c	2019-09-11 19:47:28.506233865 +0100
@@ -95,6 +95,7 @@ #define IN_TARGET_CODE 1
 #include "i386-builtins.h"
 #include "i386-expand.h"
 #include "i386-features.h"
+#include "function-abi.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -13511,6 +13512,15 @@ ix86_avx_u128_mode_needed (rtx_insn *ins
 	    }
 	}
 
+      /* If the function is known to preserve some SSE registers,
+	 RA and previous passes can legitimately rely on that for
+	 modes wider than 256 bits.  It's only safe to issue a
+	 vzeroupper if all SSE registers are clobbered.  */
+      const function_abi &abi = call_insn_abi (insn);
+      if (!hard_reg_set_subset_p (reg_class_contents[ALL_SSE_REGS],
+				  abi.mode_clobbers (V4DImode)))
+	return AVX_U128_ANY;
+
       return AVX_U128_CLEAN;
     }
 

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [05/32] Pass an ABI identifier to hard_regno_call_part_clobbered
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (2 preceding siblings ...)
  2019-09-11 19:04 ` [03/32] Add a function for getting the ABI of a call insn target Richard Sandiford
@ 2019-09-11 19:05 ` Richard Sandiford
  2019-09-29 20:58   ` Jeff Law
  2019-09-11 19:05 ` [04/32] [x86] Robustify vzeroupper handling across calls Richard Sandiford
                   ` (30 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:05 UTC (permalink / raw)
  To: gcc-patches

This patch replaces the rtx_insn argument to
targetm.hard_regno_call_part_clobbered with an ABI identifier, since
call insns are now just one possible way of getting an ABI handle.
This in turn allows predefined_function_abi::initialize to do the
right thing for non-default ABIs.

The horrible ?: in need_for_call_save_p goes away in a later patch,
with the series as a whole removing most direct calls to the hook in
favour of function_abi operations.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* target.def (hard_regno_call_part_clobbered): Take an ABI
	identifier instead of an rtx_insn.
	* doc/tm.texi: Regenerate.
	* hooks.h (hook_bool_insn_uint_mode_false): Delete.
	(hook_bool_uint_uint_mode_false): New function.
	* hooks.c (hook_bool_insn_uint_mode_false): Delete.
	(hook_bool_uint_uint_mode_false): New function.
	* config/aarch64/aarch64.c (aarch64_hard_regno_call_part_clobbered):
	Take an ABI identifier instead of an rtx_insn.
	* config/avr/avr.c (avr_hard_regno_call_part_clobbered): Likewise.
	* config/i386/i386.c (ix86_hard_regno_call_part_clobbered): Likewise.
	* config/mips/mips.c (mips_hard_regno_call_part_clobbered): Likewise.
	* config/pru/pru.c (pru_hard_regno_call_part_clobbered): Likewise.
	* config/rs6000/rs6000.c (rs6000_hard_regno_call_part_clobbered):
	Likewise.
	* config/s390/s390.c (s390_hard_regno_call_part_clobbered): Likewise.
	* cselib.c: Include function-abi.h.
	(cselib_process_insn): Update call to
	targetm.hard_regno_call_part_clobbered, using call_insn_abi
	to get the appropriate ABI identifier.
	* function-abi.cc (predefined_function_abi::initialize): Update call
	to targetm.hard_regno_call_part_clobbered.
	* ira-conflicts.c (ira_build_conflicts): Likewise.
	* ira-costs.c (ira_tune_allocno_costs): Likewise.
	* lra-constraints.c: Include function-abi.h.
	(need_for_call_save_p): Update call to
	targetm.hard_regno_call_part_clobbered, using call_insn_abi
	to get the appropriate ABI identifier.
	* lra-lives.c (check_pseudos_live_through_calls): Likewise.
	* regcprop.c (copyprop_hardreg_forward_1): Update call
	to targetm.hard_regno_call_part_clobbered.
	* reginfo.c (choose_hard_reg_mode): Likewise.
	* regrename.c (check_new_reg_p): Likewise.
	* reload.c (find_equiv_reg): Likewise.
	* reload1.c (emit_reload_insns): Likewise.
	* sched-deps.c: Include function-abi.h.
	(deps_analyze_insn): Update call to
	targetm.hard_regno_call_part_clobbered, using call_insn_abi
	to get the appropriate ABI identifier.
	* sel-sched.c (init_regs_for_mode, mark_unavailable_hard_regs): Update
	call to targetm.hard_regno_call_part_clobbered.
	* targhooks.c (default_dwarf_frame_reg_mode): Likewise.

Index: gcc/target.def
===================================================================
--- gcc/target.def	2019-09-11 19:47:24.422262645 +0100
+++ gcc/target.def	2019-09-11 19:47:32.906202859 +0100
@@ -5763,18 +5763,27 @@ The default version of this hook always
 
 DEFHOOK
 (hard_regno_call_part_clobbered,
- "This hook should return true if @var{regno} is partly call-saved and\n\
-partly call-clobbered, and if a value of mode @var{mode} would be partly\n\
-clobbered by call instruction @var{insn}.  If @var{insn} is NULL then it\n\
-should return true if any call could partly clobber the register.\n\
-For example, if the low 32 bits of @var{regno} are preserved across a call\n\
-but higher bits are clobbered, this hook should return true for a 64-bit\n\
-mode but false for a 32-bit mode.\n\
+ "ABIs usually specify that calls must preserve the full contents\n\
+of a particular register, or that calls can alter any part of a\n\
+particular register.  This information is captured by the target macro\n\
+@code{CALL_REALLY_USED_REGISTERS}.  However, some ABIs specify that calls\n\
+must preserve certain bits of a particular register but can alter others.\n\
+This hook should return true if this applies to at least one of the\n\
+registers in @samp{(reg:@var{mode} @var{regno})}, and if as a result the\n\
+call would alter part of the @var{mode} value.  For example, if a call\n\
+preserves the low 32 bits of a 64-bit hard register @var{regno} but can\n\
+clobber the upper 32 bits, this hook should return true for a 64-bit mode\n\
+but false for a 32-bit mode.\n\
+\n\
+The value of @var{abi_id} comes from the @code{predefined_function_abi}\n\
+structure that describes the ABI of the call; see the definition of the\n\
+structure for more details.  If (as is usual) the target uses the same ABI\n\
+for all functions in a translation unit, @var{abi_id} is always 0.\n\
 \n\
 The default implementation returns false, which is correct\n\
 for targets that don't have partly call-clobbered registers.",
- bool, (rtx_insn *insn, unsigned int regno, machine_mode mode),
- hook_bool_insn_uint_mode_false)
+ bool, (unsigned int abi_id, unsigned int regno, machine_mode mode),
+ hook_bool_uint_uint_mode_false)
 
 DEFHOOK
 (return_call_with_max_clobbers,
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	2019-09-11 19:47:24.414262702 +0100
+++ gcc/doc/tm.texi	2019-09-11 19:47:32.898202916 +0100
@@ -1919,14 +1919,23 @@ interoperability between several ABIs in
 @cindex call-used register
 @cindex call-clobbered register
 @cindex call-saved register
-@deftypefn {Target Hook} bool TARGET_HARD_REGNO_CALL_PART_CLOBBERED (rtx_insn *@var{insn}, unsigned int @var{regno}, machine_mode @var{mode})
-This hook should return true if @var{regno} is partly call-saved and
-partly call-clobbered, and if a value of mode @var{mode} would be partly
-clobbered by call instruction @var{insn}.  If @var{insn} is NULL then it
-should return true if any call could partly clobber the register.
-For example, if the low 32 bits of @var{regno} are preserved across a call
-but higher bits are clobbered, this hook should return true for a 64-bit
-mode but false for a 32-bit mode.
+@deftypefn {Target Hook} bool TARGET_HARD_REGNO_CALL_PART_CLOBBERED (unsigned int @var{abi_id}, unsigned int @var{regno}, machine_mode @var{mode})
+ABIs usually specify that calls must preserve the full contents
+of a particular register, or that calls can alter any part of a
+particular register.  This information is captured by the target macro
+@code{CALL_REALLY_USED_REGISTERS}.  However, some ABIs specify that calls
+must preserve certain bits of a particular register but can alter others.
+This hook should return true if this applies to at least one of the
+registers in @samp{(reg:@var{mode} @var{regno})}, and if as a result the
+call would alter part of the @var{mode} value.  For example, if a call
+preserves the low 32 bits of a 64-bit hard register @var{regno} but can
+clobber the upper 32 bits, this hook should return true for a 64-bit mode
+but false for a 32-bit mode.
+
+The value of @var{abi_id} comes from the @code{predefined_function_abi}
+structure that describes the ABI of the call; see the definition of the
+structure for more details.  If (as is usual) the target uses the same ABI
+for all functions in a translation unit, @var{abi_id} is always 0.
 
 The default implementation returns false, which is correct
 for targets that don't have partly call-clobbered registers.
Index: gcc/hooks.h
===================================================================
--- gcc/hooks.h	2019-09-05 08:49:25.869775043 +0100
+++ gcc/hooks.h	2019-09-11 19:47:32.898202916 +0100
@@ -40,7 +40,7 @@ extern bool hook_bool_const_rtx_insn_con
 extern bool hook_bool_mode_uhwi_false (machine_mode,
 				       unsigned HOST_WIDE_INT);
 extern bool hook_bool_puint64_puint64_true (poly_uint64, poly_uint64);
-extern bool hook_bool_insn_uint_mode_false (rtx_insn *, unsigned int,
+extern bool hook_bool_uint_uint_mode_false (unsigned int, unsigned int,
 					    machine_mode);
 extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
 extern bool hook_bool_tree_false (tree);
Index: gcc/hooks.c
===================================================================
--- gcc/hooks.c	2019-09-05 08:49:25.869775043 +0100
+++ gcc/hooks.c	2019-09-11 19:47:32.898202916 +0100
@@ -140,9 +140,8 @@ hook_bool_puint64_puint64_true (poly_uin
   return true;
 }
 
-/* Generic hook that takes (unsigned int, machine_mode) and returns false.  */
 bool
-hook_bool_insn_uint_mode_false (rtx_insn *, unsigned int, machine_mode)
+hook_bool_uint_uint_mode_false (unsigned int, unsigned int, machine_mode)
 {
   return false;
 }
Index: gcc/config/aarch64/aarch64.c
===================================================================
--- gcc/config/aarch64/aarch64.c	2019-09-11 19:47:24.410262730 +0100
+++ gcc/config/aarch64/aarch64.c	2019-09-11 19:47:32.858203198 +0100
@@ -1910,12 +1910,13 @@ aarch64_call_insn_abi (const rtx_insn *i
    clobbers the top 64 bits when restoring the bottom 64 bits.  */
 
 static bool
-aarch64_hard_regno_call_part_clobbered (rtx_insn *insn, unsigned int regno,
+aarch64_hard_regno_call_part_clobbered (unsigned int abi_id,
+					unsigned int regno,
 					machine_mode mode)
 {
   if (FP_REGNUM_P (regno))
     {
-      bool simd_p = insn && CALL_P (insn) && aarch64_simd_call_p (insn);
+      bool simd_p = (abi_id == ARM_PCS_SIMD);
       poly_int64 per_register_size = GET_MODE_SIZE (mode);
       unsigned int nregs = hard_regno_nregs (regno, mode);
       if (nregs > 1)
Index: gcc/config/avr/avr.c
===================================================================
--- gcc/config/avr/avr.c	2019-09-10 19:56:45.329178088 +0100
+++ gcc/config/avr/avr.c	2019-09-11 19:47:32.862203170 +0100
@@ -12164,8 +12164,8 @@ avr_hard_regno_mode_ok (unsigned int reg
 /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  */
 
 static bool
-avr_hard_regno_call_part_clobbered (rtx_insn *insn ATTRIBUTE_UNUSED,
-				    unsigned regno, machine_mode mode)
+avr_hard_regno_call_part_clobbered (unsigned, unsigned regno,
+				    machine_mode mode)
 {
   /* FIXME: This hook gets called with MODE:REGNO combinations that don't
         represent valid hard registers like, e.g. HI:29.  Returning TRUE
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	2019-09-11 19:47:28.506233865 +0100
+++ gcc/config/i386/i386.c	2019-09-11 19:47:32.866203141 +0100
@@ -18794,8 +18794,8 @@ ix86_hard_regno_mode_ok (unsigned int re
    the low 16 bytes are saved.  */
 
 static bool
-ix86_hard_regno_call_part_clobbered (rtx_insn *insn ATTRIBUTE_UNUSED,
-				     unsigned int regno, machine_mode mode)
+ix86_hard_regno_call_part_clobbered (unsigned int, unsigned int regno,
+				     machine_mode mode)
 {
   return SSE_REGNO_P (regno) && GET_MODE_SIZE (mode) > 16;
 }
Index: gcc/config/mips/mips.c
===================================================================
--- gcc/config/mips/mips.c	2019-09-10 19:57:04.697041394 +0100
+++ gcc/config/mips/mips.c	2019-09-11 19:47:32.874203085 +0100
@@ -12928,8 +12928,8 @@ mips_hard_regno_scratch_ok (unsigned int
    registers with MODE > 64 bits are part clobbered too.  */
 
 static bool
-mips_hard_regno_call_part_clobbered (rtx_insn *insn ATTRIBUTE_UNUSED,
-				     unsigned int regno, machine_mode mode)
+mips_hard_regno_call_part_clobbered (unsigned int, unsigned int regno,
+				     machine_mode mode)
 {
   if (TARGET_FLOATXX
       && hard_regno_nregs (regno, mode) == 1
Index: gcc/config/pru/pru.c
===================================================================
--- gcc/config/pru/pru.c	2019-09-10 19:56:45.345177975 +0100
+++ gcc/config/pru/pru.c	2019-09-11 19:47:32.886203000 +0100
@@ -559,8 +559,8 @@ pru_hard_regno_scratch_ok (unsigned int
 /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  */
 
 static bool
-pru_hard_regno_call_part_clobbered (rtx_insn *insn ATTRIBUTE_UNUSED,
-				    unsigned regno, machine_mode mode)
+pru_hard_regno_call_part_clobbered (unsigned, unsigned regno,
+				    machine_mode mode)
 {
   HARD_REG_SET caller_saved_set;
   HARD_REG_SET callee_saved_set;
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	2019-09-10 19:57:04.701041365 +0100
+++ gcc/config/rs6000/rs6000.c	2019-09-11 19:47:32.890202972 +0100
@@ -1936,8 +1936,8 @@ rs6000_modes_tieable_p (machine_mode mod
 /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  */
 
 static bool
-rs6000_hard_regno_call_part_clobbered (rtx_insn *insn ATTRIBUTE_UNUSED,
-				       unsigned int regno, machine_mode mode)
+rs6000_hard_regno_call_part_clobbered (unsigned int, unsigned int regno,
+				       machine_mode mode)
 {
   if (TARGET_32BIT
       && TARGET_POWERPC64
Index: gcc/config/s390/s390.c
===================================================================
--- gcc/config/s390/s390.c	2019-09-10 19:57:04.705041337 +0100
+++ gcc/config/s390/s390.c	2019-09-11 19:47:32.894202944 +0100
@@ -10297,8 +10297,8 @@ s390_hard_regno_scratch_ok (unsigned int
    bytes are saved across calls, however.  */
 
 static bool
-s390_hard_regno_call_part_clobbered (rtx_insn *insn ATTRIBUTE_UNUSED,
-				     unsigned int regno, machine_mode mode)
+s390_hard_regno_call_part_clobbered (unsigned int, unsigned int regno,
+				     machine_mode mode)
 {
   if (!TARGET_64BIT
       && TARGET_ZARCH
Index: gcc/cselib.c
===================================================================
--- gcc/cselib.c	2019-09-10 19:56:45.353177919 +0100
+++ gcc/cselib.c	2019-09-11 19:47:32.894202944 +0100
@@ -32,6 +32,7 @@ Software Foundation; either version 3, o
 #include "dumpfile.h"
 #include "cselib.h"
 #include "params.h"
+#include "function-abi.h"
 
 /* A list of cselib_val structures.  */
 struct elt_list
@@ -2765,11 +2766,12 @@ cselib_process_insn (rtx_insn *insn)
      memory.  */
   if (CALL_P (insn))
     {
+      function_abi abi = call_insn_abi (insn);
       for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
 	if (call_used_or_fixed_reg_p (i)
 	    || (REG_VALUES (i) && REG_VALUES (i)->elt
 		&& (targetm.hard_regno_call_part_clobbered
-		    (insn, i, GET_MODE (REG_VALUES (i)->elt->val_rtx)))))
+		    (abi.id (), i, GET_MODE (REG_VALUES (i)->elt->val_rtx)))))
 	  cselib_invalidate_regno (i, reg_raw_mode[i]);
 
       /* Since it is not clear how cselib is going to be used, be
Index: gcc/function-abi.cc
===================================================================
--- gcc/function-abi.cc	2019-09-11 19:47:24.414262702 +0100
+++ gcc/function-abi.cc	2019-09-11 19:47:32.898202916 +0100
@@ -50,7 +50,7 @@ predefined_function_abi::initialize (uns
 
      If the ABI specifies that part of a hard register R is call-clobbered,
      we should be able to find a single-register mode M for which
-     targetm.hard_regno_call_part_clobbered (NULL, R, M) is true.
+     targetm.hard_regno_call_part_clobbered (m_id, R, M) is true.
      In other words, it shouldn't be the case that R can hold all
      single-register modes across a call, but can't hold part of
      a multi-register mode.
@@ -66,7 +66,7 @@ predefined_function_abi::initialize (uns
       for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno)
 	if (targetm.hard_regno_mode_ok (regno, mode)
 	    && hard_regno_nregs (regno, mode) == 1
-	    && targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
+	    && targetm.hard_regno_call_part_clobbered (m_id, regno, mode))
 	  SET_HARD_REG_BIT (m_full_and_partial_reg_clobbers, regno);
     }
 
@@ -89,7 +89,7 @@ predefined_function_abi::initialize (uns
       for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno)
 	if (targetm.hard_regno_mode_ok (regno, mode)
 	    && !overlaps_hard_reg_set_p (m_full_reg_clobbers, mode, regno)
-	    && !targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
+	    && !targetm.hard_regno_call_part_clobbered (m_id, regno, mode))
 	  remove_from_hard_reg_set (&m_mode_clobbers[i], mode, regno);
     }
 
@@ -104,7 +104,7 @@ predefined_function_abi::initialize (uns
 	for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno)
 	  if (targetm.hard_regno_mode_ok (regno, mode)
 	      && !overlaps_hard_reg_set_p (m_full_reg_clobbers, mode, regno)
-	      && targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
+	      && targetm.hard_regno_call_part_clobbered (m_id, regno, mode))
 	    gcc_assert (overlaps_hard_reg_set_p (all_clobbers, mode, regno)
 			&& overlaps_hard_reg_set_p (m_mode_clobbers[i],
 						    mode, regno));
Index: gcc/ira-conflicts.c
===================================================================
--- gcc/ira-conflicts.c	2019-09-10 19:56:32.569268148 +0100
+++ gcc/ira-conflicts.c	2019-09-11 19:47:32.898202916 +0100
@@ -806,7 +806,7 @@ ira_build_conflicts (void)
 		 regs must conflict with them.  */
 	      for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
 		if (!TEST_HARD_REG_BIT (call_used_or_fixed_regs, regno)
-		    && targetm.hard_regno_call_part_clobbered (NULL, regno,
+		    && targetm.hard_regno_call_part_clobbered (0, regno,
 							       obj_mode))
 		  {
 		    SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno);
Index: gcc/ira-costs.c
===================================================================
--- gcc/ira-costs.c	2019-09-10 19:56:32.569268148 +0100
+++ gcc/ira-costs.c	2019-09-11 19:47:32.898202916 +0100
@@ -2381,7 +2381,7 @@ ira_tune_allocno_costs (void)
 						   *crossed_calls_clobber_regs)
 		  && (ira_hard_reg_set_intersection_p (regno, mode,
 						       call_used_or_fixed_regs)
-		      || targetm.hard_regno_call_part_clobbered (NULL, regno,
+		      || targetm.hard_regno_call_part_clobbered (0, regno,
 								 mode)))
 		cost += (ALLOCNO_CALL_FREQ (a)
 			 * (ira_memory_move_cost[mode][rclass][0]
Index: gcc/lra-constraints.c
===================================================================
--- gcc/lra-constraints.c	2019-09-10 19:56:32.573268120 +0100
+++ gcc/lra-constraints.c	2019-09-11 19:47:32.898202916 +0100
@@ -131,6 +131,7 @@
 #include "lra.h"
 #include "lra-int.h"
 #include "print-rtl.h"
+#include "function-abi.h"
 
 /* Value of LRA_CURR_RELOAD_NUM at the beginning of BB of the current
    insn.  Remember that LRA_CURR_RELOAD_NUM is the number of emitted
@@ -5442,7 +5443,8 @@ need_for_call_save_p (int regno)
 	       : call_used_or_fixed_regs,
 	       PSEUDO_REGNO_MODE (regno), reg_renumber[regno])
 	      || (targetm.hard_regno_call_part_clobbered
-		  (lra_reg_info[regno].call_insn,
+		  (lra_reg_info[regno].call_insn
+		   ? call_insn_abi (lra_reg_info[regno].call_insn).id () : 0,
 		   reg_renumber[regno], PSEUDO_REGNO_MODE (regno)))));
 }
 
Index: gcc/lra-lives.c
===================================================================
--- gcc/lra-lives.c	2019-09-11 19:47:24.418262673 +0100
+++ gcc/lra-lives.c	2019-09-11 19:47:32.898202916 +0100
@@ -594,7 +594,7 @@ check_pseudos_live_through_calls (int re
   if (! sparseset_bit_p (pseudos_live_through_calls, regno))
     return;
 
-  gcc_assert (call_insn && CALL_P (call_insn));
+  function_abi abi = call_insn_abi (call_insn);
   old_call_insn = lra_reg_info[regno].call_insn;
   if (!old_call_insn
       || (targetm.return_call_with_max_clobbers
@@ -606,7 +606,7 @@ check_pseudos_live_through_calls (int re
   lra_reg_info[regno].conflict_hard_regs |= last_call_used_reg_set;
 
   for (hr = 0; HARD_REGISTER_NUM_P (hr); hr++)
-    if (targetm.hard_regno_call_part_clobbered (call_insn, hr,
+    if (targetm.hard_regno_call_part_clobbered (abi.id (), hr,
 						PSEUDO_REGNO_MODE (regno)))
       add_to_hard_reg_set (&lra_reg_info[regno].conflict_hard_regs,
 			   PSEUDO_REGNO_MODE (regno), hr);
Index: gcc/regcprop.c
===================================================================
--- gcc/regcprop.c	2019-09-11 19:47:24.418262673 +0100
+++ gcc/regcprop.c	2019-09-11 19:47:32.898202916 +0100
@@ -1057,7 +1057,7 @@ copyprop_hardreg_forward_1 (basic_block
 	  for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
 	    if ((abi.clobbers_full_reg_p (regno)
 		 || (targetm.hard_regno_call_part_clobbered
-		     (insn, regno, vd->e[regno].mode)))
+		     (abi.id (), regno, vd->e[regno].mode)))
 		&& (regno < set_regno || regno >= set_regno + set_nregs))
 	      kill_value_regno (regno, 1, vd);
 
Index: gcc/reginfo.c
===================================================================
--- gcc/reginfo.c	2019-09-11 19:47:07.490381964 +0100
+++ gcc/reginfo.c	2019-09-11 19:47:32.898202916 +0100
@@ -568,7 +568,7 @@ choose_hard_reg_mode (unsigned int regno
     if (hard_regno_nregs (regno, mode) == nregs
 	&& targetm.hard_regno_mode_ok (regno, mode)
 	&& (!call_saved
-	    || !targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
+	    || !targetm.hard_regno_call_part_clobbered (0, regno, mode))
 	&& maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode)))
       found_mode = mode;
 
@@ -576,7 +576,7 @@ choose_hard_reg_mode (unsigned int regno
     if (hard_regno_nregs (regno, mode) == nregs
 	&& targetm.hard_regno_mode_ok (regno, mode)
 	&& (!call_saved
-	    || !targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
+	    || !targetm.hard_regno_call_part_clobbered (0, regno, mode))
 	&& maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode)))
       found_mode = mode;
 
@@ -584,7 +584,7 @@ choose_hard_reg_mode (unsigned int regno
     if (hard_regno_nregs (regno, mode) == nregs
 	&& targetm.hard_regno_mode_ok (regno, mode)
 	&& (!call_saved
-	    || !targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
+	    || !targetm.hard_regno_call_part_clobbered (0, regno, mode))
 	&& maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode)))
       found_mode = mode;
 
@@ -592,7 +592,7 @@ choose_hard_reg_mode (unsigned int regno
     if (hard_regno_nregs (regno, mode) == nregs
 	&& targetm.hard_regno_mode_ok (regno, mode)
 	&& (!call_saved
-	    || !targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
+	    || !targetm.hard_regno_call_part_clobbered (0, regno, mode))
 	&& maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode)))
       found_mode = mode;
 
@@ -606,7 +606,7 @@ choose_hard_reg_mode (unsigned int regno
       if (hard_regno_nregs (regno, mode) == nregs
 	  && targetm.hard_regno_mode_ok (regno, mode)
 	  && (!call_saved
-	      || !targetm.hard_regno_call_part_clobbered (NULL, regno, mode)))
+	      || !targetm.hard_regno_call_part_clobbered (0, regno, mode)))
 	return mode;
     }
 
Index: gcc/regrename.c
===================================================================
--- gcc/regrename.c	2019-09-10 19:56:45.357177891 +0100
+++ gcc/regrename.c	2019-09-11 19:47:32.898202916 +0100
@@ -339,9 +339,9 @@ check_new_reg_p (int reg ATTRIBUTE_UNUSE
 	 && ! DEBUG_INSN_P (tmp->insn))
 	|| (this_head->need_caller_save_reg
 	    && ! (targetm.hard_regno_call_part_clobbered
-		  (NULL, reg, GET_MODE (*tmp->loc)))
+		  (0, reg, GET_MODE (*tmp->loc)))
 	    && (targetm.hard_regno_call_part_clobbered
-		(NULL, new_reg, GET_MODE (*tmp->loc)))))
+		(0, new_reg, GET_MODE (*tmp->loc)))))
       return false;
 
   return true;
Index: gcc/reload.c
===================================================================
--- gcc/reload.c	2019-09-10 19:56:45.357177891 +0100
+++ gcc/reload.c	2019-09-11 19:47:32.902202887 +0100
@@ -6912,14 +6912,14 @@ find_equiv_reg (rtx goal, rtx_insn *insn
 	  if (regno >= 0 && regno < FIRST_PSEUDO_REGISTER)
 	    for (i = 0; i < nregs; ++i)
 	      if (call_used_or_fixed_reg_p (regno + i)
-		  || targetm.hard_regno_call_part_clobbered (NULL, regno + i,
+		  || targetm.hard_regno_call_part_clobbered (0, regno + i,
 							     mode))
 		return 0;
 
 	  if (valueno >= 0 && valueno < FIRST_PSEUDO_REGISTER)
 	    for (i = 0; i < valuenregs; ++i)
 	      if (call_used_or_fixed_reg_p (valueno + i)
-		  || targetm.hard_regno_call_part_clobbered (NULL, valueno + i,
+		  || targetm.hard_regno_call_part_clobbered (0, valueno + i,
 							     mode))
 		return 0;
 	}
Index: gcc/reload1.c
===================================================================
--- gcc/reload1.c	2019-09-10 19:56:45.357177891 +0100
+++ gcc/reload1.c	2019-09-11 19:47:32.902202887 +0100
@@ -8193,8 +8193,7 @@ emit_reload_insns (class insn_chain *cha
 			   : out_regno + k);
 		      reg_reloaded_insn[regno + k] = insn;
 		      SET_HARD_REG_BIT (reg_reloaded_valid, regno + k);
-		      if (targetm.hard_regno_call_part_clobbered (NULL,
-								  regno + k,
+		      if (targetm.hard_regno_call_part_clobbered (0, regno + k,
 								  mode))
 			SET_HARD_REG_BIT (reg_reloaded_call_part_clobbered,
 					  regno + k);
@@ -8274,8 +8273,7 @@ emit_reload_insns (class insn_chain *cha
 			   : in_regno + k);
 		      reg_reloaded_insn[regno + k] = insn;
 		      SET_HARD_REG_BIT (reg_reloaded_valid, regno + k);
-		      if (targetm.hard_regno_call_part_clobbered (NULL,
-								  regno + k,
+		      if (targetm.hard_regno_call_part_clobbered (0, regno + k,
 								  mode))
 			SET_HARD_REG_BIT (reg_reloaded_call_part_clobbered,
 					  regno + k);
@@ -8391,7 +8389,7 @@ emit_reload_insns (class insn_chain *cha
 		      CLEAR_HARD_REG_BIT (reg_reloaded_dead, src_regno + k);
 		      SET_HARD_REG_BIT (reg_reloaded_valid, src_regno + k);
 		      if (targetm.hard_regno_call_part_clobbered
-			  (NULL, src_regno + k, mode))
+			  (0, src_regno + k, mode))
 			SET_HARD_REG_BIT (reg_reloaded_call_part_clobbered,
 					  src_regno + k);
 		      else
Index: gcc/sched-deps.c
===================================================================
--- gcc/sched-deps.c	2019-09-09 19:01:40.371078272 +0100
+++ gcc/sched-deps.c	2019-09-11 19:47:32.902202887 +0100
@@ -38,6 +38,7 @@ Software Foundation; either version 3, o
 #include "sched-int.h"
 #include "params.h"
 #include "cselib.h"
+#include "function-abi.h"
 
 #ifdef INSN_SCHEDULING
 
@@ -3723,6 +3724,7 @@ deps_analyze_insn (class deps_desc *deps
         }
       else
         {
+	  function_abi abi = call_insn_abi (insn);
           for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
             /* A call may read and modify global register variables.  */
             if (global_regs[i])
@@ -3734,7 +3736,7 @@ deps_analyze_insn (class deps_desc *deps
              Since we only have a choice between 'might be clobbered'
              and 'definitely not clobbered', we must include all
              partly call-clobbered registers here.  */
-	    else if (targetm.hard_regno_call_part_clobbered (insn, i,
+	    else if (targetm.hard_regno_call_part_clobbered (abi.id (), i,
 							     reg_raw_mode[i])
                      || TEST_HARD_REG_BIT (regs_invalidated_by_call, i))
               SET_REGNO_REG_SET (reg_pending_clobbers, i);
Index: gcc/sel-sched.c
===================================================================
--- gcc/sel-sched.c	2019-09-10 19:56:45.357177891 +0100
+++ gcc/sel-sched.c	2019-09-11 19:47:32.902202887 +0100
@@ -1102,7 +1102,7 @@ init_regs_for_mode (machine_mode mode)
       if (i >= 0)
         continue;
 
-      if (targetm.hard_regno_call_part_clobbered (NULL, cur_reg, mode))
+      if (targetm.hard_regno_call_part_clobbered (0, cur_reg, mode))
         SET_HARD_REG_BIT (sel_hrd.regs_for_call_clobbered[mode],
                           cur_reg);
 
@@ -1247,7 +1247,7 @@ mark_unavailable_hard_regs (def_t def, s
 
   /* Exclude registers that are partially call clobbered.  */
   if (def->crosses_call
-      && !targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
+      && !targetm.hard_regno_call_part_clobbered (0, regno, mode))
     reg_rename_p->available_for_renaming
       &= ~sel_hrd.regs_for_call_clobbered[mode];
 
Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c	2019-09-11 19:47:24.422262645 +0100
+++ gcc/targhooks.c	2019-09-11 19:47:32.906202859 +0100
@@ -1928,7 +1928,7 @@ default_dwarf_frame_reg_mode (int regno)
 {
   machine_mode save_mode = reg_raw_mode[regno];
 
-  if (targetm.hard_regno_call_part_clobbered (NULL, regno, save_mode))
+  if (targetm.hard_regno_call_part_clobbered (0, regno, save_mode))
     save_mode = choose_hard_reg_mode (regno, 1, true);
   return save_mode;
 }

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [06/32] Pass an ABI to choose_hard_reg_mode
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (4 preceding siblings ...)
  2019-09-11 19:05 ` [04/32] [x86] Robustify vzeroupper handling across calls Richard Sandiford
@ 2019-09-11 19:06 ` Richard Sandiford
  2019-09-29 21:00   ` Jeff Law
  2019-09-11 19:07 ` [08/32] Remove global call sets: cfgcleanup.c Richard Sandiford
                   ` (28 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:06 UTC (permalink / raw)
  To: gcc-patches

choose_hard_reg_mode previously took a boolean saying whether the
mode needed to be call-preserved.  This patch replaces it with an
optional ABI pointer instead, so that the function can use that
to test whether a value is call-saved.

default_dwarf_frame_reg_mode uses eh_edge_abi because that's the
ABI that matters for unwinding.  Targets need to override the hook
if they want something different.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* rtl.h (predefined_function_abi): Declare.
	(choose_hard_reg_mode): Take a pointer to a predefined_function_abi
	instead of a boolean call_save flag.
	* config/gcn/gcn.c (gcn_hard_regno_caller_save_mode): Update call
	accordingly.
	* config/i386/i386.h (HARD_REGNO_CALLER_SAVE_MODE): Likewise.
	* config/ia64/ia64.h (HARD_REGNO_CALLER_SAVE_MODE): Likewise.
	* config/mips/mips.c (mips_hard_regno_caller_save_mode): Likewise.
	* config/msp430/msp430.h (HARD_REGNO_CALLER_SAVE_MODE): Likewise.
	* config/rs6000/rs6000.h (HARD_REGNO_CALLER_SAVE_MODE): Likewise.
	* config/sh/sh.c (sh_hard_regno_caller_save_mode): Likewise.
	* reginfo.c (init_reg_modes_target): Likewise.
	(choose_hard_reg_mode): Take a pointer to a predefined_function_abi
	instead of a boolean call_save flag.
	* targhooks.c: Include function-abi.h.
	(default_dwarf_frame_reg_mode): Update call to choose_hard_reg_mode,
	using eh_edge_abi to choose the mode.

Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h	2019-09-11 19:47:24.418262673 +0100
+++ gcc/rtl.h	2019-09-11 19:47:39.478156547 +0100
@@ -36,6 +36,8 @@ #define GCC_RTL_H
 
 #include "hard-reg-set.h"
 
+class predefined_function_abi;
+
 /* Value used by some passes to "recognize" noop moves as valid
  instructions.  */
 #define NOOP_MOVE_INSN_CODE	INT_MAX
@@ -3383,7 +3385,8 @@ extern bool val_signbit_known_clear_p (m
 				       unsigned HOST_WIDE_INT);
 
 /* In reginfo.c  */
-extern machine_mode choose_hard_reg_mode (unsigned int, unsigned int, bool);
+extern machine_mode choose_hard_reg_mode (unsigned int, unsigned int,
+					  const predefined_function_abi *);
 extern const HARD_REG_SET &simplifiable_subregs (const subreg_shape &);
 
 /* In emit-rtl.c  */
Index: gcc/config/gcn/gcn.c
===================================================================
--- gcc/config/gcn/gcn.c	2019-09-10 19:56:45.333178060 +0100
+++ gcc/config/gcn/gcn.c	2019-09-11 19:47:39.466156632 +0100
@@ -3017,7 +3017,7 @@ gcn_hard_regno_rename_ok (unsigned int f
 gcn_hard_regno_caller_save_mode (unsigned int regno, unsigned int nregs,
 				 machine_mode regmode)
 {
-  machine_mode result = choose_hard_reg_mode (regno, nregs, false);
+  machine_mode result = choose_hard_reg_mode (regno, nregs, NULL);
 
   if (VECTOR_MODE_P (result) && !VECTOR_MODE_P (regmode))
     result = (nregs == 1 ? SImode : DImode);
Index: gcc/config/i386/i386.h
===================================================================
--- gcc/config/i386/i386.h	2019-08-27 07:24:49.455527415 +0100
+++ gcc/config/i386/i386.h	2019-09-11 19:47:39.466156632 +0100
@@ -1256,7 +1256,7 @@ #define AVOID_CCMODE_COPIES
 #define HARD_REGNO_CALLER_SAVE_MODE(REGNO, NREGS, MODE)			\
   (CC_REGNO_P (REGNO) ? VOIDmode					\
    : (MODE) == VOIDmode && (NREGS) != 1 ? VOIDmode			\
-   : (MODE) == VOIDmode ? choose_hard_reg_mode ((REGNO), (NREGS), false) \
+   : (MODE) == VOIDmode ? choose_hard_reg_mode ((REGNO), (NREGS), NULL)	\
    : (MODE) == HImode && !((GENERAL_REGNO_P (REGNO)			\
 			    && TARGET_PARTIAL_REG_STALL)		\
 			   || MASK_REGNO_P (REGNO)) ? SImode		\
Index: gcc/config/ia64/ia64.h
===================================================================
--- gcc/config/ia64/ia64.h	2019-09-10 19:57:04.693041422 +0100
+++ gcc/config/ia64/ia64.h	2019-09-11 19:47:39.466156632 +0100
@@ -562,7 +562,7 @@ #define REG_ALLOC_ORDER							   \
 
 #define HARD_REGNO_CALLER_SAVE_MODE(REGNO, NREGS, MODE) \
   ((FR_REGNO_P (REGNO) && (NREGS) == 1) ? RFmode        \
-   : choose_hard_reg_mode ((REGNO), (NREGS), false))
+   : choose_hard_reg_mode ((REGNO), (NREGS), NULL))
 \f
 /* Handling Leaf Functions */
 
Index: gcc/config/mips/mips.c
===================================================================
--- gcc/config/mips/mips.c	2019-09-11 19:47:32.874203085 +0100
+++ gcc/config/mips/mips.c	2019-09-11 19:47:39.470156604 +0100
@@ -22174,7 +22174,7 @@ mips_hard_regno_caller_save_mode (unsign
   /* For performance, avoid saving/restoring upper parts of a register
      by returning MODE as save mode when the mode is known.  */
   if (mode == VOIDmode)
-    return choose_hard_reg_mode (regno, nregs, false);
+    return choose_hard_reg_mode (regno, nregs, NULL);
   else
     return mode;
 }
Index: gcc/config/msp430/msp430.h
===================================================================
--- gcc/config/msp430/msp430.h	2019-08-15 15:26:04.800237365 +0100
+++ gcc/config/msp430/msp430.h	2019-09-11 19:47:39.474156575 +0100
@@ -467,7 +467,7 @@ #define DWARF2_ASM_LINE_DEBUG_INFO		1
    when spilling hard registers when they may contain PSImode values.  */
 #define HARD_REGNO_CALLER_SAVE_MODE(REGNO,NREGS,MODE) \
   ((TARGET_LARGE && ((NREGS) <= 2)) ? PSImode \
-   : choose_hard_reg_mode ((REGNO), (NREGS), false))
+   : choose_hard_reg_mode ((REGNO), (NREGS), NULL))
 
 #define ACCUMULATE_OUTGOING_ARGS 1
 
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	2019-09-10 19:57:04.701041365 +0100
+++ gcc/config/rs6000/rs6000.h	2019-09-11 19:47:39.474156575 +0100
@@ -1038,7 +1038,7 @@ #define HARD_REGNO_CALLER_SAVE_MODE(REGN
    ? DFmode								\
    : (MODE) == TDmode && FP_REGNO_P (REGNO)				\
    ? DImode								\
-   : choose_hard_reg_mode ((REGNO), (NREGS), false))
+   : choose_hard_reg_mode ((REGNO), (NREGS), NULL))
 
 #define VSX_VECTOR_MODE(MODE)		\
 	 ((MODE) == V4SFmode		\
Index: gcc/config/sh/sh.c
===================================================================
--- gcc/config/sh/sh.c	2019-09-10 19:57:04.705041337 +0100
+++ gcc/config/sh/sh.c	2019-09-11 19:47:39.474156575 +0100
@@ -10634,7 +10634,7 @@ sh_hard_regno_caller_save_mode (unsigned
 	      && ((regno - FIRST_FP_REG) & 1) == 0)))
     return mode;
 
-  return choose_hard_reg_mode (regno, nregs, false);
+  return choose_hard_reg_mode (regno, nregs, NULL);
 }
 
 /* Implement TARGET_CAN_CHANGE_MODE_CLASS.  */
Index: gcc/reginfo.c
===================================================================
--- gcc/reginfo.c	2019-09-11 19:47:32.898202916 +0100
+++ gcc/reginfo.c	2019-09-11 19:47:39.474156575 +0100
@@ -442,7 +442,7 @@ init_reg_modes_target (void)
 
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     {
-      reg_raw_mode[i] = choose_hard_reg_mode (i, 1, false);
+      reg_raw_mode[i] = choose_hard_reg_mode (i, 1, NULL);
 
       /* If we couldn't find a valid mode, just use the previous mode
 	 if it is suitable, otherwise fall back on word_mode.  */
@@ -550,10 +550,11 @@ memory_move_secondary_cost (machine_mode
 
 /* Return a machine mode that is legitimate for hard reg REGNO and large
    enough to save nregs.  If we can't find one, return VOIDmode.
-   If CALL_SAVED is true, only consider modes that are call saved.  */
+   If ABI is nonnull, only consider modes that are preserved across
+   calls that use ABI.  */
 machine_mode
 choose_hard_reg_mode (unsigned int regno ATTRIBUTE_UNUSED,
-		      unsigned int nregs, bool call_saved)
+		      unsigned int nregs, const predefined_function_abi *abi)
 {
   unsigned int /* machine_mode */ m;
   machine_mode found_mode = VOIDmode, mode;
@@ -567,32 +568,28 @@ choose_hard_reg_mode (unsigned int regno
   FOR_EACH_MODE_IN_CLASS (mode, MODE_INT)
     if (hard_regno_nregs (regno, mode) == nregs
 	&& targetm.hard_regno_mode_ok (regno, mode)
-	&& (!call_saved
-	    || !targetm.hard_regno_call_part_clobbered (0, regno, mode))
+	&& (!abi || !abi->clobbers_reg_p (mode, regno))
 	&& maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode)))
       found_mode = mode;
 
   FOR_EACH_MODE_IN_CLASS (mode, MODE_FLOAT)
     if (hard_regno_nregs (regno, mode) == nregs
 	&& targetm.hard_regno_mode_ok (regno, mode)
-	&& (!call_saved
-	    || !targetm.hard_regno_call_part_clobbered (0, regno, mode))
+	&& (!abi || !abi->clobbers_reg_p (mode, regno))
 	&& maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode)))
       found_mode = mode;
 
   FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_FLOAT)
     if (hard_regno_nregs (regno, mode) == nregs
 	&& targetm.hard_regno_mode_ok (regno, mode)
-	&& (!call_saved
-	    || !targetm.hard_regno_call_part_clobbered (0, regno, mode))
+	&& (!abi || !abi->clobbers_reg_p (mode, regno))
 	&& maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode)))
       found_mode = mode;
 
   FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_INT)
     if (hard_regno_nregs (regno, mode) == nregs
 	&& targetm.hard_regno_mode_ok (regno, mode)
-	&& (!call_saved
-	    || !targetm.hard_regno_call_part_clobbered (0, regno, mode))
+	&& (!abi || !abi->clobbers_reg_p (mode, regno))
 	&& maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode)))
       found_mode = mode;
 
@@ -605,8 +602,7 @@ choose_hard_reg_mode (unsigned int regno
       mode = (machine_mode) m;
       if (hard_regno_nregs (regno, mode) == nregs
 	  && targetm.hard_regno_mode_ok (regno, mode)
-	  && (!call_saved
-	      || !targetm.hard_regno_call_part_clobbered (0, regno, mode)))
+	  && (!abi || !abi->clobbers_reg_p (mode, regno)))
 	return mode;
     }
 
Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c	2019-09-11 19:47:32.906202859 +0100
+++ gcc/targhooks.c	2019-09-11 19:47:39.478156547 +0100
@@ -83,6 +83,7 @@ Software Foundation; either version 3, o
 #include "real.h"
 #include "langhooks.h"
 #include "sbitmap.h"
+#include "function-abi.h"
 
 bool
 default_legitimate_address_p (machine_mode mode ATTRIBUTE_UNUSED,
@@ -1928,8 +1929,9 @@ default_dwarf_frame_reg_mode (int regno)
 {
   machine_mode save_mode = reg_raw_mode[regno];
 
-  if (targetm.hard_regno_call_part_clobbered (0, regno, save_mode))
-    save_mode = choose_hard_reg_mode (regno, 1, true);
+  if (targetm.hard_regno_call_part_clobbered (eh_edge_abi.id (),
+					      regno, save_mode))
+    save_mode = choose_hard_reg_mode (regno, 1, &eh_edge_abi);
   return save_mode;
 }
 

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [08/32] Remove global call sets: cfgcleanup.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (5 preceding siblings ...)
  2019-09-11 19:06 ` [06/32] Pass an ABI to choose_hard_reg_mode Richard Sandiford
@ 2019-09-11 19:07 ` Richard Sandiford
  2019-09-29 21:02   ` Jeff Law
  2019-09-11 19:07 ` [07/32] Remove global call sets: caller-save.c Richard Sandiford
                   ` (27 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:07 UTC (permalink / raw)
  To: gcc-patches

old_insns_match_p just tests whether two instructions are
similar enough to merge.  With call_insn_abi it makes more
sense to compare the ABIs directly.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* cfgcleanup.c (old_insns_match_p): Compare the ABIs of calls
	instead of the call-clobbered sets.

Index: gcc/cfgcleanup.c
===================================================================
--- gcc/cfgcleanup.c	2019-09-11 19:47:24.402262786 +0100
+++ gcc/cfgcleanup.c	2019-09-11 19:47:50.610078102 +0100
@@ -1227,13 +1227,7 @@ old_insns_match_p (int mode ATTRIBUTE_UN
 	    }
 	}
 
-      HARD_REG_SET i1_used = call_insn_abi (i1).full_reg_clobbers ();
-      HARD_REG_SET i2_used = call_insn_abi (i2).full_reg_clobbers ();
-      /* ??? This preserves traditional behavior; it might not be needed.  */
-      i1_used |= fixed_reg_set;
-      i2_used |= fixed_reg_set;
-
-      if (i1_used != i2_used)
+      if (call_insn_abi (i1) != call_insn_abi (i2))
         return dir_none;
     }
 

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [07/32] Remove global call sets: caller-save.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (6 preceding siblings ...)
  2019-09-11 19:07 ` [08/32] Remove global call sets: cfgcleanup.c Richard Sandiford
@ 2019-09-11 19:07 ` Richard Sandiford
  2019-09-29 21:01   ` Jeff Law
  2019-09-11 19:08 ` [10/32] Remove global call sets: combine.c Richard Sandiford
                   ` (26 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:07 UTC (permalink / raw)
  To: gcc-patches

All caller-save.c uses of "|= fixed_reg_set" added in a previous patch
were redundant, since the sets are later ANDed with ~fixed_reg_set.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* caller-save.c (setup_save_areas): Remove redundant |s of
	fixed_reg_set.
	(save_call_clobbered_regs): Likewise.  Use the call ABI rather
	than call_used_or_fixed_regs to decide whether a REG_RETURNED
	value is useful.

Index: gcc/caller-save.c
===================================================================
--- gcc/caller-save.c	2019-09-11 19:47:24.402262786 +0100
+++ gcc/caller-save.c	2019-09-11 19:47:45.710112631 +0100
@@ -428,8 +428,6 @@ setup_save_areas (void)
       REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
 			       &chain->live_throughout);
       used_regs = call_insn_abi (insn).full_reg_clobbers ();
-      /* ??? This preserves traditional behavior; it might not be needed.  */
-      used_regs |= fixed_reg_set;
 
       /* Record all registers set in this call insn.  These don't
 	 need to be saved.  N.B. the call insn might set a subreg
@@ -513,9 +511,6 @@ setup_save_areas (void)
 	  REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
 				   &chain->live_throughout);
 	  used_regs = call_insn_abi (insn).full_reg_clobbers ();
-	  /* ??? This preserves traditional behavior; it might not
-	     be needed.  */
-	  used_regs |= fixed_reg_set;
 
 	  /* Record all registers set in this call insn.  These don't
 	     need to be saved.  N.B. the call insn might set a subreg
@@ -793,7 +788,6 @@ save_call_clobbered_regs (void)
 	    {
 	      unsigned regno;
 	      HARD_REG_SET hard_regs_to_save;
-	      HARD_REG_SET call_def_reg_set;
 	      reg_set_iterator rsi;
 	      rtx cheap;
 
@@ -840,15 +834,12 @@ save_call_clobbered_regs (void)
 	      note_stores (insn, mark_set_regs, &this_insn_sets);
 
 	      /* Compute which hard regs must be saved before this call.  */
+	      function_abi abi = call_insn_abi (insn);
 	      hard_regs_to_save &= ~(fixed_reg_set
 				     | this_insn_sets
 				     | hard_regs_saved);
 	      hard_regs_to_save &= savable_regs;
-	      call_def_reg_set = call_insn_abi (insn).full_reg_clobbers ();
-	      /* ??? This preserves traditional behavior; it might not
-		 be needed.  */
-	      call_def_reg_set |= fixed_reg_set;
-	      hard_regs_to_save &= call_def_reg_set;
+	      hard_regs_to_save &= abi.full_reg_clobbers ();
 
 	      for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
 		if (TEST_HARD_REG_BIT (hard_regs_to_save, regno))
@@ -863,8 +854,7 @@ save_call_clobbered_regs (void)
 	      
 	      if (cheap
 		  && HARD_REGISTER_P (cheap)
-		  && TEST_HARD_REG_BIT (call_used_or_fixed_regs,
-					REGNO (cheap)))
+		  && abi.clobbers_reg_p (GET_MODE (cheap), REGNO (cheap)))
 		{
 		  rtx dest, newpat;
 		  rtx pat = PATTERN (insn);

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [10/32] Remove global call sets: combine.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (7 preceding siblings ...)
  2019-09-11 19:07 ` [07/32] Remove global call sets: caller-save.c Richard Sandiford
@ 2019-09-11 19:08 ` Richard Sandiford
  2019-09-12  2:18   ` Segher Boessenkool
  2019-09-11 19:08 ` [09/32] Remove global call sets: cfgloopanal.c Richard Sandiford
                   ` (25 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:08 UTC (permalink / raw)
  To: gcc-patches

There shouldn't be many cases in which a useful hard register is
live across a call before RA, so we might as well keep things simple
and invalidate partially-clobbered registers here, in case the values
they hold leak into the call-clobbered part.  In principle this is
a bug fix for TARGET_HARD_REGNO_CALL_PART_CLOBBERED targets,
but in practice it probably doesn't make a difference.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* combine.c: Include function-abi.h.
	(record_dead_and_set_regs): Use call_insn_abi to get the ABI
	of call insns.  Invalidate partially-clobbered registers as
	well as fully-clobbered ones.

Index: gcc/combine.c
===================================================================
--- gcc/combine.c	2019-09-09 18:58:51.448270881 +0100
+++ gcc/combine.c	2019-09-11 19:47:57.062032638 +0100
@@ -105,6 +105,7 @@ Software Foundation; either version 3, o
 #include "valtrack.h"
 #include "rtl-iter.h"
 #include "print-rtl.h"
+#include "function-abi.h"
 
 /* Number of attempts to combine instructions in this function.  */
 
@@ -13464,11 +13465,21 @@ record_dead_and_set_regs (rtx_insn *insn
 
   if (CALL_P (insn))
     {
+      function_abi abi = call_insn_abi (insn);
       hard_reg_set_iterator hrsi;
-      EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, i, hrsi)
+      EXECUTE_IF_SET_IN_HARD_REG_SET (abi.full_and_partial_reg_clobbers (),
+				      0, i, hrsi)
 	{
 	  reg_stat_type *rsp;
 
+	  /* ??? We could try to preserve some information from the last
+	     set of register I if the call doesn't actually clobber
+	     (reg:last_set_mode I), which might be true for ABIs with
+	     partial clobbers.  However, it would be difficult to
+	     update last_set_nonzero_bits and last_sign_bit_copies
+	     to account for the part of I that actually was clobbered.
+	     It wouldn't help much anyway, since we rarely see this
+	     situation before RA.  */
 	  rsp = &reg_stat[i];
 	  rsp->last_set_invalid = 1;
 	  rsp->last_set = insn;

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [09/32] Remove global call sets: cfgloopanal.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (8 preceding siblings ...)
  2019-09-11 19:08 ` [10/32] Remove global call sets: combine.c Richard Sandiford
@ 2019-09-11 19:08 ` Richard Sandiford
  2019-09-29 21:02   ` Jeff Law
  2019-09-11 19:09 ` [12/32] Remove global call sets: cselib.c Richard Sandiford
                   ` (24 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:08 UTC (permalink / raw)
  To: gcc-patches

...or rather, make the use of the default ABI explicit.  That seems
OK if not ideal for this heuristic.

In practical terms, the code patched here is counting GENERAL_REGS,
which are treated in the same way by all concurrent ABI variants
on AArch64.  It might give bad results if used for interrupt
handlers though.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* cfgloopanal.c: Include regs.h and function-abi.h.
	(init_set_costs): Use default_function_abi to test whether
	a general register is call-clobbered.

Index: gcc/cfgloopanal.c
===================================================================
--- gcc/cfgloopanal.c	2019-09-10 19:56:45.313178201 +0100
+++ gcc/cfgloopanal.c	2019-09-11 19:47:53.946054595 +0100
@@ -32,6 +32,8 @@ Software Foundation; either version 3, o
 #include "graphds.h"
 #include "params.h"
 #include "sreal.h"
+#include "regs.h"
+#include "function-abi.h"
 
 struct target_cfgloop default_target_cfgloop;
 #if SWITCHABLE_TARGET
@@ -353,7 +355,10 @@ init_set_costs (void)
 	&& !fixed_regs[i])
       {
 	target_avail_regs++;
-	if (call_used_or_fixed_reg_p (i))
+	/* ??? This is only a rough heuristic.  It doesn't cope well
+	   with alternative ABIs, but that's an optimization rather than
+	   correctness issue.  */
+	if (default_function_abi.clobbers_full_reg_p (i))
 	  target_clobbered_regs++;
       }
 

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [12/32] Remove global call sets: cselib.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (9 preceding siblings ...)
  2019-09-11 19:08 ` [09/32] Remove global call sets: cfgloopanal.c Richard Sandiford
@ 2019-09-11 19:09 ` Richard Sandiford
  2019-09-29 21:05   ` Jeff Law
  2019-09-11 19:09 ` [11/32] Remove global call sets: cse.c Richard Sandiford
                   ` (23 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:09 UTC (permalink / raw)
  To: gcc-patches

cselib_invalidate_regno is a no-op if REG_VALUES (i) is null,
so we can check that first.  Then, if we know what mode the register
currently has, we can check whether it's clobbered in that mode.

Using GET_MODE (values->elt->val_rtx) to get the mode of the last
set is taken from cselib_reg_set_mode.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* cselib.c (cselib_process_insn): If we know what mode a
	register was set in, check whether it is clobbered in that
	mode by a call.  Only fall back to reg_raw_mode if that fails.

Index: gcc/cselib.c
===================================================================
--- gcc/cselib.c	2019-09-11 19:47:32.894202944 +0100
+++ gcc/cselib.c	2019-09-11 19:48:04.229982128 +0100
@@ -2768,11 +2768,23 @@ cselib_process_insn (rtx_insn *insn)
     {
       function_abi abi = call_insn_abi (insn);
       for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-	if (call_used_or_fixed_reg_p (i)
-	    || (REG_VALUES (i) && REG_VALUES (i)->elt
-		&& (targetm.hard_regno_call_part_clobbered
-		    (abi.id (), i, GET_MODE (REG_VALUES (i)->elt->val_rtx)))))
-	  cselib_invalidate_regno (i, reg_raw_mode[i]);
+	if (elt_list *values = REG_VALUES (i))
+	  {
+	    /* If we know what mode the value was set in, check whether
+	       it is still available after the call in that mode.  If we
+	       don't know the mode, we have to check for the worst-case
+	       scenario instead.  */
+	    if (values->elt)
+	      {
+		if (abi.clobbers_reg_p (GET_MODE (values->elt->val_rtx), i))
+		  cselib_invalidate_regno (i, GET_MODE (values->elt->val_rtx));
+	      }
+	    else
+	      {
+		if (abi.clobbers_at_least_part_of_reg_p (i))
+		  cselib_invalidate_regno (i, reg_raw_mode[i]);
+	      }
+	  }
 
       /* Since it is not clear how cselib is going to be used, be
 	 conservative here and treat looping pure or const functions

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [11/32] Remove global call sets: cse.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (10 preceding siblings ...)
  2019-09-11 19:09 ` [12/32] Remove global call sets: cselib.c Richard Sandiford
@ 2019-09-11 19:09 ` Richard Sandiford
  2019-09-25 15:57   ` Richard Sandiford
  2019-09-11 19:10 ` [13/32] Remove global call sets: DF (EH edges) Richard Sandiford
                   ` (22 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:09 UTC (permalink / raw)
  To: gcc-patches

Like with the combine.c patch, this one keeps things simple by
invalidating values in partially-clobbered registers, rather than
trying to tell whether the value in a partially-clobbered register
is actually clobbered or not.  Again, this is in principle a bug fix,
but probably never matters in practice.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* cse.c: Include regs.h and function-abi.h.
	(invalidate_for_call): Take the call insn as an argument.
	Use call_insn_abi to get the ABI of the call and invalidate
	partially clobbered registers as well as fully clobbered ones.
	(cse_insn): Update call accordingly.

Index: gcc/cse.c
===================================================================
--- gcc/cse.c	2019-09-09 18:58:51.468270740 +0100
+++ gcc/cse.c	2019-09-11 19:48:00.966005128 +0100
@@ -42,6 +42,8 @@ Software Foundation; either version 3, o
 #include "tree-pass.h"
 #include "dbgcnt.h"
 #include "rtl-iter.h"
+#include "regs.h"
+#include "function-abi.h"
 
 /* The basic idea of common subexpression elimination is to go
    through the code, keeping a record of expressions that would
@@ -566,7 +568,6 @@ static void remove_invalid_subreg_refs (
 					machine_mode);
 static void rehash_using_reg (rtx);
 static void invalidate_memory (void);
-static void invalidate_for_call (void);
 static rtx use_related_value (rtx, struct table_elt *);
 
 static inline unsigned canon_hash (rtx, machine_mode);
@@ -2091,23 +2092,29 @@ rehash_using_reg (rtx x)
 }
 \f
 /* Remove from the hash table any expression that is a call-clobbered
-   register.  Also update their TICK values.  */
+   register in INSN.  Also update their TICK values.  */
 
 static void
-invalidate_for_call (void)
+invalidate_for_call (rtx_insn *insn)
 {
-  unsigned int regno, endregno;
-  unsigned int i;
+  unsigned int regno;
   unsigned hash;
   struct table_elt *p, *next;
   int in_table = 0;
   hard_reg_set_iterator hrsi;
 
-  /* Go through all the hard registers.  For each that is clobbered in
-     a CALL_INSN, remove the register from quantity chains and update
+  /* Go through all the hard registers.  For each that might be clobbered
+     in call insn INSN, remove the register from quantity chains and update
      reg_tick if defined.  Also see if any of these registers is currently
-     in the table.  */
-  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, regno, hrsi)
+     in the table.
+
+     ??? We could be more precise for partially-clobbered registers,
+     and only invalidate values that actually occupy the clobbered part
+     of the registers.  It doesn't seem worth the effort though, since
+     we shouldn't see this situation much before RA.  */
+  function_abi abi = call_insn_abi (insn);
+  EXECUTE_IF_SET_IN_HARD_REG_SET (abi.full_and_partial_reg_clobbers (),
+				  0, regno, hrsi)
     {
       delete_reg_equiv (regno);
       if (REG_TICK (regno) >= 0)
@@ -2132,15 +2139,11 @@ invalidate_for_call (void)
 	      || REGNO (p->exp) >= FIRST_PSEUDO_REGISTER)
 	    continue;
 
-	  regno = REGNO (p->exp);
-	  endregno = END_REGNO (p->exp);
-
-	  for (i = regno; i < endregno; i++)
-	    if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i))
-	      {
-		remove_from_table (p, hash);
-		break;
-	      }
+	  /* This must use the same test as above rather than the
+	     more accurate clobbers_reg_p.  */
+	  if (overlaps_hard_reg_set_p (abi.full_and_partial_reg_clobbers (),
+				       GET_MODE (p->exp), REGNO (p->exp)))
+	    remove_from_table (p, hash);
 	}
 }
 \f
@@ -5834,7 +5837,7 @@ cse_insn (rtx_insn *insn)
 	  if (GET_CODE (XEXP (tem, 0)) == USE
 	      && MEM_P (XEXP (XEXP (tem, 0), 0)))
 	    invalidate (XEXP (XEXP (tem, 0), 0), VOIDmode);
-      invalidate_for_call ();
+      invalidate_for_call (insn);
     }
 
   /* Now invalidate everything set by this instruction.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [13/32] Remove global call sets: DF (EH edges)
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (11 preceding siblings ...)
  2019-09-11 19:09 ` [11/32] Remove global call sets: cse.c Richard Sandiford
@ 2019-09-11 19:10 ` Richard Sandiford
  2019-09-29 21:07   ` Jeff Law
  2019-09-11 19:10 ` [14/32] Remove global call sets: DF (entry/exit defs) Richard Sandiford
                   ` (21 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:10 UTC (permalink / raw)
  To: gcc-patches

The DF dense_invalidated_by_call and sparse_invalidated_by_call
sets are actually only used on EH edges, and so are more the set
of registers that are invalidated by a taken EH edge.  Under the
new order, that means that they describe eh_edge_abi.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* df-problems.c: Include regs.h and function-abi.h.
	(df_rd_problem_data): Rename sparse_invalidated_by_call to
	sparse_invalidated_by_eh and dense_invalidated_by_call to
	dense_invalidated_by_eh.
	(df_print_bb_index): Update accordingly.
	(df_rd_alloc, df_rd_start_dump, df_rd_confluence_n): Likewise.
	(df_lr_confluence_n): Use eh_edge_abi to get the set of registers
	that are clobbered by an EH edge.  Clobber partially-clobbered
	registers as well as fully-clobbered ones.
	(df_md_confluence_n): Likewise.
	(df_rd_local_compute): Likewise.  Update for changes to
	df_rd_problem_data.
	* df-scan.c (df_scan_start_dump): Use eh_edge_abi to get the set
	of registers that are clobbered by an EH edge.  Includde partially-
	clobbered registers as well as fully-clobbered ones.

Index: gcc/df-problems.c
===================================================================
--- gcc/df-problems.c	2019-09-09 19:01:48.423021426 +0100
+++ gcc/df-problems.c	2019-09-11 19:48:07.405959747 +0100
@@ -36,6 +36,8 @@ Software Foundation; either version 3, o
 #include "valtrack.h"
 #include "dumpfile.h"
 #include "rtl-iter.h"
+#include "regs.h"
+#include "function-abi.h"
 
 /* Note that turning REG_DEAD_DEBUGGING on will cause
    gcc.c-torture/unsorted/dump-noaddr.c to fail because it prints
@@ -139,18 +141,17 @@ df_print_bb_index (basic_block bb, FILE
    these along with the bitmap_clear_range call to remove ranges of
    bits without actually generating a knockout vector.
 
-   The kill and sparse_kill and the dense_invalidated_by_call and
-   sparse_invalidated_by_call both play this game.  */
+   The kill and sparse_kill and the dense_invalidated_by_eh and
+   sparse_invalidated_by_eh both play this game.  */
 
 /* Private data used to compute the solution for this problem.  These
    data structures are not accessible outside of this module.  */
 class df_rd_problem_data
 {
 public:
-  /* The set of defs to regs invalidated by call.  */
-  bitmap_head sparse_invalidated_by_call;
-  /* The set of defs to regs invalidate by call for rd.  */
-  bitmap_head dense_invalidated_by_call;
+  /* The set of defs to regs invalidated by EH edges.  */
+  bitmap_head sparse_invalidated_by_eh;
+  bitmap_head dense_invalidated_by_eh;
   /* An obstack for the bitmaps we need for this problem.  */
   bitmap_obstack rd_bitmaps;
 };
@@ -187,8 +188,8 @@ df_rd_alloc (bitmap all_blocks)
   if (df_rd->problem_data)
     {
       problem_data = (class df_rd_problem_data *) df_rd->problem_data;
-      bitmap_clear (&problem_data->sparse_invalidated_by_call);
-      bitmap_clear (&problem_data->dense_invalidated_by_call);
+      bitmap_clear (&problem_data->sparse_invalidated_by_eh);
+      bitmap_clear (&problem_data->dense_invalidated_by_eh);
     }
   else
     {
@@ -196,9 +197,9 @@ df_rd_alloc (bitmap all_blocks)
       df_rd->problem_data = problem_data;
 
       bitmap_obstack_initialize (&problem_data->rd_bitmaps);
-      bitmap_initialize (&problem_data->sparse_invalidated_by_call,
+      bitmap_initialize (&problem_data->sparse_invalidated_by_eh,
 			 &problem_data->rd_bitmaps);
-      bitmap_initialize (&problem_data->dense_invalidated_by_call,
+      bitmap_initialize (&problem_data->dense_invalidated_by_eh,
 			 &problem_data->rd_bitmaps);
     }
 
@@ -391,8 +392,8 @@ df_rd_local_compute (bitmap all_blocks)
   bitmap_iterator bi;
   class df_rd_problem_data *problem_data
     = (class df_rd_problem_data *) df_rd->problem_data;
-  bitmap sparse_invalidated = &problem_data->sparse_invalidated_by_call;
-  bitmap dense_invalidated = &problem_data->dense_invalidated_by_call;
+  bitmap sparse_invalidated = &problem_data->sparse_invalidated_by_eh;
+  bitmap dense_invalidated = &problem_data->dense_invalidated_by_eh;
 
   bitmap_initialize (&seen_in_block, &df_bitmap_obstack);
   bitmap_initialize (&seen_in_insn, &df_bitmap_obstack);
@@ -404,10 +405,13 @@ df_rd_local_compute (bitmap all_blocks)
       df_rd_bb_local_compute (bb_index);
     }
 
-  /* Set up the knockout bit vectors to be applied across EH_EDGES.  */
+  /* Set up the knockout bit vectors to be applied across EH_EDGES.
+     Conservatively treat partially-clobbered registers as surviving
+     across the EH edge, i.e. assume that definitions before the edge
+     is taken *might* reach uses after it has been taken.  */
   if (!(df->changeable_flags & DF_NO_HARD_REGS))
     for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno)
-      if (TEST_HARD_REG_BIT (regs_invalidated_by_call, regno))
+      if (eh_edge_abi.clobbers_full_reg_p (regno))
 	{
 	  if (DF_DEFS_COUNT (regno) > DF_SPARSE_THRESHOLD)
 	    bitmap_set_bit (sparse_invalidated, regno);
@@ -455,8 +459,8 @@ df_rd_confluence_n (edge e)
     {
       class df_rd_problem_data *problem_data
 	= (class df_rd_problem_data *) df_rd->problem_data;
-      bitmap sparse_invalidated = &problem_data->sparse_invalidated_by_call;
-      bitmap dense_invalidated = &problem_data->dense_invalidated_by_call;
+      bitmap sparse_invalidated = &problem_data->sparse_invalidated_by_eh;
+      bitmap dense_invalidated = &problem_data->dense_invalidated_by_eh;
       bitmap_iterator bi;
       unsigned int regno;
 
@@ -579,9 +583,9 @@ df_rd_start_dump (FILE *file)
   fprintf (file, ";; Reaching defs:\n");
 
   fprintf (file, ";;  sparse invalidated \t");
-  dump_bitmap (file, &problem_data->sparse_invalidated_by_call);
+  dump_bitmap (file, &problem_data->sparse_invalidated_by_eh);
   fprintf (file, ";;  dense invalidated \t");
-  dump_bitmap (file, &problem_data->dense_invalidated_by_call);
+  dump_bitmap (file, &problem_data->dense_invalidated_by_eh);
 
   fprintf (file, ";;  reg->defs[] map:\t");
   for (regno = 0; regno < m; regno++)
@@ -976,12 +980,15 @@ df_lr_confluence_n (edge e)
   bitmap op2 = &df_lr_get_bb_info (e->dest->index)->in;
   bool changed = false;
 
-  /* Call-clobbered registers die across exception and call edges.  */
+  /* Call-clobbered registers die across exception and call edges.
+     Conservatively treat partially-clobbered registers as surviving
+     across the edges; they might or might not, depending on what
+     mode they have.  */
   /* ??? Abnormal call edges ignored for the moment, as this gets
      confused by sibling call edges, which crashes reg-stack.  */
   if (e->flags & EDGE_EH)
     {
-      bitmap_view<HARD_REG_SET> eh_kills (regs_invalidated_by_call);
+      bitmap_view<HARD_REG_SET> eh_kills (eh_edge_abi.full_reg_clobbers ());
       changed = bitmap_ior_and_compl_into (op1, op2, eh_kills);
     }
   else
@@ -4636,7 +4643,10 @@ df_md_confluence_n (edge e)
 
   if (e->flags & EDGE_EH)
     {
-      bitmap_view<HARD_REG_SET> eh_kills (regs_invalidated_by_call);
+      /* Conservatively treat partially-clobbered registers as surviving
+	 across the edge; they might or might not, depending on what mode
+	 they have.  */
+      bitmap_view<HARD_REG_SET> eh_kills (eh_edge_abi.full_reg_clobbers ());
       return bitmap_ior_and_compl_into (op1, op2, eh_kills);
     }
   else
Index: gcc/df-scan.c
===================================================================
--- gcc/df-scan.c	2019-09-11 19:47:24.410262730 +0100
+++ gcc/df-scan.c	2019-09-11 19:48:07.405959747 +0100
@@ -312,8 +312,9 @@ df_scan_start_dump (FILE *file ATTRIBUTE
   basic_block bb;
   rtx_insn *insn;
 
-  fprintf (file, ";;  invalidated by call \t");
-  df_print_regset (file, bitmap_view<HARD_REG_SET> (regs_invalidated_by_call));
+  fprintf (file, ";;  fully invalidated by EH \t");
+  df_print_regset
+    (file, bitmap_view<HARD_REG_SET> (eh_edge_abi.full_reg_clobbers ()));
   fprintf (file, ";;  hardware regs used \t");
   df_print_regset (file, &df->hardware_regs_used);
   fprintf (file, ";;  regular block artificial uses \t");

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [14/32] Remove global call sets: DF (entry/exit defs)
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (12 preceding siblings ...)
  2019-09-11 19:10 ` [13/32] Remove global call sets: DF (EH edges) Richard Sandiford
@ 2019-09-11 19:10 ` Richard Sandiford
  2019-09-29 21:07   ` Jeff Law
  2019-09-11 19:11 ` [17/32] Remove global call sets: gcse.c Richard Sandiford
                   ` (20 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:10 UTC (permalink / raw)
  To: gcc-patches

The code patched here is seeing whether the current function
needs to save at least part of a register before using it.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* df-scan.c (df_get_entry_block_def_set): Use crtl->abi to test
	whether the current function needs to save at least part of a
	register before using it.
	(df_get_exit_block_use_set): Likewise for epilogue restores.

Index: gcc/df-scan.c
===================================================================
--- gcc/df-scan.c	2019-09-11 19:48:07.405959747 +0100
+++ gcc/df-scan.c	2019-09-11 19:48:11.009934354 +0100
@@ -3499,7 +3499,9 @@ df_get_entry_block_def_set (bitmap entry
       /* Defs for the callee saved registers are inserted so that the
 	 pushes have some defining location.  */
       for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-	if ((call_used_or_fixed_reg_p (i) == 0) && (df_regs_ever_live_p (i)))
+	if (!crtl->abi->clobbers_full_reg_p (i)
+	    && !fixed_regs[i]
+	    && df_regs_ever_live_p (i))
 	  bitmap_set_bit (entry_block_defs, i);
     }
 
@@ -3672,8 +3674,9 @@ df_get_exit_block_use_set (bitmap exit_b
     {
       /* Mark all call-saved registers that we actually used.  */
       for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-	if (df_regs_ever_live_p (i) && !LOCAL_REGNO (i)
-	    && !TEST_HARD_REG_BIT (regs_invalidated_by_call, i))
+	if (df_regs_ever_live_p (i)
+	    && !LOCAL_REGNO (i)
+	    && !crtl->abi->clobbers_full_reg_p (i))
 	  bitmap_set_bit (exit_block_uses, i);
     }
 

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [16/32] Remove global call sets: function.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (15 preceding siblings ...)
  2019-09-11 19:11 ` [15/32] Remove global call sets: early-remat.c Richard Sandiford
@ 2019-09-11 19:11 ` Richard Sandiford
  2019-09-29 21:10   ` Jeff Law
  2019-09-11 19:12 ` [18/32] Remove global call sets: haifa-sched.c Richard Sandiford
                   ` (17 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:11 UTC (permalink / raw)
  To: gcc-patches

Whatever the rights and wrongs of the way aggregate_value_p
handles call-preserved registers, it's a de facto part of the ABI,
so we shouldn't change it.  The patch simply extends the current
approach to whatever call-preserved set the function happens to
be using.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* function.c (aggregate_value_p): Work out which ABI the
	function is using before testing which registers are at least
	partly preserved by a call.

Index: gcc/function.c
===================================================================
--- gcc/function.c	2019-09-11 19:47:07.490381964 +0100
+++ gcc/function.c	2019-09-11 19:48:18.357882573 +0100
@@ -2120,10 +2120,17 @@ aggregate_value_p (const_tree exp, const
   if (!REG_P (reg))
     return 0;
 
+  /* Use the default ABI if the type of the function isn't known.
+     The scheme for handling interoperability between different ABIs
+     requires us to be able to tell when we're calling a function with
+     a nondefault ABI.  */
+  const predefined_function_abi &abi = (fntype
+					? fntype_abi (fntype)
+					: default_function_abi);
   regno = REGNO (reg);
   nregs = hard_regno_nregs (regno, TYPE_MODE (type));
   for (i = 0; i < nregs; i++)
-    if (! call_used_or_fixed_reg_p (regno + i))
+    if (!fixed_regs[regno + i] && !abi.clobbers_full_reg_p (regno + i))
       return 1;
 
   return 0;

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [15/32] Remove global call sets: early-remat.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (14 preceding siblings ...)
  2019-09-11 19:11 ` [17/32] Remove global call sets: gcse.c Richard Sandiford
@ 2019-09-11 19:11 ` Richard Sandiford
  2019-09-29 21:09   ` Jeff Law
  2019-09-11 19:11 ` [16/32] Remove global call sets: function.c Richard Sandiford
                   ` (18 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:11 UTC (permalink / raw)
  To: gcc-patches

This pass previously excluded rematerialisation candidates if they
clobbered a call-preserved register, on the basis that it then
wouldn't be safe to add new instances of the candidate instruction
after a call.  This patch instead makes the decision on a call-by-call
basis.

The second emit_remat_insns_for_block hunk probably isn't needed,
but it seems safer and more consistent to have it, so that every call
to emit_remat_insns is preceded by a check for invalid clobbers.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* early-remat.c: Include regs.h and function-abi.h.
	(early_remat::maybe_add_candidate): Don't check for call-clobbered
	registers here.
	(early_remat::restrict_remat_for_unavail_regs): New function.
	(early_remat::restrict_remat_for_call): Likewise.
	(early_remat::process_block): Before calling emit_remat_insns
	for a previous call in the block, invalidate any candidates
	that would clobber call-preserved registers.
	(early_remat::emit_remat_insns_for_block): Likewise for the
	final call in a block.  Do the same thing for live-in registers
	when calling emit_remat_insns at the head of a block.

Index: gcc/early-remat.c
===================================================================
--- gcc/early-remat.c	2019-07-01 09:37:04.720545420 +0100
+++ gcc/early-remat.c	2019-09-11 19:48:14.825907465 +0100
@@ -36,6 +36,8 @@ Software Foundation; either version 3, o
 #include "rtlhash.h"
 #include "print-rtl.h"
 #include "rtl-iter.h"
+#include "regs.h"
+#include "function-abi.h"
 
 /* This pass runs before register allocation and implements an aggressive
    form of rematerialization.  It looks for pseudo registers R of mode M
@@ -435,6 +437,8 @@ struct remat_candidate_hasher : nofree_p
   void compute_clobbers (unsigned int);
   void assign_value_number (unsigned int);
   void decide_candidate_validity (void);
+  void restrict_remat_for_unavail_regs (bitmap, const_bitmap);
+  void restrict_remat_for_call (bitmap, rtx_insn *);
   bool stable_use_p (unsigned int);
   void emit_copy_before (unsigned int, rtx, rtx);
   void stabilize_pattern (unsigned int);
@@ -889,8 +893,8 @@ #define FAILURE_ARGS regno, INSN_UID (in
       else
 	{
 	  /* The instruction can set additional registers, provided that
-	     they're call-clobbered hard registers.  This is useful for
-	     instructions that alter the condition codes.  */
+	     they're hard registers.  This is useful for instructions
+	     that alter the condition codes.  */
 	  if (!HARD_REGISTER_NUM_P (def_regno))
 	    {
 	      if (dump_file)
@@ -898,20 +902,6 @@ #define FAILURE_ARGS regno, INSN_UID (in
 			 " pseudo reg %d\n", FAILURE_ARGS, def_regno);
 	      return false;
 	    }
-	  if (global_regs[def_regno])
-	    {
-	      if (dump_file)
-		fprintf (dump_file, FAILURE_FORMAT "insn also sets"
-			 " global reg %d\n", FAILURE_ARGS, def_regno);
-	      return false;
-	    }
-	  if (!TEST_HARD_REG_BIT (regs_invalidated_by_call, def_regno))
-	    {
-	      if (dump_file)
-		fprintf (dump_file, FAILURE_FORMAT "insn also sets"
-			 " call-preserved reg %d\n", FAILURE_ARGS, def_regno);
-	      return false;
-	    }
 	}
     }
 
@@ -1532,6 +1522,39 @@ early_remat::decide_candidate_validity (
       }
 }
 
+/* Remove any candidates in CANDIDATES that would clobber a register in
+   UNAVAIL_REGS.  */
+
+void
+early_remat::restrict_remat_for_unavail_regs (bitmap candidates,
+					      const_bitmap unavail_regs)
+{
+  bitmap_clear (&m_tmp_bitmap);
+  unsigned int cand_index;
+  bitmap_iterator bi;
+  EXECUTE_IF_SET_IN_BITMAP (candidates, 0, cand_index, bi)
+    {
+      remat_candidate *cand = &m_candidates[cand_index];
+      if (cand->clobbers
+	  && bitmap_intersect_p (cand->clobbers, unavail_regs))
+	bitmap_set_bit (&m_tmp_bitmap, cand_index);
+    }
+  bitmap_and_compl_into (candidates, &m_tmp_bitmap);
+}
+
+/* Remove any candidates in CANDIDATES that would clobber a register
+   that is potentially live across CALL.  */
+
+void
+early_remat::restrict_remat_for_call (bitmap candidates, rtx_insn *call)
+{
+  function_abi abi = call_insn_abi (call);
+  /* We don't know whether partially-clobbered registers are live
+     across the call or not, so assume that they are.  */
+  bitmap_view<HARD_REG_SET> call_preserved_regs (~abi.full_reg_clobbers ());
+  restrict_remat_for_unavail_regs (candidates, call_preserved_regs);
+}
+
 /* Assuming that every path reaching a point P contains a copy of a
    use U of REGNO, return true if another copy of U at P would have
    access to the same value of REGNO.  */
@@ -1984,10 +2007,13 @@ early_remat::process_block (basic_block
 	      init_temp_bitmap (&m_required);
 	    }
 	  else
-	    /* The fully-local case: candidates that need to be
-	       rematerialized after a previous call in the block.  */
-	    emit_remat_insns (m_required, NULL, info->rd_after_call,
-			      last_call);
+	    {
+	      /* The fully-local case: candidates that need to be
+		 rematerialized after a previous call in the block.  */
+	      restrict_remat_for_call (m_required, last_call);
+	      emit_remat_insns (m_required, NULL, info->rd_after_call,
+				last_call);
+	    }
 	  last_call = insn;
 	  bitmap_clear (m_available);
 	  gcc_checking_assert (empty_p (m_required));
@@ -2480,8 +2506,11 @@ early_remat::emit_remat_insns_for_block
   remat_block_info *info = &m_block_info[bb->index];
 
   if (info->last_call && !empty_p (info->required_after_call))
-    emit_remat_insns (info->required_after_call, NULL,
-		      info->rd_after_call, info->last_call);
+    {
+      restrict_remat_for_call (info->required_after_call, info->last_call);
+      emit_remat_insns (info->required_after_call, NULL,
+			info->rd_after_call, info->last_call);
+    }
 
   if (!empty_p (info->required_in))
     {
@@ -2489,6 +2518,7 @@ early_remat::emit_remat_insns_for_block
       while (insn != BB_END (bb)
 	     && !INSN_P (NEXT_INSN (insn)))
 	insn = NEXT_INSN (insn);
+      restrict_remat_for_unavail_regs (info->required_in, DF_LR_IN (bb));
       emit_remat_insns (info->required_in, info->available_in,
 			info->rd_in, insn);
     }

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [17/32] Remove global call sets: gcse.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (13 preceding siblings ...)
  2019-09-11 19:10 ` [14/32] Remove global call sets: DF (entry/exit defs) Richard Sandiford
@ 2019-09-11 19:11 ` Richard Sandiford
  2019-09-25 16:04   ` Richard Sandiford
  2019-09-29 21:10   ` Jeff Law
  2019-09-11 19:11 ` [15/32] Remove global call sets: early-remat.c Richard Sandiford
                   ` (19 subsequent siblings)
  34 siblings, 2 replies; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:11 UTC (permalink / raw)
  To: gcc-patches

This is another case in which we can conservatively treat partial
kills as full kills.  Again this is in principle a bug fix for
TARGET_HARD_REGNO_CALL_PART_CLOBBERED targets, but in practice
it probably doesn't make a difference.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* gcse.c: Include function-abi.h.
	(compute_hash_table_work): Use call_insn_abi to get the ABI of
	the call insn target.  Invalidate partially call-clobbered
	registers as well as fully call-clobbered ones.

Index: gcc/gcse.c
===================================================================
--- gcc/gcse.c	2019-09-09 18:58:51.468270740 +0100
+++ gcc/gcse.c	2019-09-11 19:48:23.453846664 +0100
@@ -160,6 +160,7 @@ Software Foundation; either version 3, o
 #include "dbgcnt.h"
 #include "gcse.h"
 #include "gcse-common.h"
+#include "function-abi.h"
 
 /* We support GCSE via Partial Redundancy Elimination.  PRE optimizations
    are a superset of those done by classic GCSE.
@@ -1527,9 +1528,14 @@ compute_hash_table_work (struct gcse_has
 
 	  if (CALL_P (insn))
 	    {
+	      function_abi abi = call_insn_abi (insn);
 	      hard_reg_set_iterator hrsi;
-	      EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call,
-					      0, regno, hrsi)
+
+	      /* We don't track modes of hard registers, so we need
+		 to be conservative and assume that partial kills
+		 are full kills.  */
+	      const_hard_reg_set clob = abi.full_and_partial_reg_clobbers ();
+	      EXECUTE_IF_SET_IN_HARD_REG_SET (clob, 0, regno, hrsi)
 		record_last_reg_set_info (insn, regno);
 
 	      if (! RTL_CONST_OR_PURE_CALL_P (insn)

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [18/32] Remove global call sets: haifa-sched.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (16 preceding siblings ...)
  2019-09-11 19:11 ` [16/32] Remove global call sets: function.c Richard Sandiford
@ 2019-09-11 19:12 ` Richard Sandiford
  2019-09-29 21:11   ` Jeff Law
  2019-09-11 19:12 ` [19/32] Remove global call sets: IRA Richard Sandiford
                   ` (16 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:12 UTC (permalink / raw)
  To: gcc-patches

The code patched here is counting how many registers the current
function would need to save in the prologue before it uses them.
The code is called per function, so using crtl is OK.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* haifa-sched.c: Include function-abi.h.
	(alloc_global_sched_pressure_data): Use crtl->abi to check whether
	the function would need to save a register before using it.

Index: gcc/haifa-sched.c
===================================================================
--- gcc/haifa-sched.c	2019-09-10 19:56:45.353177919 +0100
+++ gcc/haifa-sched.c	2019-09-11 19:48:27.281819692 +0100
@@ -146,6 +146,7 @@ Software Foundation; either version 3, o
 #include "cfgloop.h"
 #include "dumpfile.h"
 #include "print-rtl.h"
+#include "function-abi.h"
 
 #ifdef INSN_SCHEDULING
 
@@ -939,7 +940,8 @@ enum reg_class *sched_regno_pressure_cla
 /* Effective number of available registers of a given class (see comment
    in sched_pressure_start_bb).  */
 static int sched_class_regs_num[N_REG_CLASSES];
-/* Number of call_saved_regs and fixed_regs.  Helpers for calculating of
+/* The number of registers that the function would need to save before it
+   uses them, and the number of fixed_regs.  Helpers for calculating of
    sched_class_regs_num.  */
 static int call_saved_regs_num[N_REG_CLASSES];
 static int fixed_regs_num[N_REG_CLASSES];
@@ -7207,10 +7209,13 @@ alloc_global_sched_pressure_data (void)
 	  fixed_regs_num[cl] = 0;
 
 	  for (int i = 0; i < ira_class_hard_regs_num[cl]; ++i)
-	    if (!call_used_or_fixed_reg_p (ira_class_hard_regs[cl][i]))
-	      ++call_saved_regs_num[cl];
-	    else if (fixed_regs[ira_class_hard_regs[cl][i]])
-	      ++fixed_regs_num[cl];
+	    {
+	      unsigned int regno = ira_class_hard_regs[cl][i];
+	      if (fixed_regs[regno])
+		++fixed_regs_num[cl];
+	      else if (!crtl->abi->clobbers_full_reg_p (regno))
+		++call_saved_regs_num[cl];
+	    }
 	}
     }
 }

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [19/32] Remove global call sets: IRA
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (17 preceding siblings ...)
  2019-09-11 19:12 ` [18/32] Remove global call sets: haifa-sched.c Richard Sandiford
@ 2019-09-11 19:12 ` Richard Sandiford
  2019-09-30 15:16   ` Jeff Law
  2019-09-11 19:13 ` [20/32] Remove global call sets: loop-iv.c Richard Sandiford
                   ` (15 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:12 UTC (permalink / raw)
  To: gcc-patches

For -fipa-ra, IRA already keeps track of which specific registers
are call-clobbered in a region, rather than using global information.
The patch generalises this so that it tracks which ABIs are used
by calls in the region.

We can then use the new ABI descriptors to handle partially-clobbered
registers in the same way as fully-clobbered registers, without having
special code for targetm.hard_regno_call_part_clobbered.  This in turn
makes -fipa-ra work for partially-clobbered registers too.

A side-effect of allowing multiple ABIs is that we no longer have
an obvious set of conflicting registers for the self-described
"fragile hack" in ira-constraints.c.  This code kicks in for
user-defined registers that aren't live across a call at -O0,
and it tries to avoid allocating a call-clobbered register to them.
Here I've used the set of call-clobbered registers in the current
function's ABI, applying on top of any registers that are clobbered by
called functions.  This is enough to keep gcc.dg/debug/dwarf2/pr5948.c
happy.

The handling of GENERIC_STACK_CHECK in do_reload seemed to have
a reversed condition:

      for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++)
	if (df_regs_ever_live_p (i)
	    && !fixed_regs[i]
	    && call_used_or_fixed_reg_p (i))
	  size += UNITS_PER_WORD;

The final part of the condition counts registers that don't need to be
saved in the prologue, but I think the opposite was intended.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* function-abi.h (call_clobbers_in_region): Declare.
	(call_clobbered_in_region_p): New function.
	* function-abi.cc (call_clobbers_in_region): Likewise.
	* ira-int.h: Include function-abi.h.
	(ira_allocno::crossed_calls_abis): New field.
	(ALLOCNO_CROSSED_CALLS_ABIS): New macro.
	(ira_need_caller_save_regs): New function.
	(ira_need_caller_save_p): Likewise.
	* ira.c (setup_reg_renumber): Use ira_need_caller_save_p instead
	of call_used_or_fixed_regs.
	(do_reload): Use crtl->abi to test whether the current function
	needs to save a register in the prologue.  Count registers that
	need to be saved rather than registers that don't.
	* ira-build.c (create_cap_allocno): Copy ALLOCNO_CROSSED_CALLS_ABIS.
	Remove unnecessary | from ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
	(propagate_allocno_info): Merge ALLOCNO_CROSSED_CALLS_ABIS too.
	(propagate_some_info_from_allocno): Likewise.
	(copy_info_to_removed_store_destinations): Likewise.
	(ira_flattening): Say that ALLOCNO_CROSSED_CALLS_ABIS and
	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS are handled conservatively.
	(ira_build): Use ira_need_caller_save_regs instead of
	call_used_or_fixed_regs.
	* ira-color.c (calculate_saved_nregs): Use crtl->abi to test
	whether the current function would need to save a register
	before using it.
	(calculate_spill_cost): Likewise.
	(allocno_reload_assign): Use ira_need_caller_save_regs and
	ira_need_caller_save_p instead of call_used_or_fixed_regs.
	* ira-conflicts.c (ira_build_conflicts): Use
	ira_need_caller_save_regs rather than call_used_or_fixed_regs
	as the set of call-clobbered registers.  Remove the
	call_used_or_fixed_regs mask from the calculation of
	temp_hard_reg_set and mask its use instead.  Remove special
	handling of partially-clobbered registers.
	* ira-costs.c (ira_tune_allocno_costs): Use ira_need_caller_save_p.
	* ira-lives.c (process_bb_node_lives): Use mode_clobbers to
	calculate the set of conflicting registers for calls that
	can throw.  Record the ABIs of calls in ALLOCNO_CROSSED_CALLS_ABIS.
	Use full_and_partial_reg_clobbers rather than full_reg_clobbers
	for the calculation of ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
	Use eh_edge_abi to calculate the set of registers that could
	be clobbered by an EH edge.  Include partially-clobbered as
	well as fully-clobbered registers.

Index: gcc/function-abi.h
===================================================================
--- gcc/function-abi.h	2019-09-11 19:47:24.418262673 +0100
+++ gcc/function-abi.h	2019-09-11 19:48:31.709788491 +0100
@@ -265,6 +265,32 @@ #define default_function_abi \
   (this_target_function_abi_info->x_function_abis[0])
 #define eh_edge_abi default_function_abi
 
+extern HARD_REG_SET call_clobbers_in_region (unsigned int, const_hard_reg_set,
+					     machine_mode mode);
+
+/* Return true if (reg:MODE REGNO) might be clobbered by one of the
+   calls in a region described by ABIS and MASK, where:
+
+   * Bit ID of ABIS is set if the region contains a call with
+     function_abi identifier ID.
+
+   * MASK contains all the registers that are fully or partially
+     clobbered by calls in the region.
+
+   This is not quite as accurate as testing each individual call,
+   but it's a close and conservatively-correct approximation.
+   It's much better for some targets than:
+
+     overlaps_hard_reg_set_p (MASK, MODE, REGNO).  */
+
+inline bool
+call_clobbered_in_region_p (unsigned int abis, const_hard_reg_set mask,
+			    machine_mode mode, unsigned int regno)
+{
+  HARD_REG_SET clobbers = call_clobbers_in_region (abis, mask, mode);
+  return overlaps_hard_reg_set_p (clobbers, mode, regno);
+}
+
 extern const predefined_function_abi &fntype_abi (const_tree);
 extern function_abi fndecl_abi (const_tree);
 extern function_abi call_insn_abi (const rtx_insn *);
Index: gcc/function-abi.cc
===================================================================
--- gcc/function-abi.cc	2019-09-11 19:47:32.898202916 +0100
+++ gcc/function-abi.cc	2019-09-11 19:48:31.709788491 +0100
@@ -126,6 +126,31 @@ predefined_function_abi::add_full_reg_cl
     SET_HARD_REG_BIT (m_mode_clobbers[i], regno);
 }
 
+/* Return the set of registers that cannot be used to hold a value of
+   mode MODE across the calls in a region described by ABIS and MASK, where:
+
+   * Bit ID of ABIS is set if the region contains a call with
+     function_abi identifier ID.
+
+   * MASK contains all the registers that are fully or partially
+     clobbered by calls in the region.
+
+   This is not quite as accurate as testing each individual call,
+   but it's a close and conservatively-correct approximation.
+   It's much better for some targets than just using MASK.  */
+
+HARD_REG_SET
+call_clobbers_in_region (unsigned int abis, const_hard_reg_set mask,
+			 machine_mode mode)
+{
+  HARD_REG_SET result;
+  CLEAR_HARD_REG_SET (result);
+  for (unsigned int id = 0; abis; abis >>= 1, ++id)
+    if (abis & 1)
+      result |= function_abis[id].mode_clobbers (mode);
+  return result & mask;
+}
+
 /* Return the predefined ABI used by functions with type TYPE.  */
 
 const predefined_function_abi &
Index: gcc/ira-int.h
===================================================================
--- gcc/ira-int.h	2019-09-09 18:59:51.239848733 +0100
+++ gcc/ira-int.h	2019-09-11 19:48:31.713788462 +0100
@@ -22,6 +22,7 @@ Software Foundation; either version 3, o
 #define GCC_IRA_INT_H
 
 #include "recog.h"
+#include "function-abi.h"
 
 /* To provide consistency in naming, all IRA external variables,
    functions, common typedefs start with prefix ira_.  */
@@ -287,6 +288,9 @@ struct ira_allocno
   /* Register class which should be used for allocation for given
      allocno.  NO_REGS means that we should use memory.  */
   ENUM_BITFIELD (reg_class) aclass : 16;
+  /* A bitmask of the ABIs used by calls that occur while the allocno
+     is live.  */
+  unsigned int crossed_calls_abis : NUM_ABI_IDS;
   /* During the reload, value TRUE means that we should not reassign a
      hard register to the allocno got memory earlier.  It is set up
      when we removed memory-memory move insn before each iteration of
@@ -423,6 +427,7 @@ #define ALLOCNO_HARD_REGNO(A) ((A)->hard
 #define ALLOCNO_CALL_FREQ(A) ((A)->call_freq)
 #define ALLOCNO_CALLS_CROSSED_NUM(A) ((A)->calls_crossed_num)
 #define ALLOCNO_CHEAP_CALLS_CROSSED_NUM(A) ((A)->cheap_calls_crossed_num)
+#define ALLOCNO_CROSSED_CALLS_ABIS(A) ((A)->crossed_calls_abis)
 #define ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS(A) \
   ((A)->crossed_calls_clobbered_regs)
 #define ALLOCNO_MEM_OPTIMIZED_DEST(A) ((A)->mem_optimized_dest)
@@ -1510,4 +1515,28 @@ ira_allocate_and_set_or_copy_costs (int
 extern rtx ira_create_new_reg (rtx);
 extern int first_moveable_pseudo, last_moveable_pseudo;
 
+/* Return the set of registers that would need a caller save if allocno A
+   overlapped them.  */
+
+inline HARD_REG_SET
+ira_need_caller_save_regs (ira_allocno_t a)
+{
+  return call_clobbers_in_region (ALLOCNO_CROSSED_CALLS_ABIS (a),
+				  ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a),
+				  ALLOCNO_MODE (a));
+}
+
+/* Return true if we would need to save allocno A around a call if we
+   assigned hard register REGNO.  */
+
+inline bool
+ira_need_caller_save_p (ira_allocno_t a, unsigned int regno)
+{
+  if (ALLOCNO_CALLS_CROSSED_NUM (a) == 0)
+    return false;
+  return call_clobbered_in_region_p (ALLOCNO_CROSSED_CALLS_ABIS (a),
+				     ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a),
+				     ALLOCNO_MODE (a), regno);
+}
+
 #endif /* GCC_IRA_INT_H */
Index: gcc/ira.c
===================================================================
--- gcc/ira.c	2019-09-10 19:56:45.357177891 +0100
+++ gcc/ira.c	2019-09-11 19:48:31.713788462 +0100
@@ -2368,9 +2368,7 @@ setup_reg_renumber (void)
 	      OBJECT_TOTAL_CONFLICT_HARD_REGS (obj)
 		|= ~reg_class_contents[pclass];
 	    }
-	  if (ALLOCNO_CALLS_CROSSED_NUM (a) != 0
-	      && ira_hard_reg_set_intersection_p (hard_regno, ALLOCNO_MODE (a),
-						  call_used_or_fixed_regs))
+	  if (ira_need_caller_save_p (a, hard_regno))
 	    {
 	      ira_assert (!optimize || flag_caller_saves
 			  || (ALLOCNO_CALLS_CROSSED_NUM (a)
@@ -5591,7 +5589,7 @@ do_reload (void)
       for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++)
 	if (df_regs_ever_live_p (i)
 	    && !fixed_regs[i]
-	    && call_used_or_fixed_reg_p (i))
+	    && !crtl->abi->clobbers_full_reg_p (i))
 	  size += UNITS_PER_WORD;
 
       if (constant_lower_bound (size) > STACK_CHECK_MAX_FRAME_SIZE)
Index: gcc/ira-build.c
===================================================================
--- gcc/ira-build.c	2019-09-10 19:56:32.569268148 +0100
+++ gcc/ira-build.c	2019-09-11 19:48:31.709788491 +0100
@@ -903,8 +903,9 @@ create_cap_allocno (ira_allocno_t a)
 
   ALLOCNO_CALLS_CROSSED_NUM (cap) = ALLOCNO_CALLS_CROSSED_NUM (a);
   ALLOCNO_CHEAP_CALLS_CROSSED_NUM (cap) = ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);
+  ALLOCNO_CROSSED_CALLS_ABIS (cap) = ALLOCNO_CROSSED_CALLS_ABIS (a);
   ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (cap)
-    |= ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a);
+    = ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a);
   if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL)
     {
       fprintf (ira_dump_file, "    Creating cap ");
@@ -2032,6 +2033,8 @@ propagate_allocno_info (void)
 	    += ALLOCNO_CALLS_CROSSED_NUM (a);
 	  ALLOCNO_CHEAP_CALLS_CROSSED_NUM (parent_a)
 	    += ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);
+	  ALLOCNO_CROSSED_CALLS_ABIS (parent_a)
+	    |= ALLOCNO_CROSSED_CALLS_ABIS (a);
 	  ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (parent_a)
 	    |= ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a);
 	  ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (parent_a)
@@ -2415,6 +2418,7 @@ propagate_some_info_from_allocno (ira_al
   ALLOCNO_CALLS_CROSSED_NUM (a) += ALLOCNO_CALLS_CROSSED_NUM (from_a);
   ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a)
     += ALLOCNO_CHEAP_CALLS_CROSSED_NUM (from_a);
+  ALLOCNO_CROSSED_CALLS_ABIS (a) |= ALLOCNO_CROSSED_CALLS_ABIS (from_a);
   ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a)
     |= ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (from_a);
 
@@ -3056,6 +3060,8 @@ copy_info_to_removed_store_destinations
 	+= ALLOCNO_CALLS_CROSSED_NUM (a);
       ALLOCNO_CHEAP_CALLS_CROSSED_NUM (parent_a)
 	+= ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);
+      ALLOCNO_CROSSED_CALLS_ABIS (parent_a)
+	|= ALLOCNO_CROSSED_CALLS_ABIS (a);
       ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (parent_a)
 	|= ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a);
       ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (parent_a)
@@ -3155,6 +3161,9 @@ ira_flattening (int max_regno_before_emi
 		-= ALLOCNO_CALLS_CROSSED_NUM (a);
 	      ALLOCNO_CHEAP_CALLS_CROSSED_NUM (parent_a)
 		-= ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);
+	      /* Assume that ALLOCNO_CROSSED_CALLS_ABIS and
+		 ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS stay the same.
+		 We'd need to rebuild the IR to do better.  */
 	      ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (parent_a)
 		-= ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a);
 	      ira_assert (ALLOCNO_CALLS_CROSSED_NUM (parent_a) >= 0
@@ -3462,7 +3471,7 @@ ira_build (void)
 	 allocno crossing calls.  */
       FOR_EACH_ALLOCNO (a, ai)
 	if (ALLOCNO_CALLS_CROSSED_NUM (a) != 0)
-	  ior_hard_reg_conflicts (a, call_used_or_fixed_regs);
+	  ior_hard_reg_conflicts (a, ira_need_caller_save_regs (a));
     }
   if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL)
     print_copies (ira_dump_file);
Index: gcc/ira-color.c
===================================================================
--- gcc/ira-color.c	2019-09-10 19:56:32.569268148 +0100
+++ gcc/ira-color.c	2019-09-11 19:48:31.709788491 +0100
@@ -1650,7 +1650,7 @@ calculate_saved_nregs (int hard_regno, m
   ira_assert (hard_regno >= 0);
   for (i = hard_regno_nregs (hard_regno, mode) - 1; i >= 0; i--)
     if (!allocated_hardreg_p[hard_regno + i]
-	&& !TEST_HARD_REG_BIT (call_used_or_fixed_regs, hard_regno + i)
+	&& !crtl->abi->clobbers_full_reg_p (hard_regno + i)
 	&& !LOCAL_REGNO (hard_regno + i))
       nregs++;
   return nregs;
@@ -4379,7 +4379,7 @@ allocno_reload_assign (ira_allocno_t a,
       saved[i] = OBJECT_TOTAL_CONFLICT_HARD_REGS (obj);
       OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= forbidden_regs;
       if (! flag_caller_saves && ALLOCNO_CALLS_CROSSED_NUM (a) != 0)
-	OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= call_used_or_fixed_regs;
+	OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= ira_need_caller_save_regs (a);
     }
   ALLOCNO_ASSIGNED_P (a) = false;
   aclass = ALLOCNO_CLASS (a);
@@ -4398,9 +4398,7 @@ allocno_reload_assign (ira_allocno_t a,
 	       ? ALLOCNO_CLASS_COST (a)
 	       : ALLOCNO_HARD_REG_COSTS (a)[ira_class_hard_reg_index
 					    [aclass][hard_regno]]));
-      if (ALLOCNO_CALLS_CROSSED_NUM (a) != 0
-	  && ira_hard_reg_set_intersection_p (hard_regno, ALLOCNO_MODE (a),
-					      call_used_or_fixed_regs))
+      if (ira_need_caller_save_p (a, regno))
 	{
 	  ira_assert (flag_caller_saves);
 	  caller_save_needed = 1;
@@ -4687,16 +4685,16 @@ ira_mark_new_stack_slot (rtx x, int regn
    given IN and OUT for INSN.  Return also number points (through
    EXCESS_PRESSURE_LIVE_LENGTH) where the pseudo-register lives and
    the register pressure is high, number of references of the
-   pseudo-registers (through NREFS), number of callee-clobbered
-   hard-registers occupied by the pseudo-registers (through
-   CALL_USED_COUNT), and the first hard regno occupied by the
+   pseudo-registers (through NREFS), the number of psuedo registers
+   whose allocated register wouldn't need saving in the prologue
+   (through CALL_USED_COUNT), and the first hard regno occupied by the
    pseudo-registers (through FIRST_HARD_REGNO).  */
 static int
 calculate_spill_cost (int *regnos, rtx in, rtx out, rtx_insn *insn,
 		      int *excess_pressure_live_length,
 		      int *nrefs, int *call_used_count, int *first_hard_regno)
 {
-  int i, cost, regno, hard_regno, j, count, saved_cost, nregs;
+  int i, cost, regno, hard_regno, count, saved_cost;
   bool in_p, out_p;
   int length;
   ira_allocno_t a;
@@ -4713,11 +4711,8 @@ calculate_spill_cost (int *regnos, rtx i
       a = ira_regno_allocno_map[regno];
       length += ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a) / ALLOCNO_NUM_OBJECTS (a);
       cost += ALLOCNO_MEMORY_COST (a) - ALLOCNO_CLASS_COST (a);
-      nregs = hard_regno_nregs (hard_regno, ALLOCNO_MODE (a));
-      for (j = 0; j < nregs; j++)
-	if (! TEST_HARD_REG_BIT (call_used_or_fixed_regs, hard_regno + j))
-	  break;
-      if (j == nregs)
+      if (in_hard_reg_set_p (crtl->abi->full_reg_clobbers (),
+			     ALLOCNO_MODE (a), hard_regno))
 	count++;
       in_p = in && REG_P (in) && (int) REGNO (in) == hard_regno;
       out_p = out && REG_P (out) && (int) REGNO (out) == hard_regno;
Index: gcc/ira-conflicts.c
===================================================================
--- gcc/ira-conflicts.c	2019-09-11 19:47:32.898202916 +0100
+++ gcc/ira-conflicts.c	2019-09-11 19:48:31.709788491 +0100
@@ -738,9 +738,7 @@ ira_build_conflicts (void)
   if (! targetm.class_likely_spilled_p (base))
     CLEAR_HARD_REG_SET (temp_hard_reg_set);
   else
-    temp_hard_reg_set = (reg_class_contents[base]
-			 & ~ira_no_alloc_regs
-			 & call_used_or_fixed_regs);
+    temp_hard_reg_set = reg_class_contents[base] & ~ira_no_alloc_regs;
   FOR_EACH_ALLOCNO (a, ai)
     {
       int i, n = ALLOCNO_NUM_OBJECTS (a);
@@ -748,29 +746,28 @@ ira_build_conflicts (void)
       for (i = 0; i < n; i++)
 	{
 	  ira_object_t obj = ALLOCNO_OBJECT (a, i);
-	  machine_mode obj_mode = obj->allocno->mode;
 	  rtx allocno_reg = regno_reg_rtx [ALLOCNO_REGNO (a)];
 
-	  if ((! flag_caller_saves && ALLOCNO_CALLS_CROSSED_NUM (a) != 0)
-	      /* For debugging purposes don't put user defined variables in
-		 callee-clobbered registers.  However, do allow parameters
-		 in callee-clobbered registers to improve debugging.  This
-		 is a bit of a fragile hack.  */
-	      || (optimize == 0
-		  && REG_USERVAR_P (allocno_reg)
-		  && ! reg_is_parm_p (allocno_reg)))
+	  /* For debugging purposes don't put user defined variables in
+	     callee-clobbered registers.  However, do allow parameters
+	     in callee-clobbered registers to improve debugging.  This
+	     is a bit of a fragile hack.  */
+	  if (optimize == 0
+	      && REG_USERVAR_P (allocno_reg)
+	      && ! reg_is_parm_p (allocno_reg))
 	    {
-	      OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= call_used_or_fixed_regs;
-	      OBJECT_CONFLICT_HARD_REGS (obj) |= call_used_or_fixed_regs;
+	      HARD_REG_SET new_conflict_regs = crtl->abi->full_reg_clobbers ();
+	      OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= new_conflict_regs;
+	      OBJECT_CONFLICT_HARD_REGS (obj) |= new_conflict_regs;
 	    }
-	  else if (ALLOCNO_CALLS_CROSSED_NUM (a) != 0)
+
+	  if (ALLOCNO_CALLS_CROSSED_NUM (a) != 0)
 	    {
-	      HARD_REG_SET no_caller_save_reg_set
-		= (call_used_or_fixed_regs & ~savable_regs);
-	      OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= no_caller_save_reg_set;
-	      OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= temp_hard_reg_set;
-	      OBJECT_CONFLICT_HARD_REGS (obj) |= no_caller_save_reg_set;
-	      OBJECT_CONFLICT_HARD_REGS (obj) |= temp_hard_reg_set;
+	      HARD_REG_SET new_conflict_regs = ira_need_caller_save_regs (a);
+	      if (flag_caller_saves)
+		new_conflict_regs &= (~savable_regs | temp_hard_reg_set);
+	      OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= new_conflict_regs;
+	      OBJECT_CONFLICT_HARD_REGS (obj) |= new_conflict_regs;
 	    }
 
 	  /* Now we deal with paradoxical subreg cases where certain registers
@@ -797,23 +794,6 @@ ira_build_conflicts (void)
 		     }
 		}
 	    }
-
-	  if (ALLOCNO_CALLS_CROSSED_NUM (a) != 0)
-	    {
-	      int regno;
-
-	      /* Allocnos bigger than the saved part of call saved
-		 regs must conflict with them.  */
-	      for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-		if (!TEST_HARD_REG_BIT (call_used_or_fixed_regs, regno)
-		    && targetm.hard_regno_call_part_clobbered (0, regno,
-							       obj_mode))
-		  {
-		    SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno);
-		    SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj),
-				      regno);
-		  }
-	    }
 	}
     }
   if (optimize && ira_conflicts_p
Index: gcc/ira-costs.c
===================================================================
--- gcc/ira-costs.c	2019-09-11 19:47:32.898202916 +0100
+++ gcc/ira-costs.c	2019-09-11 19:48:31.713788462 +0100
@@ -2340,7 +2340,6 @@ ira_tune_allocno_costs (void)
   ira_allocno_object_iterator oi;
   ira_object_t obj;
   bool skip_p;
-  HARD_REG_SET *crossed_calls_clobber_regs;
 
   FOR_EACH_ALLOCNO (a, ai)
     {
@@ -2375,14 +2374,7 @@ ira_tune_allocno_costs (void)
 		continue;
 	      rclass = REGNO_REG_CLASS (regno);
 	      cost = 0;
-	      crossed_calls_clobber_regs
-		= &(ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
-	      if (ira_hard_reg_set_intersection_p (regno, mode,
-						   *crossed_calls_clobber_regs)
-		  && (ira_hard_reg_set_intersection_p (regno, mode,
-						       call_used_or_fixed_regs)
-		      || targetm.hard_regno_call_part_clobbered (0, regno,
-								 mode)))
+	      if (ira_need_caller_save_p (a, regno))
 		cost += (ALLOCNO_CALL_FREQ (a)
 			 * (ira_memory_move_cost[mode][rclass][0]
 			    + ira_memory_move_cost[mode][rclass][1]));
Index: gcc/ira-lives.c
===================================================================
--- gcc/ira-lives.c	2019-09-11 19:47:24.418262673 +0100
+++ gcc/ira-lives.c	2019-09-11 19:48:31.713788462 +0100
@@ -1255,11 +1255,7 @@ process_bb_node_lives (ira_loop_tree_nod
 		  ira_object_t obj = ira_object_id_map[i];
 		  a = OBJECT_ALLOCNO (obj);
 		  int num = ALLOCNO_NUM (a);
-		  HARD_REG_SET this_call_used_reg_set
-		    = call_insn_abi (insn).full_reg_clobbers ();
-		  /* ??? This preserves traditional behavior; it might not be
-		     needed.  */
-		  this_call_used_reg_set |= fixed_reg_set;
+		  function_abi abi = call_insn_abi (insn);
 
 		  /* Don't allocate allocnos that cross setjmps or any
 		     call, if this function receives a nonlocal
@@ -1275,9 +1271,9 @@ process_bb_node_lives (ira_loop_tree_nod
 		  if (can_throw_internal (insn))
 		    {
 		      OBJECT_CONFLICT_HARD_REGS (obj)
-			|= this_call_used_reg_set;
+			|= abi.mode_clobbers (ALLOCNO_MODE (a));
 		      OBJECT_TOTAL_CONFLICT_HARD_REGS (obj)
-			|= this_call_used_reg_set;
+			|= abi.mode_clobbers (ALLOCNO_MODE (a));
 		    }
 
 		  if (sparseset_bit_p (allocnos_processed, num))
@@ -1294,8 +1290,9 @@ process_bb_node_lives (ira_loop_tree_nod
 		  /* Mark it as saved at the next call.  */
 		  allocno_saved_at_call[num] = last_call_num + 1;
 		  ALLOCNO_CALLS_CROSSED_NUM (a)++;
+		  ALLOCNO_CROSSED_CALLS_ABIS (a) |= 1 << abi.id ();
 		  ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a)
-		    |= this_call_used_reg_set;
+		    |= abi.full_and_partial_reg_clobbers ();
 		  if (cheap_reg != NULL_RTX
 		      && ALLOCNO_REGNO (a) == (int) REGNO (cheap_reg))
 		    ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a)++;
@@ -1359,10 +1356,11 @@ process_bb_node_lives (ira_loop_tree_nod
 	  }
 
       /* Allocnos can't go in stack regs at the start of a basic block
-	 that is reached by an abnormal edge. Likewise for call
-	 clobbered regs, because caller-save, fixup_abnormal_edges and
-	 possibly the table driven EH machinery are not quite ready to
-	 handle such allocnos live across such edges.  */
+	 that is reached by an abnormal edge. Likewise for registers
+	 that are at least partly call clobbered, because caller-save,
+	 fixup_abnormal_edges and possibly the table driven EH machinery
+	 are not quite ready to handle such allocnos live across such
+	 edges.  */
       if (bb_has_abnormal_pred (bb))
 	{
 #ifdef STACK_REGS
@@ -1382,7 +1380,7 @@ process_bb_node_lives (ira_loop_tree_nod
 	  if (!cfun->has_nonlocal_label
 	      && has_abnormal_call_or_eh_pred_edge_p (bb))
 	    for (px = 0; px < FIRST_PSEUDO_REGISTER; px++)
-	      if (call_used_or_fixed_reg_p (px)
+	      if (eh_edge_abi.clobbers_at_least_part_of_reg_p (px)
 #ifdef REAL_PIC_OFFSET_TABLE_REGNUM
 		  /* We should create a conflict of PIC pseudo with
 		     PIC hard reg as PIC hard reg can have a wrong

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [20/32] Remove global call sets: loop-iv.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (18 preceding siblings ...)
  2019-09-11 19:12 ` [19/32] Remove global call sets: IRA Richard Sandiford
@ 2019-09-11 19:13 ` Richard Sandiford
  2019-09-29 21:20   ` Jeff Law
  2019-09-11 19:14 ` [21/32] Remove global call sets: LRA Richard Sandiford
                   ` (14 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:13 UTC (permalink / raw)
  To: gcc-patches

Similar idea to the combine.c and gcse.c patches.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* loop-iv.c: Include regs.h and function-abi.h.
	(simplify_using_initial_values): Use call_insn_abi to get the
	ABI of the call insn target.  Conservatively assume that
	partially-clobbered registers are altered.

Index: gcc/loop-iv.c
===================================================================
--- gcc/loop-iv.c	2019-09-09 19:01:40.371078272 +0100
+++ gcc/loop-iv.c	2019-09-11 19:48:35.161764168 +0100
@@ -62,6 +62,8 @@ Free Software Foundation; either version
 #include "dumpfile.h"
 #include "rtl-iter.h"
 #include "tree-ssa-loop-niter.h"
+#include "regs.h"
+#include "function-abi.h"
 
 /* Possible return values of iv_get_reaching_def.  */
 
@@ -1972,8 +1974,14 @@ simplify_using_initial_values (class loo
 	  CLEAR_REG_SET (this_altered);
 	  note_stores (insn, mark_altered, this_altered);
 	  if (CALL_P (insn))
-	    /* Kill all call clobbered registers.  */
-	    IOR_REG_SET_HRS (this_altered, regs_invalidated_by_call);
+	    {
+	      /* Kill all registers that might be clobbered by the call.
+		 We don't track modes of hard registers, so we need to be
+		 conservative and assume that partial kills are full kills.  */
+	      function_abi abi = call_insn_abi (insn);
+	      IOR_REG_SET_HRS (this_altered,
+			       abi.full_and_partial_reg_clobbers ());
+	    }
 
 	  if (suitable_set_for_replacement (insn, &dest, &src))
 	    {

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [21/32] Remove global call sets: LRA
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (19 preceding siblings ...)
  2019-09-11 19:13 ` [20/32] Remove global call sets: loop-iv.c Richard Sandiford
@ 2019-09-11 19:14 ` Richard Sandiford
  2019-09-30 15:29   ` Jeff Law
  2019-10-04 18:03   ` H.J. Lu
  2019-09-11 19:14 ` [23/32] Remove global call sets: postreload-gcse.c Richard Sandiford
                   ` (13 subsequent siblings)
  34 siblings, 2 replies; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:14 UTC (permalink / raw)
  To: gcc-patches

lra_reg has an actual_call_used_reg_set field that is only used during
inheritance.  This in turn required a special lra_create_live_ranges
pass for flag_ipa_ra to set up this field.  This patch instead makes
the inheritance code do its own live register tracking, using the
same ABI-mask-and-clobber-set pair as for IRA.

Tracking ABIs simplifies (and cheapens) the logic in lra-lives.c and
means we no longer need a separate path for -fipa-ra.  It also means
we can remove TARGET_RETURN_CALL_WITH_MAX_CLOBBERS.

The patch also strengthens the sanity check in lra_assigns so that
we check that reg_renumber is consistent with the whole conflict set,
not just the call-clobbered registers.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* target.def (return_call_with_max_clobbers): Delete.
	* doc/tm.texi.in (TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): Delete.
	* doc/tm.texi: Regenerate.
	* config/aarch64/aarch64.c (aarch64_return_call_with_max_clobbers)
	(TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): Delete.
	* lra-int.h (lra_reg::actual_call_used_reg_set): Delete.
	(lra_reg::call_insn): Delete.
	* lra.c: Include function-abi.h.
	(initialize_lra_reg_info_element): Don't initialize the fields above.
	(lra): Use crtl->abi to test whether the current function needs to
	save a register in the prologue.  Remove special pre-inheritance
	lra_create_live_ranges pass for flag_ipa_ra.
	* lra-assigns.c: Include function-abi.h
	(find_hard_regno_for_1): Use crtl->abi to test whether the current
	function needs to save a register in the prologue.
	(lra_assign): Assert that registers aren't allocated to a
	conflicting register, rather than checking only for overlaps
	with call_used_or_fixed_regs.  Do this even for flag_ipa_ra,
	and for registers that are not live across a call.
	* lra-constraints.c (last_call_for_abi): New variable.
	(full_and_partial_call_clobbers): Likewise.
	(setup_next_usage_insn): Remove the register from
	full_and_partial_call_clobbers.
	(need_for_call_save_p): Use call_clobbered_in_region_p to test
	whether the register needs a caller save.
	(need_for_split_p): Use full_and_partial_reg_clobbers instead
	of call_used_or_fixed_regs.
	(inherit_in_ebb): Initialize and maintain last_call_for_abi and
	full_and_partial_call_clobbers.
	* lra-lives.c (check_pseudos_live_through_calls): Replace
	last_call_used_reg_set and call_insn arguments with an abi argument.
	Remove handling of lra_reg::call_insn.  Use function_abi::mode_clobbers
	as the set of conflicting registers.
	(calls_have_same_clobbers_p): Delete.
	(process_bb_lives): Track the ABI of the last call instead of an
	insn/HARD_REG_SET pair.  Update calls to
	check_pseudos_live_through_calls.  Use eh_edge_abi to calculate
	the set of registers that could be clobbered by an EH edge.
	Include partially-clobbered as well as fully-clobbered registers.
	(lra_create_live_ranges_1): Don't initialize lra_reg::call_insn.
	* lra-remat.c: Include function-abi.h.
	(call_used_regs_arr_len, call_used_regs_arr): Delete.
	(set_bb_regs): Use call_insn_abi to get the set of call-clobbered
	registers and bitmap_view to combine them into dead_regs.
	(call_used_input_regno_present_p): Take a function_abi argument
	and use it to test whether a register is call-clobbered.
	(calculate_gen_cands): Use call_insn_abi to get the ABI of the
	call insn target.  Update tje call to call_used_input_regno_present_p.
	(do_remat): Likewise.
	(lra_remat): Remove the initialization of call_used_regs_arr_len
	and call_used_regs_arr.

Index: gcc/target.def
===================================================================
--- gcc/target.def	2019-09-11 19:47:32.906202859 +0100
+++ gcc/target.def	2019-09-11 19:48:38.549740292 +0100
@@ -5786,20 +5786,6 @@ for targets that don't have partly call-
  hook_bool_uint_uint_mode_false)
 
 DEFHOOK
-(return_call_with_max_clobbers,
- "This hook returns a pointer to the call that partially clobbers the\n\
-most registers.  If a platform supports multiple ABIs where the registers\n\
-that are partially clobbered may vary, this function compares two\n\
-calls and returns a pointer to the one that clobbers the most registers.\n\
-If both calls clobber the same registers, @var{call_1} must be returned.\n\
-\n\
-The registers clobbered in different ABIs must be a proper subset or\n\
-superset of all other ABIs.  @var{call_1} must always be a call insn,\n\
-call_2 may be NULL or a call insn.",
- rtx_insn *, (rtx_insn *call_1, rtx_insn *call_2),
- NULL)
-
-DEFHOOK
 (get_multilib_abi_name,
  "This hook returns name of multilib ABI name.",
  const char *, (void),
Index: gcc/doc/tm.texi.in
===================================================================
--- gcc/doc/tm.texi.in	2019-09-11 19:47:24.414262702 +0100
+++ gcc/doc/tm.texi.in	2019-09-11 19:48:38.545740321 +0100
@@ -1718,8 +1718,6 @@ must be defined.  Modern ports should de
 @cindex call-saved register
 @hook TARGET_HARD_REGNO_CALL_PART_CLOBBERED
 
-@hook TARGET_RETURN_CALL_WITH_MAX_CLOBBERS
-
 @hook TARGET_GET_MULTILIB_ABI_NAME
 
 @findex fixed_regs
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	2019-09-11 19:47:32.898202916 +0100
+++ gcc/doc/tm.texi	2019-09-11 19:48:38.545740321 +0100
@@ -1941,18 +1941,6 @@ The default implementation returns false
 for targets that don't have partly call-clobbered registers.
 @end deftypefn
 
-@deftypefn {Target Hook} {rtx_insn *} TARGET_RETURN_CALL_WITH_MAX_CLOBBERS (rtx_insn *@var{call_1}, rtx_insn *@var{call_2})
-This hook returns a pointer to the call that partially clobbers the
-most registers.  If a platform supports multiple ABIs where the registers
-that are partially clobbered may vary, this function compares two
-calls and returns a pointer to the one that clobbers the most registers.
-If both calls clobber the same registers, @var{call_1} must be returned.
-
-The registers clobbered in different ABIs must be a proper subset or
-superset of all other ABIs.  @var{call_1} must always be a call insn,
-call_2 may be NULL or a call insn.
-@end deftypefn
-
 @deftypefn {Target Hook} {const char *} TARGET_GET_MULTILIB_ABI_NAME (void)
 This hook returns name of multilib ABI name.
 @end deftypefn
Index: gcc/config/aarch64/aarch64.c
===================================================================
--- gcc/config/aarch64/aarch64.c	2019-09-11 19:47:32.858203198 +0100
+++ gcc/config/aarch64/aarch64.c	2019-09-11 19:48:38.541740349 +0100
@@ -1926,19 +1926,6 @@ aarch64_hard_regno_call_part_clobbered (
   return false;
 }
 
-/* Implement TARGET_RETURN_CALL_WITH_MAX_CLOBBERS.  */
-
-rtx_insn *
-aarch64_return_call_with_max_clobbers (rtx_insn *call_1, rtx_insn *call_2)
-{
-  gcc_assert (CALL_P (call_1) && CALL_P (call_2));
-
-  if (!aarch64_simd_call_p (call_1) || aarch64_simd_call_p (call_2))
-    return call_1;
-  else
-    return call_2;
-}
-
 /* Implement REGMODE_NATURAL_SIZE.  */
 poly_uint64
 aarch64_regmode_natural_size (machine_mode mode)
@@ -20804,10 +20791,6 @@ #define TARGET_HARD_REGNO_CALL_PART_CLOB
 #undef TARGET_CALL_INSN_ABI
 #define TARGET_CALL_INSN_ABI aarch64_call_insn_abi
 
-#undef TARGET_RETURN_CALL_WITH_MAX_CLOBBERS
-#define TARGET_RETURN_CALL_WITH_MAX_CLOBBERS \
-  aarch64_return_call_with_max_clobbers
-
 #undef TARGET_CONSTANT_ALIGNMENT
 #define TARGET_CONSTANT_ALIGNMENT aarch64_constant_alignment
 
Index: gcc/lra-int.h
===================================================================
--- gcc/lra-int.h	2019-08-19 15:57:56.818306311 +0100
+++ gcc/lra-int.h	2019-09-11 19:48:38.545740321 +0100
@@ -73,10 +73,6 @@ struct lra_copy
   /* The following fields are defined only for pseudos.	 */
   /* Hard registers with which the pseudo conflicts.  */
   HARD_REG_SET conflict_hard_regs;
-  /* Call used registers with which the pseudo conflicts, taking into account
-     the registers used by functions called from calls which cross the
-     pseudo.  */
-  HARD_REG_SET actual_call_used_reg_set;
   /* We assign hard registers to reload pseudos which can occur in few
      places.  So two hard register preferences are enough for them.
      The following fields define the preferred hard registers.	If
@@ -104,8 +100,6 @@ struct lra_copy
   int val;
   /* Offset from relative eliminate register to pesudo reg.  */
   poly_int64 offset;
-  /* Call instruction, if any, that may affect this psuedo reg.  */
-  rtx_insn *call_insn;
   /* These members are set up in lra-lives.c and updated in
      lra-coalesce.c.  */
   /* The biggest size mode in which each pseudo reg is referred in
Index: gcc/lra.c
===================================================================
--- gcc/lra.c	2019-09-10 19:56:45.357177891 +0100
+++ gcc/lra.c	2019-09-11 19:48:38.549740292 +0100
@@ -121,6 +121,7 @@ Software Foundation; either version 3, o
 #include "lra.h"
 #include "lra-int.h"
 #include "print-rtl.h"
+#include "function-abi.h"
 
 /* Dump bitmap SET with TITLE and BB INDEX.  */
 void
@@ -1323,7 +1324,6 @@ initialize_lra_reg_info_element (int i)
   lra_reg_info[i].no_stack_p = false;
 #endif
   CLEAR_HARD_REG_SET (lra_reg_info[i].conflict_hard_regs);
-  CLEAR_HARD_REG_SET (lra_reg_info[i].actual_call_used_reg_set);
   lra_reg_info[i].preferred_hard_regno1 = -1;
   lra_reg_info[i].preferred_hard_regno2 = -1;
   lra_reg_info[i].preferred_hard_regno_profit1 = 0;
@@ -1336,7 +1336,6 @@ initialize_lra_reg_info_element (int i)
   lra_reg_info[i].val = get_new_reg_value ();
   lra_reg_info[i].offset = 0;
   lra_reg_info[i].copies = NULL;
-  lra_reg_info[i].call_insn = NULL;
 }
 
 /* Initialize common reg info and copies.  */
@@ -2420,7 +2419,9 @@ lra (FILE *f)
 
   if (crtl->saves_all_registers)
     for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-      if (!call_used_or_fixed_reg_p (i) && !fixed_regs[i] && !LOCAL_REGNO (i))
+      if (!crtl->abi->clobbers_full_reg_p (i)
+	  && !fixed_regs[i]
+	  && !LOCAL_REGNO (i))
 	df_set_regs_ever_live (i, true);
 
   /* We don't DF from now and avoid its using because it is to
@@ -2478,19 +2479,7 @@ lra (FILE *f)
 	    }
 	  /* Do inheritance only for regular algorithms.  */
 	  if (! lra_simple_p)
-	    {
-	      if (flag_ipa_ra)
-		{
-		  if (live_p)
-		    lra_clear_live_ranges ();
-		  /* As a side-effect of lra_create_live_ranges, we calculate
-		     actual_call_used_reg_set,  which is needed during
-		     lra_inheritance.  */
-		  lra_create_live_ranges (true, true);
-		  live_p = true;
-		}
-	      lra_inheritance ();
-	    }
+	    lra_inheritance ();
 	  if (live_p)
 	    lra_clear_live_ranges ();
 	  bool fails_p;
Index: gcc/lra-assigns.c
===================================================================
--- gcc/lra-assigns.c	2019-09-10 19:56:32.573268120 +0100
+++ gcc/lra-assigns.c	2019-09-11 19:48:38.545740321 +0100
@@ -94,6 +94,7 @@ Software Foundation; either version 3, o
 #include "params.h"
 #include "lra.h"
 #include "lra-int.h"
+#include "function-abi.h"
 
 /* Current iteration number of the pass and current iteration number
    of the pass after the latest spill pass when any former reload
@@ -654,7 +655,7 @@ find_hard_regno_for_1 (int regno, int *c
 	  for (j = 0;
 	       j < hard_regno_nregs (hard_regno, PSEUDO_REGNO_MODE (regno));
 	       j++)
-	    if (! TEST_HARD_REG_BIT (call_used_or_fixed_regs, hard_regno + j)
+	    if (! crtl->abi->clobbers_full_reg_p (hard_regno + j)
 		&& ! df_regs_ever_live_p (hard_regno + j))
 	      /* It needs save restore.	 */
 	      hard_regno_costs[hard_regno]
@@ -1634,14 +1635,14 @@ lra_assign (bool &fails_p)
   bitmap_initialize (&all_spilled_pseudos, &reg_obstack);
   create_live_range_start_chains ();
   setup_live_pseudos_and_spill_after_risky_transforms (&all_spilled_pseudos);
-  if (! lra_asm_error_p && flag_checking && !flag_ipa_ra)
+  if (! lra_asm_error_p && flag_checking)
     /* Check correctness of allocation for call-crossed pseudos but
        only when there are no asm errors as in the case of errors the
        asm is removed and it can result in incorrect allocation.  */
     for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
-      if (lra_reg_info[i].nrefs != 0 && reg_renumber[i] >= 0
-	  && lra_reg_info[i].call_insn
-	  && overlaps_hard_reg_set_p (call_used_or_fixed_regs,
+      if (lra_reg_info[i].nrefs != 0
+	  && reg_renumber[i] >= 0
+	  && overlaps_hard_reg_set_p (lra_reg_info[i].conflict_hard_regs,
 				      PSEUDO_REGNO_MODE (i), reg_renumber[i]))
 	gcc_unreachable ();
   /* Setup insns to process on the next constraint pass.  */
Index: gcc/lra-constraints.c
===================================================================
--- gcc/lra-constraints.c	2019-09-11 19:47:32.898202916 +0100
+++ gcc/lra-constraints.c	2019-09-11 19:48:38.545740321 +0100
@@ -5147,6 +5147,14 @@ clear_invariants (void)
 /* Number of calls passed so far in current EBB.  */
 static int calls_num;
 
+/* Index ID is the CALLS_NUM associated the last call we saw with
+   ABI identifier ID.  */
+static int last_call_for_abi[NUM_ABI_IDS];
+
+/* Which registers have been fully or partially clobbered by a call
+   since they were last used.  */
+static HARD_REG_SET full_and_partial_call_clobbers;
+
 /* Current reload pseudo check for validity of elements in
    USAGE_INSNS.	 */
 static int curr_usage_insns_check;
@@ -5190,6 +5198,10 @@ setup_next_usage_insn (int regno, rtx in
   usage_insns[regno].reloads_num = reloads_num;
   usage_insns[regno].calls_num = calls_num;
   usage_insns[regno].after_p = after_p;
+  if (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0)
+    remove_from_hard_reg_set (&full_and_partial_call_clobbers,
+			      PSEUDO_REGNO_MODE (regno),
+			      reg_renumber[regno]);
 }
 
 /* The function is used to form list REGNO usages which consists of
@@ -5435,17 +5447,19 @@ inherit_reload_reg (bool def_p, int orig
 need_for_call_save_p (int regno)
 {
   lra_assert (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0);
-  return (usage_insns[regno].calls_num < calls_num
-	  && (overlaps_hard_reg_set_p
-	      ((flag_ipa_ra &&
-		! hard_reg_set_empty_p (lra_reg_info[regno].actual_call_used_reg_set))
-	       ? lra_reg_info[regno].actual_call_used_reg_set
-	       : call_used_or_fixed_regs,
-	       PSEUDO_REGNO_MODE (regno), reg_renumber[regno])
-	      || (targetm.hard_regno_call_part_clobbered
-		  (lra_reg_info[regno].call_insn
-		   ? call_insn_abi (lra_reg_info[regno].call_insn).id () : 0,
-		   reg_renumber[regno], PSEUDO_REGNO_MODE (regno)))));
+  if (usage_insns[regno].calls_num < calls_num)
+    {
+      unsigned int abis = 0;
+      for (unsigned int i = 0; i < NUM_ABI_IDS; ++i)
+	if (last_call_for_abi[i] > usage_insns[regno].calls_num)
+	  abis |= 1 << i;
+      gcc_assert (abis);
+      if (call_clobbered_in_region_p (abis, full_and_partial_call_clobbers,
+				      PSEUDO_REGNO_MODE (regno),
+				      reg_renumber[regno]))
+	return true;
+    }
+  return false;
 }
 
 /* Global registers occurring in the current EBB.  */
@@ -5485,8 +5499,7 @@ need_for_split_p (HARD_REG_SET potential
 	      true) the assign pass assumes that all pseudos living
 	      through calls are assigned to call saved hard regs.  */
 	   && (regno >= FIRST_PSEUDO_REGISTER
-	       || ! TEST_HARD_REG_BIT (call_used_or_fixed_regs, regno)
-	       || usage_insns[regno].calls_num == calls_num)
+	       || !TEST_HARD_REG_BIT (full_and_partial_call_clobbers, regno))
 	   /* We need at least 2 reloads to make pseudo splitting
 	      profitable.  We should provide hard regno splitting in
 	      any case to solve 1st insn scheduling problem when
@@ -6238,6 +6251,9 @@ inherit_in_ebb (rtx_insn *head, rtx_insn
   curr_usage_insns_check++;
   clear_invariants ();
   reloads_num = calls_num = 0;
+  for (unsigned int i = 0; i < NUM_ABI_IDS; ++i)
+    last_call_for_abi[i] = 0;
+  CLEAR_HARD_REG_SET (full_and_partial_call_clobbers);
   bitmap_clear (&check_only_regs);
   bitmap_clear (&invalid_invariant_regs);
   last_processed_bb = NULL;
@@ -6451,6 +6467,10 @@ inherit_in_ebb (rtx_insn *head, rtx_insn
 	      int regno, hard_regno;
 
 	      calls_num++;
+	      function_abi abi = call_insn_abi (curr_insn);
+	      last_call_for_abi[abi.id ()] = calls_num;
+	      full_and_partial_call_clobbers
+		|= abi.full_and_partial_reg_clobbers ();
 	      if ((cheap = find_reg_note (curr_insn,
 					  REG_RETURNED, NULL_RTX)) != NULL_RTX
 		  && ((cheap = XEXP (cheap, 0)), true)
@@ -6460,7 +6480,7 @@ inherit_in_ebb (rtx_insn *head, rtx_insn
 		  /* If there are pending saves/restores, the
 		     optimization is not worth.	 */
 		  && usage_insns[regno].calls_num == calls_num - 1
-		  && TEST_HARD_REG_BIT (call_used_or_fixed_regs, hard_regno))
+		  && abi.clobbers_reg_p (GET_MODE (cheap), hard_regno))
 		{
 		  /* Restore the pseudo from the call result as
 		     REG_RETURNED note says that the pseudo value is
@@ -6483,6 +6503,9 @@ inherit_in_ebb (rtx_insn *head, rtx_insn
 		      /* We don't need to save/restore of the pseudo from
 			 this call.	 */
 		      usage_insns[regno].calls_num = calls_num;
+		      remove_from_hard_reg_set
+			(&full_and_partial_call_clobbers,
+			 GET_MODE (cheap), hard_regno);
 		      bitmap_set_bit (&check_only_regs, regno);
 		    }
 		}
Index: gcc/lra-lives.c
===================================================================
--- gcc/lra-lives.c	2019-09-11 19:47:32.898202916 +0100
+++ gcc/lra-lives.c	2019-09-11 19:48:38.549740292 +0100
@@ -576,40 +576,21 @@ lra_setup_reload_pseudo_preferenced_hard
     }
 }
 
-/* Check that REGNO living through calls and setjumps, set up conflict
-   regs using LAST_CALL_USED_REG_SET, and clear corresponding bits in
-   PSEUDOS_LIVE_THROUGH_CALLS and PSEUDOS_LIVE_THROUGH_SETJUMPS.
-   CALL_INSN is a call that is representative of all calls in the region
-   described by the PSEUDOS_LIVE_THROUGH_* sets, in terms of the registers
-   that it preserves and clobbers.  */
+/* Check whether REGNO lives through calls and setjmps and clear
+   the corresponding bits in PSEUDOS_LIVE_THROUGH_CALLS and
+   PSEUDOS_LIVE_THROUGH_SETJUMPS.  All calls in the region described
+   by PSEUDOS_LIVE_THROUGH_CALLS have the given ABI.  */
 
 static inline void
-check_pseudos_live_through_calls (int regno,
-				  HARD_REG_SET last_call_used_reg_set,
-				  rtx_insn *call_insn)
+check_pseudos_live_through_calls (int regno, const function_abi &abi)
 {
-  int hr;
-  rtx_insn *old_call_insn;
-
   if (! sparseset_bit_p (pseudos_live_through_calls, regno))
     return;
 
-  function_abi abi = call_insn_abi (call_insn);
-  old_call_insn = lra_reg_info[regno].call_insn;
-  if (!old_call_insn
-      || (targetm.return_call_with_max_clobbers
-	  && targetm.return_call_with_max_clobbers (old_call_insn, call_insn)
-	     == call_insn))
-    lra_reg_info[regno].call_insn = call_insn;
+  machine_mode mode = PSEUDO_REGNO_MODE (regno);
 
   sparseset_clear_bit (pseudos_live_through_calls, regno);
-  lra_reg_info[regno].conflict_hard_regs |= last_call_used_reg_set;
-
-  for (hr = 0; HARD_REGISTER_NUM_P (hr); hr++)
-    if (targetm.hard_regno_call_part_clobbered (abi.id (), hr,
-						PSEUDO_REGNO_MODE (regno)))
-      add_to_hard_reg_set (&lra_reg_info[regno].conflict_hard_regs,
-			   PSEUDO_REGNO_MODE (regno), hr);
+  lra_reg_info[regno].conflict_hard_regs |= abi.mode_clobbers (mode);
   if (! sparseset_bit_p (pseudos_live_through_setjumps, regno))
     return;
   sparseset_clear_bit (pseudos_live_through_setjumps, regno);
@@ -630,19 +611,6 @@ reg_early_clobber_p (const struct lra_in
 	     && TEST_BIT (reg->early_clobber_alts, n_alt)));
 }
 
-/* Return true if call instructions CALL1 and CALL2 use ABIs that
-   preserve the same set of registers.  */
-
-static bool
-calls_have_same_clobbers_p (rtx_insn *call1, rtx_insn *call2)
-{
-  if (!targetm.return_call_with_max_clobbers)
-    return false;
-
-  return (targetm.return_call_with_max_clobbers (call1, call2) == call1
-          && targetm.return_call_with_max_clobbers (call2, call1) == call2);
-}
-
 /* Process insns of the basic block BB to update pseudo live ranges,
    pseudo hard register conflicts, and insn notes.  We do it on
    backward scan of BB insns.  CURR_POINT is the program point where
@@ -662,15 +630,13 @@ process_bb_lives (basic_block bb, int &c
   rtx_insn *next;
   rtx link, *link_loc;
   bool need_curr_point_incr;
-  HARD_REG_SET last_call_used_reg_set;
-  rtx_insn *call_insn = NULL;
-  rtx_insn *last_call_insn = NULL;
+  /* Only has a meaningful value once we've seen a call.  */
+  function_abi last_call_abi = default_function_abi;
 
   reg_live_out = df_get_live_out (bb);
   sparseset_clear (pseudos_live);
   sparseset_clear (pseudos_live_through_calls);
   sparseset_clear (pseudos_live_through_setjumps);
-  CLEAR_HARD_REG_SET (last_call_used_reg_set);
   REG_SET_TO_HARD_REG_SET (hard_regs_live, reg_live_out);
   hard_regs_live &= ~eliminable_regset;
   EXECUTE_IF_SET_IN_BITMAP (reg_live_out, FIRST_PSEUDO_REGISTER, j, bi)
@@ -876,9 +842,8 @@ process_bb_lives (basic_block bb, int &c
 	    {
 	      update_pseudo_point (reg->regno, curr_point, USE_POINT);
 	      mark_regno_live (reg->regno, reg->biggest_mode);
-	      check_pseudos_live_through_calls (reg->regno,
-						last_call_used_reg_set,
-						call_insn);
+	      /* ??? Should be a no-op for unused registers.  */
+	      check_pseudos_live_through_calls (reg->regno, last_call_abi);
 	    }
 
 	  if (!HARD_REGISTER_NUM_P (reg->regno))
@@ -927,37 +892,13 @@ process_bb_lives (basic_block bb, int &c
 
       if (call_p)
 	{
-	  call_insn = curr_insn;
-	  if (! flag_ipa_ra && ! targetm.return_call_with_max_clobbers)
-	    last_call_used_reg_set = call_used_or_fixed_regs;
-	  else
-	    {
-	      HARD_REG_SET this_call_used_reg_set
-		= call_insn_abi (curr_insn).full_reg_clobbers ();
-	      /* ??? This preserves traditional behavior; it might not
-		 be needed.  */
-	      this_call_used_reg_set |= fixed_reg_set;
-
-	      bool flush = (! hard_reg_set_empty_p (last_call_used_reg_set)
-			    && (last_call_used_reg_set
-				!= this_call_used_reg_set))
-			   || (last_call_insn && ! calls_have_same_clobbers_p
-						     (call_insn,
-						      last_call_insn));
+	  function_abi call_abi = call_insn_abi (curr_insn);
 
-	      EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, j)
-		{
-		  lra_reg_info[j].actual_call_used_reg_set
-		    |= this_call_used_reg_set;
+	  if (last_call_abi != call_abi)
+	    EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, j)
+	      check_pseudos_live_through_calls (j, last_call_abi);
 
-		  if (flush)
-		    check_pseudos_live_through_calls (j,
-						      last_call_used_reg_set,
-						      last_call_insn);
-		}
-	      last_call_used_reg_set = this_call_used_reg_set;
-	      last_call_insn = call_insn;
-	    }
+	  last_call_abi = call_abi;
 
 	  sparseset_ior (pseudos_live_through_calls,
 			 pseudos_live_through_calls, pseudos_live);
@@ -995,9 +936,7 @@ process_bb_lives (basic_block bb, int &c
 	    if (reg->type == OP_IN)
 	      update_pseudo_point (reg->regno, curr_point, USE_POINT);
 	    mark_regno_live (reg->regno, reg->biggest_mode);
-	    check_pseudos_live_through_calls (reg->regno,
-					      last_call_used_reg_set,
-					      call_insn);
+	    check_pseudos_live_through_calls (reg->regno, last_call_abi);
 	  }
 
       for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next)
@@ -1091,10 +1030,10 @@ process_bb_lives (basic_block bb, int &c
       }
 
   /* Pseudos can't go in stack regs at the start of a basic block that
-     is reached by an abnormal edge. Likewise for call clobbered regs,
-     because caller-save, fixup_abnormal_edges and possibly the table
-     driven EH machinery are not quite ready to handle such pseudos
-     live across such edges.  */
+     is reached by an abnormal edge.  Likewise for registers that are at
+     least partly call clobbered, because caller-save, fixup_abnormal_edges
+     and possibly the table driven EH machinery are not quite ready to
+     handle such pseudos live across such edges.  */
   if (bb_has_abnormal_pred (bb))
     {
 #ifdef STACK_REGS
@@ -1109,7 +1048,7 @@ process_bb_lives (basic_block bb, int &c
       if (!cfun->has_nonlocal_label
 	  && has_abnormal_call_or_eh_pred_edge_p (bb))
 	for (px = 0; HARD_REGISTER_NUM_P (px); px++)
-	  if (call_used_or_fixed_reg_p (px)
+	  if (eh_edge_abi.clobbers_at_least_part_of_reg_p (px)
 #ifdef REAL_PIC_OFFSET_TABLE_REGNUM
 	      /* We should create a conflict of PIC pseudo with PIC
 		 hard reg as PIC hard reg can have a wrong value after
@@ -1166,7 +1105,7 @@ process_bb_lives (basic_block bb, int &c
       if (sparseset_cardinality (pseudos_live_through_calls) == 0)
 	break;
       if (sparseset_bit_p (pseudos_live_through_calls, j))
-	check_pseudos_live_through_calls (j, last_call_used_reg_set, call_insn);
+	check_pseudos_live_through_calls (j, last_call_abi);
     }
 
   for (i = 0; HARD_REGISTER_NUM_P (i); ++i)
@@ -1400,7 +1339,6 @@ lra_create_live_ranges_1 (bool all_p, bo
 	lra_reg_info[i].biggest_mode = GET_MODE (regno_reg_rtx[i]);
       else
 	lra_reg_info[i].biggest_mode = VOIDmode;
-      lra_reg_info[i].call_insn = NULL;
       if (!HARD_REGISTER_NUM_P (i)
 	  && lra_reg_info[i].nrefs != 0)
 	{
Index: gcc/lra-remat.c
===================================================================
--- gcc/lra-remat.c	2019-09-10 19:56:45.357177891 +0100
+++ gcc/lra-remat.c	2019-09-11 19:48:38.549740292 +0100
@@ -65,16 +65,11 @@ Software Foundation; either version 3, o
 #include "recog.h"
 #include "lra.h"
 #include "lra-int.h"
+#include "function-abi.h"
 
 /* Number of candidates for rematerialization.  */
 static unsigned int cands_num;
 
-/* The following is used for representation of call_used_or_fixed_regs in
-   form array whose elements are hard register numbers with nonzero bit
-   in CALL_USED_OR_FIXED_REGS. */
-static int call_used_regs_arr_len;
-static int call_used_regs_arr[FIRST_PSEUDO_REGISTER];
-
 /* Bitmap used for different calculations.  */
 static bitmap_head temp_bitmap;
 
@@ -633,9 +628,12 @@ set_bb_regs (basic_block bb, rtx_insn *i
 	bitmap_set_bit (&subreg_regs, regno);
     }
   if (CALL_P (insn))
-    for (int i = 0; i < call_used_regs_arr_len; i++)
-      bitmap_set_bit (&get_remat_bb_data (bb)->dead_regs,
-		      call_used_regs_arr[i]);
+    {
+      function_abi abi = call_insn_abi (insn);
+      /* Partially-clobbered registers might still be live.  */
+      bitmap_ior_into (&get_remat_bb_data (bb)->dead_regs,
+		       bitmap_view<HARD_REG_SET> (abi.full_reg_clobbers ()));
+    }
 }
 
 /* Calculate changed_regs and dead_regs for each BB.  */
@@ -698,7 +696,7 @@ reg_overlap_for_remat_p (lra_insn_reg *r
 
 /* Return true if a call used register is an input operand of INSN.  */
 static bool
-call_used_input_regno_present_p (rtx_insn *insn)
+call_used_input_regno_present_p (const function_abi &abi, rtx_insn *insn)
 {
   int iter;
   lra_insn_recog_data_t id = lra_get_insn_recog_data (insn);
@@ -709,8 +707,9 @@ call_used_input_regno_present_p (rtx_ins
     for (reg = (iter == 0 ? id->regs : static_id->hard_regs);
 	 reg != NULL;
 	 reg = reg->next)
-      if (reg->type == OP_IN && reg->regno < FIRST_PSEUDO_REGISTER
-	  && TEST_HARD_REG_BIT (call_used_or_fixed_regs, reg->regno))
+      if (reg->type == OP_IN
+	  && reg->regno < FIRST_PSEUDO_REGISTER
+	  && abi.clobbers_reg_p (reg->biggest_mode, reg->regno))
 	return true;
   return false;
 }
@@ -799,18 +798,21 @@ calculate_gen_cands (void)
 		    }
 	    
 	    if (CALL_P (insn))
-	      EXECUTE_IF_SET_IN_BITMAP (gen_insns, 0, uid, bi)
-		{
-		  rtx_insn *insn2 = lra_insn_recog_data[uid]->insn;
+	      {
+		function_abi abi = call_insn_abi (insn);
+		EXECUTE_IF_SET_IN_BITMAP (gen_insns, 0, uid, bi)
+		  {
+		    rtx_insn *insn2 = lra_insn_recog_data[uid]->insn;
 		  
-		  cand = insn_to_cand[INSN_UID (insn2)];
-		  gcc_assert (cand != NULL);
-		  if (call_used_input_regno_present_p (insn2))
-		    {
-		      bitmap_clear_bit (gen_cands, cand->index);
-		      bitmap_set_bit (&temp_bitmap, uid);
-		    }
-		}
+		    cand = insn_to_cand[INSN_UID (insn2)];
+		    gcc_assert (cand != NULL);
+		    if (call_used_input_regno_present_p (abi, insn2))
+		      {
+			bitmap_clear_bit (gen_cands, cand->index);
+			bitmap_set_bit (&temp_bitmap, uid);
+		      }
+		  }
+	      }
 	    bitmap_and_compl_into (gen_insns, &temp_bitmap);
 
 	    cand = insn_to_cand[INSN_UID (insn)];
@@ -1205,13 +1207,16 @@ do_remat (void)
 		  }
 
 	  if (CALL_P (insn))
-	    EXECUTE_IF_SET_IN_BITMAP (avail_cands, 0, cid, bi)
-	      {
-		cand = all_cands[cid];
+	    {
+	      function_abi abi = call_insn_abi (insn);
+	      EXECUTE_IF_SET_IN_BITMAP (avail_cands, 0, cid, bi)
+		{
+		  cand = all_cands[cid];
 		
-		if (call_used_input_regno_present_p (cand->insn))
-		  bitmap_set_bit (&temp_bitmap, cand->index);
-	      }
+		  if (call_used_input_regno_present_p (abi, cand->insn))
+		    bitmap_set_bit (&temp_bitmap, cand->index);
+		}
+	    }
 
 	  bitmap_and_compl_into (avail_cands, &temp_bitmap);
 
@@ -1307,10 +1312,6 @@ lra_remat (void)
   insn_to_cand_activation = XCNEWVEC (cand_t, get_max_uid ());
   regno_cands = XCNEWVEC (cand_t, max_regno);
   all_cands.create (8000);
-  call_used_regs_arr_len = 0;
-  for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-    if (call_used_or_fixed_reg_p (i))
-      call_used_regs_arr[call_used_regs_arr_len++] = i;
   initiate_cand_table ();
   create_remat_bb_data ();
   bitmap_initialize (&temp_bitmap, &reg_obstack);

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [22/32] Remove global call sets: postreload.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (21 preceding siblings ...)
  2019-09-11 19:14 ` [23/32] Remove global call sets: postreload-gcse.c Richard Sandiford
@ 2019-09-11 19:14 ` Richard Sandiford
  2019-09-29 21:33   ` Jeff Law
  2019-09-11 19:15 ` [24/32] Remove global call sets: recog.c Richard Sandiford
                   ` (11 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:14 UTC (permalink / raw)
  To: gcc-patches

The "|= fixed_regs" in reload_combine isn't necessary, since the
set is only used to determine which values have changed (rather than,
for example, which registers are available for use).

In reload_cse_move2add we can be accurate about which registers
are still available.  BLKmode indicates a continuation of the
previous register, and since clobbers_reg_p handles multi-register
values, it's enough to skip over BLKmode entries and just test the
start register.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* postreload.c (reload_combine_recognize_pattern): Use crtl->abi
	when deciding whether a register is free for use after RA.
	(reload_combine): Remove unnecessary use of fixed_reg_set.
	(reload_cse_move2add): Use call_insn_abi to get the ABI of the
	call insn target.  Use reg_mode when testing whether a register
	is no longer available.

Index: gcc/postreload.c
===================================================================
--- gcc/postreload.c	2019-09-11 19:47:24.418262673 +0100
+++ gcc/postreload.c	2019-09-11 19:48:41.905716645 +0100
@@ -1136,7 +1136,8 @@ reload_combine_recognize_pattern (rtx_in
 	      if (TEST_HARD_REG_BIT (reg_class_contents[INDEX_REG_CLASS], i)
 		  && reg_state[i].use_index == RELOAD_COMBINE_MAX_USES
 		  && reg_state[i].store_ruid <= reg_state[regno].use_ruid
-		  && (call_used_or_fixed_reg_p (i) || df_regs_ever_live_p (i))
+		  && (crtl->abi->clobbers_full_reg_p (i)
+		      || df_regs_ever_live_p (i))
 		  && (!frame_pointer_needed || i != HARD_FRAME_POINTER_REGNUM)
 		  && !fixed_regs[i] && !global_regs[i]
 		  && hard_regno_nregs (i, GET_MODE (reg)) == 1
@@ -1332,9 +1333,6 @@ reload_combine (void)
 	{
 	  rtx link;
 	  HARD_REG_SET used_regs = call_insn_abi (insn).full_reg_clobbers ();
-	  /* ??? This preserves traditional behavior; it might not be
-	     needed.  */
-	  used_regs |= fixed_reg_set;
 
 	  for (r = 0; r < FIRST_PSEUDO_REGISTER; r++)
 	    if (TEST_HARD_REG_BIT (used_regs, r))
@@ -2126,12 +2124,13 @@ reload_cse_move2add (rtx_insn *first)
 	 unknown values.  */
       if (CALL_P (insn))
 	{
+	  function_abi abi = call_insn_abi (insn);
 	  for (i = FIRST_PSEUDO_REGISTER - 1; i >= 0; i--)
-	    {
-	      if (call_used_or_fixed_reg_p (i))
-		/* Reset the information about this register.  */
-		reg_mode[i] = VOIDmode;
-	    }
+	    if (reg_mode[i] != VOIDmode
+		&& reg_mode[i] != BLKmode
+		&& abi.clobbers_reg_p (reg_mode[i], i))
+	      /* Reset the information about this register.  */
+	      reg_mode[i] = VOIDmode;
 	}
     }
   return changed;

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [23/32] Remove global call sets: postreload-gcse.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (20 preceding siblings ...)
  2019-09-11 19:14 ` [21/32] Remove global call sets: LRA Richard Sandiford
@ 2019-09-11 19:14 ` Richard Sandiford
  2019-09-25 16:08   ` Richard Sandiford
  2019-09-11 19:14 ` [22/32] Remove global call sets: postreload.c Richard Sandiford
                   ` (12 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:14 UTC (permalink / raw)
  To: gcc-patches

This is another case in which we should conservatively treat
partial kills as full kills.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* postreload-gcse.c: Include regs.h and function-abi.h.
	(record_opr_changes): Use call_insn_abi to get the ABI of the
	call insn target.  Conservatively assume that partially-clobbered
	registers are altered.

Index: gcc/postreload-gcse.c
===================================================================
--- gcc/postreload-gcse.c	2019-09-09 18:58:51.472270712 +0100
+++ gcc/postreload-gcse.c	2019-09-11 19:48:45.585690715 +0100
@@ -41,6 +41,8 @@ Software Foundation; either version 3, o
 #include "intl.h"
 #include "gcse-common.h"
 #include "gcse.h"
+#include "regs.h"
+#include "function-abi.h"
 
 /* The following code implements gcse after reload, the purpose of this
    pass is to cleanup redundant loads generated by reload and other
@@ -770,9 +772,13 @@ record_opr_changes (rtx_insn *insn)
   /* Finally, if this is a call, record all call clobbers.  */
   if (CALL_P (insn))
     {
+      function_abi abi = call_insn_abi (insn);
       unsigned int regno;
       hard_reg_set_iterator hrsi;
-      EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, regno, hrsi)
+      /* We don't track modes of hard registers, so we need to be
+	 conservative and assume that partial kills are full kills.  */
+      EXECUTE_IF_SET_IN_HARD_REG_SET (abi.full_and_partial_reg_clobbers (),
+				      0, regno, hrsi)
 	record_last_reg_set_info_regno (insn, regno);
 
       if (! RTL_CONST_OR_PURE_CALL_P (insn))

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [25/32] Remove global call sets: regcprop.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (23 preceding siblings ...)
  2019-09-11 19:15 ` [24/32] Remove global call sets: recog.c Richard Sandiford
@ 2019-09-11 19:15 ` Richard Sandiford
  2019-09-29 21:34   ` Jeff Law
  2019-09-11 19:16 ` [26/32] Remove global call sets: regrename.c Richard Sandiford
                   ` (9 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:15 UTC (permalink / raw)
  To: gcc-patches

This is a direct replacement of an existing test for fully and
partially clobbered registers.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* regcprop.c (copyprop_hardreg_forward_1): Use the recorded
	mode of the register when deciding whether it is no longer
	available after a call.

Index: gcc/regcprop.c
===================================================================
--- gcc/regcprop.c	2019-09-11 19:47:32.898202916 +0100
+++ gcc/regcprop.c	2019-09-11 19:48:51.961645788 +0100
@@ -1055,16 +1055,15 @@ copyprop_hardreg_forward_1 (basic_block
 
 	  function_abi abi = call_insn_abi (insn);
 	  for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-	    if ((abi.clobbers_full_reg_p (regno)
-		 || (targetm.hard_regno_call_part_clobbered
-		     (abi.id (), regno, vd->e[regno].mode)))
+	    if (vd->e[regno].mode != VOIDmode
+		&& abi.clobbers_reg_p (vd->e[regno].mode, regno)
 		&& (regno < set_regno || regno >= set_regno + set_nregs))
 	      kill_value_regno (regno, 1, vd);
 
 	  /* If SET was seen in CALL_INSN_FUNCTION_USAGE, and SET_SRC
-	     of the SET isn't in regs_invalidated_by_call hard reg set,
-	     but instead among CLOBBERs on the CALL_INSN, we could wrongly
-	     assume the value in it is still live.  */
+	     of the SET isn't clobbered by ABI, but instead among
+	     CLOBBERs on the CALL_INSN, we could wrongly assume the
+	     value in it is still live.  */
 	  if (ksvd.ignore_set_reg)
 	    kill_clobbered_values (insn, vd);
 	}

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [24/32] Remove global call sets: recog.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (22 preceding siblings ...)
  2019-09-11 19:14 ` [22/32] Remove global call sets: postreload.c Richard Sandiford
@ 2019-09-11 19:15 ` Richard Sandiford
  2019-09-29 21:33   ` Jeff Law
  2019-09-11 19:15 ` [25/32] Remove global call sets: regcprop.c Richard Sandiford
                   ` (10 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:15 UTC (permalink / raw)
  To: gcc-patches

2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* recog.c: Include function-abi.h.
	(peep2_find_free_register): Use crtl->abi when deciding whether
	a register is free for use after RA.

Index: gcc/recog.c
===================================================================
--- gcc/recog.c	2019-09-10 19:56:45.357177891 +0100
+++ gcc/recog.c	2019-09-11 19:48:48.689668843 +0100
@@ -40,6 +40,7 @@ Software Foundation; either version 3, o
 #include "cfgcleanup.h"
 #include "reload.h"
 #include "tree-pass.h"
+#include "function-abi.h"
 
 #ifndef STACK_POP_CODE
 #if STACK_GROWS_DOWNWARD
@@ -3227,7 +3228,7 @@ peep2_find_free_register (int from, int
 	      break;
 	    }
 	  /* And that we don't create an extra save/restore.  */
-	  if (! call_used_or_fixed_reg_p (regno + j)
+	  if (! crtl->abi->clobbers_full_reg_p (regno + j)
 	      && ! df_regs_ever_live_p (regno + j))
 	    {
 	      success = 0;

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [26/32] Remove global call sets: regrename.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (24 preceding siblings ...)
  2019-09-11 19:15 ` [25/32] Remove global call sets: regcprop.c Richard Sandiford
@ 2019-09-11 19:16 ` Richard Sandiford
  2019-09-29 22:25   ` Jeff Law
  2019-09-11 19:16 ` [27/32] Remove global call sets: reload.c Richard Sandiford
                   ` (8 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:16 UTC (permalink / raw)
  To: gcc-patches

This patch makes regrename use a similar mask-and-clobber-set
pair to IRA when tracking whether registers are clobbered by
calls in a region.  Testing for a nonzero ABI mask is equivalent
to testing for a register that crosses a call.

Since AArch64 and c6x use regrename.h, they need to be updated
to include function-abi.h first.  AIUI this is preferred over
including function-abi.h in regrename.h.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* regrename.h (du_head::call_clobber_mask): New field.
	(du_head::need_caller_save_reg): Replace with...
	(du_head::call_abis): ...this new field.
	* regrename.c: Include function-abi.h.
	(call_clobbered_in_chain_p): New function.
	(check_new_reg_p): Use crtl->abi when deciding whether a register
	is free for use after RA.  Use call_clobbered_in_chain_p to test
	whether a candidate register would be clobbered by a call.
	(find_rename_reg): Don't add call-clobber conflicts here.
	(rename_chains): Check call_abis instead of need_caller_save_reg.
	(merge_chains): Update for changes to du_head.
	(build_def_use): Use call_insn_abi to get the ABI of the call insn
	target.  Record the ABI identifier in call_abis and the set of
	fully or partially clobbered registers in call_clobber_mask.
	Add fully-clobbered registers to hard_conflicts here rather
	than in find_rename_reg.
	* config/aarch64/cortex-a57-fma-steering.c: Include function-abi.h.
	(rename_single_chain): Check call_abis instead of need_caller_save_reg.
	* config/aarch64/falkor-tag-collision-avoidance.c: Include
	function-abi.h.
	* config/c6x/c6x.c: Likewise.

Index: gcc/regrename.h
===================================================================
--- gcc/regrename.h	2019-07-10 19:41:20.115948322 +0100
+++ gcc/regrename.h	2019-09-11 19:48:56.069616842 +0100
@@ -41,9 +41,12 @@ #define GCC_REGRENAME_H
   bitmap_head conflicts;
   /* Conflicts with untracked hard registers.  */
   HARD_REG_SET hard_conflicts;
+  /* Which registers are fully or partially clobbered by the calls that
+     the chain crosses.  */
+  HARD_REG_SET call_clobber_mask;
 
-  /* Nonzero if the chain crosses a call.  */
-  unsigned int need_caller_save_reg:1;
+  /* A bitmask of ABIs used by the calls that the chain crosses.  */
+  unsigned int call_abis : NUM_ABI_IDS;
   /* Nonzero if the register is used in a way that prevents renaming,
      such as the SET_DEST of a CALL_INSN or an asm operand that used
      to be a hard register.  */
Index: gcc/regrename.c
===================================================================
--- gcc/regrename.c	2019-09-11 19:47:32.898202916 +0100
+++ gcc/regrename.c	2019-09-11 19:48:56.069616842 +0100
@@ -33,6 +33,7 @@
 #include "addresses.h"
 #include "cfganal.h"
 #include "tree-pass.h"
+#include "function-abi.h"
 #include "regrename.h"
 
 /* This file implements the RTL register renaming pass of the compiler.  It is
@@ -303,6 +304,18 @@ merge_overlapping_regs (HARD_REG_SET *ps
     }
 }
 
+/* Return true if (reg:MODE REGNO) would be clobbered by a call covered
+   by THIS_HEAD.  */
+
+static bool
+call_clobbered_in_chain_p (du_head *this_head, machine_mode mode,
+			   unsigned int regno)
+{
+  return call_clobbered_in_region_p (this_head->call_abis,
+				     this_head->call_clobber_mask,
+				     mode, regno);
+}
+
 /* Check if NEW_REG can be the candidate register to rename for
    REG in THIS_HEAD chain.  THIS_UNAVAILABLE is a set of unavailable hard
    registers.  */
@@ -322,7 +335,7 @@ check_new_reg_p (int reg ATTRIBUTE_UNUSE
 	|| global_regs[new_reg + i]
 	/* Can't use regs which aren't saved by the prologue.  */
 	|| (! df_regs_ever_live_p (new_reg + i)
-	    && ! call_used_or_fixed_reg_p (new_reg + i))
+	    && ! crtl->abi->clobbers_full_reg_p (new_reg + i))
 #ifdef LEAF_REGISTERS
 	/* We can't use a non-leaf register if we're in a
 	   leaf function.  */
@@ -337,11 +350,8 @@ check_new_reg_p (int reg ATTRIBUTE_UNUSE
   for (tmp = this_head->first; tmp; tmp = tmp->next_use)
     if ((!targetm.hard_regno_mode_ok (new_reg, GET_MODE (*tmp->loc))
 	 && ! DEBUG_INSN_P (tmp->insn))
-	|| (this_head->need_caller_save_reg
-	    && ! (targetm.hard_regno_call_part_clobbered
-		  (0, reg, GET_MODE (*tmp->loc)))
-	    && (targetm.hard_regno_call_part_clobbered
-		(0, new_reg, GET_MODE (*tmp->loc)))))
+	|| call_clobbered_in_chain_p (this_head, GET_MODE (*tmp->loc),
+				      new_reg))
       return false;
 
   return true;
@@ -363,12 +373,6 @@ find_rename_reg (du_head_p this_head, en
   int pass;
   int best_new_reg = old_reg;
 
-  /* Further narrow the set of registers we can use for renaming.
-     If the chain needs a call-saved register, mark the call-used
-     registers as unavailable.  */
-  if (this_head->need_caller_save_reg)
-    *unavailable |= call_used_or_fixed_regs;
-
   /* Mark registers that overlap this chain's lifetime as unavailable.  */
   merge_overlapping_regs (unavailable, this_head);
 
@@ -499,7 +503,7 @@ rename_chains (void)
 	{
 	  fprintf (dump_file, "Register %s in insn %d",
 		   reg_names[reg], INSN_UID (this_head->first->insn));
-	  if (this_head->need_caller_save_reg)
+	  if (this_head->call_abis)
 	    fprintf (dump_file, " crosses a call");
 	}
 
@@ -680,7 +684,8 @@ merge_chains (du_head_p c1, du_head_p c2
   c1->hard_conflicts |= c2->hard_conflicts;
   bitmap_ior_into (&c1->conflicts, &c2->conflicts);
 
-  c1->need_caller_save_reg |= c2->need_caller_save_reg;
+  c1->call_clobber_mask |= c2->call_clobber_mask;
+  c1->call_abis |= c2->call_abis;
   c1->cannot_rename |= c2->cannot_rename;
 }
 
@@ -1834,9 +1839,14 @@ build_def_use (basic_block bb)
 	     requires a caller-saved reg.  */
 	  if (CALL_P (insn))
 	    {
+	      function_abi abi = call_insn_abi (insn);
 	      class du_head *p;
 	      for (p = open_chains; p; p = p->next_chain)
-		p->need_caller_save_reg = 1;
+		{
+		  p->call_abis |= (1 << abi.id ());
+		  p->call_clobber_mask |= abi.full_and_partial_reg_clobbers ();
+		  p->hard_conflicts |= abi.full_reg_clobbers ();
+		}
 	    }
 
 	  /* Step 5: Close open chains that overlap writes.  Similar to
Index: gcc/config/aarch64/cortex-a57-fma-steering.c
===================================================================
--- gcc/config/aarch64/cortex-a57-fma-steering.c	2019-09-09 18:59:26.352024447 +0100
+++ gcc/config/aarch64/cortex-a57-fma-steering.c	2019-09-11 19:48:56.065616870 +0100
@@ -37,6 +37,7 @@ #define INCLUDE_LIST
 #include "insn-attr.h"
 #include "context.h"
 #include "tree-pass.h"
+#include "function-abi.h"
 #include "regrename.h"
 #include "aarch64-protos.h"
 
@@ -281,7 +282,7 @@ rename_single_chain (du_head_p head, HAR
     {
       fprintf (dump_file, "Register %s in insn %d", reg_names[reg],
 	       INSN_UID (head->first->insn));
-      if (head->need_caller_save_reg)
+      if (head->call_abis)
 	fprintf (dump_file, " crosses a call");
     }
 
Index: gcc/config/aarch64/falkor-tag-collision-avoidance.c
===================================================================
--- gcc/config/aarch64/falkor-tag-collision-avoidance.c	2019-09-09 18:59:26.352024447 +0100
+++ gcc/config/aarch64/falkor-tag-collision-avoidance.c	2019-09-11 19:48:56.065616870 +0100
@@ -38,6 +38,7 @@ #define INCLUDE_LIST
 #include "optabs.h"
 #include "regs.h"
 #include "recog.h"
+#include "function-abi.h"
 #include "regrename.h"
 #include "print-rtl.h"
 
Index: gcc/config/c6x/c6x.c
===================================================================
--- gcc/config/c6x/c6x.c	2019-09-10 19:56:55.597105623 +0100
+++ gcc/config/c6x/c6x.c	2019-09-11 19:48:56.069616842 +0100
@@ -55,6 +55,7 @@ #define IN_TARGET_CODE 1
 #include "sel-sched.h"
 #include "debug.h"
 #include "hw-doloop.h"
+#include "function-abi.h"
 #include "regrename.h"
 #include "dumpfile.h"
 #include "builtins.h"

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [27/32] Remove global call sets: reload.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (25 preceding siblings ...)
  2019-09-11 19:16 ` [26/32] Remove global call sets: regrename.c Richard Sandiford
@ 2019-09-11 19:16 ` Richard Sandiford
  2019-09-29 22:26   ` Jeff Law
  2019-09-11 19:17 ` [00/32] Remove global call sets: rtlanal.c Richard Sandiford
                   ` (7 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:16 UTC (permalink / raw)
  To: gcc-patches

The inheritance code in find_equiv_reg can use clobbers_reg_p
to test whether a call clobbers either of the equivalent registers.

reload and find_reg use crtl->abi to test whether a register needs
to be saved in the prologue before use.

reload_as_needed can use full_and_partial_reg_clobbers and thus
avoid needing to keep its own record of which registers are part
call-clobbered.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* reload.c: Include function-abi.h.
	(find_equiv_reg): Use clobbers_reg_p to test whether either
	of the equivalent registers is clobbered by a call.
	* reload1.c: Include function-abi.h.
	(reg_reloaded_call_part_clobbered): Delete.
	(reload): Use crtl->abi to test which registers would need
	saving in the prologue before use.
	(find_reg): Likewise.
	(emit_reload_insns): Remove code for reg_reloaded_call_part_clobbered.
	(reload_as_needed): Likewise.  Use full_and_partial_reg_clobbers
	instead of call_used_or_fixed_regs | reg_reloaded_call_part_clobbered.

Index: gcc/reload.c
===================================================================
--- gcc/reload.c	2019-09-11 19:47:32.902202887 +0100
+++ gcc/reload.c	2019-09-11 19:49:00.269587248 +0100
@@ -106,6 +106,7 @@ #define REG_OK_STRICT
 #include "reload.h"
 #include "addresses.h"
 #include "params.h"
+#include "function-abi.h"
 
 /* True if X is a constant that can be forced into the constant pool.
    MODE is the mode of the operand, or VOIDmode if not known.  */
@@ -6904,24 +6905,19 @@ find_equiv_reg (rtx goal, rtx_insn *insn
 	 if either of the two is in a call-clobbered register, or memory.  */
       if (CALL_P (p))
 	{
-	  int i;
-
 	  if (goal_mem || need_stable_sp)
 	    return 0;
 
-	  if (regno >= 0 && regno < FIRST_PSEUDO_REGISTER)
-	    for (i = 0; i < nregs; ++i)
-	      if (call_used_or_fixed_reg_p (regno + i)
-		  || targetm.hard_regno_call_part_clobbered (0, regno + i,
-							     mode))
-		return 0;
+	  function_abi abi = call_insn_abi (p);
+	  if (regno >= 0
+	      && regno < FIRST_PSEUDO_REGISTER
+	      && abi.clobbers_reg_p (mode, regno))
+	    return 0;
 
-	  if (valueno >= 0 && valueno < FIRST_PSEUDO_REGISTER)
-	    for (i = 0; i < valuenregs; ++i)
-	      if (call_used_or_fixed_reg_p (valueno + i)
-		  || targetm.hard_regno_call_part_clobbered (0, valueno + i,
-							     mode))
-		return 0;
+	  if (valueno >= 0
+	      && valueno < FIRST_PSEUDO_REGISTER
+	      && abi.clobbers_reg_p (mode, valueno))
+	    return 0;
 	}
 
       if (INSN_P (p))
Index: gcc/reload1.c
===================================================================
--- gcc/reload1.c	2019-09-11 19:47:32.902202887 +0100
+++ gcc/reload1.c	2019-09-11 19:49:00.273587220 +0100
@@ -42,6 +42,7 @@ Software Foundation; either version 3, o
 #include "except.h"
 #include "dumpfile.h"
 #include "rtl-iter.h"
+#include "function-abi.h"
 
 /* This file contains the reload pass of the compiler, which is
    run after register allocation has been done.  It checks that
@@ -120,11 +121,6 @@ #define spill_indirect_levels			\
    This is only valid if reg_reloaded_contents is set and valid.  */
 static HARD_REG_SET reg_reloaded_dead;
 
-/* Indicate whether the register's current value is one that is not
-   safe to retain across a call, even for registers that are normally
-   call-saved.  This is only meaningful for members of reg_reloaded_valid.  */
-static HARD_REG_SET reg_reloaded_call_part_clobbered;
-
 /* Number of spill-regs so far; number of valid elements of spill_regs.  */
 static int n_spills;
 
@@ -795,7 +791,7 @@ reload (rtx_insn *first, int global)
 
   if (crtl->saves_all_registers)
     for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-      if (! call_used_or_fixed_reg_p (i)
+      if (! crtl->abi->clobbers_full_reg_p (i)
 	  && ! fixed_regs[i]
 	  && ! LOCAL_REGNO (i))
 	df_set_regs_ever_live (i, true);
@@ -1908,8 +1904,8 @@ find_reg (class insn_chain *chain, int o
 		  && (inv_reg_alloc_order[regno]
 		      < inv_reg_alloc_order[best_reg])
 #else
-		  && call_used_or_fixed_reg_p (regno)
-		  && ! call_used_or_fixed_reg_p (best_reg)
+		  && crtl->abi->clobbers_full_reg_p (regno)
+		  && !crtl->abi->clobbers_full_reg_p (best_reg)
 #endif
 		  ))
 	    {
@@ -4464,7 +4460,6 @@ reload_as_needed (int live_known)
   reg_last_reload_reg = XCNEWVEC (rtx, max_regno);
   INIT_REG_SET (&reg_has_output_reload);
   CLEAR_HARD_REG_SET (reg_reloaded_valid);
-  CLEAR_HARD_REG_SET (reg_reloaded_call_part_clobbered);
 
   set_initial_elim_offsets ();
 
@@ -4786,8 +4781,8 @@ reload_as_needed (int live_known)
          be partially clobbered by the call.  */
       else if (CALL_P (insn))
 	{
-	  reg_reloaded_valid &= ~(call_used_or_fixed_regs
-				  | reg_reloaded_call_part_clobbered);
+	  reg_reloaded_valid
+	    &= ~call_insn_abi (insn).full_and_partial_reg_clobbers ();
 
 	  /* If this is a call to a setjmp-type function, we must not
 	     reuse any reload reg contents across the call; that will
@@ -8193,13 +8188,6 @@ emit_reload_insns (class insn_chain *cha
 			   : out_regno + k);
 		      reg_reloaded_insn[regno + k] = insn;
 		      SET_HARD_REG_BIT (reg_reloaded_valid, regno + k);
-		      if (targetm.hard_regno_call_part_clobbered (0, regno + k,
-								  mode))
-			SET_HARD_REG_BIT (reg_reloaded_call_part_clobbered,
-					  regno + k);
-		      else
-			CLEAR_HARD_REG_BIT (reg_reloaded_call_part_clobbered,
-					    regno + k);
 		    }
 		}
 	    }
@@ -8273,13 +8261,6 @@ emit_reload_insns (class insn_chain *cha
 			   : in_regno + k);
 		      reg_reloaded_insn[regno + k] = insn;
 		      SET_HARD_REG_BIT (reg_reloaded_valid, regno + k);
-		      if (targetm.hard_regno_call_part_clobbered (0, regno + k,
-								  mode))
-			SET_HARD_REG_BIT (reg_reloaded_call_part_clobbered,
-					  regno + k);
-		      else
-			CLEAR_HARD_REG_BIT (reg_reloaded_call_part_clobbered,
-					    regno + k);
 		    }
 		}
 	    }
@@ -8388,13 +8369,6 @@ emit_reload_insns (class insn_chain *cha
 		      reg_reloaded_insn[src_regno + k] = store_insn;
 		      CLEAR_HARD_REG_BIT (reg_reloaded_dead, src_regno + k);
 		      SET_HARD_REG_BIT (reg_reloaded_valid, src_regno + k);
-		      if (targetm.hard_regno_call_part_clobbered
-			  (0, src_regno + k, mode))
-			SET_HARD_REG_BIT (reg_reloaded_call_part_clobbered,
-					  src_regno + k);
-		      else
-			CLEAR_HARD_REG_BIT (reg_reloaded_call_part_clobbered,
-					    src_regno + k);
 		      SET_HARD_REG_BIT (reg_is_output_reload, src_regno + k);
 		      if (note)
 			SET_HARD_REG_BIT (reg_reloaded_died, src_regno);

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [29/32] Remove global call sets: sched-deps.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (27 preceding siblings ...)
  2019-09-11 19:17 ` [00/32] Remove global call sets: rtlanal.c Richard Sandiford
@ 2019-09-11 19:17 ` Richard Sandiford
  2019-09-29 22:20   ` Jeff Law
  2019-09-11 19:18 ` [31/32] Remove global call sets: shrink-wrap.c Richard Sandiford
                   ` (5 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:17 UTC (permalink / raw)
  To: gcc-patches

This is a straight replacement of an existing "full or partial"
call-clobber check.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* sched-deps.c (deps_analyze_insn): Use the ABI of the target
	function to test whether a register is fully or partly clobbered.

Index: gcc/sched-deps.c
===================================================================
--- gcc/sched-deps.c	2019-09-11 19:47:32.902202887 +0100
+++ gcc/sched-deps.c	2019-09-11 19:49:08.517529131 +0100
@@ -3736,9 +3736,7 @@ deps_analyze_insn (class deps_desc *deps
              Since we only have a choice between 'might be clobbered'
              and 'definitely not clobbered', we must include all
              partly call-clobbered registers here.  */
-	    else if (targetm.hard_regno_call_part_clobbered (abi.id (), i,
-							     reg_raw_mode[i])
-                     || TEST_HARD_REG_BIT (regs_invalidated_by_call, i))
+	    else if (abi.clobbers_at_least_part_of_reg_p (i))
               SET_REGNO_REG_SET (reg_pending_clobbers, i);
           /* We don't know what set of fixed registers might be used
              by the function, but it is certain that the stack pointer

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [00/32] Remove global call sets: rtlanal.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (26 preceding siblings ...)
  2019-09-11 19:16 ` [27/32] Remove global call sets: reload.c Richard Sandiford
@ 2019-09-11 19:17 ` Richard Sandiford
  2019-09-29 22:21   ` Jeff Law
  2019-09-11 19:17 ` [29/32] Remove global call sets: sched-deps.c Richard Sandiford
                   ` (6 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:17 UTC (permalink / raw)
  To: gcc-patches

The reg_set_p part is simple, since the caller is asking about
a specific REG rtx, with a known register number and mode.

The find_all_hard_reg_sets part emphasises that the "implicit"
behaviour was always a bit suspect, since it includes fully-clobbered
registers but not partially-clobbered registers.  The only current
user of this path is the c6x-specific scheduler predication code,
and c6x doesn't have partly call-clobbered registers, so in practice
it's fine.  I've added a comment to try to disuade future users.
(The !implicit path is OK and useful though.)


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* rtlanal.c: Include function-abi.h.
	(reg_set_p): Use call_insn_abi to get the ABI of the called
	function and clobbers_reg_p to test whether the register
	is call-clobbered.
	(find_all_hard_reg_sets): When implicit is true, use call_insn_abi
	to get the ABI of the called function and full_reg_clobbers to
	get the set of fully call-clobbered registers.  Warn about the
	pitfalls of using this mode.

Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c	2019-09-11 19:47:24.418262673 +0100
+++ gcc/rtlanal.c	2019-09-11 19:49:04.417558020 +0100
@@ -36,6 +36,7 @@ Software Foundation; either version 3, o
 #include "addresses.h"
 #include "rtl-iter.h"
 #include "hard-reg-set.h"
+#include "function-abi.h"
 
 /* Forward declarations */
 static void set_of_1 (rtx, const_rtx, void *);
@@ -1270,8 +1271,8 @@ reg_set_p (const_rtx reg, const_rtx insn
 	  || (CALL_P (insn)
 	      && ((REG_P (reg)
 		   && REGNO (reg) < FIRST_PSEUDO_REGISTER
-		   && overlaps_hard_reg_set_p (regs_invalidated_by_call,
-					       GET_MODE (reg), REGNO (reg)))
+		   && (call_insn_abi (as_a<const rtx_insn *> (insn))
+		       .clobbers_reg_p (GET_MODE (reg), REGNO (reg))))
 		  || MEM_P (reg)
 		  || find_reg_fusage (insn, CLOBBER, reg)))))
     return true;
@@ -1486,7 +1487,11 @@ record_hard_reg_sets (rtx x, const_rtx p
 }
 
 /* Examine INSN, and compute the set of hard registers written by it.
-   Store it in *PSET.  Should only be called after reload.  */
+   Store it in *PSET.  Should only be called after reload.
+
+   IMPLICIT is true if we should include registers that are fully-clobbered
+   by calls.  This should be used with caution, since it doesn't include
+   partially-clobbered registers.  */
 void
 find_all_hard_reg_sets (const rtx_insn *insn, HARD_REG_SET *pset, bool implicit)
 {
@@ -1495,7 +1500,7 @@ find_all_hard_reg_sets (const rtx_insn *
   CLEAR_HARD_REG_SET (*pset);
   note_stores (insn, record_hard_reg_sets, pset);
   if (CALL_P (insn) && implicit)
-    *pset |= call_used_or_fixed_regs;
+    *pset |= call_insn_abi (insn).full_reg_clobbers ();
   for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
     if (REG_NOTE_KIND (link) == REG_INC)
       record_hard_reg_sets (XEXP (link, 0), NULL, pset);

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [31/32] Remove global call sets: shrink-wrap.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (28 preceding siblings ...)
  2019-09-11 19:17 ` [29/32] Remove global call sets: sched-deps.c Richard Sandiford
@ 2019-09-11 19:18 ` Richard Sandiford
  2019-09-29 22:21   ` Jeff Law
  2019-09-11 19:18 ` [30/32] Remove global call sets: sel-sched.c Richard Sandiford
                   ` (4 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:18 UTC (permalink / raw)
  To: gcc-patches

This is a straight replacement of "calls we can clobber without saving
them first".


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* shrink-wrap.c: Include function-abi.h.
	(requires_stack_frame_p): Use crtl->abi to test whether the
	current function can use a register without saving it first.

Index: gcc/shrink-wrap.c
===================================================================
--- gcc/shrink-wrap.c	2019-09-10 19:56:32.577268091 +0100
+++ gcc/shrink-wrap.c	2019-09-11 19:49:15.133482515 +0100
@@ -43,7 +43,7 @@ Software Foundation; either version 3, o
 #include "regcprop.h"
 #include "rtl-iter.h"
 #include "valtrack.h"
-
+#include "function-abi.h"
 
 /* Return true if INSN requires the stack frame to be set up.
    PROLOGUE_USED contains the hard registers used in the function
@@ -76,7 +76,7 @@ requires_stack_frame_p (rtx_insn *insn,
     }
   if (hard_reg_set_intersect_p (hardregs, prologue_used))
     return true;
-  hardregs &= ~call_used_or_fixed_regs;
+  hardregs &= ~crtl->abi->full_reg_clobbers ();
   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
     if (TEST_HARD_REG_BIT (hardregs, regno)
 	&& df_regs_ever_live_p (regno))

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [30/32] Remove global call sets: sel-sched.c
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (29 preceding siblings ...)
  2019-09-11 19:18 ` [31/32] Remove global call sets: shrink-wrap.c Richard Sandiford
@ 2019-09-11 19:18 ` Richard Sandiford
  2019-09-30 15:08   ` Jeff Law
  2019-09-11 19:19 ` [32/32] Hide regs_invalidated_by_call etc Richard Sandiford
                   ` (3 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:18 UTC (permalink / raw)
  To: gcc-patches

The main change here is to replace a crosses_call boolean with
a bitmask of the ABIs used by the crossed calls.  For space reasons,
I didn't also add a HARD_REG_SET that tracks the set of registers
that are actually clobbered, which means that this is the one part
of the series that doesn't benefit from -fipa-ra.  The existing
FIXME suggests that the current structures aren't the preferred
way of representing this anyhow, and the pass already makes
conservative assumptions about call-crossing registers.


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* sel-sched-ir.h (_def::crosses_call): Replace with...
	(_def::crossed_call_abis): ..this new field.
	(def_list_add): Take a mask of ABIs instead of a crosses_call
	boolean.
	* sel-sched-ir.c (def_list_add): Likewise.  Update initialization
	of _def accordingly.
	* sel-sched.c: Include function-abi.h.
	(hard_regs_data::regs_for_call_clobbered): Delete.
	(reg_rename::crosses_call): Replace with...
	(reg_rename::crossed_call_abis): ...this new field.
	(fur_static_params::crosses_call): Replace with...
	(fur_static_params::crossed_call_abis): ...this new field.
	(init_regs_for_mode): Don't initialize sel_hrd.regs_for_call_clobbered.
	(init_hard_regs_data): Use crtl->abi to test which registers the
	current function would need to save before it uses them.
	(mark_unavailable_hard_regs): Update handling of call-clobbered
	registers, using call_clobbers_in_region to find out which registers
	might be call-clobbered (but without taking -fipa-ra into account
	for now).  Remove separate handling of partially call-clobbered
	registers.
	(verify_target_availability): Use crossed_call_abis instead of
	crosses_call.
	(get_spec_check_type_for_insn, find_used_regs): Likewise.
	(fur_orig_expr_found, fur_on_enter, fur_orig_expr_not_found): Likewise.

Index: gcc/sel-sched-ir.h
===================================================================
--- gcc/sel-sched-ir.h	2019-07-10 19:41:26.359898316 +0100
+++ gcc/sel-sched-ir.h	2019-09-11 19:49:11.673506894 +0100
@@ -188,12 +188,12 @@ struct _def
 {
   insn_t orig_insn;
 
-  /* FIXME: Get rid of CROSSES_CALL in each def, since if we're moving up
+  /* FIXME: Get rid of CROSSED_CALL_ABIS in each def, since if we're moving up
      rhs from two different places, but only one of the code motion paths
      crosses a call, we can't use any of the call_used_regs, no matter which
-     path or whether all paths crosses a call.  Thus we should move CROSSES_CALL
-     to static params.  */
-  bool crosses_call;
+     path or whether all paths crosses a call.  Thus we should move
+     CROSSED_CALL_ABIS to static params.  */
+  unsigned int crossed_call_abis;
 };
 typedef struct _def *def_t;
 
@@ -1510,7 +1510,7 @@ extern void flist_tail_init (flist_tail_
 
 extern fence_t flist_lookup (flist_t, insn_t);
 extern void flist_clear (flist_t *);
-extern void def_list_add (def_list_t *, insn_t, bool);
+extern void def_list_add (def_list_t *, insn_t, unsigned int);
 
 /* Target context functions.  */
 extern tc_t create_target_context (bool);
Index: gcc/sel-sched-ir.c
===================================================================
--- gcc/sel-sched-ir.c	2019-09-09 19:01:40.375078244 +0100
+++ gcc/sel-sched-ir.c	2019-09-11 19:49:11.673506894 +0100
@@ -311,9 +311,10 @@ flist_clear (flist_t *lp)
     flist_remove (lp);
 }
 
-/* Add ORIGINAL_INSN the def list DL honoring CROSSES_CALL.  */
+/* Add ORIGINAL_INSN the def list DL honoring CROSSED_CALL_ABIS.  */
 void
-def_list_add (def_list_t *dl, insn_t original_insn, bool crosses_call)
+def_list_add (def_list_t *dl, insn_t original_insn,
+	      unsigned int crossed_call_abis)
 {
   def_t d;
 
@@ -321,7 +322,7 @@ def_list_add (def_list_t *dl, insn_t ori
   d = DEF_LIST_DEF (*dl);
 
   d->orig_insn = original_insn;
-  d->crosses_call = crosses_call;
+  d->crossed_call_abis = crossed_call_abis;
 }
 \f
 
Index: gcc/sel-sched.c
===================================================================
--- gcc/sel-sched.c	2019-09-11 19:47:32.902202887 +0100
+++ gcc/sel-sched.c	2019-09-11 19:49:11.673506894 +0100
@@ -46,6 +46,7 @@ Software Foundation; either version 3, o
 #include "sel-sched-dump.h"
 #include "sel-sched.h"
 #include "dbgcnt.h"
+#include "function-abi.h"
 
 /* Implementation of selective scheduling approach.
    The below implementation follows the original approach with the following
@@ -302,10 +303,6 @@ struct hard_regs_data
      that the whole set is not computed yet.  */
   HARD_REG_SET regs_for_rename[FIRST_PSEUDO_REGISTER];
 
-  /* For every mode, this stores registers not available due to
-     call clobbering.  */
-  HARD_REG_SET regs_for_call_clobbered[NUM_MACHINE_MODES];
-
   /* All registers that are used or call used.  */
   HARD_REG_SET regs_ever_used;
 
@@ -325,8 +322,8 @@ struct reg_rename
   /* These are *available* for renaming.  */
   HARD_REG_SET available_for_renaming;
 
-  /* Whether this code motion path crosses a call.  */
-  bool crosses_call;
+  /* The set of ABIs used by calls that the code motion path crosses.  */
+  unsigned int crossed_call_abis : NUM_ABI_IDS;
 };
 
 /* A global structure that contains the needed information about harg
@@ -390,8 +387,8 @@ struct fur_static_params
   /* Pointer to the list of original insns definitions.  */
   def_list_t *original_insns;
 
-  /* True if a code motion path contains a CALL insn.  */
-  bool crosses_call;
+  /* The set of ABIs used by calls that the code motion path crosses.  */
+  unsigned int crossed_call_abis : NUM_ABI_IDS;
 };
 
 typedef struct fur_static_params *fur_static_params_p;
@@ -1067,7 +1064,6 @@ init_regs_for_mode (machine_mode mode)
   int cur_reg;
 
   CLEAR_HARD_REG_SET (sel_hrd.regs_for_mode[mode]);
-  CLEAR_HARD_REG_SET (sel_hrd.regs_for_call_clobbered[mode]);
 
   for (cur_reg = 0; cur_reg < FIRST_PSEUDO_REGISTER; cur_reg++)
     {
@@ -1102,10 +1098,6 @@ init_regs_for_mode (machine_mode mode)
       if (i >= 0)
         continue;
 
-      if (targetm.hard_regno_call_part_clobbered (0, cur_reg, mode))
-        SET_HARD_REG_BIT (sel_hrd.regs_for_call_clobbered[mode],
-                          cur_reg);
-
       /* If the CUR_REG passed all the checks above,
          then it's ok.  */
       SET_HARD_REG_BIT (sel_hrd.regs_for_mode[mode], cur_reg);
@@ -1123,7 +1115,8 @@ init_hard_regs_data (void)
 
   CLEAR_HARD_REG_SET (sel_hrd.regs_ever_used);
   for (cur_reg = 0; cur_reg < FIRST_PSEUDO_REGISTER; cur_reg++)
-    if (df_regs_ever_live_p (cur_reg) || call_used_or_fixed_reg_p (cur_reg))
+    if (df_regs_ever_live_p (cur_reg)
+	|| crtl->abi->clobbers_full_reg_p (cur_reg))
       SET_HARD_REG_BIT (sel_hrd.regs_ever_used, cur_reg);
 
   /* Initialize registers that are valid based on mode when this is
@@ -1193,7 +1186,7 @@ mark_unavailable_hard_regs (def_t def, s
       SET_HARD_REG_SET (reg_rename_p->unavailable_hard_regs);
 
       /* Give a chance for original register, if it isn't in used_regs.  */
-      if (!def->crosses_call)
+      if (!def->crossed_call_abis)
         CLEAR_HARD_REG_BIT (reg_rename_p->unavailable_hard_regs, regno);
 
       return;
@@ -1224,13 +1217,20 @@ mark_unavailable_hard_regs (def_t def, s
     reg_rename_p->unavailable_hard_regs |= sel_hrd.stack_regs;
 #endif
 
-  /* If there's a call on this path, make regs from call_used_or_fixed_regs
-     unavailable.  */
-  if (def->crosses_call)
-    reg_rename_p->unavailable_hard_regs |= call_used_or_fixed_regs;
+  mode = GET_MODE (orig_dest);
+
+  /* If there's a call on this path, make regs from full_reg_clobbers
+     unavailable.
 
-  /* Stop here before reload: we need FRAME_REGS, STACK_REGS, and crosses_call,
-     but not register classes.  */
+     ??? It would be better to track the set of clobbered registers
+     directly, but that would be quite expensive in a def_t.  */
+  if (def->crossed_call_abis)
+    reg_rename_p->unavailable_hard_regs
+      |= call_clobbers_in_region (def->crossed_call_abis,
+				  reg_class_contents[ALL_REGS], mode);
+
+  /* Stop here before reload: we need FRAME_REGS, STACK_REGS, and
+     crossed_call_abis, but not register classes.  */
   if (!reload_completed)
     return;
 
@@ -1238,19 +1238,11 @@ mark_unavailable_hard_regs (def_t def, s
      register class.  */
   reg_rename_p->available_for_renaming = reg_class_contents[cl];
 
-  mode = GET_MODE (orig_dest);
-
   /* Leave only registers available for this mode.  */
   if (!sel_hrd.regs_for_mode_ok[mode])
     init_regs_for_mode (mode);
   reg_rename_p->available_for_renaming &= sel_hrd.regs_for_mode[mode];
 
-  /* Exclude registers that are partially call clobbered.  */
-  if (def->crosses_call
-      && !targetm.hard_regno_call_part_clobbered (0, regno, mode))
-    reg_rename_p->available_for_renaming
-      &= ~sel_hrd.regs_for_call_clobbered[mode];
-
   /* Leave only those that are ok to rename.  */
   EXECUTE_IF_SET_IN_HARD_REG_SET (reg_rename_p->available_for_renaming,
                                   0, cur_reg, hrsi)
@@ -1481,7 +1473,7 @@ choose_best_pseudo_reg (regset used_regs
 	      /* Don't let register cross a call if it doesn't already
 		 cross one.  This condition is written in accordance with
 		 that in sched-deps.c sched_analyze_reg().  */
-	      if (!reg_rename_p->crosses_call
+	      if (!reg_rename_p->crossed_call_abis
 		  || REG_N_CALLS_CROSSED (orig_regno) > 0)
 		return gen_rtx_REG (mode, orig_regno);
 	    }
@@ -1508,7 +1500,8 @@ choose_best_pseudo_reg (regset used_regs
 
     max_regno = max_reg_num ();
     maybe_extend_reg_info_p ();
-    REG_N_CALLS_CROSSED (REGNO (new_reg)) = reg_rename_p->crosses_call ? 1 : 0;
+    REG_N_CALLS_CROSSED (REGNO (new_reg))
+      = reg_rename_p->crossed_call_abis ? 1 : 0;
 
     return new_reg;
   }
@@ -1560,7 +1553,8 @@ verify_target_availability (expr_t expr,
        as well.  */
     gcc_assert (scheduled_something_on_previous_fence || !live_available
 		|| !hard_available
-		|| (!reload_completed && reg_rename_p->crosses_call
+		|| (!reload_completed
+		    && reg_rename_p->crossed_call_abis
 		    && REG_N_CALLS_CROSSED (regno) == 0));
 }
 
@@ -3248,7 +3242,7 @@ get_spec_check_type_for_insn (insn_t ins
    All the original operations found during the traversal are saved in the
    ORIGINAL_INSNS list.
 
-   REG_RENAME_P->CROSSES_CALL is true, if there is a call insn on the path
+   REG_RENAME_P->CROSSED_CALL_ABIS is true, if there is a call insn on the path
    from INSN to original insn. In this case CALL_USED_REG_SET will be added
    to unavailable hard regs at the point original operation is found.  */
 
@@ -3269,7 +3263,7 @@ find_used_regs (insn_t insn, av_set_t or
   bitmap_clear (code_motion_visited_blocks);
 
   /* Init parameters for code_motion_path_driver.  */
-  sparams.crosses_call = false;
+  sparams.crossed_call_abis = 0;
   sparams.original_insns = original_insns;
   sparams.used_regs = used_regs;
 
@@ -3278,7 +3272,7 @@ find_used_regs (insn_t insn, av_set_t or
 
   res = code_motion_path_driver (insn, orig_ops, NULL, &lparams, &sparams);
 
-  reg_rename_p->crosses_call |= sparams.crosses_call;
+  reg_rename_p->crossed_call_abis |= sparams.crossed_call_abis;
 
   gcc_assert (res == 1);
   gcc_assert (original_insns && *original_insns);
@@ -6006,7 +6000,7 @@ move_op_orig_expr_found (insn_t insn, ex
 
 /* The function is called when original expr is found.
    INSN - current insn traversed, EXPR - the corresponding expr found,
-   crosses_call and original_insns in STATIC_PARAMS are updated.  */
+   crossed_call_abis and original_insns in STATIC_PARAMS are updated.  */
 static void
 fur_orig_expr_found (insn_t insn, expr_t expr ATTRIBUTE_UNUSED,
                      cmpd_local_params_p lparams ATTRIBUTE_UNUSED,
@@ -6016,9 +6010,9 @@ fur_orig_expr_found (insn_t insn, expr_t
   regset tmp;
 
   if (CALL_P (insn))
-    params->crosses_call = true;
+    params->crossed_call_abis |= 1 << call_insn_abi (insn).id ();
 
-  def_list_add (params->original_insns, insn, params->crosses_call);
+  def_list_add (params->original_insns, insn, params->crossed_call_abis);
 
   /* Mark the registers that do not meet the following condition:
     (2) not among the live registers of the point
@@ -6176,10 +6170,10 @@ fur_on_enter (insn_t insn ATTRIBUTE_UNUS
 	 least one insn in ORIGINAL_INSNS.  */
       gcc_assert (*sparams->original_insns);
 
-      /* Adjust CROSSES_CALL, since we may have come to this block along
+      /* Adjust CROSSED_CALL_ABIS, since we may have come to this block along
 	 different path.  */
-      DEF_LIST_DEF (*sparams->original_insns)->crosses_call
-	  |= sparams->crosses_call;
+      DEF_LIST_DEF (*sparams->original_insns)->crossed_call_abis
+	|= sparams->crossed_call_abis;
     }
   else
     local_params->old_original_insns = *sparams->original_insns;
@@ -6233,7 +6227,7 @@ fur_orig_expr_not_found (insn_t insn, av
   fur_static_params_p sparams = (fur_static_params_p) static_params;
 
   if (CALL_P (insn))
-    sparams->crosses_call = true;
+    sparams->crossed_call_abis |= 1 << call_insn_abi (insn).id ();
   else if (DEBUG_INSN_P (insn))
     return true;
 

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [32/32] Hide regs_invalidated_by_call etc.
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (30 preceding siblings ...)
  2019-09-11 19:18 ` [30/32] Remove global call sets: sel-sched.c Richard Sandiford
@ 2019-09-11 19:19 ` Richard Sandiford
  2019-09-29 22:22   ` Jeff Law
  2019-09-12 20:42 ` [00/32] Support multiple ABIs in the same translation unit Steven Bosscher
                   ` (2 subsequent siblings)
  34 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-11 19:19 UTC (permalink / raw)
  To: gcc-patches

The previous patches removed all target-independent uses of
regs_invalidated_by_call, call_used_or_fixed_regs and
call_used_or_fixed_reg_p.  This patch therefore restricts
them to target-specific code (and reginfo.c, which sets them up).


2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* hard-reg-set.h (regs_invalidated_by_call): Only define if
	IN_TARGET_CODE.
	(call_used_or_fixed_regs): Likewise.
	(call_used_or_fixed_reg_p): Likewise.
	* reginfo.c (regs_invalidated_by_call): New macro.

Index: gcc/hard-reg-set.h
===================================================================
--- gcc/hard-reg-set.h	2019-09-10 19:57:04.713041281 +0100
+++ gcc/hard-reg-set.h	2019-09-11 19:49:18.149461264 +0100
@@ -477,10 +477,12 @@ #define call_used_regs \
 #endif
 #define savable_regs \
   (this_target_hard_regs->x_savable_regs)
+#ifdef IN_TARGET_CODE
 #define regs_invalidated_by_call \
   (this_target_hard_regs->x_regs_invalidated_by_call)
 #define call_used_or_fixed_regs \
   (regs_invalidated_by_call | fixed_reg_set)
+#endif
 #define reg_alloc_order \
   (this_target_hard_regs->x_reg_alloc_order)
 #define inv_reg_alloc_order \
@@ -509,6 +511,7 @@ #define reg_names \
 #define REG_CAN_CHANGE_MODE_P(REGN, FROM, TO)                          \
   (targetm.can_change_mode_class (FROM, TO, REGNO_REG_CLASS (REGN)))
 
+#ifdef IN_TARGET_CODE
 /* Return true if register REGNO is either fixed or call-used
    (aka call-clobbered).  */
 
@@ -517,5 +520,6 @@ call_used_or_fixed_reg_p (unsigned int r
 {
   return fixed_regs[regno] || this_target_hard_regs->x_call_used_regs[regno];
 }
+#endif
 
 #endif /* ! GCC_HARD_REG_SET_H */
Index: gcc/reginfo.c
===================================================================
--- gcc/reginfo.c	2019-09-11 19:47:39.474156575 +0100
+++ gcc/reginfo.c	2019-09-11 19:49:18.149461264 +0100
@@ -69,6 +69,8 @@ struct target_regs *this_target_regs = &
 
 #define call_used_regs \
   (this_target_hard_regs->x_call_used_regs)
+#define regs_invalidated_by_call \
+  (this_target_hard_regs->x_regs_invalidated_by_call)
 
 /* Data for initializing fixed_regs.  */
 static const char initial_fixed_regs[] = FIXED_REGISTERS;

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [10/32] Remove global call sets: combine.c
  2019-09-11 19:08 ` [10/32] Remove global call sets: combine.c Richard Sandiford
@ 2019-09-12  2:18   ` Segher Boessenkool
  2019-09-12  7:52     ` Richard Sandiford
  0 siblings, 1 reply; 100+ messages in thread
From: Segher Boessenkool @ 2019-09-12  2:18 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On Wed, Sep 11, 2019 at 08:08:38PM +0100, Richard Sandiford wrote:
>        hard_reg_set_iterator hrsi;
> -      EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, i, hrsi)
> +      EXECUTE_IF_SET_IN_HARD_REG_SET (abi.full_and_partial_reg_clobbers (),
> +				      0, i, hrsi)

So "abi" in that means calls?  It is not such a great name like that.
Since its children are very_long_names, it doesn't need to be only three
chars itself, either?


Segher

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [10/32] Remove global call sets: combine.c
  2019-09-12  2:18   ` Segher Boessenkool
@ 2019-09-12  7:52     ` Richard Sandiford
  2019-09-20  0:43       ` Segher Boessenkool
  0 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-12  7:52 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches

Segher Boessenkool <segher@kernel.crashing.org> writes:
> On Wed, Sep 11, 2019 at 08:08:38PM +0100, Richard Sandiford wrote:
>>        hard_reg_set_iterator hrsi;
>> -      EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, i, hrsi)
>> +      EXECUTE_IF_SET_IN_HARD_REG_SET (abi.full_and_partial_reg_clobbers (),
>> +				      0, i, hrsi)
>
> So "abi" in that means calls?

"abi" is the interface of the callee function, taking things like
function attributes and -fipa-ra into account.

The register sets are describing what the callee does rather than
what calls to it do.  E.g. on targets that allow linker stubs to be
inserted between calls, the scratch registers reserved for linker stubs
are still call-clobbered, even if the target of the call doesn't use
them.  (Those call clobbers are represented separately, at least when
TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS is true.  When it's
false we don't use -fipa-ra information at all.)

> It is not such a great name like that.  Since its children are
> very_long_names, it doesn't need to be only three chars itself,
> either?

OK, what name would you prefer?

Richard

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [00/32] Support multiple ABIs in the same translation unit
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (31 preceding siblings ...)
  2019-09-11 19:19 ` [32/32] Hide regs_invalidated_by_call etc Richard Sandiford
@ 2019-09-12 20:42 ` Steven Bosscher
  2019-09-26 19:24 ` Dimitar Dimitrov
  2019-10-01  2:09 ` build-failure for cris-elf with "[00/32] Support multiple ABIs in the same translation unit" Hans-Peter Nilsson
  34 siblings, 0 replies; 100+ messages in thread
From: Steven Bosscher @ 2019-09-12 20:42 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On Wednesday, September 11, 2019, Richard Sandiford <
richard.sandiford@arm.com> wrote:.
>
>
> Sorry for the long write-up.
>
> Richard
>

*thanks* for the long write-up!

Ciao!
Steven

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [10/32] Remove global call sets: combine.c
  2019-09-12  7:52     ` Richard Sandiford
@ 2019-09-20  0:43       ` Segher Boessenkool
  2019-09-25 15:52         ` Richard Sandiford
  0 siblings, 1 reply; 100+ messages in thread
From: Segher Boessenkool @ 2019-09-20  0:43 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

Hi Richard,

Sorry this too me so long to get back to.

On Thu, Sep 12, 2019 at 08:51:59AM +0100, Richard Sandiford wrote:
> Segher Boessenkool <segher@kernel.crashing.org> writes:
> > On Wed, Sep 11, 2019 at 08:08:38PM +0100, Richard Sandiford wrote:
> >>        hard_reg_set_iterator hrsi;
> >> -      EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, i, hrsi)
> >> +      EXECUTE_IF_SET_IN_HARD_REG_SET (abi.full_and_partial_reg_clobbers (),
> >> +				      0, i, hrsi)
> >
> > So "abi" in that means calls?
> 
> "abi" is the interface of the callee function, taking things like
> function attributes and -fipa-ra into account.
> 
> The register sets are describing what the callee does rather than
> what calls to it do.  E.g. on targets that allow linker stubs to be
> inserted between calls, the scratch registers reserved for linker stubs
> are still call-clobbered, even if the target of the call doesn't use
> them.  (Those call clobbers are represented separately, at least when
> TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS is true.  When it's
> false we don't use -fipa-ra information at all.)
> 
> > It is not such a great name like that.  Since its children are
> > very_long_names, it doesn't need to be only three chars itself,
> > either?
> 
> OK, what name would you prefer?

Maybe call_abi is a good name?  It's difficult to capture the subtleties
in a short enough name.  As always :-)


Segher

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [03/32] Add a function for getting the ABI of a call insn target
  2019-09-11 19:04 ` [03/32] Add a function for getting the ABI of a call insn target Richard Sandiford
@ 2019-09-25 15:38   ` Richard Sandiford
  2019-09-30 15:52     ` Jeff Law
  0 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-25 15:38 UTC (permalink / raw)
  To: gcc-patches

Richard Sandiford <richard.sandiford@arm.com> writes:
> This patch replaces get_call_reg_set_usage with call_insn_abi,
> which returns the ABI of the target of a call insn.  The ABI's
> full_reg_clobbers corresponds to regs_invalidated_by_call,
> whereas many callers instead passed call_used_or_fixed_regs, i.e.:
>
>   (regs_invalidated_by_call | fixed_reg_set)
>
> The patch slavishly preserves the "| fixed_reg_set" for these callers;
> later patches will clean this up.

On reflection, I think insn_callee_abi would be a better name for the
function than call_insn_abi, since it should make it clearer that the
function returns the ABI of the target function.  In future we could
have expr_callee_abi for CALL_EXPRs.

Also, after Segher's comments for 10/32, I've used "callee_abi" as
the name of temporary variables, instead of just "abi".

I've made the same change for later patches (except where I've posted
new versions instead), but it didn't seem worth spamming the lists
with that.

Tested as before.

Richard

PS. Ping for the series :-)


2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* target.def (insn_callee_abi): New hook.
	(remove_extra_call_preserved_regs): Delete.
	* doc/tm.texi.in (TARGET_INSN_CALLEE_ABI): New macro.
	(TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete.
	* doc/tm.texi: Regenerate.
	* targhooks.h (default_remove_extra_call_preserved_regs): Delete.
	* targhooks.c (default_remove_extra_call_preserved_regs): Delete.
	* config/aarch64/aarch64.c (aarch64_simd_call_p): Constify the
	insn argument.
	(aarch64_remove_extra_call_preserved_regs): Delete.
	(aarch64_insn_callee_abi): New function.
	(TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete.
	(TARGET_INSN_CALLEE_ABI): New macro.
	* rtl.h (get_call_fndecl): Declare.
	(cgraph_rtl_info): Fix formatting.  Tweak comment for
	function_used_regs.  Remove function_used_regs_valid.
	* rtlanal.c (get_call_fndecl): Moved from final.c
	* function-abi.h (insn_callee_abi): Declare.
	(target_function_abi_info): Mention insn_callee_abi.
	* function-abi.cc (fndecl_abi): Handle flag_ipa_ra in a similar
	way to get_call_reg_set_usage did.
	(insn_callee_abi): New function.
	* regs.h (get_call_reg_set_usage): Delete.
	* final.c: Include function-abi.h.
	(collect_fn_hard_reg_usage): Add fixed and stack registers to
	function_used_regs before the main loop rather than afterwards.
	Use insn_callee_abi instead of get_call_reg_set_usage.  Exit early
	if function_used_regs ends up not being useful.
	(get_call_fndecl): Move to rtlanal.c
	(get_call_cgraph_rtl_info, get_call_reg_set_usage): Delete.
	* caller-save.c: Include function-abi.h.
	(setup_save_areas, save_call_clobbered_regs): Use insn_callee_abi
	instead of get_call_reg_set_usage.
	* cfgcleanup.c: Include function-abi.h.
	(old_insns_match_p): Use insn_callee_abi instead of
	get_call_reg_set_usage.
	* cgraph.h (cgraph_node::rtl_info): Take a const_tree instead of
	a tree.
	* cgraph.c (cgraph_node::rtl_info): Likewise.  Initialize
	function_used_regs.
	* df-scan.c: Include function-abi.h.
	(df_get_call_refs): Use insn_callee_abi instead of
	get_call_reg_set_usage.
	* ira-lives.c: Include function-abi.h.
	(process_bb_node_lives): Use insn_callee_abi instead of
	get_call_reg_set_usage.
	* lra-lives.c: Include function-abi.h.
	(process_bb_lives): Use insn_callee_abi instead of
	get_call_reg_set_usage.
	* postreload.c: Include function-abi.h.
	(reload_combine): Use insn_callee_abi instead of
	get_call_reg_set_usage.
	* regcprop.c: Include function-abi.h.
	(copyprop_hardreg_forward_1): Use insn_callee_abi instead of
	get_call_reg_set_usage.
	* resource.c: Include function-abi.h.
	(mark_set_resources, mark_target_live_regs): Use insn_callee_abi
	instead of get_call_reg_set_usage.
	* var-tracking.c: Include function-abi.h.
	(dataflow_set_clear_at_call): Use insn_callee_abi instead of
	get_call_reg_set_usage.

Index: gcc/target.def
===================================================================
--- gcc/target.def	2019-09-25 16:23:04.000000000 +0100
+++ gcc/target.def	2019-09-25 16:23:05.092580444 +0100
@@ -4952,6 +4952,19 @@ interoperability between several ABIs in
  const predefined_function_abi &, (const_tree type),
  NULL)
 
+DEFHOOK
+(insn_callee_abi,
+ "This hook returns a description of the ABI used by the target of\n\
+call instruction @var{insn}; see the definition of\n\
+@code{predefined_function_abi} for details of the ABI descriptor.\n\
+Only the global function @code{insn_callee_abi} should call this hook\n\
+directly.\n\
+\n\
+Targets only need to define this hook if they support\n\
+interoperability between several ABIs in the same translation unit.",
+ const predefined_function_abi &, (const rtx_insn *insn),
+ NULL)
+
 /* ??? Documenting this hook requires a GFDL license grant.  */
 DEFHOOK_UNDOC
 (internal_arg_pointer,
@@ -5834,20 +5847,6 @@ DEFHOOK
  const char *, (void),
  hook_constcharptr_void_null)
 
-DEFHOOK
-(remove_extra_call_preserved_regs,
- "This hook removes registers from the set of call-clobbered registers\n\
- in @var{used_regs} if, contrary to the default rules, something guarantees\n\
- that @samp{insn} preserves those registers.  For example, some targets\n\
- support variant ABIs in which functions preserve more registers than\n\
- normal functions would.  Removing those extra registers from @var{used_regs}\n\
- can lead to better register allocation.\n\
- \n\
- The default implementation does nothing, which is always safe.\n\
- Defining the hook is purely an optimization.",
- void, (rtx_insn *insn, HARD_REG_SET *used_regs),
- default_remove_extra_call_preserved_regs)
-
 /* Return the smallest number of different values for which it is best to
    use a jump-table instead of a tree of conditional branches.  */
 DEFHOOK
Index: gcc/doc/tm.texi.in
===================================================================
--- gcc/doc/tm.texi.in	2019-09-25 16:23:04.000000000 +0100
+++ gcc/doc/tm.texi.in	2019-09-25 16:23:05.088580476 +0100
@@ -1711,13 +1711,13 @@ must be defined.  Modern ports should de
 @cindex call-saved register
 @hook TARGET_FNTYPE_ABI
 
+@hook TARGET_INSN_CALLEE_ABI
+
 @cindex call-used register
 @cindex call-clobbered register
 @cindex call-saved register
 @hook TARGET_HARD_REGNO_CALL_PART_CLOBBERED
 
-@hook TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS
-
 @hook TARGET_RETURN_CALL_WITH_MAX_CLOBBERS
 
 @hook TARGET_GET_MULTILIB_ABI_NAME
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	2019-09-25 16:23:04.000000000 +0100
+++ gcc/doc/tm.texi	2019-09-25 16:23:05.084580502 +0100
@@ -1905,6 +1905,17 @@ descriptor.  Targets only need to define
 interoperability between several ABIs in the same translation unit.
 @end deftypefn
 
+@deftypefn {Target Hook} {const predefined_function_abi &} TARGET_INSN_CALLEE_ABI (const rtx_insn *@var{insn})
+This hook returns a description of the ABI used by the target of
+call instruction @var{insn}; see the definition of
+@code{predefined_function_abi} for details of the ABI descriptor.
+Only the global function @code{insn_callee_abi} should call this hook
+directly.
+
+Targets only need to define this hook if they support
+interoperability between several ABIs in the same translation unit.
+@end deftypefn
+
 @cindex call-used register
 @cindex call-clobbered register
 @cindex call-saved register
@@ -1921,18 +1932,6 @@ The default implementation returns false
 for targets that don't have partly call-clobbered registers.
 @end deftypefn
 
-@deftypefn {Target Hook} void TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS (rtx_insn *@var{insn}, HARD_REG_SET *@var{used_regs})
-This hook removes registers from the set of call-clobbered registers
- in @var{used_regs} if, contrary to the default rules, something guarantees
- that @samp{insn} preserves those registers.  For example, some targets
- support variant ABIs in which functions preserve more registers than
- normal functions would.  Removing those extra registers from @var{used_regs}
- can lead to better register allocation.
- 
- The default implementation does nothing, which is always safe.
- Defining the hook is purely an optimization.
-@end deftypefn
-
 @deftypefn {Target Hook} {rtx_insn *} TARGET_RETURN_CALL_WITH_MAX_CLOBBERS (rtx_insn *@var{call_1}, rtx_insn *@var{call_2})
 This hook returns a pointer to the call that partially clobbers the
 most registers.  If a platform supports multiple ABIs where the registers
Index: gcc/targhooks.h
===================================================================
--- gcc/targhooks.h	2019-09-25 16:23:04.000000000 +0100
+++ gcc/targhooks.h	2019-09-25 16:23:05.092580444 +0100
@@ -281,7 +281,5 @@ extern tree default_preferred_else_value
 extern bool default_have_speculation_safe_value (bool);
 extern bool speculation_safe_value_not_needed (bool);
 extern rtx default_speculation_safe_value (machine_mode, rtx, rtx, rtx);
-extern void default_remove_extra_call_preserved_regs (rtx_insn *,
-						      HARD_REG_SET *);
 
 #endif /* GCC_TARGHOOKS_H */
Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c	2019-09-25 16:23:04.000000000 +0100
+++ gcc/targhooks.c	2019-09-25 16:23:05.092580444 +0100
@@ -2363,9 +2363,4 @@ default_speculation_safe_value (machine_
   return result;
 }
 
-void
-default_remove_extra_call_preserved_regs (rtx_insn *, HARD_REG_SET *)
-{
-}
-
 #include "gt-targhooks.h"
Index: gcc/config/aarch64/aarch64.c
===================================================================
--- gcc/config/aarch64/aarch64.c	2019-09-25 16:23:04.000000000 +0100
+++ gcc/config/aarch64/aarch64.c	2019-09-25 16:23:05.080580530 +0100
@@ -1877,7 +1877,7 @@ aarch64_reg_save_mode (tree fndecl, unsi
    the function.  */
 
 static bool
-aarch64_simd_call_p (rtx_insn *insn)
+aarch64_simd_call_p (const rtx_insn *insn)
 {
   rtx symbol;
   rtx call;
@@ -1895,20 +1895,14 @@ aarch64_simd_call_p (rtx_insn *insn)
   return aarch64_simd_decl_p (fndecl);
 }
 
-/* Implement TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS.  If INSN calls
-   a function that uses the SIMD ABI, take advantage of the extra
-   call-preserved registers that the ABI provides.  */
+/* Implement TARGET_INSN_CALLEE_ABI.  */
 
-void
-aarch64_remove_extra_call_preserved_regs (rtx_insn *insn,
-					  HARD_REG_SET *return_set)
+const predefined_function_abi &
+aarch64_insn_callee_abi (const rtx_insn *insn)
 {
   if (aarch64_simd_call_p (insn))
-    {
-      for (int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-	if (FP_SIMD_SAVED_REGNUM_P (regno))
-	  CLEAR_HARD_REG_BIT (*return_set, regno);
-    }
+    return aarch64_simd_abi ();
+  return default_function_abi;
 }
 
 /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  The callee only saves
@@ -20924,9 +20918,8 @@ #define TARGET_MODES_TIEABLE_P aarch64_m
 #define TARGET_HARD_REGNO_CALL_PART_CLOBBERED \
   aarch64_hard_regno_call_part_clobbered
 
-#undef TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS
-#define TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS \
-  aarch64_remove_extra_call_preserved_regs
+#undef TARGET_INSN_CALLEE_ABI
+#define TARGET_INSN_CALLEE_ABI aarch64_insn_callee_abi
 
 #undef TARGET_RETURN_CALL_WITH_MAX_CLOBBERS
 #define TARGET_RETURN_CALL_WITH_MAX_CLOBBERS \
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h	2019-09-25 16:23:04.000000000 +0100
+++ gcc/rtl.h	2019-09-25 16:23:05.092580444 +0100
@@ -3438,6 +3438,7 @@ extern int rtx_unstable_p (const_rtx);
 extern bool rtx_varies_p (const_rtx, bool);
 extern bool rtx_addr_varies_p (const_rtx, bool);
 extern rtx get_call_rtx_from (const rtx_insn *);
+extern tree get_call_fndecl (const rtx_insn *);
 extern HOST_WIDE_INT get_integer_term (const_rtx);
 extern rtx get_related_value (const_rtx);
 extern bool offset_within_block_p (const_rtx, HOST_WIDE_INT);
@@ -4389,14 +4390,11 @@ extern tree GTY(()) global_regs_decl[FIR
    Available only for functions that has been already assembled.  */
 
 struct GTY(()) cgraph_rtl_info {
-   unsigned int preferred_incoming_stack_boundary;
+  unsigned int preferred_incoming_stack_boundary;
 
-  /* Call unsaved hard registers really used by the corresponding
-     function (including ones used by functions called by the
-     function).  */
+  /* Which registers the function clobbers, either directly or by
+     calling another function.  */
   HARD_REG_SET function_used_regs;
-  /* Set if function_used_regs is valid.  */
-  unsigned function_used_regs_valid: 1;
 };
 
 /* If loads from memories of mode MODE always sign or zero extend,
Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c	2019-09-25 16:23:04.000000000 +0100
+++ gcc/rtlanal.c	2019-09-25 16:23:05.092580444 +0100
@@ -822,6 +822,24 @@ get_call_rtx_from (const rtx_insn *insn)
     return x;
   return NULL_RTX;
 }
+
+/* Get the declaration of the function called by INSN.  */
+
+tree
+get_call_fndecl (const rtx_insn *insn)
+{
+  rtx note, datum;
+
+  note = find_reg_note (insn, REG_CALL_DECL, NULL_RTX);
+  if (note == NULL_RTX)
+    return NULL_TREE;
+
+  datum = XEXP (note, 0);
+  if (datum != NULL_RTX)
+    return SYMBOL_REF_DECL (datum);
+
+  return NULL_TREE;
+}
 \f
 /* Return the value of the integer term in X, if one is apparent;
    otherwise return 0.
Index: gcc/function-abi.h
===================================================================
--- gcc/function-abi.h	2019-09-25 16:23:04.000000000 +0100
+++ gcc/function-abi.h	2019-09-25 16:23:05.088580476 +0100
@@ -224,6 +224,8 @@ struct target_function_abi_info
      * crtl->abi is the ABI of the function that we are currently
        compiling to rtl.
 
+     * insn_callee_abi (INSN) is the ABI used by the target of call insn INSN.
+
      * eh_edge_abi is the "ABI" used when taking an EH edge from an
        exception-throwing statement to an exception handler.  Catching
        exceptions from calls can be treated as an abnormal return from
@@ -265,5 +267,6 @@ #define eh_edge_abi default_function_abi
 
 extern const predefined_function_abi &fntype_abi (const_tree);
 extern function_abi fndecl_abi (const_tree);
+extern function_abi insn_callee_abi (const rtx_insn *);
 
 #endif
Index: gcc/function-abi.cc
===================================================================
--- gcc/function-abi.cc	2019-09-25 16:23:04.000000000 +0100
+++ gcc/function-abi.cc	2019-09-25 16:23:05.088580476 +0100
@@ -143,5 +143,28 @@ function_abi
 fndecl_abi (const_tree fndecl)
 {
   gcc_assert (TREE_CODE (fndecl) == FUNCTION_DECL);
-  return fntype_abi (TREE_TYPE (fndecl));
+  const predefined_function_abi &base_abi = fntype_abi (TREE_TYPE (fndecl));
+
+  if (flag_ipa_ra && decl_binds_to_current_def_p (fndecl))
+    if (cgraph_rtl_info *info = cgraph_node::rtl_info (fndecl))
+      return function_abi (base_abi, info->function_used_regs);
+
+  return base_abi;
+}
+
+/* Return the ABI of the function called by INSN.  */
+
+function_abi
+insn_callee_abi (const rtx_insn *insn)
+{
+  gcc_assert (insn && CALL_P (insn));
+
+  if (flag_ipa_ra)
+    if (tree fndecl = get_call_fndecl (insn))
+      return fndecl_abi (fndecl);
+
+  if (targetm.calls.insn_callee_abi)
+    return targetm.calls.insn_callee_abi (insn);
+
+  return default_function_abi;
 }
Index: gcc/regs.h
===================================================================
--- gcc/regs.h	2019-09-25 16:23:04.000000000 +0100
+++ gcc/regs.h	2019-09-25 16:23:05.088580476 +0100
@@ -383,8 +383,4 @@ range_in_hard_reg_set_p (const_hard_reg_
   return true;
 }
 
-/* Get registers used by given function call instruction.  */
-extern bool get_call_reg_set_usage (rtx_insn *insn, HARD_REG_SET *reg_set,
-				    HARD_REG_SET default_set);
-
 #endif /* GCC_REGS_H */
Index: gcc/final.c
===================================================================
--- gcc/final.c	2019-09-25 16:23:04.000000000 +0100
+++ gcc/final.c	2019-09-25 16:23:05.088580476 +0100
@@ -81,6 +81,7 @@ #define INCLUDE_ALGORITHM /* reverse */
 #include "asan.h"
 #include "rtl-iter.h"
 #include "print-rtl.h"
+#include "function-abi.h"
 
 #ifdef XCOFF_DEBUGGING_INFO
 #include "xcoffout.h"		/* Needed for external data declarations.  */
@@ -230,7 +231,6 @@ static int alter_cond (rtx);
 #endif
 static int align_fuzz (rtx, rtx, int, unsigned);
 static void collect_fn_hard_reg_usage (void);
-static tree get_call_fndecl (rtx_insn *);
 \f
 /* Initialize data in final at the beginning of a compilation.  */
 
@@ -4994,7 +4994,16 @@ collect_fn_hard_reg_usage (void)
   if (!targetm.call_fusage_contains_non_callee_clobbers)
     return;
 
-  CLEAR_HARD_REG_SET (function_used_regs);
+  /* Be conservative - mark fixed and global registers as used.  */
+  function_used_regs = fixed_reg_set;
+
+#ifdef STACK_REGS
+  /* Handle STACK_REGS conservatively, since the df-framework does not
+     provide accurate information for them.  */
+
+  for (i = FIRST_STACK_REG; i <= LAST_STACK_REG; i++)
+    SET_HARD_REG_BIT (function_used_regs, i);
+#endif
 
   for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
     {
@@ -5005,96 +5014,23 @@ collect_fn_hard_reg_usage (void)
 
       if (CALL_P (insn)
 	  && !self_recursive_call_p (insn))
-	{
-	  if (!get_call_reg_set_usage (insn, &insn_used_regs,
-				       call_used_or_fixed_regs))
-	    return;
-
-	  function_used_regs |= insn_used_regs;
-	}
+	function_used_regs
+	  |= insn_callee_abi (insn).full_and_partial_reg_clobbers ();
 
       find_all_hard_reg_sets (insn, &insn_used_regs, false);
       function_used_regs |= insn_used_regs;
-    }
 
-  /* Be conservative - mark fixed and global registers as used.  */
-  function_used_regs |= fixed_reg_set;
-
-#ifdef STACK_REGS
-  /* Handle STACK_REGS conservatively, since the df-framework does not
-     provide accurate information for them.  */
-
-  for (i = FIRST_STACK_REG; i <= LAST_STACK_REG; i++)
-    SET_HARD_REG_BIT (function_used_regs, i);
-#endif
+      if (hard_reg_set_subset_p (crtl->abi->full_and_partial_reg_clobbers (),
+				 function_used_regs))
+	return;
+    }
 
-  /* The information we have gathered is only interesting if it exposes a
-     register from the call_used_regs that is not used in this function.  */
-  if (hard_reg_set_subset_p (call_used_or_fixed_regs, function_used_regs))
-    return;
+  /* Mask out fully-saved registers, so that they don't affect equality
+     comparisons between function_abis.  */
+  function_used_regs &= crtl->abi->full_and_partial_reg_clobbers ();
 
   node = cgraph_node::rtl_info (current_function_decl);
   gcc_assert (node != NULL);
 
   node->function_used_regs = function_used_regs;
-  node->function_used_regs_valid = 1;
-}
-
-/* Get the declaration of the function called by INSN.  */
-
-static tree
-get_call_fndecl (rtx_insn *insn)
-{
-  rtx note, datum;
-
-  note = find_reg_note (insn, REG_CALL_DECL, NULL_RTX);
-  if (note == NULL_RTX)
-    return NULL_TREE;
-
-  datum = XEXP (note, 0);
-  if (datum != NULL_RTX)
-    return SYMBOL_REF_DECL (datum);
-
-  return NULL_TREE;
-}
-
-/* Return the cgraph_rtl_info of the function called by INSN.  Returns NULL for
-   call targets that can be overwritten.  */
-
-static struct cgraph_rtl_info *
-get_call_cgraph_rtl_info (rtx_insn *insn)
-{
-  tree fndecl;
-
-  if (insn == NULL_RTX)
-    return NULL;
-
-  fndecl = get_call_fndecl (insn);
-  if (fndecl == NULL_TREE
-      || !decl_binds_to_current_def_p (fndecl))
-    return NULL;
-
-  return cgraph_node::rtl_info (fndecl);
-}
-
-/* Find hard registers used by function call instruction INSN, and return them
-   in REG_SET.  Return DEFAULT_SET in REG_SET if not found.  */
-
-bool
-get_call_reg_set_usage (rtx_insn *insn, HARD_REG_SET *reg_set,
-			HARD_REG_SET default_set)
-{
-  if (flag_ipa_ra)
-    {
-      struct cgraph_rtl_info *node = get_call_cgraph_rtl_info (insn);
-      if (node != NULL
-	  && node->function_used_regs_valid)
-	{
-	  *reg_set = node->function_used_regs & default_set;
-	  return true;
-	}
-    }
-  *reg_set = default_set;
-  targetm.remove_extra_call_preserved_regs (insn, reg_set);
-  return false;
 }
Index: gcc/caller-save.c
===================================================================
--- gcc/caller-save.c	2019-09-25 16:23:04.000000000 +0100
+++ gcc/caller-save.c	2019-09-25 16:23:05.072580590 +0100
@@ -37,6 +37,7 @@ Software Foundation; either version 3, o
 #include "dumpfile.h"
 #include "rtl-iter.h"
 #include "target.h"
+#include "function-abi.h"
 
 #define MOVE_MAX_WORDS (MOVE_MAX / UNITS_PER_WORD)
 
@@ -426,7 +427,9 @@ setup_save_areas (void)
       freq = REG_FREQ_FROM_BB (BLOCK_FOR_INSN (insn));
       REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
 			       &chain->live_throughout);
-      get_call_reg_set_usage (insn, &used_regs, call_used_or_fixed_regs);
+      used_regs = insn_callee_abi (insn).full_reg_clobbers ();
+      /* ??? This preserves traditional behavior; it might not be needed.  */
+      used_regs |= fixed_reg_set;
 
       /* Record all registers set in this call insn.  These don't
 	 need to be saved.  N.B. the call insn might set a subreg
@@ -509,7 +512,10 @@ setup_save_areas (void)
 
 	  REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
 				   &chain->live_throughout);
-	  get_call_reg_set_usage (insn, &used_regs, call_used_or_fixed_regs);
+	  used_regs = insn_callee_abi (insn).full_reg_clobbers ();
+	  /* ??? This preserves traditional behavior; it might not
+	     be needed.  */
+	  used_regs |= fixed_reg_set;
 
 	  /* Record all registers set in this call insn.  These don't
 	     need to be saved.  N.B. the call insn might set a subreg
@@ -838,8 +844,10 @@ save_call_clobbered_regs (void)
 				     | this_insn_sets
 				     | hard_regs_saved);
 	      hard_regs_to_save &= savable_regs;
-	      get_call_reg_set_usage (insn, &call_def_reg_set,
-				      call_used_or_fixed_regs);
+	      call_def_reg_set = insn_callee_abi (insn).full_reg_clobbers ();
+	      /* ??? This preserves traditional behavior; it might not
+		 be needed.  */
+	      call_def_reg_set |= fixed_reg_set;
 	      hard_regs_to_save &= call_def_reg_set;
 
 	      for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
Index: gcc/cfgcleanup.c
===================================================================
--- gcc/cfgcleanup.c	2019-09-25 16:23:04.000000000 +0100
+++ gcc/cfgcleanup.c	2019-09-25 16:23:05.076580558 +0100
@@ -54,6 +54,7 @@ Software Foundation; either version 3, o
 #include "dbgcnt.h"
 #include "rtl-iter.h"
 #include "regs.h"
+#include "function-abi.h"
 
 #define FORWARDER_BLOCK_P(BB) ((BB)->flags & BB_FORWARDER_BLOCK)
 
@@ -1226,10 +1227,11 @@ old_insns_match_p (int mode ATTRIBUTE_UN
 	    }
 	}
 
-      HARD_REG_SET i1_used, i2_used;
-
-      get_call_reg_set_usage (i1, &i1_used, call_used_or_fixed_regs);
-      get_call_reg_set_usage (i2, &i2_used, call_used_or_fixed_regs);
+      HARD_REG_SET i1_used = insn_callee_abi (i1).full_reg_clobbers ();
+      HARD_REG_SET i2_used = insn_callee_abi (i2).full_reg_clobbers ();
+      /* ??? This preserves traditional behavior; it might not be needed.  */
+      i1_used |= fixed_reg_set;
+      i2_used |= fixed_reg_set;
 
       if (i1_used != i2_used)
         return dir_none;
Index: gcc/cgraph.h
===================================================================
--- gcc/cgraph.h	2019-09-25 16:23:04.000000000 +0100
+++ gcc/cgraph.h	2019-09-25 16:23:05.076580558 +0100
@@ -1379,7 +1379,7 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cg
   static cgraph_local_info *local_info (tree decl);
 
   /* Return local info for the compiled function.  */
-  static struct cgraph_rtl_info *rtl_info (tree);
+  static struct cgraph_rtl_info *rtl_info (const_tree);
 
   /* Return the cgraph node that has ASMNAME for its DECL_ASSEMBLER_NAME.
      Return NULL if there's no such node.  */
Index: gcc/cgraph.c
===================================================================
--- gcc/cgraph.c	2019-09-25 16:23:04.000000000 +0100
+++ gcc/cgraph.c	2019-09-25 16:23:05.076580558 +0100
@@ -1839,7 +1839,7 @@ cgraph_node::local_info (tree decl)
 /* Return local info for the compiled function.  */
 
 cgraph_rtl_info *
-cgraph_node::rtl_info (tree decl)
+cgraph_node::rtl_info (const_tree decl)
 {
   gcc_assert (TREE_CODE (decl) == FUNCTION_DECL);
   cgraph_node *node = get (decl);
@@ -1854,7 +1854,10 @@ cgraph_node::rtl_info (tree decl)
     return NULL;
   /* Allocate if it doesn't exist.  */
   if (node->rtl == NULL)
-    node->rtl = ggc_cleared_alloc<cgraph_rtl_info> ();
+    {
+      node->rtl = ggc_cleared_alloc<cgraph_rtl_info> ();
+      node->rtl->function_used_regs = reg_class_contents[ALL_REGS];
+    }
   return node->rtl;
 }
 
Index: gcc/df-scan.c
===================================================================
--- gcc/df-scan.c	2019-09-25 16:23:04.000000000 +0100
+++ gcc/df-scan.c	2019-09-25 16:23:05.080580530 +0100
@@ -35,7 +35,7 @@ Software Foundation; either version 3, o
 #include "emit-rtl.h"  /* FIXME: Can go away once crtl is moved to rtl.h.  */
 #include "dumpfile.h"
 #include "calls.h"
-
+#include "function-abi.h"
 
 /* The set of hard registers in eliminables[i].from. */
 
@@ -3088,13 +3088,11 @@ df_get_call_refs (class df_collection_re
   bool is_sibling_call;
   unsigned int i;
   HARD_REG_SET defs_generated;
-  HARD_REG_SET fn_reg_set_usage;
 
   CLEAR_HARD_REG_SET (defs_generated);
   df_find_hard_reg_defs (PATTERN (insn_info->insn), &defs_generated);
   is_sibling_call = SIBLING_CALL_P (insn_info->insn);
-  get_call_reg_set_usage (insn_info->insn, &fn_reg_set_usage,
-			  regs_invalidated_by_call);
+  function_abi callee_abi = insn_callee_abi (insn_info->insn);
 
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     {
@@ -3118,7 +3116,7 @@ df_get_call_refs (class df_collection_re
 			       NULL, bb, insn_info, DF_REF_REG_DEF, flags);
 	    }
 	}
-      else if (TEST_HARD_REG_BIT (fn_reg_set_usage, i)
+      else if (callee_abi.clobbers_full_reg_p (i)
 	       /* no clobbers for regs that are the result of the call */
 	       && !TEST_HARD_REG_BIT (defs_generated, i)
 	       && (!is_sibling_call
Index: gcc/ira-lives.c
===================================================================
--- gcc/ira-lives.c	2019-09-25 16:23:04.000000000 +0100
+++ gcc/ira-lives.c	2019-09-25 16:23:05.088580476 +0100
@@ -33,6 +33,7 @@ Software Foundation; either version 3, o
 #include "ira.h"
 #include "ira-int.h"
 #include "sparseset.h"
+#include "function-abi.h"
 
 /* The code in this file is similar to one in global but the code
    works on the allocno basis and creates live ranges instead of
@@ -1254,10 +1255,11 @@ process_bb_node_lives (ira_loop_tree_nod
 		  ira_object_t obj = ira_object_id_map[i];
 		  a = OBJECT_ALLOCNO (obj);
 		  int num = ALLOCNO_NUM (a);
-		  HARD_REG_SET this_call_used_reg_set;
-
-		  get_call_reg_set_usage (insn, &this_call_used_reg_set,
-					  call_used_or_fixed_regs);
+		  HARD_REG_SET this_call_used_reg_set
+		    = insn_callee_abi (insn).full_reg_clobbers ();
+		  /* ??? This preserves traditional behavior; it might not be
+		     needed.  */
+		  this_call_used_reg_set |= fixed_reg_set;
 
 		  /* Don't allocate allocnos that cross setjmps or any
 		     call, if this function receives a nonlocal
Index: gcc/lra-lives.c
===================================================================
--- gcc/lra-lives.c	2019-09-25 16:23:04.000000000 +0100
+++ gcc/lra-lives.c	2019-09-25 16:23:05.088580476 +0100
@@ -43,6 +43,7 @@ Software Foundation; either version 3, o
 #include "sparseset.h"
 #include "lra-int.h"
 #include "target.h"
+#include "function-abi.h"
 
 /* Program points are enumerated by numbers from range
    0..LRA_LIVE_MAX_POINT-1.  There are approximately two times more
@@ -931,9 +932,11 @@ process_bb_lives (basic_block bb, int &c
 	    last_call_used_reg_set = call_used_or_fixed_regs;
 	  else
 	    {
-	      HARD_REG_SET this_call_used_reg_set;
-	      get_call_reg_set_usage (curr_insn, &this_call_used_reg_set,
-				      call_used_or_fixed_regs);
+	      HARD_REG_SET this_call_used_reg_set
+		= insn_callee_abi (curr_insn).full_reg_clobbers ();
+	      /* ??? This preserves traditional behavior; it might not
+		 be needed.  */
+	      this_call_used_reg_set |= fixed_reg_set;
 
 	      bool flush = (! hard_reg_set_empty_p (last_call_used_reg_set)
 			    && (last_call_used_reg_set
Index: gcc/postreload.c
===================================================================
--- gcc/postreload.c	2019-09-25 16:23:04.000000000 +0100
+++ gcc/postreload.c	2019-09-25 16:23:05.088580476 +0100
@@ -40,6 +40,7 @@ Software Foundation; either version 3, o
 #include "cselib.h"
 #include "tree-pass.h"
 #include "dbgcnt.h"
+#include "function-abi.h"
 
 static int reload_cse_noop_set_p (rtx);
 static bool reload_cse_simplify (rtx_insn *, rtx);
@@ -1330,9 +1331,10 @@ reload_combine (void)
       if (CALL_P (insn))
 	{
 	  rtx link;
-	  HARD_REG_SET used_regs;
-
-	  get_call_reg_set_usage (insn, &used_regs, call_used_or_fixed_regs);
+	  HARD_REG_SET used_regs = insn_callee_abi (insn).full_reg_clobbers ();
+	  /* ??? This preserves traditional behavior; it might not be
+	     needed.  */
+	  used_regs |= fixed_reg_set;
 
 	  for (r = 0; r < FIRST_PSEUDO_REGISTER; r++)
 	    if (TEST_HARD_REG_BIT (used_regs, r))
Index: gcc/regcprop.c
===================================================================
--- gcc/regcprop.c	2019-09-25 16:23:04.000000000 +0100
+++ gcc/regcprop.c	2019-09-25 16:23:05.088580476 +0100
@@ -35,6 +35,7 @@
 #include "rtl-iter.h"
 #include "cfgrtl.h"
 #include "target.h"
+#include "function-abi.h"
 
 /* The following code does forward propagation of hard register copies.
    The object is to eliminate as many dependencies as possible, so that
@@ -1035,7 +1036,6 @@ copyprop_hardreg_forward_1 (basic_block
 	  unsigned int set_nregs = 0;
 	  unsigned int regno;
 	  rtx exp;
-	  HARD_REG_SET regs_invalidated_by_this_call;
 
 	  for (exp = CALL_INSN_FUNCTION_USAGE (insn); exp; exp = XEXP (exp, 1))
 	    {
@@ -1053,11 +1053,9 @@ copyprop_hardreg_forward_1 (basic_block
 		}
 	    }
 
-	  get_call_reg_set_usage (insn,
-				  &regs_invalidated_by_this_call,
-				  regs_invalidated_by_call);
+	  function_abi callee_abi = insn_callee_abi (insn);
 	  for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-	    if ((TEST_HARD_REG_BIT (regs_invalidated_by_this_call, regno)
+	    if ((callee_abi.clobbers_full_reg_p (regno)
 		 || (targetm.hard_regno_call_part_clobbered
 		     (insn, regno, vd->e[regno].mode)))
 		&& (regno < set_regno || regno >= set_regno + set_nregs))
Index: gcc/resource.c
===================================================================
--- gcc/resource.c	2019-09-25 16:23:04.000000000 +0100
+++ gcc/resource.c	2019-09-25 16:23:05.088580476 +0100
@@ -30,6 +30,7 @@ Software Foundation; either version 3, o
 #include "resource.h"
 #include "insn-attr.h"
 #include "params.h"
+#include "function-abi.h"
 
 /* This structure is used to record liveness information at the targets or
    fallthrough insns of branches.  We will most likely need the information
@@ -662,12 +663,10 @@ mark_set_resources (rtx x, struct resour
 	{
 	  rtx_call_insn *call_insn = as_a <rtx_call_insn *> (x);
 	  rtx link;
-	  HARD_REG_SET regs;
 
 	  res->cc = res->memory = 1;
 
-	  get_call_reg_set_usage (call_insn, &regs, regs_invalidated_by_call);
-	  res->regs |= regs;
+	  res->regs |= insn_callee_abi (call_insn).full_reg_clobbers ();
 
 	  for (link = CALL_INSN_FUNCTION_USAGE (call_insn);
 	       link; link = XEXP (link, 1))
@@ -1038,10 +1037,8 @@ mark_target_live_regs (rtx_insn *insns,
 		 predicated instruction, or if the CALL is NORETURN.  */
 	      if (GET_CODE (PATTERN (real_insn)) != COND_EXEC)
 		{
-		  HARD_REG_SET regs_invalidated_by_this_call;
-		  get_call_reg_set_usage (real_insn,
-					  &regs_invalidated_by_this_call,
-					  regs_invalidated_by_call);
+		  HARD_REG_SET regs_invalidated_by_this_call
+		    = insn_callee_abi (real_insn).full_reg_clobbers ();
 		  /* CALL clobbers all call-used regs that aren't fixed except
 		     sp, ap, and fp.  Do this before setting the result of the
 		     call live.  */
Index: gcc/var-tracking.c
===================================================================
--- gcc/var-tracking.c	2019-09-25 16:23:04.000000000 +0100
+++ gcc/var-tracking.c	2019-09-25 16:23:05.092580444 +0100
@@ -116,6 +116,7 @@
 #include "rtl-iter.h"
 #include "fibonacci_heap.h"
 #include "print-rtl.h"
+#include "function-abi.h"
 
 typedef fibonacci_heap <long, basic_block_def> bb_heap_t;
 typedef fibonacci_node <long, basic_block_def> bb_heap_node_t;
@@ -4900,12 +4901,10 @@ dataflow_set_clear_at_call (dataflow_set
 {
   unsigned int r;
   hard_reg_set_iterator hrsi;
-  HARD_REG_SET invalidated_regs;
 
-  get_call_reg_set_usage (call_insn, &invalidated_regs,
-			  regs_invalidated_by_call);
+  function_abi callee_abi = insn_callee_abi (call_insn);
 
-  EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
+  EXECUTE_IF_SET_IN_HARD_REG_SET (callee_abi.full_reg_clobbers (), 0, r, hrsi)
     var_regno_delete (set, r);
 
   if (MAY_HAVE_DEBUG_BIND_INSNS)

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [04/32] [x86] Robustify vzeroupper handling across calls
  2019-09-11 19:05 ` [04/32] [x86] Robustify vzeroupper handling across calls Richard Sandiford
@ 2019-09-25 15:48   ` Richard Sandiford
  2019-09-25 18:11     ` Uros Bizjak
  2019-10-01 10:14     ` Uros Bizjak
  0 siblings, 2 replies; 100+ messages in thread
From: Richard Sandiford @ 2019-09-25 15:48 UTC (permalink / raw)
  To: gcc-patches; +Cc: hubicka, ubizjak

Ping

Richard Sandiford <richard.sandiford@arm.com> writes:
> One of the effects of the function_abi series is to make -fipa-ra
> work for partially call-clobbered registers.  E.g. if a call preserves
> only the low 32 bits of a register R, we handled the partial clobber
> separately from -fipa-ra, and so treated the upper bits of R as
> clobbered even if we knew that the target function doesn't touch R.
>
> "Fixing" this caused problems for the vzeroupper handling on x86.
> The pass that inserts the vzerouppers assumes that no 256-bit or 512-bit
> values are live across a call unless the call takes a 256-bit or 512-bit
> argument:
>
>       /* Needed mode is set to AVX_U128_CLEAN if there are
> 	 no 256bit or 512bit modes used in function arguments. */
>
> This implicitly relies on:
>
> /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  The only ABI that
>    saves SSE registers across calls is Win64 (thus no need to check the
>    current ABI here), and with AVX enabled Win64 only guarantees that
>    the low 16 bytes are saved.  */
>
> static bool
> ix86_hard_regno_call_part_clobbered (rtx_insn *insn ATTRIBUTE_UNUSED,
> 				     unsigned int regno, machine_mode mode)
> {
>   return SSE_REGNO_P (regno) && GET_MODE_SIZE (mode) > 16;
> }
>
> The comment suggests that this code is only needed for Win64 and that
> not testing for Win64 is just a simplification.  But in practice it was
> needed for correctness on GNU/Linux and other targets too, since without
> it the RA would be able to keep 256-bit and 512-bit values in SSE
> registers across calls that are known not to clobber them.
>
> This patch conservatively treats calls as AVX_U128_ANY if the RA can see
> that some SSE registers are not touched by a call.  There are then no
> regressions if the ix86_hard_regno_call_part_clobbered check is disabled
> for GNU/Linux (not something we should do, was just for testing).
>
> If in fact we want -fipa-ra to pretend that all functions clobber
> SSE registers above 128 bits, it'd certainly be possible to arrange
> that.  But IMO that would be an optimisation decision, whereas what
> the patch is fixing is a correctness decision.  So I think we should
> have this check even so.

2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/i386/i386.c: Include function-abi.h.
	(ix86_avx_u128_mode_needed): Treat function calls as AVX_U128_ANY
	if they preserve some 256-bit or 512-bit SSE registers.

Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	2019-09-25 16:47:48.000000000 +0100
+++ gcc/config/i386/i386.c	2019-09-25 16:47:49.089962608 +0100
@@ -95,6 +95,7 @@ #define IN_TARGET_CODE 1
 #include "i386-builtins.h"
 #include "i386-expand.h"
 #include "i386-features.h"
+#include "function-abi.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -13511,6 +13512,15 @@ ix86_avx_u128_mode_needed (rtx_insn *ins
 	    }
 	}
 
+      /* If the function is known to preserve some SSE registers,
+	 RA and previous passes can legitimately rely on that for
+	 modes wider than 256 bits.  It's only safe to issue a
+	 vzeroupper if all SSE registers are clobbered.  */
+      const function_abi &abi = insn_callee_abi (insn);
+      if (!hard_reg_set_subset_p (reg_class_contents[ALL_SSE_REGS],
+				  abi.mode_clobbers (V4DImode)))
+	return AVX_U128_ANY;
+
       return AVX_U128_CLEAN;
     }
 

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [10/32] Remove global call sets: combine.c
  2019-09-20  0:43       ` Segher Boessenkool
@ 2019-09-25 15:52         ` Richard Sandiford
  2019-09-25 16:30           ` Segher Boessenkool
  2019-09-29 22:32           ` Jeff Law
  0 siblings, 2 replies; 100+ messages in thread
From: Richard Sandiford @ 2019-09-25 15:52 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches

Segher Boessenkool <segher@kernel.crashing.org> writes:
> Hi Richard,
>
> Sorry this too me so long to get back to.
>
> On Thu, Sep 12, 2019 at 08:51:59AM +0100, Richard Sandiford wrote:
>> Segher Boessenkool <segher@kernel.crashing.org> writes:
>> > On Wed, Sep 11, 2019 at 08:08:38PM +0100, Richard Sandiford wrote:
>> >>        hard_reg_set_iterator hrsi;
>> >> -      EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, i, hrsi)
>> >> +      EXECUTE_IF_SET_IN_HARD_REG_SET (abi.full_and_partial_reg_clobbers (),
>> >> +				      0, i, hrsi)
>> >
>> > So "abi" in that means calls?
>> 
>> "abi" is the interface of the callee function, taking things like
>> function attributes and -fipa-ra into account.
>> 
>> The register sets are describing what the callee does rather than
>> what calls to it do.  E.g. on targets that allow linker stubs to be
>> inserted between calls, the scratch registers reserved for linker stubs
>> are still call-clobbered, even if the target of the call doesn't use
>> them.  (Those call clobbers are represented separately, at least when
>> TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS is true.  When it's
>> false we don't use -fipa-ra information at all.)
>> 
>> > It is not such a great name like that.  Since its children are
>> > very_long_names, it doesn't need to be only three chars itself,
>> > either?
>> 
>> OK, what name would you prefer?
>
> Maybe call_abi is a good name?  It's difficult to capture the subtleties
> in a short enough name.  As always :-)

The formatting ended up being a bit weird with a longer name,
so how about the attached instead?

Richard


2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* combine.c: Include function-abi.h.
	(record_dead_and_set_regs): Use insn_callee_abi to get the ABI
	of the target of call insns.  Invalidate partially-clobbered
	registers as well as fully-clobbered ones.

Index: gcc/combine.c
===================================================================
--- gcc/combine.c	2019-09-12 10:52:53.000000000 +0100
+++ gcc/combine.c	2019-09-25 16:50:21.772865265 +0100
@@ -105,6 +105,7 @@ Software Foundation; either version 3, o
 #include "valtrack.h"
 #include "rtl-iter.h"
 #include "print-rtl.h"
+#include "function-abi.h"
 
 /* Number of attempts to combine instructions in this function.  */
 
@@ -13464,11 +13465,21 @@ record_dead_and_set_regs (rtx_insn *insn
 
   if (CALL_P (insn))
     {
+      HARD_REG_SET callee_clobbers
+	= insn_callee_abi (insn).full_and_partial_reg_clobbers ();
       hard_reg_set_iterator hrsi;
-      EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, i, hrsi)
+      EXECUTE_IF_SET_IN_HARD_REG_SET (callee_clobbers, 0, i, hrsi)
 	{
 	  reg_stat_type *rsp;
 
+	  /* ??? We could try to preserve some information from the last
+	     set of register I if the call doesn't actually clobber
+	     (reg:last_set_mode I), which might be true for ABIs with
+	     partial clobbers.  However, it would be difficult to
+	     update last_set_nonzero_bits and last_sign_bit_copies
+	     to account for the part of I that actually was clobbered.
+	     It wouldn't help much anyway, since we rarely see this
+	     situation before RA.  */
 	  rsp = &reg_stat[i];
 	  rsp->last_set_invalid = 1;
 	  rsp->last_set = insn;

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [11/32] Remove global call sets: cse.c
  2019-09-11 19:09 ` [11/32] Remove global call sets: cse.c Richard Sandiford
@ 2019-09-25 15:57   ` Richard Sandiford
  2019-09-29 21:04     ` Jeff Law
  0 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-25 15:57 UTC (permalink / raw)
  To: gcc-patches

Richard Sandiford <richard.sandiford@arm.com> writes:
> Like with the combine.c patch, this one keeps things simple by
> invalidating values in partially-clobbered registers, rather than
> trying to tell whether the value in a partially-clobbered register
> is actually clobbered or not.  Again, this is in principle a bug fix,
> but probably never matters in practice.

Similary to the combine patch, I've updated this to avoid the
short "abi" name and use a temporary HARD_REG_SET instead.

Richard


2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* cse.c: Include regs.h and function-abi.h.
	(invalidate_for_call): Take the call insn as an argument.
	Use insn_callee_abi to get the ABI of the call and invalidate
	partially clobbered registers as well as fully clobbered ones.
	(cse_insn): Update call accordingly.

Index: gcc/cse.c
===================================================================
--- gcc/cse.c	2019-09-17 15:27:11.338066929 +0100
+++ gcc/cse.c	2019-09-25 16:55:31.202641509 +0100
@@ -42,6 +42,8 @@ Software Foundation; either version 3, o
 #include "tree-pass.h"
 #include "dbgcnt.h"
 #include "rtl-iter.h"
+#include "regs.h"
+#include "function-abi.h"
 
 /* The basic idea of common subexpression elimination is to go
    through the code, keeping a record of expressions that would
@@ -566,7 +568,6 @@ static void remove_invalid_subreg_refs (
 					machine_mode);
 static void rehash_using_reg (rtx);
 static void invalidate_memory (void);
-static void invalidate_for_call (void);
 static rtx use_related_value (rtx, struct table_elt *);
 
 static inline unsigned canon_hash (rtx, machine_mode);
@@ -2091,23 +2092,29 @@ rehash_using_reg (rtx x)
 }
 \f
 /* Remove from the hash table any expression that is a call-clobbered
-   register.  Also update their TICK values.  */
+   register in INSN.  Also update their TICK values.  */
 
 static void
-invalidate_for_call (void)
+invalidate_for_call (rtx_insn *insn)
 {
-  unsigned int regno, endregno;
-  unsigned int i;
+  unsigned int regno;
   unsigned hash;
   struct table_elt *p, *next;
   int in_table = 0;
   hard_reg_set_iterator hrsi;
 
-  /* Go through all the hard registers.  For each that is clobbered in
-     a CALL_INSN, remove the register from quantity chains and update
+  /* Go through all the hard registers.  For each that might be clobbered
+     in call insn INSN, remove the register from quantity chains and update
      reg_tick if defined.  Also see if any of these registers is currently
-     in the table.  */
-  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, regno, hrsi)
+     in the table.
+
+     ??? We could be more precise for partially-clobbered registers,
+     and only invalidate values that actually occupy the clobbered part
+     of the registers.  It doesn't seem worth the effort though, since
+     we shouldn't see this situation much before RA.  */
+  HARD_REG_SET callee_clobbers
+    = insn_callee_abi (insn).full_and_partial_reg_clobbers ();
+  EXECUTE_IF_SET_IN_HARD_REG_SET (callee_clobbers, 0, regno, hrsi)
     {
       delete_reg_equiv (regno);
       if (REG_TICK (regno) >= 0)
@@ -2132,15 +2139,11 @@ invalidate_for_call (void)
 	      || REGNO (p->exp) >= FIRST_PSEUDO_REGISTER)
 	    continue;
 
-	  regno = REGNO (p->exp);
-	  endregno = END_REGNO (p->exp);
-
-	  for (i = regno; i < endregno; i++)
-	    if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i))
-	      {
-		remove_from_table (p, hash);
-		break;
-	      }
+	  /* This must use the same test as above rather than the
+	     more accurate clobbers_reg_p.  */
+	  if (overlaps_hard_reg_set_p (callee_clobbers, GET_MODE (p->exp),
+				       REGNO (p->exp)))
+	    remove_from_table (p, hash);
 	}
 }
 \f
@@ -5823,7 +5826,7 @@ cse_insn (rtx_insn *insn)
 	  if (GET_CODE (XEXP (tem, 0)) == USE
 	      && MEM_P (XEXP (XEXP (tem, 0), 0)))
 	    invalidate (XEXP (XEXP (tem, 0), 0), VOIDmode);
-      invalidate_for_call ();
+      invalidate_for_call (insn);
     }
 
   /* Now invalidate everything set by this instruction.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [17/32] Remove global call sets: gcse.c
  2019-09-11 19:11 ` [17/32] Remove global call sets: gcse.c Richard Sandiford
@ 2019-09-25 16:04   ` Richard Sandiford
  2019-09-29 21:10   ` Jeff Law
  1 sibling, 0 replies; 100+ messages in thread
From: Richard Sandiford @ 2019-09-25 16:04 UTC (permalink / raw)
  To: gcc-patches

Richard Sandiford <richard.sandiford@arm.com> writes:
> This is another case in which we can conservatively treat partial
> kills as full kills.  Again this is in principle a bug fix for
> TARGET_HARD_REGNO_CALL_PART_CLOBBERED targets, but in practice
> it probably doesn't make a difference.

Similary to the combine patch, I've updated this to avoid the
short "abi" name and use a temporary HARD_REG_SET instead.

Richard


2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* gcse.c: Include function-abi.h.
	(compute_hash_table_work): Use insn_callee_abi to get the ABI of
	the call insn target.  Invalidate partially call-clobbered
	registers as well as fully call-clobbered ones.

Index: gcc/gcse.c
===================================================================
--- gcc/gcse.c	2019-09-25 17:03:07.000000000 +0100
+++ gcc/gcse.c	2019-09-25 17:03:07.427363103 +0100
@@ -160,6 +160,7 @@ Software Foundation; either version 3, o
 #include "dbgcnt.h"
 #include "gcse.h"
 #include "gcse-common.h"
+#include "function-abi.h"
 
 /* We support GCSE via Partial Redundancy Elimination.  PRE optimizations
    are a superset of those done by classic GCSE.
@@ -1528,8 +1529,13 @@ compute_hash_table_work (struct gcse_has
 	  if (CALL_P (insn))
 	    {
 	      hard_reg_set_iterator hrsi;
-	      EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call,
-					      0, regno, hrsi)
+
+	      /* We don't track modes of hard registers, so we need
+		 to be conservative and assume that partial kills
+		 are full kills.  */
+	      HARD_REG_SET callee_clobbers
+		= insn_callee_abi (insn).full_and_partial_reg_clobbers ();
+	      EXECUTE_IF_SET_IN_HARD_REG_SET (callee_clobbers, 0, regno, hrsi)
 		record_last_reg_set_info (insn, regno);
 
 	      if (! RTL_CONST_OR_PURE_CALL_P (insn)

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [23/32] Remove global call sets: postreload-gcse.c
  2019-09-11 19:14 ` [23/32] Remove global call sets: postreload-gcse.c Richard Sandiford
@ 2019-09-25 16:08   ` Richard Sandiford
  2019-09-29 22:22     ` Jeff Law
  0 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-25 16:08 UTC (permalink / raw)
  To: gcc-patches

Richard Sandiford <richard.sandiford@arm.com> writes:
> This is another case in which we should conservatively treat
> partial kills as full kills.

Similary to the combine patch, I've updated this to avoid the
short "abi" name and use a temporary HARD_REG_SET instead.

Richard


2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* postreload-gcse.c: Include regs.h and function-abi.h.
	(record_opr_changes): Use insn_callee_abi to get the ABI of the
	call insn target.  Conservatively assume that partially-clobbered
	registers are altered.

Index: gcc/postreload-gcse.c
===================================================================
--- gcc/postreload-gcse.c	2019-09-12 10:52:50.000000000 +0100
+++ gcc/postreload-gcse.c	2019-09-25 17:06:55.213726369 +0100
@@ -41,6 +41,8 @@ Software Foundation; either version 3, o
 #include "intl.h"
 #include "gcse-common.h"
 #include "gcse.h"
+#include "regs.h"
+#include "function-abi.h"
 
 /* The following code implements gcse after reload, the purpose of this
    pass is to cleanup redundant loads generated by reload and other
@@ -772,7 +774,11 @@ record_opr_changes (rtx_insn *insn)
     {
       unsigned int regno;
       hard_reg_set_iterator hrsi;
-      EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, regno, hrsi)
+      /* We don't track modes of hard registers, so we need to be
+	 conservative and assume that partial kills are full kills.  */
+      HARD_REG_SET callee_clobbers
+	= insn_callee_abi (insn).full_and_partial_reg_clobbers ();
+      EXECUTE_IF_SET_IN_HARD_REG_SET (callee_clobbers, 0, regno, hrsi)
 	record_last_reg_set_info_regno (insn, regno);
 
       if (! RTL_CONST_OR_PURE_CALL_P (insn))

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [10/32] Remove global call sets: combine.c
  2019-09-25 15:52         ` Richard Sandiford
@ 2019-09-25 16:30           ` Segher Boessenkool
  2019-09-29 22:32           ` Jeff Law
  1 sibling, 0 replies; 100+ messages in thread
From: Segher Boessenkool @ 2019-09-25 16:30 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On Wed, Sep 25, 2019 at 04:52:14PM +0100, Richard Sandiford wrote:
> Segher Boessenkool <segher@kernel.crashing.org> writes:
> > On Thu, Sep 12, 2019 at 08:51:59AM +0100, Richard Sandiford wrote:
> >> Segher Boessenkool <segher@kernel.crashing.org> writes:
> >> > It is not such a great name like that.  Since its children are
> >> > very_long_names, it doesn't need to be only three chars itself,
> >> > either?
> >> 
> >> OK, what name would you prefer?
> >
> > Maybe call_abi is a good name?  It's difficult to capture the subtleties
> > in a short enough name.  As always :-)
> 
> The formatting ended up being a bit weird with a longer name,
> so how about the attached instead?

That looks great, thanks!

> +	  /* ??? We could try to preserve some information from the last
> +	     set of register I if the call doesn't actually clobber
> +	     (reg:last_set_mode I), which might be true for ABIs with
> +	     partial clobbers.  However, it would be difficult to
> +	     update last_set_nonzero_bits and last_sign_bit_copies
> +	     to account for the part of I that actually was clobbered.
> +	     It wouldn't help much anyway, since we rarely see this
> +	     situation before RA.  */

I would like to completely get rid of reg_stat, and have known bits
dealt with by some DF thing instead...  It would work much better and
be much easier to use at the same time.  Also, other passes could use
it as well.

If I ever will find time to do this, I don't know :-/


Segher

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [04/32] [x86] Robustify vzeroupper handling across calls
  2019-09-25 15:48   ` Richard Sandiford
@ 2019-09-25 18:11     ` Uros Bizjak
  2019-10-01 10:14     ` Uros Bizjak
  1 sibling, 0 replies; 100+ messages in thread
From: Uros Bizjak @ 2019-09-25 18:11 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, Jan Hubicka

On Wed, Sep 25, 2019 at 5:48 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Ping
>
> Richard Sandiford <richard.sandiford@arm.com> writes:
> > One of the effects of the function_abi series is to make -fipa-ra
> > work for partially call-clobbered registers.  E.g. if a call preserves
> > only the low 32 bits of a register R, we handled the partial clobber
> > separately from -fipa-ra, and so treated the upper bits of R as
> > clobbered even if we knew that the target function doesn't touch R.
> >
> > "Fixing" this caused problems for the vzeroupper handling on x86.
> > The pass that inserts the vzerouppers assumes that no 256-bit or 512-bit
> > values are live across a call unless the call takes a 256-bit or 512-bit
> > argument:
> >
> >       /* Needed mode is set to AVX_U128_CLEAN if there are
> >        no 256bit or 512bit modes used in function arguments. */
> >
> > This implicitly relies on:
> >
> > /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  The only ABI that
> >    saves SSE registers across calls is Win64 (thus no need to check the
> >    current ABI here), and with AVX enabled Win64 only guarantees that
> >    the low 16 bytes are saved.  */
> >
> > static bool
> > ix86_hard_regno_call_part_clobbered (rtx_insn *insn ATTRIBUTE_UNUSED,
> >                                    unsigned int regno, machine_mode mode)
> > {
> >   return SSE_REGNO_P (regno) && GET_MODE_SIZE (mode) > 16;
> > }
> >
> > The comment suggests that this code is only needed for Win64 and that
> > not testing for Win64 is just a simplification.  But in practice it was
> > needed for correctness on GNU/Linux and other targets too, since without
> > it the RA would be able to keep 256-bit and 512-bit values in SSE
> > registers across calls that are known not to clobber them.
> >
> > This patch conservatively treats calls as AVX_U128_ANY if the RA can see
> > that some SSE registers are not touched by a call.  There are then no
> > regressions if the ix86_hard_regno_call_part_clobbered check is disabled
> > for GNU/Linux (not something we should do, was just for testing).
> >
> > If in fact we want -fipa-ra to pretend that all functions clobber
> > SSE registers above 128 bits, it'd certainly be possible to arrange
> > that.  But IMO that would be an optimisation decision, whereas what
> > the patch is fixing is a correctness decision.  So I think we should
> > have this check even so.
>
> 2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * config/i386/i386.c: Include function-abi.h.
>         (ix86_avx_u128_mode_needed): Treat function calls as AVX_U128_ANY
>         if they preserve some 256-bit or 512-bit SSE registers.

OK.

Thanks,
Uros.

>
> Index: gcc/config/i386/i386.c
> ===================================================================
> --- gcc/config/i386/i386.c      2019-09-25 16:47:48.000000000 +0100
> +++ gcc/config/i386/i386.c      2019-09-25 16:47:49.089962608 +0100
> @@ -95,6 +95,7 @@ #define IN_TARGET_CODE 1
>  #include "i386-builtins.h"
>  #include "i386-expand.h"
>  #include "i386-features.h"
> +#include "function-abi.h"
>
>  /* This file should be included last.  */
>  #include "target-def.h"
> @@ -13511,6 +13512,15 @@ ix86_avx_u128_mode_needed (rtx_insn *ins
>             }
>         }
>
> +      /* If the function is known to preserve some SSE registers,
> +        RA and previous passes can legitimately rely on that for
> +        modes wider than 256 bits.  It's only safe to issue a
> +        vzeroupper if all SSE registers are clobbered.  */
> +      const function_abi &abi = insn_callee_abi (insn);
> +      if (!hard_reg_set_subset_p (reg_class_contents[ALL_SSE_REGS],
> +                                 abi.mode_clobbers (V4DImode)))
> +       return AVX_U128_ANY;
> +
>        return AVX_U128_CLEAN;
>      }
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [00/32] Support multiple ABIs in the same translation unit
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (32 preceding siblings ...)
  2019-09-12 20:42 ` [00/32] Support multiple ABIs in the same translation unit Steven Bosscher
@ 2019-09-26 19:24 ` Dimitar Dimitrov
  2019-09-27  8:58   ` Richard Sandiford
  2019-10-01  2:09 ` build-failure for cris-elf with "[00/32] Support multiple ABIs in the same translation unit" Hans-Peter Nilsson
  34 siblings, 1 reply; 100+ messages in thread
From: Dimitar Dimitrov @ 2019-09-26 19:24 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On Wed, 11 Sep 2019, 22:02:26 EEST Richard Sandiford wrote:
> The reason for the PRU differences is that the port defines
> targetm.hard_regno_call_part_clobbered, but uses it to test whether
> a multi-register value contains a mixture of fully-clobbered and
> fully-preserved registers.  AFAICT the port doesn't actually have
> individual registers that are partly clobbered, so it doesn't need
> to define the hook.  (I can see how the documentation gave a misleading
> impression though.  I've tried to improve it in one of the patches.)
> The series moves away from testing hard_regno_call_part_clobbered
> directly to testing cached information instead, and the way that the
> cached information is calculated means that defining the hook the way
> the PRU port does has no effect.  In other words, after the series we
> treat it (rightly IMO) as having a "normal" ABI whereas before we didn't.
You are correct. Port does not have partially clobbered HW registers. And 
indeed I was worried about multi-register values.

PRU testsuite showed no regression from trunk with your patch set.

With your patch set, I tried to compare PRU assembly with and without defining 
the targetm.hard_regno_call_part_clobbered hook. There was much noise in 
compare-all-tests due to lto compiler ID strings, but after some filtering I 
think the output assembly was the same.

Thanks,
Dimitar



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [00/32] Support multiple ABIs in the same translation unit
  2019-09-26 19:24 ` Dimitar Dimitrov
@ 2019-09-27  8:58   ` Richard Sandiford
  0 siblings, 0 replies; 100+ messages in thread
From: Richard Sandiford @ 2019-09-27  8:58 UTC (permalink / raw)
  To: Dimitar Dimitrov; +Cc: gcc-patches

Dimitar Dimitrov <dimitar@dinux.eu> writes:
> On Wed, 11 Sep 2019, 22:02:26 EEST Richard Sandiford wrote:
>> The reason for the PRU differences is that the port defines
>> targetm.hard_regno_call_part_clobbered, but uses it to test whether
>> a multi-register value contains a mixture of fully-clobbered and
>> fully-preserved registers.  AFAICT the port doesn't actually have
>> individual registers that are partly clobbered, so it doesn't need
>> to define the hook.  (I can see how the documentation gave a misleading
>> impression though.  I've tried to improve it in one of the patches.)
>> The series moves away from testing hard_regno_call_part_clobbered
>> directly to testing cached information instead, and the way that the
>> cached information is calculated means that defining the hook the way
>> the PRU port does has no effect.  In other words, after the series we
>> treat it (rightly IMO) as having a "normal" ABI whereas before we didn't.
> You are correct. Port does not have partially clobbered HW registers. And 
> indeed I was worried about multi-register values.
>
> PRU testsuite showed no regression from trunk with your patch set.
>
> With your patch set, I tried to compare PRU assembly with and without defining 
> the targetm.hard_regno_call_part_clobbered hook. There was much noise in 
> compare-all-tests due to lto compiler ID strings, but after some filtering I 
> think the output assembly was the same.

OK, great!  Thanks for the testing.

Richard

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [01/32] Add function_abi.{h,cc}
  2019-09-11 19:03 ` [01/32] Add function_abi.{h,cc} Richard Sandiford
@ 2019-09-29 20:51   ` Jeff Law
  2019-09-30  9:19     ` Richard Sandiford
  0 siblings, 1 reply; 100+ messages in thread
From: Jeff Law @ 2019-09-29 20:51 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:03 PM, Richard Sandiford wrote:
> This patch adds new structures and functions for handling
> multiple ABIs in a translation unit.  The structures are:
> 
> - predefined_function_abi: describes a static, predefined ABI
> - function_abi: describes either a predefined ABI or a local
>   variant of one (e.g. taking -fipa-ra into account)
> 
> The patch adds functions for getting the ABI from a given type
> or decl; a later patch will also add a function for getting the
> ABI of the target of a call insn.
> 
> Although ABIs are about much more than call-clobber/saved choices,
> I wanted to keep the name general in case we add more ABI-related
> information in future.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* Makefile.in (OBJS): Add function-abi.o.
> 	(GTFILES): Add function-abi.h.
> 	* function-abi.cc: New file.
> 	* function-abi.h: Likewise.
> 	* emit-rtl.h (rtl_data::abi): New field.
> 	* function.c: Include function-abi.h.
> 	(prepare_function_start): Initialize crtl->abi.
> 	* read-rtl-function.c: Include regs.h and function-abi.h.
> 	(read_rtl_function_body): Initialize crtl->abi.
> 	(read_rtl_function_body_from_file_range): Likewise.
> 	* reginfo.c: Include function-abi.h.
> 	(init_reg_sets_1): Initialize default_function_abi.
> 	(globalize_reg): Call add_full_reg_clobber for each predefined ABI
> 	when making a register global.
> 	* target-globals.h (this_target_function_abi_info): Declare.
> 	(target_globals::function_abi_info): New field.
> 	(restore_target_globals): Copy it.
> 	* target-globals.c: Include function-abi.h.
> 	(default_target_globals): Initialize the function_abi_info field.
> 	(target_globals): Allocate it.
> 	(save_target_globals): Free it.
So no problem with this as-is.  Questions though:

1. Do we need to stream this information for LTO?

2. Do we need to support it for the RTL front-end, even if primarily for
testing purposes?

These may show up in later patches, if so safely assume that I'll get to
them soon enough :-)

jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [02/32] Add a target hook for getting an ABI from a function type
  2019-09-11 19:03 ` [02/32] Add a target hook for getting an ABI from a function type Richard Sandiford
@ 2019-09-29 20:52   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 20:52 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:03 PM, Richard Sandiford wrote:
> This patch adds a target hook that allows targets to return
> the ABI associated with a particular function type.  Generally,
> when multiple ABIs are in use, it must be possible to tell from
> a function type and its attributes which ABI it is using.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* target.def (fntype_abi): New target hook.
> 	* doc/tm.texi.in (TARGET_FNTYPE_ABI): Likewise.
> 	* doc/tm.texi: Regenerate.
> 	* target.h (predefined_function_abi): Declare.
> 	* function-abi.cc (fntype_abi): Call targetm.calls.fntype_abi,
> 	if defined.
> 	* config/aarch64/aarch64.h (ARM_PCS_SIMD): New arm_pcs value.
> 	* config/aarch64/aarch64.c: Include function-abi.h.
> 	(aarch64_simd_abi, aarch64_fntype_abi): New functions.
> 	(TARGET_FNTYPE_ABI): Define.
> 
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [05/32] Pass an ABI identifier to hard_regno_call_part_clobbered
  2019-09-11 19:05 ` [05/32] Pass an ABI identifier to hard_regno_call_part_clobbered Richard Sandiford
@ 2019-09-29 20:58   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 20:58 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:05 PM, Richard Sandiford wrote:
> This patch replaces the rtx_insn argument to
> targetm.hard_regno_call_part_clobbered with an ABI identifier, since
> call insns are now just one possible way of getting an ABI handle.
> This in turn allows predefined_function_abi::initialize to do the
> right thing for non-default ABIs.
> 
> The horrible ?: in need_for_call_save_p goes away in a later patch,
> with the series as a whole removing most direct calls to the hook in
> favour of function_abi operations.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* target.def (hard_regno_call_part_clobbered): Take an ABI
> 	identifier instead of an rtx_insn.
> 	* doc/tm.texi: Regenerate.
> 	* hooks.h (hook_bool_insn_uint_mode_false): Delete.
> 	(hook_bool_uint_uint_mode_false): New function.
> 	* hooks.c (hook_bool_insn_uint_mode_false): Delete.
> 	(hook_bool_uint_uint_mode_false): New function.
> 	* config/aarch64/aarch64.c (aarch64_hard_regno_call_part_clobbered):
> 	Take an ABI identifier instead of an rtx_insn.
> 	* config/avr/avr.c (avr_hard_regno_call_part_clobbered): Likewise.
> 	* config/i386/i386.c (ix86_hard_regno_call_part_clobbered): Likewise.
> 	* config/mips/mips.c (mips_hard_regno_call_part_clobbered): Likewise.
> 	* config/pru/pru.c (pru_hard_regno_call_part_clobbered): Likewise.
> 	* config/rs6000/rs6000.c (rs6000_hard_regno_call_part_clobbered):
> 	Likewise.
> 	* config/s390/s390.c (s390_hard_regno_call_part_clobbered): Likewise.
> 	* cselib.c: Include function-abi.h.
> 	(cselib_process_insn): Update call to
> 	targetm.hard_regno_call_part_clobbered, using call_insn_abi
> 	to get the appropriate ABI identifier.
> 	* function-abi.cc (predefined_function_abi::initialize): Update call
> 	to targetm.hard_regno_call_part_clobbered.
> 	* ira-conflicts.c (ira_build_conflicts): Likewise.
> 	* ira-costs.c (ira_tune_allocno_costs): Likewise.
> 	* lra-constraints.c: Include function-abi.h.
> 	(need_for_call_save_p): Update call to
> 	targetm.hard_regno_call_part_clobbered, using call_insn_abi
> 	to get the appropriate ABI identifier.
> 	* lra-lives.c (check_pseudos_live_through_calls): Likewise.
> 	* regcprop.c (copyprop_hardreg_forward_1): Update call
> 	to targetm.hard_regno_call_part_clobbered.
> 	* reginfo.c (choose_hard_reg_mode): Likewise.
> 	* regrename.c (check_new_reg_p): Likewise.
> 	* reload.c (find_equiv_reg): Likewise.
> 	* reload1.c (emit_reload_insns): Likewise.
> 	* sched-deps.c: Include function-abi.h.
> 	(deps_analyze_insn): Update call to
> 	targetm.hard_regno_call_part_clobbered, using call_insn_abi
> 	to get the appropriate ABI identifier.
> 	* sel-sched.c (init_regs_for_mode, mark_unavailable_hard_regs): Update
> 	call to targetm.hard_regno_call_part_clobbered.
> 	* targhooks.c (default_dwarf_frame_reg_mode): Likewise.
> 
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [06/32] Pass an ABI to choose_hard_reg_mode
  2019-09-11 19:06 ` [06/32] Pass an ABI to choose_hard_reg_mode Richard Sandiford
@ 2019-09-29 21:00   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:00 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:06 PM, Richard Sandiford wrote:
> choose_hard_reg_mode previously took a boolean saying whether the
> mode needed to be call-preserved.  This patch replaces it with an
> optional ABI pointer instead, so that the function can use that
> to test whether a value is call-saved.
> 
> default_dwarf_frame_reg_mode uses eh_edge_abi because that's the
> ABI that matters for unwinding.  Targets need to override the hook
> if they want something different.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* rtl.h (predefined_function_abi): Declare.
> 	(choose_hard_reg_mode): Take a pointer to a predefined_function_abi
> 	instead of a boolean call_save flag.
> 	* config/gcn/gcn.c (gcn_hard_regno_caller_save_mode): Update call
> 	accordingly.
> 	* config/i386/i386.h (HARD_REGNO_CALLER_SAVE_MODE): Likewise.
> 	* config/ia64/ia64.h (HARD_REGNO_CALLER_SAVE_MODE): Likewise.
> 	* config/mips/mips.c (mips_hard_regno_caller_save_mode): Likewise.
> 	* config/msp430/msp430.h (HARD_REGNO_CALLER_SAVE_MODE): Likewise.
> 	* config/rs6000/rs6000.h (HARD_REGNO_CALLER_SAVE_MODE): Likewise.
> 	* config/sh/sh.c (sh_hard_regno_caller_save_mode): Likewise.
> 	* reginfo.c (init_reg_modes_target): Likewise.
> 	(choose_hard_reg_mode): Take a pointer to a predefined_function_abi
> 	instead of a boolean call_save flag.
> 	* targhooks.c: Include function-abi.h.
> 	(default_dwarf_frame_reg_mode): Update call to choose_hard_reg_mode,
> 	using eh_edge_abi to choose the mode.
> 
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [07/32] Remove global call sets: caller-save.c
  2019-09-11 19:07 ` [07/32] Remove global call sets: caller-save.c Richard Sandiford
@ 2019-09-29 21:01   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:01 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:07 PM, Richard Sandiford wrote:
> All caller-save.c uses of "|= fixed_reg_set" added in a previous patch
> were redundant, since the sets are later ANDed with ~fixed_reg_set.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* caller-save.c (setup_save_areas): Remove redundant |s of
> 	fixed_reg_set.
> 	(save_call_clobbered_regs): Likewise.  Use the call ABI rather
> 	than call_used_or_fixed_regs to decide whether a REG_RETURNED
> 	value is useful.
> 
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [09/32] Remove global call sets: cfgloopanal.c
  2019-09-11 19:08 ` [09/32] Remove global call sets: cfgloopanal.c Richard Sandiford
@ 2019-09-29 21:02   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:02 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:08 PM, Richard Sandiford wrote:
> ...or rather, make the use of the default ABI explicit.  That seems
> OK if not ideal for this heuristic.
> 
> In practical terms, the code patched here is counting GENERAL_REGS,
> which are treated in the same way by all concurrent ABI variants
> on AArch64.  It might give bad results if used for interrupt
> handlers though.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* cfgloopanal.c: Include regs.h and function-abi.h.
> 	(init_set_costs): Use default_function_abi to test whether
> 	a general register is call-clobbered.
> 
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [08/32] Remove global call sets: cfgcleanup.c
  2019-09-11 19:07 ` [08/32] Remove global call sets: cfgcleanup.c Richard Sandiford
@ 2019-09-29 21:02   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:02 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:07 PM, Richard Sandiford wrote:
> old_insns_match_p just tests whether two instructions are
> similar enough to merge.  With call_insn_abi it makes more
> sense to compare the ABIs directly.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* cfgcleanup.c (old_insns_match_p): Compare the ABIs of calls
> 	instead of the call-clobbered sets.
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [11/32] Remove global call sets: cse.c
  2019-09-25 15:57   ` Richard Sandiford
@ 2019-09-29 21:04     ` Jeff Law
  2019-09-30 16:23       ` Richard Sandiford
  0 siblings, 1 reply; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:04 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/25/19 9:57 AM, Richard Sandiford wrote:
> Richard Sandiford <richard.sandiford@arm.com> writes:
>> Like with the combine.c patch, this one keeps things simple by
>> invalidating values in partially-clobbered registers, rather than
>> trying to tell whether the value in a partially-clobbered register
>> is actually clobbered or not.  Again, this is in principle a bug fix,
>> but probably never matters in practice.
> 
> Similary to the combine patch, I've updated this to avoid the
> short "abi" name and use a temporary HARD_REG_SET instead.
> 
> Richard
> 
> 
> 2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* cse.c: Include regs.h and function-abi.h.
> 	(invalidate_for_call): Take the call insn as an argument.
> 	Use insn_callee_abi to get the ABI of the call and invalidate
> 	partially clobbered registers as well as fully clobbered ones.
> 	(cse_insn): Update call accordingly.
> 
> Index: gcc/cse.c
> ===================================================================
> --- gcc/cse.c	2019-09-17 15:27:11.338066929 +0100
> +++ gcc/cse.c	2019-09-25 16:55:31.202641509 +0100
> @@ -42,6 +42,8 @@ Software Foundation; either version 3, o
>  #include "tree-pass.h"
>  #include "dbgcnt.h"
>  #include "rtl-iter.h"
> +#include "regs.h"
> +#include "function-abi.h"
>  
>  /* The basic idea of common subexpression elimination is to go
>     through the code, keeping a record of expressions that would
> @@ -566,7 +568,6 @@ static void remove_invalid_subreg_refs (
>  					machine_mode);
>  static void rehash_using_reg (rtx);
>  static void invalidate_memory (void);
> -static void invalidate_for_call (void);
>  static rtx use_related_value (rtx, struct table_elt *);
>  
>  static inline unsigned canon_hash (rtx, machine_mode);
> @@ -2091,23 +2092,29 @@ rehash_using_reg (rtx x)
>  }
>  \f
>  /* Remove from the hash table any expression that is a call-clobbered
> -   register.  Also update their TICK values.  */
> +   register in INSN.  Also update their TICK values.  */
>  
>  static void
> -invalidate_for_call (void)
> +invalidate_for_call (rtx_insn *insn)
>  {
> -  unsigned int regno, endregno;
> -  unsigned int i;
> +  unsigned int regno;
>    unsigned hash;
>    struct table_elt *p, *next;
>    int in_table = 0;
>    hard_reg_set_iterator hrsi;
>  
> -  /* Go through all the hard registers.  For each that is clobbered in
> -     a CALL_INSN, remove the register from quantity chains and update
> +  /* Go through all the hard registers.  For each that might be clobbered
> +     in call insn INSN, remove the register from quantity chains and update
>       reg_tick if defined.  Also see if any of these registers is currently
> -     in the table.  */
> -  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, regno, hrsi)
> +     in the table.
> +
> +     ??? We could be more precise for partially-clobbered registers,
> +     and only invalidate values that actually occupy the clobbered part
> +     of the registers.  It doesn't seem worth the effort though, since
> +     we shouldn't see this situation much before RA.  */
> +  HARD_REG_SET callee_clobbers
> +    = insn_callee_abi (insn).full_and_partial_reg_clobbers ();
> +  EXECUTE_IF_SET_IN_HARD_REG_SET (callee_clobbers, 0, regno, hrsi)
>      {
>        delete_reg_equiv (regno);
>        if (REG_TICK (regno) >= 0)
> @@ -2132,15 +2139,11 @@ invalidate_for_call (void)
>  	      || REGNO (p->exp) >= FIRST_PSEUDO_REGISTER)
>  	    continue;
>  
> -	  regno = REGNO (p->exp);
> -	  endregno = END_REGNO (p->exp);
> -
> -	  for (i = regno; i < endregno; i++)
> -	    if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i))
> -	      {
> -		remove_from_table (p, hash);
> -		break;
> -	      }
> +	  /* This must use the same test as above rather than the
> +	     more accurate clobbers_reg_p.  */
> +	  if (overlaps_hard_reg_set_p (callee_clobbers, GET_MODE (p->exp),
> +				       REGNO (p->exp)))
Is it worth putting a forward comment to the earlier test to the later
one to help ensure they're kept in sync?

OK with or without that comment update.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [12/32] Remove global call sets: cselib.c
  2019-09-11 19:09 ` [12/32] Remove global call sets: cselib.c Richard Sandiford
@ 2019-09-29 21:05   ` Jeff Law
  2019-10-29  9:20     ` Martin Liška
  0 siblings, 1 reply; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:05 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:09 PM, Richard Sandiford wrote:
> cselib_invalidate_regno is a no-op if REG_VALUES (i) is null,
> so we can check that first.  Then, if we know what mode the register
> currently has, we can check whether it's clobbered in that mode.
> 
> Using GET_MODE (values->elt->val_rtx) to get the mode of the last
> set is taken from cselib_reg_set_mode.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* cselib.c (cselib_process_insn): If we know what mode a
> 	register was set in, check whether it is clobbered in that
> 	mode by a call.  Only fall back to reg_raw_mode if that fails.
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [14/32] Remove global call sets: DF (entry/exit defs)
  2019-09-11 19:10 ` [14/32] Remove global call sets: DF (entry/exit defs) Richard Sandiford
@ 2019-09-29 21:07   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:07 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:10 PM, Richard Sandiford wrote:
> The code patched here is seeing whether the current function
> needs to save at least part of a register before using it.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* df-scan.c (df_get_entry_block_def_set): Use crtl->abi to test
> 	whether the current function needs to save at least part of a
> 	register before using it.
> 	(df_get_exit_block_use_set): Likewise for epilogue restores.
> 
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [13/32] Remove global call sets: DF (EH edges)
  2019-09-11 19:10 ` [13/32] Remove global call sets: DF (EH edges) Richard Sandiford
@ 2019-09-29 21:07   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:07 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:10 PM, Richard Sandiford wrote:
> The DF dense_invalidated_by_call and sparse_invalidated_by_call
> sets are actually only used on EH edges, and so are more the set
> of registers that are invalidated by a taken EH edge.  Under the
> new order, that means that they describe eh_edge_abi.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* df-problems.c: Include regs.h and function-abi.h.
> 	(df_rd_problem_data): Rename sparse_invalidated_by_call to
> 	sparse_invalidated_by_eh and dense_invalidated_by_call to
> 	dense_invalidated_by_eh.
> 	(df_print_bb_index): Update accordingly.
> 	(df_rd_alloc, df_rd_start_dump, df_rd_confluence_n): Likewise.
> 	(df_lr_confluence_n): Use eh_edge_abi to get the set of registers
> 	that are clobbered by an EH edge.  Clobber partially-clobbered
> 	registers as well as fully-clobbered ones.
> 	(df_md_confluence_n): Likewise.
> 	(df_rd_local_compute): Likewise.  Update for changes to
> 	df_rd_problem_data.
> 	* df-scan.c (df_scan_start_dump): Use eh_edge_abi to get the set
> 	of registers that are clobbered by an EH edge.  Includde partially-
> 	clobbered registers as well as fully-clobbered ones.
> 
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [15/32] Remove global call sets: early-remat.c
  2019-09-11 19:11 ` [15/32] Remove global call sets: early-remat.c Richard Sandiford
@ 2019-09-29 21:09   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:09 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:10 PM, Richard Sandiford wrote:
> This pass previously excluded rematerialisation candidates if they
> clobbered a call-preserved register, on the basis that it then
> wouldn't be safe to add new instances of the candidate instruction
> after a call.  This patch instead makes the decision on a call-by-call
> basis.
> 
> The second emit_remat_insns_for_block hunk probably isn't needed,
> but it seems safer and more consistent to have it, so that every call
> to emit_remat_insns is preceded by a check for invalid clobbers.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* early-remat.c: Include regs.h and function-abi.h.
> 	(early_remat::maybe_add_candidate): Don't check for call-clobbered
> 	registers here.
> 	(early_remat::restrict_remat_for_unavail_regs): New function.
> 	(early_remat::restrict_remat_for_call): Likewise.
> 	(early_remat::process_block): Before calling emit_remat_insns
> 	for a previous call in the block, invalidate any candidates
> 	that would clobber call-preserved registers.
> 	(early_remat::emit_remat_insns_for_block): Likewise for the
> 	final call in a block.  Do the same thing for live-in registers
> 	when calling emit_remat_insns at the head of a block.
> 
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [17/32] Remove global call sets: gcse.c
  2019-09-11 19:11 ` [17/32] Remove global call sets: gcse.c Richard Sandiford
  2019-09-25 16:04   ` Richard Sandiford
@ 2019-09-29 21:10   ` Jeff Law
  1 sibling, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:10 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:11 PM, Richard Sandiford wrote:
> This is another case in which we can conservatively treat partial
> kills as full kills.  Again this is in principle a bug fix for
> TARGET_HARD_REGNO_CALL_PART_CLOBBERED targets, but in practice
> it probably doesn't make a difference.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* gcse.c: Include function-abi.h.
> 	(compute_hash_table_work): Use call_insn_abi to get the ABI of
> 	the call insn target.  Invalidate partially call-clobbered
> 	registers as well as fully call-clobbered ones.
> 
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [16/32] Remove global call sets: function.c
  2019-09-11 19:11 ` [16/32] Remove global call sets: function.c Richard Sandiford
@ 2019-09-29 21:10   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:10 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:11 PM, Richard Sandiford wrote:
> Whatever the rights and wrongs of the way aggregate_value_p
> handles call-preserved registers, it's a de facto part of the ABI,
> so we shouldn't change it.  The patch simply extends the current
> approach to whatever call-preserved set the function happens to
> be using.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* function.c (aggregate_value_p): Work out which ABI the
> 	function is using before testing which registers are at least
> 	partly preserved by a call.
> 
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [18/32] Remove global call sets: haifa-sched.c
  2019-09-11 19:12 ` [18/32] Remove global call sets: haifa-sched.c Richard Sandiford
@ 2019-09-29 21:11   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:11 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:12 PM, Richard Sandiford wrote:
> The code patched here is counting how many registers the current
> function would need to save in the prologue before it uses them.
> The code is called per function, so using crtl is OK.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* haifa-sched.c: Include function-abi.h.
> 	(alloc_global_sched_pressure_data): Use crtl->abi to check whether
> 	the function would need to save a register before using it.
> 
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [20/32] Remove global call sets: loop-iv.c
  2019-09-11 19:13 ` [20/32] Remove global call sets: loop-iv.c Richard Sandiford
@ 2019-09-29 21:20   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:20 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:13 PM, Richard Sandiford wrote:
> Similar idea to the combine.c and gcse.c patches.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* loop-iv.c: Include regs.h and function-abi.h.
> 	(simplify_using_initial_values): Use call_insn_abi to get the
> 	ABI of the call insn target.  Conservatively assume that
> 	partially-clobbered registers are altered.
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [24/32] Remove global call sets: recog.c
  2019-09-11 19:15 ` [24/32] Remove global call sets: recog.c Richard Sandiford
@ 2019-09-29 21:33   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:33 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:15 PM, Richard Sandiford wrote:
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* recog.c: Include function-abi.h.
> 	(peep2_find_free_register): Use crtl->abi when deciding whether
> 	a register is free for use after RA.
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [22/32] Remove global call sets: postreload.c
  2019-09-11 19:14 ` [22/32] Remove global call sets: postreload.c Richard Sandiford
@ 2019-09-29 21:33   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:33 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:14 PM, Richard Sandiford wrote:
> The "|= fixed_regs" in reload_combine isn't necessary, since the
> set is only used to determine which values have changed (rather than,
> for example, which registers are available for use).
> 
> In reload_cse_move2add we can be accurate about which registers
> are still available.  BLKmode indicates a continuation of the
> previous register, and since clobbers_reg_p handles multi-register
> values, it's enough to skip over BLKmode entries and just test the
> start register.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* postreload.c (reload_combine_recognize_pattern): Use crtl->abi
> 	when deciding whether a register is free for use after RA.
> 	(reload_combine): Remove unnecessary use of fixed_reg_set.
> 	(reload_cse_move2add): Use call_insn_abi to get the ABI of the
> 	call insn target.  Use reg_mode when testing whether a register
> 	is no longer available.
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [25/32] Remove global call sets: regcprop.c
  2019-09-11 19:15 ` [25/32] Remove global call sets: regcprop.c Richard Sandiford
@ 2019-09-29 21:34   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 21:34 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:15 PM, Richard Sandiford wrote:
> This is a direct replacement of an existing test for fully and
> partially clobbered registers.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* regcprop.c (copyprop_hardreg_forward_1): Use the recorded
> 	mode of the register when deciding whether it is no longer
> 	available after a call.
> 
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [29/32] Remove global call sets: sched-deps.c
  2019-09-11 19:17 ` [29/32] Remove global call sets: sched-deps.c Richard Sandiford
@ 2019-09-29 22:20   ` Jeff Law
  2019-10-04 14:32     ` Christophe Lyon
  0 siblings, 1 reply; 100+ messages in thread
From: Jeff Law @ 2019-09-29 22:20 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:17 PM, Richard Sandiford wrote:
> This is a straight replacement of an existing "full or partial"
> call-clobber check.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* sched-deps.c (deps_analyze_insn): Use the ABI of the target
> 	function to test whether a register is fully or partly clobbered.
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [31/32] Remove global call sets: shrink-wrap.c
  2019-09-11 19:18 ` [31/32] Remove global call sets: shrink-wrap.c Richard Sandiford
@ 2019-09-29 22:21   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 22:21 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:18 PM, Richard Sandiford wrote:
> This is a straight replacement of "calls we can clobber without saving
> them first".
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* shrink-wrap.c: Include function-abi.h.
> 	(requires_stack_frame_p): Use crtl->abi to test whether the
> 	current function can use a register without saving it first.
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [00/32] Remove global call sets: rtlanal.c
  2019-09-11 19:17 ` [00/32] Remove global call sets: rtlanal.c Richard Sandiford
@ 2019-09-29 22:21   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 22:21 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:17 PM, Richard Sandiford wrote:
> The reg_set_p part is simple, since the caller is asking about
> a specific REG rtx, with a known register number and mode.
> 
> The find_all_hard_reg_sets part emphasises that the "implicit"
> behaviour was always a bit suspect, since it includes fully-clobbered
> registers but not partially-clobbered registers.  The only current
> user of this path is the c6x-specific scheduler predication code,
> and c6x doesn't have partly call-clobbered registers, so in practice
> it's fine.  I've added a comment to try to disuade future users.
> (The !implicit path is OK and useful though.)
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* rtlanal.c: Include function-abi.h.
> 	(reg_set_p): Use call_insn_abi to get the ABI of the called
> 	function and clobbers_reg_p to test whether the register
> 	is call-clobbered.
> 	(find_all_hard_reg_sets): When implicit is true, use call_insn_abi
> 	to get the ABI of the called function and full_reg_clobbers to
> 	get the set of fully call-clobbered registers.  Warn about the
> 	pitfalls of using this mode.
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [23/32] Remove global call sets: postreload-gcse.c
  2019-09-25 16:08   ` Richard Sandiford
@ 2019-09-29 22:22     ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 22:22 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/25/19 10:08 AM, Richard Sandiford wrote:
> Richard Sandiford <richard.sandiford@arm.com> writes:
>> This is another case in which we should conservatively treat
>> partial kills as full kills.
> 
> Similary to the combine patch, I've updated this to avoid the
> short "abi" name and use a temporary HARD_REG_SET instead.
> 
> Richard
> 
> 
> 2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* postreload-gcse.c: Include regs.h and function-abi.h.
> 	(record_opr_changes): Use insn_callee_abi to get the ABI of the
> 	call insn target.  Conservatively assume that partially-clobbered
> 	registers are altered.
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [32/32] Hide regs_invalidated_by_call etc.
  2019-09-11 19:19 ` [32/32] Hide regs_invalidated_by_call etc Richard Sandiford
@ 2019-09-29 22:22   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 22:22 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:19 PM, Richard Sandiford wrote:
> The previous patches removed all target-independent uses of
> regs_invalidated_by_call, call_used_or_fixed_regs and
> call_used_or_fixed_reg_p.  This patch therefore restricts
> them to target-specific code (and reginfo.c, which sets them up).
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* hard-reg-set.h (regs_invalidated_by_call): Only define if
> 	IN_TARGET_CODE.
> 	(call_used_or_fixed_regs): Likewise.
> 	(call_used_or_fixed_reg_p): Likewise.
> 	* reginfo.c (regs_invalidated_by_call): New macro.
OK.
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [26/32] Remove global call sets: regrename.c
  2019-09-11 19:16 ` [26/32] Remove global call sets: regrename.c Richard Sandiford
@ 2019-09-29 22:25   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 22:25 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:16 PM, Richard Sandiford wrote:
> This patch makes regrename use a similar mask-and-clobber-set
> pair to IRA when tracking whether registers are clobbered by
> calls in a region.  Testing for a nonzero ABI mask is equivalent
> to testing for a register that crosses a call.
> 
> Since AArch64 and c6x use regrename.h, they need to be updated
> to include function-abi.h first.  AIUI this is preferred over
> including function-abi.h in regrename.h.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* regrename.h (du_head::call_clobber_mask): New field.
> 	(du_head::need_caller_save_reg): Replace with...
> 	(du_head::call_abis): ...this new field.
> 	* regrename.c: Include function-abi.h.
> 	(call_clobbered_in_chain_p): New function.
> 	(check_new_reg_p): Use crtl->abi when deciding whether a register
> 	is free for use after RA.  Use call_clobbered_in_chain_p to test
> 	whether a candidate register would be clobbered by a call.
> 	(find_rename_reg): Don't add call-clobber conflicts here.
> 	(rename_chains): Check call_abis instead of need_caller_save_reg.
> 	(merge_chains): Update for changes to du_head.
> 	(build_def_use): Use call_insn_abi to get the ABI of the call insn
> 	target.  Record the ABI identifier in call_abis and the set of
> 	fully or partially clobbered registers in call_clobber_mask.
> 	Add fully-clobbered registers to hard_conflicts here rather
> 	than in find_rename_reg.
> 	* config/aarch64/cortex-a57-fma-steering.c: Include function-abi.h.
> 	(rename_single_chain): Check call_abis instead of need_caller_save_reg.
> 	* config/aarch64/falkor-tag-collision-avoidance.c: Include
> 	function-abi.h.
> 	* config/c6x/c6x.c: Likewise.
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [27/32] Remove global call sets: reload.c
  2019-09-11 19:16 ` [27/32] Remove global call sets: reload.c Richard Sandiford
@ 2019-09-29 22:26   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-29 22:26 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:16 PM, Richard Sandiford wrote:
> The inheritance code in find_equiv_reg can use clobbers_reg_p
> to test whether a call clobbers either of the equivalent registers.
> 
> reload and find_reg use crtl->abi to test whether a register needs
> to be saved in the prologue before use.
> 
> reload_as_needed can use full_and_partial_reg_clobbers and thus
> avoid needing to keep its own record of which registers are part
> call-clobbered.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* reload.c: Include function-abi.h.
> 	(find_equiv_reg): Use clobbers_reg_p to test whether either
> 	of the equivalent registers is clobbered by a call.
> 	* reload1.c: Include function-abi.h.
> 	(reg_reloaded_call_part_clobbered): Delete.
> 	(reload): Use crtl->abi to test which registers would need
> 	saving in the prologue before use.
> 	(find_reg): Likewise.
> 	(emit_reload_insns): Remove code for reg_reloaded_call_part_clobbered.
> 	(reload_as_needed): Likewise.  Use full_and_partial_reg_clobbers
> 	instead of call_used_or_fixed_regs | reg_reloaded_call_part_clobbered.
> 
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [10/32] Remove global call sets: combine.c
  2019-09-25 15:52         ` Richard Sandiford
  2019-09-25 16:30           ` Segher Boessenkool
@ 2019-09-29 22:32           ` Jeff Law
  2019-09-29 22:43             ` Segher Boessenkool
  1 sibling, 1 reply; 100+ messages in thread
From: Jeff Law @ 2019-09-29 22:32 UTC (permalink / raw)
  To: Segher Boessenkool, gcc-patches, richard.sandiford

On 9/25/19 9:52 AM, Richard Sandiford wrote:
> Segher Boessenkool <segher@kernel.crashing.org> writes:
>> Hi Richard,
>>
>> Sorry this too me so long to get back to.
>>
>> On Thu, Sep 12, 2019 at 08:51:59AM +0100, Richard Sandiford wrote:
>>> Segher Boessenkool <segher@kernel.crashing.org> writes:
>>>> On Wed, Sep 11, 2019 at 08:08:38PM +0100, Richard Sandiford wrote:
>>>>>        hard_reg_set_iterator hrsi;
>>>>> -      EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, i, hrsi)
>>>>> +      EXECUTE_IF_SET_IN_HARD_REG_SET (abi.full_and_partial_reg_clobbers (),
>>>>> +				      0, i, hrsi)
>>>>
>>>> So "abi" in that means calls?
>>>
>>> "abi" is the interface of the callee function, taking things like
>>> function attributes and -fipa-ra into account.
>>>
>>> The register sets are describing what the callee does rather than
>>> what calls to it do.  E.g. on targets that allow linker stubs to be
>>> inserted between calls, the scratch registers reserved for linker stubs
>>> are still call-clobbered, even if the target of the call doesn't use
>>> them.  (Those call clobbers are represented separately, at least when
>>> TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS is true.  When it's
>>> false we don't use -fipa-ra information at all.)
>>>
>>>> It is not such a great name like that.  Since its children are
>>>> very_long_names, it doesn't need to be only three chars itself,
>>>> either?
>>>
>>> OK, what name would you prefer?
>>
>> Maybe call_abi is a good name?  It's difficult to capture the subtleties
>> in a short enough name.  As always :-)
> 
> The formatting ended up being a bit weird with a longer name,
> so how about the attached instead?
> 
> Richard
> 
> 
> 2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* combine.c: Include function-abi.h.
> 	(record_dead_and_set_regs): Use insn_callee_abi to get the ABI
> 	of the target of call insns.  Invalidate partially-clobbered
> 	registers as well as fully-clobbered ones.
OK if Segher doesn't object.

jeff
> 

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [10/32] Remove global call sets: combine.c
  2019-09-29 22:32           ` Jeff Law
@ 2019-09-29 22:43             ` Segher Boessenkool
  0 siblings, 0 replies; 100+ messages in thread
From: Segher Boessenkool @ 2019-09-29 22:43 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches, richard.sandiford

On Sun, Sep 29, 2019 at 04:32:13PM -0600, Jeff Law wrote:
> On 9/25/19 9:52 AM, Richard Sandiford wrote:
> > gcc/
> > 	* combine.c: Include function-abi.h.
> > 	(record_dead_and_set_regs): Use insn_callee_abi to get the ABI
> > 	of the target of call insns.  Invalidate partially-clobbered
> > 	registers as well as fully-clobbered ones.
> OK if Segher doesn't object.

https://gcc.gnu.org/ml/gcc-patches/2019-09/msg01462.html

"That looks great, thanks!"

:-)


Segher

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [01/32] Add function_abi.{h,cc}
  2019-09-29 20:51   ` Jeff Law
@ 2019-09-30  9:19     ` Richard Sandiford
  2019-09-30 21:16       ` Jeff Law
  0 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-30  9:19 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

Hi Jeff,

Thanks as always for the reviews :-)

Jeff Law <law@redhat.com> writes:
> On 9/11/19 1:03 PM, Richard Sandiford wrote:
>> This patch adds new structures and functions for handling
>> multiple ABIs in a translation unit.  The structures are:
>> 
>> - predefined_function_abi: describes a static, predefined ABI
>> - function_abi: describes either a predefined ABI or a local
>>   variant of one (e.g. taking -fipa-ra into account)
>> 
>> The patch adds functions for getting the ABI from a given type
>> or decl; a later patch will also add a function for getting the
>> ABI of the target of a call insn.
>> 
>> Although ABIs are about much more than call-clobber/saved choices,
>> I wanted to keep the name general in case we add more ABI-related
>> information in future.
>> 
>> 
>> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
>> 
>> gcc/
>> 	* Makefile.in (OBJS): Add function-abi.o.
>> 	(GTFILES): Add function-abi.h.
>> 	* function-abi.cc: New file.
>> 	* function-abi.h: Likewise.
>> 	* emit-rtl.h (rtl_data::abi): New field.
>> 	* function.c: Include function-abi.h.
>> 	(prepare_function_start): Initialize crtl->abi.
>> 	* read-rtl-function.c: Include regs.h and function-abi.h.
>> 	(read_rtl_function_body): Initialize crtl->abi.
>> 	(read_rtl_function_body_from_file_range): Likewise.
>> 	* reginfo.c: Include function-abi.h.
>> 	(init_reg_sets_1): Initialize default_function_abi.
>> 	(globalize_reg): Call add_full_reg_clobber for each predefined ABI
>> 	when making a register global.
>> 	* target-globals.h (this_target_function_abi_info): Declare.
>> 	(target_globals::function_abi_info): New field.
>> 	(restore_target_globals): Copy it.
>> 	* target-globals.c: Include function-abi.h.
>> 	(default_target_globals): Initialize the function_abi_info field.
>> 	(target_globals): Allocate it.
>> 	(save_target_globals): Free it.
> So no problem with this as-is.  Questions though:
>
> 1. Do we need to stream this information for LTO?

At the moment this is all derived information rather than something we need
to stream directly.  E.g. the set of available predefined_function_abis
really only depends on command-line flags.  The mapping from functions
to predefined_function_abis currently depends only on the function type,
so streaming the type is enough to recover the ABI too.  function_abi
additionally depends on RTL stuff that doesn't affect LTO.

> 2. Do we need to support it for the RTL front-end, even if primarily for
> testing purposes?

Yeah, I guess it could be useful to be able to pretend that a function
is defined locally with a certain -fipa-ra clobber set, but I think in
most cases it'd be possible to do this using:

  void __attribute__ ((noinline, noclone))
  callee (void)
  {
    asm ("" ::: ...regs...);
  }

(That's the kind of test I've used in the support for the SVE PCS FWIW,
not posted yet.)

Richard

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [30/32] Remove global call sets: sel-sched.c
  2019-09-11 19:18 ` [30/32] Remove global call sets: sel-sched.c Richard Sandiford
@ 2019-09-30 15:08   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-30 15:08 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:18 PM, Richard Sandiford wrote:
> The main change here is to replace a crosses_call boolean with
> a bitmask of the ABIs used by the crossed calls.  For space reasons,
> I didn't also add a HARD_REG_SET that tracks the set of registers
> that are actually clobbered, which means that this is the one part
> of the series that doesn't benefit from -fipa-ra.  The existing
> FIXME suggests that the current structures aren't the preferred
> way of representing this anyhow, and the pass already makes
> conservative assumptions about call-crossing registers.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* sel-sched-ir.h (_def::crosses_call): Replace with...
> 	(_def::crossed_call_abis): ..this new field.
> 	(def_list_add): Take a mask of ABIs instead of a crosses_call
> 	boolean.
> 	* sel-sched-ir.c (def_list_add): Likewise.  Update initialization
> 	of _def accordingly.
> 	* sel-sched.c: Include function-abi.h.
> 	(hard_regs_data::regs_for_call_clobbered): Delete.
> 	(reg_rename::crosses_call): Replace with...
> 	(reg_rename::crossed_call_abis): ...this new field.
> 	(fur_static_params::crosses_call): Replace with...
> 	(fur_static_params::crossed_call_abis): ...this new field.
> 	(init_regs_for_mode): Don't initialize sel_hrd.regs_for_call_clobbered.
> 	(init_hard_regs_data): Use crtl->abi to test which registers the
> 	current function would need to save before it uses them.
> 	(mark_unavailable_hard_regs): Update handling of call-clobbered
> 	registers, using call_clobbers_in_region to find out which registers
> 	might be call-clobbered (but without taking -fipa-ra into account
> 	for now).  Remove separate handling of partially call-clobbered
> 	registers.
> 	(verify_target_availability): Use crossed_call_abis instead of
> 	crosses_call.
> 	(get_spec_check_type_for_insn, find_used_regs): Likewise.
> 	(fur_orig_expr_found, fur_on_enter, fur_orig_expr_not_found): Likewise.
> 
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [19/32] Remove global call sets: IRA
  2019-09-11 19:12 ` [19/32] Remove global call sets: IRA Richard Sandiford
@ 2019-09-30 15:16   ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-30 15:16 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:12 PM, Richard Sandiford wrote:
> For -fipa-ra, IRA already keeps track of which specific registers
> are call-clobbered in a region, rather than using global information.
> The patch generalises this so that it tracks which ABIs are used
> by calls in the region.
> 
> We can then use the new ABI descriptors to handle partially-clobbered
> registers in the same way as fully-clobbered registers, without having
> special code for targetm.hard_regno_call_part_clobbered.  This in turn
> makes -fipa-ra work for partially-clobbered registers too.
> 
> A side-effect of allowing multiple ABIs is that we no longer have
> an obvious set of conflicting registers for the self-described
> "fragile hack" in ira-constraints.c.  This code kicks in for
> user-defined registers that aren't live across a call at -O0,
> and it tries to avoid allocating a call-clobbered register to them.
> Here I've used the set of call-clobbered registers in the current
> function's ABI, applying on top of any registers that are clobbered by
> called functions.  This is enough to keep gcc.dg/debug/dwarf2/pr5948.c
> happy.
> 
> The handling of GENERIC_STACK_CHECK in do_reload seemed to have
> a reversed condition:
> 
>       for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++)
> 	if (df_regs_ever_live_p (i)
> 	    && !fixed_regs[i]
> 	    && call_used_or_fixed_reg_p (i))
> 	  size += UNITS_PER_WORD;
> 
> The final part of the condition counts registers that don't need to be
> saved in the prologue, but I think the opposite was intended.
Agreed.  Given it's just used to emit a diagnostic and that in reality
it's only used for Ada, I'm confident this code isn't getting exercised
in any significant way right now.

> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* function-abi.h (call_clobbers_in_region): Declare.
> 	(call_clobbered_in_region_p): New function.
> 	* function-abi.cc (call_clobbers_in_region): Likewise.
> 	* ira-int.h: Include function-abi.h.
> 	(ira_allocno::crossed_calls_abis): New field.
> 	(ALLOCNO_CROSSED_CALLS_ABIS): New macro.
> 	(ira_need_caller_save_regs): New function.
> 	(ira_need_caller_save_p): Likewise.
> 	* ira.c (setup_reg_renumber): Use ira_need_caller_save_p instead
> 	of call_used_or_fixed_regs.
> 	(do_reload): Use crtl->abi to test whether the current function
> 	needs to save a register in the prologue.  Count registers that
> 	need to be saved rather than registers that don't.
> 	* ira-build.c (create_cap_allocno): Copy ALLOCNO_CROSSED_CALLS_ABIS.
> 	Remove unnecessary | from ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
> 	(propagate_allocno_info): Merge ALLOCNO_CROSSED_CALLS_ABIS too.
> 	(propagate_some_info_from_allocno): Likewise.
> 	(copy_info_to_removed_store_destinations): Likewise.
> 	(ira_flattening): Say that ALLOCNO_CROSSED_CALLS_ABIS and
> 	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS are handled conservatively.
> 	(ira_build): Use ira_need_caller_save_regs instead of
> 	call_used_or_fixed_regs.
> 	* ira-color.c (calculate_saved_nregs): Use crtl->abi to test
> 	whether the current function would need to save a register
> 	before using it.
> 	(calculate_spill_cost): Likewise.
> 	(allocno_reload_assign): Use ira_need_caller_save_regs and
> 	ira_need_caller_save_p instead of call_used_or_fixed_regs.
> 	* ira-conflicts.c (ira_build_conflicts): Use
> 	ira_need_caller_save_regs rather than call_used_or_fixed_regs
> 	as the set of call-clobbered registers.  Remove the
> 	call_used_or_fixed_regs mask from the calculation of
> 	temp_hard_reg_set and mask its use instead.  Remove special
> 	handling of partially-clobbered registers.
> 	* ira-costs.c (ira_tune_allocno_costs): Use ira_need_caller_save_p.
> 	* ira-lives.c (process_bb_node_lives): Use mode_clobbers to
> 	calculate the set of conflicting registers for calls that
> 	can throw.  Record the ABIs of calls in ALLOCNO_CROSSED_CALLS_ABIS.
> 	Use full_and_partial_reg_clobbers rather than full_reg_clobbers
> 	for the calculation of ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
> 	Use eh_edge_abi to calculate the set of registers that could
> 	be clobbered by an EH edge.  Include partially-clobbered as
> 	well as fully-clobbered registers.
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [21/32] Remove global call sets: LRA
  2019-09-11 19:14 ` [21/32] Remove global call sets: LRA Richard Sandiford
@ 2019-09-30 15:29   ` Jeff Law
  2019-10-04 18:03   ` H.J. Lu
  1 sibling, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-30 15:29 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/11/19 1:14 PM, Richard Sandiford wrote:
> lra_reg has an actual_call_used_reg_set field that is only used during
> inheritance.  This in turn required a special lra_create_live_ranges
> pass for flag_ipa_ra to set up this field.  This patch instead makes
> the inheritance code do its own live register tracking, using the
> same ABI-mask-and-clobber-set pair as for IRA.
> 
> Tracking ABIs simplifies (and cheapens) the logic in lra-lives.c and
> means we no longer need a separate path for -fipa-ra.  It also means
> we can remove TARGET_RETURN_CALL_WITH_MAX_CLOBBERS.
> 
> The patch also strengthens the sanity check in lra_assigns so that
> we check that reg_renumber is consistent with the whole conflict set,
> not just the call-clobbered registers.
> 
> 
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* target.def (return_call_with_max_clobbers): Delete.
> 	* doc/tm.texi.in (TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): Delete.
> 	* doc/tm.texi: Regenerate.
> 	* config/aarch64/aarch64.c (aarch64_return_call_with_max_clobbers)
> 	(TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): Delete.
> 	* lra-int.h (lra_reg::actual_call_used_reg_set): Delete.
> 	(lra_reg::call_insn): Delete.
> 	* lra.c: Include function-abi.h.
> 	(initialize_lra_reg_info_element): Don't initialize the fields above.
> 	(lra): Use crtl->abi to test whether the current function needs to
> 	save a register in the prologue.  Remove special pre-inheritance
> 	lra_create_live_ranges pass for flag_ipa_ra.
> 	* lra-assigns.c: Include function-abi.h
> 	(find_hard_regno_for_1): Use crtl->abi to test whether the current
> 	function needs to save a register in the prologue.
> 	(lra_assign): Assert that registers aren't allocated to a
> 	conflicting register, rather than checking only for overlaps
> 	with call_used_or_fixed_regs.  Do this even for flag_ipa_ra,
> 	and for registers that are not live across a call.
> 	* lra-constraints.c (last_call_for_abi): New variable.
> 	(full_and_partial_call_clobbers): Likewise.
> 	(setup_next_usage_insn): Remove the register from
> 	full_and_partial_call_clobbers.
> 	(need_for_call_save_p): Use call_clobbered_in_region_p to test
> 	whether the register needs a caller save.
> 	(need_for_split_p): Use full_and_partial_reg_clobbers instead
> 	of call_used_or_fixed_regs.
> 	(inherit_in_ebb): Initialize and maintain last_call_for_abi and
> 	full_and_partial_call_clobbers.
> 	* lra-lives.c (check_pseudos_live_through_calls): Replace
> 	last_call_used_reg_set and call_insn arguments with an abi argument.
> 	Remove handling of lra_reg::call_insn.  Use function_abi::mode_clobbers
> 	as the set of conflicting registers.
> 	(calls_have_same_clobbers_p): Delete.
> 	(process_bb_lives): Track the ABI of the last call instead of an
> 	insn/HARD_REG_SET pair.  Update calls to
> 	check_pseudos_live_through_calls.  Use eh_edge_abi to calculate
> 	the set of registers that could be clobbered by an EH edge.
> 	Include partially-clobbered as well as fully-clobbered registers.
> 	(lra_create_live_ranges_1): Don't initialize lra_reg::call_insn.
> 	* lra-remat.c: Include function-abi.h.
> 	(call_used_regs_arr_len, call_used_regs_arr): Delete.
> 	(set_bb_regs): Use call_insn_abi to get the set of call-clobbered
> 	registers and bitmap_view to combine them into dead_regs.
> 	(call_used_input_regno_present_p): Take a function_abi argument
> 	and use it to test whether a register is call-clobbered.
> 	(calculate_gen_cands): Use call_insn_abi to get the ABI of the
> 	call insn target.  Update tje call to call_used_input_regno_present_p.
> 	(do_remat): Likewise.
> 	(lra_remat): Remove the initialization of call_used_regs_arr_len
> 	and call_used_regs_arr.
OK
jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [03/32] Add a function for getting the ABI of a call insn target
  2019-09-25 15:38   ` Richard Sandiford
@ 2019-09-30 15:52     ` Jeff Law
  2019-09-30 16:32       ` Richard Sandiford
  0 siblings, 1 reply; 100+ messages in thread
From: Jeff Law @ 2019-09-30 15:52 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/25/19 9:38 AM, Richard Sandiford wrote:
> Richard Sandiford <richard.sandiford@arm.com> writes:
>> This patch replaces get_call_reg_set_usage with call_insn_abi,
>> which returns the ABI of the target of a call insn.  The ABI's
>> full_reg_clobbers corresponds to regs_invalidated_by_call,
>> whereas many callers instead passed call_used_or_fixed_regs, i.e.:
>>
>>   (regs_invalidated_by_call | fixed_reg_set)
>>
>> The patch slavishly preserves the "| fixed_reg_set" for these callers;
>> later patches will clean this up.
> 
> On reflection, I think insn_callee_abi would be a better name for the
> function than call_insn_abi, since it should make it clearer that the
> function returns the ABI of the target function.  In future we could
> have expr_callee_abi for CALL_EXPRs.
> 
> Also, after Segher's comments for 10/32, I've used "callee_abi" as
> the name of temporary variables, instead of just "abi".
> 
> I've made the same change for later patches (except where I've posted
> new versions instead), but it didn't seem worth spamming the lists
> with that.
> 
> Tested as before.
> 
> Richard
> 
> PS. Ping for the series :-)
> 
> 
> 2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* target.def (insn_callee_abi): New hook.
> 	(remove_extra_call_preserved_regs): Delete.
> 	* doc/tm.texi.in (TARGET_INSN_CALLEE_ABI): New macro.
> 	(TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete.
> 	* doc/tm.texi: Regenerate.
> 	* targhooks.h (default_remove_extra_call_preserved_regs): Delete.
> 	* targhooks.c (default_remove_extra_call_preserved_regs): Delete.
> 	* config/aarch64/aarch64.c (aarch64_simd_call_p): Constify the
> 	insn argument.
> 	(aarch64_remove_extra_call_preserved_regs): Delete.
> 	(aarch64_insn_callee_abi): New function.
> 	(TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete.
> 	(TARGET_INSN_CALLEE_ABI): New macro.
> 	* rtl.h (get_call_fndecl): Declare.
> 	(cgraph_rtl_info): Fix formatting.  Tweak comment for
> 	function_used_regs.  Remove function_used_regs_valid.
> 	* rtlanal.c (get_call_fndecl): Moved from final.c
> 	* function-abi.h (insn_callee_abi): Declare.
> 	(target_function_abi_info): Mention insn_callee_abi.
> 	* function-abi.cc (fndecl_abi): Handle flag_ipa_ra in a similar
> 	way to get_call_reg_set_usage did.
> 	(insn_callee_abi): New function.
> 	* regs.h (get_call_reg_set_usage): Delete.
> 	* final.c: Include function-abi.h.
> 	(collect_fn_hard_reg_usage): Add fixed and stack registers to
> 	function_used_regs before the main loop rather than afterwards.
> 	Use insn_callee_abi instead of get_call_reg_set_usage.  Exit early
> 	if function_used_regs ends up not being useful.
> 	(get_call_fndecl): Move to rtlanal.c
> 	(get_call_cgraph_rtl_info, get_call_reg_set_usage): Delete.
> 	* caller-save.c: Include function-abi.h.
> 	(setup_save_areas, save_call_clobbered_regs): Use insn_callee_abi
> 	instead of get_call_reg_set_usage.
> 	* cfgcleanup.c: Include function-abi.h.
> 	(old_insns_match_p): Use insn_callee_abi instead of
> 	get_call_reg_set_usage.
> 	* cgraph.h (cgraph_node::rtl_info): Take a const_tree instead of
> 	a tree.
> 	* cgraph.c (cgraph_node::rtl_info): Likewise.  Initialize
> 	function_used_regs.
> 	* df-scan.c: Include function-abi.h.
> 	(df_get_call_refs): Use insn_callee_abi instead of
> 	get_call_reg_set_usage.
> 	* ira-lives.c: Include function-abi.h.
> 	(process_bb_node_lives): Use insn_callee_abi instead of
> 	get_call_reg_set_usage.
> 	* lra-lives.c: Include function-abi.h.
> 	(process_bb_lives): Use insn_callee_abi instead of
> 	get_call_reg_set_usage.
> 	* postreload.c: Include function-abi.h.
> 	(reload_combine): Use insn_callee_abi instead of
> 	get_call_reg_set_usage.
> 	* regcprop.c: Include function-abi.h.
> 	(copyprop_hardreg_forward_1): Use insn_callee_abi instead of
> 	get_call_reg_set_usage.
> 	* resource.c: Include function-abi.h.
> 	(mark_set_resources, mark_target_live_regs): Use insn_callee_abi
> 	instead of get_call_reg_set_usage.
> 	* var-tracking.c: Include function-abi.h.
> 	(dataflow_set_clear_at_call): Use insn_callee_abi instead of
> 	get_call_reg_set_usage.
> 
OK.  I think that's the whole set, right?

There were a ton of guality issues over the last couple weeks in the
tester, but those were somewhat expected due to a prior patchkit.  The
affected targets have new baseline results in my tester, so if anything
goes nuts as a result of your patches it should be fairly obvious.

Jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [11/32] Remove global call sets: cse.c
  2019-09-29 21:04     ` Jeff Law
@ 2019-09-30 16:23       ` Richard Sandiford
  0 siblings, 0 replies; 100+ messages in thread
From: Richard Sandiford @ 2019-09-30 16:23 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

Jeff Law <law@redhat.com> writes:
> On 9/25/19 9:57 AM, Richard Sandiford wrote:
>> Richard Sandiford <richard.sandiford@arm.com> writes:
>>> Like with the combine.c patch, this one keeps things simple by
>>> invalidating values in partially-clobbered registers, rather than
>>> trying to tell whether the value in a partially-clobbered register
>>> is actually clobbered or not.  Again, this is in principle a bug fix,
>>> but probably never matters in practice.
>> 
>> Similary to the combine patch, I've updated this to avoid the
>> short "abi" name and use a temporary HARD_REG_SET instead.
>> 
>> Richard
>> 
>> 
>> 2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>
>> 
>> gcc/
>> 	* cse.c: Include regs.h and function-abi.h.
>> 	(invalidate_for_call): Take the call insn as an argument.
>> 	Use insn_callee_abi to get the ABI of the call and invalidate
>> 	partially clobbered registers as well as fully clobbered ones.
>> 	(cse_insn): Update call accordingly.
>> 
>> Index: gcc/cse.c
>> ===================================================================
>> --- gcc/cse.c	2019-09-17 15:27:11.338066929 +0100
>> +++ gcc/cse.c	2019-09-25 16:55:31.202641509 +0100
>> @@ -42,6 +42,8 @@ Software Foundation; either version 3, o
>>  #include "tree-pass.h"
>>  #include "dbgcnt.h"
>>  #include "rtl-iter.h"
>> +#include "regs.h"
>> +#include "function-abi.h"
>>  
>>  /* The basic idea of common subexpression elimination is to go
>>     through the code, keeping a record of expressions that would
>> @@ -566,7 +568,6 @@ static void remove_invalid_subreg_refs (
>>  					machine_mode);
>>  static void rehash_using_reg (rtx);
>>  static void invalidate_memory (void);
>> -static void invalidate_for_call (void);
>>  static rtx use_related_value (rtx, struct table_elt *);
>>  
>>  static inline unsigned canon_hash (rtx, machine_mode);
>> @@ -2091,23 +2092,29 @@ rehash_using_reg (rtx x)
>>  }
>>  \f
>>  /* Remove from the hash table any expression that is a call-clobbered
>> -   register.  Also update their TICK values.  */
>> +   register in INSN.  Also update their TICK values.  */
>>  
>>  static void
>> -invalidate_for_call (void)
>> +invalidate_for_call (rtx_insn *insn)
>>  {
>> -  unsigned int regno, endregno;
>> -  unsigned int i;
>> +  unsigned int regno;
>>    unsigned hash;
>>    struct table_elt *p, *next;
>>    int in_table = 0;
>>    hard_reg_set_iterator hrsi;
>>  
>> -  /* Go through all the hard registers.  For each that is clobbered in
>> -     a CALL_INSN, remove the register from quantity chains and update
>> +  /* Go through all the hard registers.  For each that might be clobbered
>> +     in call insn INSN, remove the register from quantity chains and update
>>       reg_tick if defined.  Also see if any of these registers is currently
>> -     in the table.  */
>> -  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, regno, hrsi)
>> +     in the table.
>> +
>> +     ??? We could be more precise for partially-clobbered registers,
>> +     and only invalidate values that actually occupy the clobbered part
>> +     of the registers.  It doesn't seem worth the effort though, since
>> +     we shouldn't see this situation much before RA.  */
>> +  HARD_REG_SET callee_clobbers
>> +    = insn_callee_abi (insn).full_and_partial_reg_clobbers ();
>> +  EXECUTE_IF_SET_IN_HARD_REG_SET (callee_clobbers, 0, regno, hrsi)
>>      {
>>        delete_reg_equiv (regno);
>>        if (REG_TICK (regno) >= 0)
>> @@ -2132,15 +2139,11 @@ invalidate_for_call (void)
>>  	      || REGNO (p->exp) >= FIRST_PSEUDO_REGISTER)
>>  	    continue;
>>  
>> -	  regno = REGNO (p->exp);
>> -	  endregno = END_REGNO (p->exp);
>> -
>> -	  for (i = regno; i < endregno; i++)
>> -	    if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i))
>> -	      {
>> -		remove_from_table (p, hash);
>> -		break;
>> -	      }
>> +	  /* This must use the same test as above rather than the
>> +	     more accurate clobbers_reg_p.  */
>> +	  if (overlaps_hard_reg_set_p (callee_clobbers, GET_MODE (p->exp),
>> +				       REGNO (p->exp)))
> Is it worth putting a forward comment to the earlier test to the later
> one to help ensure they're kept in sync?
>
> OK with or without that comment update.

Yeah, agree that'd be safer.  Here's what I installed.

Thanks,
Richard


2019-09-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* cse.c: Include regs.h and function-abi.h.
	(invalidate_for_call): Take the call insn as an argument.
	Use insn_callee_abi to get the ABI of the call and invalidate
	partially clobbered registers as well as fully clobbered ones.
	(cse_insn): Update call accordingly.

Index: gcc/cse.c
===================================================================
--- gcc/cse.c	2019-09-30 17:03:36.000000000 +0100
+++ gcc/cse.c	2019-09-30 17:03:36.726099589 +0100
@@ -42,6 +42,8 @@ Software Foundation; either version 3, o
 #include "tree-pass.h"
 #include "dbgcnt.h"
 #include "rtl-iter.h"
+#include "regs.h"
+#include "function-abi.h"
 
 /* The basic idea of common subexpression elimination is to go
    through the code, keeping a record of expressions that would
@@ -566,7 +568,6 @@ static void remove_invalid_subreg_refs (
 					machine_mode);
 static void rehash_using_reg (rtx);
 static void invalidate_memory (void);
-static void invalidate_for_call (void);
 static rtx use_related_value (rtx, struct table_elt *);
 
 static inline unsigned canon_hash (rtx, machine_mode);
@@ -2091,23 +2092,31 @@ rehash_using_reg (rtx x)
 }
 \f
 /* Remove from the hash table any expression that is a call-clobbered
-   register.  Also update their TICK values.  */
+   register in INSN.  Also update their TICK values.  */
 
 static void
-invalidate_for_call (void)
+invalidate_for_call (rtx_insn *insn)
 {
-  unsigned int regno, endregno;
-  unsigned int i;
+  unsigned int regno;
   unsigned hash;
   struct table_elt *p, *next;
   int in_table = 0;
   hard_reg_set_iterator hrsi;
 
-  /* Go through all the hard registers.  For each that is clobbered in
-     a CALL_INSN, remove the register from quantity chains and update
+  /* Go through all the hard registers.  For each that might be clobbered
+     in call insn INSN, remove the register from quantity chains and update
      reg_tick if defined.  Also see if any of these registers is currently
-     in the table.  */
-  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, regno, hrsi)
+     in the table.
+
+     ??? We could be more precise for partially-clobbered registers,
+     and only invalidate values that actually occupy the clobbered part
+     of the registers.  It doesn't seem worth the effort though, since
+     we shouldn't see this situation much before RA.  Whatever choice
+     we make here has to be consistent with the table walk below,
+     so any change to this test will require a change there too.  */
+  HARD_REG_SET callee_clobbers
+    = insn_callee_abi (insn).full_and_partial_reg_clobbers ();
+  EXECUTE_IF_SET_IN_HARD_REG_SET (callee_clobbers, 0, regno, hrsi)
     {
       delete_reg_equiv (regno);
       if (REG_TICK (regno) >= 0)
@@ -2132,15 +2141,11 @@ invalidate_for_call (void)
 	      || REGNO (p->exp) >= FIRST_PSEUDO_REGISTER)
 	    continue;
 
-	  regno = REGNO (p->exp);
-	  endregno = END_REGNO (p->exp);
-
-	  for (i = regno; i < endregno; i++)
-	    if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i))
-	      {
-		remove_from_table (p, hash);
-		break;
-	      }
+	  /* This must use the same test as above rather than the
+	     more accurate clobbers_reg_p.  */
+	  if (overlaps_hard_reg_set_p (callee_clobbers, GET_MODE (p->exp),
+				       REGNO (p->exp)))
+	    remove_from_table (p, hash);
 	}
 }
 \f
@@ -5823,7 +5828,7 @@ cse_insn (rtx_insn *insn)
 	  if (GET_CODE (XEXP (tem, 0)) == USE
 	      && MEM_P (XEXP (XEXP (tem, 0), 0)))
 	    invalidate (XEXP (XEXP (tem, 0), 0), VOIDmode);
-      invalidate_for_call ();
+      invalidate_for_call (insn);
     }
 
   /* Now invalidate everything set by this instruction.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [03/32] Add a function for getting the ABI of a call insn target
  2019-09-30 15:52     ` Jeff Law
@ 2019-09-30 16:32       ` Richard Sandiford
  2019-09-30 16:46         ` Jeff Law
  0 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-09-30 16:32 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

Jeff Law <law@redhat.com> writes:
> On 9/25/19 9:38 AM, Richard Sandiford wrote:
>> Richard Sandiford <richard.sandiford@arm.com> writes:
>>> This patch replaces get_call_reg_set_usage with call_insn_abi,
>>> which returns the ABI of the target of a call insn.  The ABI's
>>> full_reg_clobbers corresponds to regs_invalidated_by_call,
>>> whereas many callers instead passed call_used_or_fixed_regs, i.e.:
>>>
>>>   (regs_invalidated_by_call | fixed_reg_set)
>>>
>>> The patch slavishly preserves the "| fixed_reg_set" for these callers;
>>> later patches will clean this up.
>> 
>> On reflection, I think insn_callee_abi would be a better name for the
>> function than call_insn_abi, since it should make it clearer that the
>> function returns the ABI of the target function.  In future we could
>> have expr_callee_abi for CALL_EXPRs.
>> 
>> Also, after Segher's comments for 10/32, I've used "callee_abi" as
>> the name of temporary variables, instead of just "abi".
>> 
>> I've made the same change for later patches (except where I've posted
>> new versions instead), but it didn't seem worth spamming the lists
>> with that.
>> 
>> Tested as before.
>> 
>> Richard
>> 
>> PS. Ping for the series :-)
>> 
>> 
>> 2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>
>> 
>> gcc/
>> 	* target.def (insn_callee_abi): New hook.
>> 	(remove_extra_call_preserved_regs): Delete.
>> 	* doc/tm.texi.in (TARGET_INSN_CALLEE_ABI): New macro.
>> 	(TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete.
>> 	* doc/tm.texi: Regenerate.
>> 	* targhooks.h (default_remove_extra_call_preserved_regs): Delete.
>> 	* targhooks.c (default_remove_extra_call_preserved_regs): Delete.
>> 	* config/aarch64/aarch64.c (aarch64_simd_call_p): Constify the
>> 	insn argument.
>> 	(aarch64_remove_extra_call_preserved_regs): Delete.
>> 	(aarch64_insn_callee_abi): New function.
>> 	(TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete.
>> 	(TARGET_INSN_CALLEE_ABI): New macro.
>> 	* rtl.h (get_call_fndecl): Declare.
>> 	(cgraph_rtl_info): Fix formatting.  Tweak comment for
>> 	function_used_regs.  Remove function_used_regs_valid.
>> 	* rtlanal.c (get_call_fndecl): Moved from final.c
>> 	* function-abi.h (insn_callee_abi): Declare.
>> 	(target_function_abi_info): Mention insn_callee_abi.
>> 	* function-abi.cc (fndecl_abi): Handle flag_ipa_ra in a similar
>> 	way to get_call_reg_set_usage did.
>> 	(insn_callee_abi): New function.
>> 	* regs.h (get_call_reg_set_usage): Delete.
>> 	* final.c: Include function-abi.h.
>> 	(collect_fn_hard_reg_usage): Add fixed and stack registers to
>> 	function_used_regs before the main loop rather than afterwards.
>> 	Use insn_callee_abi instead of get_call_reg_set_usage.  Exit early
>> 	if function_used_regs ends up not being useful.
>> 	(get_call_fndecl): Move to rtlanal.c
>> 	(get_call_cgraph_rtl_info, get_call_reg_set_usage): Delete.
>> 	* caller-save.c: Include function-abi.h.
>> 	(setup_save_areas, save_call_clobbered_regs): Use insn_callee_abi
>> 	instead of get_call_reg_set_usage.
>> 	* cfgcleanup.c: Include function-abi.h.
>> 	(old_insns_match_p): Use insn_callee_abi instead of
>> 	get_call_reg_set_usage.
>> 	* cgraph.h (cgraph_node::rtl_info): Take a const_tree instead of
>> 	a tree.
>> 	* cgraph.c (cgraph_node::rtl_info): Likewise.  Initialize
>> 	function_used_regs.
>> 	* df-scan.c: Include function-abi.h.
>> 	(df_get_call_refs): Use insn_callee_abi instead of
>> 	get_call_reg_set_usage.
>> 	* ira-lives.c: Include function-abi.h.
>> 	(process_bb_node_lives): Use insn_callee_abi instead of
>> 	get_call_reg_set_usage.
>> 	* lra-lives.c: Include function-abi.h.
>> 	(process_bb_lives): Use insn_callee_abi instead of
>> 	get_call_reg_set_usage.
>> 	* postreload.c: Include function-abi.h.
>> 	(reload_combine): Use insn_callee_abi instead of
>> 	get_call_reg_set_usage.
>> 	* regcprop.c: Include function-abi.h.
>> 	(copyprop_hardreg_forward_1): Use insn_callee_abi instead of
>> 	get_call_reg_set_usage.
>> 	* resource.c: Include function-abi.h.
>> 	(mark_set_resources, mark_target_live_regs): Use insn_callee_abi
>> 	instead of get_call_reg_set_usage.
>> 	* var-tracking.c: Include function-abi.h.
>> 	(dataflow_set_clear_at_call): Use insn_callee_abi instead of
>> 	get_call_reg_set_usage.
>> 
> OK.  I think that's the whole set, right?

Yep, all committed now :-)  Thanks again for the reviews.

> There were a ton of guality issues over the last couple weeks in the
> tester, but those were somewhat expected due to a prior patchkit.  The
> affected targets have new baseline results in my tester, so if anything
> goes nuts as a result of your patches it should be fairly obvious.

There was certainly an extra guality failure on aarch64-linux-gnu with
this series, think it was:

gcc.dg/guality/pr41616-1.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-object

Will be interesting to see what effect it has on other targets.

Richard

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [03/32] Add a function for getting the ABI of a call insn target
  2019-09-30 16:32       ` Richard Sandiford
@ 2019-09-30 16:46         ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-30 16:46 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/30/19 10:32 AM, Richard Sandiford wrote:
> Jeff Law <law@redhat.com> writes:
>> On 9/25/19 9:38 AM, Richard Sandiford wrote:
>>> Richard Sandiford <richard.sandiford@arm.com> writes:
>>>> This patch replaces get_call_reg_set_usage with call_insn_abi,
>>>> which returns the ABI of the target of a call insn.  The ABI's
>>>> full_reg_clobbers corresponds to regs_invalidated_by_call,
>>>> whereas many callers instead passed call_used_or_fixed_regs, i.e.:
>>>>
>>>>   (regs_invalidated_by_call | fixed_reg_set)
>>>>
>>>> The patch slavishly preserves the "| fixed_reg_set" for these callers;
>>>> later patches will clean this up.
>>>
>>> On reflection, I think insn_callee_abi would be a better name for the
>>> function than call_insn_abi, since it should make it clearer that the
>>> function returns the ABI of the target function.  In future we could
>>> have expr_callee_abi for CALL_EXPRs.
>>>
>>> Also, after Segher's comments for 10/32, I've used "callee_abi" as
>>> the name of temporary variables, instead of just "abi".
>>>
>>> I've made the same change for later patches (except where I've posted
>>> new versions instead), but it didn't seem worth spamming the lists
>>> with that.
>>>
>>> Tested as before.
>>>
>>> Richard
>>>
>>> PS. Ping for the series :-)
>>>
>>>
>>> 2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>
>>>
>>> gcc/
>>> 	* target.def (insn_callee_abi): New hook.
>>> 	(remove_extra_call_preserved_regs): Delete.
>>> 	* doc/tm.texi.in (TARGET_INSN_CALLEE_ABI): New macro.
>>> 	(TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete.
>>> 	* doc/tm.texi: Regenerate.
>>> 	* targhooks.h (default_remove_extra_call_preserved_regs): Delete.
>>> 	* targhooks.c (default_remove_extra_call_preserved_regs): Delete.
>>> 	* config/aarch64/aarch64.c (aarch64_simd_call_p): Constify the
>>> 	insn argument.
>>> 	(aarch64_remove_extra_call_preserved_regs): Delete.
>>> 	(aarch64_insn_callee_abi): New function.
>>> 	(TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete.
>>> 	(TARGET_INSN_CALLEE_ABI): New macro.
>>> 	* rtl.h (get_call_fndecl): Declare.
>>> 	(cgraph_rtl_info): Fix formatting.  Tweak comment for
>>> 	function_used_regs.  Remove function_used_regs_valid.
>>> 	* rtlanal.c (get_call_fndecl): Moved from final.c
>>> 	* function-abi.h (insn_callee_abi): Declare.
>>> 	(target_function_abi_info): Mention insn_callee_abi.
>>> 	* function-abi.cc (fndecl_abi): Handle flag_ipa_ra in a similar
>>> 	way to get_call_reg_set_usage did.
>>> 	(insn_callee_abi): New function.
>>> 	* regs.h (get_call_reg_set_usage): Delete.
>>> 	* final.c: Include function-abi.h.
>>> 	(collect_fn_hard_reg_usage): Add fixed and stack registers to
>>> 	function_used_regs before the main loop rather than afterwards.
>>> 	Use insn_callee_abi instead of get_call_reg_set_usage.  Exit early
>>> 	if function_used_regs ends up not being useful.
>>> 	(get_call_fndecl): Move to rtlanal.c
>>> 	(get_call_cgraph_rtl_info, get_call_reg_set_usage): Delete.
>>> 	* caller-save.c: Include function-abi.h.
>>> 	(setup_save_areas, save_call_clobbered_regs): Use insn_callee_abi
>>> 	instead of get_call_reg_set_usage.
>>> 	* cfgcleanup.c: Include function-abi.h.
>>> 	(old_insns_match_p): Use insn_callee_abi instead of
>>> 	get_call_reg_set_usage.
>>> 	* cgraph.h (cgraph_node::rtl_info): Take a const_tree instead of
>>> 	a tree.
>>> 	* cgraph.c (cgraph_node::rtl_info): Likewise.  Initialize
>>> 	function_used_regs.
>>> 	* df-scan.c: Include function-abi.h.
>>> 	(df_get_call_refs): Use insn_callee_abi instead of
>>> 	get_call_reg_set_usage.
>>> 	* ira-lives.c: Include function-abi.h.
>>> 	(process_bb_node_lives): Use insn_callee_abi instead of
>>> 	get_call_reg_set_usage.
>>> 	* lra-lives.c: Include function-abi.h.
>>> 	(process_bb_lives): Use insn_callee_abi instead of
>>> 	get_call_reg_set_usage.
>>> 	* postreload.c: Include function-abi.h.
>>> 	(reload_combine): Use insn_callee_abi instead of
>>> 	get_call_reg_set_usage.
>>> 	* regcprop.c: Include function-abi.h.
>>> 	(copyprop_hardreg_forward_1): Use insn_callee_abi instead of
>>> 	get_call_reg_set_usage.
>>> 	* resource.c: Include function-abi.h.
>>> 	(mark_set_resources, mark_target_live_regs): Use insn_callee_abi
>>> 	instead of get_call_reg_set_usage.
>>> 	* var-tracking.c: Include function-abi.h.
>>> 	(dataflow_set_clear_at_call): Use insn_callee_abi instead of
>>> 	get_call_reg_set_usage.
>>>
>> OK.  I think that's the whole set, right?
> 
> Yep, all committed now :-)  Thanks again for the reviews.
> 
>> There were a ton of guality issues over the last couple weeks in the
>> tester, but those were somewhat expected due to a prior patchkit.  The
>> affected targets have new baseline results in my tester, so if anything
>> goes nuts as a result of your patches it should be fairly obvious.
> 
> There was certainly an extra guality failure on aarch64-linux-gnu with
> this series, think it was:
> 
> gcc.dg/guality/pr41616-1.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-object
> 
> Will be interesting to see what effect it has on other targets.
Noted.  I'll pass them along rather than ignoring and forcing new baselines.

jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [01/32] Add function_abi.{h,cc}
  2019-09-30  9:19     ` Richard Sandiford
@ 2019-09-30 21:16       ` Jeff Law
  0 siblings, 0 replies; 100+ messages in thread
From: Jeff Law @ 2019-09-30 21:16 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 9/30/19 3:19 AM, Richard Sandiford wrote:
> Hi Jeff,
> 
> Thanks as always for the reviews :-)
> 
> Jeff Law <law@redhat.com> writes:
>> On 9/11/19 1:03 PM, Richard Sandiford wrote:
>>> This patch adds new structures and functions for handling
>>> multiple ABIs in a translation unit.  The structures are:
>>>
>>> - predefined_function_abi: describes a static, predefined ABI
>>> - function_abi: describes either a predefined ABI or a local
>>>   variant of one (e.g. taking -fipa-ra into account)
>>>
>>> The patch adds functions for getting the ABI from a given type
>>> or decl; a later patch will also add a function for getting the
>>> ABI of the target of a call insn.
>>>
>>> Although ABIs are about much more than call-clobber/saved choices,
>>> I wanted to keep the name general in case we add more ABI-related
>>> information in future.
>>>
>>>
>>> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
>>>
>>> gcc/
>>> 	* Makefile.in (OBJS): Add function-abi.o.
>>> 	(GTFILES): Add function-abi.h.
>>> 	* function-abi.cc: New file.
>>> 	* function-abi.h: Likewise.
>>> 	* emit-rtl.h (rtl_data::abi): New field.
>>> 	* function.c: Include function-abi.h.
>>> 	(prepare_function_start): Initialize crtl->abi.
>>> 	* read-rtl-function.c: Include regs.h and function-abi.h.
>>> 	(read_rtl_function_body): Initialize crtl->abi.
>>> 	(read_rtl_function_body_from_file_range): Likewise.
>>> 	* reginfo.c: Include function-abi.h.
>>> 	(init_reg_sets_1): Initialize default_function_abi.
>>> 	(globalize_reg): Call add_full_reg_clobber for each predefined ABI
>>> 	when making a register global.
>>> 	* target-globals.h (this_target_function_abi_info): Declare.
>>> 	(target_globals::function_abi_info): New field.
>>> 	(restore_target_globals): Copy it.
>>> 	* target-globals.c: Include function-abi.h.
>>> 	(default_target_globals): Initialize the function_abi_info field.
>>> 	(target_globals): Allocate it.
>>> 	(save_target_globals): Free it.
>> So no problem with this as-is.  Questions though:
>>
>> 1. Do we need to stream this information for LTO?
> 
> At the moment this is all derived information rather than something we need
> to stream directly.  E.g. the set of available predefined_function_abis
> really only depends on command-line flags.  The mapping from functions
> to predefined_function_abis currently depends only on the function type,
> so streaming the type is enough to recover the ABI too.  function_abi
> additionally depends on RTL stuff that doesn't affect LTO.
OK.  THanks for clarifying.


> 
>> 2. Do we need to support it for the RTL front-end, even if primarily for
>> testing purposes?
> 
> Yeah, I guess it could be useful to be able to pretend that a function
> is defined locally with a certain -fipa-ra clobber set, but I think in
> most cases it'd be possible to do this using:
> 
>   void __attribute__ ((noinline, noclone))
>   callee (void)
>   {
>     asm ("" ::: ...regs...);
>   }
> 
> (That's the kind of test I've used in the support for the SVE PCS FWIW,
> not posted yet.)
Your call on whether or not it's worth it.  Given that it's supposed to
be derived info, I won't lose any sleep if it's not available in the RTL
front-end.

jeff

^ permalink raw reply	[flat|nested] 100+ messages in thread

* build-failure for cris-elf with "[00/32] Support multiple ABIs in the same translation unit"
  2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
                   ` (33 preceding siblings ...)
  2019-09-26 19:24 ` Dimitar Dimitrov
@ 2019-10-01  2:09 ` Hans-Peter Nilsson
  2019-10-01  7:51   ` Richard Sandiford
  34 siblings, 1 reply; 100+ messages in thread
From: Hans-Peter Nilsson @ 2019-10-01  2:09 UTC (permalink / raw)
  To: richard.sandiford; +Cc: gcc-patches

> From: Richard Sandiford <richard.sandiford@arm.com>
> Date: Wed, 11 Sep 2019 21:02:26 +0200

> This series of patches introduces some classes and helpers for handling
> multiple ABIs in the same translation unit.  At the moment "ABI" maans
> specifically the choice of call-clobbered registers
[...]

> The series also makes -fipa-ra work for partially-clobbered registers too.
[...]

My autotester for cris-elf complains about a build-breaking
commit in the revision range (working:breaking) 276299:276359
and a glance at those commits and the error message says the
cause is likely one of your commits.  Relevant part of
build-log, hopefully sufficient:

-----------------------
g++ -fno-PIE -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE   -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -I. -I. -I/x/hpautotest-gcc1/gcc/gcc -I/x/hpautotest-gcc1/gcc/gcc/. -I/x/hpautotest-gcc1/gcc/gcc/../include -I/x/hpautotest-gcc1/gcc/gcc/../libcpp/include -I/x/hpautotest-gcc1/cris-elf/gccobj/./gmp -I/x/hpautotest-gcc1/gcc/gmp -I/x/hpautotest-gcc1/cris-elf/gccobj/./mpfr/src -I/x/hpautotest-gcc1/gcc/mpfr/src -I/x/hpautotest-gcc1/gcc/mpc/src  -I/x/hpautotest-gcc1/gcc/gcc/../libdecnumber -I/x/hpautotest-gcc1/gcc/gcc/../libdecnumber/dpd -I../libdecnumber -I/x/hpautotest-gcc1/gcc/gcc/../libbacktrace   -o caller-save.o -MT caller-save.o -MMD -MP -MF ./.deps/caller-save.TPo /x/hpautotest-gcc1/gcc/gcc/caller-save.c
In file included from /x/hpautotest-gcc1/gcc/gcc/caller-save.c:31:0:
/x/hpautotest-gcc1/gcc/gcc/caller-save.c: In function 'void init_caller_save()':
/x/hpautotest-gcc1/gcc/gcc/regs.h:195:44: error: cannot convert 'bool' to 'const predefined_function_abi*' for argument '3' to 'machine_mode choose_hard_reg_mode(unsigned int, unsigned int, const predefined_function_abi*)'
   choose_hard_reg_mode (REGNO, NREGS, false)
                                            ^
/x/hpautotest-gcc1/gcc/gcc/caller-save.c:203:26: note: in expansion of macro 'HARD_REGNO_CALLER_SAVE_MODE'
  regno_save_mode[i][j] = HARD_REGNO_CALLER_SAVE_MODE (i, j, VOIDmode);
                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~
/x/hpautotest-gcc1/gcc/gcc/caller-save.c: In function 'void save_call_clobbered_regs()':
/x/hpautotest-gcc1/gcc/gcc/regs.h:195:44: error: cannot convert 'bool' to 'const predefined_function_abi*' for argument '3' to 'machine_mode choose_hard_reg_mode(unsigned int, unsigned int, const predefined_function_abi*)'
   choose_hard_reg_mode (REGNO, NREGS, false)
                                            ^
/x/hpautotest-gcc1/gcc/gcc/caller-save.c:821:12: note: in expansion of macro 'HARD_REGNO_CALLER_SAVE_MODE'
     mode = HARD_REGNO_CALLER_SAVE_MODE
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~
Makefile:1117: recipe for target 'caller-save.o' failed
-----------------------

> Also tested by compiling at least one target per CPU directory and
> checking for no new warnings.

(Hmm...  So maybe a host gcc issue?)

My host is x86-64 Debian 9, i.e. gcc-6.3.0.

brgds, H-P

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: build-failure for cris-elf with "[00/32] Support multiple ABIs in the same translation unit"
  2019-10-01  2:09 ` build-failure for cris-elf with "[00/32] Support multiple ABIs in the same translation unit" Hans-Peter Nilsson
@ 2019-10-01  7:51   ` Richard Sandiford
  2019-10-01 10:58     ` Hans-Peter Nilsson
  0 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-10-01  7:51 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: gcc-patches

Hans-Peter Nilsson <hans-peter.nilsson@axis.com> writes:
>> From: Richard Sandiford <richard.sandiford@arm.com>
>> Date: Wed, 11 Sep 2019 21:02:26 +0200
>
>> This series of patches introduces some classes and helpers for handling
>> multiple ABIs in the same translation unit.  At the moment "ABI" maans
>> specifically the choice of call-clobbered registers
> [...]
>
>> The series also makes -fipa-ra work for partially-clobbered registers too.
> [...]
>
> My autotester for cris-elf complains about a build-breaking
> commit in the revision range (working:breaking) 276299:276359
> and a glance at those commits and the error message says the
> cause is likely one of your commits.  Relevant part of
> build-log, hopefully sufficient:
>
> -----------------------
> g++ -fno-PIE -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE   -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -I. -I. -I/x/hpautotest-gcc1/gcc/gcc -I/x/hpautotest-gcc1/gcc/gcc/. -I/x/hpautotest-gcc1/gcc/gcc/../include -I/x/hpautotest-gcc1/gcc/gcc/../libcpp/include -I/x/hpautotest-gcc1/cris-elf/gccobj/./gmp -I/x/hpautotest-gcc1/gcc/gmp -I/x/hpautotest-gcc1/cris-elf/gccobj/./mpfr/src -I/x/hpautotest-gcc1/gcc/mpfr/src -I/x/hpautotest-gcc1/gcc/mpc/src  -I/x/hpautotest-gcc1/gcc/gcc/../libdecnumber -I/x/hpautotest-gcc1/gcc/gcc/../libdecnumber/dpd -I../libdecnumber -I/x/hpautotest-gcc1/gcc/gcc/../libbacktrace   -o caller-save.o -MT caller-save.o -MMD -MP -MF ./.deps/caller-save.TPo /x/hpautotest-gcc1/gcc/gcc/caller-save.c
> In file included from /x/hpautotest-gcc1/gcc/gcc/caller-save.c:31:0:
> /x/hpautotest-gcc1/gcc/gcc/caller-save.c: In function 'void init_caller_save()':
> /x/hpautotest-gcc1/gcc/gcc/regs.h:195:44: error: cannot convert 'bool' to 'const predefined_function_abi*' for argument '3' to 'machine_mode choose_hard_reg_mode(unsigned int, unsigned int, const predefined_function_abi*)'
>    choose_hard_reg_mode (REGNO, NREGS, false)
>                                             ^
> /x/hpautotest-gcc1/gcc/gcc/caller-save.c:203:26: note: in expansion of macro 'HARD_REGNO_CALLER_SAVE_MODE'
>   regno_save_mode[i][j] = HARD_REGNO_CALLER_SAVE_MODE (i, j, VOIDmode);
>                           ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> /x/hpautotest-gcc1/gcc/gcc/caller-save.c: In function 'void save_call_clobbered_regs()':
> /x/hpautotest-gcc1/gcc/gcc/regs.h:195:44: error: cannot convert 'bool' to 'const predefined_function_abi*' for argument '3' to 'machine_mode choose_hard_reg_mode(unsigned int, unsigned int, const predefined_function_abi*)'
>    choose_hard_reg_mode (REGNO, NREGS, false)
>                                             ^
> /x/hpautotest-gcc1/gcc/gcc/caller-save.c:821:12: note: in expansion of macro 'HARD_REGNO_CALLER_SAVE_MODE'
>      mode = HARD_REGNO_CALLER_SAVE_MODE
>             ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> Makefile:1117: recipe for target 'caller-save.o' failed
> -----------------------
>
>> Also tested by compiling at least one target per CPU directory and
>> checking for no new warnings.
>
> (Hmm...  So maybe a host gcc issue?)

Bah, yeah, seems so.

> My host is x86-64 Debian 9, i.e. gcc-6.3.0.

Mine was 5.4.0, which only treats this as a warning.  Normally I check
for extra warnings too, but mustn't have done this time, sorry.

Fixed as below.  I also belatedly see a new definition of
HARD_REGNO_CALLER_SAVE_MODE was added since I posted the patches,
so the patch fixes that too.  Tested by cross-building cris-elf
and sparc-linux-gnu, applied as obvious.

Richard


2019-10-01  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* regs.h (HARD_REGNO_CALLER_SAVE_MODE): Update call to
	choose_hard_reg_mode.
	* config/sparc/sparc.h (HARD_REGNO_CALLER_SAVE_MODE): Likewise.

Index: gcc/regs.h
===================================================================
--- gcc/regs.h	2019-09-30 17:19:45.047128655 +0100
+++ gcc/regs.h	2019-10-01 08:46:22.368168133 +0100
@@ -192,7 +192,7 @@ #define REG_BASIC_BLOCK(N) (reg_info_p[N
 /* Select a register mode required for caller save of hard regno REGNO.  */
 #ifndef HARD_REGNO_CALLER_SAVE_MODE
 #define HARD_REGNO_CALLER_SAVE_MODE(REGNO, NREGS, MODE) \
-  choose_hard_reg_mode (REGNO, NREGS, false)
+  choose_hard_reg_mode (REGNO, NREGS, NULL)
 #endif
 
 /* Target-dependent globals.  */
Index: gcc/config/sparc/sparc.h
===================================================================
--- gcc/config/sparc/sparc.h	2019-09-21 13:56:08.855935013 +0100
+++ gcc/config/sparc/sparc.h	2019-10-01 08:46:22.368168133 +0100
@@ -716,7 +716,7 @@ #define HARD_REGNO_RENAME_OK(FROM, TO) (
    mode but the largest suitable mode for the given (REGNO, NREGS) pair and
    it quickly creates paradoxical subregs that can be problematic.  */
 #define HARD_REGNO_CALLER_SAVE_MODE(REGNO, NREGS, MODE) \
-  ((MODE) == VOIDmode ? choose_hard_reg_mode (REGNO, NREGS, false) : (MODE))
+  ((MODE) == VOIDmode ? choose_hard_reg_mode (REGNO, NREGS, NULL) : (MODE))
 
 /* Specify the registers used for certain standard purposes.
    The values of these macros are register numbers.  */

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [04/32] [x86] Robustify vzeroupper handling across calls
  2019-09-25 15:48   ` Richard Sandiford
  2019-09-25 18:11     ` Uros Bizjak
@ 2019-10-01 10:14     ` Uros Bizjak
  2019-10-08 18:17       ` Uros Bizjak
  1 sibling, 1 reply; 100+ messages in thread
From: Uros Bizjak @ 2019-10-01 10:14 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, Jan Hubicka

On Wed, Sep 25, 2019 at 5:48 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:

> > The comment suggests that this code is only needed for Win64 and that
> > not testing for Win64 is just a simplification.  But in practice it was
> > needed for correctness on GNU/Linux and other targets too, since without
> > it the RA would be able to keep 256-bit and 512-bit values in SSE
> > registers across calls that are known not to clobber them.
> >
> > This patch conservatively treats calls as AVX_U128_ANY if the RA can see
> > that some SSE registers are not touched by a call.  There are then no
> > regressions if the ix86_hard_regno_call_part_clobbered check is disabled
> > for GNU/Linux (not something we should do, was just for testing).

If RA can sse that some SSE regs are not touched by the call, then we
are sure that the called function is part of the current TU. In this
case, the called function will be compiled using VEX instructions,
where there is no AVX-SSE transition penalty. So, skipping VZEROUPPER
is beneficial here.

Uros.

> > If in fact we want -fipa-ra to pretend that all functions clobber
> > SSE registers above 128 bits, it'd certainly be possible to arrange
> > that.  But IMO that would be an optimisation decision, whereas what
> > the patch is fixing is a correctness decision.  So I think we should
> > have this check even so.
>
> 2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * config/i386/i386.c: Include function-abi.h.
>         (ix86_avx_u128_mode_needed): Treat function calls as AVX_U128_ANY
>         if they preserve some 256-bit or 512-bit SSE registers.
>
> Index: gcc/config/i386/i386.c
> ===================================================================
> --- gcc/config/i386/i386.c      2019-09-25 16:47:48.000000000 +0100
> +++ gcc/config/i386/i386.c      2019-09-25 16:47:49.089962608 +0100
> @@ -95,6 +95,7 @@ #define IN_TARGET_CODE 1
>  #include "i386-builtins.h"
>  #include "i386-expand.h"
>  #include "i386-features.h"
> +#include "function-abi.h"
>
>  /* This file should be included last.  */
>  #include "target-def.h"
> @@ -13511,6 +13512,15 @@ ix86_avx_u128_mode_needed (rtx_insn *ins
>             }
>         }
>
> +      /* If the function is known to preserve some SSE registers,
> +        RA and previous passes can legitimately rely on that for
> +        modes wider than 256 bits.  It's only safe to issue a
> +        vzeroupper if all SSE registers are clobbered.  */
> +      const function_abi &abi = insn_callee_abi (insn);
> +      if (!hard_reg_set_subset_p (reg_class_contents[ALL_SSE_REGS],
> +                                 abi.mode_clobbers (V4DImode)))
> +       return AVX_U128_ANY;
> +
>        return AVX_U128_CLEAN;
>      }
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: build-failure for cris-elf with "[00/32] Support multiple ABIs in the same translation unit"
  2019-10-01  7:51   ` Richard Sandiford
@ 2019-10-01 10:58     ` Hans-Peter Nilsson
  0 siblings, 0 replies; 100+ messages in thread
From: Hans-Peter Nilsson @ 2019-10-01 10:58 UTC (permalink / raw)
  To: richard.sandiford; +Cc: gcc-patches

> From: Richard Sandiford <richard.sandiford@arm.com>
> Date: Tue, 1 Oct 2019 09:51:51 +0200

> Hans-Peter Nilsson <hans-peter.nilsson@axis.com> writes:
> > My autotester for cris-elf complains about a build-breaking
> > commit in the revision range (working:breaking) 276299:276359

> Fixed as below.  I also belatedly see a new definition of
> HARD_REGNO_CALLER_SAVE_MODE was added since I posted the patches,
> so the patch fixes that too.  Tested by cross-building cris-elf
> and sparc-linux-gnu, applied as obvious.

Thanks, but...  The build still fails, now when building
gfortran.  I've entered PR91948 (JFTR; CC:ed).

brgds, H-P

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [29/32] Remove global call sets: sched-deps.c
  2019-09-29 22:20   ` Jeff Law
@ 2019-10-04 14:32     ` Christophe Lyon
  2019-10-04 14:35       ` Richard Sandiford
  0 siblings, 1 reply; 100+ messages in thread
From: Christophe Lyon @ 2019-10-04 14:32 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc Patches, Richard Sandiford

On Mon, 30 Sep 2019 at 00:20, Jeff Law <law@redhat.com> wrote:

> On 9/11/19 1:17 PM, Richard Sandiford wrote:
> > This is a straight replacement of an existing "full or partial"
> > call-clobber check.
> >
> >
> > 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> >
> > gcc/
> >       * sched-deps.c (deps_analyze_insn): Use the ABI of the target
> >       function to test whether a register is fully or partly clobbered.
> OK
> jeff
>

Hi Richard,

My testing shows regressions on arm after you applied this patch (r276335):
For instance on arm-none-linux-gnueabi
--with-mode arm
--with-cpu cortex-a9
FAIL:  gcc.dg/strlenopt-18g.c execution test

If you force -march=armv5t via RUNTESTFLAGS, there's an additional failure:
FAIL: gcc.dg/strlenopt-19.c execution test

In fortran, I see different sets of regressions depending on arm vs thumb
mode.
target arm-none-linux-gnueabi
--with-mode arm
--with-cpu cortex-a9
I get these new FAILs:
    gfortran.dg/char4_iunit_1.f03   -O0  execution test
    gfortran.dg/char4_iunit_1.f03   -O1  execution test
    gfortran.dg/char4_iunit_1.f03   -O2  execution test
    gfortran.dg/char4_iunit_1.f03   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
    gfortran.dg/char4_iunit_1.f03   -O3 -g  execution test
    gfortran.dg/char4_iunit_1.f03   -Os  execution test
    gfortran.dg/namelist_16.f90   -O0  execution test
    gfortran.dg/namelist_16.f90   -O1  execution test
    gfortran.dg/namelist_16.f90   -O2  execution test
    gfortran.dg/namelist_16.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
    gfortran.dg/namelist_16.f90   -O3 -g  execution test
    gfortran.dg/namelist_16.f90   -Os  execution test
    gfortran.dg/namelist_95.f90   -O0  execution test
    gfortran.dg/namelist_95.f90   -O1  execution test
    gfortran.dg/namelist_95.f90   -O2  execution test
    gfortran.dg/namelist_95.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
    gfortran.dg/namelist_95.f90   -O3 -g  execution test
    gfortran.dg/namelist_95.f90   -Os  execution test
    gfortran.dg/real_const_3.f90   -O0  execution test
    gfortran.dg/real_const_3.f90   -O1  execution test
    gfortran.dg/real_const_3.f90   -O2  execution test
    gfortran.dg/real_const_3.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
    gfortran.dg/real_const_3.f90   -O3 -g  execution test
    gfortran.dg/real_const_3.f90   -Os  execution test


When defaulting to thumb:
target arm-none-linux-gnueabi
--with-mode thumb
--with-cpu cortex-a9
I get these new FAILs:
    gfortran.dg/f2003_io_5.f03   -O0  execution test
    gfortran.dg/f2003_io_5.f03   -O1  execution test
    gfortran.dg/f2003_io_5.f03   -O2  execution test
    gfortran.dg/f2003_io_5.f03   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
    gfortran.dg/f2003_io_5.f03   -O3 -g  execution test
    gfortran.dg/f2003_io_5.f03   -Os  execution test
    gfortran.dg/real_const_3.f90   -O0  execution test
    gfortran.dg/real_const_3.f90   -O1  execution test
    gfortran.dg/real_const_3.f90   -O2  execution test
    gfortran.dg/real_const_3.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
    gfortran.dg/real_const_3.f90   -O3 -g  execution test
    gfortran.dg/real_const_3.f90   -Os  execution test

This is the most recent validation result I have so far, so maybe you
already fixed the problem?

Thanks,

Christophe

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [29/32] Remove global call sets: sched-deps.c
  2019-10-04 14:32     ` Christophe Lyon
@ 2019-10-04 14:35       ` Richard Sandiford
  2019-10-04 14:37         ` Christophe Lyon
  2019-10-07 13:29         ` Christophe Lyon
  0 siblings, 2 replies; 100+ messages in thread
From: Richard Sandiford @ 2019-10-04 14:35 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: Jeff Law, gcc Patches

Christophe Lyon <christophe.lyon@linaro.org> writes:
> On Mon, 30 Sep 2019 at 00:20, Jeff Law <law@redhat.com> wrote:
>
>     On 9/11/19 1:17 PM, Richard Sandiford wrote:
>     > This is a straight replacement of an existing "full or partial"
>     > call-clobber check.
>     >
>     >
>     > 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
>     >
>     > gcc/
>     >       * sched-deps.c (deps_analyze_insn): Use the ABI of the target
>     >       function to test whether a register is fully or partly clobbered.
>     OK
>     jeff
>
>
> Hi Richard,
>
> My testing shows regressions on arm after you applied this patch (r276335):
> For instance on arm-none-linux-gnueabi
> --with-mode arm
> --with-cpu cortex-a9
> FAIL:  gcc.dg/strlenopt-18g.c execution test
>
> If you force -march=armv5t via RUNTESTFLAGS, there's an additional failure:
> FAIL: gcc.dg/strlenopt-19.c execution test
>
> In fortran, I see different sets of regressions depending on arm vs thumb mode.
> target arm-none-linux-gnueabi
> --with-mode arm
> --with-cpu cortex-a9
> I get these new FAILs:
>     gfortran.dg/char4_iunit_1.f03   -O0  execution test
>     gfortran.dg/char4_iunit_1.f03   -O1  execution test
>     gfortran.dg/char4_iunit_1.f03   -O2  execution test
>     gfortran.dg/char4_iunit_1.f03   -O3 -fomit-frame-pointer -funroll-loops
> -fpeel-loops -ftracer -finline-functions  execution test
>     gfortran.dg/char4_iunit_1.f03   -O3 -g  execution test
>     gfortran.dg/char4_iunit_1.f03   -Os  execution test
>     gfortran.dg/namelist_16.f90   -O0  execution test
>     gfortran.dg/namelist_16.f90   -O1  execution test
>     gfortran.dg/namelist_16.f90   -O2  execution test
>     gfortran.dg/namelist_16.f90   -O3 -fomit-frame-pointer -funroll-loops
> -fpeel-loops -ftracer -finline-functions  execution test
>     gfortran.dg/namelist_16.f90   -O3 -g  execution test
>     gfortran.dg/namelist_16.f90   -Os  execution test
>     gfortran.dg/namelist_95.f90   -O0  execution test
>     gfortran.dg/namelist_95.f90   -O1  execution test
>     gfortran.dg/namelist_95.f90   -O2  execution test
>     gfortran.dg/namelist_95.f90   -O3 -fomit-frame-pointer -funroll-loops
> -fpeel-loops -ftracer -finline-functions  execution test
>     gfortran.dg/namelist_95.f90   -O3 -g  execution test
>     gfortran.dg/namelist_95.f90   -Os  execution test
>     gfortran.dg/real_const_3.f90   -O0  execution test
>     gfortran.dg/real_const_3.f90   -O1  execution test
>     gfortran.dg/real_const_3.f90   -O2  execution test
>     gfortran.dg/real_const_3.f90   -O3 -fomit-frame-pointer -funroll-loops
> -fpeel-loops -ftracer -finline-functions  execution test
>     gfortran.dg/real_const_3.f90   -O3 -g  execution test
>     gfortran.dg/real_const_3.f90   -Os  execution test
>
>
> When defaulting to thumb:
> target arm-none-linux-gnueabi
> --with-mode thumb
> --with-cpu cortex-a9
> I get these new FAILs:
>     gfortran.dg/f2003_io_5.f03   -O0  execution test
>     gfortran.dg/f2003_io_5.f03   -O1  execution test
>     gfortran.dg/f2003_io_5.f03   -O2  execution test
>     gfortran.dg/f2003_io_5.f03   -O3 -fomit-frame-pointer -funroll-loops
> -fpeel-loops -ftracer -finline-functions  execution test
>     gfortran.dg/f2003_io_5.f03   -O3 -g  execution test
>     gfortran.dg/f2003_io_5.f03   -Os  execution test
>     gfortran.dg/real_const_3.f90   -O0  execution test
>     gfortran.dg/real_const_3.f90   -O1  execution test
>     gfortran.dg/real_const_3.f90   -O2  execution test
>     gfortran.dg/real_const_3.f90   -O3 -fomit-frame-pointer -funroll-loops
> -fpeel-loops -ftracer -finline-functions  execution test
>     gfortran.dg/real_const_3.f90   -O3 -g  execution test
>     gfortran.dg/real_const_3.f90   -Os  execution test
>
> This is the most recent validation result I have so far, so maybe you already
> fixed the problem?

This sounds very like https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00170.html
Let me know if you see any remaining failures after that though.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [29/32] Remove global call sets: sched-deps.c
  2019-10-04 14:35       ` Richard Sandiford
@ 2019-10-04 14:37         ` Christophe Lyon
  2019-10-07 13:29         ` Christophe Lyon
  1 sibling, 0 replies; 100+ messages in thread
From: Christophe Lyon @ 2019-10-04 14:37 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Jeff Law, gcc Patches

On Fri, 4 Oct 2019 at 16:35, Richard Sandiford <richard.sandiford@arm.com>
wrote:

> Christophe Lyon <christophe.lyon@linaro.org> writes:
> > On Mon, 30 Sep 2019 at 00:20, Jeff Law <law@redhat.com> wrote:
> >
> >     On 9/11/19 1:17 PM, Richard Sandiford wrote:
> >     > This is a straight replacement of an existing "full or partial"
> >     > call-clobber check.
> >     >
> >     >
> >     > 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> >     >
> >     > gcc/
> >     >       * sched-deps.c (deps_analyze_insn): Use the ABI of the target
> >     >       function to test whether a register is fully or partly
> clobbered.
> >     OK
> >     jeff
> >
> >
> > Hi Richard,
> >
> > My testing shows regressions on arm after you applied this patch
> (r276335):
> > For instance on arm-none-linux-gnueabi
> > --with-mode arm
> > --with-cpu cortex-a9
> > FAIL:  gcc.dg/strlenopt-18g.c execution test
> >
> > If you force -march=armv5t via RUNTESTFLAGS, there's an additional
> failure:
> > FAIL: gcc.dg/strlenopt-19.c execution test
> >
> > In fortran, I see different sets of regressions depending on arm vs
> thumb mode.
> > target arm-none-linux-gnueabi
> > --with-mode arm
> > --with-cpu cortex-a9
> > I get these new FAILs:
> >     gfortran.dg/char4_iunit_1.f03   -O0  execution test
> >     gfortran.dg/char4_iunit_1.f03   -O1  execution test
> >     gfortran.dg/char4_iunit_1.f03   -O2  execution test
> >     gfortran.dg/char4_iunit_1.f03   -O3 -fomit-frame-pointer
> -funroll-loops
> > -fpeel-loops -ftracer -finline-functions  execution test
> >     gfortran.dg/char4_iunit_1.f03   -O3 -g  execution test
> >     gfortran.dg/char4_iunit_1.f03   -Os  execution test
> >     gfortran.dg/namelist_16.f90   -O0  execution test
> >     gfortran.dg/namelist_16.f90   -O1  execution test
> >     gfortran.dg/namelist_16.f90   -O2  execution test
> >     gfortran.dg/namelist_16.f90   -O3 -fomit-frame-pointer -funroll-loops
> > -fpeel-loops -ftracer -finline-functions  execution test
> >     gfortran.dg/namelist_16.f90   -O3 -g  execution test
> >     gfortran.dg/namelist_16.f90   -Os  execution test
> >     gfortran.dg/namelist_95.f90   -O0  execution test
> >     gfortran.dg/namelist_95.f90   -O1  execution test
> >     gfortran.dg/namelist_95.f90   -O2  execution test
> >     gfortran.dg/namelist_95.f90   -O3 -fomit-frame-pointer -funroll-loops
> > -fpeel-loops -ftracer -finline-functions  execution test
> >     gfortran.dg/namelist_95.f90   -O3 -g  execution test
> >     gfortran.dg/namelist_95.f90   -Os  execution test
> >     gfortran.dg/real_const_3.f90   -O0  execution test
> >     gfortran.dg/real_const_3.f90   -O1  execution test
> >     gfortran.dg/real_const_3.f90   -O2  execution test
> >     gfortran.dg/real_const_3.f90   -O3 -fomit-frame-pointer
> -funroll-loops
> > -fpeel-loops -ftracer -finline-functions  execution test
> >     gfortran.dg/real_const_3.f90   -O3 -g  execution test
> >     gfortran.dg/real_const_3.f90   -Os  execution test
> >
> >
> > When defaulting to thumb:
> > target arm-none-linux-gnueabi
> > --with-mode thumb
> > --with-cpu cortex-a9
> > I get these new FAILs:
> >     gfortran.dg/f2003_io_5.f03   -O0  execution test
> >     gfortran.dg/f2003_io_5.f03   -O1  execution test
> >     gfortran.dg/f2003_io_5.f03   -O2  execution test
> >     gfortran.dg/f2003_io_5.f03   -O3 -fomit-frame-pointer -funroll-loops
> > -fpeel-loops -ftracer -finline-functions  execution test
> >     gfortran.dg/f2003_io_5.f03   -O3 -g  execution test
> >     gfortran.dg/f2003_io_5.f03   -Os  execution test
> >     gfortran.dg/real_const_3.f90   -O0  execution test
> >     gfortran.dg/real_const_3.f90   -O1  execution test
> >     gfortran.dg/real_const_3.f90   -O2  execution test
> >     gfortran.dg/real_const_3.f90   -O3 -fomit-frame-pointer
> -funroll-loops
> > -fpeel-loops -ftracer -finline-functions  execution test
> >     gfortran.dg/real_const_3.f90   -O3 -g  execution test
> >     gfortran.dg/real_const_3.f90   -Os  execution test
> >
> > This is the most recent validation result I have so far, so maybe you
> already
> > fixed the problem?
>
> This sounds very like
> https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00170.html
> Let me know if you see any remaining failures after that though.
>
> Sure, thanks for the quick reply!


> Thanks,
> Richard
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [21/32] Remove global call sets: LRA
  2019-09-11 19:14 ` [21/32] Remove global call sets: LRA Richard Sandiford
  2019-09-30 15:29   ` Jeff Law
@ 2019-10-04 18:03   ` H.J. Lu
  2019-10-04 21:52     ` H.J. Lu
  1 sibling, 1 reply; 100+ messages in thread
From: H.J. Lu @ 2019-10-04 18:03 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: GCC Patches

On Wed, Sep 11, 2019 at 12:14 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> lra_reg has an actual_call_used_reg_set field that is only used during
> inheritance.  This in turn required a special lra_create_live_ranges
> pass for flag_ipa_ra to set up this field.  This patch instead makes
> the inheritance code do its own live register tracking, using the
> same ABI-mask-and-clobber-set pair as for IRA.
>
> Tracking ABIs simplifies (and cheapens) the logic in lra-lives.c and
> means we no longer need a separate path for -fipa-ra.  It also means
> we can remove TARGET_RETURN_CALL_WITH_MAX_CLOBBERS.
>
> The patch also strengthens the sanity check in lra_assigns so that
> we check that reg_renumber is consistent with the whole conflict set,
> not just the call-clobbered registers.
>
>
> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * target.def (return_call_with_max_clobbers): Delete.
>         * doc/tm.texi.in (TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): Delete.
>         * doc/tm.texi: Regenerate.
>         * config/aarch64/aarch64.c (aarch64_return_call_with_max_clobbers)
>         (TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): Delete.
>         * lra-int.h (lra_reg::actual_call_used_reg_set): Delete.
>         (lra_reg::call_insn): Delete.
>         * lra.c: Include function-abi.h.
>         (initialize_lra_reg_info_element): Don't initialize the fields above.
>         (lra): Use crtl->abi to test whether the current function needs to
>         save a register in the prologue.  Remove special pre-inheritance
>         lra_create_live_ranges pass for flag_ipa_ra.
>         * lra-assigns.c: Include function-abi.h
>         (find_hard_regno_for_1): Use crtl->abi to test whether the current
>         function needs to save a register in the prologue.
>         (lra_assign): Assert that registers aren't allocated to a
>         conflicting register, rather than checking only for overlaps
>         with call_used_or_fixed_regs.  Do this even for flag_ipa_ra,
>         and for registers that are not live across a call.
>         * lra-constraints.c (last_call_for_abi): New variable.
>         (full_and_partial_call_clobbers): Likewise.
>         (setup_next_usage_insn): Remove the register from
>         full_and_partial_call_clobbers.
>         (need_for_call_save_p): Use call_clobbered_in_region_p to test
>         whether the register needs a caller save.
>         (need_for_split_p): Use full_and_partial_reg_clobbers instead
>         of call_used_or_fixed_regs.
>         (inherit_in_ebb): Initialize and maintain last_call_for_abi and
>         full_and_partial_call_clobbers.
>         * lra-lives.c (check_pseudos_live_through_calls): Replace
>         last_call_used_reg_set and call_insn arguments with an abi argument.
>         Remove handling of lra_reg::call_insn.  Use function_abi::mode_clobbers
>         as the set of conflicting registers.
>         (calls_have_same_clobbers_p): Delete.
>         (process_bb_lives): Track the ABI of the last call instead of an
>         insn/HARD_REG_SET pair.  Update calls to
>         check_pseudos_live_through_calls.  Use eh_edge_abi to calculate
>         the set of registers that could be clobbered by an EH edge.
>         Include partially-clobbered as well as fully-clobbered registers.
>         (lra_create_live_ranges_1): Don't initialize lra_reg::call_insn.
>         * lra-remat.c: Include function-abi.h.
>         (call_used_regs_arr_len, call_used_regs_arr): Delete.
>         (set_bb_regs): Use call_insn_abi to get the set of call-clobbered
>         registers and bitmap_view to combine them into dead_regs.
>         (call_used_input_regno_present_p): Take a function_abi argument
>         and use it to test whether a register is call-clobbered.
>         (calculate_gen_cands): Use call_insn_abi to get the ABI of the
>         call insn target.  Update tje call to call_used_input_regno_present_p.
>         (do_remat): Likewise.
>         (lra_remat): Remove the initialization of call_used_regs_arr_len
>         and call_used_regs_arr.

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91994

-- 
H.J.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [21/32] Remove global call sets: LRA
  2019-10-04 18:03   ` H.J. Lu
@ 2019-10-04 21:52     ` H.J. Lu
  2019-10-05 13:33       ` Richard Sandiford
  0 siblings, 1 reply; 100+ messages in thread
From: H.J. Lu @ 2019-10-04 21:52 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: GCC Patches

On Fri, Oct 4, 2019 at 11:03 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Wed, Sep 11, 2019 at 12:14 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
> >
> > lra_reg has an actual_call_used_reg_set field that is only used during
> > inheritance.  This in turn required a special lra_create_live_ranges
> > pass for flag_ipa_ra to set up this field.  This patch instead makes
> > the inheritance code do its own live register tracking, using the
> > same ABI-mask-and-clobber-set pair as for IRA.
> >
> > Tracking ABIs simplifies (and cheapens) the logic in lra-lives.c and
> > means we no longer need a separate path for -fipa-ra.  It also means
> > we can remove TARGET_RETURN_CALL_WITH_MAX_CLOBBERS.
> >
> > The patch also strengthens the sanity check in lra_assigns so that
> > we check that reg_renumber is consistent with the whole conflict set,
> > not just the call-clobbered registers.
> >
> >
> > 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> >
> > gcc/
> >         * target.def (return_call_with_max_clobbers): Delete.
> >         * doc/tm.texi.in (TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): Delete.
> >         * doc/tm.texi: Regenerate.
> >         * config/aarch64/aarch64.c (aarch64_return_call_with_max_clobbers)
> >         (TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): Delete.
> >         * lra-int.h (lra_reg::actual_call_used_reg_set): Delete.
> >         (lra_reg::call_insn): Delete.
> >         * lra.c: Include function-abi.h.
> >         (initialize_lra_reg_info_element): Don't initialize the fields above.
> >         (lra): Use crtl->abi to test whether the current function needs to
> >         save a register in the prologue.  Remove special pre-inheritance
> >         lra_create_live_ranges pass for flag_ipa_ra.
> >         * lra-assigns.c: Include function-abi.h
> >         (find_hard_regno_for_1): Use crtl->abi to test whether the current
> >         function needs to save a register in the prologue.
> >         (lra_assign): Assert that registers aren't allocated to a
> >         conflicting register, rather than checking only for overlaps
> >         with call_used_or_fixed_regs.  Do this even for flag_ipa_ra,
> >         and for registers that are not live across a call.
> >         * lra-constraints.c (last_call_for_abi): New variable.
> >         (full_and_partial_call_clobbers): Likewise.
> >         (setup_next_usage_insn): Remove the register from
> >         full_and_partial_call_clobbers.
> >         (need_for_call_save_p): Use call_clobbered_in_region_p to test
> >         whether the register needs a caller save.
> >         (need_for_split_p): Use full_and_partial_reg_clobbers instead
> >         of call_used_or_fixed_regs.
> >         (inherit_in_ebb): Initialize and maintain last_call_for_abi and
> >         full_and_partial_call_clobbers.
> >         * lra-lives.c (check_pseudos_live_through_calls): Replace
> >         last_call_used_reg_set and call_insn arguments with an abi argument.
> >         Remove handling of lra_reg::call_insn.  Use function_abi::mode_clobbers
> >         as the set of conflicting registers.
> >         (calls_have_same_clobbers_p): Delete.
> >         (process_bb_lives): Track the ABI of the last call instead of an
> >         insn/HARD_REG_SET pair.  Update calls to
> >         check_pseudos_live_through_calls.  Use eh_edge_abi to calculate
> >         the set of registers that could be clobbered by an EH edge.
> >         Include partially-clobbered as well as fully-clobbered registers.
> >         (lra_create_live_ranges_1): Don't initialize lra_reg::call_insn.
> >         * lra-remat.c: Include function-abi.h.
> >         (call_used_regs_arr_len, call_used_regs_arr): Delete.
> >         (set_bb_regs): Use call_insn_abi to get the set of call-clobbered
> >         registers and bitmap_view to combine them into dead_regs.
> >         (call_used_input_regno_present_p): Take a function_abi argument
> >         and use it to test whether a register is call-clobbered.
> >         (calculate_gen_cands): Use call_insn_abi to get the ABI of the
> >         call insn target.  Update tje call to call_used_input_regno_present_p.
> >         (do_remat): Likewise.
> >         (lra_remat): Remove the initialization of call_used_regs_arr_len
> >         and call_used_regs_arr.
>
> This caused:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91994
>

This change doesn't work with -mzeroupper.  When -mzeroupper is used,
upper bits of vector registers are clobbered upon callee return if any
MM/ZMM registers are used in callee.  Even if YMM7 isn't used, upper
bits of YMM7 can still be clobbered by vzeroupper when YMM1 is used.

-- 
H.J.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [21/32] Remove global call sets: LRA
  2019-10-04 21:52     ` H.J. Lu
@ 2019-10-05 13:33       ` Richard Sandiford
  0 siblings, 0 replies; 100+ messages in thread
From: Richard Sandiford @ 2019-10-05 13:33 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GCC Patches

"H.J. Lu" <hjl.tools@gmail.com> writes:
> On Fri, Oct 4, 2019 at 11:03 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>> On Wed, Sep 11, 2019 at 12:14 PM Richard Sandiford
>> <richard.sandiford@arm.com> wrote:
>> >
>> > lra_reg has an actual_call_used_reg_set field that is only used during
>> > inheritance.  This in turn required a special lra_create_live_ranges
>> > pass for flag_ipa_ra to set up this field.  This patch instead makes
>> > the inheritance code do its own live register tracking, using the
>> > same ABI-mask-and-clobber-set pair as for IRA.
>> >
>> > Tracking ABIs simplifies (and cheapens) the logic in lra-lives.c and
>> > means we no longer need a separate path for -fipa-ra.  It also means
>> > we can remove TARGET_RETURN_CALL_WITH_MAX_CLOBBERS.
>> >
>> > The patch also strengthens the sanity check in lra_assigns so that
>> > we check that reg_renumber is consistent with the whole conflict set,
>> > not just the call-clobbered registers.
>> >
>> >
>> > 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
>> >
>> > gcc/
>> >         * target.def (return_call_with_max_clobbers): Delete.
>> >         * doc/tm.texi.in (TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): Delete.
>> >         * doc/tm.texi: Regenerate.
>> >         * config/aarch64/aarch64.c (aarch64_return_call_with_max_clobbers)
>> >         (TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): Delete.
>> >         * lra-int.h (lra_reg::actual_call_used_reg_set): Delete.
>> >         (lra_reg::call_insn): Delete.
>> >         * lra.c: Include function-abi.h.
>> >         (initialize_lra_reg_info_element): Don't initialize the fields above.
>> >         (lra): Use crtl->abi to test whether the current function needs to
>> >         save a register in the prologue.  Remove special pre-inheritance
>> >         lra_create_live_ranges pass for flag_ipa_ra.
>> >         * lra-assigns.c: Include function-abi.h
>> >         (find_hard_regno_for_1): Use crtl->abi to test whether the current
>> >         function needs to save a register in the prologue.
>> >         (lra_assign): Assert that registers aren't allocated to a
>> >         conflicting register, rather than checking only for overlaps
>> >         with call_used_or_fixed_regs.  Do this even for flag_ipa_ra,
>> >         and for registers that are not live across a call.
>> >         * lra-constraints.c (last_call_for_abi): New variable.
>> >         (full_and_partial_call_clobbers): Likewise.
>> >         (setup_next_usage_insn): Remove the register from
>> >         full_and_partial_call_clobbers.
>> >         (need_for_call_save_p): Use call_clobbered_in_region_p to test
>> >         whether the register needs a caller save.
>> >         (need_for_split_p): Use full_and_partial_reg_clobbers instead
>> >         of call_used_or_fixed_regs.
>> >         (inherit_in_ebb): Initialize and maintain last_call_for_abi and
>> >         full_and_partial_call_clobbers.
>> >         * lra-lives.c (check_pseudos_live_through_calls): Replace
>> >         last_call_used_reg_set and call_insn arguments with an abi argument.
>> >         Remove handling of lra_reg::call_insn.  Use function_abi::mode_clobbers
>> >         as the set of conflicting registers.
>> >         (calls_have_same_clobbers_p): Delete.
>> >         (process_bb_lives): Track the ABI of the last call instead of an
>> >         insn/HARD_REG_SET pair.  Update calls to
>> >         check_pseudos_live_through_calls.  Use eh_edge_abi to calculate
>> >         the set of registers that could be clobbered by an EH edge.
>> >         Include partially-clobbered as well as fully-clobbered registers.
>> >         (lra_create_live_ranges_1): Don't initialize lra_reg::call_insn.
>> >         * lra-remat.c: Include function-abi.h.
>> >         (call_used_regs_arr_len, call_used_regs_arr): Delete.
>> >         (set_bb_regs): Use call_insn_abi to get the set of call-clobbered
>> >         registers and bitmap_view to combine them into dead_regs.
>> >         (call_used_input_regno_present_p): Take a function_abi argument
>> >         and use it to test whether a register is call-clobbered.
>> >         (calculate_gen_cands): Use call_insn_abi to get the ABI of the
>> >         call insn target.  Update tje call to call_used_input_regno_present_p.
>> >         (do_remat): Likewise.
>> >         (lra_remat): Remove the initialization of call_used_regs_arr_len
>> >         and call_used_regs_arr.
>>
>> This caused:
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91994

Thanks for reducing & tracking down the underlying cause.

> This change doesn't work with -mzeroupper.  When -mzeroupper is used,
> upper bits of vector registers are clobbered upon callee return if any
> MM/ZMM registers are used in callee.  Even if YMM7 isn't used, upper
> bits of YMM7 can still be clobbered by vzeroupper when YMM1 is used.

The problem here really is that the pattern is just:

(define_insn "avx_vzeroupper"
  [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)]
  "TARGET_AVX"
  "vzeroupper"
  ...)

and so its effect on the registers isn't modelled at all in rtl.
Maybe one option would be to add a parallel:

  (set (reg:V2DI N) (reg:V2DI N))

for each register.  Or we could do something like I did for the SVE
tlsdesc calls, although here that would mean using a call pattern for
something that isn't really a call.  Or we could reinstate clobber_high
and use that, but that's very much third out of three.

I don't think we should add target hooks to get around this, since that's
IMO papering over the issue.

I'll try the parallel set thing first.

Richard

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [29/32] Remove global call sets: sched-deps.c
  2019-10-04 14:35       ` Richard Sandiford
  2019-10-04 14:37         ` Christophe Lyon
@ 2019-10-07 13:29         ` Christophe Lyon
  1 sibling, 0 replies; 100+ messages in thread
From: Christophe Lyon @ 2019-10-07 13:29 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Jeff Law, gcc Patches

On Fri, 4 Oct 2019 at 16:35, Richard Sandiford <richard.sandiford@arm.com>
wrote:

> Christophe Lyon <christophe.lyon@linaro.org> writes:
> > On Mon, 30 Sep 2019 at 00:20, Jeff Law <law@redhat.com> wrote:
> >
> >     On 9/11/19 1:17 PM, Richard Sandiford wrote:
> >     > This is a straight replacement of an existing "full or partial"
> >     > call-clobber check.
> >     >
> >     >
> >     > 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
> >     >
> >     > gcc/
> >     >       * sched-deps.c (deps_analyze_insn): Use the ABI of the target
> >     >       function to test whether a register is fully or partly
> clobbered.
> >     OK
> >     jeff
> >
> >
> > Hi Richard,
> >
> > My testing shows regressions on arm after you applied this patch
> (r276335):
> > For instance on arm-none-linux-gnueabi
> > --with-mode arm
> > --with-cpu cortex-a9
> > FAIL:  gcc.dg/strlenopt-18g.c execution test
> >
> > If you force -march=armv5t via RUNTESTFLAGS, there's an additional
> failure:
> > FAIL: gcc.dg/strlenopt-19.c execution test
> >
> > In fortran, I see different sets of regressions depending on arm vs
> thumb mode.
> > target arm-none-linux-gnueabi
> > --with-mode arm
> > --with-cpu cortex-a9
> > I get these new FAILs:
> >     gfortran.dg/char4_iunit_1.f03   -O0  execution test
> >     gfortran.dg/char4_iunit_1.f03   -O1  execution test
> >     gfortran.dg/char4_iunit_1.f03   -O2  execution test
> >     gfortran.dg/char4_iunit_1.f03   -O3 -fomit-frame-pointer
> -funroll-loops
> > -fpeel-loops -ftracer -finline-functions  execution test
> >     gfortran.dg/char4_iunit_1.f03   -O3 -g  execution test
> >     gfortran.dg/char4_iunit_1.f03   -Os  execution test
> >     gfortran.dg/namelist_16.f90   -O0  execution test
> >     gfortran.dg/namelist_16.f90   -O1  execution test
> >     gfortran.dg/namelist_16.f90   -O2  execution test
> >     gfortran.dg/namelist_16.f90   -O3 -fomit-frame-pointer -funroll-loops
> > -fpeel-loops -ftracer -finline-functions  execution test
> >     gfortran.dg/namelist_16.f90   -O3 -g  execution test
> >     gfortran.dg/namelist_16.f90   -Os  execution test
> >     gfortran.dg/namelist_95.f90   -O0  execution test
> >     gfortran.dg/namelist_95.f90   -O1  execution test
> >     gfortran.dg/namelist_95.f90   -O2  execution test
> >     gfortran.dg/namelist_95.f90   -O3 -fomit-frame-pointer -funroll-loops
> > -fpeel-loops -ftracer -finline-functions  execution test
> >     gfortran.dg/namelist_95.f90   -O3 -g  execution test
> >     gfortran.dg/namelist_95.f90   -Os  execution test
> >     gfortran.dg/real_const_3.f90   -O0  execution test
> >     gfortran.dg/real_const_3.f90   -O1  execution test
> >     gfortran.dg/real_const_3.f90   -O2  execution test
> >     gfortran.dg/real_const_3.f90   -O3 -fomit-frame-pointer
> -funroll-loops
> > -fpeel-loops -ftracer -finline-functions  execution test
> >     gfortran.dg/real_const_3.f90   -O3 -g  execution test
> >     gfortran.dg/real_const_3.f90   -Os  execution test
> >
> >
> > When defaulting to thumb:
> > target arm-none-linux-gnueabi
> > --with-mode thumb
> > --with-cpu cortex-a9
> > I get these new FAILs:
> >     gfortran.dg/f2003_io_5.f03   -O0  execution test
> >     gfortran.dg/f2003_io_5.f03   -O1  execution test
> >     gfortran.dg/f2003_io_5.f03   -O2  execution test
> >     gfortran.dg/f2003_io_5.f03   -O3 -fomit-frame-pointer -funroll-loops
> > -fpeel-loops -ftracer -finline-functions  execution test
> >     gfortran.dg/f2003_io_5.f03   -O3 -g  execution test
> >     gfortran.dg/f2003_io_5.f03   -Os  execution test
> >     gfortran.dg/real_const_3.f90   -O0  execution test
> >     gfortran.dg/real_const_3.f90   -O1  execution test
> >     gfortran.dg/real_const_3.f90   -O2  execution test
> >     gfortran.dg/real_const_3.f90   -O3 -fomit-frame-pointer
> -funroll-loops
> > -fpeel-loops -ftracer -finline-functions  execution test
> >     gfortran.dg/real_const_3.f90   -O3 -g  execution test
> >     gfortran.dg/real_const_3.f90   -Os  execution test
> >
> > This is the most recent validation result I have so far, so maybe you
> already
> > fixed the problem?
>
> This sounds very like
> https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00170.html
> Let me know if you see any remaining failures after that though.
>
>
I looks OK, now.
Note that r276489 fixed the strlenopt regressions, while the fortran
regressions
seem to have been fixed between r276457 and  r276488 (that is, just before
the above patch).

Thanks,

Christophe

Thanks,
> Richard
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [04/32] [x86] Robustify vzeroupper handling across calls
  2019-10-01 10:14     ` Uros Bizjak
@ 2019-10-08 18:17       ` Uros Bizjak
  0 siblings, 0 replies; 100+ messages in thread
From: Uros Bizjak @ 2019-10-08 18:17 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 3486 bytes --]

The following patch uses correct SSE register class; vzeroupper
operates only on lower 16 (8 on 32bit target) SSE registers.

2019-10-08  Uroš Bizjak  <ubizjak@gmail.com>

    PR target/91994
    * config/i386/i386.c (x86_avx_u128_mode_needed): Use SSE_REG
    instead of ALL_SSE_REG to check if function call preserves some
    256-bit SSE registers.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.

On Tue, Oct 1, 2019 at 12:14 PM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Wed, Sep 25, 2019 at 5:48 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>
> > > The comment suggests that this code is only needed for Win64 and that
> > > not testing for Win64 is just a simplification.  But in practice it was
> > > needed for correctness on GNU/Linux and other targets too, since without
> > > it the RA would be able to keep 256-bit and 512-bit values in SSE
> > > registers across calls that are known not to clobber them.
> > >
> > > This patch conservatively treats calls as AVX_U128_ANY if the RA can see
> > > that some SSE registers are not touched by a call.  There are then no
> > > regressions if the ix86_hard_regno_call_part_clobbered check is disabled
> > > for GNU/Linux (not something we should do, was just for testing).
>
> If RA can sse that some SSE regs are not touched by the call, then we
> are sure that the called function is part of the current TU. In this
> case, the called function will be compiled using VEX instructions,
> where there is no AVX-SSE transition penalty. So, skipping VZEROUPPER
> is beneficial here.
>
> Uros.
>
> > > If in fact we want -fipa-ra to pretend that all functions clobber
> > > SSE registers above 128 bits, it'd certainly be possible to arrange
> > > that.  But IMO that would be an optimisation decision, whereas what
> > > the patch is fixing is a correctness decision.  So I think we should
> > > have this check even so.
> >
> > 2019-09-25  Richard Sandiford  <richard.sandiford@arm.com>
> >
> > gcc/
> >         * config/i386/i386.c: Include function-abi.h.
> >         (ix86_avx_u128_mode_needed): Treat function calls as AVX_U128_ANY
> >         if they preserve some 256-bit or 512-bit SSE registers.
> >
> > Index: gcc/config/i386/i386.c
> > ===================================================================
> > --- gcc/config/i386/i386.c      2019-09-25 16:47:48.000000000 +0100
> > +++ gcc/config/i386/i386.c      2019-09-25 16:47:49.089962608 +0100
> > @@ -95,6 +95,7 @@ #define IN_TARGET_CODE 1
> >  #include "i386-builtins.h"
> >  #include "i386-expand.h"
> >  #include "i386-features.h"
> > +#include "function-abi.h"
> >
> >  /* This file should be included last.  */
> >  #include "target-def.h"
> > @@ -13511,6 +13512,15 @@ ix86_avx_u128_mode_needed (rtx_insn *ins
> >             }
> >         }
> >
> > +      /* If the function is known to preserve some SSE registers,
> > +        RA and previous passes can legitimately rely on that for
> > +        modes wider than 256 bits.  It's only safe to issue a
> > +        vzeroupper if all SSE registers are clobbered.  */
> > +      const function_abi &abi = insn_callee_abi (insn);
> > +      if (!hard_reg_set_subset_p (reg_class_contents[ALL_SSE_REGS],
> > +                                 abi.mode_clobbers (V4DImode)))
> > +       return AVX_U128_ANY;
> > +
> >        return AVX_U128_CLEAN;
> >      }
> >

[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 605 bytes --]

Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 276677)
+++ config/i386/i386.c	(working copy)
@@ -13530,7 +13530,7 @@ ix86_avx_u128_mode_needed (rtx_insn *insn)
 	 modes wider than 256 bits.  It's only safe to issue a
 	 vzeroupper if all SSE registers are clobbered.  */
       const function_abi &abi = insn_callee_abi (insn);
-      if (!hard_reg_set_subset_p (reg_class_contents[ALL_SSE_REGS],
+      if (!hard_reg_set_subset_p (reg_class_contents[SSE_REGS],
 				  abi.mode_clobbers (V4DImode)))
 	return AVX_U128_ANY;
 

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [12/32] Remove global call sets: cselib.c
  2019-09-29 21:05   ` Jeff Law
@ 2019-10-29  9:20     ` Martin Liška
  0 siblings, 0 replies; 100+ messages in thread
From: Martin Liška @ 2019-10-29  9:20 UTC (permalink / raw)
  To: Jeff Law, gcc-patches, richard.sandiford

On 9/29/19 11:05 PM, Jeff Law wrote:
> On 9/11/19 1:09 PM, Richard Sandiford wrote:
>> cselib_invalidate_regno is a no-op if REG_VALUES (i) is null,
>> so we can check that first.  Then, if we know what mode the register
>> currently has, we can check whether it's clobbered in that mode.
>>
>> Using GET_MODE (values->elt->val_rtx) to get the mode of the last
>> set is taken from cselib_reg_set_mode.
>>
>>
>> 2019-09-11  Richard Sandiford  <richard.sandiford@arm.com>
>>
>> gcc/
>> 	* cselib.c (cselib_process_insn): If we know what mode a
>> 	register was set in, check whether it is clobbered in that
>> 	mode by a call.  Only fall back to reg_raw_mode if that fails.
> OK
> jeff
> 

Caused https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92264

Martin

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [21/32] Remove global call sets: LRA
  2019-10-06 14:32 ` Richard Sandiford
@ 2019-10-07  6:04   ` Uros Bizjak
  0 siblings, 0 replies; 100+ messages in thread
From: Uros Bizjak @ 2019-10-07  6:04 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, H. J. Lu

On Sun, Oct 6, 2019 at 4:32 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Uros Bizjak <ubizjak@gmail.com> writes:
> >>>> This caused:
> >>>>
> >>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91994
> >>
> >> Thanks for reducing & tracking down the underlying cause.
> >>
> >>> This change doesn't work with -mzeroupper.  When -mzeroupper is used,
> >>> upper bits of vector registers are clobbered upon callee return if any
> >>> MM/ZMM registers are used in callee.  Even if YMM7 isn't used, upper
> >>> bits of YMM7 can still be clobbered by vzeroupper when YMM1 is used.
> >>
> >> The problem here really is that the pattern is just:
> >>
> >> (define_insn "avx_vzeroupper"
> >>   [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)]
> >>   "TARGET_AVX"
> >>   "vzeroupper"
> >>   ...)
> >>
> >> and so its effect on the registers isn't modelled at all in rtl.
> >> Maybe one option would be to add a parallel:
> >>
> >>   (set (reg:V2DI N) (reg:V2DI N))
> >>
> >> for each register.  Or we could do something like I did for the SVE
> >> tlsdesc calls, although here that would mean using a call pattern for
> >> something that isn't really a call.  Or we could reinstate clobber_high
> >> and use that, but that's very much third out of three.
> >>
> >> I don't think we should add target hooks to get around this, since that's
> >> IMO papering over the issue.
> >>
> >> I'll try the parallel set thing first.
> >
> > Please note that vzeroupper insertion pass runs after register
> > allocation, so in effect vzeroupper pattern is hidden to the register
> > allocator.
>
> Right, but even post-RA passes rely on the register usage being accurate.
> Same for collect_fn_hard_reg_usage, which is the issue here.
>
> The info collected by collect_fn_hard_reg_usage was always wrong for
> vzeroupper.  What changed with my patch is that we now use that info
> for partly call-clobbered registers as well as "normally" clobbered
> registers.  So this is another instance of a problem that was previously
> being masked by having ix86_hard_regno_call_part_clobbered enforce Win64
> rules for all ABIs.
>
> My first idea of adding:
>
>   (set (reg:V2DI N) (reg:V2DI N))
>
> for all clobbered registers didn't work well because it left previously-
> dead registers upwards exposed (obvious in hindsight).  And the second
> idea of using a fake call would require too many "is this really a call?"
> hacks.
>
> So in the end I went for a subpass that chooses between:
>
>   (set (reg:V2DI N) (reg:V2DI N))
>
> and
>
>   (clobber (reg:V2DI N))
>
> depending on whether register N is live or not.  This fixes the testcase
> and doesn't seem to regress code quality for the tests I've tried.
>
> Tested on x86_64-linux-gnu.  OK to install?
>
> Richard
>
>
> 2019-10-06  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         PR target/91994
>         * config/i386/sse.md (avx_vzeroupper): Turn into a define_expand
>         and wrap the unspec_volatile in a parallel.
>         (*avx_vzeroupper): New define_insn.  Use a match_parallel around
>         the unspec_volatile.
>         * config/i386/predicates.md (vzeroupper_pattern): Expect the
>         unspec_volatile to be wrapped in a parallel.
>         * config/i386/i386-features.c (ix86_add_reg_usage_to_vzeroupper)
>         (ix86_add_reg_usage_to_vzerouppers): New functions.
>         (rest_of_handle_insert_vzeroupper): Use them to add register
>         usage information to the vzeroupper instructions.
>
> gcc/testsuite/
>         PR target/91994
>         * gcc.target/i386/pr91994.c: New test.

LGTM.

Thanks,
Uros.

> Index: gcc/config/i386/sse.md
> ===================================================================
> --- gcc/config/i386/sse.md      2019-09-17 15:27:10.214075253 +0100
> +++ gcc/config/i386/sse.md      2019-10-06 15:19:10.062769500 +0100
> @@ -19622,9 +19622,16 @@ (define_insn "*avx_vzeroall"
>     (set_attr "mode" "OI")])
>
>  ;; Clear the upper 128bits of AVX registers, equivalent to a NOP
> -;; if the upper 128bits are unused.
> -(define_insn "avx_vzeroupper"
> -  [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)]
> +;; if the upper 128bits are unused.  Initially we expand the instructions
> +;; as though they had no effect on the SSE registers, but later add SETs and
> +;; CLOBBERs to the PARALLEL to model the real effect.
> +(define_expand "avx_vzeroupper"
> +  [(parallel [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)])]
> +  "TARGET_AVX")
> +
> +(define_insn "*avx_vzeroupper"
> +  [(match_parallel 0 "vzeroupper_pattern"
> +     [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)])]
>    "TARGET_AVX"
>    "vzeroupper"
>    [(set_attr "type" "sse")
> Index: gcc/config/i386/predicates.md
> ===================================================================
> --- gcc/config/i386/predicates.md       2019-09-10 19:56:45.337178032 +0100
> +++ gcc/config/i386/predicates.md       2019-10-06 15:19:10.054769556 +0100
> @@ -1441,8 +1441,9 @@ (define_predicate "vzeroall_pattern"
>
>  ;; return true if OP is a vzeroupper pattern.
>  (define_predicate "vzeroupper_pattern"
> -  (and (match_code "unspec_volatile")
> -       (match_test "XINT (op, 1) == UNSPECV_VZEROUPPER")))
> +  (and (match_code "parallel")
> +       (match_code "unspec_volatile" "a")
> +       (match_test "XINT (XVECEXP (op, 0, 0), 1) == UNSPECV_VZEROUPPER")))
>
>  ;; Return true if OP is an addsub vec_merge operation
>  (define_predicate "addsub_vm_operator"
> Index: gcc/config/i386/i386-features.c
> ===================================================================
> --- gcc/config/i386/i386-features.c     2019-09-21 13:56:08.895934718 +0100
> +++ gcc/config/i386/i386-features.c     2019-10-06 15:19:10.054769556 +0100
> @@ -1757,6 +1757,68 @@ convert_scalars_to_vector (bool timode_p
>    return 0;
>  }
>
> +/* Modify the vzeroupper pattern in INSN so that it describes the effect
> +   that the instruction has on the SSE registers.  LIVE_REGS are the set
> +   of registers that are live across the instruction.
> +
> +   For a live register R we use:
> +
> +     (set (reg:V2DF R) (reg:V2DF R))
> +
> +   which preserves the low 128 bits but clobbers the upper bits.
> +   For a dead register we just use:
> +
> +     (clobber (reg:V2DF R))
> +
> +   which invalidates any previous contents of R and stops R from becoming
> +   live across the vzeroupper in future.  */
> +
> +static void
> +ix86_add_reg_usage_to_vzeroupper (rtx_insn *insn, bitmap live_regs)
> +{
> +  rtx pattern = PATTERN (insn);
> +  unsigned int nregs = TARGET_64BIT ? 16 : 8;
> +  rtvec vec = rtvec_alloc (nregs + 1);
> +  RTVEC_ELT (vec, 0) = XVECEXP (pattern, 0, 0);
> +  for (unsigned int i = 0; i < nregs; ++i)
> +    {
> +      unsigned int regno = GET_SSE_REGNO (i);
> +      rtx reg = gen_rtx_REG (V2DImode, regno);
> +      if (bitmap_bit_p (live_regs, regno))
> +       RTVEC_ELT (vec, i + 1) = gen_rtx_SET (reg, reg);
> +      else
> +       RTVEC_ELT (vec, i + 1) = gen_rtx_CLOBBER (VOIDmode, reg);
> +    }
> +  XVEC (pattern, 0) = vec;
> +  df_insn_rescan (insn);
> +}
> +
> +/* Walk the vzeroupper instructions in the function and annotate them
> +   with the effect that they have on the SSE registers.  */
> +
> +static void
> +ix86_add_reg_usage_to_vzerouppers (void)
> +{
> +  basic_block bb;
> +  rtx_insn *insn;
> +  auto_bitmap live_regs;
> +
> +  df_analyze ();
> +  FOR_EACH_BB_FN (bb, cfun)
> +    {
> +      bitmap_copy (live_regs, df_get_live_out (bb));
> +      df_simulate_initialize_backwards (bb, live_regs);
> +      FOR_BB_INSNS_REVERSE (bb, insn)
> +       {
> +         if (!NONDEBUG_INSN_P (insn))
> +           continue;
> +         if (vzeroupper_pattern (PATTERN (insn), VOIDmode))
> +           ix86_add_reg_usage_to_vzeroupper (insn, live_regs);
> +         df_simulate_one_insn_backwards (bb, insn, live_regs);
> +       }
> +    }
> +}
> +
>  static unsigned int
>  rest_of_handle_insert_vzeroupper (void)
>  {
> @@ -1773,6 +1835,7 @@ rest_of_handle_insert_vzeroupper (void)
>
>    /* Call optimize_mode_switching.  */
>    g->get_passes ()->execute_pass_mode_switching ();
> +  ix86_add_reg_usage_to_vzerouppers ();
>    return 0;
>  }
>
> Index: gcc/testsuite/gcc.target/i386/pr91994.c
> ===================================================================
> --- /dev/null   2019-09-17 11:41:18.176664108 +0100
> +++ gcc/testsuite/gcc.target/i386/pr91994.c     2019-10-06 15:19:10.062769500 +0100
> @@ -0,0 +1,35 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target avx } */
> +/* { dg-options "-O2 -mavx -mvzeroupper" } */
> +
> +#include "avx-check.h"
> +
> +#include <immintrin.h>
> +
> +__m256i x1, x2, x3;
> +
> +__attribute__ ((noinline))
> +static void
> +foo (void)
> +{
> +  x1 = x2;
> +}
> +
> +void
> +bar (void)
> +{
> +  __m256i x = x1;
> +  foo ();
> +  x3 = x;
> +}
> +
> +__attribute__ ((noinline))
> +void
> +avx_test (void)
> +{
> +  __m256i x = _mm256_set1_epi8 (3);
> +  x1 = x;
> +  bar ();
> +  if (__builtin_memcmp (&x3, &x, sizeof (x)))
> +    __builtin_abort ();
> +}

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [21/32] Remove global call sets: LRA
  2019-10-06  8:45 [21/32] Remove global call sets: LRA Uros Bizjak
@ 2019-10-06 14:32 ` Richard Sandiford
  2019-10-07  6:04   ` Uros Bizjak
  0 siblings, 1 reply; 100+ messages in thread
From: Richard Sandiford @ 2019-10-06 14:32 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, H. J. Lu

Uros Bizjak <ubizjak@gmail.com> writes:
>>>> This caused:
>>>>
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91994
>>
>> Thanks for reducing & tracking down the underlying cause.
>>
>>> This change doesn't work with -mzeroupper.  When -mzeroupper is used,
>>> upper bits of vector registers are clobbered upon callee return if any
>>> MM/ZMM registers are used in callee.  Even if YMM7 isn't used, upper
>>> bits of YMM7 can still be clobbered by vzeroupper when YMM1 is used.
>>
>> The problem here really is that the pattern is just:
>>
>> (define_insn "avx_vzeroupper"
>>   [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)]
>>   "TARGET_AVX"
>>   "vzeroupper"
>>   ...)
>>
>> and so its effect on the registers isn't modelled at all in rtl.
>> Maybe one option would be to add a parallel:
>>
>>   (set (reg:V2DI N) (reg:V2DI N))
>>
>> for each register.  Or we could do something like I did for the SVE
>> tlsdesc calls, although here that would mean using a call pattern for
>> something that isn't really a call.  Or we could reinstate clobber_high
>> and use that, but that's very much third out of three.
>>
>> I don't think we should add target hooks to get around this, since that's
>> IMO papering over the issue.
>>
>> I'll try the parallel set thing first.
>
> Please note that vzeroupper insertion pass runs after register
> allocation, so in effect vzeroupper pattern is hidden to the register
> allocator.

Right, but even post-RA passes rely on the register usage being accurate.
Same for collect_fn_hard_reg_usage, which is the issue here.

The info collected by collect_fn_hard_reg_usage was always wrong for
vzeroupper.  What changed with my patch is that we now use that info
for partly call-clobbered registers as well as "normally" clobbered
registers.  So this is another instance of a problem that was previously
being masked by having ix86_hard_regno_call_part_clobbered enforce Win64
rules for all ABIs.

My first idea of adding:

  (set (reg:V2DI N) (reg:V2DI N))

for all clobbered registers didn't work well because it left previously-
dead registers upwards exposed (obvious in hindsight).  And the second
idea of using a fake call would require too many "is this really a call?"
hacks.

So in the end I went for a subpass that chooses between:

  (set (reg:V2DI N) (reg:V2DI N))

and

  (clobber (reg:V2DI N))

depending on whether register N is live or not.  This fixes the testcase
and doesn't seem to regress code quality for the tests I've tried.

Tested on x86_64-linux-gnu.  OK to install?

Richard


2019-10-06  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	PR target/91994
	* config/i386/sse.md (avx_vzeroupper): Turn into a define_expand
	and wrap the unspec_volatile in a parallel.
	(*avx_vzeroupper): New define_insn.  Use a match_parallel around
	the unspec_volatile.
	* config/i386/predicates.md (vzeroupper_pattern): Expect the
	unspec_volatile to be wrapped in a parallel.
	* config/i386/i386-features.c (ix86_add_reg_usage_to_vzeroupper)
	(ix86_add_reg_usage_to_vzerouppers): New functions.
	(rest_of_handle_insert_vzeroupper): Use them to add register
	usage information to the vzeroupper instructions.

gcc/testsuite/
	PR target/91994
	* gcc.target/i386/pr91994.c: New test.

Index: gcc/config/i386/sse.md
===================================================================
--- gcc/config/i386/sse.md	2019-09-17 15:27:10.214075253 +0100
+++ gcc/config/i386/sse.md	2019-10-06 15:19:10.062769500 +0100
@@ -19622,9 +19622,16 @@ (define_insn "*avx_vzeroall"
    (set_attr "mode" "OI")])
 
 ;; Clear the upper 128bits of AVX registers, equivalent to a NOP
-;; if the upper 128bits are unused.
-(define_insn "avx_vzeroupper"
-  [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)]
+;; if the upper 128bits are unused.  Initially we expand the instructions
+;; as though they had no effect on the SSE registers, but later add SETs and
+;; CLOBBERs to the PARALLEL to model the real effect.
+(define_expand "avx_vzeroupper"
+  [(parallel [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)])]
+  "TARGET_AVX")
+
+(define_insn "*avx_vzeroupper"
+  [(match_parallel 0 "vzeroupper_pattern"
+     [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)])]
   "TARGET_AVX"
   "vzeroupper"
   [(set_attr "type" "sse")
Index: gcc/config/i386/predicates.md
===================================================================
--- gcc/config/i386/predicates.md	2019-09-10 19:56:45.337178032 +0100
+++ gcc/config/i386/predicates.md	2019-10-06 15:19:10.054769556 +0100
@@ -1441,8 +1441,9 @@ (define_predicate "vzeroall_pattern"
 
 ;; return true if OP is a vzeroupper pattern.
 (define_predicate "vzeroupper_pattern"
-  (and (match_code "unspec_volatile")
-       (match_test "XINT (op, 1) == UNSPECV_VZEROUPPER")))
+  (and (match_code "parallel")
+       (match_code "unspec_volatile" "a")
+       (match_test "XINT (XVECEXP (op, 0, 0), 1) == UNSPECV_VZEROUPPER")))
 
 ;; Return true if OP is an addsub vec_merge operation
 (define_predicate "addsub_vm_operator"
Index: gcc/config/i386/i386-features.c
===================================================================
--- gcc/config/i386/i386-features.c	2019-09-21 13:56:08.895934718 +0100
+++ gcc/config/i386/i386-features.c	2019-10-06 15:19:10.054769556 +0100
@@ -1757,6 +1757,68 @@ convert_scalars_to_vector (bool timode_p
   return 0;
 }
 
+/* Modify the vzeroupper pattern in INSN so that it describes the effect
+   that the instruction has on the SSE registers.  LIVE_REGS are the set
+   of registers that are live across the instruction.
+
+   For a live register R we use:
+
+     (set (reg:V2DF R) (reg:V2DF R))
+
+   which preserves the low 128 bits but clobbers the upper bits.
+   For a dead register we just use:
+
+     (clobber (reg:V2DF R))
+
+   which invalidates any previous contents of R and stops R from becoming
+   live across the vzeroupper in future.  */
+
+static void
+ix86_add_reg_usage_to_vzeroupper (rtx_insn *insn, bitmap live_regs)
+{
+  rtx pattern = PATTERN (insn);
+  unsigned int nregs = TARGET_64BIT ? 16 : 8;
+  rtvec vec = rtvec_alloc (nregs + 1);
+  RTVEC_ELT (vec, 0) = XVECEXP (pattern, 0, 0);
+  for (unsigned int i = 0; i < nregs; ++i)
+    {
+      unsigned int regno = GET_SSE_REGNO (i);
+      rtx reg = gen_rtx_REG (V2DImode, regno);
+      if (bitmap_bit_p (live_regs, regno))
+	RTVEC_ELT (vec, i + 1) = gen_rtx_SET (reg, reg);
+      else
+	RTVEC_ELT (vec, i + 1) = gen_rtx_CLOBBER (VOIDmode, reg);
+    }
+  XVEC (pattern, 0) = vec;
+  df_insn_rescan (insn);
+}
+
+/* Walk the vzeroupper instructions in the function and annotate them
+   with the effect that they have on the SSE registers.  */
+
+static void
+ix86_add_reg_usage_to_vzerouppers (void)
+{
+  basic_block bb;
+  rtx_insn *insn;
+  auto_bitmap live_regs;
+
+  df_analyze ();
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      bitmap_copy (live_regs, df_get_live_out (bb));
+      df_simulate_initialize_backwards (bb, live_regs);
+      FOR_BB_INSNS_REVERSE (bb, insn)
+	{
+	  if (!NONDEBUG_INSN_P (insn))
+	    continue;
+	  if (vzeroupper_pattern (PATTERN (insn), VOIDmode))
+	    ix86_add_reg_usage_to_vzeroupper (insn, live_regs);
+	  df_simulate_one_insn_backwards (bb, insn, live_regs);
+	}
+    }
+}
+
 static unsigned int
 rest_of_handle_insert_vzeroupper (void)
 {
@@ -1773,6 +1835,7 @@ rest_of_handle_insert_vzeroupper (void)
 
   /* Call optimize_mode_switching.  */
   g->get_passes ()->execute_pass_mode_switching ();
+  ix86_add_reg_usage_to_vzerouppers ();
   return 0;
 }
 
Index: gcc/testsuite/gcc.target/i386/pr91994.c
===================================================================
--- /dev/null	2019-09-17 11:41:18.176664108 +0100
+++ gcc/testsuite/gcc.target/i386/pr91994.c	2019-10-06 15:19:10.062769500 +0100
@@ -0,0 +1,35 @@
+/* { dg-do run } */
+/* { dg-require-effective-target avx } */
+/* { dg-options "-O2 -mavx -mvzeroupper" } */
+
+#include "avx-check.h"
+
+#include <immintrin.h>
+
+__m256i x1, x2, x3;
+
+__attribute__ ((noinline))
+static void
+foo (void)
+{
+  x1 = x2;
+}
+
+void
+bar (void)
+{
+  __m256i x = x1;
+  foo ();
+  x3 = x;
+}
+
+__attribute__ ((noinline))
+void
+avx_test (void)
+{
+  __m256i x = _mm256_set1_epi8 (3);
+  x1 = x;
+  bar ();
+  if (__builtin_memcmp (&x3, &x, sizeof (x)))
+    __builtin_abort ();
+}

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [21/32] Remove global call sets: LRA
@ 2019-10-06  8:45 Uros Bizjak
  2019-10-06 14:32 ` Richard Sandiford
  0 siblings, 1 reply; 100+ messages in thread
From: Uros Bizjak @ 2019-10-06  8:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford, H. J. Lu

>>> This caused:
>>>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91994
>
> Thanks for reducing & tracking down the underlying cause.
>
>> This change doesn't work with -mzeroupper.  When -mzeroupper is used,
>> upper bits of vector registers are clobbered upon callee return if any
>> MM/ZMM registers are used in callee.  Even if YMM7 isn't used, upper
>> bits of YMM7 can still be clobbered by vzeroupper when YMM1 is used.
>
> The problem here really is that the pattern is just:
>
> (define_insn "avx_vzeroupper"
>   [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)]
>   "TARGET_AVX"
>   "vzeroupper"
>   ...)
>
> and so its effect on the registers isn't modelled at all in rtl.
> Maybe one option would be to add a parallel:
>
>   (set (reg:V2DI N) (reg:V2DI N))
>
> for each register.  Or we could do something like I did for the SVE
> tlsdesc calls, although here that would mean using a call pattern for
> something that isn't really a call.  Or we could reinstate clobber_high
> and use that, but that's very much third out of three.
>
> I don't think we should add target hooks to get around this, since that's
> IMO papering over the issue.
>
> I'll try the parallel set thing first.

Please note that vzeroupper insertion pass runs after register
allocation, so in effect vzeroupper pattern is hidden to the register
allocator.

Uros.

^ permalink raw reply	[flat|nested] 100+ messages in thread

end of thread, other threads:[~2019-10-29  9:19 UTC | newest]

Thread overview: 100+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-11 19:02 [00/32] Support multiple ABIs in the same translation unit Richard Sandiford
2019-09-11 19:03 ` [02/32] Add a target hook for getting an ABI from a function type Richard Sandiford
2019-09-29 20:52   ` Jeff Law
2019-09-11 19:03 ` [01/32] Add function_abi.{h,cc} Richard Sandiford
2019-09-29 20:51   ` Jeff Law
2019-09-30  9:19     ` Richard Sandiford
2019-09-30 21:16       ` Jeff Law
2019-09-11 19:04 ` [03/32] Add a function for getting the ABI of a call insn target Richard Sandiford
2019-09-25 15:38   ` Richard Sandiford
2019-09-30 15:52     ` Jeff Law
2019-09-30 16:32       ` Richard Sandiford
2019-09-30 16:46         ` Jeff Law
2019-09-11 19:05 ` [05/32] Pass an ABI identifier to hard_regno_call_part_clobbered Richard Sandiford
2019-09-29 20:58   ` Jeff Law
2019-09-11 19:05 ` [04/32] [x86] Robustify vzeroupper handling across calls Richard Sandiford
2019-09-25 15:48   ` Richard Sandiford
2019-09-25 18:11     ` Uros Bizjak
2019-10-01 10:14     ` Uros Bizjak
2019-10-08 18:17       ` Uros Bizjak
2019-09-11 19:06 ` [06/32] Pass an ABI to choose_hard_reg_mode Richard Sandiford
2019-09-29 21:00   ` Jeff Law
2019-09-11 19:07 ` [08/32] Remove global call sets: cfgcleanup.c Richard Sandiford
2019-09-29 21:02   ` Jeff Law
2019-09-11 19:07 ` [07/32] Remove global call sets: caller-save.c Richard Sandiford
2019-09-29 21:01   ` Jeff Law
2019-09-11 19:08 ` [10/32] Remove global call sets: combine.c Richard Sandiford
2019-09-12  2:18   ` Segher Boessenkool
2019-09-12  7:52     ` Richard Sandiford
2019-09-20  0:43       ` Segher Boessenkool
2019-09-25 15:52         ` Richard Sandiford
2019-09-25 16:30           ` Segher Boessenkool
2019-09-29 22:32           ` Jeff Law
2019-09-29 22:43             ` Segher Boessenkool
2019-09-11 19:08 ` [09/32] Remove global call sets: cfgloopanal.c Richard Sandiford
2019-09-29 21:02   ` Jeff Law
2019-09-11 19:09 ` [12/32] Remove global call sets: cselib.c Richard Sandiford
2019-09-29 21:05   ` Jeff Law
2019-10-29  9:20     ` Martin Liška
2019-09-11 19:09 ` [11/32] Remove global call sets: cse.c Richard Sandiford
2019-09-25 15:57   ` Richard Sandiford
2019-09-29 21:04     ` Jeff Law
2019-09-30 16:23       ` Richard Sandiford
2019-09-11 19:10 ` [13/32] Remove global call sets: DF (EH edges) Richard Sandiford
2019-09-29 21:07   ` Jeff Law
2019-09-11 19:10 ` [14/32] Remove global call sets: DF (entry/exit defs) Richard Sandiford
2019-09-29 21:07   ` Jeff Law
2019-09-11 19:11 ` [17/32] Remove global call sets: gcse.c Richard Sandiford
2019-09-25 16:04   ` Richard Sandiford
2019-09-29 21:10   ` Jeff Law
2019-09-11 19:11 ` [15/32] Remove global call sets: early-remat.c Richard Sandiford
2019-09-29 21:09   ` Jeff Law
2019-09-11 19:11 ` [16/32] Remove global call sets: function.c Richard Sandiford
2019-09-29 21:10   ` Jeff Law
2019-09-11 19:12 ` [18/32] Remove global call sets: haifa-sched.c Richard Sandiford
2019-09-29 21:11   ` Jeff Law
2019-09-11 19:12 ` [19/32] Remove global call sets: IRA Richard Sandiford
2019-09-30 15:16   ` Jeff Law
2019-09-11 19:13 ` [20/32] Remove global call sets: loop-iv.c Richard Sandiford
2019-09-29 21:20   ` Jeff Law
2019-09-11 19:14 ` [21/32] Remove global call sets: LRA Richard Sandiford
2019-09-30 15:29   ` Jeff Law
2019-10-04 18:03   ` H.J. Lu
2019-10-04 21:52     ` H.J. Lu
2019-10-05 13:33       ` Richard Sandiford
2019-09-11 19:14 ` [23/32] Remove global call sets: postreload-gcse.c Richard Sandiford
2019-09-25 16:08   ` Richard Sandiford
2019-09-29 22:22     ` Jeff Law
2019-09-11 19:14 ` [22/32] Remove global call sets: postreload.c Richard Sandiford
2019-09-29 21:33   ` Jeff Law
2019-09-11 19:15 ` [24/32] Remove global call sets: recog.c Richard Sandiford
2019-09-29 21:33   ` Jeff Law
2019-09-11 19:15 ` [25/32] Remove global call sets: regcprop.c Richard Sandiford
2019-09-29 21:34   ` Jeff Law
2019-09-11 19:16 ` [26/32] Remove global call sets: regrename.c Richard Sandiford
2019-09-29 22:25   ` Jeff Law
2019-09-11 19:16 ` [27/32] Remove global call sets: reload.c Richard Sandiford
2019-09-29 22:26   ` Jeff Law
2019-09-11 19:17 ` [00/32] Remove global call sets: rtlanal.c Richard Sandiford
2019-09-29 22:21   ` Jeff Law
2019-09-11 19:17 ` [29/32] Remove global call sets: sched-deps.c Richard Sandiford
2019-09-29 22:20   ` Jeff Law
2019-10-04 14:32     ` Christophe Lyon
2019-10-04 14:35       ` Richard Sandiford
2019-10-04 14:37         ` Christophe Lyon
2019-10-07 13:29         ` Christophe Lyon
2019-09-11 19:18 ` [31/32] Remove global call sets: shrink-wrap.c Richard Sandiford
2019-09-29 22:21   ` Jeff Law
2019-09-11 19:18 ` [30/32] Remove global call sets: sel-sched.c Richard Sandiford
2019-09-30 15:08   ` Jeff Law
2019-09-11 19:19 ` [32/32] Hide regs_invalidated_by_call etc Richard Sandiford
2019-09-29 22:22   ` Jeff Law
2019-09-12 20:42 ` [00/32] Support multiple ABIs in the same translation unit Steven Bosscher
2019-09-26 19:24 ` Dimitar Dimitrov
2019-09-27  8:58   ` Richard Sandiford
2019-10-01  2:09 ` build-failure for cris-elf with "[00/32] Support multiple ABIs in the same translation unit" Hans-Peter Nilsson
2019-10-01  7:51   ` Richard Sandiford
2019-10-01 10:58     ` Hans-Peter Nilsson
2019-10-06  8:45 [21/32] Remove global call sets: LRA Uros Bizjak
2019-10-06 14:32 ` Richard Sandiford
2019-10-07  6:04   ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).