public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* PowerPC 'future' patches introduction
@ 2019-08-14 21:36 Michael Meissner
  2019-08-14 21:37 ` [PATCH], Patch #1 of 10, Add instruction format enumeration Michael Meissner
                   ` (12 more replies)
  0 siblings, 13 replies; 26+ messages in thread
From: Michael Meissner @ 2019-08-14 21:36 UTC (permalink / raw)
  To: gcc-patches, segher, dje.gcc, meissner; +Cc: Alan Modra

I will be submitting 10 patches will that will add support to GCC for a
possible future PowerPC processor.  These patches add support for new
instructions that extend the offsettable memory instructions (D, DS, or DQ
instruction formats) to have 34 bit offsets (instead of 16, 14, or 12 bits
respectively).  These instructions use reserved encodings for the first 32 bits
and the second 32 bits may either be the traditional instruction that is being
extended or a new encoding.  These new 64-bit instructions are called
'prefixed' instructions.

These new instructions also have a mode that uses the 34 bit offset and adds it
to the current location instead of a base register, giving pc-relative
addressing.  Pc-relative addressing will be supported in the next ABI (3.1) as
an alternative to the current TOC based addressing.

The first patch adds the new insn_form enumeration to describe the instruction
format.  This is similar to the previous patch, except the name is now
insn_form instead of offset_format, and I simplified the set up for the
instruction format, using the existing reg_addr structure.

The second patch adds the basic infrastructure using RTL attributes on the
insns to say whether an instructin is prefixed or not.  I tried to simplify
this over previous versions of patch, by only having a "prefixed" attribute
instead of a "maybe_prefixed" and "prefixed" attributes.

The third patch adds support for all offsettable memory instructions to use the
new instructions.  After this patch is installed, you would be able to generate
the new pc-relative instructions if you use the -mpcrel option.

The fourth patch adjusts the costs when you use prefixed instructions (prefixed
instructions are larger than traditional instructions, so we need to adjust the
costs based on instruction size).

The fifth patch switches the default when you use -mcpu=future to use
pc-relative instructions instead of using the TOC by default.

The sixth patch adds support for the 'future' machine to the target_clones and
target function attributes, as well as the __builtin_cpu_supports built in
function.

The seventh patch adds a new RTL pass to implement the PCREL_OPT relocations
that will be part of the ISA 3.1 specification.  This optimization allows the
linker to optimize accessing external symbols that are local to the main
program in some cases.

The eighth, ninth, and tenth passes adds tests for the 'future' machine to the
testsuite.

After these patches are installed, Alan Modra will have a set of patches update
the thread local storage (TLS) for use with pc-relative addressing.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH], Patch #1 of 10, Add instruction format enumeration
  2019-08-14 21:36 PowerPC 'future' patches introduction Michael Meissner
@ 2019-08-14 21:37 ` Michael Meissner
  2019-08-14 22:11 ` [PATCH], Patch #2 of 10, Add RTL prefixed attribute Michael Meissner
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Michael Meissner @ 2019-08-14 21:37 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn,
	Michael Meissner, Alan Modra

This patch implements the insn_form enumeration that identifies which types of
instruction format is used for memory instruction.  While the PowerPC has
additional formats, the instruction formats that we need to use are:

	INSN_FORM_D	-- Traditional D-form instructions (16 bits offset);
	INSN_FORM_DS	-- Traditional DS-form instructions (14 bits offset);
	INSN_FORM_DQ	-- Traditional DQ-form instructions (12 bits offset).

In the previous patches, these were called offset_format instead of insn_form.
I changed the way the insn_form values are computed to use the mask bits in the
reg_addr data structure, instead of having a function that set this up in a
confusing manner.

Previous patches did not have the support for external addresses, and these are
added in this new patch.

This patch includes the notation of a default instruction format.  This is used
in the absence of the actual register used.  The default instruction format is
based on the anticipated usage.

For example, in 64-bit mode, a DImode integer's default instruction format is
INSN_FORM_DS because the LD and STD instructions use the DS instruction
encoding.  But if you were loading the DImode into a traditional FPR register,
the LFD and STFD instructins are D format.

Similarly, for SFmode and DFmode, the traditional FPR memory instructions uses
D format instruction, but the traditional Altivec memory instructions use the
DS format.  In this case the traditional instruction format is D format.

The new prefixed memory and pc-relative lookup functions now take the default
insn_form as an argument.  This is important if the default format is DS format
or DQ format, and the offset has the bottom 2 or 4 bits non-zero.  In this
case, we can do the memory operation using a prefixed load or store instead of
requiring the offset to be loaded into a GPR.

The pc-relative match function (pcrel_addr_p) now optionally returns the base
address and offset to allow print_operand_address and other functions that
would otherwise need to decode the instruction to have the values available
directly.

There is a new function (reg_to_insn_form) that takes a register and an address
and returns the instruction format for that particular memory address.  This is
due to the fact that when offset addressing was added to the PowerPC
traditional Altivec registers, the instruction format used was DS format
instead of D format for the scalar values.  If the register is a pseudo
register, the function returns the default instruction format.  This function
will primarily be used in the next patch to identify whether an insn uses a
prefixed instruction or not.

2019-08-14   Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (pcrel_address): Rewrite to use
	pcrel_addr_p.
	(pcrel_external_address): Rewrite to use pcrel_addr_p.
	(prefixed_mem_operand): Rewrite to use prefixed_local_addr_p.
	(pcrel_external_mem_operand): Rewrite to use pcrel_addr_p.
	* config/rs6000/rs6000-protos.h (reg_to_insn_form): New
	declaration.
	(pcrel_info_type): New declaration.
	(PCREL_NULL): New macro.
	(pcrel_addr_p): New declaration.
	(rs6000_prefixed_address_mode_p): Delete.
	* config/rs6000/rs6000.c (struct rs6000_reg_addr): Add fields for
	instruction format and prefixed memory support.
	(rs6000_debug_insn_form): New debug function.
	(rs6000_debug_print_mode): Print instruction formats.
	(setup_insn_form): New function.
	(rs6000_init_hard_regno_mode_ok): Call setup_insn_form.
	(print_operand_address): Call pcrel_addr_p instead of
	pcrel_address.  Add support for external pc-relative labels.
	(mode_supports_prefixed_address_p): Delete.
	(rs6000_prefixed_address_mode_p): Delete, replace with
	prefixed_local_addr_p.
	(prefixed_local_addr_p): Replace rs6000_prefixed_address_mode_p.
	Add argument to specify the instruction format.
	(pcrel_addr_p): New function.
	(reg_to_insn_form): New function.
	* config/rs6000/rs6000.md (enum insn_form): New enumeration.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 274173)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -1626,32 +1626,11 @@
   return GET_CODE (op) == UNSPEC && XINT (op, 1) == UNSPEC_TOCREL;
 })
 
-;; Return true if the operand is a pc-relative address.
+;; Return true if the operand is a pc-relative address to a local symbol.
 (define_predicate "pcrel_address"
   (match_code "label_ref,symbol_ref,const")
 {
-  if (!rs6000_pcrel_p (cfun))
-    return false;
-
-  if (GET_CODE (op) == CONST)
-    op = XEXP (op, 0);
-
-  /* Validate offset.  */
-  if (GET_CODE (op) == PLUS)
-    {
-      rtx op0 = XEXP (op, 0);
-      rtx op1 = XEXP (op, 1);
-
-      if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
-	return false;
-
-      op = op0;
-    }
-
-  if (LABEL_REF_P (op))
-    return true;
-
-  return (SYMBOL_REF_P (op) && SYMBOL_REF_LOCAL_P (op));
+  return pcrel_addr_p (op, true, false, PCREL_NULL);
 })
 
 ;; Return true if the operand is an external symbol whose address can be loaded
@@ -1665,32 +1644,14 @@
 (define_predicate "pcrel_external_address"
   (match_code "symbol_ref,const")
 {
-  if (!rs6000_pcrel_p (cfun))
-    return false;
-
-  if (GET_CODE (op) == CONST)
-    op = XEXP (op, 0);
-
-  /* Validate offset.  */
-  if (GET_CODE (op) == PLUS)
-    {
-      rtx op0 = XEXP (op, 0);
-      rtx op1 = XEXP (op, 1);
-
-      if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
-	return false;
-
-      op = op0;
-    }
-
-  return (SYMBOL_REF_P (op) && !SYMBOL_REF_LOCAL_P (op));
+  return pcrel_addr_p (op, false, true, PCREL_NULL);
 })
 
 ;; Return 1 if op is a prefixed memory operand.
 (define_predicate "prefixed_mem_operand"
   (match_code "mem")
 {
-  return rs6000_prefixed_address_mode_p (XEXP (op, 0), GET_MODE (op));
+  return prefixed_local_addr_p (XEXP (op, 0), mode, INSN_FORM_UNKNOWN);
 })
 
 ;; Return 1 if op is a memory operand to an external variable when we
@@ -1699,7 +1660,7 @@
 (define_predicate "pcrel_external_mem_operand"
   (match_code "mem")
 {
-  return pcrel_external_address (XEXP (op, 0), Pmode);
+  return pcrel_addr_p (XEXP (op, 0), false, true, PCREL_NULL);
 })
 
 ;; Match the first insn (addis) in fusing the combination of addis and loads to
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 274173)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -47,7 +47,19 @@ extern bool legitimate_indirect_address_p (rtx, in
 extern bool legitimate_indexed_address_p (rtx, int);
 extern bool avoiding_indexed_address_p (machine_mode);
 extern rtx rs6000_force_indexed_or_indirect_mem (rtx x);
+extern enum insn_form reg_to_insn_form (rtx, machine_mode);
+extern bool prefixed_local_addr_p (rtx, machine_mode, enum insn_form);
 
+/* Pc-relative address broken into component parts by pcrel_addr_p.  */
+typedef struct {
+  rtx base_addr;		/* SYMBOL_REF or LABEL_REF.  */
+  HOST_WIDE_INT offset;		/* Offset from the base address.  */
+  bool external_p;		/* Is the symbol external?  */
+} pcrel_info_type;
+
+#define PCREL_NULL ((pcrel_info_type *)0)
+
+extern bool pcrel_addr_p (rtx, bool, bool, pcrel_info_type *);
 extern rtx rs6000_got_register (rtx);
 extern rtx find_addr_reg (rtx);
 extern rtx gen_easy_altivec_constant (rtx);
@@ -154,7 +166,6 @@ extern align_flags rs6000_loop_align (rtx);
 extern void rs6000_split_logical (rtx [], enum rtx_code, bool, bool, bool);
 extern bool rs6000_pcrel_p (struct function *);
 extern bool rs6000_fndecl_pcrel_p (const_tree);
-extern bool rs6000_prefixed_address_mode_p (rtx, machine_mode);
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 274173)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -369,8 +369,11 @@ struct rs6000_reg_addr {
   enum insn_code reload_fpr_gpr;	/* INSN to move from FPR to GPR.  */
   enum insn_code reload_gpr_vsx;	/* INSN to move from GPR to VSX.  */
   enum insn_code reload_vsx_gpr;	/* INSN to move from VSX to GPR.  */
+  enum insn_form default_insn_form;	/* Default format for offsets.  */
+  enum insn_form insn_form[(int)N_RELOAD_REG]; /* Register insn format.  */
   addr_mask_type addr_mask[(int)N_RELOAD_REG]; /* Valid address masks.  */
   bool scalar_in_vmx_p;			/* Scalar value can go in VMX.  */
+  bool prefixed_memory_p;		/* We can use prefixed memory.  */
 };
 
 static struct rs6000_reg_addr reg_addr[NUM_MACHINE_MODES];
@@ -2053,6 +2056,28 @@ rs6000_debug_vector_unit (enum rs6000_vector v)
   return ret;
 }
 
+/* Return a character that can be printed out to describe an instruction
+   format.  */
+
+DEBUG_FUNCTION char
+rs6000_debug_insn_form (enum insn_form iform)
+{
+  char ret;
+
+  switch (iform)
+    {
+    case INSN_FORM_UNKNOWN:  ret = '-'; break;
+    case INSN_FORM_D:        ret = 'd'; break;
+    case INSN_FORM_DS:       ret = 's'; break;
+    case INSN_FORM_DQ:       ret = 'q'; break;
+    case INSN_FORM_X:        ret = 'x'; break;
+    case INSN_FORM_PREFIXED: ret = 'p'; break;
+    default:                 ret = '?'; break;
+    }
+
+  return ret;
+}
+
 /* Inner function printing just the address mask for a particular reload
    register class.  */
 DEBUG_FUNCTION char *
@@ -2115,6 +2140,12 @@ rs6000_debug_print_mode (ssize_t m)
     fprintf (stderr, " %s: %s", reload_reg_map[rc].name,
 	     rs6000_debug_addr_mask (reg_addr[m].addr_mask[rc], true));
 
+  fprintf (stderr, "  Format: %c:%c%c%c",
+          rs6000_debug_insn_form (reg_addr[m].default_insn_form),
+          rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_GPR]),
+          rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_FPR]),
+          rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_VMX]));
+
   if ((reg_addr[m].reload_store != CODE_FOR_nothing)
       || (reg_addr[m].reload_load != CODE_FOR_nothing))
     {
@@ -2668,6 +2699,153 @@ rs6000_setup_reg_addr_masks (void)
     }
 }
 
+/* Set up the instruction format for each mode and register type from the
+   addr_mask.  */
+
+static void
+setup_insn_form (void)
+{
+  for (ssize_t m = 0; m < NUM_MACHINE_MODES; ++m)
+    {
+      machine_mode scalar_mode = (machine_mode) m;
+
+      /* Convert complex and IBM double double/_Decimal128 into their scalar
+	 parts that the registers will be split into for doing load or
+	 store.  */
+      if (COMPLEX_MODE_P (scalar_mode))
+	scalar_mode = GET_MODE_INNER (scalar_mode);
+
+      if (FLOAT128_2REG_P (scalar_mode))
+	scalar_mode = DFmode;
+
+      for (ssize_t rc = FIRST_RELOAD_REG_CLASS; rc <= LAST_RELOAD_REG_CLASS; rc++)
+	{
+	  machine_mode single_reg_mode = scalar_mode;
+	  size_t msize = GET_MODE_SIZE (scalar_mode);
+	  addr_mask_type addr_mask = reg_addr[scalar_mode].addr_mask[rc];
+	  enum insn_form iform = INSN_FORM_UNKNOWN;
+
+	  /* Is the mode permitted in the GPR/FPR/Altivec registers?  */
+	  if ((addr_mask & RELOAD_REG_VALID) != 0)
+	    {
+	      /* The addr_mask does not have the offsettable or indexed bits
+		 set for modes that are split into multiple registers (like
+		 IFmode).  It doesn't need this set, since typically by time it
+		 is used in secondary reload, the modes are split into
+		 component parts.
+
+		 The instruction format however can be used earlier in the
+		 compilation, so we need to setup what kind of instruction can
+		 be generated for the modes that are split.  */
+	      if ((addr_mask & (RELOAD_REG_MULTIPLE
+				| RELOAD_REG_OFFSET
+				| RELOAD_REG_INDEXED)) == RELOAD_REG_MULTIPLE)
+		{
+		  /* Multiple register types in GPRs depend on whether we can
+		     use DImode in a single register or SImode.  */
+		  if (rc == RELOAD_REG_GPR)
+		    {
+		      if (TARGET_POWERPC64)
+			{
+			  gcc_assert ((msize % 8) == 0);
+			  single_reg_mode = DImode;
+			}
+
+		      else
+			{
+			  gcc_assert ((msize % 4) == 0);
+			  single_reg_mode = SImode;
+			}
+		    }
+
+		  /* Multiple VSX vector sized data items will use a single
+		     vector type as an instruction format.  */
+		  else if (TARGET_VSX)
+		    {
+		      gcc_assert ((rc == RELOAD_REG_FPR)
+				  || (rc == RELOAD_REG_VMX));
+
+		      if ((msize % 16) == 0)
+			single_reg_mode = V2DImode;
+		    }
+
+		  /* Multiple Altivec vector sized data items will use a single
+		     vector type as an instruction format.  */
+		  else if (TARGET_ALTIVEC && rc == RELOAD_REG_VMX
+			   && (msize % 16) == 0
+			   && VECTOR_MODE_P (single_reg_mode))
+		    single_reg_mode = V4SImode;
+
+		  /* If we only have the traditional floating point unit, use
+		     DFmode as the base type.  */
+		  else if (!TARGET_VSX && TARGET_HARD_FLOAT
+			   && rc == RELOAD_REG_FPR && (msize % 8) == 0)
+		    single_reg_mode = DFmode;
+
+		  /* Get the information for the register mode used after
+		     splitting.  */
+		  addr_mask = reg_addr[single_reg_mode].addr_mask[rc];
+		  msize = GET_MODE_SIZE (single_reg_mode);
+		}
+
+	      /* Figure out the instruction format of each mode.
+
+		 For offsettable addresses that aren't specifically quad mode,
+		 see if the default form is D or DS.  GPR 64-bit offsettable
+		 addresses are DS format.  Likewise, all Altivec offsettable
+		 adddresses are DS format.  */
+	      if ((addr_mask & RELOAD_REG_OFFSET) != 0)
+		{
+		  if ((addr_mask & RELOAD_REG_QUAD_OFFSET) != 0)
+		    iform = INSN_FORM_DQ;
+
+		  else if (rc == RELOAD_REG_VMX
+			   || (rc == RELOAD_REG_GPR && TARGET_POWERPC64
+			       && (msize >= 8)))
+		    iform = INSN_FORM_DS;
+
+		  else
+		    iform = INSN_FORM_D;
+		}
+
+	      else if ((addr_mask & RELOAD_REG_INDEXED) != 0)
+		iform = INSN_FORM_X;
+	    }
+
+	  reg_addr[m].insn_form[rc] = iform;
+	}
+
+      /* Figure out the default insn format that is used for offsettable memory
+	 instructions.  For scalar floating point use the FPR addressing, for
+	 vectors and IEEE 128-bit use a suitable vector register type, and
+	 otherwise use GPRs.  */
+      ssize_t def_rc;
+      if (TARGET_VSX
+	  && (VECTOR_MODE_P (scalar_mode) || FLOAT128_IEEE_P (scalar_mode)))
+	{
+	  if ((reg_addr[m].addr_mask[RELOAD_REG_FPR] & RELOAD_REG_VALID) != 0)
+	    def_rc = RELOAD_REG_FPR;
+	  else
+	    def_rc = RELOAD_REG_VMX;
+	}
+
+      else if (TARGET_ALTIVEC && !TARGET_VSX && VECTOR_MODE_P (scalar_mode))
+	def_rc = RELOAD_REG_VMX;
+
+      else if (TARGET_HARD_FLOAT && SCALAR_FLOAT_MODE_P (scalar_mode))
+	def_rc = RELOAD_REG_FPR;
+
+      else
+	def_rc = RELOAD_REG_GPR;
+
+      reg_addr[m].default_insn_form = reg_addr[m].insn_form[def_rc];
+
+      /* Don't enable prefixed memory support until all of the infrastructure
+	 changes are in.  */
+      reg_addr[m].prefixed_memory_p = false;
+    }
+}
+
 \f
 /* Initialize the various global tables that are based on register size.  */
 static void
@@ -3181,6 +3359,9 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p
      use.  */
   rs6000_setup_reg_addr_masks ();
 
+  /* Update the instruction formats.  */
+  setup_insn_form ();
+
   if (global_init_p || TARGET_DEBUG_TARGET)
     {
       if (TARGET_DEBUG_REG)
@@ -13070,30 +13251,22 @@ print_operand (FILE *file, rtx x, int code)
 void
 print_operand_address (FILE *file, rtx x)
 {
+  pcrel_info_type pcrel_info;
+
   if (REG_P (x))
     fprintf (file, "0(%s)", reg_names[ REGNO (x) ]);
 
   /* Is it a pc-relative address?  */
-  else if (pcrel_address (x, Pmode))
+  else if (pcrel_addr_p (x, true, true, &pcrel_info))
     {
-      HOST_WIDE_INT offset;
+      output_addr_const (file, pcrel_info.base_addr);
 
-      if (GET_CODE (x) == CONST)
-	x = XEXP (x, 0);
+      if (pcrel_info.offset)
+	fprintf (file, "%+" PRId64, pcrel_info.offset);
 
-      if (GET_CODE (x) == PLUS)
-	{
-	  offset = INTVAL (XEXP (x, 1));
-	  x = XEXP (x, 0);
-	}
-      else
-	offset = 0;
+      if (pcrel_info.external_p)
+	fputs ("@got", file);
 
-      output_addr_const (file, x);
-
-      if (offset)
-	fprintf (file, "%+" PRId64, offset);
-
       fputs ("@pcrel", file);
     }
   else if (SYMBOL_REF_P (x) || GET_CODE (x) == CONST
@@ -13579,70 +13752,204 @@ rs6000_pltseq_template (rtx *operands, int which)
   return str;
 }
 #endif
+\f
+/* Return true if the address ADDR is a prefixed address either with a large
+   offset, an offset that does not fit in the normal instruction form, or a
+   pc-relative address to a local symbol.
 
-/* Helper function to return whether a MODE can do prefixed loads/stores.
-   VOIDmode is used when we are loading the pc-relative address into a base
-   register, but we are not using it as part of a memory operation.  As modes
-   add support for prefixed memory, they will be added here.  */
+   MODE is the mode of the memory.
 
-static bool
-mode_supports_prefixed_address_p (machine_mode mode)
-{
-  return mode == VOIDmode;
-}
+   IFORM is used to determine if the traditional address is either DS format or
+   DQ format and the bottom bits of the offset are non-zero.  */
 
-/* Function to return true if ADDR is a valid prefixed memory address that uses
-   mode MODE.  */
-
 bool
-rs6000_prefixed_address_mode_p (rtx addr, machine_mode mode)
+prefixed_local_addr_p (rtx addr, machine_mode mode, enum insn_form iform)
 {
-  if (!TARGET_PREFIXED_ADDR || !mode_supports_prefixed_address_p (mode))
+  if (!reg_addr[mode].prefixed_memory_p)
     return false;
 
-  /* Check for PC-relative addresses.  */
-  if (pcrel_address (addr, Pmode))
-    return true;
+  if (GET_CODE (addr) == CONST)
+    addr = XEXP (addr, 0);
 
-  /* Check for prefixed memory addresses that have a large numeric offset,
-     or an offset that can't be used for a DS/DQ-form memory operation.  */
+  /* Single register, not prefixed.  */
+  if (REG_P (addr) || SUBREG_P (addr))
+    return false;
+
+  /* Register + offset.  */
+  else if (GET_CODE (addr) == PLUS
+	   && (REG_P (XEXP (addr, 0)) || SUBREG_P (XEXP (addr, 0)))
+	   && CONST_INT_P (XEXP (addr, 1)))
+    {
+      HOST_WIDE_INT offset = INTVAL (XEXP (addr, 1));
+
+      /* Prefixed instructions can only access 34-bits.  Fail if the value
+	 is larger than that.  */
+      if (!SIGNED_34BIT_OFFSET_P (offset))
+	return false;
+
+      /* For small offsets see whether it might be a DS or DQ instruction where
+	 the bottom bits non-zero.  This would require using a prefixed
+	 address.  If the offset is larger than 16 bits, then the instruction
+	 must be prefixed.  */
+      if (SIGNED_16BIT_OFFSET_P (offset))
+	{
+	  /* Use default if we don't know the precise instruction format.  */
+	  if (iform == INSN_FORM_UNKNOWN)
+	    iform = reg_addr[mode].default_insn_form;
+
+	  if (iform == INSN_FORM_DS)
+	    return (offset & 3) != 0;
+
+	  else if (iform == INSN_FORM_DQ)
+	    return (offset & 15) != 0;
+
+	  else if (iform != INSN_FORM_PREFIXED)
+	    return false;
+	}
+
+      return true;
+    }
+
+  else if (!TARGET_PCREL)
+    return false;
+
   if (GET_CODE (addr) == PLUS)
     {
-      rtx op0 = XEXP (addr, 0);
       rtx op1 = XEXP (addr, 1);
 
-      if (!base_reg_operand (op0, Pmode) || !CONST_INT_P (op1))
+      if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
 	return false;
 
-      HOST_WIDE_INT value = INTVAL (op1);
-      if (!SIGNED_34BIT_OFFSET_P (value))
+      addr = XEXP (addr, 0);
+    }
+
+  /* Local pc-relative symbols/labels.  */
+  return (LABEL_REF_P (addr)
+	  || (SYMBOL_REF_P (addr) && SYMBOL_REF_LOCAL_P (addr)));
+}
+
+/* Return true if the address ADDR is a prefixed address that is a pc-relative
+   reference either to a local symbol or to an external symbol.  We break apart
+   the address and return the parts.  LOCAL_SYMBOL_P and EXTERNAL_SYMBOL_P says
+   whether local and external pc-relative symbols are allowed.  P_INFO points
+   to a structure that returns the broken out component parts if desired.  */
+
+bool
+pcrel_addr_p (rtx addr,
+	      bool local_symbol_p,
+	      bool external_symbol_p,
+	      pcrel_info_type *p_info)
+{
+  rtx base_addr = NULL_RTX;
+  HOST_WIDE_INT offset = 0;
+  bool was_external_p = false;
+
+  if (p_info)
+    {
+      p_info->base_addr = NULL_RTX;
+      p_info->offset = 0;
+      p_info->external_p = false;
+    }
+
+  if (!TARGET_PCREL)
+    return false;
+
+  if (GET_CODE (addr) == CONST)
+    addr = XEXP (addr, 0);
+
+  /* Pc-relative symbols/labels without offsets.  */
+  if (SYMBOL_REF_P (addr))
+    {
+      base_addr = addr;
+      was_external_p = !SYMBOL_REF_LOCAL_P (addr);
+    }
+
+  else if (LABEL_REF_P (addr))
+    base_addr = addr;
+
+  /* Pc-relative symbols with offsets.  */
+  else if (GET_CODE (addr) == PLUS
+	   && SYMBOL_REF_P (XEXP (addr, 0))
+	   && CONST_INT_P (XEXP (addr, 1)))
+    {
+      base_addr = XEXP (addr, 0);
+      offset = INTVAL (XEXP (addr, 1));
+      was_external_p = !SYMBOL_REF_LOCAL_P (base_addr);
+
+      if (!SIGNED_34BIT_OFFSET_P (offset))
 	return false;
+    }
 
-      /* Offset larger than 16-bits?  */
-      if (!SIGNED_16BIT_OFFSET_P (value))
-	return true;
+  else
+    return false;
 
-      /* DQ instruction (bottom 4 bits must be 0) for vectors.  */
-      HOST_WIDE_INT mask;
-      if (GET_MODE_SIZE (mode) >= 16)
-	mask = 15;
+  if (was_external_p && !external_symbol_p)
+    return false;
 
-      /* DS instruction (bottom 2 bits must be 0).  For 32-bit integers, we
-	 need to use DS instructions if we are sign-extending the value with
-	 LWA.  For 32-bit floating point, we need DS instructions to load and
-	 store values to the traditional Altivec registers.  */
-      else if (GET_MODE_SIZE (mode) >= 4)
-	mask = 3;
+  if (!was_external_p && !local_symbol_p)
+    return false;
 
-      /* QImode/HImode has no restrictions.  */
+  if (p_info)
+    {
+      p_info->base_addr = base_addr;
+      p_info->offset = offset;
+      p_info->external_p = was_external_p;
+    }
+
+  return true;
+}
+
+/* Given a register and a mode, return the instruction format for that
+   register.  If the register is a pseudo register, use the default format.
+   Otherwise if it is hard register, look to see exactly what type of
+   addressing is used.  */
+
+enum insn_form
+reg_to_insn_form (rtx reg, machine_mode mode)
+{
+  enum insn_form iform;
+
+  /* Handle UNSPECs, such as the special UNSPEC_SF_FROM_SI and
+     UNSPEC_SI_FROM_SF UNSPECs, which are used to hide SF/SI interactions.
+     Look at the first argument, and if it is a register, use that.  */
+  if (GET_CODE (reg) == UNSPEC || GET_CODE (reg) == UNSPEC_VOLATILE)
+    {
+      rtx op0 = XVECEXP (reg, 0, 0);
+      if (REG_P (op0) || SUBREG_P (op0))
+	reg = op0;
+    }
+
+  /* If it isn't a register, use the defaults.  */
+  if (!REG_P (reg) && !SUBREG_P (reg))
+    iform = reg_addr[mode].default_insn_form;
+
+  else
+    {
+      unsigned int r = reg_or_subregno (reg);
+
+      /* If we have a pseudo, use the default instruction format.  */
+      if (r >= FIRST_PSEUDO_REGISTER)
+	iform = reg_addr[mode].default_insn_form;
+
+      /* If we have a hard register, use the address format of that hard
+	 register.  */
       else
-	return true;
+	{
+	  if (IN_RANGE (r, FIRST_GPR_REGNO, LAST_GPR_REGNO))
+	    iform = reg_addr[mode].insn_form[RELOAD_REG_GPR];
 
-      /* Return true if we must use a prefixed instruction.  */
-      return (value & mask) != 0;
+	  else if (IN_RANGE (r, FIRST_FPR_REGNO, LAST_FPR_REGNO))
+	    iform = reg_addr[mode].insn_form[RELOAD_REG_FPR];
+
+	  else if (IN_RANGE (r, FIRST_ALTIVEC_REGNO, LAST_ALTIVEC_REGNO))
+	    iform = reg_addr[mode].insn_form[RELOAD_REG_VMX];
+
+	  else
+	    gcc_unreachable ();
+	}
     }
 
-  return false;
+  return iform;
 }
 \f
 #if defined (HAVE_GAS_HIDDEN) && !TARGET_MACHO
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 274173)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -252,6 +252,23 @@
 ;; Is copying of this instruction disallowed?
 (define_attr "cannot_copy" "no,yes" (const_string "no"))
 
+;; Enumeration of the PowerPC instruction formats.  We only list the
+;; instruction formats that are used by the code, and not every possible
+;; instruction format that the machine supports.
+
+;; The main use for this enumeration is to determine if a particular
+;; offsettable instruction has a valid offset field for a traditional
+;; instruction, or whether a prefixed instruction might be needed to hold the
+;; offset.  For DS/DQ format instructions, if we have an offset that has the
+;; bottom bits non-zero, we can use a prefixed instruction instead of pushing
+;; the offset to an index register.
+(define_enum "insn_form" [unknown	; Unknown format
+			  d		; Offset addressing uses 16 bits
+			  ds		; Offset addressing uses 14 bits
+			  dq		; Offset addressing uses 12 bits
+			  x		; Indexed addressing
+			  prefixed])	; Prefixed instruction
+
 ;; Length of the instruction (in bytes).
 (define_attr "length" "" (const_int 4))
 

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH], Patch #2 of 10, Add RTL prefixed attribute
  2019-08-14 21:36 PowerPC 'future' patches introduction Michael Meissner
  2019-08-14 21:37 ` [PATCH], Patch #1 of 10, Add instruction format enumeration Michael Meissner
@ 2019-08-14 22:11 ` Michael Meissner
  2019-08-19 19:15   ` Segher Boessenkool
  2019-08-14 22:12 ` [PATCH], Patch #3 of 10, Add prefixed addressing support Michael Meissner
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 26+ messages in thread
From: Michael Meissner @ 2019-08-14 22:11 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn,
	Michael Meissner, Alan Modra

This patch adds the RTL attribute "prefixed" that says this particular
instruction is a prefixed instruction.

The target hooks FINAL_SCAN_INSN and ASM_OUTPUT_OPCODE are defined.  If the
insn is prefixed, ASM_OUTPUT_OPCODE will emit a leading 'p' before the
instruction is emitted.  For example, a load word and zero extend instruction
would have the output template:

	lwz%U1%X1 %0,%1

If the insn is prefixed, ASM_OUTPUT_OPCODE will emit the leading 'p' and the
assembler would see something like:

	plwz 3,foo@pcrel

The RTL length attribute looks a the RTL prefixed attribute to set the default
length to either 4 or 12.

In order to simplify setting the length for complex insns that aren't yet
split, I have added two new RTL attributes ("non_prefixed_length" and
"prefixed_length") that the length attribute uses.  Normally these values would
be 4 and 12 bytes, unless this is overwritten by the insn attributes.

In previous versions of the patch, I had a maybe_prefixed attribute that was
used to say this instruction might be prefixed, and to check whether the
instruction was prefixed externally.  Now, I use the type RTL attribute, and I
only look if the type is one of the load types, one of the store types, or one
of the integer and add types.

Due to some of the existing load and store insns not using the traditional
operands[0] and operands[1], the functions that test whether an insn is
prefixed only use the insn and not the operands directly.

Most of the new  code is in a new file (rs6000-prefixed.c).

2019-08-14   Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000-prefixed.c: New file.
	* config/rs6000/rs6000-protos.h (rs6000_final_prescan_insn):
	Update calling signature.
	(prefixed_load_p): New function.
	(prefixed_store_p): New function.
	(prefixed_paddi_p): New function.
	* config/rs6000/rs6000.c (rs6000_emit_move): Add support for
	loading up pc-relatve addresses.
	* config/rs6000/rs6000.h (FINAL_SCAN_INSN): New target hook.
	(ASM_OUTPUT_OPCODE): New target hook.
	* config/rs6000/rs6000.md (prefixed attribute): New attribute.
	(prefixed_length attribute): New attribute.
	(non_prefixed_length attribute): New attribute.
	(length attribute): Calculate length in terms of the prefixed,
	prefixed_length, and non_prefixed_length attributes.
	(pcrel_addr): New insn for pc-relative support.
	(pcrel_ext_addr): New insn for pc-relative support.
	* config/rs6000/t-rs6000 (rs6000-prefixed.o): Add build rule.
	* config.gcc (powerpc*-*-*): Add rs6000-prefixed.c.
	(rs6000*-*-*): Add rs6000-prefixed.c.

Index: gcc/config/rs6000/rs6000-prefixed.c
===================================================================
--- gcc/config/rs6000/rs6000-prefixed.c	(revision 0)
+++ gcc/config/rs6000/rs6000-prefixed.c	(working copy)
@@ -0,0 +1,188 @@
+/* Subroutines used to support prefixed addressing on the PowerPC.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "tree.h"
+#include "memmodel.h"
+#include "df.h"
+#include "tm_p.h"
+#include "ira.h"
+#include "print-tree.h"
+#include "varasm.h"
+#include "explow.h"
+#include "expr.h"
+#include "output.h"
+#include "tree-pass.h"
+#include "rtx-vector-builder.h"
+#include "print-rtl.h"
+#include "insn-attr.h"
+#include "insn-config.h"
+#include "recog.h"
+#include "tm-constrs.h"
+
+/* Whether the next instruction needs a 'p' prefix issued before the
+   instruction is printed out.  */
+static bool next_insn_prefixed_p;
+
+/* Define FINAL_PRESCAN_INSN if some processing needs to be done before
+   outputting the assembler code.  On the PowerPC, we remember if the current
+   insn is a prefixed insn where we need to emit a 'p' before the insn.  */
+void
+rs6000_final_prescan_insn (rtx_insn *insn)
+{
+  next_insn_prefixed_p = (get_attr_prefixed (insn) != PREFIXED_NO);
+  return;
+}
+
+/* Define ASM_OUTPUT_OPCODE to do anything special before emitting an opcode.
+   We use it to emit a 'p' for prefixed insns that is set in
+   FINAL_PRESCAN_INSN.  We also use it for PCREL_OPT to emit the relocation
+   that ties the load of the GOT pointer with the load/store that uses the GOT
+   number.  */
+void
+rs6000_asm_output_opcode (FILE *stream, const char *)
+{
+  if (next_insn_prefixed_p)
+    {
+      next_insn_prefixed_p = false;
+      fprintf (stream, "p");
+    }
+
+  return;
+}
+
+\f
+/* Whether a load instruction is a prefixed instruction.  This is called from
+   the prefixed attribute processing.  We can't use operands[0] and
+   operands[1], because there are several load insns that don't use the
+   standard destination and source operands (mov<mode>_update1, etc.).  */
+
+bool
+prefixed_load_p (rtx_insn *insn)
+{
+  rtx set = single_set (insn);
+  if (!set)
+    return false;
+
+  rtx reg = SET_DEST (set);
+  rtx mem = SET_SRC (set);
+  bool sign_p = false;
+
+  /* Allow sign/zero/float extend as part of the load.  */
+  if (GET_CODE (mem) == SIGN_EXTEND)
+    {
+      sign_p = true;
+      mem = XEXP (mem, 0);
+    }
+
+  else if (GET_CODE (mem) == ZERO_EXTEND || GET_CODE (mem) == FLOAT_EXTEND)
+    mem = XEXP (mem, 0);
+
+  /* Is this a load?  */
+  if (!MEM_P (mem))
+    return false;
+
+  machine_mode mode = GET_MODE (mem);
+  rtx addr = XEXP (mem, 0);
+
+  /* Special case LWA, which uses the DS instruction format, instead of the D
+     instruction format.  */
+  enum insn_form iform = (sign_p && mode == SImode && GET_CODE (addr) == PLUS
+			  ? INSN_FORM_DS
+			  : reg_to_insn_form (reg, mode));
+
+  return prefixed_local_addr_p (addr, mode, iform);
+}
+
+/* Whether a store instruction is a prefixed instruction.  This is called from
+   the prefixed attribute processing.  */
+
+bool
+prefixed_store_p (rtx_insn *insn)
+{
+  rtx set = single_set (insn);
+  if (!set)
+    return false;
+
+  rtx mem = SET_DEST (set);
+  rtx reg = SET_SRC (set);
+
+  /* Is this a store?  */
+  if (!MEM_P (mem))
+    return false;
+
+  machine_mode mode = GET_MODE (mem);
+  enum insn_form iform = reg_to_insn_form (reg, mode);
+
+  return prefixed_local_addr_p (XEXP (mem, 0), mode, iform);
+}
+
+/* Whether a load immediate or add instruction is a prefixed instruction.  This
+   is called from the prefixed attribute processing.  */
+
+bool
+prefixed_paddi_p (rtx_insn *insn)
+{
+  rtx set = single_set (insn);
+  if (!set)
+    return false;
+
+  rtx dest = SET_DEST (set);
+  rtx src = SET_SRC (set);
+
+  if (!REG_P (dest) && !SUBREG_P (dest))
+    return false;
+
+  /* Is this a load immediate that can't be done with a simple ADDI or
+     ADDIS?  */
+  if (CONST_INT_P (src))
+    return (satisfies_constraint_eI (src)
+	    && !satisfies_constraint_I (src)
+	    && !satisfies_constraint_L (src));
+
+  /* Is this a PADDI instruction that can't be done with a simple ADDI or
+     ADDIS?  */
+  if (GET_CODE (src) == PLUS)
+    {
+      rtx op1 = XEXP (src, 1);
+
+      return (CONST_INT_P (op1)
+	      && satisfies_constraint_eI (op1)
+	      && !satisfies_constraint_I (op1)
+	      && !satisfies_constraint_L (op1));
+    }
+
+  /* If not, is it a load of a pc-relative address?  */
+  if (!TARGET_PCREL)
+    return false;
+
+  if (!SYMBOL_REF_P (src) && !LABEL_REF_P (src) && GET_CODE (src) != CONST)
+    return false;
+
+  /* Look for either pc-relative addresses of local symbols that we can use a
+     PLA to load or external symbols that we can load a GOT address via a
+     pc-relative load.  */
+  return pcrel_addr_p (src, true, true, PCREL_NULL);
+}
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 274174)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -245,7 +245,14 @@ extern void rs6000_d_target_versions (void);
 const char * rs6000_xcoff_strip_dollar (const char *);
 #endif
 
-void rs6000_final_prescan_insn (rtx_insn *, rtx *operand, int num_operands);
+/* Declare functions in rs6000-prefixed.c  */
+#ifdef RTX_CODE
+extern bool prefixed_load_p (rtx_insn *);
+extern bool prefixed_store_p (rtx_insn *);
+extern bool prefixed_paddi_p (rtx_insn *);
+extern void rs6000_asm_output_opcode (FILE *, const char *);
+void rs6000_final_prescan_insn (rtx_insn *);
+#endif
 
 extern unsigned char rs6000_class_max_nregs[][LIM_REG_CLASSES];
 extern unsigned char rs6000_hard_regno_nregs[][FIRST_PSEUDO_REGISTER];
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 274174)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -9815,6 +9815,17 @@ rs6000_emit_move (rtx dest, rtx source, machine_mo
 	  return;
 	}
 
+      /* Handle pc-relative addresses, either external symbols or internal
+	 within the function.  */
+      if (TARGET_PCREL)
+	{
+	  if (pcrel_addr_p (operands[1], true, true, PCREL_NULL))
+	    {
+	      emit_insn (gen_rtx_SET (operands[0], operands[1]));
+	      return;
+	    }
+	}
+
       if (DEFAULT_ABI == ABI_V4
 	  && mode == Pmode && mode == SImode
 	  && flag_pic == 1 && got_operand (operands[1], mode))
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 274173)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -2572,3 +2572,24 @@ typedef struct GTY(()) machine_function
   IN_RANGE ((VALUE),							\
 	    -(HOST_WIDE_INT_1 << 33),					\
 	    (HOST_WIDE_INT_1 << 33) - 1 - (EXTRA))
+
+/* Define this if some processing needs to be done before outputting the
+   assembler code.  On the PowerPC, we remember if the current insn is a normal
+   prefixed insn where we need to emit a 'p' before the insn.  */
+#define FINAL_PRESCAN_INSN(INSN, OPERANDS, NOPERANDS)			\
+do									\
+  {									\
+    if (TARGET_PREFIXED_ADDR)						\
+      rs6000_final_prescan_insn (INSN);					\
+  }									\
+while (0)
+
+/* Do anything special before emitting an opcode.  We use it to emit a 'p' for
+   prefixed insns that is set in FINAL_PRESCAN_INSN.  */
+#define ASM_OUTPUT_OPCODE(STREAM, OPCODE)				\
+  do									\
+    {									\
+     if (TARGET_PREFIXED_ADDR)						\
+       rs6000_asm_output_opcode (STREAM, OPCODE);			\
+    }									\
+  while (0)
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 274174)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -269,9 +269,48 @@
 			  x		; Indexed addressing
 			  prefixed])	; Prefixed instruction
 
-;; Length of the instruction (in bytes).
-(define_attr "length" "" (const_int 4))
+;; Whether an insn is a prefixed insn, and an initial 'p' should be printed
+;; before the instruction.  A prefixed instruction has a prefix instruction
+;; word that extends the immediate value of the instructions from 12-16 bits to
+;; 34 bits.  The macro ASM_OUTPUT_OPCODE emits a leading 'p' for prefixed
+;; insns.  The default "length" attribute will also be adjusted by default to
+;; be 12 bytes.
+(define_attr "prefixed" "no,yes"
+  (cond [(ior (match_test "!TARGET_PREFIXED_ADDR")
+	      (match_test "!NONJUMP_INSN_P (insn)"))
+	 (const_string "no")
 
+	 (eq_attr "type" "load,fpload,vecload")
+	 (if_then_else (match_test "prefixed_load_p (insn)")
+		       (const_string "yes")
+		       (const_string "no"))
+
+	 (eq_attr "type" "store,fpstore,vecstore")
+	 (if_then_else (match_test "prefixed_store_p (insn)")
+		       (const_string "yes")
+		       (const_string "no"))
+
+	 (eq_attr "type" "integer,add")
+	 (if_then_else (match_test "prefixed_paddi_p (insn)")
+		       (const_string "yes")
+		       (const_string "no"))]
+	(const_string "no")))
+
+;; Length in bytes of instructions that use prefixed addressing and length in
+;; bytes of instructions that does not use prefixed addressing.  This allows
+;; both lengths to be defined as constants, and the length attribute can pick
+;; the size as appropriate.
+(define_attr "prefixed_length" "" (const_int 12))
+(define_attr "non_prefixed_length" "" (const_int 4))
+
+;; Length of the instruction (in bytes).  Prefixed insns are 8 bytes, but the
+;; assembler might issue need to issue a NOP so that the prefixed instruction
+;; does not cross a cache boundary, which makes them possibly 12 bytes.
+(define_attr "length" ""
+  (if_then_else (eq_attr "prefixed" "yes")
+		(attr "prefixed_length")
+		(attr "non_prefixed_length")))
+
 ;; Processor type -- this attribute must exactly match the processor_type
 ;; enumeration in rs6000-opts.h.
 (define_attr "cpu"
@@ -9888,6 +9927,25 @@
   operands[6] = gen_rtx_PARALLEL (VOIDmode, p);
 })
 \f
+;; Load up a pc-relative address.  ASM_OUTPUT_OPCODE will emit the initial "p".
+(define_insn "*pcrel_addr"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=b*r")
+	(match_operand:DI 1 "pcrel_address"))]
+  "TARGET_PCREL"
+  "la %0,%a1"
+  [(set_attr "prefixed" "yes")])
+
+;; Load up a pc-relative address to an external symbol.  If the symbol and the
+;; program are both defined in the main program, the linker will optimize this
+;; to a PADDI.  Otherwise, it will create a GOT address that is relocated by
+;; the dynamic linker and loaded up.
+(define_insn "*pcrel_ext_addr"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=b*r")
+	(match_operand:DI 1 "pcrel_external_address"))]
+  "TARGET_PCREL"
+  "ld %0,%a1"
+  [(set_attr "prefixed" "yes")])
+
 ;; TOC register handling.
 
 ;; Code to initialize the TOC register...
Index: gcc/config/rs6000/t-rs6000
===================================================================
--- gcc/config/rs6000/t-rs6000	(revision 274173)
+++ gcc/config/rs6000/t-rs6000	(working copy)
@@ -47,6 +47,10 @@ rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call
 	$(COMPILE) $<
 	$(POSTCOMPILE)
 
+rs6000-prefixed.o: $(srcdir)/config/rs6000/rs6000-prefixed.c
+	$(COMPILE) $<
+	$(POSTCOMPILE)
+
 $(srcdir)/config/rs6000/rs6000-tables.opt: $(srcdir)/config/rs6000/genopt.sh \
   $(srcdir)/config/rs6000/rs6000-cpus.def
 	$(SHELL) $(srcdir)/config/rs6000/genopt.sh $(srcdir)/config/rs6000 > \
Index: gcc/config.gcc
===================================================================
--- gcc/config.gcc	(revision 274173)
+++ gcc/config.gcc	(working copy)
@@ -500,6 +500,7 @@ or1k*-*-*)
 powerpc*-*-*)
 	cpu_type=rs6000
 	extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o rs6000-call.o"
+	extra_objs="${extra_objs} rs6000-prefixed.o"
 	extra_headers="ppc-asm.h altivec.h htmintrin.h htmxlintrin.h"
 	extra_headers="${extra_headers} bmi2intrin.h bmiintrin.h"
 	extra_headers="${extra_headers} xmmintrin.h mm_malloc.h emmintrin.h"
@@ -514,6 +515,7 @@ powerpc*-*-*)
 	esac
 	extra_options="${extra_options} g.opt fused-madd.opt rs6000/rs6000-tables.opt"
 	target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-logue.c \$(srcdir)/config/rs6000/rs6000-call.c"
+	target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-prefixed.c"
 	;;
 pru-*-*)
 	cpu_type=pru
@@ -526,7 +528,9 @@ riscv*)
 rs6000*-*-*)
 	extra_options="${extra_options} g.opt fused-madd.opt rs6000/rs6000-tables.opt"
 	extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o rs6000-call.o"
+	extra_objs="${extra_objs} rs6000-prefixed.o"
 	target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-logue.c \$(srcdir)/config/rs6000/rs6000-call.c"
+	target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-prefixed.c"
 	;;
 sparc*-*-*)
 	cpu_type=sparc

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH], Patch #3 of 10, Add prefixed addressing support
  2019-08-14 21:36 PowerPC 'future' patches introduction Michael Meissner
  2019-08-14 21:37 ` [PATCH], Patch #1 of 10, Add instruction format enumeration Michael Meissner
  2019-08-14 22:11 ` [PATCH], Patch #2 of 10, Add RTL prefixed attribute Michael Meissner
@ 2019-08-14 22:12 ` Michael Meissner
  2019-08-16  1:59   ` Bill Schmidt
  2019-08-14 22:15 ` [PATCH], Patch #4 of 10, Adjust costs based on insn sizes Michael Meissner
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 26+ messages in thread
From: Michael Meissner @ 2019-08-14 22:12 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn,
	Michael Meissner, Alan Modra

This patch adds prefixed memory support to all offsettable instructions.

Unlike previous versions of the patch, this patch combines all of the
modifications for addressing to one patch.  Previously, I had 3 separate
patches (one for PADDI, one for scalar types, and one for vector types).

2019-08-14   Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (add_operand): Add support for the
	PADDI instruction.
	(non_add_cint_operand): Add support for the PADDI instruction.
	(lwa_operand): Add support for the prefixed PLWA instruction.
	* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok_uncached):
	Only treat modes < 16 bytes as scalars.
	(rs6000_debug_print_mode): Print whether the mode supports
	prefixed addressing.
	(setup_insn_form): Enable prefixed addressing for all modes whose
	default instruction form includes offset addressing.
	(num_insns_constant_gpr): Add support for the PADDI instruction.
	(quad_address_p): Add support for prefixed addressing.
	(mem_operand_gpr): Add support for prefixed addressing.
	(mem_operand_ds_form): Add support for prefixed addressing.
	(rs6000_legitimate_offset_address_p): Add support for prefixed
	addressing.
	(rs6000_legitimate_address_p): Add support for prefixed
	addressing.
	(rs6000_mode_dependent_address): Add support for prefixed
	addressing.
	(rs6000_rtx_costs): Make PADDI cost the same as ADDI or ADDIS.
	* config/rs6000/rs6000.md (add<mode>3): Add support for PADDI.
	(movsi_internal1): Add support for prefixed addressing, and using
	PADDI to load up large integers.
	(movsi splitter): Do not split up a PADDI instruction.
	(mov<mode>_64bit_dm): Add support for prefixed addressing.
	(movtd_64bit_nodm): Add support for prefixed addressing.
	(movdi_internal64): Add support for prefixed addressing, and using
	PADDI to load up large integers.
	(movdi splitter): Update comment about PADDI.
	(stack_protect_setdi): Add support for prefixed addressing.
	(stack_protect_testdi): Add support for prefixed addressing.
	* config/rs6000/vsx.md (vsx_mov<mode>_64bit): Add support for
	prefixed addressing.
	(vsx_extract_<P:mode>_<VSX_D:mode>_load): Add support for prefixed
	addressing.
	(vsx_extract_<P:mode>_<VSX_D:mode>_load): Add support for prefixed
	addressing.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 274174)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -839,7 +839,8 @@
 (define_predicate "add_operand"
   (if_then_else (match_code "const_int")
     (match_test "satisfies_constraint_I (op)
-		 || satisfies_constraint_L (op)")
+		 || satisfies_constraint_L (op)
+		 || satisfies_constraint_eI (op)")
     (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is either a non-special register, or 0, or -1.
@@ -852,7 +853,8 @@
 (define_predicate "non_add_cint_operand"
   (and (match_code "const_int")
        (match_test "!satisfies_constraint_I (op)
-		    && !satisfies_constraint_L (op)")))
+		    && !satisfies_constraint_L (op)
+		    && !satisfies_constraint_eI (op)")))
 
 ;; Return 1 if the operand is a constant that can be used as the operand
 ;; of an AND, OR or XOR.
@@ -933,6 +935,13 @@
     return false;
 
   addr = XEXP (inner, 0);
+
+  /* The LWA instruction uses the DS-form format where the bottom two bits of
+     the offset must be 0.  The prefixed PLWA does not have this
+     restriction.  */
+  if (prefixed_local_addr_p (addr, mode, INSN_FORM_DS))
+    return true;
+
   if (GET_CODE (addr) == PRE_INC
       || GET_CODE (addr) == PRE_DEC
       || (GET_CODE (addr) == PRE_MODIFY
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 274175)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -1828,7 +1828,7 @@ rs6000_hard_regno_mode_ok_uncached (int regno, mac
 
       if (ALTIVEC_REGNO_P (regno))
 	{
-	  if (GET_MODE_SIZE (mode) != 16 && !reg_addr[mode].scalar_in_vmx_p)
+	  if (GET_MODE_SIZE (mode) < 16 && !reg_addr[mode].scalar_in_vmx_p)
 	    return 0;
 
 	  return ALTIVEC_REGNO_P (last_regno);
@@ -2146,6 +2146,11 @@ rs6000_debug_print_mode (ssize_t m)
           rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_FPR]),
           rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_VMX]));
 
+  if (reg_addr[m].prefixed_memory_p)
+    fprintf (stderr, "  Prefix");
+  else
+    spaces += sizeof ("  Prefix") - 1;
+
   if ((reg_addr[m].reload_store != CODE_FOR_nothing)
       || (reg_addr[m].reload_load != CODE_FOR_nothing))
     {
@@ -2838,11 +2843,16 @@ setup_insn_form (void)
       else
 	def_rc = RELOAD_REG_GPR;
 
-      reg_addr[m].default_insn_form = reg_addr[m].insn_form[def_rc];
+      enum insn_form def_iform = reg_addr[m].insn_form[def_rc];
+      reg_addr[m].default_insn_form = def_iform;
 
-      /* Don't enable prefixed memory support until all of the infrastructure
-	 changes are in.  */
-      reg_addr[m].prefixed_memory_p = false;
+      /* Only modes that support offset addressing by default can be
+	 prefixed.  */
+      reg_addr[m].prefixed_memory_p = (TARGET_PREFIXED_ADDR
+				       && (def_iform == INSN_FORM_D
+					   || def_iform == INSN_FORM_DS
+					   || def_iform == INSN_FORM_DQ));
+
     }
 }
 
@@ -5693,7 +5703,7 @@ static int
 num_insns_constant_gpr (HOST_WIDE_INT value)
 {
   /* signed constant loadable with addi */
-  if (((unsigned HOST_WIDE_INT) value + 0x8000) < 0x10000)
+  if (SIGNED_16BIT_OFFSET_P (value))
     return 1;
 
   /* constant loadable with addis */
@@ -5701,6 +5711,10 @@ num_insns_constant_gpr (HOST_WIDE_INT value)
 	   && (value >> 31 == -1 || value >> 31 == 0))
     return 1;
 
+  /* PADDI can support up to 34 bit signed integers.  */
+  else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (value))
+    return 1;
+
   else if (TARGET_POWERPC64)
     {
       HOST_WIDE_INT low  = ((value & 0xffffffff) ^ 0x80000000) - 0x80000000;
@@ -7411,7 +7425,7 @@ quad_address_p (rtx addr, machine_mode mode, bool
 {
   rtx op0, op1;
 
-  if (GET_MODE_SIZE (mode) != 16)
+  if (GET_MODE_SIZE (mode) < 16)
     return false;
 
   if (legitimate_indirect_address_p (addr, strict))
@@ -7420,6 +7434,13 @@ quad_address_p (rtx addr, machine_mode mode, bool
   if (VECTOR_MODE_P (mode) && !mode_supports_dq_form (mode))
     return false;
 
+  /* Is this a valid prefixed address?  If the bottom four bits of the offset
+     are non-zero, we could use a prefixed instruction (which does not have the
+     DQ-form constraint that the traditional instruction had) instead of
+     forcing the unaligned offset to a GPR.  */
+  if (prefixed_local_addr_p (addr, mode, INSN_FORM_DQ))
+    return true;
+
   if (GET_CODE (addr) != PLUS)
     return false;
 
@@ -7521,6 +7542,13 @@ mem_operand_gpr (rtx op, machine_mode mode)
       && legitimate_indirect_address_p (XEXP (addr, 0), false))
     return true;
 
+  /* Allow prefixed instructions if supported.  If the bottom two bits of the
+     offset are non-zero, we could use a prefixed instruction (which does not
+     have the DS-form constraint that the traditional instruction had) instead
+     of forcing the unaligned offset to a GPR.  */
+  if (prefixed_local_addr_p (addr, mode, INSN_FORM_DS))
+    return true;
+
   /* Don't allow non-offsettable addresses.  See PRs 83969 and 84279.  */
   if (!rs6000_offsettable_memref_p (op, mode, false))
     return false;
@@ -7542,7 +7570,7 @@ mem_operand_gpr (rtx op, machine_mode mode)
        causes a wrap, so test only the low 16 bits.  */
     offset = ((offset & 0xffff) ^ 0x8000) - 0x8000;
 
-  return offset + 0x8000 < 0x10000u - extra;
+  return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
 
 /* As above, but for DS-FORM VSX insns.  Unlike mem_operand_gpr,
@@ -7555,6 +7583,13 @@ mem_operand_ds_form (rtx op, machine_mode mode)
   int extra;
   rtx addr = XEXP (op, 0);
 
+  /* Allow prefixed instructions if supported.  If the bottom two bits of the
+     offset are non-zero, we could use a prefixed instruction (which does not
+     have the DS-form constraint that the traditional instruction had) instead
+     of forcing the unaligned offset to a GPR.  */
+  if (prefixed_local_addr_p (addr, mode, INSN_FORM_DS))
+    return true;
+
   if (!offsettable_address_p (false, mode, addr))
     return false;
 
@@ -7575,7 +7610,7 @@ mem_operand_ds_form (rtx op, machine_mode mode)
        causes a wrap, so test only the low 16 bits.  */
     offset = ((offset & 0xffff) ^ 0x8000) - 0x8000;
 
-  return offset + 0x8000 < 0x10000u - extra;
+  return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
 \f
 /* Subroutines of rs6000_legitimize_address and rs6000_legitimate_address_p.  */
@@ -7924,8 +7959,10 @@ rs6000_legitimate_offset_address_p (machine_mode m
       break;
     }
 
-  offset += 0x8000;
-  return offset < 0x10000 - extra;
+  if (TARGET_PREFIXED_ADDR)
+    return SIGNED_34BIT_OFFSET_EXTRA_P (offset, extra);
+  else
+    return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
 
 bool
@@ -8822,6 +8859,11 @@ rs6000_legitimate_address_p (machine_mode mode, rt
       && mode_supports_pre_incdec_p (mode)
       && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict))
     return 1;
+
+  /* Handle prefixed addresses (pc-relative or 34-bit offset).  */
+  if (prefixed_local_addr_p (x, mode, INSN_FORM_UNKNOWN))
+    return 1;
+
   /* Handle restricted vector d-form offsets in ISA 3.0.  */
   if (quad_offset_p)
     {
@@ -8880,7 +8922,10 @@ rs6000_legitimate_address_p (machine_mode mode, rt
 	  || (!avoiding_indexed_address_p (mode)
 	      && legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict)))
       && rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0)))
-    return 1;
+    {
+      /* There is no prefixed version of the load/store with update.  */
+      return !prefixed_local_addr_p (XEXP (x, 1), mode, INSN_FORM_UNKNOWN);
+    }
   if (reg_offset_p && !quad_offset_p
       && legitimate_lo_sum_address_p (mode, x, reg_ok_strict))
     return 1;
@@ -8942,8 +8987,12 @@ rs6000_mode_dependent_address (const_rtx addr)
 	  && XEXP (addr, 0) != arg_pointer_rtx
 	  && CONST_INT_P (XEXP (addr, 1)))
 	{
-	  unsigned HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
-	  return val + 0x8000 >= 0x10000 - (TARGET_POWERPC64 ? 8 : 12);
+	  HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
+	  HOST_WIDE_INT extra = TARGET_POWERPC64 ? 8 : 12;
+	  if (TARGET_PREFIXED_ADDR)
+	    return !SIGNED_34BIT_OFFSET_EXTRA_P (val, extra);
+	  else
+	    return !SIGNED_16BIT_OFFSET_EXTRA_P (val, extra);
 	}
       break;
 
@@ -20939,7 +20988,8 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int ou
 	    || outer_code == PLUS
 	    || outer_code == MINUS)
 	   && (satisfies_constraint_I (x)
-	       || satisfies_constraint_L (x)))
+	       || satisfies_constraint_L (x)
+	       || satisfies_constraint_eI (x)))
 	  || (outer_code == AND
 	      && (satisfies_constraint_K (x)
 		  || (mode == SImode
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 274175)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -1768,15 +1768,17 @@
 })
 
 (define_insn "*add<mode>3"
-  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
-	(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
-		  (match_operand:GPR 2 "add_operand" "r,I,L")))]
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r,r")
+	(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b,b")
+		  (match_operand:GPR 2 "add_operand" "r,I,L,eI")))]
   ""
   "@
    add %0,%1,%2
    addi %0,%1,%2
-   addis %0,%1,%v2"
-  [(set_attr "type" "add")])
+   addis %0,%1,%v2
+   addi %0,%1,%2"
+  [(set_attr "type" "add")
+   (set_attr "isa" "*,*,*,fut")])
 
 (define_insn "*addsi3_high"
   [(set (match_operand:SI 0 "gpc_reg_operand" "=b")
@@ -6916,22 +6918,22 @@
 
 ;;		MR           LA           LWZ          LFIWZX       LXSIWZX
 ;;		STW          STFIWX       STXSIWX      LI           LIS
-;;		#            XXLOR        XXSPLTIB 0   XXSPLTIB -1  VSPLTISW
-;;		XXLXOR 0     XXLORC -1    P9 const     MTVSRWZ      MFVSRWZ
-;;		MF%1         MT%0         NOP
+;;		PLI          #            XXLOR        XXSPLTIB 0   XXSPLTIB -1
+;;		VSPLTISW     XXLXOR 0     XXLORC -1    P9 const     MTVSRWZ
+;;		MFVSRWZ      MF%1         MT%0         NOP
 (define_insn "*movsi_internal1"
   [(set (match_operand:SI 0 "nonimmediate_operand"
 		"=r,         r,           r,           d,           v,
 		 m,          Z,           Z,           r,           r,
-		 r,          wa,          wa,          wa,          v,
-		 wa,         v,           v,           wa,          r,
-		 r,          *h,          *h")
+		 r,          r,           wa,          wa,          wa,
+		 v,          wa,          v,           v,           wa,
+		 r,          r,           *h,          *h")
 	(match_operand:SI 1 "input_operand"
 		"r,          U,           m,           Z,           Z,
 		 r,          d,           v,           I,           L,
-		 n,          wa,          O,           wM,          wB,
-		 O,          wM,          wS,          r,           wa,
-		 *h,         r,           0"))]
+		 eI,         n,           wa,          O,           wM,
+		 wB,         O,           wM,          wS,          r,
+		 wa,         *h,          r,           0"))]
   "gpc_reg_operand (operands[0], SImode)
    || gpc_reg_operand (operands[1], SImode)"
   "@
@@ -6945,6 +6947,7 @@
    stxsiwx %x1,%y0
    li %0,%1
    lis %0,%v1
+   li %0,%1
    #
    xxlor %x0,%x1,%x1
    xxspltib %x0,0
@@ -6961,21 +6964,21 @@
   [(set_attr "type"
 		"*,          *,           load,        fpload,      fpload,
 		 store,      fpstore,     fpstore,     *,           *,
-		 *,          veclogical,  vecsimple,   vecsimple,   vecsimple,
-		 veclogical, veclogical,  vecsimple,   mffgpr,      mftgpr,
-		 *,          *,           *")
+		 *,          *,           veclogical,  vecsimple,   vecsimple,
+		 vecsimple,  veclogical,  veclogical,  vecsimple,   mffgpr,
+		 mftgpr,     *,           *,           *")
    (set_attr "length"
 		"*,          *,           *,           *,           *,
 		 *,          *,           *,           *,           *,
-		 8,          *,           *,           *,           *,
-		 *,          *,           8,           *,           *,
-		 *,          *,           *")
+		 *,          8,           *,           *,           *,
+		 *,          *,           *,           8,           *,
+		 *,          *,           *,           *")
    (set_attr "isa"
 		"*,          *,           *,           p8v,         p8v,
 		 *,          p8v,         p8v,         *,           *,
-		 *,          p8v,         p9v,         p9v,         p8v,
-		 p9v,        p8v,         p9v,         p8v,         p8v,
-		 *,          *,           *")])
+		 fut,        *,           p8v,         p9v,         p9v,
+		 p8v,        p9v,         p8v,         p9v,         p8v,
+		 p8v,        *,           *,           *")])
 
 ;; Like movsi, but adjust a SF value to be used in a SI context, i.e.
 ;; (set (reg:SI ...) (subreg:SI (reg:SF ...) 0))
@@ -7120,14 +7123,15 @@
   "xscvdpsp %x0,%x1"
   [(set_attr "type" "fp")])
 
-;; Split a load of a large constant into the appropriate two-insn
-;; sequence.
+;; Split a load of a large constant into the appropriate two-insn sequence.  On
+;; systems that support PADDI (PLI), we can use PLI to load any 32-bit constant
+;; in one instruction.
 
 (define_split
   [(set (match_operand:SI 0 "gpc_reg_operand")
 	(match_operand:SI 1 "const_int_operand"))]
   "(unsigned HOST_WIDE_INT) (INTVAL (operands[1]) + 0x8000) >= 0x10000
-   && (INTVAL (operands[1]) & 0xffff) != 0"
+   && (INTVAL (operands[1]) & 0xffff) != 0 && !TARGET_PREFIXED_ADDR"
   [(set (match_dup 0)
 	(match_dup 2))
    (set (match_dup 0)
@@ -7766,9 +7770,18 @@
 ;; not swapped like they are for TImode or TFmode.  Subregs therefore are
 ;; problematical.  Don't allow direct move for this case.
 
+;;		FPR load    FPR store   FPR move    FPR zero    GPR load
+;;		GPR store   GPR move    GPR zero    MFVSRD      MTVSRD
+
 (define_insn_and_split "*mov<mode>_64bit_dm"
-  [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r,r,d")
-	(match_operand:FMOVE128_FPR 1 "input_operand" "d,m,d,<zero_fp>,r,<zero_fp>Y,r,d,r"))]
+  [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand"
+		"=m,        d,          d,          d,          Y,
+		 r,         r,          r,          r,          d")
+
+	(match_operand:FMOVE128_FPR 1 "input_operand"
+		"d,         m,          d,          <zero_fp>,  r,
+		 <zero_fp>, Y,          r,          d,          r"))]
+
   "TARGET_HARD_FLOAT && TARGET_POWERPC64 && FLOAT128_2REG_P (<MODE>mode)
    && (<MODE>mode != TDmode || WORDS_BIG_ENDIAN)
    && (gpc_reg_operand (operands[0], <MODE>mode)
@@ -7776,9 +7789,13 @@
   "#"
   "&& reload_completed"
   [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8,8,8,8,12,12,8,8,8")
-   (set_attr "isa" "*,*,*,*,*,*,*,p8v,p8v")])
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+}
+  [(set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")
+   (set_attr "non_prefixed_length" "8")
+   (set_attr "prefixed_length" "20")])
 
 (define_insn_and_split "*movtd_64bit_nodm"
   [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
@@ -7789,8 +7806,12 @@
   "#"
   "&& reload_completed"
   [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8,8,8,12,12,8")])
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+}
+  [(set_attr "non_prefixed_length" "8")
+   (set_attr "prefixed_length" "20")])
 
 (define_insn_and_split "*mov<mode>_32bit"
   [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r")
@@ -8800,24 +8821,24 @@
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
 
-;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR #
-;;              FPR store  FPR load   FPR move   AVX store  AVX store   AVX load
-;;              AVX load   VSX move   P9 0       P9 -1      AVX 0/-1    VSX 0
-;;              VSX -1     P9 const   AVX const  From SPR   To SPR      SPR<->SPR
-;;              VSX->GPR   GPR->VSX
+;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR pli
+;;              GPR #      FPR store  FPR load   FPR move   AVX store   AVX store
+;;              AVX load   AVX load   VSX move   P9 0       P9 -1       AVX 0/-1
+;;              VSX 0      VSX -1     P9 const   AVX const  From SPR    To SPR
+;;              SPR<->SPR  VSX->GPR   GPR->VSX
 (define_insn "*movdi_internal64"
   [(set (match_operand:DI 0 "nonimmediate_operand"
                "=YZ,       r,         r,         r,         r,          r,
-                m,         ^d,        ^d,        wY,        Z,          $v,
-                $v,        ^wa,       wa,        wa,        v,          wa,
-                wa,        v,         v,         r,         *h,         *h,
-                ?r,        ?wa")
+                r,         m,         ^d,        ^d,        wY,         Z,
+                $v,        $v,        ^wa,       wa,        wa,         v,
+                wa,        wa,        v,         v,         r,          *h,
+                *h,        ?r,        ?wa")
 	(match_operand:DI 1 "input_operand"
-               "r,         YZ,        r,         I,         L,          nF,
-                ^d,        m,         ^d,        ^v,        $v,         wY,
-                Z,         ^wa,       Oj,        wM,        OjwM,       Oj,
-                wM,        wS,        wB,        *h,        r,          0,
-                wa,        r"))]
+               "r,         YZ,        r,         I,         L,          eI,
+                nF,        ^d,        m,         ^d,        ^v,         $v,
+                wY,        Z,         ^wa,       Oj,        wM,         OjwM,
+                Oj,        wM,        wS,        wB,        *h,         r,
+                0,         wa,        r"))]
   "TARGET_POWERPC64
    && (gpc_reg_operand (operands[0], DImode)
        || gpc_reg_operand (operands[1], DImode))"
@@ -8827,6 +8848,7 @@
    mr %0,%1
    li %0,%1
    lis %0,%v1
+   li %0,%1
    #
    stfd%U0%X0 %1,%0
    lfd%U1%X1 %0,%1
@@ -8850,26 +8872,28 @@
    mtvsrd %x0,%1"
   [(set_attr "type"
                "store,      load,	*,         *,         *,         *,
-                fpstore,    fpload,     fpsimple,  fpstore,   fpstore,   fpload,
-                fpload,     veclogical, vecsimple, vecsimple, vecsimple, veclogical,
-                veclogical, vecsimple,  vecsimple, mfjmpr,    mtjmpr,    *,
-                mftgpr,    mffgpr")
+                *,          fpstore,    fpload,    fpsimple,  fpstore,   fpstore,
+                fpload,     fpload,     veclogical,vecsimple, vecsimple, vecsimple,
+                veclogical, veclogical, vecsimple,  vecsimple, mfjmpr,   mtjmpr,
+                *,          mftgpr,    mffgpr")
    (set_attr "size" "64")
    (set_attr "length"
-               "*,         *,         *,         *,         *,          20,
+               "*,         *,         *,         *,         *,          *,
+                20,        *,         *,         *,         *,          *,
                 *,         *,         *,         *,         *,          *,
-                *,         *,         *,         *,         *,          *,
-                *,         8,         *,         *,         *,          *,
-                *,         *")
+                *,         *,         8,         *,         *,          *,
+                *,         *,         *")
    (set_attr "isa"
-               "*,         *,         *,         *,         *,          *,
-                *,         *,         *,         p9v,       p7v,        p9v,
-                p7v,       *,         p9v,       p9v,       p7v,        *,
-                *,         p7v,       p7v,       *,         *,          *,
-                p8v,       p8v")])
+               "*,         *,         *,         *,         *,          fut,
+                *,         *,         *,         *,         p9v,        p7v,
+                p9v,       p7v,       *,         p9v,       p9v,        p7v,
+                *,         *,         p7v,       p7v,       *,          *,
+                *,         p8v,       p8v")])
 
 ; Some DImode loads are best done as a load of -1 followed by a mask
-; instruction.
+; instruction.  On systems that support the PADDI (PLI) instruction,
+; num_insns_constant returns 1, so these splitter would not be used for things
+; that be loaded with PLI.
 (define_split
   [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
 	(match_operand:DI 1 "const_int_operand"))]
@@ -8987,7 +9011,8 @@
   return rs6000_output_move_128bit (operands);
 }
   [(set_attr "type" "store,store,load,load,*,*")
-   (set_attr "length" "8")])
+   (set_attr "non_prefixed_length" "8,8,8,8,8,40")
+   (set_attr "prefixed_length" "20,20,20,20,8,40")])
 
 (define_split
   [(set (match_operand:TI2 0 "int_reg_operand")
@@ -11501,15 +11526,43 @@
   [(set_attr "type" "three")
    (set_attr "length" "12")])
 
+;; We can't use the prefixed attribute here because there are two memory
+;; instructions, and we can't split the insn due to the fact that this
+;; operation needs to be done in one piece.
 (define_insn "stack_protect_setdi"
   [(set (match_operand:DI 0 "memory_operand" "=Y")
 	(unspec:DI [(match_operand:DI 1 "memory_operand" "Y")] UNSPEC_SP_SET))
    (set (match_scratch:DI 2 "=&r") (const_int 0))]
   "TARGET_64BIT"
-  "ld%U1%X1 %2,%1\;std%U0%X0 %2,%0\;li %2,0"
+{
+  if (prefixed_mem_operand (operands[1], DImode))
+    output_asm_insn ("pld %2,%1", operands);
+  else
+    output_asm_insn ("ld%U1%X1 %2,%1", operands);
+
+  if (prefixed_mem_operand (operands[0], DImode))
+    output_asm_insn ("pstd %2,%0", operands);
+  else
+    output_asm_insn ("std%U0%X0 %2,%0", operands);
+
+  return "li %2,0";
+}
   [(set_attr "type" "three")
-   (set_attr "length" "12")])
 
+  ;; Back to back prefixed memory instructions take 20 bytes (8 bytes for each
+  ;; prefixed instruction + 4 bytes for the possible NOP).
+   (set_attr "prefixed" "no")
+   (set (attr "length")
+	(cond [(and (match_operand 0 "prefixed_mem_operand")
+		    (match_operand 1 "prefixed_mem_operand"))
+	       (const_string "24")
+
+	       (ior (match_operand 0 "prefixed_mem_operand")
+		    (match_operand 1 "prefixed_mem_operand"))
+	       (const_string "20")]
+
+	      (const_string "12")))])
+
 (define_expand "stack_protect_test"
   [(match_operand 0 "memory_operand")
    (match_operand 1 "memory_operand")
@@ -11547,6 +11600,9 @@
    lwz%U1%X1 %3,%1\;lwz%U2%X2 %4,%2\;cmplw %0,%3,%4\;li %3,0\;li %4,0"
   [(set_attr "length" "16,20")])
 
+;; We can't use the prefixed attribute here because there are two memory
+;; instructions, and we can't split the insn due to the fact that this
+;; operation needs to be done in one piece.
 (define_insn "stack_protect_testdi"
   [(set (match_operand:CCEQ 0 "cc_reg_operand" "=x,?y")
         (unspec:CCEQ [(match_operand:DI 1 "memory_operand" "Y,Y")
@@ -11555,11 +11611,44 @@
    (set (match_scratch:DI 4 "=r,r") (const_int 0))
    (clobber (match_scratch:DI 3 "=&r,&r"))]
   "TARGET_64BIT"
-  "@
-   ld%U1%X1 %3,%1\;ld%U2%X2 %4,%2\;xor. %3,%3,%4\;li %4,0
-   ld%U1%X1 %3,%1\;ld%U2%X2 %4,%2\;cmpld %0,%3,%4\;li %3,0\;li %4,0"
-  [(set_attr "length" "16,20")])
+{
+  if (prefixed_mem_operand (operands[1], DImode))
+    output_asm_insn ("pld %3,%1", operands);
+  else
+    output_asm_insn ("ld%U1%X1 %3,%1", operands);
 
+  if (prefixed_mem_operand (operands[2], DImode))
+    output_asm_insn ("pld %4,%2", operands);
+  else
+    output_asm_insn ("ld%U2%X2 %4,%2", operands);
+
+  if (which_alternative == 0)
+    output_asm_insn ("xor. %3,%3,%4", operands);
+  else
+    output_asm_insn ("cmpld %0,%3,%4\;li %3,0", operands);
+
+  return "li %4,0";
+}
+  ;; Back to back prefixed memory instructions take 20 bytes (8 bytes for each
+  ;; prefixed instruction + 4 bytes for the possible NOP).
+  [(set (attr "length")
+	(cond [(and (match_operand 1 "prefixed_mem_operand")
+		    (match_operand 2 "prefixed_mem_operand"))
+	       (if_then_else (eq_attr "alternative" "0")
+			     (const_string "28")
+			     (const_string "32"))
+
+	       (ior (match_operand 1 "prefixed_mem_operand")
+		    (match_operand 2 "prefixed_mem_operand"))
+	       (if_then_else (eq_attr "alternative" "0")
+			     (const_string "20")
+			     (const_string "24"))]
+
+	      (if_then_else (eq_attr "alternative" "0")
+			    (const_string "16")
+			    (const_string "20"))))
+   (set_attr "prefixed" "no")])
+
 \f
 ;; Here are the actual compare insns.
 (define_insn "*cmp<mode>_signed"
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 274173)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -1149,10 +1149,30 @@
                "vecstore,  vecload,   vecsimple, mffgpr,    mftgpr,    load,
                 store,     load,      store,     *,         vecsimple, vecsimple,
                 vecsimple, *,         *,         vecstore,  vecload")
-   (set_attr "length"
-               "*,         *,         *,         8,         *,         8,
-                8,         8,         8,         8,         *,         *,
-                *,         20,        8,         *,         *")
+   (set (attr "non_prefixed_length")
+	(cond [(and (eq_attr "alternative" "4")		;; MTVSRDD
+		    (match_test "TARGET_P9_VECTOR"))
+	       (const_string "4")
+
+	       (eq_attr "alternative" "3,4")		;; GPR <-> VSX
+	       (const_string "8")
+
+	       (eq_attr "alternative" "5,6,7,8")	;; GPR load/store
+	       (const_string "8")]
+	      (const_string "*")))
+
+   (set (attr "prefixed_length")
+	(cond [(and (eq_attr "alternative" "4")		;; MTVSRDD
+		    (match_test "TARGET_P9_VECTOR"))
+	       (const_string "4")
+
+	       (eq_attr "alternative" "3,4")		;; GPR <-> VSX
+	       (const_string "8")
+
+	       (eq_attr "alternative" "5,6,7,8")	;; GPR load/store
+	       (const_string "20")]
+	      (const_string "*")))
+
    (set_attr "isa"
                "<VSisa>,   <VSisa>,   <VSisa>,   *,         *,         *,
                 *,         *,         *,         *,         p9v,       *,
@@ -3199,7 +3219,12 @@
 					   operands[3], <VSX_D:VS_scalar>mode);
 }
   [(set_attr "type" "fpload,load")
-   (set_attr "length" "8")])
+   (set (attr "prefixed")
+	(if_then_else (match_operand 1 "prefixed_mem_operand")
+		      (const_string "yes")
+		      (const_string "no")))
+   (set_attr "non_prefixed_length" "8")
+   (set_attr "prefixed_length" "16")])
 
 ;; Optimize storing a single scalar element that is the right location to
 ;; memory
@@ -3294,6 +3319,8 @@
 }
   [(set_attr "type" "fpload,fpload,fpload,load")
    (set_attr "length" "8")
+   (set_attr "non_prefixed_length" "8")
+   (set_attr "prefixed_length" "16")
    (set_attr "isa" "*,p7v,p9v,*")])
 
 ;; Variable V4SF extract

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH], Patch #4 of 10, Adjust costs based on insn sizes
  2019-08-14 21:36 PowerPC 'future' patches introduction Michael Meissner
                   ` (2 preceding siblings ...)
  2019-08-14 22:12 ` [PATCH], Patch #3 of 10, Add prefixed addressing support Michael Meissner
@ 2019-08-14 22:15 ` Michael Meissner
  2019-08-14 22:23 ` [PATCH], Patch #5 of 10, Make -mpcrel default for -mcpu=future Michael Meissner
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Michael Meissner @ 2019-08-14 22:15 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn,
	Michael Meissner, Alan Modra

Some of the cost functions in the PowerPC compiler uses the length of the
instruction to factor in the costs.  This patches adjusts this calculation so
that prefixed instructions are treated as the same cost as non-prefixed
instructions.

I forgot to mention in the previous patchs, all 10 of the patches have been
bootstrapped on a little endian power8 system in progression, and there were no
regressions.  Once the previous patches have been checked in, can I check this
patch into the trunk?

2019-08-14   Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_num_insns): New function.
	(rs6000_insn_cost): Use rs6000_num_insns to treat prefixed
	load/store instructions with the same case as non-prefixed
	instructions.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 274177)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -21369,7 +21369,43 @@ rs6000_debug_rtx_costs (rtx x, machine_mode mode,
   return ret;
 }
 
+/* How many real instructions are generated for this insn?  This is slightly
+   different from the length attribute, in that the length attribute counts the
+   number of bytes.  With prefixed instructions, we don't want to count a
+   prefixed instruction (length 12 bytes including possible NOP) as taking 3
+   instructions, but just one.  */
+
 static int
+rs6000_num_insns (rtx_insn *insn)
+{
+  /* Try to figure it out based on the length and whether there are prefixed
+     instructions.  While prefixed instructions are only 8 bytes, we have to
+     use 12 as the size of the first prefixed instruction in case the
+     instruction needs to be aligned.  Back to back prefixed instructions would
+     only take 20 bytes, since it is guaranteed that one of the prefixed
+     instructions does not need the alignment.  */
+  int length = get_attr_length (insn);
+
+  if (length >= 12 && TARGET_PREFIXED_ADDR
+      && get_attr_prefixed (insn) == PREFIXED_YES)
+    {
+      /* Single prefixed instruction.  */
+      if (length == 12)
+	return 1;
+
+      /* A normal instruction and a prefixed instruction (16) or two back
+	 to back prefixed instructions (20).  */
+      if (length == 16 || length == 20)
+	return 2;
+
+      /* Guess for larger instruction sizes.  */
+      return 2 + (length - 20) / 4;
+    }
+
+  return length / 4;
+}
+
+static int
 rs6000_insn_cost (rtx_insn *insn, bool speed)
 {
   if (recog_memoized (insn) < 0)
@@ -21382,7 +21418,7 @@ rs6000_insn_cost (rtx_insn *insn, bool speed)
   if (cost > 0)
     return cost;
 
-  int n = get_attr_length (insn) / 4;
+  int n = rs6000_num_insns (insn);
   enum attr_type type = get_attr_type (insn);
 
   switch (type)

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH], Patch #5 of 10, Make -mpcrel default for -mcpu=future
  2019-08-14 21:36 PowerPC 'future' patches introduction Michael Meissner
                   ` (3 preceding siblings ...)
  2019-08-14 22:15 ` [PATCH], Patch #4 of 10, Adjust costs based on insn sizes Michael Meissner
@ 2019-08-14 22:23 ` Michael Meissner
  2019-08-14 23:10 ` [PATCH], Patch #6 of 10, Add 'future' support to function attributes Michael Meissner
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Michael Meissner @ 2019-08-14 22:23 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn,
	Michael Meissner, Alan Modra

This patch changes the default for -mcpu=future to turn on pc-relative
addressing by default.

I have built each of the patches in turn on a little endian power8 system doing
a bootstrap and make check.  There were no regressions.  Can I check this patch
into the trunk once the previous patches are checked in?

2019-08-14   Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Enable
	pc-relative support by default on 'future' systems.

Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def	(revision 274173)
+++ gcc/config/rs6000/rs6000-cpus.def	(working copy)
@@ -75,10 +75,10 @@
 				 | OPTION_MASK_P8_VECTOR		\
 				 | OPTION_MASK_P9_VECTOR)
 
-/* Support for a future processor's features.  Do not enable -mpcrel until it
-   is fully functional.  */
+/* Support for a future processor's features.  */
 #define ISA_FUTURE_MASKS_SERVER	(ISA_3_0_MASKS_SERVER			\
 				 | OPTION_MASK_FUTURE			\
+				 | OPTION_MASK_PCREL			\
 				 | OPTION_MASK_PREFIXED_ADDR)
 
 /* Flags that need to be turned off if -mno-future.  */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH], Patch #6 of 10, Add 'future' support to function attributes
  2019-08-14 21:36 PowerPC 'future' patches introduction Michael Meissner
                   ` (4 preceding siblings ...)
  2019-08-14 22:23 ` [PATCH], Patch #5 of 10, Make -mpcrel default for -mcpu=future Michael Meissner
@ 2019-08-14 23:10 ` Michael Meissner
  2019-08-14 23:13 ` [PATCH], Patch #7 of 10, Add support for PCREL_OPT Michael Meissner
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Michael Meissner @ 2019-08-14 23:10 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn,
	Michael Meissner, Alan Modra

This patch adds support for using cpu=future in the "target" function
attribute, "target" pragma support, and "target_clones" function attributes.

In addition, it adds support for the following arguments to
__builtin_cpu_supports:

	"arch_3_1"	Whether ISA 3.1 is supported by the machine;
	"mma"		Whether the MMA extension is supported by the machine.

The hwcap2 bits used in the auxv table will be appearing in future Linux
kernels.  At this present time, there is no support for:

	__builtin_cpu_is ("future")

I have built each of the patches on a little endian power8 system and there
were no regressions in either the bootstrap or make check operations.  Can I
check this patch into the trunk after the other patches have been checked in?

[gcc]
2019-08-14   Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/ppc-auxv.h (PPC_FEATURE2_ARCH_3_1): New hwcap2
	bit.
	(PPC_FEATURE2_MMA): New hwcap2 bit.
	* config/rs6000/rs6000-call.c (cpu_supports_info): Add arch 3.1
	and mma bits.
	* config/rs6000/rs6000.c (rs6000_clone_map): Add 'future' system
	to target_clone support.

[gcc/testsuite]
2019-08-14  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/clone3.c: New test for using 'future' with
	the target_clones attribute.

Index: gcc/config/rs6000/ppc-auxv.h
===================================================================
--- gcc/config/rs6000/ppc-auxv.h	(revision 274173)
+++ gcc/config/rs6000/ppc-auxv.h	(working copy)
@@ -93,6 +93,9 @@
 #define PPC_FEATURE2_SCV            0x00100000
 #define PPC_FEATURE2_HTM_NO_SUSPEND 0x00080000
 
+/* These are not yet official.  */
+#define PPC_FEATURE2_ARCH_3_1       0x00040000
+#define PPC_FEATURE2_MMA            0x00020000
 
 /* Thread Control Block (TCB) offsets of the AT_PLATFORM, AT_HWCAP and
    AT_HWCAP2 values.  These must match the values defined in GLIBC.  */
Index: gcc/config/rs6000/rs6000-call.c
===================================================================
--- gcc/config/rs6000/rs6000-call.c	(revision 274173)
+++ gcc/config/rs6000/rs6000-call.c	(working copy)
@@ -171,7 +171,9 @@ static const struct
   { "arch_3_00",	PPC_FEATURE2_ARCH_3_00,		1 },
   { "ieee128",		PPC_FEATURE2_HAS_IEEE128,	1 },
   { "darn",		PPC_FEATURE2_DARN,		1 },
-  { "scv",		PPC_FEATURE2_SCV,		1 }
+  { "scv",		PPC_FEATURE2_SCV,		1 },
+  { "arch_3_1",		PPC_FEATURE2_ARCH_3_1,		1 },
+  { "mma",		PPC_FEATURE2_MMA,		1 },
 };
 
 static void altivec_init_builtins (void);
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 274178)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -259,6 +259,7 @@ enum {
   CLONE_ISA_2_06,			/* ISA 2.06 (power7).  */
   CLONE_ISA_2_07,			/* ISA 2.07 (power8).  */
   CLONE_ISA_3_00,			/* ISA 3.00 (power9).  */
+  CLONE_ISA_3_1,			/* ISA 3.1 (future).  */
   CLONE_MAX
 };
 
@@ -274,6 +275,7 @@ static const struct clone_map rs6000_clone_map[CLO
   { OPTION_MASK_POPCNTD,	"arch_2_06" },	/* ISA 2.06 (power7).  */
   { OPTION_MASK_P8_VECTOR,	"arch_2_07" },	/* ISA 2.07 (power8).  */
   { OPTION_MASK_P9_VECTOR,	"arch_3_00" },	/* ISA 3.00 (power9).  */
+  { OPTION_MASK_FUTURE,		"arch_3_1" },	/* ISA 3.1 (future).  */
 };
 
 
Index: gcc/testsuite/gcc.target/powerpc/clone3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/clone3.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/clone3.c	(working copy)
@@ -0,0 +1,33 @@
+/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
+/* { dg-options "-mdejagnu-cpu=power8 -O2" } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-require-effective-target ppc_cpu_supports_hw } */
+
+/* Power9 (aka, ISA 3.0) has a MODSD instruction to do modulus, while Power8
+   (aka, ISA 2.07) has to do modulus with divide and multiply.  Make sure
+   both clone functions are generated.
+
+   FUTURE has pc-relative instructions to access static values, while earlier
+   systems used TOC addressing.
+
+   Restrict ourselves to Linux, since IFUNC might not be supported in other
+   operating systems.  */
+
+static long s;
+long *p = &s;
+
+__attribute__((target_clones("cpu=future,cpu=power9,default")))
+long mod_func (long a, long b)
+{
+  return (a % b) + s;
+}
+
+long mod_func_or (long a, long b, long c)
+{
+  return mod_func (a, b) | c;
+}
+
+/* { dg-final { scan-assembler-times {\mdivd\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mmulld\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mmodsd\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mpld\M}   1 } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH], Patch #7 of 10, Add support for PCREL_OPT
  2019-08-14 21:36 PowerPC 'future' patches introduction Michael Meissner
                   ` (5 preceding siblings ...)
  2019-08-14 23:10 ` [PATCH], Patch #6 of 10, Add 'future' support to function attributes Michael Meissner
@ 2019-08-14 23:13 ` Michael Meissner
  2019-08-14 23:16 ` [PATCH], Patch #8 of 10, Miscellaneous future tests Michael Meissner
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Michael Meissner @ 2019-08-14 23:13 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn,
	Michael Meissner, Alan Modra

This patch adds a new RTL pass that occurs before the final pass to implement
the PCREL_OPT optimization that is implemented by the linker.

Without this optimization, access to external symbols loads up the address from
a .GOT section and does the normal operation.  For example:

	extern unsigned int esym;

	/* ... */

	esym = 1;

would generate:

        pld 9,esym@got@pcrel
        li 10,1
        stw 10,0(9)

I.e. load the address of 'esym' into r9, and do a normal 'stw'.

With the PCREL_OPT optimization, the compiler would generate:

        li 9,1
        pld 10,esym@got@pcrel
.Lpcrel1:
        .reloc .Lpcrel1-8,R_PPC64_PCREL_OPT,.-(.Lpcrel1-8)
        stw 9,0(10)

When the module is linked, if the object file is in the main program and the
'esym' variable is also in the main program, the linker will change the code
to:

	li 9,1
	pstw 9,esym@pcrel
	nop

If either the object file is in a shared library, or the variable 'esym' is in
a shared library, then the old code is used:

	li 9,1
	pld 10,esym.got@pcrel
	stw 9,0(10)

	.section .got
esym.got:
	.quad esym

When optimizing loads with PCREL_OPT, this patch makes sure that the register
being loaded is not live between the PLD instruction loading the address and
the normal load instruction.

Similarly, when optimizing stores with PCREL_OPT, this patch makes sures values
being stored must be live at the time the address is loaded and still live at
the time the store is done.

If there is more than reference to the external symbol in the basic block, or
the load of the address is in one basic block and the memory reference is in
another basic block, this pass does not optimize the reference to use
PCREL_OPT.  For example:

	extern unsigned int esym;

	void inc (void)
	{
	  esym++;
	}

Generates:

        pld 10,esym@got@pcrel
        lwz 9,0(10)
        addi 9,9,1
        stw 9,0(10)

As with the other patches, I have bootstraped the changes on a little endian
power8 system and there were no regressions.  Once the previous patches are
checked in, can I check this patch into the trunk?

[gcc]
2019-08-14   Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/pcrel.md: New file.
	* config/rs6000/predicates.md (one_reg_memory_operand): New
	predicate.
	* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Add
	-mpcrel-opt.
	(OTHER_FUTURE_MASKS): Add -mpcrel-opt.
	(POWERPC_MASKS): Add -mpcrel-opt.
	* config/rs6000/rs6000-passes.def: Add pc-relative optimization
	pass.
	* config/rs6000/rs6000-pcrel.c: New file.
	* config/rs6000/rs6000-prefixed.c (pcrel_opt_label_num): New
	static variable.
	(rs6000_final_prescan_insn): Add support for pc-relative
	optimization pass.
	(rs6000_asm_output_opcode): Add support for pc-relative
	optimization pass.
	* config/rs6000/rs6000-protos.h (rs6000_final_prescan_insn):
	Change calling signature.
	(make_pass_pcrel_opt): New declaration.
	* config/rs6000/rs6000.c (rs6000_option_override_internal): Add
	support for -mpcrel-opt.
	(rs6000_opt_masks): Add -mpcrel-opt.
	* config/rs6000/rs6000.h (FINAL_PRESCAN_INSN): Update
	rs6000_final_prescan_insn call.
	* config/rs6000/rs6000.md: Include pcrel.md.
	(pcrel_opt attribute): New RTL attribute.
	* config/rs6000/rs6000.opt (-mpcrel-opt): New option.
	* config/rs6000/t-rs6000 (rs6000-pcrel.o): Add build rule.
	(MD_INCLUDES): Add pcrel.md.
	* config.gcc (powerpc*-*-*): Add rs6000-pcrel.o.
	(rs6000*-*-*): Add rs6000-pcrel.o.

[gcc/testsuite]
2019-08-07   Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/pcrel-opt-di.c: New test.

Index: gcc/config/rs6000/pcrel.md
===================================================================
--- gcc/config/rs6000/pcrel.md	(revision 0)
+++ gcc/config/rs6000/pcrel.md	(working copy)
@@ -0,0 +1,563 @@
+;; PC relative support.
+;; Copyright (C) 2019 Free Software Foundation, Inc.
+;; Contributed by Peter Bergner <bergner@linux.ibm.com> and
+;;		  Michael Meissner <meissner@linux.ibm.com>
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+;;
+;; UNSPEC usage
+;;
+
+(define_c_enum "unspec"
+  [UNSPEC_PCREL_LD
+   UNSPEC_PCREL_ST
+  ])
+
+
+;; Optimize references to external variables to combine loading up the external
+;; address from the GOT and doing the load or store operation.
+;;
+;; A typical optimization looks like:
+;;
+;;		pld b,var@pcrel@got(0),1
+;;	100:
+;;		...
+;;		.reloc 100b-8,R_PPC64_PCREL_OPT,0
+;;		lwz r,0(b)
+;;
+;; If 'var' is an external variable defined in another module in the main
+;; program, and the code is being linked for the main program, then the
+;; linker can optimize this to:
+;;
+;;		plwz r,var(0),1
+;;	100:
+;;		...
+;;		nop
+;;
+;; If either the variable or the code being linked is defined in a shared
+;; library, then the linker puts the address in the GOT area, and the pld will
+;; load up the pointer, and then that pointer is used for the load or store.
+;; If there is more than one reference to the GOT pointer, the compiler will
+;; not do this optimization, and use the GOT pointer normally.
+;;
+;; Having the label after the pld instruction and using label-8 in the .reloc
+;; addresses the prefixed instruction properly.  If we put the label before the
+;; pld instruction, then the relocation might point to the NOP that is
+;; generated if the prefixed instruction is not aligned.
+;;
+;; We need to rewrite the normal GOT load operation before register allocation
+;; to include setting the eventual destination register for loads, or referring
+;; to the value being stored for store operations so that the proper register
+;; lifetime is set in case the optimization is done and the pld/lwz is
+;; converted to plwz/nop.
+
+(define_mode_iterator PO [QI HI SI DI SF DF
+			  V16QI V8HI V4SI V4SF V2DI V2DF V1TI KF
+			  (TF "FLOAT128_IEEE_P (TFmode)")])
+
+;; Vector types for pcrel optimization
+(define_mode_iterator POV [V16QI V8HI V4SI V4SF V2DI V2DF V1TI KF
+			   (TF "FLOAT128_IEEE_P (TFmode)")])
+
+;; Define the constraints for each mode for pcrel_opt.  The order of the
+;; constraints should have the most natural register class first.
+(define_mode_attr PO_constraint [(QI    "r,d,v")
+				 (HI    "r,d,v")
+				 (SI    "r,d,v")
+				 (DI    "r,d,v")
+				 (SF    "d,v,r")
+				 (DF    "d,v,r")
+				 (V16QI "wa,wn,wn")
+				 (V8HI  "wa,wn,wn")
+				 (V4SI  "wa,wn,wn")
+				 (V4SF  "wa,wn,wn")
+				 (V2DI  "wa,wn,wn")
+				 (V2DF  "wa,wn,wn")
+				 (V1TI  "wa,wn,wn")
+				 (KF    "wa,wn,wn")
+				 (TF    "wa,wn,wn")])
+
+;; Combiner pattern that combines the load of the GOT along with the load.  The
+;; first split pass before register allocation will split this into the load of
+;; the GOT that indicates the resultant value may be created if the PCREL_OPT
+;; relocation is done.
+;;
+;; The (set (match_dup 0)
+;;	    (unspec:<MODE> [(const_int 0)] UNSPEC_PCREL_LD))
+;;
+;; Is to signal to the register allocator that the destination register may be
+;; set by the GOT operation (if the linker does the optimization).
+;;
+;; We need to set the "cost" explicitly so that the instruction length is not
+;; used.  We return the same cost as a normal load (4 if we are not optimizing
+;; for speed, 8 if we are optimizing for speed)
+
+(define_insn_and_split "*mov<mode>_pcrel_opt_load"
+  [(set (match_operand:PO 0 "gpc_reg_operand")
+	(match_operand:PO 1 "pcrel_external_mem_operand"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64
+   && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(parallel [(set (match_dup 2)
+		   (match_dup 3))
+	      (set (match_dup 0)
+		   (unspec:<MODE> [(const_int 0)] UNSPEC_PCREL_LD))
+	      (use (const_int 0))])
+   (parallel [(set (match_dup 0)
+		   (match_dup 4))
+	      (use (match_dup 0))
+	      (use (const_int 0))])]
+{
+  rtx mem = operands[1];
+  rtx got = gen_reg_rtx (DImode);
+
+  operands[2] = got;
+  operands[3] = XEXP (mem, 0);
+  operands[4] = change_address (mem, <MODE>mode, got);
+}
+  [(set_attr "type" "load")
+   (set_attr "length" "16")
+   (set (attr "cost")
+	(if_then_else (match_test "optimize_function_for_speed_p (cfun)")
+		      (const_string "8")
+		      (const_string "4")))
+   (set_attr "prefixed" "yes")])
+
+;; Zero extend combiner patterns
+(define_insn_and_split "*mov<mode>_pcrel_opt_zero_extend"
+  [(set (match_operand:DI 0 "gpc_reg_operand")
+	(zero_extend:DI
+	 (match_operand:QHSI 1 "pcrel_external_mem_operand")))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64
+   && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(parallel [(set (match_dup 2)
+		   (match_dup 3))
+	      (set (match_dup 0)
+		   (unspec:DI [(const_int 0)] UNSPEC_PCREL_LD))
+	      (use (const_int 0))])
+   (parallel [(set (match_dup 0)
+		   (zero_extend:DI
+		    (match_dup 4)))
+	      (use (match_dup 0))
+	      (use (const_int 0))])]
+{
+  rtx mem = operands[1];
+  rtx got = gen_reg_rtx (DImode);
+
+  operands[2] = got;
+  operands[3] = XEXP (mem, 0);
+  operands[4] = change_address (mem, <MODE>mode, got);
+}
+  [(set_attr "type" "load")
+   (set_attr "length" "16")
+   (set (attr "cost")
+	(if_then_else (match_test "optimize_function_for_speed_p (cfun)")
+		      (const_string "8")
+		      (const_string "4")))
+   (set_attr "prefixed" "yes")])
+
+;; Sign extend combiner patterns
+(define_insn_and_split "*mov<mode>_pcrel_opt_sign_extend"
+  [(set (match_operand:DI 0 "gpc_reg_operand")
+	(sign_extend:DI
+	 (match_operand:HSI 1 "pcrel_external_mem_operand")))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64
+   && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(parallel [(set (match_dup 2)
+		   (match_dup 3))
+	      (set (match_dup 0)
+		   (unspec:DI [(const_int 0)] UNSPEC_PCREL_LD))
+	      (use (const_int 0))])
+   (parallel [(set (match_dup 0)
+		   (sign_extend:DI
+		    (match_dup 4)))
+	      (use (match_dup 0))
+	      (use (const_int 0))])]
+{
+  rtx mem = operands[1];
+  rtx got = gen_reg_rtx (DImode);
+
+  operands[2] = got;
+  operands[3] = XEXP (mem, 0);
+  operands[4] = change_address (mem, <MODE>mode, got);
+}
+  [(set_attr "type" "load")
+   (set_attr "length" "16")
+   (set (attr "cost")
+	(if_then_else (match_test "optimize_function_for_speed_p (cfun)")
+		      (const_string "8")
+		      (const_string "4")))
+   (set_attr "prefixed" "yes")])
+
+;; Float extend combiner pattern
+(define_insn_and_split "*movdf_pcrel_opt_float_extend"
+  [(set (match_operand:DF 0 "gpc_reg_operand")
+	(float_extend:DF
+	 (match_operand:SF 1 "pcrel_external_mem_operand")))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64
+   && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(parallel [(set (match_dup 2)
+		   (match_dup 3))
+	      (set (match_dup 0)
+		   (unspec:DF [(const_int 0)] UNSPEC_PCREL_LD))
+	      (use (const_int 0))])
+   (parallel [(set (match_dup 0)
+		   (float_extend:DF
+		    (match_dup 4)))
+	      (use (match_dup 0))
+	      (use (const_int 0))])]
+{
+  rtx mem = operands[1];
+  rtx got = gen_reg_rtx (DImode);
+
+  operands[2] = got;
+  operands[3] = XEXP (mem, 0);
+  operands[4] = change_address (mem, SFmode, got);
+}
+  [(set_attr "type" "load")
+   (set_attr "length" "16")
+   (set (attr "cost")
+	(if_then_else (match_test "optimize_function_for_speed_p (cfun)")
+		      (const_string "8")
+		      (const_string "4")))
+   (set_attr "prefixed" "yes")])
+
+;; Patterns to load up the GOT address that may be changed into the load of the
+;; actual variable.
+(define_insn "*mov<mode>_pcrel_opt_load_got"
+  [(set (match_operand:DI 0 "base_reg_operand" "=b,b,b")
+	(match_operand:DI 1 "pcrel_external_address"))
+   (set (match_operand:PO 2 "gpc_reg_operand" "=<PO_constraint>")
+	(unspec:PO [(const_int 0)] UNSPEC_PCREL_LD))
+   (use (match_operand:DI 3 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+{
+  return (INTVAL (operands[3])) ? "ld %0,%a1\n.Lpcrel%3:" : "ld %0,%a1";
+}
+  [(set_attr "type" "load")
+   (set_attr "length" "12")
+   (set_attr "pcrel_opt" "load_got")
+   (set (attr "cost")
+	(if_then_else (match_test "optimize_function_for_speed_p (cfun)")
+		      (const_string "8")
+		      (const_string "4")))
+   (set_attr "prefixed" "yes")])
+
+;; The secondary load insns that uses the GOT pointer that may become a NOP.
+(define_insn "*mov<mode>_pcrel_opt_load_mem"
+  [(set (match_operand:QHI 0 "gpc_reg_operand" "+r,wa")
+	(match_operand:QHI 1 "one_reg_memory_operand" "Q,Q"))
+   (use (match_operand:QHI 2 "gpc_reg_operand" "0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   l<wd>z %0,%1
+   lxsi<wd>zx %x0,%y1"
+  [(set_attr "type" "load,fpload")
+   (set_attr "pcrel_opt" "load,no")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movsi_pcrel_opt_load_mem"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "+r,d,v")
+	(match_operand:SI 1 "one_reg_memory_operand" "Q,Q,Q"))
+   (use (match_operand:SI 2 "gpc_reg_operand" "0,0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   lwz %0,%1
+   lfiwzx %0,%y1
+   lxsiwzx %x0,%y1"
+  [(set_attr "type" "load,fpload,fpload")
+   (set_attr "pcrel_opt" "load,no,no")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movdi_pcrel_opt_load_mem"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "+r,d,v")
+	(match_operand:DI 1 "one_reg_memory_operand" "Q,Q,Q"))
+   (use (match_operand:DI 2 "gpc_reg_operand" "0,0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   ld %0,%1
+   lfd %0,%1
+   lxsd %0,%1"
+  [(set_attr "type" "load,fpload,fpload")
+   (set_attr "pcrel_opt" "load")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movsf_pcrel_opt_load_mem"
+  [(set (match_operand:SF 0 "gpc_reg_operand" "+d,v,r")
+	(match_operand:SF 1 "one_reg_memory_operand" "Q,Q,Q"))
+   (use (match_operand:SF 2 "gpc_reg_operand" "0,0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   lfs %0,%1
+   lxssp %0,%1
+   lwz %0,%1"
+  [(set_attr "type" "fpload,fpload,load")
+   (set_attr "pcrel_opt" "load")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movdf_pcrel_opt_load_mem"
+  [(set (match_operand:DF 0 "gpc_reg_operand" "+d,v,r")
+	(match_operand:DF 1 "one_reg_memory_operand" "Q,Q,Q"))
+   (use (match_operand:DF 2 "gpc_reg_operand" "0,0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   lfd %0,%1
+   lxsd %0,%1
+   ld %0,%1"
+  [(set_attr "type" "fpload,fpload,load")
+   (set_attr "pcrel_opt" "load")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*mov<mode>_pcrel_opt_load_mem"
+  [(set (match_operand:POV 0 "gpc_reg_operand" "+wa")
+	(match_operand:POV 1 "one_reg_memory_operand" "Q"))
+   (use (match_operand:POV 2 "gpc_reg_operand" "0"))
+   (use (match_operand:DI 3 "const_int_operand" "n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "lxv %x0,%1"
+  [(set_attr "type" "vecload")
+   (set_attr "pcrel_opt" "load")
+   (set_attr "prefixed" "no")])
+
+;; Zero extend insns
+(define_insn "*mov<mode>_pcrel_opt_load_zero_extend2"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "+r,wa")
+	(zero_extend:DI
+	 (match_operand:QHI 1 "one_reg_memory_operand" "Q,Q")))
+   (use (match_operand:DI 2 "gpc_reg_operand" "0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   l<wd>z %0,%1
+   lxsi<wd>zx %x0,%y1"
+  [(set_attr "type" "load,fpload")
+   (set_attr "pcrel_opt" "load,no")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movsi_pcrel_opt_load_zero_extend2"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "+r,d,v")
+	(zero_extend:DI
+	 (match_operand:SI 1 "one_reg_memory_operand" "Q,Q,Q")))
+   (use (match_operand:DI 2 "gpc_reg_operand" "0,0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   lwz %0,%1
+   lfiwzx %0,%y1
+   lxsiwzx %x0,%y1"
+  [(set_attr "type" "load,fpload,fpload")
+   (set_attr "pcrel_opt" "load,no,no")
+   (set_attr "prefixed" "no")])
+
+;; Sign extend insns
+(define_insn "*movsi_pcrel_opt_load_sign_extend2"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "+r,d,v")
+	(sign_extend:DI
+	 (match_operand:SI 1 "one_reg_memory_operand" "Q,Q,Q")))
+   (use (match_operand:DI 2 "gpc_reg_operand" "0,0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   lwa %0,%1
+   lfiwax %0,%y1
+   lxsiwax %x0,%y1"
+  [(set_attr "type" "load,fpload,fpload")
+   (set_attr "pcrel_opt" "load,no,no")
+   (set_attr "prefixed" "no")])
+
+(define_insn_and_split "*movhi_pcrel_opt_load_sign_extend2"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "+r,v")
+	(sign_extend:DI
+	 (match_operand:HI 1 "one_reg_memory_operand" "Q,Q")))
+   (use (match_operand:DI 2 "gpc_reg_operand" "0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   lha %0,%1
+   #"
+  "&& reload_completed && altivec_register_operand (operands[0], HImode)"
+  [(parallel [(set (match_dup 4)
+		   (match_dup 1))
+	      (use (match_dup 4))
+	      (use (const_int 0))])
+   (set (match_dup 0)
+	(sign_extend:DI
+	 (match_dup 4)))]
+{
+  operands[4] = gen_rtx_REG (HImode, REGNO (operands[0]));
+}
+  [(set_attr "type" "load,fpload")
+   (set_attr "pcrel_opt" "load,no")
+   (set_attr "length" "4,8")
+   (set_attr "prefixed" "no")])
+
+;; Floating point extend insn
+(define_insn "*movsf_pcrel_opt_load_float_extend2"
+  [(set (match_operand:DF 0 "gpc_reg_operand" "+d,v")
+	(float_extend:DF
+	 (match_operand:SF 1 "one_reg_memory_operand" "Q,Q")))
+   (use (match_operand:DF 2 "gpc_reg_operand" "0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   lfs %0,%1
+   lxssp %0,%1"
+  [(set_attr "type" "fpload")
+   (set_attr "pcrel_opt" "load")
+   (set_attr "prefixed" "no")])
+
+; ;; Store combiner insns that merge together loading up the address of the
+; ;; external variable and doing the store.  This is split in the first split
+; ;; pass before register allocation.
+;;
+;; We need to set the "cost" explicitly so that the instruction length is not
+;; used.  We return the same cost as a normal store (4).
+(define_insn_and_split "*mov<mode>_pcrel_opt_store"
+  [(set (match_operand:PO 0 "pcrel_external_mem_operand")
+ 	(match_operand:PO 1 "gpc_reg_operand"))]
+   "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64
+    && can_create_pseudo_p ()"
+   "#"
+   "&& 1"
+   [(set (match_dup 2)
+	 (unspec:DI [(match_dup 1)
+		     (match_dup 3)
+		     (const_int 0)] UNSPEC_PCREL_ST))
+    (parallel [(set (match_dup 4)
+		    (match_dup 1))
+	       (use (const_int 0))])]
+{
+  rtx mem = operands[0];
+  rtx addr = XEXP (mem, 0);
+  rtx got = gen_reg_rtx (DImode);
+
+  operands[2] = got;
+  operands[3] = addr;
+  operands[4] = change_address (mem, <MODE>mode, got);
+}
+  [(set_attr "type" "load")
+   (set_attr "length" "20")
+   (set_attr "pcrel_opt" "store_got")
+   (set_attr "cost" "4")
+   (set_attr "prefixed" "yes")])
+
+;; Load of the GOT address for a store operation that may be converted into a
+;; direct store.
+(define_insn "*mov<mode>_pcrel_opt_store_got"
+  [(set (match_operand:DI 0 "base_reg_operand" "=&b,&b,&b")
+	(unspec:DI [(match_operand:PO 1 "gpc_reg_operand" "<PO_constraint>")
+		    (match_operand:DI 2 "pcrel_external_address")
+		    (match_operand:DI 3 "const_int_operand" "n,n,n")]
+		   UNSPEC_PCREL_ST))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+{
+  return (INTVAL (operands[3])) ? "ld %0,%a2\n.Lpcrel%3:" : "ld %0,%a2";
+}
+  [(set_attr "type" "load")
+   (set_attr "length" "12")
+   (set_attr "pcrel_opt" "store_got")
+   (set_attr "cost" "4")
+   (set_attr "prefixed" "yes")])
+
+;; Secondary store instruction that uses the GOT pointer, and may be optimized
+;; into a NOP instruction.
+(define_insn "*mov<mode>_pcrel_opt_store_mem"
+  [(set (match_operand:QHI 0 "one_reg_memory_operand" "=Q,Q")
+	(match_operand:QHI 1 "gpc_reg_operand" "r,wa"))
+   (use (match_operand:DI 2 "const_int_operand" "n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+  st<wd> %1,%0
+  stxsi<wd>x %x1,%y0"
+  [(set_attr "type" "store,fpstore")
+   (set_attr "pcrel_opt" "store,no")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movsi_pcrel_opt_store_mem"
+  [(set (match_operand:SI 0 "one_reg_memory_operand" "=Q,Q,Q")
+	(match_operand:SI 1 "gpc_reg_operand" "r,d,v"))
+   (use (match_operand:DI 2 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+  stw %1,%0
+  stfiwx %1,%y0
+  stxsiwx %1,%y0"
+  [(set_attr "type" "store,fpstore,fpstore")
+   (set_attr "pcrel_opt" "store,no,no")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movdi_pcrel_opt_store_mem"
+  [(set (match_operand:DI 0 "one_reg_memory_operand" "=Q,Q,Q")
+	(match_operand:DI 1 "gpc_reg_operand" "r,d,v"))
+   (use (match_operand:DI 2 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+  std %1,%0
+  stfd %1,%0
+  stxsd %1,%0"
+  [(set_attr "type" "store,fpstore,fpstore")
+   (set_attr "pcrel_opt" "store")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movsf_pcrel_opt_store_mem"
+  [(set (match_operand:SF 0 "one_reg_memory_operand" "=Q,Q,Q")
+	(match_operand:SF 1 "gpc_reg_operand" "d,v,r"))
+   (use (match_operand:DI 2 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+  stfs %1,%0
+  stxssp %1,%0
+  stw %1,%0"
+  [(set_attr "type" "fpstore,fpstore,store")
+   (set_attr "pcrel_opt" "store")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movdf_pcrel_opt_store_mem"
+  [(set (match_operand:DF 0 "one_reg_memory_operand" "=Q,Q,Q")
+	(match_operand:DF 1 "gpc_reg_operand" "d,v,r"))
+   (use (match_operand:DI 2 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+  stfd %1,%0
+  stxsd %1,%0
+  std %1,%0"
+  [(set_attr "type" "fpstore,fpstore,store")
+   (set_attr "pcrel_opt" "store")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*mov<mode>_pcrel_opt_store_mem"
+  [(set (match_operand:POV 0 "one_reg_memory_operand" "=Q")
+	(match_operand:POV 1 "gpc_reg_operand" "wa"))
+   (use (match_operand:DI 2 "const_int_operand" "n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "stxv %x1,%0"
+  [(set_attr "type" "vecstore")
+   (set_attr "pcrel_opt" "store")
+   (set_attr "prefixed" "no")])
Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 274194)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -775,6 +775,13 @@ (define_predicate "indexed_or_indirect_o
   return indexed_or_indirect_address (op, mode);
 })
 
+;; Return 1 if the operand uses a single register for the address.
+(define_predicate "one_reg_memory_operand"
+  (match_code "mem")
+{
+  return REG_P (XEXP (op, 0));
+})
+
 ;; Like indexed_or_indirect_operand, but also allow a GPR register if direct
 ;; moves are supported.
 (define_predicate "reg_or_indexed_operand"
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def	(revision 274194)
+++ gcc/config/rs6000/rs6000-cpus.def	(working copy)
@@ -79,10 +79,12 @@
 #define ISA_FUTURE_MASKS_SERVER	(ISA_3_0_MASKS_SERVER			\
 				 | OPTION_MASK_FUTURE			\
 				 | OPTION_MASK_PCREL			\
+				 | OPTION_MASK_PCREL_OPT		\
 				 | OPTION_MASK_PREFIXED_ADDR)
 
 /* Flags that need to be turned off if -mno-future.  */
 #define OTHER_FUTURE_MASKS	(OPTION_MASK_PCREL			\
+				 | OPTION_MASK_PCREL_OPT		\
 				 | OPTION_MASK_PREFIXED_ADDR)
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
@@ -138,6 +140,7 @@
 				 | OPTION_MASK_P9_MISC			\
 				 | OPTION_MASK_P9_VECTOR		\
 				 | OPTION_MASK_PCREL			\
+				 | OPTION_MASK_PCREL_OPT		\
 				 | OPTION_MASK_POPCNTB			\
 				 | OPTION_MASK_POPCNTD			\
 				 | OPTION_MASK_POWERPC64		\
Index: gcc/config/rs6000/rs6000-passes.def
===================================================================
--- gcc/config/rs6000/rs6000-passes.def	(revision 274194)
+++ gcc/config/rs6000/rs6000-passes.def	(working copy)
@@ -25,3 +25,12 @@ along with GCC; see the file COPYING3.
  */
 
   INSERT_PASS_BEFORE (pass_cse, 1, pass_analyze_swaps);
+
+/* The pcrel_opt pass must be the final pass before final.  This pass combines
+   references to external pc-relative variables with their use.  There must be
+   only one reference to the external pointer loaded in order to do the
+   optimization.  Otherwise we load up the addresses (either via PADDI if the
+   label is local or via a PLD from the got section if it is defined in another
+   module) and the value as a base pointer.  */
+
+  INSERT_PASS_BEFORE (pass_final, 1, pass_pcrel_opt);
Index: gcc/config/rs6000/rs6000-pcrel.c
===================================================================
--- gcc/config/rs6000/rs6000-pcrel.c	(revision 0)
+++ gcc/config/rs6000/rs6000-pcrel.c	(working copy)
@@ -0,0 +1,463 @@
+/* Subroutines used support the pc-relative linker optimization.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This file implements a RTL pass that looks for pc-relative loads of the
+   address of an external variable using the PCREL_GOT relocation and a single
+   load/store that uses that GOT pointer.  If that is found we create the
+   PCREL_OPT relocation to possibly convert:
+
+	pld b,var@pcrel@got(0),1
+
+	# possibly other instructions that do not use the base register 'b' or
+        # the result register 'r'.
+
+	lwz r,0(b)
+
+   into:
+
+	plwz r,var@pcrel(0),1
+
+	# possibly other instructions that do not use the base register 'b' or
+        # the result register 'r'.
+
+	nop
+
+   If the variable is not defined in the main program or the code using it is
+   not in the main program, the linker put the address in the .got section and
+   do:
+
+	.section .got
+	.Lvar_got:	.dword var
+
+	.section .text
+	pld b,.Lvar_got@pcrel(0),1
+
+	# possibly other instructions that do not use the base register 'b' or
+        # the result register 'r'.
+
+	lwz r,0(b)
+
+   We only look for a single usage in the basic block where the GOT pointer is
+   loaded.  Multiple uses or references in another basic block will force us to
+   not use the PCREL_OPT relocation.
+
+   This file also contains the support function for prefixed memory to emit the
+   leading 'p' in front of prefixed instructions, and to create the necessary
+   relocations needed for PCREL_OPT.  */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "tree.h"
+#include "memmodel.h"
+#include "df.h"
+#include "tm_p.h"
+#include "ira.h"
+#include "print-tree.h"
+#include "varasm.h"
+#include "explow.h"
+#include "expr.h"
+#include "output.h"
+#include "tree-pass.h"
+#include "rtx-vector-builder.h"
+#include "print-rtl.h"
+#include "insn-attr.h"
+
+\f
+// Optimize pc-relative references
+const pass_data pass_data_pcrel =
+{
+  RTL_PASS,			// type
+  "pcrel",			// name
+  OPTGROUP_NONE,		// optinfo_flags
+  TV_NONE,			// tv_id
+  0,				// properties_required
+  0,				// properties_provided
+  0,				// properties_destroyed
+  0,				// todo_flags_start
+  TODO_df_finish,		// todo_flags_finish
+};
+
+// Pass data structures
+class pcrel : public rtl_opt_pass
+{
+private:
+  // Function to optimize pc relative loads/stores
+  unsigned int do_pcrel_opt (function *);
+
+  // A GOT pointer used for a load
+  void load_got (rtx_insn *);
+
+  // A load insn that uses the GOT ponter
+  void load_insn (rtx_insn *);
+
+  // A GOT pointer used for a store
+  void store_got (rtx_insn *);
+
+  // A store insn that uses the GOT ponter
+  void store_insn (rtx_insn *);
+
+  // Record the number of loads and stores optimized
+  unsigned long num_got_loads;
+  unsigned long num_got_stores;
+  unsigned long num_loads;
+  unsigned long num_stores;
+  unsigned long num_opt_loads;
+  unsigned long num_opt_stores;
+
+  // We record the GOT insn for each register that sets a GOT for a load or a
+  // store instruction.
+  rtx_insn *got_reg[32];
+
+public:
+  pcrel (gcc::context *ctxt)
+  : rtl_opt_pass (pass_data_pcrel, ctxt),
+    num_got_loads (0),
+    num_got_stores (0),
+    num_loads (0),
+    num_stores (0),
+    num_opt_loads (0),
+    num_opt_stores (0)
+  {}
+
+  ~pcrel (void)
+  {}
+
+  // opt_pass methods:
+  virtual bool gate (function *)
+  {
+    return TARGET_PCREL && TARGET_PCREL_OPT && optimize;
+  }
+
+  virtual unsigned int execute (function *fun)
+  {
+    return do_pcrel_opt (fun);
+  }
+
+  opt_pass *clone ()
+  {
+    return new pcrel (m_ctxt);
+  }
+};
+
+\f
+/* Return a marker to create the backward pointing label that links the load or
+   store to the insn that loads the adddress of an external label with
+   PCREL_GOT.  This allows us to create the necessary R_PPC64_PCREL_OPT
+   relocation to link the two instructions.  */
+
+static rtx
+pcrel_marker (void)
+{
+  static unsigned int label_number = 0;
+
+  label_number++;
+  return GEN_INT (label_number);
+}
+
+\f
+// Save the current PCREL_OPT load GOT insn address in the register # of the
+// GOT pointer that is loaded.
+//
+// The PCREL_OPT LOAD_GOT insn looks like:
+//
+//	(parallel [(set (base) (addr))
+//		   (set (reg)  (unspec [(const_int 0)] UNSPEC_PCREL_LD))
+//		   (use (marker))])
+//
+// The base register is the GOT address, and the marker is a numeric label that
+// is created in this pass if the only use of the GOT load pointer is for a
+// single load.
+
+void
+pcrel::load_got (rtx_insn *insn)
+{
+  rtx pattern = PATTERN (insn);
+  rtx set = XVECEXP (pattern, 0, 0);
+  int got = REGNO (SET_DEST (set));
+
+  gcc_assert (IN_RANGE (got, FIRST_GPR_REGNO+1, LAST_GPR_REGNO));
+  got_reg[got] = insn;
+  num_got_loads++;
+}
+
+// See if the use of this load of a GOT pointer is the only usage.  If so,
+// allocate a marker to create a label.
+//
+// The PCREL_OPT LOAD insn looks like:
+//
+//	(parallel [(set (reg) (mem))
+//		   (use (reg)
+//		   (use (marker))])
+//
+// Between the reg and the memory might be a SIGN_EXTEND, ZERO_EXTEND, or
+// FLOAT_EXTEND:
+//
+//	(parallel [(set (reg) (sign_extend (mem)))
+//		   (use (reg)
+//		   (use (marker))])
+
+void
+pcrel::load_insn (rtx_insn *insn)
+{
+  num_loads++;
+
+  /* If the optimizer has changed the load instruction, just use the GOT
+     pointer as an address.  */
+  rtx pattern = PATTERN (insn);
+  if (GET_CODE (pattern) != PARALLEL || XVECLEN (pattern, 0) != 3)
+    return;
+
+  rtx set = XVECEXP (pattern, 0, 0);
+  if (GET_CODE (set) != SET
+      || GET_CODE (XVECEXP (pattern, 0, 1)) != USE
+      || GET_CODE (XVECEXP (pattern, 0, 2)) != USE)
+    return;
+
+  rtx dest = SET_DEST (set);
+  rtx src = SET_SRC (set);
+
+  if (!rtx_equal_p (dest, XEXP (XVECEXP (pattern, 0, 1), 0)))
+    return;
+
+  if (GET_CODE (src) == SIGN_EXTEND || GET_CODE (src) == ZERO_EXTEND
+      || GET_CODE (src) == FLOAT_EXTEND)
+    src = XEXP (src, 0);
+
+  if (!MEM_P (src))
+    return;
+
+  rtx addr = XEXP (src, 0);
+  if (!REG_P (addr))
+    return;
+
+  int r = REGNO (addr);
+  if (!IN_RANGE (r, FIRST_GPR_REGNO+1, LAST_GPR_REGNO))
+    return;
+
+  rtx_insn *got_insn = got_reg[r];
+
+  // See if this is the only reference, and there is a set of the GOT pointer
+  // previously in the same basic block.  If this is the only reference,
+  // optimize it.
+  if (got_insn
+      && get_attr_pcrel_opt (got_insn) == PCREL_OPT_LOAD_GOT
+      && !reg_used_between_p (addr, got_insn, insn)
+      && (find_reg_note (insn, REG_DEAD, addr) || rtx_equal_p (dest, addr)))
+    {
+      rtx marker = pcrel_marker ();
+      rtx got_use = XVECEXP (PATTERN (got_insn), 0, 2);
+      rtx insn_use = XVECEXP (pattern, 0, 2);
+
+      gcc_checking_assert (rtx_equal_p (XEXP (got_use, 0), const0_rtx));
+      gcc_checking_assert (rtx_equal_p (XEXP (insn_use, 0), const0_rtx));
+
+      XEXP (got_use, 0) = marker;
+      XEXP (insn_use, 0) = marker;
+      num_opt_loads++;
+    }
+
+  // Forget the GOT now that we've used it.
+  got_reg[r] = (rtx_insn *)0;
+}
+
+// Save the current PCREL_OPT store GOT insn address in the register # of the
+// GOT pointer that is loaded.
+//
+// The PCREL_OPT STORE_GOT insn looks like:
+//
+//	(set (set (base)
+//	     (unspec:DI [(src)
+//			 (addr)
+//			 (marker)] UNSPEC_PCREL_ST))
+//
+// The base register is the GOT address, and the marker is a numeric label that
+// is created in this pass or 0 to indicate there are other uses of the GOT
+// pointer.
+
+void
+pcrel::store_got (rtx_insn *insn)
+{
+  rtx pattern = PATTERN (insn);
+  int got = REGNO (SET_DEST (pattern));
+
+  gcc_checking_assert (IN_RANGE (got, FIRST_GPR_REGNO+1, LAST_GPR_REGNO));
+  got_reg[got] = insn;
+  num_got_stores++;
+}
+
+// See if the use of this store using a GOT pointer is the only usage.  If so,
+// allocate a marker to create a label.
+//
+// The PCREL_OPT STORE insn looks like:
+//
+//	(parallel [(set (mem) (reg))
+//		   (use (marker))])
+
+void
+pcrel::store_insn (rtx_insn *insn)
+{
+  num_stores++;
+
+  /* If the optimizer has changed the store instruction, just use the GOT
+     pointer as an address.  */
+  rtx pattern = PATTERN (insn);
+  if (GET_CODE (pattern) != PARALLEL || XVECLEN (pattern, 0) != 2)
+    return;
+
+  rtx set = XVECEXP (pattern, 0, 0);
+  if (GET_CODE (set) != SET || GET_CODE (XVECEXP (pattern, 0, 1)) != USE)
+    return;
+
+  rtx dest = SET_DEST (set);
+
+  if (!MEM_P (dest))
+    return;
+
+  rtx addr = XEXP (dest, 0);
+  if (!REG_P (addr))
+    return;
+
+  int r = REGNO (addr);
+  if (!IN_RANGE (r, FIRST_GPR_REGNO+1, LAST_GPR_REGNO))
+    return;
+
+  rtx_insn *got_insn = got_reg[r];
+
+  // See if this is the only reference, and there is a GOT pointer previously.
+  // If this is the only reference, optimize it.
+  if (got_insn
+      && get_attr_pcrel_opt (got_insn) == PCREL_OPT_STORE_GOT
+      && !reg_used_between_p (addr, got_insn, insn)
+      && find_reg_note (insn, REG_DEAD, addr))
+    {
+      rtx marker = pcrel_marker ();
+      rtx got_src = SET_SRC (PATTERN (got_insn));
+      rtx insn_use = XVECEXP (pattern, 0, 1);
+
+      gcc_checking_assert (rtx_equal_p (XVECEXP (got_src, 0, 2), const0_rtx));
+      gcc_checking_assert (rtx_equal_p (XEXP (insn_use, 0), const0_rtx));
+
+      XVECEXP (got_src, 0, 2) = marker;
+      XEXP (insn_use, 0) = marker;
+      num_opt_stores++;
+    }
+
+  // Forget the GOT now
+  got_reg[r] = (rtx_insn *)0;
+}
+
+// Optimize pcrel external variable references
+
+unsigned int
+pcrel::do_pcrel_opt (function *fun)
+{
+  basic_block bb;
+  rtx_insn *insn, *curr_insn = 0;
+
+  // Dataflow analysis for use-def chains.
+  df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
+  df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
+  df_analyze ();
+  df_set_flags (DF_DEFER_INSN_RESCAN | DF_LR_RUN_DCE);
+
+  // Look at each basic block to see if there is a load of an external
+  // variable's GOT address, and a single load/store using that GOT address.
+  FOR_ALL_BB_FN (bb, fun)
+    {
+      bool clear_got_p = true;
+
+      FOR_BB_INSNS_SAFE (bb, insn, curr_insn)
+	{
+	  if (clear_got_p)
+	    {
+	      memset ((void *) &got_reg[0], 0, sizeof (got_reg));
+	      clear_got_p = false;
+	    }
+
+	  if (NONJUMP_INSN_P (insn))
+	    {
+	      rtx pattern = PATTERN (insn);
+	      if (GET_CODE (pattern) == SET || GET_CODE (pattern) == PARALLEL)
+		{
+		  switch (get_attr_pcrel_opt (insn))
+		    {
+		    case PCREL_OPT_NO:
+		      break;
+
+		    case PCREL_OPT_LOAD_GOT:
+		      load_got (insn);
+		      break;
+
+		    case PCREL_OPT_LOAD:
+		      load_insn (insn);
+		      break;
+
+		    case PCREL_OPT_STORE_GOT:
+		      store_got (insn);
+		      break;
+
+		    case PCREL_OPT_STORE:
+		      store_insn (insn);
+		      break;
+
+		    default:
+		      gcc_unreachable ();
+		    }
+		}
+	    }
+
+	  /* Don't let the GOT load be moved before a label, jump, or call and
+	     the dependent load/store after the label, jump, or call.  */
+	  else if (JUMP_P (insn) || CALL_P (insn) || LABEL_P (insn))
+	    clear_got_p = true;
+	}
+    }
+
+  // Rebuild ud chains.
+  df_remove_problem (df_chain);
+  df_process_deferred_rescans ();
+  df_set_flags (DF_RD_PRUNE_DEAD_DEFS | DF_LR_RUN_DCE);
+  df_chain_add_problem (DF_UD_CHAIN);
+  df_analyze ();
+
+  if (dump_file)
+    {
+      fprintf (dump_file, "\npc-relative optimizations:\n");
+      fprintf (dump_file, "\tgot loads        = %lu\n", num_got_loads);
+      fprintf (dump_file, "\tpotential loads  = %lu\n", num_loads);
+      fprintf (dump_file, "\toptimized loads  = %lu\n", num_opt_loads);
+      fprintf (dump_file, "\tgot stores       = %lu\n", num_got_stores);
+      fprintf (dump_file, "\tpotential stores = %lu\n", num_stores);
+      fprintf (dump_file, "\toptimized stores = %lu\n\n", num_opt_stores);
+    }
+
+  return 0;
+}
+
+\f
+rtl_opt_pass *
+make_pass_pcrel_opt (gcc::context *ctxt)
+{
+  return new pcrel (ctxt);
+}
Index: gcc/config/rs6000/rs6000-prefixed.c
===================================================================
--- gcc/config/rs6000/rs6000-prefixed.c	(revision 274194)
+++ gcc/config/rs6000/rs6000-prefixed.c	(working copy)
@@ -46,13 +46,39 @@
    instruction is printed out.  */
 static bool next_insn_prefixed_p;
 
+/* Numeric label that is the address of the GOT load instruction + 8 that we
+   link the R_PPC64_PCREL_OPT relocation to for on the next instruction.  */
+static unsigned int pcrel_opt_label_num;
+
 /* Define FINAL_PRESCAN_INSN if some processing needs to be done before
    outputting the assembler code.  On the PowerPC, we remember if the current
-   insn is a prefixed insn where we need to emit a 'p' before the insn.  */
+   insn is a prefixed insn where we need to emit a 'p' before the insn.
+
+   In addition, if the insn is part of a pc-relative reference to an external
+   label optimization, this is recorded also.  */
 void
-rs6000_final_prescan_insn (rtx_insn *insn)
+rs6000_final_prescan_insn (rtx_insn *insn, rtx operands[], int noperands)
 {
   next_insn_prefixed_p = (get_attr_prefixed (insn) != PREFIXED_NO);
+
+  enum attr_pcrel_opt pcrel_attr = get_attr_pcrel_opt (insn);
+
+  /* For the load and store instructions that are tied to a GOT pointer, we
+     know that operand 3 constains a marker for loads and operand 2 contains
+     the marker for stores.  If it is non-zero, it is the numeric label where
+     we load the address + 8.  */
+  if (pcrel_attr == PCREL_OPT_LOAD)
+    {
+      gcc_assert (noperands >= 3);
+      pcrel_opt_label_num = INTVAL (operands[3]);
+    }
+  else if (pcrel_attr == PCREL_OPT_STORE)
+    {
+      gcc_assert (noperands >= 2);
+      pcrel_opt_label_num = INTVAL (operands[2]);
+    }
+  else
+    pcrel_opt_label_num = 0;
   return;
 }
 
@@ -64,6 +90,13 @@ rs6000_final_prescan_insn (rtx_insn *ins
 void
 rs6000_asm_output_opcode (FILE *stream, const char *)
 {
+  if (pcrel_opt_label_num)
+    {
+      fprintf (stream, ".reloc .Lpcrel%u-8,R_PPC64_PCREL_OPT,.-(.Lpcrel%u-8)\n\t",
+	       pcrel_opt_label_num, pcrel_opt_label_num);
+      pcrel_opt_label_num = 0;
+    }
+
   if (next_insn_prefixed_p)
     {
       next_insn_prefixed_p = false;
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 274194)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -251,7 +251,7 @@ extern bool prefixed_load_p (rtx_insn *)
 extern bool prefixed_store_p (rtx_insn *);
 extern bool prefixed_paddi_p (rtx_insn *);
 extern void rs6000_asm_output_opcode (FILE *, const char *);
-void rs6000_final_prescan_insn (rtx_insn *);
+void rs6000_final_prescan_insn (rtx_insn *, rtx [], int);
 #endif
 
 extern unsigned char rs6000_class_max_nregs[][LIM_REG_CLASSES];
@@ -263,6 +263,7 @@ extern bool rs6000_linux_float_exception
 namespace gcc { class context; }
 class rtl_opt_pass;
 
+extern rtl_opt_pass *make_pass_pcrel_opt (gcc::context *);
 extern rtl_opt_pass *make_pass_analyze_swaps (gcc::context *);
 extern bool rs6000_sum_of_two_registers_p (const_rtx expr);
 extern bool rs6000_quadword_masked_address_p (const_rtx exp);
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 274194)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -4212,7 +4212,17 @@ rs6000_option_override_internal (bool gl
       if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
 	error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
 
-      rs6000_isa_flags &= ~OPTION_MASK_PCREL;
+      rs6000_isa_flags &= ~(OPTION_MASK_PCREL
+			    | OPTION_MASK_PCREL_OPT);
+    }
+
+  /* Check -mfuture debug switches.  */
+  if (!TARGET_PCREL && TARGET_PCREL_OPT)
+    {
+      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL_OPT) != 0)
+	error ("%qs requires %qs", "-mpcrel-opt", "-mpcrel");
+
+      rs6000_isa_flags &= ~OPTION_MASK_PCREL_OPT;
     }
 
   /* Print the options after updating the defaults.  */
@@ -4353,7 +4363,8 @@ rs6000_option_override_internal (bool gl
       if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
 	error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");
 
-      rs6000_isa_flags &= ~OPTION_MASK_PCREL;
+      rs6000_isa_flags &= ~(OPTION_MASK_PCREL
+			    | OPTION_MASK_PCREL_OPT);
     }
 
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
@@ -23169,6 +23180,7 @@ static struct rs6000_opt_mask const rs60
   { "mulhw",			OPTION_MASK_MULHW,		false, true  },
   { "multiple",			OPTION_MASK_MULTIPLE,		false, true  },
   { "pcrel",			OPTION_MASK_PCREL,		false, true  },
+  { "pcrel-opt",		OPTION_MASK_PCREL_OPT,		false, true  },
   { "popcntb",			OPTION_MASK_POPCNTB,		false, true  },
   { "popcntd",			OPTION_MASK_POPCNTD,		false, true  },
   { "power8-fusion",		OPTION_MASK_P8_FUSION,		false, true  },
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 274194)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -2580,7 +2580,7 @@ typedef struct GTY(()) machine_function
 do									\
   {									\
     if (TARGET_PREFIXED_ADDR)						\
-      rs6000_final_prescan_insn (INSN);					\
+      rs6000_final_prescan_insn (INSN, OPERANDS, NOPERANDS);		\
   }									\
 while (0)
 
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 274194)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -269,6 +269,31 @@ (define_enum "insn_form" [unknown	; Unkn
 			  x		; Indexed addressing
 			  prefixed])	; Prefixed instruction
 
+;; Whether this instruction is part of the two instruction sequence that
+;; supports PCREL_OPT optimizations, where the linker can change code of the
+;; form:
+;;
+;;		pld b,var@got@pcrel
+;;	100:
+;;		# possibly other instructions
+;;		.reloc 100b-8,R_PPC64_PCREL_OPT,0
+;;		lwz r,0(b)
+;;
+;; into the following if 'var' is in the main program:
+;;
+;;		plwz r,0(b)
+;;		# possibly other instructions
+;;		nop
+;;
+;; The states are:
+;;	no		-- insn is not involved with PCREL_OPT optimizations
+;;	load_got	-- insn loads up the GOT pointer for a load instruction
+;;	load		-- insn is an offsettable load that uses the GOT pointer
+;;	store_got	-- insn loads up the GOT pointer for a store instruction
+;;	store		-- insn is an offsettable store that uses the GOT pointer
+
+(define_attr "pcrel_opt" "no,load_got,load,store_got,store" (const_string "no"))
+
 ;; Whether an insn is a prefixed insn, and an initial 'p' should be printed
 ;; before the instruction.  A prefixed instruction has a prefix instruction
 ;; word that extends the immediate value of the instructions from 12-16 bits to
@@ -14543,6 +14568,7 @@ (define_insn "*cmp<mode>_hw"
 
 \f
 
+(include "pcrel.md")
 (include "sync.md")
 (include "vector.md")
 (include "vsx.md")
Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt	(revision 274194)
+++ gcc/config/rs6000/rs6000.opt	(working copy)
@@ -577,3 +577,7 @@ Generate (do not generate) prefixed memo
 mpcrel
 Target Report Mask(PCREL) Var(rs6000_isa_flags)
 Generate (do not generate) pc-relative memory addressing.
+
+mpcrel-opt
+Target Undocumented Mask(PCREL_OPT) Var(rs6000_isa_flags)
+Generate (do not generate) pc-relative memory optimizations for externals.
Index: gcc/config/rs6000/t-rs6000
===================================================================
--- gcc/config/rs6000/t-rs6000	(revision 274194)
+++ gcc/config/rs6000/t-rs6000	(working copy)
@@ -47,6 +47,10 @@ rs6000-call.o: $(srcdir)/config/rs6000/r
 	$(COMPILE) $<
 	$(POSTCOMPILE)
 
+rs6000-pcrel.o: $(srcdir)/config/rs6000/rs6000-pcrel.c
+	$(COMPILE) $<
+	$(POSTCOMPILE)
+
 rs6000-prefixed.o: $(srcdir)/config/rs6000/rs6000-prefixed.c
 	$(COMPILE) $<
 	$(POSTCOMPILE)
@@ -83,6 +87,7 @@ MD_INCLUDES = $(srcdir)/config/rs6000/rs
 	$(srcdir)/config/rs6000/predicates.md \
 	$(srcdir)/config/rs6000/constraints.md \
 	$(srcdir)/config/rs6000/darwin.md \
+	$(srcdir)/config/rs6000/pcrel.md \
 	$(srcdir)/config/rs6000/sync.md \
 	$(srcdir)/config/rs6000/vector.md \
 	$(srcdir)/config/rs6000/vsx.md \
Index: gcc/config.gcc
===================================================================
--- gcc/config.gcc	(revision 274194)
+++ gcc/config.gcc	(working copy)
@@ -500,7 +500,7 @@ or1k*-*-*)
 powerpc*-*-*)
 	cpu_type=rs6000
 	extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o rs6000-call.o"
-	extra_objs="${extra_objs} rs6000-prefixed.o"
+	extra_objs="${extra_objs} rs6000-prefixed.o rs6000-pcrel.o"
 	extra_headers="ppc-asm.h altivec.h htmintrin.h htmxlintrin.h"
 	extra_headers="${extra_headers} bmi2intrin.h bmiintrin.h"
 	extra_headers="${extra_headers} xmmintrin.h mm_malloc.h emmintrin.h"
@@ -516,6 +516,7 @@ powerpc*-*-*)
 	extra_options="${extra_options} g.opt fused-madd.opt rs6000/rs6000-tables.opt"
 	target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-logue.c \$(srcdir)/config/rs6000/rs6000-call.c"
 	target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-prefixed.c"
+	target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-pcrel.c"
 	;;
 pru-*-*)
 	cpu_type=pru
@@ -528,9 +529,10 @@ riscv*)
 rs6000*-*-*)
 	extra_options="${extra_options} g.opt fused-madd.opt rs6000/rs6000-tables.opt"
 	extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o rs6000-call.o"
-	extra_objs="${extra_objs} rs6000-prefixed.o"
+	extra_objs="${extra_objs} rs6000-prefixed.o rs6000-pcrel.o"
 	target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-logue.c \$(srcdir)/config/rs6000/rs6000-call.c"
 	target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-prefixed.c"
+	target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-pcrel.c"
 	;;
 sparc*-*-*)
 	cpu_type=sparc
Index: gcc/testsuite/gcc.target/powerpc/pcrel-opt-di.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pcrel-opt-di.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pcrel-opt-di.c	(working copy)
@@ -0,0 +1,53 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_elfv2 } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-mdejagnu-cpu=future -O2" } */
+
+/* Determine if the pc-relative optimization using the R_PPC64_PCREL_OPT
+   optimization is supported.  */
+
+#ifndef TYPE
+#define TYPE long
+#endif
+
+extern TYPE ext;
+
+/* This should generate:
+		PLD 9,ext@got@pcrel
+	.Label:
+		.reloc .Label-8,R_PPC64_PCREL_OPT,0
+		LD 3,0(9)  */
+TYPE
+get_ext (void)
+{
+  return ext;
+}
+
+/* This should generate:
+		PLD 9,ext@got@pcrel
+	.Label:
+		.reloc .Label-8,R_PPC64_PCREL_OPT,0
+		STD 3,0(9)  */
+
+void
+set_ext (TYPE a)
+{
+  ext = a;
+}
+
+/* Because it has two references to 'ext', this should not generate a
+   R_PPC64_PCREL_OPT relocation.  Instead it should generate:
+		PLD 10,ext@got@pcrel
+		LD 9,0(10)
+		ADDI 9,9,1
+		STD 9,0(10)  */
+
+void
+inc_ext (void)
+{
+  ext++;
+}
+
+/* { dg-final { scan-assembler-times "ext@got@pcrel"     3 } } */
+/* { dg-final { scan-assembler-times "R_PPC64_PCREL_OPT" 2 } } */
+

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH], Patch #8 of 10, Miscellaneous future tests
  2019-08-14 21:36 PowerPC 'future' patches introduction Michael Meissner
                   ` (6 preceding siblings ...)
  2019-08-14 23:13 ` [PATCH], Patch #7 of 10, Add support for PCREL_OPT Michael Meissner
@ 2019-08-14 23:16 ` Michael Meissner
  2019-08-14 23:17 ` [PATCH], Patch #9 of 10, Add tests with large memory offsets Michael Meissner
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Michael Meissner @ 2019-08-14 23:16 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn,
	Michael Meissner, Alan Modra

This patch adds miscellaneous tests for the new prefixed addressing.

With patches 1-7 applied, these patches all succeed.  Can I check these patches
into the trunk?

2019-08-14  Michael Meissner  <meissner@linux.ibm.com>

	* gcc/testsuite/gcc.target/powerpc/prefix-odd-memory.c: New test.
	* gcc/testsuite/gcc.target/powerpc/paddi-1.c: New test.
	* gcc/testsuite/gcc.target/powerpc/paddi-2.c: New test.
	* gcc/testsuite/gcc.target/powerpc/paddi-3.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-premodify.c: New test.

Index: gcc/testsuite/gcc.target/powerpc/paddi-1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/paddi-1.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/paddi-1.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PADDI is generated to add a large constant.  */
+unsigned long
+add (unsigned long a)
+{
+  return a + 0x12345678UL;
+}
+
+/* { dg-final { scan-assembler {\mpaddi\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/paddi-2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/paddi-2.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/paddi-2.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant.  */
+unsigned long
+large (void)
+{
+  return 0x12345678UL;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/paddi-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/paddi-3.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/paddi-3.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant for SImode.  */
+void
+large_si (unsigned int *p)
+{
+  *p = 0x12345U;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-odd-memory.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-odd-memory.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-odd-memory.c	(working copy)
@@ -0,0 +1,156 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests whether we can generate a prefixed load/store operation for addresses
+   that don't meet DS/DQ alignment constraints.  */
+
+unsigned long
+load_uc_odd (unsigned char *p)
+{
+  return p[1];				/* should generate LBZ.  */
+}
+
+long
+load_sc_odd (signed char *p)
+{
+  return p[1];				/* should generate LBZ + EXTSB.  */
+}
+
+unsigned long
+load_us_odd (unsigned char *p)
+{
+  return *(unsigned short *)(p + 1);	/* should generate LHZ.  */
+}
+
+long
+load_ss_odd (unsigned char *p)
+{
+  return *(short *)(p + 1);		/* should generate LHA.  */
+}
+
+unsigned long
+load_ui_odd (unsigned char *p)
+{
+  return *(unsigned int *)(p + 1);	/* should generate LWZ.  */
+}
+
+long
+load_si_odd (unsigned char *p)
+{
+  return *(int *)(p + 1);		/* should generate PLWA.  */
+}
+
+unsigned long
+load_ul_odd (unsigned char *p)
+{
+  return *(unsigned long *)(p + 1);	/* should generate PLD.  */
+}
+
+long
+load_sl_odd (unsigned char *p)
+{
+  return *(long *)(p + 1);	/* should generate PLD.  */
+}
+
+float
+load_float_odd (unsigned char *p)
+{
+  return *(float *)(p + 1);		/* should generate LFS.  */
+}
+
+double
+load_double_odd (unsigned char *p)
+{
+  return *(double *)(p + 1);		/* should generate LFD.  */
+}
+
+__ieee128
+load_ieee128_odd (unsigned char *p)
+{
+  return *(__ieee128 *)(p + 1);		/* should generate PLXV.  */
+}
+
+void
+store_uc_odd (unsigned char uc, unsigned char *p)
+{
+  p[1] = uc;				/* should generate STB.  */
+}
+
+void
+store_sc_odd (signed char sc, signed char *p)
+{
+  p[1] = sc;				/* should generate STB.  */
+}
+
+void
+store_us_odd (unsigned short us, unsigned char *p)
+{
+  *(unsigned short *)(p + 1) = us;	/* should generate STH.  */
+}
+
+void
+store_ss_odd (signed short ss, unsigned char *p)
+{
+  *(signed short *)(p + 1) = ss;	/* should generate STH.  */
+}
+
+void
+store_ui_odd (unsigned int ui, unsigned char *p)
+{
+  *(unsigned int *)(p + 1) = ui;	/* should generate STW.  */
+}
+
+void
+store_si_odd (signed int si, unsigned char *p)
+{
+  *(signed int *)(p + 1) = si;		/* should generate STW.  */
+}
+
+void
+store_ul_odd (unsigned long ul, unsigned char *p)
+{
+  *(unsigned long *)(p + 1) = ul;	/* should generate PSTD.  */
+}
+
+void
+store_sl_odd (signed long sl, unsigned char *p)
+{
+  *(signed long *)(p + 1) = sl;		/* should generate PSTD.  */
+}
+
+void
+store_float_odd (float f, unsigned char *p)
+{
+  *(float *)(p + 1) = f;		/* should generate STF.  */
+}
+
+void
+store_double_odd (double d, unsigned char *p)
+{
+  *(double *)(p + 1) = d;		/* should generate STD.  */
+}
+
+void
+store_ieee128_odd (__ieee128 ieee, unsigned char *p)
+{
+  *(__ieee128 *)(p + 1) = ieee;		/* should generate PSTXV.  */
+}
+
+/* { dg-final { scan-assembler-times {\mextsb\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mlbz\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mlfd\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlfs\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlha\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlhz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlwz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mpld\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mplwa\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mplxv\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstb\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mstfd\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mstfs\M}  1 } } */
+/* { dg-final { scan-assembler-times {\msth\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mstw\M}   2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-premodify.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-premodify.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-premodify.c	(working copy)
@@ -0,0 +1,47 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Make sure that we don't try to generate a prefixed form of the load and
+   store with update instructions.  */
+
+#ifndef SIZE
+#define SIZE 50000
+#endif
+
+struct foo {
+  unsigned int field;
+  char pad[SIZE];
+};
+
+struct foo *inc_load (struct foo *p, unsigned int *q)
+{
+  *q = (++p)->field;
+  return p;
+}
+
+struct foo *dec_load (struct foo *p, unsigned int *q)
+{
+  *q = (--p)->field;
+  return p;
+}
+
+struct foo *inc_store (struct foo *p, unsigned int *q)
+{
+  (++p)->field = *q;
+  return p;
+}
+
+struct foo *dec_store (struct foo *p, unsigned int *q)
+{
+  (--p)->field = *q;
+  return p;
+}
+
+/* { dg-final { scan-assembler-times {\mpli\M|\mpla\M|\mpaddi\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mplwz\M}                  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}                  2 } } */
+/* { dg-final { scan-assembler-not   {\mp?lwzu\M}                  } } */
+/* { dg-final { scan-assembler-not   {\mp?stwzu\M}                 } } */
+/* { dg-final { scan-assembler-not   {\maddis\M}                   } } */
+/* { dg-final { scan-assembler-not   {\maddi\M}                    } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH], Patch #9 of 10, Add tests with large memory offsets
  2019-08-14 21:36 PowerPC 'future' patches introduction Michael Meissner
                   ` (7 preceding siblings ...)
  2019-08-14 23:16 ` [PATCH], Patch #8 of 10, Miscellaneous future tests Michael Meissner
@ 2019-08-14 23:17 ` Michael Meissner
  2019-08-15  3:48 ` [PATCH], Patch #10 of 10, Add pc-relative tests Michael Meissner
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Michael Meissner @ 2019-08-14 23:17 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn,
	Michael Meissner, Alan Modra

This patch adds tests for all of the types using large address offsets that
would not fit into 16 bits, and verifies that prefixed instructions are generated.

The tests in this patch all succeed when patches 1-7 are applied on a little
endian power8 system.  Can I check these patches into the trunk when the
previous patches have been applied?

2019-08-14  Michael Meissner  <meissner@linux.ibm.com>

	* gcc/testsuite/gcc.target/powerpc/prefix-large.h: New set of
	tests to test prefixed addressing on 'future' system with large
	numeric offsets.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-df.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-di.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-si.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c: New test.

Index: gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE _Decimal64
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-df.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-df.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE double
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-di.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-di.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-di.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE long
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE short
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplh[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE __float128
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE signed char
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c	(working copy)
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE _Decimal32
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mpaddi\M|\mpli|\mpla\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mlfiwzx\M}              2 } } */
+/* { dg-final { scan-assembler-times {\mstfiwx\M}              2 } } */
+
+
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE float
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfs\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfs\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-si.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-si.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-si.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE int
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplw[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE unsigned long
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE unsigned short
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplhz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE unsigned char
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE unsigned int
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplwz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE vector double
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large.h
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large.h	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large.h	(working copy)
@@ -0,0 +1,59 @@
+/* Common tests for prefixed instructions testing whether we can generate a
+   34-bit offset using 1 instruction.  */
+
+typedef signed char	schar;
+typedef unsigned char	uchar;
+typedef unsigned short	ushort;
+typedef unsigned int	uint;
+typedef unsigned long	ulong;
+typedef long double	ldouble;
+typedef vector double	v2df;
+typedef vector long	v2di;
+typedef vector float	v4sf;
+typedef vector int	v4si;
+
+#ifndef TYPE
+#define TYPE ulong
+#endif
+
+#ifndef ITYPE
+#define ITYPE TYPE
+#endif
+
+#ifndef OTYPE
+#define OTYPE TYPE
+#endif
+
+#if !defined(DO_ADD) && !defined(DO_VALUE) && !defined(DO_SET)
+#define DO_ADD		1
+#define DO_VALUE	1
+#define DO_SET		1
+#endif
+
+#ifndef CONSTANT
+#define CONSTANT	0x123450UL
+#endif
+
+#if DO_ADD
+void
+add (TYPE *p, TYPE a)
+{
+  p[CONSTANT] += a;
+}
+#endif
+
+#if DO_VALUE
+OTYPE
+value (TYPE *p)
+{
+  return p[CONSTANT];
+}
+#endif
+
+#if DO_SET
+void
+set (TYPE *p, ITYPE a)
+{
+  p[CONSTANT] = a;
+}
+#endif

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH], Patch #10 of 10, Add pc-relative tests
  2019-08-14 21:36 PowerPC 'future' patches introduction Michael Meissner
                   ` (8 preceding siblings ...)
  2019-08-14 23:17 ` [PATCH], Patch #9 of 10, Add tests with large memory offsets Michael Meissner
@ 2019-08-15  3:48 ` Michael Meissner
  2019-08-15  4:05 ` PowerPC 'future' patches introduction Segher Boessenkool
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Michael Meissner @ 2019-08-15  3:48 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn,
	Michael Meissner, Alan Modra

This patch adds tests to make sure the appropriate pc-relative instructions are
generated for -mcpu=future.

The tests in this patch all pass with patches 1-7 applied on a little endian
power8 system running Linux.  Once patches 1-7 have been applied, can I check
these patches into the trunk?

2019-08-14  Michael Meissner  <meissner@linux.ibm.com>

	* gcc/testsuite/gcc.target/powerpc/prefix-large.h: New set of
	tests to test prefixed addressing on 'future' system with
	pc-relative addreses.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c: New test.

Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for SImode.  */
+
+#define TYPE _Decimal64
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for DFmode.  */
+
+#define TYPE double
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for DImode.  */
+
+#define TYPE long
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for HImode.  */
+
+#define TYPE short
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplh[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for KFmode.  */
+
+#define TYPE __float128
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for QImode.  */
+
+#define TYPE signed char
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for SImode.  */
+
+#define TYPE _Decimal32
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mpaddi|\mpla\M} 3 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for SFmode.  */
+
+#define TYPE float
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplfs\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfs\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for SImode.  */
+
+#define TYPE int
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplw[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for unsigned DImode.  */
+
+#define TYPE unsigned long
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for unsigned HImode.  */
+
+#define TYPE unsigned short
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplhz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for unsigned QImode.  */
+
+#define TYPE unsigned char
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for unsigned SImode.  */
+
+#define TYPE unsigned int
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplwz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for V2DFmode.  */
+
+#define TYPE vector double
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h	(working copy)
@@ -0,0 +1,58 @@
+/* Common tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for each type.  */
+
+typedef signed char	schar;
+typedef unsigned char	uchar;
+typedef unsigned short	ushort;
+typedef unsigned int	uint;
+typedef unsigned long	ulong;
+typedef long double	ldouble;
+typedef vector double	v2df;
+typedef vector long	v2di;
+typedef vector float	v4sf;
+typedef vector int	v4si;
+
+#ifndef TYPE
+#define TYPE ulong
+#endif
+
+#ifndef ITYPE
+#define ITYPE TYPE
+#endif
+
+#ifndef OTYPE
+#define OTYPE TYPE
+#endif
+
+static TYPE a;
+TYPE *p = &a;
+
+#if !defined(DO_ADD) && !defined(DO_VALUE) && !defined(DO_SET)
+#define DO_ADD		1
+#define DO_VALUE	1
+#define DO_SET		1
+#endif
+
+#if DO_ADD
+void
+add (TYPE b)
+{
+  a += b;
+}
+#endif
+
+#if DO_VALUE
+OTYPE
+value (void)
+{
+  return (OTYPE)a;
+}
+#endif
+
+#if DO_SET
+void
+set (ITYPE b)
+{
+  a = (TYPE)b;
+}
+#endif

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PowerPC 'future' patches introduction
  2019-08-14 21:36 PowerPC 'future' patches introduction Michael Meissner
                   ` (9 preceding siblings ...)
  2019-08-15  3:48 ` [PATCH], Patch #10 of 10, Add pc-relative tests Michael Meissner
@ 2019-08-15  4:05 ` Segher Boessenkool
  2019-08-15  8:10 ` PC-relative TLS support Alan Modra
  2019-08-15 21:35 ` [PATCH], Patch #1 replacement (fix issues with future TLS patches) Michael Meissner
  12 siblings, 0 replies; 26+ messages in thread
From: Segher Boessenkool @ 2019-08-15  4:05 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc, Alan Modra

Hi Mike,

On Wed, Aug 14, 2019 at 04:57:32PM -0400, Michael Meissner wrote:
> to the current location instead of a base register, giving pc-relative
> addressing.  Pc-relative addressing will be supported in the next ABI (3.1) as
> an alternative to the current TOC based addressing.

That's not an ABI version, that's an ISA version.  But it will be in a
future ELFv2 ABI version, yes.

> The fifth patch switches the default when you use -mcpu=future to use
> pc-relative instructions instead of using the TOC by default.

As David reminded me, you should only do this on OSes where this works.
Only for ABIs that support PCREL, even?  Or both.

> The seventh patch adds a new RTL pass to implement the PCREL_OPT relocations
> that will be part of the ISA 3.1 specification.

Some version of the ELFv2 ABI again?


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* PC-relative TLS support
  2019-08-14 21:36 PowerPC 'future' patches introduction Michael Meissner
                   ` (10 preceding siblings ...)
  2019-08-15  4:05 ` PowerPC 'future' patches introduction Segher Boessenkool
@ 2019-08-15  8:10 ` Alan Modra
  2019-08-15 19:47   ` Segher Boessenkool
  2019-08-15 21:35 ` [PATCH], Patch #1 replacement (fix issues with future TLS patches) Michael Meissner
  12 siblings, 1 reply; 26+ messages in thread
From: Alan Modra @ 2019-08-15  8:10 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

Supporting TLS for -mpcrel turns out to be relatively simple, in part
due to deciding that !TARGET_TLS_MARKERS with -mpcrel is silly.  No
assembler that I know of supporting prefix insns lacks TLS marker
support.  Also, at some point powerpc gcc ought to remove
!TARGET_TLS_MARKERS generally and simplify all the occurrences of
IS_NOMARK_TLSGETADDR in rs6000.md rather than complicating them.

Mike, the rs6000_option_override_internal hunk is new compared to
the patch you had from me.

        * config/rs6000/predicates.md (unspec_tls): Allow const0_rtx for got
	element of unspec vec.
        * config/rs6000/rs6000.c (rs6000_option_override_internal): Disable
	-mpcrel if -mno-tls-markers.
	(rs6000_legitimize_tls_address): Support PC-relative TLS.
        * config/rs6000/rs6000.md (UNSPEC_TLSTLS_PCREL): New unspec.
	(tls_gd_pcrel, tls_ld_pcrel): New insns.
        (tls_dtprel, tls_tprel): Set attr prefixed when tls_size is not 16.
        (tls_got_tprel_pcrel, tls_tls_pcrel): New insns.

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index fba87946ec7..4ea588e1027 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -995,9 +995,9 @@
   if (CONST_INT_P (op))
     return 1;
   if (XINT (op, 1) == UNSPEC_TLSGD)
-    return REG_P (XVECEXP (op, 0, 1));
+    return REG_P (XVECEXP (op, 0, 1)) || XVECEXP (op, 0, 1) == const0_rtx;
   if (XINT (op, 1) == UNSPEC_TLSLD)
-    return REG_P (XVECEXP (op, 0, 0));
+    return REG_P (XVECEXP (op, 0, 0)) || XVECEXP (op, 0, 0) == const0_rtx;
   return 0;
 })
 
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 6aca0ce5bf3..c04206ab139 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4216,6 +4216,16 @@ rs6000_option_override_internal (bool global_init_p)
 			    | OPTION_MASK_PCREL_OPT);
     }
 
+  /* -mpcrel requires tls marker support.  */
+  if (TARGET_PCREL && !TARGET_TLS_MARKERS)
+    {
+      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+	error ("%qs requires %qs", "-mpcrel", "-mtls-markers");
+
+      rs6000_isa_flags &= ~(OPTION_MASK_PCREL
+			    | OPTION_MASK_PCREL_OPT);
+    }
+
   /* Check -mfuture debug switches.  */
   if (!TARGET_PCREL && TARGET_PCREL_OPT)
     {
@@ -8613,7 +8623,8 @@ rs6000_legitimize_tls_address (rtx addr, enum tls_model model)
     return rs6000_legitimize_tls_address_aix (addr, model);
 
   dest = gen_reg_rtx (Pmode);
-  if (model == TLS_MODEL_LOCAL_EXEC && rs6000_tls_size == 16)
+  if (model == TLS_MODEL_LOCAL_EXEC
+      && (rs6000_tls_size == 16 || rs6000_pcrel_p (cfun)))
     {
       rtx tlsreg;
 
@@ -8660,7 +8671,9 @@ rs6000_legitimize_tls_address (rtx addr, enum tls_model model)
 	 them in the .got section.  So use a pointer to the .got section,
 	 not one to secondary TOC sections used by 64-bit -mminimal-toc,
 	 or to secondary GOT sections used by 32-bit -fPIC.  */
-      if (TARGET_64BIT)
+      if (rs6000_pcrel_p (cfun))
+	got = const0_rtx;
+      else if (TARGET_64BIT)
 	got = gen_rtx_REG (Pmode, 2);
       else
 	{
@@ -8735,7 +8748,7 @@ rs6000_legitimize_tls_address (rtx addr, enum tls_model model)
 	  rtx uns = gen_rtx_UNSPEC (Pmode, vec, UNSPEC_TLS_GET_ADDR);
 	  set_unique_reg_note (get_last_insn (), REG_EQUAL, uns);
 
-	  if (rs6000_tls_size == 16)
+	  if (rs6000_tls_size == 16 || rs6000_pcrel_p (cfun))
 	    {
 	      if (TARGET_64BIT)
 		insn = gen_tls_dtprel_64 (dest, tmp1, addr);
@@ -8776,7 +8789,14 @@ rs6000_legitimize_tls_address (rtx addr, enum tls_model model)
 	  else
 	    insn = gen_tls_got_tprel_32 (tmp2, got, addr);
 	  emit_insn (insn);
-	  if (TARGET_64BIT)
+	  if (rs6000_pcrel_p (cfun))
+	    {
+	      if (TARGET_64BIT)
+		insn = gen_tls_tls_pcrel_64 (dest, tmp2, addr);
+	      else
+		insn = gen_tls_tls_pcrel_32 (dest, tmp2, addr);
+	    }
+	  else if (TARGET_64BIT)
 	    insn = gen_tls_tls_64 (dest, tmp2, addr);
 	  else
 	    insn = gen_tls_tls_32 (dest, tmp2, addr);
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 0e7d90e5357..6e32d8fdff1 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -88,6 +88,7 @@
    UNSPEC_TLSTPRELLO
    UNSPEC_TLSGOTTPREL
    UNSPEC_TLSTLS
+   UNSPEC_TLSTLS_PCREL
    UNSPEC_FIX_TRUNC_TF		; fadd, rounding towards zero
    UNSPEC_STFIWX
    UNSPEC_POPCNTB
@@ -9514,6 +9515,15 @@
 \f
 ;; TLS support.
 
+(define_insn "*tls_gd_pcrel<bits>"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=b")
+	(unspec:P [(match_operand:P 1 "rs6000_tls_symbol_ref" "")
+		   (const_int 0)]
+		  UNSPEC_TLSGD))]
+  "HAVE_AS_TLS && TARGET_TLS_MARKERS"
+  "la %0,%1@got@tlsgd@pcrel"
+  [(set_attr "prefixed" "yes")])
+
 (define_insn_and_split "*tls_gd<bits>"
   [(set (match_operand:P 0 "gpc_reg_operand" "=b")
 	(unspec:P [(match_operand:P 1 "rs6000_tls_symbol_ref" "")
@@ -9554,6 +9564,14 @@
   "HAVE_AS_TLS && TARGET_TLS_MARKERS && TARGET_CMODEL != CMODEL_SMALL"
   "addi %0,%1,%2@got@tlsgd@l")
 
+(define_insn "*tls_ld_pcrel<bits>"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=b")
+	(unspec:P [(const_int 0)]
+		  UNSPEC_TLSLD))]
+  "HAVE_AS_TLS && TARGET_TLS_MARKERS"
+  "la %0,%&@got@tlsld@pcrel"
+  [(set_attr "prefixed" "yes")])
+
 (define_insn_and_split "*tls_ld<bits>"
   [(set (match_operand:P 0 "gpc_reg_operand" "=b")
 	(unspec:P [(match_operand:P 1 "gpc_reg_operand" "b")]
@@ -9597,7 +9615,11 @@
 		   (match_operand:P 2 "rs6000_tls_symbol_ref" "")]
 		  UNSPEC_TLSDTPREL))]
   "HAVE_AS_TLS"
-  "addi %0,%1,%2@dtprel")
+  "addi %0,%1,%2@dtprel"
+  [(set (attr "prefixed")
+	(if_then_else (match_test "rs6000_tls_size == 16")
+		      (const_string "no")
+		      (const_string "yes")))])
 
 (define_insn "tls_dtprel_ha_<bits>"
   [(set (match_operand:P 0 "gpc_reg_operand" "=r")
@@ -9661,7 +9683,11 @@
 		   (match_operand:P 2 "rs6000_tls_symbol_ref" "")]
 		  UNSPEC_TLSTPREL))]
   "HAVE_AS_TLS"
-  "addi %0,%1,%2@tprel")
+  "addi %0,%1,%2@tprel"
+  [(set (attr "prefixed")
+	(if_then_else (match_test "rs6000_tls_size == 16")
+		      (const_string "no")
+		      (const_string "yes")))])
 
 (define_insn "tls_tprel_ha_<bits>"
   [(set (match_operand:P 0 "gpc_reg_operand" "=r")
@@ -9679,6 +9705,15 @@
   "HAVE_AS_TLS"
   "addi %0,%1,%2@tprel@l")
 
+(define_insn "*tls_got_tprel_pcrel_<bits>"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=b")
+	(unspec:P [(const_int 0)
+		   (match_operand:P 1 "rs6000_tls_symbol_ref" "")]
+		  UNSPEC_TLSGOTTPREL))]
+  "HAVE_AS_TLS"
+  "<ptrload> %0,%1@got@tprel@pcrel"
+  [(set_attr "prefixed" "yes")])
+
 ;; "b" output constraint here and on tls_tls input to support linker tls
 ;; optimization.  The linker may edit the instructions emitted by a
 ;; tls_got_tprel/tls_tls pair to addis,addi.
@@ -9722,6 +9757,14 @@
   "HAVE_AS_TLS && TARGET_CMODEL != CMODEL_SMALL"
   "<ptrload> %0,%2@got@tprel@l(%1)")
 
+(define_insn "tls_tls_pcrel_<bits>"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
+	(unspec:P [(match_operand:P 1 "gpc_reg_operand" "b")
+		   (match_operand:P 2 "rs6000_tls_symbol_ref" "")]
+		  UNSPEC_TLSTLS_PCREL))]
+  "TARGET_ELF && HAVE_AS_TLS"
+  "add %0,%1,%2@tls@pcrel")
+
 (define_insn "tls_tls_<bits>"
   [(set (match_operand:P 0 "gpc_reg_operand" "=r")
 	(unspec:P [(match_operand:P 1 "gpc_reg_operand" "b")

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PC-relative TLS support
  2019-08-15  8:10 ` PC-relative TLS support Alan Modra
@ 2019-08-15 19:47   ` Segher Boessenkool
  2019-08-16  4:09     ` Alan Modra
  0 siblings, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2019-08-15 19:47 UTC (permalink / raw)
  To: Alan Modra; +Cc: Michael Meissner, gcc-patches, dje.gcc

Hi!

On Thu, Aug 15, 2019 at 01:35:10PM +0930, Alan Modra wrote:
> Supporting TLS for -mpcrel turns out to be relatively simple, in part
> due to deciding that !TARGET_TLS_MARKERS with -mpcrel is silly.  No
> assembler that I know of supporting prefix insns lacks TLS marker
> support.

Will this stay that way?  (Or do we not care, not now anyway?)

> Also, at some point powerpc gcc ought to remove
> !TARGET_TLS_MARKERS generally and simplify all the occurrences of
> IS_NOMARK_TLSGETADDR in rs6000.md rather than complicating them.

The last time this came up (a year ago) the conclusion was that we first
would have to remove AIX support.

>         * config/rs6000/predicates.md (unspec_tls): Allow const0_rtx for got
> 	element of unspec vec.
>         * config/rs6000/rs6000.c (rs6000_option_override_internal): Disable
> 	-mpcrel if -mno-tls-markers.
> 	(rs6000_legitimize_tls_address): Support PC-relative TLS.
>         * config/rs6000/rs6000.md (UNSPEC_TLSTLS_PCREL): New unspec.
> 	(tls_gd_pcrel, tls_ld_pcrel): New insns.
>         (tls_dtprel, tls_tprel): Set attr prefixed when tls_size is not 16.
>         (tls_got_tprel_pcrel, tls_tls_pcrel): New insns.

(Changelog has whitespace damage, I guess that is just from how you
mailed this?  Please fix when applying it).

The patch is fine when its prerequisites are in.  Thanks,


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH], Patch #1 replacement (fix issues with future TLS patches)
  2019-08-14 21:36 PowerPC 'future' patches introduction Michael Meissner
                   ` (11 preceding siblings ...)
  2019-08-15  8:10 ` PC-relative TLS support Alan Modra
@ 2019-08-15 21:35 ` Michael Meissner
  2019-08-16  0:25   ` Segher Boessenkool
  2019-08-16  0:42   ` Bill Schmidt
  12 siblings, 2 replies; 26+ messages in thread
From: Michael Meissner @ 2019-08-15 21:35 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn,
	Michael Meissner, Alan Modra

After I submitted the patches, Aaron Sawdey tested the branch that has
the patches on it, along with Alan's TLS patches.  Alan's patch causes
the functions that determine if the insn is prefixed or not to be run
earlier that before.  The compiler was dying because the virtual arg
pointer and frame pointer registers weren't eliminated at that point,
and because I was only checking if the regno was between 0 and 31 for
GPRs.

I rewrote the test in reg_to_insn_form to use INT_REGNO_P macro (which
includes tests for arg pointer and frame pointer virtual registers).  I
used the other two macros (FP_REGNO_P and ALTIVEC_REGNO_P) for
consistency.

In addition, I removed the gcc_unreachable call if the register class
is not a GPR, FPR, or VMX register, and used the GPR defaults.  This is
in case the function gets called in the middle of reload where the
final moves are not done.

This patch replaces patch #1.  I have bootstrapped the compiler with
these changes and verified it fixed the problem Aaron was seeing.  Can
I check this into the FSF trunk?

2019-08-15   Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (pcrel_address): Rewrite to use
	pcrel_addr_p.
	(pcrel_external_address): Rewrite to use pcrel_addr_p.
	(prefixed_mem_operand): Rewrite to use prefixed_local_addr_p.
	(pcrel_external_mem_operand): Rewrite to use pcrel_addr_p.
	* config/rs6000/rs6000-protos.h (reg_to_insn_form): New
	declaration.
	(pcrel_info_type): New declaration.
	(PCREL_NULL): New macro.
	(pcrel_addr_p): New declaration.
	(rs6000_prefixed_address_mode_p): Delete.
	* config/rs6000/rs6000.c (struct rs6000_reg_addr): Add fields for
	instruction format and prefixed memory support.
	(rs6000_debug_insn_form): New debug function.
	(rs6000_debug_print_mode): Print instruction formats.
	(setup_insn_form): New function.
	(rs6000_init_hard_regno_mode_ok): Call setup_insn_form.
	(print_operand_address): Call pcrel_addr_p instead of
	pcrel_address.  Add support for external pc-relative labels.
	(mode_supports_prefixed_address_p): Delete.
	(rs6000_prefixed_address_mode_p): Delete, replace with
	prefixed_local_addr_p.
	(prefixed_local_addr_p): Replace rs6000_prefixed_address_mode_p.
	Add argument to specify the instruction format.
	(pcrel_addr_p): New function.
	(reg_to_insn_form): New function.
	* config/rs6000/rs6000.md (enum insn_form): New enumeration.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 274172)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -1626,32 +1626,11 @@ (define_predicate "small_toc_ref"
   return GET_CODE (op) == UNSPEC && XINT (op, 1) == UNSPEC_TOCREL;
 })
 
-;; Return true if the operand is a pc-relative address.
+;; Return true if the operand is a pc-relative address to a local symbol.
 (define_predicate "pcrel_address"
   (match_code "label_ref,symbol_ref,const")
 {
-  if (!rs6000_pcrel_p (cfun))
-    return false;
-
-  if (GET_CODE (op) == CONST)
-    op = XEXP (op, 0);
-
-  /* Validate offset.  */
-  if (GET_CODE (op) == PLUS)
-    {
-      rtx op0 = XEXP (op, 0);
-      rtx op1 = XEXP (op, 1);
-
-      if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
-	return false;
-
-      op = op0;
-    }
-
-  if (LABEL_REF_P (op))
-    return true;
-
-  return (SYMBOL_REF_P (op) && SYMBOL_REF_LOCAL_P (op));
+  return pcrel_addr_p (op, true, false, PCREL_NULL);
 })
 
 ;; Return true if the operand is an external symbol whose address can be loaded
@@ -1665,32 +1644,14 @@ (define_predicate "pcrel_address"
 (define_predicate "pcrel_external_address"
   (match_code "symbol_ref,const")
 {
-  if (!rs6000_pcrel_p (cfun))
-    return false;
-
-  if (GET_CODE (op) == CONST)
-    op = XEXP (op, 0);
-
-  /* Validate offset.  */
-  if (GET_CODE (op) == PLUS)
-    {
-      rtx op0 = XEXP (op, 0);
-      rtx op1 = XEXP (op, 1);
-
-      if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
-	return false;
-
-      op = op0;
-    }
-
-  return (SYMBOL_REF_P (op) && !SYMBOL_REF_LOCAL_P (op));
+  return pcrel_addr_p (op, false, true, PCREL_NULL);
 })
 
 ;; Return 1 if op is a prefixed memory operand.
 (define_predicate "prefixed_mem_operand"
   (match_code "mem")
 {
-  return rs6000_prefixed_address_mode_p (XEXP (op, 0), GET_MODE (op));
+  return prefixed_local_addr_p (XEXP (op, 0), mode, INSN_FORM_UNKNOWN);
 })
 
 ;; Return 1 if op is a memory operand to an external variable when we
@@ -1699,7 +1660,7 @@ (define_predicate "prefixed_mem_operand"
 (define_predicate "pcrel_external_mem_operand"
   (match_code "mem")
 {
-  return pcrel_external_address (XEXP (op, 0), Pmode);
+  return pcrel_addr_p (XEXP (op, 0), false, true, PCREL_NULL);
 })
 
 ;; Match the first insn (addis) in fusing the combination of addis and loads to
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 274172)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -47,7 +47,19 @@ extern bool legitimate_indirect_address_
 extern bool legitimate_indexed_address_p (rtx, int);
 extern bool avoiding_indexed_address_p (machine_mode);
 extern rtx rs6000_force_indexed_or_indirect_mem (rtx x);
+extern enum insn_form reg_to_insn_form (rtx, machine_mode);
+extern bool prefixed_local_addr_p (rtx, machine_mode, enum insn_form);
 
+/* Pc-relative address broken into component parts by pcrel_addr_p.  */
+typedef struct {
+  rtx base_addr;		/* SYMBOL_REF or LABEL_REF.  */
+  HOST_WIDE_INT offset;		/* Offset from the base address.  */
+  bool external_p;		/* Is the symbol external?  */
+} pcrel_info_type;
+
+#define PCREL_NULL ((pcrel_info_type *)0)
+
+extern bool pcrel_addr_p (rtx, bool, bool, pcrel_info_type *);
 extern rtx rs6000_got_register (rtx);
 extern rtx find_addr_reg (rtx);
 extern rtx gen_easy_altivec_constant (rtx);
@@ -154,7 +166,6 @@ extern align_flags rs6000_loop_align (rt
 extern void rs6000_split_logical (rtx [], enum rtx_code, bool, bool, bool);
 extern bool rs6000_pcrel_p (struct function *);
 extern bool rs6000_fndecl_pcrel_p (const_tree);
-extern bool rs6000_prefixed_address_mode_p (rtx, machine_mode);
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 274172)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -369,8 +369,11 @@ struct rs6000_reg_addr {
   enum insn_code reload_fpr_gpr;	/* INSN to move from FPR to GPR.  */
   enum insn_code reload_gpr_vsx;	/* INSN to move from GPR to VSX.  */
   enum insn_code reload_vsx_gpr;	/* INSN to move from VSX to GPR.  */
+  enum insn_form default_insn_form;	/* Default format for offsets.  */
+  enum insn_form insn_form[(int)N_RELOAD_REG]; /* Register insn format.  */
   addr_mask_type addr_mask[(int)N_RELOAD_REG]; /* Valid address masks.  */
   bool scalar_in_vmx_p;			/* Scalar value can go in VMX.  */
+  bool prefixed_memory_p;		/* We can use prefixed memory.  */
 };
 
 static struct rs6000_reg_addr reg_addr[NUM_MACHINE_MODES];
@@ -2053,6 +2056,28 @@ rs6000_debug_vector_unit (enum rs6000_ve
   return ret;
 }
 
+/* Return a character that can be printed out to describe an instruction
+   format.  */
+
+DEBUG_FUNCTION char
+rs6000_debug_insn_form (enum insn_form iform)
+{
+  char ret;
+
+  switch (iform)
+    {
+    case INSN_FORM_UNKNOWN:  ret = '-'; break;
+    case INSN_FORM_D:        ret = 'd'; break;
+    case INSN_FORM_DS:       ret = 's'; break;
+    case INSN_FORM_DQ:       ret = 'q'; break;
+    case INSN_FORM_X:        ret = 'x'; break;
+    case INSN_FORM_PREFIXED: ret = 'p'; break;
+    default:                 ret = '?'; break;
+    }
+
+  return ret;
+}
+
 /* Inner function printing just the address mask for a particular reload
    register class.  */
 DEBUG_FUNCTION char *
@@ -2115,6 +2140,12 @@ rs6000_debug_print_mode (ssize_t m)
     fprintf (stderr, " %s: %s", reload_reg_map[rc].name,
 	     rs6000_debug_addr_mask (reg_addr[m].addr_mask[rc], true));
 
+  fprintf (stderr, "  Format: %c:%c%c%c",
+          rs6000_debug_insn_form (reg_addr[m].default_insn_form),
+          rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_GPR]),
+          rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_FPR]),
+          rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_VMX]));
+
   if ((reg_addr[m].reload_store != CODE_FOR_nothing)
       || (reg_addr[m].reload_load != CODE_FOR_nothing))
     {
@@ -2668,6 +2699,153 @@ rs6000_setup_reg_addr_masks (void)
     }
 }
 
+/* Set up the instruction format for each mode and register type from the
+   addr_mask.  */
+
+static void
+setup_insn_form (void)
+{
+  for (ssize_t m = 0; m < NUM_MACHINE_MODES; ++m)
+    {
+      machine_mode scalar_mode = (machine_mode) m;
+
+      /* Convert complex and IBM double double/_Decimal128 into their scalar
+	 parts that the registers will be split into for doing load or
+	 store.  */
+      if (COMPLEX_MODE_P (scalar_mode))
+	scalar_mode = GET_MODE_INNER (scalar_mode);
+
+      if (FLOAT128_2REG_P (scalar_mode))
+	scalar_mode = DFmode;
+
+      for (ssize_t rc = FIRST_RELOAD_REG_CLASS; rc <= LAST_RELOAD_REG_CLASS; rc++)
+	{
+	  machine_mode single_reg_mode = scalar_mode;
+	  size_t msize = GET_MODE_SIZE (scalar_mode);
+	  addr_mask_type addr_mask = reg_addr[scalar_mode].addr_mask[rc];
+	  enum insn_form iform = INSN_FORM_UNKNOWN;
+
+	  /* Is the mode permitted in the GPR/FPR/Altivec registers?  */
+	  if ((addr_mask & RELOAD_REG_VALID) != 0)
+	    {
+	      /* The addr_mask does not have the offsettable or indexed bits
+		 set for modes that are split into multiple registers (like
+		 IFmode).  It doesn't need this set, since typically by time it
+		 is used in secondary reload, the modes are split into
+		 component parts.
+
+		 The instruction format however can be used earlier in the
+		 compilation, so we need to setup what kind of instruction can
+		 be generated for the modes that are split.  */
+	      if ((addr_mask & (RELOAD_REG_MULTIPLE
+				| RELOAD_REG_OFFSET
+				| RELOAD_REG_INDEXED)) == RELOAD_REG_MULTIPLE)
+		{
+		  /* Multiple register types in GPRs depend on whether we can
+		     use DImode in a single register or SImode.  */
+		  if (rc == RELOAD_REG_GPR)
+		    {
+		      if (TARGET_POWERPC64)
+			{
+			  gcc_assert ((msize % 8) == 0);
+			  single_reg_mode = DImode;
+			}
+
+		      else
+			{
+			  gcc_assert ((msize % 4) == 0);
+			  single_reg_mode = SImode;
+			}
+		    }
+
+		  /* Multiple VSX vector sized data items will use a single
+		     vector type as an instruction format.  */
+		  else if (TARGET_VSX)
+		    {
+		      gcc_assert ((rc == RELOAD_REG_FPR)
+				  || (rc == RELOAD_REG_VMX));
+
+		      if ((msize % 16) == 0)
+			single_reg_mode = V2DImode;
+		    }
+
+		  /* Multiple Altivec vector sized data items will use a single
+		     vector type as an instruction format.  */
+		  else if (TARGET_ALTIVEC && rc == RELOAD_REG_VMX
+			   && (msize % 16) == 0
+			   && VECTOR_MODE_P (single_reg_mode))
+		    single_reg_mode = V4SImode;
+
+		  /* If we only have the traditional floating point unit, use
+		     DFmode as the base type.  */
+		  else if (!TARGET_VSX && TARGET_HARD_FLOAT
+			   && rc == RELOAD_REG_FPR && (msize % 8) == 0)
+		    single_reg_mode = DFmode;
+
+		  /* Get the information for the register mode used after
+		     splitting.  */
+		  addr_mask = reg_addr[single_reg_mode].addr_mask[rc];
+		  msize = GET_MODE_SIZE (single_reg_mode);
+		}
+
+	      /* Figure out the instruction format of each mode.
+
+		 For offsettable addresses that aren't specifically quad mode,
+		 see if the default form is D or DS.  GPR 64-bit offsettable
+		 addresses are DS format.  Likewise, all Altivec offsettable
+		 adddresses are DS format.  */
+	      if ((addr_mask & RELOAD_REG_OFFSET) != 0)
+		{
+		  if ((addr_mask & RELOAD_REG_QUAD_OFFSET) != 0)
+		    iform = INSN_FORM_DQ;
+
+		  else if (rc == RELOAD_REG_VMX
+			   || (rc == RELOAD_REG_GPR && TARGET_POWERPC64
+			       && (msize >= 8)))
+		    iform = INSN_FORM_DS;
+
+		  else
+		    iform = INSN_FORM_D;
+		}
+
+	      else if ((addr_mask & RELOAD_REG_INDEXED) != 0)
+		iform = INSN_FORM_X;
+	    }
+
+	  reg_addr[m].insn_form[rc] = iform;
+	}
+
+      /* Figure out the default insn format that is used for offsettable memory
+	 instructions.  For scalar floating point use the FPR addressing, for
+	 vectors and IEEE 128-bit use a suitable vector register type, and
+	 otherwise use GPRs.  */
+      ssize_t def_rc;
+      if (TARGET_VSX
+	  && (VECTOR_MODE_P (scalar_mode) || FLOAT128_IEEE_P (scalar_mode)))
+	{
+	  if ((reg_addr[m].addr_mask[RELOAD_REG_FPR] & RELOAD_REG_VALID) != 0)
+	    def_rc = RELOAD_REG_FPR;
+	  else
+	    def_rc = RELOAD_REG_VMX;
+	}
+
+      else if (TARGET_ALTIVEC && !TARGET_VSX && VECTOR_MODE_P (scalar_mode))
+	def_rc = RELOAD_REG_VMX;
+
+      else if (TARGET_HARD_FLOAT && SCALAR_FLOAT_MODE_P (scalar_mode))
+	def_rc = RELOAD_REG_FPR;
+
+      else
+	def_rc = RELOAD_REG_GPR;
+
+      reg_addr[m].default_insn_form = reg_addr[m].insn_form[def_rc];
+
+      /* Don't enable prefixed memory support until all of the infrastructure
+	 changes are in.  */
+      reg_addr[m].prefixed_memory_p = false;
+    }
+}
+
 \f
 /* Initialize the various global tables that are based on register size.  */
 static void
@@ -3181,6 +3359,9 @@ rs6000_init_hard_regno_mode_ok (bool glo
      use.  */
   rs6000_setup_reg_addr_masks ();
 
+  /* Update the instruction formats.  */
+  setup_insn_form ();
+
   if (global_init_p || TARGET_DEBUG_TARGET)
     {
       if (TARGET_DEBUG_REG)
@@ -13070,29 +13251,21 @@ print_operand (FILE *file, rtx x, int co
 void
 print_operand_address (FILE *file, rtx x)
 {
+  pcrel_info_type pcrel_info;
+
   if (REG_P (x))
     fprintf (file, "0(%s)", reg_names[ REGNO (x) ]);
 
   /* Is it a pc-relative address?  */
-  else if (pcrel_address (x, Pmode))
+  else if (pcrel_addr_p (x, true, true, &pcrel_info))
     {
-      HOST_WIDE_INT offset;
+      output_addr_const (file, pcrel_info.base_addr);
 
-      if (GET_CODE (x) == CONST)
-	x = XEXP (x, 0);
+      if (pcrel_info.offset)
+	fprintf (file, "%+" PRId64, pcrel_info.offset);
 
-      if (GET_CODE (x) == PLUS)
-	{
-	  offset = INTVAL (XEXP (x, 1));
-	  x = XEXP (x, 0);
-	}
-      else
-	offset = 0;
-
-      output_addr_const (file, x);
-
-      if (offset)
-	fprintf (file, "%+" PRId64, offset);
+      if (pcrel_info.external_p)
+	fputs ("@got", file);
 
       fputs ("@pcrel", file);
     }
@@ -13579,70 +13752,206 @@ rs6000_pltseq_template (rtx *operands, i
   return str;
 }
 #endif
+\f
+/* Return true if the address ADDR is a prefixed address either with a large
+   offset, an offset that does not fit in the normal instruction form, or a
+   pc-relative address to a local symbol.
 
-/* Helper function to return whether a MODE can do prefixed loads/stores.
-   VOIDmode is used when we are loading the pc-relative address into a base
-   register, but we are not using it as part of a memory operation.  As modes
-   add support for prefixed memory, they will be added here.  */
-
-static bool
-mode_supports_prefixed_address_p (machine_mode mode)
-{
-  return mode == VOIDmode;
-}
+   MODE is the mode of the memory.
 
-/* Function to return true if ADDR is a valid prefixed memory address that uses
-   mode MODE.  */
+   IFORM is used to determine if the traditional address is either DS format or
+   DQ format and the bottom bits of the offset are non-zero.  */
 
 bool
-rs6000_prefixed_address_mode_p (rtx addr, machine_mode mode)
+prefixed_local_addr_p (rtx addr, machine_mode mode, enum insn_form iform)
 {
-  if (!TARGET_PREFIXED_ADDR || !mode_supports_prefixed_address_p (mode))
+  if (!reg_addr[mode].prefixed_memory_p)
     return false;
 
-  /* Check for PC-relative addresses.  */
-  if (pcrel_address (addr, Pmode))
-    return true;
+  if (GET_CODE (addr) == CONST)
+    addr = XEXP (addr, 0);
+
+  /* Single register, not prefixed.  */
+  if (REG_P (addr) || SUBREG_P (addr))
+    return false;
+
+  /* Register + offset.  */
+  else if (GET_CODE (addr) == PLUS
+	   && (REG_P (XEXP (addr, 0)) || SUBREG_P (XEXP (addr, 0)))
+	   && CONST_INT_P (XEXP (addr, 1)))
+    {
+      HOST_WIDE_INT offset = INTVAL (XEXP (addr, 1));
+
+      /* Prefixed instructions can only access 34-bits.  Fail if the value
+	 is larger than that.  */
+      if (!SIGNED_34BIT_OFFSET_P (offset))
+	return false;
+
+      /* For small offsets see whether it might be a DS or DQ instruction where
+	 the bottom bits non-zero.  This would require using a prefixed
+	 address.  If the offset is larger than 16 bits, then the instruction
+	 must be prefixed.  */
+      if (SIGNED_16BIT_OFFSET_P (offset))
+	{
+	  /* Use default if we don't know the precise instruction format.  */
+	  if (iform == INSN_FORM_UNKNOWN)
+	    iform = reg_addr[mode].default_insn_form;
+
+	  if (iform == INSN_FORM_DS)
+	    return (offset & 3) != 0;
+
+	  else if (iform == INSN_FORM_DQ)
+	    return (offset & 15) != 0;
+
+	  else if (iform != INSN_FORM_PREFIXED)
+	    return false;
+	}
+
+      return true;
+    }
+
+  else if (!TARGET_PCREL)
+    return false;
 
-  /* Check for prefixed memory addresses that have a large numeric offset,
-     or an offset that can't be used for a DS/DQ-form memory operation.  */
   if (GET_CODE (addr) == PLUS)
     {
-      rtx op0 = XEXP (addr, 0);
       rtx op1 = XEXP (addr, 1);
 
-      if (!base_reg_operand (op0, Pmode) || !CONST_INT_P (op1))
+      if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
 	return false;
 
-      HOST_WIDE_INT value = INTVAL (op1);
-      if (!SIGNED_34BIT_OFFSET_P (value))
+      addr = XEXP (addr, 0);
+    }
+
+  /* Local pc-relative symbols/labels.  */
+  return (LABEL_REF_P (addr)
+	  || (SYMBOL_REF_P (addr) && SYMBOL_REF_LOCAL_P (addr)));
+}
+
+/* Return true if the address ADDR is a prefixed address that is a pc-relative
+   reference either to a local symbol or to an external symbol.  We break apart
+   the address and return the parts.  LOCAL_SYMBOL_P and EXTERNAL_SYMBOL_P says
+   whether local and external pc-relative symbols are allowed.  P_INFO points
+   to a structure that returns the broken out component parts if desired.  */
+
+bool
+pcrel_addr_p (rtx addr,
+	      bool local_symbol_p,
+	      bool external_symbol_p,
+	      pcrel_info_type *p_info)
+{
+  rtx base_addr = NULL_RTX;
+  HOST_WIDE_INT offset = 0;
+  bool was_external_p = false;
+
+  if (p_info)
+    {
+      p_info->base_addr = NULL_RTX;
+      p_info->offset = 0;
+      p_info->external_p = false;
+    }
+
+  if (!TARGET_PCREL)
+    return false;
+
+  if (GET_CODE (addr) == CONST)
+    addr = XEXP (addr, 0);
+
+  /* Pc-relative symbols/labels without offsets.  */
+  if (SYMBOL_REF_P (addr))
+    {
+      base_addr = addr;
+      was_external_p = !SYMBOL_REF_LOCAL_P (addr);
+    }
+
+  else if (LABEL_REF_P (addr))
+    base_addr = addr;
+
+  /* Pc-relative symbols with offsets.  */
+  else if (GET_CODE (addr) == PLUS
+	   && SYMBOL_REF_P (XEXP (addr, 0))
+	   && CONST_INT_P (XEXP (addr, 1)))
+    {
+      base_addr = XEXP (addr, 0);
+      offset = INTVAL (XEXP (addr, 1));
+      was_external_p = !SYMBOL_REF_LOCAL_P (base_addr);
+
+      if (!SIGNED_34BIT_OFFSET_P (offset))
 	return false;
+    }
 
-      /* Offset larger than 16-bits?  */
-      if (!SIGNED_16BIT_OFFSET_P (value))
-	return true;
+  else
+    return false;
+
+  if (was_external_p && !external_symbol_p)
+    return false;
+
+  if (!was_external_p && !local_symbol_p)
+    return false;
 
-      /* DQ instruction (bottom 4 bits must be 0) for vectors.  */
-      HOST_WIDE_INT mask;
-      if (GET_MODE_SIZE (mode) >= 16)
-	mask = 15;
-
-      /* DS instruction (bottom 2 bits must be 0).  For 32-bit integers, we
-	 need to use DS instructions if we are sign-extending the value with
-	 LWA.  For 32-bit floating point, we need DS instructions to load and
-	 store values to the traditional Altivec registers.  */
-      else if (GET_MODE_SIZE (mode) >= 4)
-	mask = 3;
+  if (p_info)
+    {
+      p_info->base_addr = base_addr;
+      p_info->offset = offset;
+      p_info->external_p = was_external_p;
+    }
+
+  return true;
+}
+
+/* Given a register and a mode, return the instruction format for that
+   register.  If the register is a pseudo register, use the default format.
+   Otherwise if it is hard register, look to see exactly what type of
+   addressing is used.  */
+
+enum insn_form
+reg_to_insn_form (rtx reg, machine_mode mode)
+{
+  enum insn_form iform;
 
-      /* QImode/HImode has no restrictions.  */
+  /* Handle UNSPECs, such as the special UNSPEC_SF_FROM_SI and
+     UNSPEC_SI_FROM_SF UNSPECs, which are used to hide SF/SI interactions.
+     Look at the first argument, and if it is a register, use that.  */
+  if (GET_CODE (reg) == UNSPEC || GET_CODE (reg) == UNSPEC_VOLATILE)
+    {
+      rtx op0 = XVECEXP (reg, 0, 0);
+      if (REG_P (op0) || SUBREG_P (op0))
+	reg = op0;
+    }
+
+  /* If it isn't a register, use the defaults.  */
+  if (!REG_P (reg) && !SUBREG_P (reg))
+    iform = reg_addr[mode].default_insn_form;
+
+  else
+    {
+      unsigned int r = reg_or_subregno (reg);
+
+      /* If we have a pseudo, use the default instruction format.  */
+      if (r >= FIRST_PSEUDO_REGISTER)
+	iform = reg_addr[mode].default_insn_form;
+
+      /* If we have a hard register, use the address format of that hard
+	 register.  */
       else
-	return true;
+	{
+	  if (INT_REGNO_P (r))
+	    iform = reg_addr[mode].insn_form[RELOAD_REG_GPR];
+
+	  else if (FP_REGNO_P (r))
+	    iform = reg_addr[mode].insn_form[RELOAD_REG_FPR];
 
-      /* Return true if we must use a prefixed instruction.  */
-      return (value & mask) != 0;
+	  else if (ALTIVEC_REGNO_P (r))
+	    iform = reg_addr[mode].insn_form[RELOAD_REG_VMX];
+
+	  /* For anything else (SPR, CA, etc.) assume the GPR registers will be
+	     used to load or store the value.  */
+	  else
+	    iform = reg_addr[mode].insn_form[RELOAD_REG_GPR];
+	}
     }
 
-  return false;
+  return iform;
 }
 \f
 #if defined (HAVE_GAS_HIDDEN) && !TARGET_MACHO
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 274172)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -252,6 +252,23 @@ (define_attr "var_shift" "no,yes"
 ;; Is copying of this instruction disallowed?
 (define_attr "cannot_copy" "no,yes" (const_string "no"))
 
+;; Enumeration of the PowerPC instruction formats.  We only list the
+;; instruction formats that are used by the code, and not every possible
+;; instruction format that the machine supports.
+
+;; The main use for this enumeration is to determine if a particular
+;; offsettable instruction has a valid offset field for a traditional
+;; instruction, or whether a prefixed instruction might be needed to hold the
+;; offset.  For DS/DQ format instructions, if we have an offset that has the
+;; bottom bits non-zero, we can use a prefixed instruction instead of pushing
+;; the offset to an index register.
+(define_enum "insn_form" [unknown	; Unknown format
+			  d		; Offset addressing uses 16 bits
+			  ds		; Offset addressing uses 14 bits
+			  dq		; Offset addressing uses 12 bits
+			  x		; Indexed addressing
+			  prefixed])	; Prefixed instruction
+
 ;; Length of the instruction (in bytes).
 (define_attr "length" "" (const_int 4))
 

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH], Patch #1 replacement (fix issues with future TLS patches)
  2019-08-15 21:35 ` [PATCH], Patch #1 replacement (fix issues with future TLS patches) Michael Meissner
@ 2019-08-16  0:25   ` Segher Boessenkool
  2019-08-16  0:42   ` Bill Schmidt
  1 sibling, 0 replies; 26+ messages in thread
From: Segher Boessenkool @ 2019-08-16  0:25 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn, Alan Modra

Hi Mike,

On Thu, Aug 15, 2019 at 05:19:16PM -0400, Michael Meissner wrote:
> -;; Return true if the operand is a pc-relative address.
> +;; Return true if the operand is a pc-relative address to a local symbol.

The pcrel_addr_p comment says it is *not* just for local symbols.  So
which is it?

Having both something called "pcrel_address" and something called
"pcrel_addr_p" is confusing.  And "pcrel_addr_p" isn't a predicate at
all, so it should not be called that.  Or you can remove that "info"
argument, which is probably a good idea.

>  (define_predicate "pcrel_address"
>    (match_code "label_ref,symbol_ref,const")
>  {
> +  return pcrel_addr_p (op, true, false, PCREL_NULL);
>  })

Please avoid boolean arguments altogether; it isn't clear at all what
they mean here.

Ah, they say only locals are allowed here.  So this RTL predicate
shouldn't be called "pcrel_address"; it should have "local" in the name
somewhere.

>  ;; Return 1 if op is a prefixed memory operand.
>  (define_predicate "prefixed_mem_operand"
>    (match_code "mem")
>  {
> -  return rs6000_prefixed_address_mode_p (XEXP (op, 0), GET_MODE (op));
> +  return prefixed_local_addr_p (XEXP (op, 0), mode, INSN_FORM_UNKNOWN);
>  })

Similar issues with "local" here.

>  (define_predicate "pcrel_external_mem_operand"
>    (match_code "mem")
>  {
> -  return pcrel_external_address (XEXP (op, 0), Pmode);
> +  return pcrel_addr_p (XEXP (op, 0), false, true, PCREL_NULL);
>  })

Why this change?

> +/* Pc-relative address broken into component parts by pcrel_addr_p.  */
> +typedef struct {
> +  rtx base_addr;		/* SYMBOL_REF or LABEL_REF.  */
> +  HOST_WIDE_INT offset;		/* Offset from the base address.  */
> +  bool external_p;		/* Is the symbol external?  */
> +} pcrel_info_type;

Don't use typedefs please.

Don't call booleans xxx_p; xxx_p is a name used for a predicate, that
is, a pure (or "const" in GCC terms) function returning a boolean.

Don't name types "*_type".

> +#define PCREL_NULL ((pcrel_info_type *)0)

Please just use NULL where you use this.  (Or 0 as far as I care, but
that's not the GCC coding style :-) ).

> --- gcc/config/rs6000/rs6000.c	(revision 274172)
> +++ gcc/config/rs6000/rs6000.c	(working copy)
> @@ -369,8 +369,11 @@ struct rs6000_reg_addr {
>    enum insn_code reload_fpr_gpr;	/* INSN to move from FPR to GPR.  */
>    enum insn_code reload_gpr_vsx;	/* INSN to move from GPR to VSX.  */
>    enum insn_code reload_vsx_gpr;	/* INSN to move from VSX to GPR.  */
> +  enum insn_form default_insn_form;	/* Default format for offsets.  */
> +  enum insn_form insn_form[(int)N_RELOAD_REG]; /* Register insn format.  */
>    addr_mask_type addr_mask[(int)N_RELOAD_REG]; /* Valid address masks.  */

Why the casts here?  Not all places use this cast, so why is it needed
here and not in all cases?

> +/* Return a character that can be printed out to describe an instruction
> +   format.  */
> +
> +DEBUG_FUNCTION char
> +rs6000_debug_insn_form (enum insn_form iform)
> +{
> +  char ret;
> +
> +  switch (iform)
> +    {
> +    case INSN_FORM_UNKNOWN:  ret = '-'; break;
> +    case INSN_FORM_D:        ret = 'd'; break;
> +    case INSN_FORM_DS:       ret = 's'; break;
> +    case INSN_FORM_DQ:       ret = 'q'; break;
> +    case INSN_FORM_X:        ret = 'x'; break;
> +    case INSN_FORM_PREFIXED: ret = 'p'; break;
> +    default:                 ret = '?'; break;
> +    }
> +
> +  return ret;
> +}

This doesn't follow the coding style.

> +  fprintf (stderr, "  Format: %c:%c%c%c",
> +          rs6000_debug_insn_form (reg_addr[m].default_insn_form),
> +          rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_GPR]),
> +          rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_FPR]),
> +          rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_VMX]));

Is this useful?  For others I mean, not just for you.

> +/* Set up the instruction format for each mode and register type from the
> +   addr_mask.  */
> +
> +static void
> +setup_insn_form (void)
> +{
> +  for (ssize_t m = 0; m < NUM_MACHINE_MODES; ++m)

Why ssize_t?  Most places just use int.

> +    {
> +      machine_mode scalar_mode = (machine_mode) m;
> +
> +      /* Convert complex and IBM double double/_Decimal128 into their scalar
> +	 parts that the registers will be split into for doing load or
> +	 store.  */
> +      if (COMPLEX_MODE_P (scalar_mode))
> +	scalar_mode = GET_MODE_INNER (scalar_mode);

Do you also need to handle some vector modes here?

> +      if (FLOAT128_2REG_P (scalar_mode))
> +	scalar_mode = DFmode;
> +
> +      for (ssize_t rc = FIRST_RELOAD_REG_CLASS; rc <= LAST_RELOAD_REG_CLASS; rc++)

(overwide line)

> +	{
> +	  machine_mode single_reg_mode = scalar_mode;
> +	  size_t msize = GET_MODE_SIZE (scalar_mode);
> +	  addr_mask_type addr_mask = reg_addr[scalar_mode].addr_mask[rc];
> +	  enum insn_form iform = INSN_FORM_UNKNOWN;
> +
> +	  /* Is the mode permitted in the GPR/FPR/Altivec registers?  */
> +	  if ((addr_mask & RELOAD_REG_VALID) != 0)
> +	    {
> +	      /* The addr_mask does not have the offsettable or indexed bits
> +		 set for modes that are split into multiple registers (like
> +		 IFmode).  It doesn't need this set, since typically by time it
> +		 is used in secondary reload, the modes are split into
> +		 component parts.
> +
> +		 The instruction format however can be used earlier in the
> +		 compilation, so we need to setup what kind of instruction can
> +		 be generated for the modes that are split.  */

If it's only "typically" split, is it true that the bits do not have to
be set?

Snip the rest of this function...  It needs much better factoring, and
better commenting, I cannot make heads or tails of it, sorry.

> +  /* Update the instruction formats.  */
> +  setup_insn_form ();

"Update"?  And s/format/form/ throughout.

>  void
>  print_operand_address (FILE *file, rtx x)
>  {
> +  pcrel_info_type pcrel_info;
> +
>    if (REG_P (x))
>      fprintf (file, "0(%s)", reg_names[ REGNO (x) ]);
>  
>    /* Is it a pc-relative address?  */
> -  else if (pcrel_address (x, Pmode))
> +  else if (pcrel_addr_p (x, true, true, &pcrel_info))

This is the only place that uses non-null for the info argument.  So
delete that from the normal function please, and handle that here, maybe
with some helper functions.

Don't do two (or more) different things in one function, in general.

> +/* Return true if the address ADDR is a prefixed address either with a large
> +   offset, an offset that does not fit in the normal instruction form, or a
> +   pc-relative address to a local symbol.

"Return true if ADDR is a local address that needs a prefix insn to
encode."?

You don't need to describe all exact reasons here...  That probably is
out-of-date (and/or not so exact) anyway.

>  
> +   MODE is the mode of the memory.
>  
> +   IFORM is used to determine if the traditional address is either DS format or
> +   DQ format and the bottom bits of the offset are non-zero.  */

s/traditional/non-prefixed/?  "Traditional" is a word like "old"; it is
stale almost before you write it.

>  bool
> +prefixed_local_addr_p (rtx addr, machine_mode mode, enum insn_form iform)

> +      if (!SIGNED_34BIT_OFFSET_P (offset))
> +	return false;

      /* Can this be described with a non-prefixed insn?  */
      if (!SIGNED_16BIT_OFFSET_P (offset))
	return true;

(and then the DS/DQ forms that need a prefix).

> +	  /* Use default if we don't know the precise instruction format.  */
> +	  if (iform == INSN_FORM_UNKNOWN)
> +	    iform = reg_addr[mode].default_insn_form;

So is this default as strict as possible?  As lenient as possible?  Or
what?

> +/* Given a register and a mode, return the instruction format for that
> +   register.  If the register is a pseudo register, use the default format.

What *is* the default form?  How is that decided?

Nowhere do you say if what something returns is as optimistic or as
pessimistic as possible.  Either option has something to say for it,
but it is important we know what it is :-)

> +enum insn_form
> +reg_to_insn_form (rtx reg, machine_mode mode)

What does this describe?  The insn form used for loading/storing that
register?

> +  /* If it isn't a register, use the defaults.  */
> +  if (!REG_P (reg) && !SUBREG_P (reg))
> +    iform = reg_addr[mode].default_insn_form;

If so...  How can that happen?

> +	  /* For anything else (SPR, CA, etc.) assume the GPR registers will be
> +	     used to load or store the value.  */

That's a big assumption.  gcc_unreachable instead?


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH], Patch #1 replacement (fix issues with future TLS patches)
  2019-08-15 21:35 ` [PATCH], Patch #1 replacement (fix issues with future TLS patches) Michael Meissner
  2019-08-16  0:25   ` Segher Boessenkool
@ 2019-08-16  0:42   ` Bill Schmidt
  1 sibling, 0 replies; 26+ messages in thread
From: Bill Schmidt @ 2019-08-16  0:42 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Alan Modra

Hi Mike, just a couple points from me...

On 8/15/19 4:19 PM, Michael Meissner wrote:

<snip>
> Index: gcc/config/rs6000/rs6000.c
> ===================================================================
> --- gcc/config/rs6000/rs6000.c	(revision 274172)
> +++ gcc/config/rs6000/rs6000.c	(working copy)
> @@ -369,8 +369,11 @@ struct rs6000_reg_addr {
>    enum insn_code reload_fpr_gpr;	/* INSN to move from FPR to GPR.  */
>    enum insn_code reload_gpr_vsx;	/* INSN to move from GPR to VSX.  */
>    enum insn_code reload_vsx_gpr;	/* INSN to move from VSX to GPR.  */
> +  enum insn_form default_insn_form;	/* Default format for offsets.  */
> +  enum insn_form insn_form[(int)N_RELOAD_REG]; /* Register insn format.  */
>    addr_mask_type addr_mask[(int)N_RELOAD_REG]; /* Valid address masks.  */
>    bool scalar_in_vmx_p;			/* Scalar value can go in VMX.  */
> +  bool prefixed_memory_p;		/* We can use prefixed memory.  */
>  };
>
>  static struct rs6000_reg_addr reg_addr[NUM_MACHINE_MODES];
> @@ -2053,6 +2056,28 @@ rs6000_debug_vector_unit (enum rs6000_ve
>    return ret;
>  }
>
> +/* Return a character that can be printed out to describe an instruction
> +   format.  */
> +
> +DEBUG_FUNCTION char
> +rs6000_debug_insn_form (enum insn_form iform)
> +{
> +  char ret;
> +
> +  switch (iform)
> +    {
> +    case INSN_FORM_UNKNOWN:  ret = '-'; break;
> +    case INSN_FORM_D:        ret = 'd'; break;
> +    case INSN_FORM_DS:       ret = 's'; break;
> +    case INSN_FORM_DQ:       ret = 'q'; break;
> +    case INSN_FORM_X:        ret = 'x'; break;
> +    case INSN_FORM_PREFIXED: ret = 'p'; break;
> +    default:                 ret = '?'; break;
> +    }
> +
> +  return ret;
> +}
> +
>  /* Inner function printing just the address mask for a particular reload
>     register class.  */
>  DEBUG_FUNCTION char *
> @@ -2115,6 +2140,12 @@ rs6000_debug_print_mode (ssize_t m)
>      fprintf (stderr, " %s: %s", reload_reg_map[rc].name,
>  	     rs6000_debug_addr_mask (reg_addr[m].addr_mask[rc], true));
>
> +  fprintf (stderr, "  Format: %c:%c%c%c",
> +          rs6000_debug_insn_form (reg_addr[m].default_insn_form),
> +          rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_GPR]),
> +          rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_FPR]),
> +          rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_VMX]));
> +
>    if ((reg_addr[m].reload_store != CODE_FOR_nothing)
>        || (reg_addr[m].reload_load != CODE_FOR_nothing))
>      {
> @@ -2668,6 +2699,153 @@ rs6000_setup_reg_addr_masks (void)
>      }
>  }
>
> +/* Set up the instruction format for each mode and register type from the
> +   addr_mask.  */
> +
> +static void
> +setup_insn_form (void)
> +{
> +  for (ssize_t m = 0; m < NUM_MACHINE_MODES; ++m)
> +    {
> +      machine_mode scalar_mode = (machine_mode) m;
> +
> +      /* Convert complex and IBM double double/_Decimal128 into their scalar
> +	 parts that the registers will be split into for doing load or
> +	 store.  */
> +      if (COMPLEX_MODE_P (scalar_mode))
> +	scalar_mode = GET_MODE_INNER (scalar_mode);
> +
> +      if (FLOAT128_2REG_P (scalar_mode))
> +	scalar_mode = DFmode;
> +
> +      for (ssize_t rc = FIRST_RELOAD_REG_CLASS; rc <= LAST_RELOAD_REG_CLASS; rc++)
> +	{
> +	  machine_mode single_reg_mode = scalar_mode;
> +	  size_t msize = GET_MODE_SIZE (scalar_mode);
> +	  addr_mask_type addr_mask = reg_addr[scalar_mode].addr_mask[rc];
> +	  enum insn_form iform = INSN_FORM_UNKNOWN;
> +
> +	  /* Is the mode permitted in the GPR/FPR/Altivec registers?  */
> +	  if ((addr_mask & RELOAD_REG_VALID) != 0)

To help with readability and maintainability, may I suggest factoring
the following into a separate function...
> +	    {
> +	      /* The addr_mask does not have the offsettable or indexed bits
> +		 set for modes that are split into multiple registers (like
> +		 IFmode).  It doesn't need this set, since typically by time it
> +		 is used in secondary reload, the modes are split into
> +		 component parts.
> +
> +		 The instruction format however can be used earlier in the
> +		 compilation, so we need to setup what kind of instruction can
> +		 be generated for the modes that are split.  */
> +	      if ((addr_mask & (RELOAD_REG_MULTIPLE
> +				| RELOAD_REG_OFFSET
> +				| RELOAD_REG_INDEXED)) == RELOAD_REG_MULTIPLE)
> +		{
> +		  /* Multiple register types in GPRs depend on whether we can
> +		     use DImode in a single register or SImode.  */
> +		  if (rc == RELOAD_REG_GPR)
> +		    {
> +		      if (TARGET_POWERPC64)
> +			{
> +			  gcc_assert ((msize % 8) == 0);
> +			  single_reg_mode = DImode;
> +			}
> +
> +		      else
> +			{
> +			  gcc_assert ((msize % 4) == 0);
> +			  single_reg_mode = SImode;
> +			}
> +		    }
> +
> +		  /* Multiple VSX vector sized data items will use a single
> +		     vector type as an instruction format.  */
> +		  else if (TARGET_VSX)
> +		    {
> +		      gcc_assert ((rc == RELOAD_REG_FPR)
> +				  || (rc == RELOAD_REG_VMX));
> +
> +		      if ((msize % 16) == 0)
> +			single_reg_mode = V2DImode;
> +		    }
> +
> +		  /* Multiple Altivec vector sized data items will use a single
> +		     vector type as an instruction format.  */
> +		  else if (TARGET_ALTIVEC && rc == RELOAD_REG_VMX
> +			   && (msize % 16) == 0
> +			   && VECTOR_MODE_P (single_reg_mode))
> +		    single_reg_mode = V4SImode;
> +
> +		  /* If we only have the traditional floating point unit, use
> +		     DFmode as the base type.  */
> +		  else if (!TARGET_VSX && TARGET_HARD_FLOAT
> +			   && rc == RELOAD_REG_FPR && (msize % 8) == 0)
> +		    single_reg_mode = DFmode;
> +
> +		  /* Get the information for the register mode used after
> +		     splitting.  */
> +		  addr_mask = reg_addr[single_reg_mode].addr_mask[rc];
> +		  msize = GET_MODE_SIZE (single_reg_mode);
> +		}
> +
> +	      /* Figure out the instruction format of each mode.
> +
> +		 For offsettable addresses that aren't specifically quad mode,
> +		 see if the default form is D or DS.  GPR 64-bit offsettable
> +		 addresses are DS format.  Likewise, all Altivec offsettable
> +		 adddresses are DS format.  */
> +	      if ((addr_mask & RELOAD_REG_OFFSET) != 0)
> +		{
> +		  if ((addr_mask & RELOAD_REG_QUAD_OFFSET) != 0)
> +		    iform = INSN_FORM_DQ;
> +
> +		  else if (rc == RELOAD_REG_VMX
> +			   || (rc == RELOAD_REG_GPR && TARGET_POWERPC64
> +			       && (msize >= 8)))
> +		    iform = INSN_FORM_DS;
> +
> +		  else
> +		    iform = INSN_FORM_D;
> +		}
> +
> +	      else if ((addr_mask & RELOAD_REG_INDEXED) != 0)
> +		iform = INSN_FORM_X;
> +	    }
> +
> +	  reg_addr[m].insn_form[rc] = iform;
> +	}

... until here.  Having all this in a doubly nested loop makes it
difficult to read.
> +
> +      /* Figure out the default insn format that is used for offsettable memory
> +	 instructions.  For scalar floating point use the FPR addressing, for
> +	 vectors and IEEE 128-bit use a suitable vector register type, and
> +	 otherwise use GPRs.  */
> +      ssize_t def_rc;
> +      if (TARGET_VSX
> +	  && (VECTOR_MODE_P (scalar_mode) || FLOAT128_IEEE_P (scalar_mode)))
> +	{
> +	  if ((reg_addr[m].addr_mask[RELOAD_REG_FPR] & RELOAD_REG_VALID) != 0)
> +	    def_rc = RELOAD_REG_FPR;
> +	  else
> +	    def_rc = RELOAD_REG_VMX;
> +	}
> +
> +      else if (TARGET_ALTIVEC && !TARGET_VSX && VECTOR_MODE_P (scalar_mode))
> +	def_rc = RELOAD_REG_VMX;
> +
> +      else if (TARGET_HARD_FLOAT && SCALAR_FLOAT_MODE_P (scalar_mode))
> +	def_rc = RELOAD_REG_FPR;
> +
> +      else
> +	def_rc = RELOAD_REG_GPR;
> +
> +      reg_addr[m].default_insn_form = reg_addr[m].insn_form[def_rc];
> +
> +      /* Don't enable prefixed memory support until all of the infrastructure
> +	 changes are in.  */
> +      reg_addr[m].prefixed_memory_p = false;
> +    }
> +}
> +
>  \f
>  /* Initialize the various global tables that are based on register size.  */
>  static void
> @@ -3181,6 +3359,9 @@ rs6000_init_hard_regno_mode_ok (bool glo
>       use.  */
>    rs6000_setup_reg_addr_masks ();
>
> +  /* Update the instruction formats.  */
> +  setup_insn_form ();
> +
>    if (global_init_p || TARGET_DEBUG_TARGET)
>      {
>        if (TARGET_DEBUG_REG)
> @@ -13070,29 +13251,21 @@ print_operand (FILE *file, rtx x, int co
>  void
>  print_operand_address (FILE *file, rtx x)
>  {
> +  pcrel_info_type pcrel_info;
> +
>    if (REG_P (x))
>      fprintf (file, "0(%s)", reg_names[ REGNO (x) ]);
>
>    /* Is it a pc-relative address?  */
> -  else if (pcrel_address (x, Pmode))
> +  else if (pcrel_addr_p (x, true, true, &pcrel_info))
>      {
> -      HOST_WIDE_INT offset;
> +      output_addr_const (file, pcrel_info.base_addr);
>
> -      if (GET_CODE (x) == CONST)
> -	x = XEXP (x, 0);
> +      if (pcrel_info.offset)
> +	fprintf (file, "%+" PRId64, pcrel_info.offset);
>
> -      if (GET_CODE (x) == PLUS)
> -	{
> -	  offset = INTVAL (XEXP (x, 1));
> -	  x = XEXP (x, 0);
> -	}
> -      else
> -	offset = 0;
> -
> -      output_addr_const (file, x);
> -
> -      if (offset)
> -	fprintf (file, "%+" PRId64, offset);
> +      if (pcrel_info.external_p)
> +	fputs ("@got", file);
>
>        fputs ("@pcrel", file);
>      }
> @@ -13579,70 +13752,206 @@ rs6000_pltseq_template (rtx *operands, i
>    return str;
>  }
>  #endif
> +\f
> +/* Return true if the address ADDR is a prefixed address either with a large
> +   offset, an offset that does not fit in the normal instruction form, or a
> +   pc-relative address to a local symbol.

I was confused as to the difference between the first two clauses in the
above comment.  I think that in the second perhaps you mean it doesn't
"fit" because the low-order 2 or 4 bits are nonzero for DS/DQ; is that
right?  If so, this comment could be clarified.  Not fitting sounds like
it requires more than 16 bits (possibly shifted) to describe.

Thanks,
Bill
>
> -/* Helper function to return whether a MODE can do prefixed loads/stores.
> -   VOIDmode is used when we are loading the pc-relative address into a base
> -   register, but we are not using it as part of a memory operation.  As modes
> -   add support for prefixed memory, they will be added here.  */
> -
> -static bool
> -mode_supports_prefixed_address_p (machine_mode mode)
> -{
> -  return mode == VOIDmode;
> -}
> +   MODE is the mode of the memory.
>
> -/* Function to return true if ADDR is a valid prefixed memory address that uses
> -   mode MODE.  */
> +   IFORM is used to determine if the traditional address is either DS format or
> +   DQ format and the bottom bits of the offset are non-zero.  */
>
>  bool
> -rs6000_prefixed_address_mode_p (rtx addr, machine_mode mode)
> +prefixed_local_addr_p (rtx addr, machine_mode mode, enum insn_form iform)
>  {
> -  if (!TARGET_PREFIXED_ADDR || !mode_supports_prefixed_address_p (mode))
> +  if (!reg_addr[mode].prefixed_memory_p)
>      return false;
>
> -  /* Check for PC-relative addresses.  */
> -  if (pcrel_address (addr, Pmode))
> -    return true;
> +  if (GET_CODE (addr) == CONST)
> +    addr = XEXP (addr, 0);
> +
> +  /* Single register, not prefixed.  */
> +  if (REG_P (addr) || SUBREG_P (addr))
> +    return false;
> +
> +  /* Register + offset.  */
> +  else if (GET_CODE (addr) == PLUS
> +	   && (REG_P (XEXP (addr, 0)) || SUBREG_P (XEXP (addr, 0)))
> +	   && CONST_INT_P (XEXP (addr, 1)))
> +    {
> +      HOST_WIDE_INT offset = INTVAL (XEXP (addr, 1));
> +
> +      /* Prefixed instructions can only access 34-bits.  Fail if the value
> +	 is larger than that.  */
> +      if (!SIGNED_34BIT_OFFSET_P (offset))
> +	return false;
> +
> +      /* For small offsets see whether it might be a DS or DQ instruction where
> +	 the bottom bits non-zero.  This would require using a prefixed
> +	 address.  If the offset is larger than 16 bits, then the instruction
> +	 must be prefixed.  */
> +      if (SIGNED_16BIT_OFFSET_P (offset))
> +	{
> +	  /* Use default if we don't know the precise instruction format.  */
> +	  if (iform == INSN_FORM_UNKNOWN)
> +	    iform = reg_addr[mode].default_insn_form;
> +
> +	  if (iform == INSN_FORM_DS)
> +	    return (offset & 3) != 0;
> +
> +	  else if (iform == INSN_FORM_DQ)
> +	    return (offset & 15) != 0;
> +
> +	  else if (iform != INSN_FORM_PREFIXED)
> +	    return false;
> +	}
> +
> +      return true;
> +    }
> +
> +  else if (!TARGET_PCREL)
> +    return false;
>
> -  /* Check for prefixed memory addresses that have a large numeric offset,
> -     or an offset that can't be used for a DS/DQ-form memory operation.  */
>    if (GET_CODE (addr) == PLUS)
>      {
> -      rtx op0 = XEXP (addr, 0);
>        rtx op1 = XEXP (addr, 1);
>
> -      if (!base_reg_operand (op0, Pmode) || !CONST_INT_P (op1))
> +      if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
>  	return false;
>
> -      HOST_WIDE_INT value = INTVAL (op1);
> -      if (!SIGNED_34BIT_OFFSET_P (value))
> +      addr = XEXP (addr, 0);
> +    }
> +
> +  /* Local pc-relative symbols/labels.  */
> +  return (LABEL_REF_P (addr)
> +	  || (SYMBOL_REF_P (addr) && SYMBOL_REF_LOCAL_P (addr)));
> +}
> +
> +/* Return true if the address ADDR is a prefixed address that is a pc-relative
> +   reference either to a local symbol or to an external symbol.  We break apart
> +   the address and return the parts.  LOCAL_SYMBOL_P and EXTERNAL_SYMBOL_P says
> +   whether local and external pc-relative symbols are allowed.  P_INFO points
> +   to a structure that returns the broken out component parts if desired.  */
> +
> +bool
> +pcrel_addr_p (rtx addr,
> +	      bool local_symbol_p,
> +	      bool external_symbol_p,
> +	      pcrel_info_type *p_info)
> +{
> +  rtx base_addr = NULL_RTX;
> +  HOST_WIDE_INT offset = 0;
> +  bool was_external_p = false;
> +
> +  if (p_info)
> +    {
> +      p_info->base_addr = NULL_RTX;
> +      p_info->offset = 0;
> +      p_info->external_p = false;
> +    }
> +
> +  if (!TARGET_PCREL)
> +    return false;
> +
> +  if (GET_CODE (addr) == CONST)
> +    addr = XEXP (addr, 0);
> +
> +  /* Pc-relative symbols/labels without offsets.  */
> +  if (SYMBOL_REF_P (addr))
> +    {
> +      base_addr = addr;
> +      was_external_p = !SYMBOL_REF_LOCAL_P (addr);
> +    }
> +
> +  else if (LABEL_REF_P (addr))
> +    base_addr = addr;
> +
> +  /* Pc-relative symbols with offsets.  */
> +  else if (GET_CODE (addr) == PLUS
> +	   && SYMBOL_REF_P (XEXP (addr, 0))
> +	   && CONST_INT_P (XEXP (addr, 1)))
> +    {
> +      base_addr = XEXP (addr, 0);
> +      offset = INTVAL (XEXP (addr, 1));
> +      was_external_p = !SYMBOL_REF_LOCAL_P (base_addr);
> +
> +      if (!SIGNED_34BIT_OFFSET_P (offset))
>  	return false;
> +    }
>
> -      /* Offset larger than 16-bits?  */
> -      if (!SIGNED_16BIT_OFFSET_P (value))
> -	return true;
> +  else
> +    return false;
> +
> +  if (was_external_p && !external_symbol_p)
> +    return false;
> +
> +  if (!was_external_p && !local_symbol_p)
> +    return false;
>
> -      /* DQ instruction (bottom 4 bits must be 0) for vectors.  */
> -      HOST_WIDE_INT mask;
> -      if (GET_MODE_SIZE (mode) >= 16)
> -	mask = 15;
> -
> -      /* DS instruction (bottom 2 bits must be 0).  For 32-bit integers, we
> -	 need to use DS instructions if we are sign-extending the value with
> -	 LWA.  For 32-bit floating point, we need DS instructions to load and
> -	 store values to the traditional Altivec registers.  */
> -      else if (GET_MODE_SIZE (mode) >= 4)
> -	mask = 3;
> +  if (p_info)
> +    {
> +      p_info->base_addr = base_addr;
> +      p_info->offset = offset;
> +      p_info->external_p = was_external_p;
> +    }
> +
> +  return true;
> +}
> +
> +/* Given a register and a mode, return the instruction format for that
> +   register.  If the register is a pseudo register, use the default format.
> +   Otherwise if it is hard register, look to see exactly what type of
> +   addressing is used.  */
> +
> +enum insn_form
> +reg_to_insn_form (rtx reg, machine_mode mode)
> +{
> +  enum insn_form iform;
>
> -      /* QImode/HImode has no restrictions.  */
> +  /* Handle UNSPECs, such as the special UNSPEC_SF_FROM_SI and
> +     UNSPEC_SI_FROM_SF UNSPECs, which are used to hide SF/SI interactions.
> +     Look at the first argument, and if it is a register, use that.  */
> +  if (GET_CODE (reg) == UNSPEC || GET_CODE (reg) == UNSPEC_VOLATILE)
> +    {
> +      rtx op0 = XVECEXP (reg, 0, 0);
> +      if (REG_P (op0) || SUBREG_P (op0))
> +	reg = op0;
> +    }
> +
> +  /* If it isn't a register, use the defaults.  */
> +  if (!REG_P (reg) && !SUBREG_P (reg))
> +    iform = reg_addr[mode].default_insn_form;
> +
> +  else
> +    {
> +      unsigned int r = reg_or_subregno (reg);
> +
> +      /* If we have a pseudo, use the default instruction format.  */
> +      if (r >= FIRST_PSEUDO_REGISTER)
> +	iform = reg_addr[mode].default_insn_form;
> +
> +      /* If we have a hard register, use the address format of that hard
> +	 register.  */
>        else
> -	return true;
> +	{
> +	  if (INT_REGNO_P (r))
> +	    iform = reg_addr[mode].insn_form[RELOAD_REG_GPR];
> +
> +	  else if (FP_REGNO_P (r))
> +	    iform = reg_addr[mode].insn_form[RELOAD_REG_FPR];
>
> -      /* Return true if we must use a prefixed instruction.  */
> -      return (value & mask) != 0;
> +	  else if (ALTIVEC_REGNO_P (r))
> +	    iform = reg_addr[mode].insn_form[RELOAD_REG_VMX];
> +
> +	  /* For anything else (SPR, CA, etc.) assume the GPR registers will be
> +	     used to load or store the value.  */
> +	  else
> +	    iform = reg_addr[mode].insn_form[RELOAD_REG_GPR];
> +	}
>      }
>
> -  return false;
> +  return iform;
>  }
>  \f
>  #if defined (HAVE_GAS_HIDDEN) && !TARGET_MACHO
> Index: gcc/config/rs6000/rs6000.md
> ===================================================================
> --- gcc/config/rs6000/rs6000.md	(revision 274172)
> +++ gcc/config/rs6000/rs6000.md	(working copy)
> @@ -252,6 +252,23 @@ (define_attr "var_shift" "no,yes"
>  ;; Is copying of this instruction disallowed?
>  (define_attr "cannot_copy" "no,yes" (const_string "no"))
>
> +;; Enumeration of the PowerPC instruction formats.  We only list the
> +;; instruction formats that are used by the code, and not every possible
> +;; instruction format that the machine supports.
> +
> +;; The main use for this enumeration is to determine if a particular
> +;; offsettable instruction has a valid offset field for a traditional
> +;; instruction, or whether a prefixed instruction might be needed to hold the
> +;; offset.  For DS/DQ format instructions, if we have an offset that has the
> +;; bottom bits non-zero, we can use a prefixed instruction instead of pushing
> +;; the offset to an index register.
> +(define_enum "insn_form" [unknown	; Unknown format
> +			  d		; Offset addressing uses 16 bits
> +			  ds		; Offset addressing uses 14 bits
> +			  dq		; Offset addressing uses 12 bits
> +			  x		; Indexed addressing
> +			  prefixed])	; Prefixed instruction
> +
>  ;; Length of the instruction (in bytes).
>  (define_attr "length" "" (const_int 4))
>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH], Patch #3 of 10, Add prefixed addressing support
  2019-08-14 22:12 ` [PATCH], Patch #3 of 10, Add prefixed addressing support Michael Meissner
@ 2019-08-16  1:59   ` Bill Schmidt
  0 siblings, 0 replies; 26+ messages in thread
From: Bill Schmidt @ 2019-08-16  1:59 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Alan Modra

On 8/14/19 5:06 PM, Michael Meissner wrote:
> This patch adds prefixed memory support to all offsettable instructions.
>
> Unlike previous versions of the patch, this patch combines all of the
> modifications for addressing to one patch.  Previously, I had 3 separate
> patches (one for PADDI, one for scalar types, and one for vector types).
>
> 2019-08-14   Michael Meissner  <meissner@linux.ibm.com>
>
> 	* config/rs6000/predicates.md (add_operand): Add support for the
> 	PADDI instruction.
> 	(non_add_cint_operand): Add support for the PADDI instruction.
> 	(lwa_operand): Add support for the prefixed PLWA instruction.
> 	* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok_uncached):
> 	Only treat modes < 16 bytes as scalars.
> 	(rs6000_debug_print_mode): Print whether the mode supports
> 	prefixed addressing.
> 	(setup_insn_form): Enable prefixed addressing for all modes whose
> 	default instruction form includes offset addressing.
> 	(num_insns_constant_gpr): Add support for the PADDI instruction.
> 	(quad_address_p): Add support for prefixed addressing.
> 	(mem_operand_gpr): Add support for prefixed addressing.
> 	(mem_operand_ds_form): Add support for prefixed addressing.
> 	(rs6000_legitimate_offset_address_p): Add support for prefixed
> 	addressing.
> 	(rs6000_legitimate_address_p): Add support for prefixed
> 	addressing.
> 	(rs6000_mode_dependent_address): Add support for prefixed
> 	addressing.
> 	(rs6000_rtx_costs): Make PADDI cost the same as ADDI or ADDIS.
> 	* config/rs6000/rs6000.md (add<mode>3): Add support for PADDI.
> 	(movsi_internal1): Add support for prefixed addressing, and using
> 	PADDI to load up large integers.
> 	(movsi splitter): Do not split up a PADDI instruction.
> 	(mov<mode>_64bit_dm): Add support for prefixed addressing.
> 	(movtd_64bit_nodm): Add support for prefixed addressing.
> 	(movdi_internal64): Add support for prefixed addressing, and using
> 	PADDI to load up large integers.
> 	(movdi splitter): Update comment about PADDI.
> 	(stack_protect_setdi): Add support for prefixed addressing.
> 	(stack_protect_testdi): Add support for prefixed addressing.
> 	* config/rs6000/vsx.md (vsx_mov<mode>_64bit): Add support for
> 	prefixed addressing.
> 	(vsx_extract_<P:mode>_<VSX_D:mode>_load): Add support for prefixed
> 	addressing.
> 	(vsx_extract_<P:mode>_<VSX_D:mode>_load): Add support for prefixed
> 	addressing.
>
> Index: gcc/config/rs6000/predicates.md
> ===================================================================
> --- gcc/config/rs6000/predicates.md	(revision 274174)
> +++ gcc/config/rs6000/predicates.md	(working copy)
> @@ -839,7 +839,8 @@
>  (define_predicate "add_operand"
>    (if_then_else (match_code "const_int")
>      (match_test "satisfies_constraint_I (op)
> -		 || satisfies_constraint_L (op)")
> +		 || satisfies_constraint_L (op)
> +		 || satisfies_constraint_eI (op)")
>      (match_operand 0 "gpc_reg_operand")))
>
>  ;; Return 1 if the operand is either a non-special register, or 0, or -1.
> @@ -852,7 +853,8 @@
>  (define_predicate "non_add_cint_operand"
>    (and (match_code "const_int")
>         (match_test "!satisfies_constraint_I (op)
> -		    && !satisfies_constraint_L (op)")))
> +		    && !satisfies_constraint_L (op)
> +		    && !satisfies_constraint_eI (op)")))
>
>  ;; Return 1 if the operand is a constant that can be used as the operand
>  ;; of an AND, OR or XOR.
> @@ -933,6 +935,13 @@
>      return false;
>
>    addr = XEXP (inner, 0);
> +
> +  /* The LWA instruction uses the DS-form format where the bottom two bits of
> +     the offset must be 0.  The prefixed PLWA does not have this
> +     restriction.  */
> +  if (prefixed_local_addr_p (addr, mode, INSN_FORM_DS))
> +    return true;
> +
>    if (GET_CODE (addr) == PRE_INC
>        || GET_CODE (addr) == PRE_DEC
>        || (GET_CODE (addr) == PRE_MODIFY
> Index: gcc/config/rs6000/rs6000.c
> ===================================================================
> --- gcc/config/rs6000/rs6000.c	(revision 274175)
> +++ gcc/config/rs6000/rs6000.c	(working copy)
> @@ -1828,7 +1828,7 @@ rs6000_hard_regno_mode_ok_uncached (int regno, mac
>
>        if (ALTIVEC_REGNO_P (regno))
>  	{
> -	  if (GET_MODE_SIZE (mode) != 16 && !reg_addr[mode].scalar_in_vmx_p)
> +	  if (GET_MODE_SIZE (mode) < 16 && !reg_addr[mode].scalar_in_vmx_p)
>  	    return 0;

Unrelated change?  I don't quite understand why it was changed, either. 
Is this to do with vector_pair support?  If so, maybe it belongs with a
different patch?
>
>  	  return ALTIVEC_REGNO_P (last_regno);
> @@ -2146,6 +2146,11 @@ rs6000_debug_print_mode (ssize_t m)
>            rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_FPR]),
>            rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_VMX]));
>
> +  if (reg_addr[m].prefixed_memory_p)
> +    fprintf (stderr, "  Prefix");
> +  else
> +    spaces += sizeof ("  Prefix") - 1;
> +
>    if ((reg_addr[m].reload_store != CODE_FOR_nothing)
>        || (reg_addr[m].reload_load != CODE_FOR_nothing))
>      {
> @@ -2838,11 +2843,16 @@ setup_insn_form (void)
>        else
>  	def_rc = RELOAD_REG_GPR;
>
> -      reg_addr[m].default_insn_form = reg_addr[m].insn_form[def_rc];
> +      enum insn_form def_iform = reg_addr[m].insn_form[def_rc];
> +      reg_addr[m].default_insn_form = def_iform;
>
> -      /* Don't enable prefixed memory support until all of the infrastructure
> -	 changes are in.  */
> -      reg_addr[m].prefixed_memory_p = false;
> +      /* Only modes that support offset addressing by default can be
> +	 prefixed.  */
> +      reg_addr[m].prefixed_memory_p = (TARGET_PREFIXED_ADDR
> +				       && (def_iform == INSN_FORM_D
> +					   || def_iform == INSN_FORM_DS
> +					   || def_iform == INSN_FORM_DQ));
> +
>      }
>  }
>
> @@ -5693,7 +5703,7 @@ static int
>  num_insns_constant_gpr (HOST_WIDE_INT value)
>  {
>    /* signed constant loadable with addi */
> -  if (((unsigned HOST_WIDE_INT) value + 0x8000) < 0x10000)
> +  if (SIGNED_16BIT_OFFSET_P (value))
>      return 1;
>
>    /* constant loadable with addis */
> @@ -5701,6 +5711,10 @@ num_insns_constant_gpr (HOST_WIDE_INT value)
>  	   && (value >> 31 == -1 || value >> 31 == 0))
>      return 1;
>
> +  /* PADDI can support up to 34 bit signed integers.  */
> +  else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (value))
> +    return 1;
> +
>    else if (TARGET_POWERPC64)
>      {
>        HOST_WIDE_INT low  = ((value & 0xffffffff) ^ 0x80000000) - 0x80000000;
> @@ -7411,7 +7425,7 @@ quad_address_p (rtx addr, machine_mode mode, bool
>  {
>    rtx op0, op1;
>
> -  if (GET_MODE_SIZE (mode) != 16)
> +  if (GET_MODE_SIZE (mode) < 16)
>      return false;

Same question about whether this is an unrelated change, perhaps to do
with vector_pair support?
>
>    if (legitimate_indirect_address_p (addr, strict))
> @@ -7420,6 +7434,13 @@ quad_address_p (rtx addr, machine_mode mode, bool
>    if (VECTOR_MODE_P (mode) && !mode_supports_dq_form (mode))
>      return false;
>
> +  /* Is this a valid prefixed address?  If the bottom four bits of the offset
> +     are non-zero, we could use a prefixed instruction (which does not have the
> +     DQ-form constraint that the traditional instruction had) instead of
> +     forcing the unaligned offset to a GPR.  */
> +  if (prefixed_local_addr_p (addr, mode, INSN_FORM_DQ))
> +    return true;
> +
>    if (GET_CODE (addr) != PLUS)
>      return false;
>
> @@ -7521,6 +7542,13 @@ mem_operand_gpr (rtx op, machine_mode mode)
>        && legitimate_indirect_address_p (XEXP (addr, 0), false))
>      return true;
>
> +  /* Allow prefixed instructions if supported.  If the bottom two bits of the
> +     offset are non-zero, we could use a prefixed instruction (which does not
> +     have the DS-form constraint that the traditional instruction had) instead
> +     of forcing the unaligned offset to a GPR.  */
> +  if (prefixed_local_addr_p (addr, mode, INSN_FORM_DS))
> +    return true;
> +
>    /* Don't allow non-offsettable addresses.  See PRs 83969 and 84279.  */
>    if (!rs6000_offsettable_memref_p (op, mode, false))
>      return false;
> @@ -7542,7 +7570,7 @@ mem_operand_gpr (rtx op, machine_mode mode)
>         causes a wrap, so test only the low 16 bits.  */
>      offset = ((offset & 0xffff) ^ 0x8000) - 0x8000;
>
> -  return offset + 0x8000 < 0x10000u - extra;
> +  return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
>  }
>
>  /* As above, but for DS-FORM VSX insns.  Unlike mem_operand_gpr,
> @@ -7555,6 +7583,13 @@ mem_operand_ds_form (rtx op, machine_mode mode)
>    int extra;
>    rtx addr = XEXP (op, 0);
>
> +  /* Allow prefixed instructions if supported.  If the bottom two bits of the
> +     offset are non-zero, we could use a prefixed instruction (which does not
> +     have the DS-form constraint that the traditional instruction had) instead
> +     of forcing the unaligned offset to a GPR.  */
> +  if (prefixed_local_addr_p (addr, mode, INSN_FORM_DS))
> +    return true;
> +
>    if (!offsettable_address_p (false, mode, addr))
>      return false;
>
> @@ -7575,7 +7610,7 @@ mem_operand_ds_form (rtx op, machine_mode mode)
>         causes a wrap, so test only the low 16 bits.  */
>      offset = ((offset & 0xffff) ^ 0x8000) - 0x8000;
>
> -  return offset + 0x8000 < 0x10000u - extra;
> +  return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
>  }
>  \f
>  /* Subroutines of rs6000_legitimize_address and rs6000_legitimate_address_p.  */
> @@ -7924,8 +7959,10 @@ rs6000_legitimate_offset_address_p (machine_mode m
>        break;
>      }
>
> -  offset += 0x8000;
> -  return offset < 0x10000 - extra;
> +  if (TARGET_PREFIXED_ADDR)
> +    return SIGNED_34BIT_OFFSET_EXTRA_P (offset, extra);
> +  else
> +    return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
>  }
>
>  bool
> @@ -8822,6 +8859,11 @@ rs6000_legitimate_address_p (machine_mode mode, rt
>        && mode_supports_pre_incdec_p (mode)
>        && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict))
>      return 1;
> +
> +  /* Handle prefixed addresses (pc-relative or 34-bit offset).  */
> +  if (prefixed_local_addr_p (x, mode, INSN_FORM_UNKNOWN))
> +    return 1;
> +
>    /* Handle restricted vector d-form offsets in ISA 3.0.  */
>    if (quad_offset_p)
>      {
> @@ -8880,7 +8922,10 @@ rs6000_legitimate_address_p (machine_mode mode, rt
>  	  || (!avoiding_indexed_address_p (mode)
>  	      && legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict)))
>        && rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0)))
> -    return 1;
> +    {
> +      /* There is no prefixed version of the load/store with update.  */
> +      return !prefixed_local_addr_p (XEXP (x, 1), mode, INSN_FORM_UNKNOWN);
> +    }
>    if (reg_offset_p && !quad_offset_p
>        && legitimate_lo_sum_address_p (mode, x, reg_ok_strict))
>      return 1;
> @@ -8942,8 +8987,12 @@ rs6000_mode_dependent_address (const_rtx addr)
>  	  && XEXP (addr, 0) != arg_pointer_rtx
>  	  && CONST_INT_P (XEXP (addr, 1)))
>  	{
> -	  unsigned HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
> -	  return val + 0x8000 >= 0x10000 - (TARGET_POWERPC64 ? 8 : 12);
> +	  HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
> +	  HOST_WIDE_INT extra = TARGET_POWERPC64 ? 8 : 12;
> +	  if (TARGET_PREFIXED_ADDR)
> +	    return !SIGNED_34BIT_OFFSET_EXTRA_P (val, extra);
> +	  else
> +	    return !SIGNED_16BIT_OFFSET_EXTRA_P (val, extra);
>  	}
>        break;
>
> @@ -20939,7 +20988,8 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int ou
>  	    || outer_code == PLUS
>  	    || outer_code == MINUS)
>  	   && (satisfies_constraint_I (x)
> -	       || satisfies_constraint_L (x)))
> +	       || satisfies_constraint_L (x)
> +	       || satisfies_constraint_eI (x)))
>  	  || (outer_code == AND
>  	      && (satisfies_constraint_K (x)
>  		  || (mode == SImode
> Index: gcc/config/rs6000/rs6000.md
> ===================================================================
> --- gcc/config/rs6000/rs6000.md	(revision 274175)
> +++ gcc/config/rs6000/rs6000.md	(working copy)
> @@ -1768,15 +1768,17 @@
>  })
>
>  (define_insn "*add<mode>3"
> -  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
> -	(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
> -		  (match_operand:GPR 2 "add_operand" "r,I,L")))]
> +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r,r")
> +	(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b,b")
> +		  (match_operand:GPR 2 "add_operand" "r,I,L,eI")))]
>    ""
>    "@
>     add %0,%1,%2
>     addi %0,%1,%2
> -   addis %0,%1,%v2"
> -  [(set_attr "type" "add")])
> +   addis %0,%1,%v2
> +   addi %0,%1,%2"
> +  [(set_attr "type" "add")
> +   (set_attr "isa" "*,*,*,fut")])
>
>  (define_insn "*addsi3_high"
>    [(set (match_operand:SI 0 "gpc_reg_operand" "=b")
> @@ -6916,22 +6918,22 @@
>
>  ;;		MR           LA           LWZ          LFIWZX       LXSIWZX
>  ;;		STW          STFIWX       STXSIWX      LI           LIS
> -;;		#            XXLOR        XXSPLTIB 0   XXSPLTIB -1  VSPLTISW
> -;;		XXLXOR 0     XXLORC -1    P9 const     MTVSRWZ      MFVSRWZ
> -;;		MF%1         MT%0         NOP
> +;;		PLI          #            XXLOR        XXSPLTIB 0   XXSPLTIB -1
> +;;		VSPLTISW     XXLXOR 0     XXLORC -1    P9 const     MTVSRWZ
> +;;		MFVSRWZ      MF%1         MT%0         NOP
>  (define_insn "*movsi_internal1"
>    [(set (match_operand:SI 0 "nonimmediate_operand"
>  		"=r,         r,           r,           d,           v,
>  		 m,          Z,           Z,           r,           r,
> -		 r,          wa,          wa,          wa,          v,
> -		 wa,         v,           v,           wa,          r,
> -		 r,          *h,          *h")
> +		 r,          r,           wa,          wa,          wa,
> +		 v,          wa,          v,           v,           wa,
> +		 r,          r,           *h,          *h")
>  	(match_operand:SI 1 "input_operand"
>  		"r,          U,           m,           Z,           Z,
>  		 r,          d,           v,           I,           L,
> -		 n,          wa,          O,           wM,          wB,
> -		 O,          wM,          wS,          r,           wa,
> -		 *h,         r,           0"))]
> +		 eI,         n,           wa,          O,           wM,
> +		 wB,         O,           wM,          wS,          r,
> +		 wa,         *h,          r,           0"))]
>    "gpc_reg_operand (operands[0], SImode)
>     || gpc_reg_operand (operands[1], SImode)"
>    "@
> @@ -6945,6 +6947,7 @@
>     stxsiwx %x1,%y0
>     li %0,%1
>     lis %0,%v1
> +   li %0,%1
>     #
>     xxlor %x0,%x1,%x1
>     xxspltib %x0,0
> @@ -6961,21 +6964,21 @@
>    [(set_attr "type"
>  		"*,          *,           load,        fpload,      fpload,
>  		 store,      fpstore,     fpstore,     *,           *,
> -		 *,          veclogical,  vecsimple,   vecsimple,   vecsimple,
> -		 veclogical, veclogical,  vecsimple,   mffgpr,      mftgpr,
> -		 *,          *,           *")
> +		 *,          *,           veclogical,  vecsimple,   vecsimple,
> +		 vecsimple,  veclogical,  veclogical,  vecsimple,   mffgpr,
> +		 mftgpr,     *,           *,           *")
>     (set_attr "length"
>  		"*,          *,           *,           *,           *,
>  		 *,          *,           *,           *,           *,
> -		 8,          *,           *,           *,           *,
> -		 *,          *,           8,           *,           *,
> -		 *,          *,           *")
> +		 *,          8,           *,           *,           *,
> +		 *,          *,           *,           8,           *,
> +		 *,          *,           *,           *")
>     (set_attr "isa"
>  		"*,          *,           *,           p8v,         p8v,
>  		 *,          p8v,         p8v,         *,           *,
> -		 *,          p8v,         p9v,         p9v,         p8v,
> -		 p9v,        p8v,         p9v,         p8v,         p8v,
> -		 *,          *,           *")])
> +		 fut,        *,           p8v,         p9v,         p9v,
> +		 p8v,        p9v,         p8v,         p9v,         p8v,
> +		 p8v,        *,           *,           *")])
>
>  ;; Like movsi, but adjust a SF value to be used in a SI context, i.e.
>  ;; (set (reg:SI ...) (subreg:SI (reg:SF ...) 0))
> @@ -7120,14 +7123,15 @@
>    "xscvdpsp %x0,%x1"
>    [(set_attr "type" "fp")])
>
> -;; Split a load of a large constant into the appropriate two-insn
> -;; sequence.
> +;; Split a load of a large constant into the appropriate two-insn sequence.  On
> +;; systems that support PADDI (PLI), we can use PLI to load any 32-bit constant
> +;; in one instruction.
>
>  (define_split
>    [(set (match_operand:SI 0 "gpc_reg_operand")
>  	(match_operand:SI 1 "const_int_operand"))]
>    "(unsigned HOST_WIDE_INT) (INTVAL (operands[1]) + 0x8000) >= 0x10000
> -   && (INTVAL (operands[1]) & 0xffff) != 0"
> +   && (INTVAL (operands[1]) & 0xffff) != 0 && !TARGET_PREFIXED_ADDR"
>    [(set (match_dup 0)
>  	(match_dup 2))
>     (set (match_dup 0)
> @@ -7766,9 +7770,18 @@
>  ;; not swapped like they are for TImode or TFmode.  Subregs therefore are
>  ;; problematical.  Don't allow direct move for this case.
>
> +;;		FPR load    FPR store   FPR move    FPR zero    GPR load
> +;;		GPR store   GPR move    GPR zero    MFVSRD      MTVSRD
> +
>  (define_insn_and_split "*mov<mode>_64bit_dm"
> -  [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r,r,d")
> -	(match_operand:FMOVE128_FPR 1 "input_operand" "d,m,d,<zero_fp>,r,<zero_fp>Y,r,d,r"))]
> +  [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand"
> +		"=m,        d,          d,          d,          Y,
> +		 r,         r,          r,          r,          d")
> +
> +	(match_operand:FMOVE128_FPR 1 "input_operand"
> +		"d,         m,          d,          <zero_fp>,  r,
> +		 <zero_fp>, Y,          r,          d,          r"))]
> +
>    "TARGET_HARD_FLOAT && TARGET_POWERPC64 && FLOAT128_2REG_P (<MODE>mode)
>     && (<MODE>mode != TDmode || WORDS_BIG_ENDIAN)
>     && (gpc_reg_operand (operands[0], <MODE>mode)
> @@ -7776,9 +7789,13 @@
>    "#"
>    "&& reload_completed"
>    [(pc)]
> -{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
> -  [(set_attr "length" "8,8,8,8,12,12,8,8,8")
> -   (set_attr "isa" "*,*,*,*,*,*,*,p8v,p8v")])
> +{
> +  rs6000_split_multireg_move (operands[0], operands[1]);
> +  DONE;
> +}
> +  [(set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")
> +   (set_attr "non_prefixed_length" "8")
> +   (set_attr "prefixed_length" "20")])
>
>  (define_insn_and_split "*movtd_64bit_nodm"
>    [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
> @@ -7789,8 +7806,12 @@
>    "#"
>    "&& reload_completed"
>    [(pc)]
> -{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
> -  [(set_attr "length" "8,8,8,12,12,8")])
> +{
> +  rs6000_split_multireg_move (operands[0], operands[1]);
> +  DONE;
> +}
> +  [(set_attr "non_prefixed_length" "8")
> +   (set_attr "prefixed_length" "20")])
>
>  (define_insn_and_split "*mov<mode>_32bit"
>    [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r")
> @@ -8800,24 +8821,24 @@
>    [(pc)]
>  { rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
>
> -;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR #
> -;;              FPR store  FPR load   FPR move   AVX store  AVX store   AVX load
> -;;              AVX load   VSX move   P9 0       P9 -1      AVX 0/-1    VSX 0
> -;;              VSX -1     P9 const   AVX const  From SPR   To SPR      SPR<->SPR
> -;;              VSX->GPR   GPR->VSX
> +;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR pli
> +;;              GPR #      FPR store  FPR load   FPR move   AVX store   AVX store
> +;;              AVX load   AVX load   VSX move   P9 0       P9 -1       AVX 0/-1
> +;;              VSX 0      VSX -1     P9 const   AVX const  From SPR    To SPR
> +;;              SPR<->SPR  VSX->GPR   GPR->VSX
>  (define_insn "*movdi_internal64"
>    [(set (match_operand:DI 0 "nonimmediate_operand"
>                 "=YZ,       r,         r,         r,         r,          r,
> -                m,         ^d,        ^d,        wY,        Z,          $v,
> -                $v,        ^wa,       wa,        wa,        v,          wa,
> -                wa,        v,         v,         r,         *h,         *h,
> -                ?r,        ?wa")
> +                r,         m,         ^d,        ^d,        wY,         Z,
> +                $v,        $v,        ^wa,       wa,        wa,         v,
> +                wa,        wa,        v,         v,         r,          *h,
> +                *h,        ?r,        ?wa")
>  	(match_operand:DI 1 "input_operand"
> -               "r,         YZ,        r,         I,         L,          nF,
> -                ^d,        m,         ^d,        ^v,        $v,         wY,
> -                Z,         ^wa,       Oj,        wM,        OjwM,       Oj,
> -                wM,        wS,        wB,        *h,        r,          0,
> -                wa,        r"))]
> +               "r,         YZ,        r,         I,         L,          eI,
> +                nF,        ^d,        m,         ^d,        ^v,         $v,
> +                wY,        Z,         ^wa,       Oj,        wM,         OjwM,
> +                Oj,        wM,        wS,        wB,        *h,         r,
> +                0,         wa,        r"))]
>    "TARGET_POWERPC64
>     && (gpc_reg_operand (operands[0], DImode)
>         || gpc_reg_operand (operands[1], DImode))"
> @@ -8827,6 +8848,7 @@
>     mr %0,%1
>     li %0,%1
>     lis %0,%v1
> +   li %0,%1
>     #
>     stfd%U0%X0 %1,%0
>     lfd%U1%X1 %0,%1
> @@ -8850,26 +8872,28 @@
>     mtvsrd %x0,%1"
>    [(set_attr "type"
>                 "store,      load,	*,         *,         *,         *,
> -                fpstore,    fpload,     fpsimple,  fpstore,   fpstore,   fpload,
> -                fpload,     veclogical, vecsimple, vecsimple, vecsimple, veclogical,
> -                veclogical, vecsimple,  vecsimple, mfjmpr,    mtjmpr,    *,
> -                mftgpr,    mffgpr")
> +                *,          fpstore,    fpload,    fpsimple,  fpstore,   fpstore,
> +                fpload,     fpload,     veclogical,vecsimple, vecsimple, vecsimple,
> +                veclogical, veclogical, vecsimple,  vecsimple, mfjmpr,   mtjmpr,
> +                *,          mftgpr,    mffgpr")
>     (set_attr "size" "64")
>     (set_attr "length"
> -               "*,         *,         *,         *,         *,          20,
> +               "*,         *,         *,         *,         *,          *,
> +                20,        *,         *,         *,         *,          *,
>                  *,         *,         *,         *,         *,          *,
> -                *,         *,         *,         *,         *,          *,
> -                *,         8,         *,         *,         *,          *,
> -                *,         *")
> +                *,         *,         8,         *,         *,          *,
> +                *,         *,         *")
>     (set_attr "isa"
> -               "*,         *,         *,         *,         *,          *,
> -                *,         *,         *,         p9v,       p7v,        p9v,
> -                p7v,       *,         p9v,       p9v,       p7v,        *,
> -                *,         p7v,       p7v,       *,         *,          *,
> -                p8v,       p8v")])
> +               "*,         *,         *,         *,         *,          fut,
> +                *,         *,         *,         *,         p9v,        p7v,
> +                p9v,       p7v,       *,         p9v,       p9v,        p7v,
> +                *,         *,         p7v,       p7v,       *,          *,
> +                *,         p8v,       p8v")])
>
>  ; Some DImode loads are best done as a load of -1 followed by a mask
> -; instruction.
> +; instruction.  On systems that support the PADDI (PLI) instruction,
> +; num_insns_constant returns 1, so these splitter would not be used for things
> +; that be loaded with PLI.
>  (define_split
>    [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
>  	(match_operand:DI 1 "const_int_operand"))]
> @@ -8987,7 +9011,8 @@
>    return rs6000_output_move_128bit (operands);
>  }
>    [(set_attr "type" "store,store,load,load,*,*")
> -   (set_attr "length" "8")])
> +   (set_attr "non_prefixed_length" "8,8,8,8,8,40")
> +   (set_attr "prefixed_length" "20,20,20,20,8,40")])
>
>  (define_split
>    [(set (match_operand:TI2 0 "int_reg_operand")
> @@ -11501,15 +11526,43 @@
>    [(set_attr "type" "three")
>     (set_attr "length" "12")])
>
> +;; We can't use the prefixed attribute here because there are two memory
> +;; instructions, and we can't split the insn due to the fact that this
> +;; operation needs to be done in one piece.
>  (define_insn "stack_protect_setdi"
>    [(set (match_operand:DI 0 "memory_operand" "=Y")
>  	(unspec:DI [(match_operand:DI 1 "memory_operand" "Y")] UNSPEC_SP_SET))
>     (set (match_scratch:DI 2 "=&r") (const_int 0))]
>    "TARGET_64BIT"
> -  "ld%U1%X1 %2,%1\;std%U0%X0 %2,%0\;li %2,0"
> +{
> +  if (prefixed_mem_operand (operands[1], DImode))
> +    output_asm_insn ("pld %2,%1", operands);
> +  else
> +    output_asm_insn ("ld%U1%X1 %2,%1", operands);
> +
> +  if (prefixed_mem_operand (operands[0], DImode))
> +    output_asm_insn ("pstd %2,%0", operands);
> +  else
> +    output_asm_insn ("std%U0%X0 %2,%0", operands);
> +
> +  return "li %2,0";
> +}
>    [(set_attr "type" "three")
> -   (set_attr "length" "12")])
>
> +  ;; Back to back prefixed memory instructions take 20 bytes (8 bytes for each
> +  ;; prefixed instruction + 4 bytes for the possible NOP).
> +   (set_attr "prefixed" "no")

Should "prefixed" be conditional?  "no" seems to break rs6000_num_insns
in patch #4.
> +   (set (attr "length")
> +	(cond [(and (match_operand 0 "prefixed_mem_operand")
> +		    (match_operand 1 "prefixed_mem_operand"))
> +	       (const_string "24")
> +
> +	       (ior (match_operand 0 "prefixed_mem_operand")
> +		    (match_operand 1 "prefixed_mem_operand"))
> +	       (const_string "20")]
> +
> +	      (const_string "12")))])
> +
>  (define_expand "stack_protect_test"
>    [(match_operand 0 "memory_operand")
>     (match_operand 1 "memory_operand")
> @@ -11547,6 +11600,9 @@
>     lwz%U1%X1 %3,%1\;lwz%U2%X2 %4,%2\;cmplw %0,%3,%4\;li %3,0\;li %4,0"
>    [(set_attr "length" "16,20")])
>
> +;; We can't use the prefixed attribute here because there are two memory
> +;; instructions, and we can't split the insn due to the fact that this
> +;; operation needs to be done in one piece.
>  (define_insn "stack_protect_testdi"
>    [(set (match_operand:CCEQ 0 "cc_reg_operand" "=x,?y")
>          (unspec:CCEQ [(match_operand:DI 1 "memory_operand" "Y,Y")
> @@ -11555,11 +11611,44 @@
>     (set (match_scratch:DI 4 "=r,r") (const_int 0))
>     (clobber (match_scratch:DI 3 "=&r,&r"))]
>    "TARGET_64BIT"
> -  "@
> -   ld%U1%X1 %3,%1\;ld%U2%X2 %4,%2\;xor. %3,%3,%4\;li %4,0
> -   ld%U1%X1 %3,%1\;ld%U2%X2 %4,%2\;cmpld %0,%3,%4\;li %3,0\;li %4,0"
> -  [(set_attr "length" "16,20")])
> +{
> +  if (prefixed_mem_operand (operands[1], DImode))
> +    output_asm_insn ("pld %3,%1", operands);
> +  else
> +    output_asm_insn ("ld%U1%X1 %3,%1", operands);
>
> +  if (prefixed_mem_operand (operands[2], DImode))
> +    output_asm_insn ("pld %4,%2", operands);
> +  else
> +    output_asm_insn ("ld%U2%X2 %4,%2", operands);
> +
> +  if (which_alternative == 0)
> +    output_asm_insn ("xor. %3,%3,%4", operands);
> +  else
> +    output_asm_insn ("cmpld %0,%3,%4\;li %3,0", operands);
> +
> +  return "li %4,0";
> +}
> +  ;; Back to back prefixed memory instructions take 20 bytes (8 bytes for each
> +  ;; prefixed instruction + 4 bytes for the possible NOP).
> +  [(set (attr "length")
> +	(cond [(and (match_operand 1 "prefixed_mem_operand")
> +		    (match_operand 2 "prefixed_mem_operand"))
> +	       (if_then_else (eq_attr "alternative" "0")
> +			     (const_string "28")
> +			     (const_string "32"))
> +
> +	       (ior (match_operand 1 "prefixed_mem_operand")
> +		    (match_operand 2 "prefixed_mem_operand"))
> +	       (if_then_else (eq_attr "alternative" "0")
> +			     (const_string "20")
> +			     (const_string "24"))]
> +
> +	      (if_then_else (eq_attr "alternative" "0")
> +			    (const_string "16")
> +			    (const_string "20"))))
> +   (set_attr "prefixed" "no")])

Same question about "prefixed" being conditional; again seems to break
patch #4.

Thanks,
Bill
> +
>  \f
>  ;; Here are the actual compare insns.
>  (define_insn "*cmp<mode>_signed"
> Index: gcc/config/rs6000/vsx.md
> ===================================================================
> --- gcc/config/rs6000/vsx.md	(revision 274173)
> +++ gcc/config/rs6000/vsx.md	(working copy)
> @@ -1149,10 +1149,30 @@
>                 "vecstore,  vecload,   vecsimple, mffgpr,    mftgpr,    load,
>                  store,     load,      store,     *,         vecsimple, vecsimple,
>                  vecsimple, *,         *,         vecstore,  vecload")
> -   (set_attr "length"
> -               "*,         *,         *,         8,         *,         8,
> -                8,         8,         8,         8,         *,         *,
> -                *,         20,        8,         *,         *")
> +   (set (attr "non_prefixed_length")
> +	(cond [(and (eq_attr "alternative" "4")		;; MTVSRDD
> +		    (match_test "TARGET_P9_VECTOR"))
> +	       (const_string "4")
> +
> +	       (eq_attr "alternative" "3,4")		;; GPR <-> VSX
> +	       (const_string "8")
> +
> +	       (eq_attr "alternative" "5,6,7,8")	;; GPR load/store
> +	       (const_string "8")]
> +	      (const_string "*")))
> +
> +   (set (attr "prefixed_length")
> +	(cond [(and (eq_attr "alternative" "4")		;; MTVSRDD
> +		    (match_test "TARGET_P9_VECTOR"))
> +	       (const_string "4")
> +
> +	       (eq_attr "alternative" "3,4")		;; GPR <-> VSX
> +	       (const_string "8")
> +
> +	       (eq_attr "alternative" "5,6,7,8")	;; GPR load/store
> +	       (const_string "20")]
> +	      (const_string "*")))
> +
>     (set_attr "isa"
>                 "<VSisa>,   <VSisa>,   <VSisa>,   *,         *,         *,
>                  *,         *,         *,         *,         p9v,       *,
> @@ -3199,7 +3219,12 @@
>  					   operands[3], <VSX_D:VS_scalar>mode);
>  }
>    [(set_attr "type" "fpload,load")
> -   (set_attr "length" "8")])
> +   (set (attr "prefixed")
> +	(if_then_else (match_operand 1 "prefixed_mem_operand")
> +		      (const_string "yes")
> +		      (const_string "no")))
> +   (set_attr "non_prefixed_length" "8")
> +   (set_attr "prefixed_length" "16")])
>
>  ;; Optimize storing a single scalar element that is the right location to
>  ;; memory
> @@ -3294,6 +3319,8 @@
>  }
>    [(set_attr "type" "fpload,fpload,fpload,load")
>     (set_attr "length" "8")
> +   (set_attr "non_prefixed_length" "8")
> +   (set_attr "prefixed_length" "16")
>     (set_attr "isa" "*,p7v,p9v,*")])
>
>  ;; Variable V4SF extract
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PC-relative TLS support
  2019-08-15 19:47   ` Segher Boessenkool
@ 2019-08-16  4:09     ` Alan Modra
  2019-08-19 13:39       ` Segher Boessenkool
  0 siblings, 1 reply; 26+ messages in thread
From: Alan Modra @ 2019-08-16  4:09 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Thu, Aug 15, 2019 at 01:24:07PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Aug 15, 2019 at 01:35:10PM +0930, Alan Modra wrote:
> > Supporting TLS for -mpcrel turns out to be relatively simple, in part
> > due to deciding that !TARGET_TLS_MARKERS with -mpcrel is silly.  No
> > assembler that I know of supporting prefix insns lacks TLS marker
> > support.
> 
> Will this stay that way?  (Or do we not care, not now anyway?)

I'd say we leave that problem to someone who wants pcrel without tls
markers.  It's not hard to do, just extend rs6000_output_tlsargs and
adjust IS_NOMARK_TLSGETADDR length attribute expressions.

> > Also, at some point powerpc gcc ought to remove
> > !TARGET_TLS_MARKERS generally and simplify all the occurrences of
> > IS_NOMARK_TLSGETADDR in rs6000.md rather than complicating them.
> 
> The last time this came up (a year ago) the conclusion was that we first
> would have to remove AIX support.

Hmm, I wonder has that changed?  A quick look at the source says the
AIX TLS support uses completely different patterns and shouldn't care.

> (Changelog has whitespace damage, I guess that is just from how you
> mailed this?  Please fix when applying it).

Fixed.  (It wasn't the mailer..)

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PC-relative TLS support
  2019-08-16  4:09     ` Alan Modra
@ 2019-08-19 13:39       ` Segher Boessenkool
  2019-08-21 13:34         ` Alan Modra
  0 siblings, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2019-08-19 13:39 UTC (permalink / raw)
  To: Alan Modra; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Fri, Aug 16, 2019 at 11:29:30AM +0930, Alan Modra wrote:
> On Thu, Aug 15, 2019 at 01:24:07PM -0500, Segher Boessenkool wrote:
> > On Thu, Aug 15, 2019 at 01:35:10PM +0930, Alan Modra wrote:
> > > Supporting TLS for -mpcrel turns out to be relatively simple, in part
> > > due to deciding that !TARGET_TLS_MARKERS with -mpcrel is silly.  No
> > > assembler that I know of supporting prefix insns lacks TLS marker
> > > support.
> > 
> > Will this stay that way?  (Or do we not care, not now anyway?)
> 
> I'd say we leave that problem to someone who wants pcrel without tls
> markers.  It's not hard to do, just extend rs6000_output_tlsargs and
> adjust IS_NOMARK_TLSGETADDR length attribute expressions.

Okay, so the latter option :-)

> > > Also, at some point powerpc gcc ought to remove
> > > !TARGET_TLS_MARKERS generally and simplify all the occurrences of
> > > IS_NOMARK_TLSGETADDR in rs6000.md rather than complicating them.
> > 
> > The last time this came up (a year ago) the conclusion was that we first
> > would have to remove AIX support.
> 
> Hmm, I wonder has that changed?  A quick look at the source says the
> AIX TLS support uses completely different patterns and shouldn't care.

https://gcc.gnu.org/ml/gcc-patches/2018-11/msg02259.html

But if you think we can remove the !TARGET_TLS_MARKERS everywhere it
is relevant at all, now is the time, patches very welcome, it would be
a nice cleanup :-)  Needs testing everywhere of course, but now is
stage 1 :-)


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH], Patch #2 of 10, Add RTL prefixed attribute
  2019-08-14 22:11 ` [PATCH], Patch #2 of 10, Add RTL prefixed attribute Michael Meissner
@ 2019-08-19 19:15   ` Segher Boessenkool
  0 siblings, 0 replies; 26+ messages in thread
From: Segher Boessenkool @ 2019-08-19 19:15 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn, Alan Modra

Hi Mike,

Some comments on this patch:

On Wed, Aug 14, 2019 at 05:59:13PM -0400, Michael Meissner wrote:
> Due to some of the existing load and store insns not using the traditional
> operands[0] and operands[1], the functions that test whether an insn is
> prefixed only use the insn and not the operands directly.

Both the "update" and the "indexed" attributes have no problem with
this: the insns that have the problem set the attribute value directly.
This is mainly all the various update insns, but there are a bunch more,
and they all need different settings for their attributes.

> 	* config/rs6000/rs6000.c (rs6000_emit_move): Add support for
> 	loading up pc-relatve addresses.

(typo btw)

> +void
> +rs6000_final_prescan_insn (rtx_insn *insn)
> +{
> +  next_insn_prefixed_p = (get_attr_prefixed (insn) != PREFIXED_NO);
> +  return;
> +}

Don't say "return;" at the end of a function please.

> +void
> +rs6000_asm_output_opcode (FILE *stream, const char *)
> +{
> +  if (next_insn_prefixed_p)
> +    {
> +      next_insn_prefixed_p = false;
> +      fprintf (stream, "p");
> +    }

You don't need to clear the flag here; the next call to
rs6000_final_prescan_insn will.

> +#define FINAL_PRESCAN_INSN(INSN, OPERANDS, NOPERANDS)			\
> +do									\
> +  {									\
> +    if (TARGET_PREFIXED_ADDR)						\
> +      rs6000_final_prescan_insn (INSN);					\
> +  }									\
> +while (0)

Either have the function only do what it needs to for prefixed, and call
it something with prefixed in the name, or put the TARGET_PREFIXED_ADDR
test in the function itself.

> +;; Load up a pc-relative address.  ASM_OUTPUT_OPCODE will emit the initial "p".
> +(define_insn "*pcrel_addr"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=b*r")
> +	(match_operand:DI 1 "pcrel_address"))]
> +  "TARGET_PCREL"
> +  "la %0,%a1"
> +  [(set_attr "prefixed" "yes")])

(use P for addresses please)

> +;; Load up a pc-relative address to an external symbol.  If the symbol and the
> +;; program are both defined in the main program, the linker will optimize this
> +;; to a PADDI.  Otherwise, it will create a GOT address that is relocated by
> +;; the dynamic linker and loaded up.
> +(define_insn "*pcrel_ext_addr"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=b*r")
> +	(match_operand:DI 1 "pcrel_external_address"))]
> +  "TARGET_PCREL"
> +  "ld %0,%a1"
> +  [(set_attr "prefixed" "yes")])

pld does an indirection more than pla does, but this is not clear at all
from the RTL, from the predicate names.  All this is *before* the linker
has done its thing, so pcrel_external_address is really some GOT memory,
so it should have that in its name.


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PC-relative TLS support
  2019-08-19 13:39       ` Segher Boessenkool
@ 2019-08-21 13:34         ` Alan Modra
  2019-11-11  7:40           ` Alan Modra
  2019-11-11 12:10           ` Segher Boessenkool
  0 siblings, 2 replies; 26+ messages in thread
From: Alan Modra @ 2019-08-21 13:34 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Mon, Aug 19, 2019 at 07:45:19AM -0500, Segher Boessenkool wrote:
> But if you think we can remove the !TARGET_TLS_MARKERS everywhere it
> is relevant at all, now is the time, patches very welcome, it would be
> a nice cleanup :-)  Needs testing everywhere of course, but now is
> stage 1 :-)

This patch removes !TARGET_TLS_MARKERS support.  -mtls-markers (and
-mno-tls-markers) disappear as valid options too, because I figure
they haven't been used too much except by people testing the
compiler.  Bootstrapped and regression tested powerpc64le-linux and
powerpc-ibm-aix7.1.3.0 (on gcc111).  I believe powerpc*-darwin doesn't
support TLS.

Requiring an 8 year old binutils-2.20 shouldn't be that onerous.

Note that this patch doesn't remove the configure test to set
HAVE_AS_TLS_MARKERS.  I was wondering whether I ought to hook that
into a "sorry, your assembler is too old" error?

	* config/rs6000/rs6000-protos.h (rs6000_output_tlsargs): Delete.
	* config/rs6000/rs6000.c (rs6000_output_tlsargs): Delete.
	(rs6000_legitimize_tls_address): Remove !TARGET_TLS_MARKERS code.
	(rs6000_call_template_1): Delete TARGET_TLS_MARKERS test and
	allow other UNSPECs besides UNSPEC_TLSGD and UNSPEC_TLSLD.
	(rs6000_indirect_call_template_1): Likewise.
	(rs6000_pltseq_template): Likewise.
	(rs6000_opt_vars): Remove "tls-markers" entry.
	* config/rs6000/rs6000.h (TARGET_TLS_MARKERS): Don't define.
	(IS_NOMARK_TLSGETADDR): Likewise.
	* config/rs6000/rs6000.md (tls_gd<bits>): Replace TARGET_TLS_MARKERS
	with !TARGET_XCOFF.
	(tls_gd_high<bits>, tls_gd_low<bits>): Likewise.
	(tls_ld<bits>, tls_ld_high<bits>, tls_ld_low<bits>): Likewise.
	(pltseq_plt_pcrel<mode>): Likewise.
	(call_value_local32): Remove IS_NOMARK_TLSGETADDR predicate test.
	(call_value_local64): Likewise.
	(call_value_indirect_nonlocal_sysv<mode>): Remove IS_NOMARK_TLSGETADDR
	output and length attribute sub-expression.
	(call_value_nonlocal_sysv<mode>),
	(call_value_nonlocal_sysv_secure<mode>),
	(call_value_local_aix<mode>, call_value_nonlocal_aix<mode>),
	(call_value_indirect_aix<mode>, call_value_indirect_elfv2<mode>),
	(call_value_indirect_pcrel<mode>): Likewise.
	* config/rs6000/rs6000.opt (mtls-markers): Delete.
	* doc/install.texi (powerpc-*-*): Require binutils-2.20.

diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 06e40d94b17..88b5b7cec55 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -139,7 +139,6 @@ extern bool valid_sf_si_move (rtx, rtx, machine_mode);
 extern void rs6000_emit_move (rtx, rtx, machine_mode);
 extern bool rs6000_legitimate_offset_address_p (machine_mode, rtx,
 						bool, bool);
-extern void rs6000_output_tlsargs (rtx *);
 extern rtx rs6000_find_base_term (rtx);
 extern rtx rs6000_return_addr (int, rtx);
 extern void rs6000_output_symbol_ref (FILE*, rtx);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index e792116fb40..5e2b08c3c72 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -8329,41 +8329,6 @@ rs6000_legitimize_tls_address_aix (rtx addr, enum tls_model model)
   return dest;
 }
 
-/* Output arg setup instructions for a !TARGET_TLS_MARKERS
-   __tls_get_addr call.  */
-
-void
-rs6000_output_tlsargs (rtx *operands)
-{
-  /* Set up operands for output_asm_insn, without modifying OPERANDS.  */
-  rtx op[3];
-
-  /* The set dest of the call, ie. r3, which is also the first arg reg.  */
-  op[0] = operands[0];
-  /* The TLS symbol from global_tlsarg stashed as CALL operand 2.  */
-  op[1] = XVECEXP (operands[2], 0, 0);
-  if (XINT (operands[2], 1) == UNSPEC_TLSGD)
-    {
-      /* The GOT register.  */
-      op[2] = XVECEXP (operands[2], 0, 1);
-      if (TARGET_CMODEL != CMODEL_SMALL)
-	output_asm_insn ("addis %0,%2,%1@got@tlsgd@ha\n\t"
-			 "addi %0,%0,%1@got@tlsgd@l", op);
-      else
-	output_asm_insn ("addi %0,%2,%1@got@tlsgd", op);
-    }
-  else if (XINT (operands[2], 1) == UNSPEC_TLSLD)
-    {
-      if (TARGET_CMODEL != CMODEL_SMALL)
-	output_asm_insn ("addis %0,%1,%&@got@tlsld@ha\n\t"
-			 "addi %0,%0,%&@got@tlsld@l", op);
-      else
-	output_asm_insn ("addi %0,%1,%&@got@tlsld", op);
-    }
-  else
-    gcc_unreachable ();
-}
-
 /* Passes the tls arg value for global dynamic and local dynamic
    emit_library_call_value in rs6000_legitimize_tls_address to
    rs6000_call_aix and rs6000_call_sysv.  This is used to emit the
@@ -8465,16 +8430,10 @@ rs6000_legitimize_tls_address (rtx addr, enum tls_model model)
 	  rtx arg = gen_rtx_UNSPEC (Pmode, gen_rtvec (2, addr, got),
 				    UNSPEC_TLSGD);
 	  tga = rs6000_tls_get_addr ();
+	  rtx argreg = gen_rtx_REG (Pmode, 3);
+	  emit_insn (gen_rtx_SET (argreg, arg));
 	  global_tlsarg = arg;
-	  if (TARGET_TLS_MARKERS)
-	    {
-	      rtx argreg = gen_rtx_REG (Pmode, 3);
-	      emit_insn (gen_rtx_SET (argreg, arg));
-	      emit_library_call_value (tga, dest, LCT_CONST, Pmode,
-				       argreg, Pmode);
-	    }
-	  else
-	    emit_library_call_value (tga, dest, LCT_CONST, Pmode);
+	  emit_library_call_value (tga, dest, LCT_CONST, Pmode, argreg, Pmode);
 	  global_tlsarg = NULL_RTX;
 
 	  /* Make a note so that the result of this call can be CSEd.  */
@@ -8487,16 +8446,10 @@ rs6000_legitimize_tls_address (rtx addr, enum tls_model model)
 	  rtx arg = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, got), UNSPEC_TLSLD);
 	  tga = rs6000_tls_get_addr ();
 	  tmp1 = gen_reg_rtx (Pmode);
+	  rtx argreg = gen_rtx_REG (Pmode, 3);
+	  emit_insn (gen_rtx_SET (argreg, arg));
 	  global_tlsarg = arg;
-	  if (TARGET_TLS_MARKERS)
-	    {
-	      rtx argreg = gen_rtx_REG (Pmode, 3);
-	      emit_insn (gen_rtx_SET (argreg, arg));
-	      emit_library_call_value (tga, tmp1, LCT_CONST, Pmode,
-				       argreg, Pmode);
-	    }
-	  else
-	    emit_library_call_value (tga, tmp1, LCT_CONST, Pmode);
+	  emit_library_call_value (tga, tmp1, LCT_CONST, Pmode, argreg, Pmode);
 	  global_tlsarg = NULL_RTX;
 
 	  /* Make a note so that the result of this call can be CSEd.  */
@@ -13270,14 +13223,12 @@ rs6000_call_template_1 (rtx *operands, unsigned int funop, bool sibcall)
 
   char arg[12];
   arg[0] = 0;
-  if (TARGET_TLS_MARKERS && GET_CODE (operands[funop + 1]) == UNSPEC)
+  if (GET_CODE (operands[funop + 1]) == UNSPEC)
     {
       if (XINT (operands[funop + 1], 1) == UNSPEC_TLSGD)
 	sprintf (arg, "(%%%u@tlsgd)", funop + 1);
       else if (XINT (operands[funop + 1], 1) == UNSPEC_TLSLD)
 	sprintf (arg, "(%%&@tlsld)");
-      else
-	gcc_unreachable ();
     }
 
   /* The magic 32768 offset here corresponds to the offset of
@@ -13418,7 +13369,7 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop,
       const char *rel64 = TARGET_64BIT ? "64" : "";
       char tls[29];
       tls[0] = 0;
-      if (TARGET_TLS_MARKERS && GET_CODE (operands[funop + 1]) == UNSPEC)
+      if (GET_CODE (operands[funop + 1]) == UNSPEC)
 	{
 	  if (XINT (operands[funop + 1], 1) == UNSPEC_TLSGD)
 	    sprintf (tls, ".reloc .,R_PPC%s_TLSGD,%%%u\n\t",
@@ -13426,8 +13377,6 @@ rs6000_indirect_call_template_1 (rtx *operands, unsigned int funop,
 	  else if (XINT (operands[funop + 1], 1) == UNSPEC_TLSLD)
 	    sprintf (tls, ".reloc .,R_PPC%s_TLSLD,%%&\n\t",
 		     rel64);
-	  else
-	    gcc_unreachable ();
 	}
 
       const char *notoc = rs6000_pcrel_p (cfun) ? "_NOTOC" : "";
@@ -13514,7 +13463,7 @@ rs6000_pltseq_template (rtx *operands, int which)
   const char *rel64 = TARGET_64BIT ? "64" : "";
   char tls[30];
   tls[0] = 0;
-  if (TARGET_TLS_MARKERS && GET_CODE (operands[3]) == UNSPEC)
+  if (GET_CODE (operands[3]) == UNSPEC)
     {
       char off = which == RS6000_PLTSEQ_PLT_PCREL34 ? '8' : '4';
       if (XINT (operands[3], 1) == UNSPEC_TLSGD)
@@ -13523,8 +13472,6 @@ rs6000_pltseq_template (rtx *operands, int which)
       else if (XINT (operands[3], 1) == UNSPEC_TLSLD)
 	sprintf (tls, ".reloc .-%c,R_PPC%s_TLSLD,%%&\n\t",
 		 off, rel64);
-      else
-	gcc_unreachable ();
     }
 
   gcc_assert (DEFAULT_ABI == ABI_ELFv2 || DEFAULT_ABI == ABI_V4);
@@ -22866,9 +22813,6 @@ static struct rs6000_opt_var const rs6000_opt_vars[] =
   { "align-branch-targets",
     offsetof (struct gcc_options, x_TARGET_ALIGN_BRANCH_TARGETS),
     offsetof (struct cl_target_option, x_TARGET_ALIGN_BRANCH_TARGETS), },
-  { "tls-markers",
-    offsetof (struct gcc_options, x_tls_markers),
-    offsetof (struct cl_target_option, x_tls_markers), },
   { "sched-prolog",
     offsetof (struct gcc_options, x_TARGET_SCHED_PROLOG),
     offsetof (struct cl_target_option, x_TARGET_SCHED_PROLOG), },
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 9c11a3e4d46..b263213ad75 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -228,15 +228,6 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
 #define TARGET_MFCRF 0
 #endif
 
-/* Define TARGET_TLS_MARKERS if the target assembler does not support
-   arg markers for __tls_get_addr calls.  */
-#ifndef HAVE_AS_TLS_MARKERS
-#undef  TARGET_TLS_MARKERS
-#define TARGET_TLS_MARKERS 0
-#else
-#define TARGET_TLS_MARKERS tls_markers
-#endif
-
 #ifndef TARGET_SECURE_PLT
 #define TARGET_SECURE_PLT 0
 #endif
@@ -1513,13 +1504,6 @@ enum rs6000_pltseq_enum {
 #define IS_V4_FP_ARGS(OP) \
   ((INTVAL (OP) & (CALL_V4_CLEAR_FP_ARGS | CALL_V4_SET_FP_ARGS)) != 0)
 
-/* Whether OP is an UNSPEC used in !TARGET_TLS_MARKER calls.  */
-#define IS_NOMARK_TLSGETADDR(OP)		\
-  (!TARGET_TLS_MARKERS				\
-   && GET_CODE (OP) == UNSPEC			\
-   && (XINT (OP, 1) == UNSPEC_TLSGD		\
-       || XINT (OP, 1) == UNSPEC_TLSLD))
-
 /* We don't have prologue and epilogue functions to save/restore
    everything for most ABIs.  */
 #define WORLD_SAVE_P(INFO) 0
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 9a7a1da987f..b5b4bc1587e 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -9413,7 +9413,7 @@ (define_insn_and_split "*tls_gd<bits>"
 	(unspec:P [(match_operand:P 1 "rs6000_tls_symbol_ref" "")
 		   (match_operand:P 2 "gpc_reg_operand" "b")]
 		  UNSPEC_TLSGD))]
-  "HAVE_AS_TLS && TARGET_TLS_MARKERS"
+  "HAVE_AS_TLS && !TARGET_XCOFF"
   "addi %0,%2,%1@got@tlsgd"
   "&& TARGET_CMODEL != CMODEL_SMALL"
   [(set (match_dup 3)
@@ -9436,7 +9436,7 @@ (define_insn "*tls_gd_high<bits>"
        (unspec:P [(match_operand:P 1 "rs6000_tls_symbol_ref" "")
 		  (match_operand:P 2 "gpc_reg_operand" "b")]
 		 UNSPEC_TLSGD)))]
-  "HAVE_AS_TLS && TARGET_TLS_MARKERS && TARGET_CMODEL != CMODEL_SMALL"
+  "HAVE_AS_TLS && !TARGET_XCOFF && TARGET_CMODEL != CMODEL_SMALL"
   "addis %0,%2,%1@got@tlsgd@ha")
 
 (define_insn "*tls_gd_low<bits>"
@@ -9445,14 +9445,14 @@ (define_insn "*tls_gd_low<bits>"
        (unspec:P [(match_operand:P 2 "rs6000_tls_symbol_ref" "")
 		  (match_operand:P 3 "gpc_reg_operand" "b")]
 		 UNSPEC_TLSGD)))]
-  "HAVE_AS_TLS && TARGET_TLS_MARKERS && TARGET_CMODEL != CMODEL_SMALL"
+  "HAVE_AS_TLS && !TARGET_XCOFF && TARGET_CMODEL != CMODEL_SMALL"
   "addi %0,%1,%2@got@tlsgd@l")
 
 (define_insn_and_split "*tls_ld<bits>"
   [(set (match_operand:P 0 "gpc_reg_operand" "=b")
 	(unspec:P [(match_operand:P 1 "gpc_reg_operand" "b")]
 		  UNSPEC_TLSLD))]
-  "HAVE_AS_TLS && TARGET_TLS_MARKERS"
+  "HAVE_AS_TLS && !TARGET_XCOFF"
   "addi %0,%1,%&@got@tlsld"
   "&& TARGET_CMODEL != CMODEL_SMALL"
   [(set (match_dup 2)
@@ -9474,7 +9474,7 @@ (define_insn "*tls_ld_high<bits>"
      (high:P
        (unspec:P [(match_operand:P 1 "gpc_reg_operand" "b")]
 		 UNSPEC_TLSLD)))]
-  "HAVE_AS_TLS && TARGET_TLS_MARKERS && TARGET_CMODEL != CMODEL_SMALL"
+  "HAVE_AS_TLS && !TARGET_XCOFF && TARGET_CMODEL != CMODEL_SMALL"
   "addis %0,%1,%&@got@tlsld@ha")
 
 (define_insn "*tls_ld_low<bits>"
@@ -9482,7 +9482,7 @@ (define_insn "*tls_ld_low<bits>"
      (lo_sum:P (match_operand:P 1 "gpc_reg_operand" "b")
        (unspec:P [(match_operand:P 2 "gpc_reg_operand" "b")]
 		 UNSPEC_TLSLD)))]
-  "HAVE_AS_TLS && TARGET_TLS_MARKERS && TARGET_CMODEL != CMODEL_SMALL"
+  "HAVE_AS_TLS && !TARGET_XCOFF && TARGET_CMODEL != CMODEL_SMALL"
   "addi %0,%1,%&@got@tlsld@l")
 
 (define_insn "tls_dtprel_<bits>"
@@ -10193,7 +10193,7 @@ (define_insn "*pltseq_plt_pcrel<mode>"
 		   (match_operand:P 2 "symbol_ref_operand" "s")
 		   (match_operand:P 3 "" "")]
 		  UNSPEC_PLT_PCREL))]
-  "HAVE_AS_PLTSEQ && TARGET_TLS_MARKERS
+  "HAVE_AS_PLTSEQ && !TARGET_XCOFF
    && rs6000_pcrel_p (cfun)"
 {
   return rs6000_pltseq_template (operands, RS6000_PLTSEQ_PLT_PCREL34);
@@ -10308,8 +10308,7 @@ (define_insn "*call_value_local32"
 	      (match_operand 2)))
    (use (match_operand:SI 3 "immediate_operand" "O,n"))
    (clobber (reg:SI LR_REGNO))]
-  "(INTVAL (operands[3]) & CALL_LONG) == 0
-   && !IS_NOMARK_TLSGETADDR (operands[2])"
+  "(INTVAL (operands[3]) & CALL_LONG) == 0"
 {
   if (INTVAL (operands[3]) & CALL_V4_SET_FP_ARGS)
     output_asm_insn ("crxor 6,6,6", operands);
@@ -10329,8 +10328,7 @@ (define_insn "*call_value_local64"
 	      (match_operand 2)))
    (use (match_operand:SI 3 "immediate_operand" "O,n"))
    (clobber (reg:DI LR_REGNO))]
-  "TARGET_64BIT && (INTVAL (operands[3]) & CALL_LONG) == 0
-   && !IS_NOMARK_TLSGETADDR (operands[2])"
+  "TARGET_64BIT && (INTVAL (operands[3]) & CALL_LONG) == 0"
 {
   if (INTVAL (operands[3]) & CALL_V4_SET_FP_ARGS)
     output_asm_insn ("crxor 6,6,6", operands);
@@ -10428,10 +10426,7 @@ (define_insn "*call_value_indirect_nonlocal_sysv<mode>"
   "DEFAULT_ABI == ABI_V4
    || DEFAULT_ABI == ABI_DARWIN"
 {
-  if (IS_NOMARK_TLSGETADDR (operands[2]))
-    rs6000_output_tlsargs (operands);
-
-  else if (INTVAL (operands[3]) & CALL_V4_SET_FP_ARGS)
+  if (INTVAL (operands[3]) & CALL_V4_SET_FP_ARGS)
     output_asm_insn ("crxor 6,6,6", operands);
 
   else if (INTVAL (operands[3]) & CALL_V4_CLEAR_FP_ARGS)
@@ -10442,8 +10437,7 @@ (define_insn "*call_value_indirect_nonlocal_sysv<mode>"
   [(set_attr "type" "jmpreg")
    (set (attr "length")
 	(plus
-	  (if_then_else (ior (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
-			     (match_test "IS_V4_FP_ARGS (operands[3])"))
+	  (if_then_else (match_test "IS_V4_FP_ARGS (operands[3])")
 	    (const_int 4)
 	    (const_int 0))
 	  (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
@@ -10461,10 +10455,7 @@ (define_insn "*call_value_nonlocal_sysv<mode>"
     || (DEFAULT_ABI == ABI_V4
 	&& (INTVAL (operands[3]) & CALL_LONG) == 0))"
 {
-  if (IS_NOMARK_TLSGETADDR (operands[2]))
-    rs6000_output_tlsargs (operands);
-
-  else if (INTVAL (operands[3]) & CALL_V4_SET_FP_ARGS)
+  if (INTVAL (operands[3]) & CALL_V4_SET_FP_ARGS)
     output_asm_insn ("crxor 6,6,6", operands);
 
   else if (INTVAL (operands[3]) & CALL_V4_CLEAR_FP_ARGS)
@@ -10474,8 +10465,7 @@ (define_insn "*call_value_nonlocal_sysv<mode>"
 }
   [(set_attr "type" "branch")
    (set (attr "length")
-	(if_then_else (ior (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
-			   (match_test "IS_V4_FP_ARGS (operands[3])"))
+	(if_then_else (match_test "IS_V4_FP_ARGS (operands[3])")
 	  (const_int 8)
 	  (const_int 4)))])
 
@@ -10490,10 +10480,7 @@ (define_insn "*call_value_nonlocal_sysv_secure<mode>"
     && TARGET_SECURE_PLT && flag_pic && !SYMBOL_REF_LOCAL_P (operands[1])
     && (INTVAL (operands[3]) & CALL_LONG) == 0)"
 {
-  if (IS_NOMARK_TLSGETADDR (operands[2]))
-    rs6000_output_tlsargs (operands);
-
-  else if (INTVAL (operands[3]) & CALL_V4_SET_FP_ARGS)
+  if (INTVAL (operands[3]) & CALL_V4_SET_FP_ARGS)
     output_asm_insn ("crxor 6,6,6", operands);
 
   else if (INTVAL (operands[3]) & CALL_V4_CLEAR_FP_ARGS)
@@ -10503,8 +10490,7 @@ (define_insn "*call_value_nonlocal_sysv_secure<mode>"
 }
   [(set_attr "type" "branch")
    (set (attr "length")
-	(if_then_else (ior (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
-			   (match_test "IS_V4_FP_ARGS (operands[3])"))
+	(if_then_else (match_test "IS_V4_FP_ARGS (operands[3])")
 	  (const_int 8)
 	  (const_int 4)))])
 
@@ -10527,8 +10513,7 @@ (define_insn "*call_value_local_aix<mode>"
 	(call (mem:SI (match_operand:P 1 "current_file_function_operand" "s"))
 	      (match_operand 2)))
    (clobber (reg:P LR_REGNO))]
-  "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
-   && !IS_NOMARK_TLSGETADDR (operands[2])"
+  "DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2"
 {
   if (rs6000_pcrel_p (cfun))
     return "bl %z1@notoc";
@@ -10560,21 +10545,13 @@ (define_insn "*call_value_nonlocal_aix<mode>"
    (clobber (reg:P LR_REGNO))]
   "DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2"
 {
-  if (IS_NOMARK_TLSGETADDR (operands[2]))
-    rs6000_output_tlsargs (operands);
-
   return rs6000_call_template (operands, 1);
 }
   [(set_attr "type" "branch")
    (set (attr "length")
-	(plus (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
-		(if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
-		  (const_int 8)
-		  (const_int 4))
-		(const_int 0))
-	      (if_then_else (match_test "rs6000_pcrel_p (cfun)")
-		(const_int 4)
-		(const_int 8))))])
+	(if_then_else (match_test "rs6000_pcrel_p (cfun)")
+	    (const_int 4)
+	    (const_int 8)))])
 
 ;; Call to indirect functions with the AIX abi using a 3 word descriptor.
 ;; Operand0 is the addresss of the function to call
@@ -10609,23 +10586,14 @@ (define_insn "*call_value_indirect_aix<mode>"
    (clobber (reg:P LR_REGNO))]
   "DEFAULT_ABI == ABI_AIX"
 {
-  if (IS_NOMARK_TLSGETADDR (operands[2]))
-    rs6000_output_tlsargs (operands);
-
   return rs6000_indirect_call_template (operands, 1);
 }
   [(set_attr "type" "jmpreg")
    (set (attr "length")
-	(plus
-	  (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
-	    (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
-	      (const_int 8)
-	      (const_int 4))
-	    (const_int 0))
-	  (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
-			     (match_test "which_alternative != 1"))
+	(if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
+			   (match_test "which_alternative != 1"))
 	    (const_string "16")
-	    (const_string "12"))))])
+	    (const_string "12")))])
 
 ;; Call to indirect functions with the ELFv2 ABI.
 ;; Operand0 is the addresss of the function to call
@@ -10672,23 +10640,14 @@ (define_insn "*call_value_indirect_elfv2<mode>"
    (clobber (reg:P LR_REGNO))]
   "DEFAULT_ABI == ABI_ELFv2"
 {
-  if (IS_NOMARK_TLSGETADDR (operands[2]))
-    rs6000_output_tlsargs (operands);
-
   return rs6000_indirect_call_template (operands, 1);
 }
   [(set_attr "type" "jmpreg")
    (set (attr "length")
-	(plus
-	  (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
-	    (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
-	      (const_int 8)
-	      (const_int 4))
-	    (const_int 0))
-	  (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
-			     (match_test "which_alternative != 1"))
+	(if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
+			   (match_test "which_alternative != 1"))
 	    (const_string "12")
-	    (const_string "8"))))])
+	    (const_string "8")))])
 
 (define_insn "*call_value_indirect_pcrel<mode>"
   [(set (match_operand 0 "" "")
@@ -10697,23 +10656,14 @@ (define_insn "*call_value_indirect_pcrel<mode>"
    (clobber (reg:P LR_REGNO))]
   "rs6000_pcrel_p (cfun)"
 {
-  if (IS_NOMARK_TLSGETADDR (operands[2]))
-    rs6000_output_tlsargs (operands);
-
   return rs6000_indirect_call_template (operands, 1);
 }
   [(set_attr "type" "jmpreg")
    (set (attr "length")
-	(plus
-	  (if_then_else (match_test "IS_NOMARK_TLSGETADDR (operands[2])")
-	    (if_then_else (match_test "TARGET_CMODEL != CMODEL_SMALL")
-	      (const_int 8)
-	      (const_int 4))
-	    (const_int 0))
-	  (if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
-			     (match_test "which_alternative != 1"))
+	(if_then_else (and (match_test "!rs6000_speculate_indirect_jumps")
+			   (match_test "which_alternative != 1"))
 	    (const_string "8")
-	    (const_string "4"))))])
+	    (const_string "4")))])
 
 ;; Call subroutine returning any type.
 (define_expand "untyped_call"
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 1b69507cfa8..29803b753eb 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -246,10 +246,6 @@ mavoid-indexed-addresses
 Target Report Var(TARGET_AVOID_XFORM) Init(-1) Save
 Avoid generation of indexed load/store instructions when possible.
 
-mtls-markers
-Target Report Var(tls_markers) Init(1) Save
-Mark __tls_get_addr calls with argument info.
-
 msched-epilog
 Target Undocumented Var(TARGET_SCHED_PROLOG) Init(1) Save
 
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index df6fefd72b9..feeda9fcc0e 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -4354,7 +4354,7 @@ The OpenRISC 1000 32-bit processor with delay slots.
 You can specify a default version for the @option{-mcpu=@var{cpu_type}}
 switch by using the configure option @option{--with-cpu-@var{cpu_type}}.
 
-You will need GNU binutils 2.15 or newer.
+You will need GNU binutils 2.20 or newer.
 
 @html
 <hr />

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PC-relative TLS support
  2019-08-21 13:34         ` Alan Modra
@ 2019-11-11  7:40           ` Alan Modra
  2019-11-11 11:45             ` Segher Boessenkool
  2019-11-11 12:10           ` Segher Boessenkool
  1 sibling, 1 reply; 26+ messages in thread
From: Alan Modra @ 2019-11-11  7:40 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches

On Wed, Aug 21, 2019 at 09:55:28PM +0930, Alan Modra wrote:
> On Mon, Aug 19, 2019 at 07:45:19AM -0500, Segher Boessenkool wrote:
> > But if you think we can remove the !TARGET_TLS_MARKERS everywhere it
> > is relevant at all, now is the time, patches very welcome, it would be
> > a nice cleanup :-)  Needs testing everywhere of course, but now is
> > stage 1 :-)
> 
> This patch removes !TARGET_TLS_MARKERS support.  -mtls-markers (and
> -mno-tls-markers) disappear as valid options too, because I figure
> they haven't been used too much except by people testing the
> compiler.  Bootstrapped and regression tested powerpc64le-linux and
> powerpc-ibm-aix7.1.3.0 (on gcc111).  I believe powerpc*-darwin doesn't
> support TLS.
> 
> Requiring an 8 year old binutils-2.20 shouldn't be that onerous.
> 
> Note that this patch doesn't remove the configure test to set
> HAVE_AS_TLS_MARKERS.  I was wondering whether I ought to hook that
> into a "sorry, your assembler is too old" error?

https://gcc.gnu.org/ml/gcc-patches/2019-08/msg01487.html

I should have pinged this before now, and really I think the following
additional patch makes more sense than any sort of sorry message.
Mostly people will be running the assembler anyway so will discover
quickly that their assembler is too old.

	* configure.ac (HAVE_AS_TLS_MARKERS): Delete test.
	* configure: Regenerate.
	* config.in: Regenerate.

diff --git a/gcc/configure.ac b/gcc/configure.ac
index 5f32fd4d5e4..44d816630e9 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -4811,12 +4811,6 @@ LCF0:
       [AC_DEFINE(HAVE_AS_GNU_ATTRIBUTE, 1,
 	  [Define if your assembler supports .gnu_attribute.])])
 
-    gcc_GAS_CHECK_FEATURE([tls marker support],
-      gcc_cv_as_powerpc_tls_markers, [2,20,0],,
-      [ bl __tls_get_addr(x@tlsgd)],,
-      [AC_DEFINE(HAVE_AS_TLS_MARKERS, 1,
-	  [Define if your assembler supports arg info for __tls_get_addr.])])
-
     gcc_GAS_CHECK_FEATURE([prologue entry point marker support],
       gcc_cv_as_powerpc_entry_markers, [2,26,0],-a64 --fatal-warnings,
       [ .reloc .,R_PPC64_ENTRY; nop],,

> 	* config/rs6000/rs6000-protos.h (rs6000_output_tlsargs): Delete.
> 	* config/rs6000/rs6000.c (rs6000_output_tlsargs): Delete.
> 	(rs6000_legitimize_tls_address): Remove !TARGET_TLS_MARKERS code.
> 	(rs6000_call_template_1): Delete TARGET_TLS_MARKERS test and
> 	allow other UNSPECs besides UNSPEC_TLSGD and UNSPEC_TLSLD.
> 	(rs6000_indirect_call_template_1): Likewise.
> 	(rs6000_pltseq_template): Likewise.
> 	(rs6000_opt_vars): Remove "tls-markers" entry.
> 	* config/rs6000/rs6000.h (TARGET_TLS_MARKERS): Don't define.
> 	(IS_NOMARK_TLSGETADDR): Likewise.
> 	* config/rs6000/rs6000.md (tls_gd<bits>): Replace TARGET_TLS_MARKERS
> 	with !TARGET_XCOFF.
> 	(tls_gd_high<bits>, tls_gd_low<bits>): Likewise.
> 	(tls_ld<bits>, tls_ld_high<bits>, tls_ld_low<bits>): Likewise.
> 	(pltseq_plt_pcrel<mode>): Likewise.
> 	(call_value_local32): Remove IS_NOMARK_TLSGETADDR predicate test.
> 	(call_value_local64): Likewise.
> 	(call_value_indirect_nonlocal_sysv<mode>): Remove IS_NOMARK_TLSGETADDR
> 	output and length attribute sub-expression.
> 	(call_value_nonlocal_sysv<mode>),
> 	(call_value_nonlocal_sysv_secure<mode>),
> 	(call_value_local_aix<mode>, call_value_nonlocal_aix<mode>),
> 	(call_value_indirect_aix<mode>, call_value_indirect_elfv2<mode>),
> 	(call_value_indirect_pcrel<mode>): Likewise.
> 	* config/rs6000/rs6000.opt (mtls-markers): Delete.
> 	* doc/install.texi (powerpc-*-*): Require binutils-2.20.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PC-relative TLS support
  2019-11-11  7:40           ` Alan Modra
@ 2019-11-11 11:45             ` Segher Boessenkool
  0 siblings, 0 replies; 26+ messages in thread
From: Segher Boessenkool @ 2019-11-11 11:45 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

Hi Alan,

On Mon, Nov 11, 2019 at 05:46:01PM +1030, Alan Modra wrote:
> On Wed, Aug 21, 2019 at 09:55:28PM +0930, Alan Modra wrote:
> > On Mon, Aug 19, 2019 at 07:45:19AM -0500, Segher Boessenkool wrote:
> > > But if you think we can remove the !TARGET_TLS_MARKERS everywhere it
> > > is relevant at all, now is the time, patches very welcome, it would be
> > > a nice cleanup :-)  Needs testing everywhere of course, but now is
> > > stage 1 :-)
> > 
> > This patch removes !TARGET_TLS_MARKERS support.  -mtls-markers (and
> > -mno-tls-markers) disappear as valid options too, because I figure
> > they haven't been used too much except by people testing the
> > compiler.  Bootstrapped and regression tested powerpc64le-linux and
> > powerpc-ibm-aix7.1.3.0 (on gcc111).  I believe powerpc*-darwin doesn't
> > support TLS.

Excellent :-)

> > Requiring an 8 year old binutils-2.20 shouldn't be that onerous.

Right.  Your binutils should be about the same vintage as your GCC, or
various things will not work anyway.  And building binutils is easy and
cheap anyway, compared to building GCC.

> I should have pinged this before now,

And I shouldn't have dropped it :-)

> and really I think the following
> additional patch makes more sense than any sort of sorry message.
> Mostly people will be running the assembler anyway so will discover
> quickly that their assembler is too old.

Yeah, that's fine.  Almost no one would ever hit that anwyay.

I'll reply to the rest as reply to the original mail.


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PC-relative TLS support
  2019-08-21 13:34         ` Alan Modra
  2019-11-11  7:40           ` Alan Modra
@ 2019-11-11 12:10           ` Segher Boessenkool
  2019-11-11 13:36             ` Alan Modra
  1 sibling, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2019-11-11 12:10 UTC (permalink / raw)
  To: Alan Modra; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Wed, Aug 21, 2019 at 09:55:28PM +0930, Alan Modra wrote:
> This patch removes !TARGET_TLS_MARKERS support.  -mtls-markers (and
> -mno-tls-markers) disappear as valid options too, because I figure
> they haven't been used too much except by people testing the
> compiler.

Okay.

> 	(rs6000_call_template_1): Delete TARGET_TLS_MARKERS test and
> 	allow other UNSPECs besides UNSPEC_TLSGD and UNSPEC_TLSLD.

Why is that?  Should we allow the other code that can happen and keep
the gcc_unreachable?  Or do we know that no other code can happen here
ever, and the extra documentation isn't useful?

> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -9413,7 +9413,7 @@ (define_insn_and_split "*tls_gd<bits>"
>  	(unspec:P [(match_operand:P 1 "rs6000_tls_symbol_ref" "")
>  		   (match_operand:P 2 "gpc_reg_operand" "b")]
>  		  UNSPEC_TLSGD))]
> -  "HAVE_AS_TLS && TARGET_TLS_MARKERS"
> +  "HAVE_AS_TLS && !TARGET_XCOFF"

Should that be TARGET_ELF instead?

Okay for trunk with those two things looked at.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PC-relative TLS support
  2019-11-11 12:10           ` Segher Boessenkool
@ 2019-11-11 13:36             ` Alan Modra
  0 siblings, 0 replies; 26+ messages in thread
From: Alan Modra @ 2019-11-11 13:36 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Mon, Nov 11, 2019 at 05:56:47AM -0600, Segher Boessenkool wrote:
> On Wed, Aug 21, 2019 at 09:55:28PM +0930, Alan Modra wrote:
> > This patch removes !TARGET_TLS_MARKERS support.  -mtls-markers (and
> > -mno-tls-markers) disappear as valid options too, because I figure
> > they haven't been used too much except by people testing the
> > compiler.
> 
> Okay.
> 
> > 	(rs6000_call_template_1): Delete TARGET_TLS_MARKERS test and
> > 	allow other UNSPECs besides UNSPEC_TLSGD and UNSPEC_TLSLD.
> 
> Why is that?  Should we allow the other code that can happen and keep
> the gcc_unreachable?  Or do we know that no other code can happen here
> ever, and the extra documentation isn't useful?

The code in question is just printing the @tlsgd or @tlsld arg.  I
don't see any point in asserting that no other UNSPEC could ever be
used in a call operand.  Other places dealing with UNSPEC_TLSGD
and UNSPEC_TLSLD don't check, and if another UNSPEC is invented for
some fancy future call insn it's quite unlikely to want to output
anything here.

(I don't think I found such an UNSPEC already extant..)

> > --- a/gcc/config/rs6000/rs6000.md
> > +++ b/gcc/config/rs6000/rs6000.md
> > @@ -9413,7 +9413,7 @@ (define_insn_and_split "*tls_gd<bits>"
> >  	(unspec:P [(match_operand:P 1 "rs6000_tls_symbol_ref" "")
> >  		   (match_operand:P 2 "gpc_reg_operand" "b")]
> >  		  UNSPEC_TLSGD))]
> > -  "HAVE_AS_TLS && TARGET_TLS_MARKERS"
> > +  "HAVE_AS_TLS && !TARGET_XCOFF"
> 
> Should that be TARGET_ELF instead?

Either should work.  So, yes, probably better with TARGET_ELF.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-11-11 13:05 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-14 21:36 PowerPC 'future' patches introduction Michael Meissner
2019-08-14 21:37 ` [PATCH], Patch #1 of 10, Add instruction format enumeration Michael Meissner
2019-08-14 22:11 ` [PATCH], Patch #2 of 10, Add RTL prefixed attribute Michael Meissner
2019-08-19 19:15   ` Segher Boessenkool
2019-08-14 22:12 ` [PATCH], Patch #3 of 10, Add prefixed addressing support Michael Meissner
2019-08-16  1:59   ` Bill Schmidt
2019-08-14 22:15 ` [PATCH], Patch #4 of 10, Adjust costs based on insn sizes Michael Meissner
2019-08-14 22:23 ` [PATCH], Patch #5 of 10, Make -mpcrel default for -mcpu=future Michael Meissner
2019-08-14 23:10 ` [PATCH], Patch #6 of 10, Add 'future' support to function attributes Michael Meissner
2019-08-14 23:13 ` [PATCH], Patch #7 of 10, Add support for PCREL_OPT Michael Meissner
2019-08-14 23:16 ` [PATCH], Patch #8 of 10, Miscellaneous future tests Michael Meissner
2019-08-14 23:17 ` [PATCH], Patch #9 of 10, Add tests with large memory offsets Michael Meissner
2019-08-15  3:48 ` [PATCH], Patch #10 of 10, Add pc-relative tests Michael Meissner
2019-08-15  4:05 ` PowerPC 'future' patches introduction Segher Boessenkool
2019-08-15  8:10 ` PC-relative TLS support Alan Modra
2019-08-15 19:47   ` Segher Boessenkool
2019-08-16  4:09     ` Alan Modra
2019-08-19 13:39       ` Segher Boessenkool
2019-08-21 13:34         ` Alan Modra
2019-11-11  7:40           ` Alan Modra
2019-11-11 11:45             ` Segher Boessenkool
2019-11-11 12:10           ` Segher Boessenkool
2019-11-11 13:36             ` Alan Modra
2019-08-15 21:35 ` [PATCH], Patch #1 replacement (fix issues with future TLS patches) Michael Meissner
2019-08-16  0:25   ` Segher Boessenkool
2019-08-16  0:42   ` Bill Schmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).