public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* PowerPC future machine patches, version 4
@ 2019-09-18 23:42 Michael Meissner
  2019-09-18 23:49 ` [PATCH] V4, patch #1: Rework prefixed/pc-relative lookup Michael Meissner
                   ` (21 more replies)
  0 siblings, 22 replies; 37+ messages in thread
From: Michael Meissner @ 2019-09-18 23:42 UTC (permalink / raw)
  To: gcc-patches, segher, dje.gcc, meissner

This is a reworking of the patches that I posted as V3 at the end of August.

Unlike the last set of patches, I do not use the address mask bits in reg_addr,
but instead, I have a separate function that takes an address and decodes it
into the various different flavors (single register address, D-form 16-bit
address, X-form indexed address, numeric 34-bit offset, local pc-relative
address, etc.).  The caller then decides whether the address matches what they
are looking for.

I have two enumerations that I added to this series:

    1)	insn_form: This is the address format (D, DS, DQ, X, etc.);

    2)	non_prefixed: This is a limited enum that just describes the format of
	the non-prefixed instruction to decide if an address needs to be
	prefixed or not.

Originally, I was trying to re-use the same insn_form enumeration for both the
output and the input to say what the traditional instruction uses, but I
ultimately separated them to make it clearer.

As I said to you at the Caulron, when I replaced some of the predicates, I put
them in a different location in predicates.md, so that it would be clear that
the old version was completely eliminated, and replaced with a new version.

I also removed the two boolean arguments for the pc-relative matching, and
instead the address to insn_form just returns different values (34-bit numeric
offset, 34-bit bit pc-relative reference to a local symbol, 34-bit pc-relative
reference to an external symbol, etc.).

I did collapse the fix for vector extracts into the patches that enable general
prefixed addressing, so we don't have a possibility of bad code being
generated.

Right now, I'm not going to add the PCREL_OPT patches to this set, but I will
do it later, if these patches get applied.  I will rework it to meet the
comments you raised.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH] V4, patch #1: Rework prefixed/pc-relative lookup
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
@ 2019-09-18 23:49 ` Michael Meissner
  2019-09-21  1:29   ` Segher Boessenkool
  2019-09-18 23:56 ` [PATCH], V4, patch #2: Add prefixed insn attribute Michael Meissner
                   ` (20 subsequent siblings)
  21 siblings, 1 reply; 37+ messages in thread
From: Michael Meissner @ 2019-09-18 23:49 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch reworks the prefixed and pc-relative memory matching functions.

As I said in the intro message, I do not re-use the address mask bits in
reg_addr, but instead, I have a separate function that takes an address and
decodes it into the various different flavors (single register address, D-form
16-bit address, X-form indexed address, numeric 34-bit offset, local
pc-relative address, etc.).  The caller then decides whether the address
matches what they are looking for.

I have two enumerations that I added to this series:

    1)	insn_form: This is the address format (D, DS, DQ, X, etc.);

    2)	non_prefixed: This is a limited enum that just describes the format of
	the non-prefixed instruction to decide if an address needs to be
	prefixed or not.

Originally, I was trying to re-use the same insn_form enumeration for both the
output and the input to say what the traditional instruction uses, but I
ultimately separated them to make it clearer.

This is an infrastructure patch.  It needs the second patch to enable basic
pc-relative support.

I have done a bootstrap build with all of the patches applied, and there were
no regressions in the test suite.  After posting these patches, I will start a
job to build each set of patches in turn just to make sure there are no extra
warnings.

Can I commit this patch to the trunk?

2019-09-18  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (pcrel_address): Delete predicate.
	(pcrel_local_address): Replace pcrel_address predicate, use the
	new function address_to_insn_form.
	(pcrel_external_address): Replace with new implementation using
	address_to_insn_form..
	(prefixed_mem_operand): Delete predicate which is now unused.
	(pcrel_external_mem_operand): Delete predicate which is now
	unused.
	* config/rs6000/rs6000-protos.h (enum insn_form): New
	enumeration.
	(enum non_prefixed): New enumeration.
	(address_to_insn_form): New declaration.
	* config/rs6000/rs6000.c (print_operand_address): Check for either
	pc-relative local symbols or pc-relative external symbols.
	(mode_supports_prefixed_address_p): Delete, no longer used.
	(rs6000_prefixed_address_mode_p): Delete, no longer used.
	(address_to_insn_form): New function to decode an address format.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 275903)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -1625,82 +1625,7 @@ (define_predicate "small_toc_ref"
   return GET_CODE (op) == UNSPEC && XINT (op, 1) == UNSPEC_TOCREL;
 })
 
-;; Return true if the operand is a pc-relative address.
-(define_predicate "pcrel_address"
-  (match_code "label_ref,symbol_ref,const")
-{
-  if (!rs6000_pcrel_p (cfun))
-    return false;
-
-  if (GET_CODE (op) == CONST)
-    op = XEXP (op, 0);
-
-  /* Validate offset.  */
-  if (GET_CODE (op) == PLUS)
-    {
-      rtx op0 = XEXP (op, 0);
-      rtx op1 = XEXP (op, 1);
-
-      if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
-	return false;
-
-      op = op0;
-    }
-
-  if (LABEL_REF_P (op))
-    return true;
-
-  return (SYMBOL_REF_P (op) && SYMBOL_REF_LOCAL_P (op));
-})
-
-;; Return true if the operand is an external symbol whose address can be loaded
-;; into a register using:
-;;	PLD reg,label@pcrel@got
-;;
-;; The linker will either optimize this to either a PADDI if the label is
-;; defined locally in another module or a PLD of the address if the label is
-;; defined in another module.
-
-(define_predicate "pcrel_external_address"
-  (match_code "symbol_ref,const")
-{
-  if (!rs6000_pcrel_p (cfun))
-    return false;
-
-  if (GET_CODE (op) == CONST)
-    op = XEXP (op, 0);
-
-  /* Validate offset.  */
-  if (GET_CODE (op) == PLUS)
-    {
-      rtx op0 = XEXP (op, 0);
-      rtx op1 = XEXP (op, 1);
-
-      if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
-	return false;
-
-      op = op0;
-    }
-
-  return (SYMBOL_REF_P (op) && !SYMBOL_REF_LOCAL_P (op));
-})
-
-;; Return 1 if op is a prefixed memory operand.
-(define_predicate "prefixed_mem_operand"
-  (match_code "mem")
-{
-  return rs6000_prefixed_address_mode_p (XEXP (op, 0), GET_MODE (op));
-})
-
-;; Return 1 if op is a memory operand to an external variable when we
-;; support pc-relative addressing and the PCREL_OPT relocation to
-;; optimize references to it.
-(define_predicate "pcrel_external_mem_operand"
-  (match_code "mem")
-{
-  return pcrel_external_address (XEXP (op, 0), Pmode);
-})
-
+\f
 ;; Match the first insn (addis) in fusing the combination of addis and loads to
 ;; GPR registers on power8.
 (define_predicate "fusion_gpr_addis"
@@ -1857,3 +1782,31 @@ (define_predicate "fusion_addis_mem_comb
 
   return 0;
 })
+
+\f
+;; Return true if the operand is a pc-relative address to a local symbol or a
+;; label that can be used directly in a memory operation.
+(define_predicate "pcrel_local_address"
+  (match_code "label_ref,symbol_ref,const")
+{
+  enum insn_form iform = address_to_insn_form (op, mode, NON_PREFIXED_DEFAULT);
+  return iform == INSN_FORM_PCREL_LOCAL;
+})
+
+;; Return true if the operand is an external symbol whose address can be loaded
+;; into a register.
+(define_predicate "pcrel_external_address"
+  (match_code "symbol_ref,const")
+{
+  enum insn_form iform = address_to_insn_form (op, mode, NON_PREFIXED_DEFAULT);
+  return iform == INSN_FORM_PCREL_EXTERNAL;
+})
+
+;; Return true if the address is pc-relative and the symbol is either local or
+;; external.
+(define_predicate "pcrel_local_or_external_address"
+  (match_code "label_ref,symbol_ref,const")
+{
+  enum insn_form iform = address_to_insn_form (op, mode, NON_PREFIXED_DEFAULT);
+  return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
+})
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 275903)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -154,7 +154,41 @@ extern align_flags rs6000_loop_align (rt
 extern void rs6000_split_logical (rtx [], enum rtx_code, bool, bool, bool);
 extern bool rs6000_pcrel_p (struct function *);
 extern bool rs6000_fndecl_pcrel_p (const_tree);
-extern bool rs6000_prefixed_address_mode_p (rtx, machine_mode);
+
+/* Different PowerPC instruction formats that are used by GCC.  There are
+   various other instruction formats used by the PowerPC hardware, but the
+   these formats are not currently used by GCC.  */
+
+enum insn_form {
+  INSN_FORM_BAD,		/* Bad instruction format.  */
+  INSN_FORM_BASE_REG,		/* Base register only.  */
+  INSN_FORM_D,			/* Base register + 16-bit numeric offset.  */
+  INSN_FORM_DS,			/* Base register + 14-bit offset + 00.  */
+  INSN_FORM_DQ,			/* Base register + 12-bit offset + 0000.  */
+  INSN_FORM_X,			/* Base register + index register.  */
+  INSN_FORM_UPDATE,		/* Address udpates base register.  */
+  INSN_FORM_LO_SUM,		/* Special offset instruction.  */
+  INSN_FORM_PREFIXED_NUMERIC,	/* Base register + 34 bit numeric offset.  */
+  INSN_FORM_PCREL_LOCAL,	/* Pc-relative local symbol.  */
+  INSN_FORM_PCREL_EXTERNAL	/* Pc-relative external symbol.  */
+};
+
+/* Instruction format for the non-prefixed version of a load or store.  This is
+   used to determine if a 16-bit offset is valid to be used with a non-prefixed
+   (traditional) instruction or if the bottom bits of the offset cannot be used
+   with a DS or DQ instruction format, and GCC has to use a prefixed
+   instruction for the load or store.  */
+
+enum non_prefixed {
+  NON_PREFIXED_DEFAULT,		/* Use the default.  */
+  NON_PREFIXED_D,		/* All 16-bits are valid.  */
+  NON_PREFIXED_DS,		/* Bottom 2 bits must be 0.  */
+  NON_PREFIXED_DQ,		/* Bottom 4 bits must be 0.  */
+  NON_PREFIXED_X		/* No offset memory form exists.  */
+};
+
+extern enum insn_form address_to_insn_form (rtx, machine_mode,
+					    enum non_prefixed);
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 275903)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -13079,7 +13079,7 @@ print_operand_address (FILE *file, rtx x
     fprintf (file, "0(%s)", reg_names[ REGNO (x) ]);
 
   /* Is it a pc-relative address?  */
-  else if (pcrel_address (x, Pmode))
+  else if (TARGET_PCREL && pcrel_local_or_external_address (x, VOIDmode))
     {
       HOST_WIDE_INT offset;
 
@@ -13099,6 +13099,9 @@ print_operand_address (FILE *file, rtx x
       if (offset)
 	fprintf (file, "%+" PRId64, offset);
 
+      if (SYMBOL_REF_P (x) && !SYMBOL_REF_LOCAL_P (x))
+	fputs ("@got", file);
+
       fputs ("@pcrel", file);
     }
   else if (SYMBOL_REF_P (x) || GET_CODE (x) == CONST
@@ -13584,71 +13587,6 @@ rs6000_pltseq_template (rtx *operands, i
   return str;
 }
 #endif
-
-/* Helper function to return whether a MODE can do prefixed loads/stores.
-   VOIDmode is used when we are loading the pc-relative address into a base
-   register, but we are not using it as part of a memory operation.  As modes
-   add support for prefixed memory, they will be added here.  */
-
-static bool
-mode_supports_prefixed_address_p (machine_mode mode)
-{
-  return mode == VOIDmode;
-}
-
-/* Function to return true if ADDR is a valid prefixed memory address that uses
-   mode MODE.  */
-
-bool
-rs6000_prefixed_address_mode_p (rtx addr, machine_mode mode)
-{
-  if (!TARGET_PREFIXED_ADDR || !mode_supports_prefixed_address_p (mode))
-    return false;
-
-  /* Check for PC-relative addresses.  */
-  if (pcrel_address (addr, Pmode))
-    return true;
-
-  /* Check for prefixed memory addresses that have a large numeric offset,
-     or an offset that can't be used for a DS/DQ-form memory operation.  */
-  if (GET_CODE (addr) == PLUS)
-    {
-      rtx op0 = XEXP (addr, 0);
-      rtx op1 = XEXP (addr, 1);
-
-      if (!base_reg_operand (op0, Pmode) || !CONST_INT_P (op1))
-	return false;
-
-      HOST_WIDE_INT value = INTVAL (op1);
-      if (!SIGNED_34BIT_OFFSET_P (value))
-	return false;
-
-      /* Offset larger than 16-bits?  */
-      if (!SIGNED_16BIT_OFFSET_P (value))
-	return true;
-
-      /* DQ instruction (bottom 4 bits must be 0) for vectors.  */
-      HOST_WIDE_INT mask;
-      if (GET_MODE_SIZE (mode) >= 16)
-	mask = 15;
-
-      /* DS instruction (bottom 2 bits must be 0).  For 32-bit integers, we
-	 need to use DS instructions if we are sign-extending the value with
-	 LWA.  For 32-bit floating point, we need DS instructions to load and
-	 store values to the traditional Altivec registers.  */
-      else if (GET_MODE_SIZE (mode) >= 4)
-	mask = 3;
-
-      /* QImode/HImode has no restrictions.  */
-      else
-	return true;
-
-      /* Return true if we must use a prefixed instruction.  */
-      return (value & mask) != 0;
-    }
-
-  return false;
-}
 \f
 #if defined (HAVE_GAS_HIDDEN) && !TARGET_MACHO
 /* Emit an assembler directive to set symbol visibility for DECL to
@@ -24627,6 +24565,158 @@ rs6000_pcrel_p (struct function *fn)
   return rs6000_fndecl_pcrel_p (fn->decl);
 }
 
+\f
+/* Given an address (ADDR), a mode (MODE), and what the format of the
+   non-prefixed address (NON_PREFIXED_INSN) is, return the instruction format
+   for the address.  */
+
+enum insn_form
+address_to_insn_form (rtx addr,
+		      machine_mode mode,
+		      enum non_prefixed non_prefixed_insn)
+{
+  rtx op0, op1;
+
+  /* Single register is easy.  */
+  if (REG_P (addr) || SUBREG_P (addr))
+    return INSN_FORM_BASE_REG;
+
+  /* If we don't support offset addressing, make sure only indexed addressing
+     is allowed.  We special case SDmode so that the register allocator does
+     try to move SDmode through GPR registers, but instead uses the 32-bit
+     integer read/write instructions for the floating point registers.  */
+  if (non_prefixed_insn == NON_PREFIXED_X || mode == SDmode)
+    {
+      if (GET_CODE (addr) != PLUS)
+	return INSN_FORM_BAD;
+
+      op0 = XEXP (addr, 0);
+      op1 = XEXP (addr, 1);
+      if (!REG_P (op0) && !SUBREG_P (op0))
+	return INSN_FORM_BAD;
+
+      if (!REG_P (op1) && !SUBREG_P (op1))
+	return INSN_FORM_BAD;
+
+      return INSN_FORM_X;
+    }
+
+  /* Deal with update forms.  */
+  if (GET_RTX_CLASS (GET_CODE (addr)) == RTX_AUTOINC)
+    return INSN_FORM_UPDATE;
+
+  /* Handle pc-relative symbols and labels.  Check for both local and external
+     symbols.  Assume labels are always local.  */
+  if (TARGET_PCREL)
+    {
+      if (SYMBOL_REF_P (addr))
+	return (SYMBOL_REF_LOCAL_P (addr)
+		? INSN_FORM_PCREL_LOCAL
+		: INSN_FORM_PCREL_EXTERNAL);
+
+      if (LABEL_REF_P (addr))
+	return INSN_FORM_PCREL_LOCAL;
+    }
+
+  /* Check whether this is an offsettable address.  Deal with LO_SUM addresses
+     used with TOC and 32-bit addressing and with indexed addresses.  */
+  if (GET_CODE (addr) == CONST)
+    addr = XEXP (addr, 0);
+
+  if (GET_CODE (addr) != PLUS)
+    return GET_CODE (addr) == LO_SUM ? INSN_FORM_LO_SUM : INSN_FORM_BAD;
+
+  op0 = XEXP (addr, 0);
+  op1 = XEXP (addr, 1);
+
+  if (REG_P (op1) || SUBREG_P (op1))
+    return INSN_FORM_X;
+
+  if (!CONST_INT_P (op1))
+    return INSN_FORM_BAD;
+
+  HOST_WIDE_INT offset = INTVAL (op1);
+  if (!SIGNED_34BIT_OFFSET_P (offset))
+    return INSN_FORM_BAD;
+
+  /* Check for local and external pc-relative addresses.  Labels are always
+     local.  */
+  if (TARGET_PCREL)
+    {
+      if (SYMBOL_REF_P (op0))
+	return (SYMBOL_REF_LOCAL_P (op0)
+		? INSN_FORM_PCREL_LOCAL
+		: INSN_FORM_PCREL_EXTERNAL);
+
+      if (LABEL_REF_P (op0))
+	return INSN_FORM_PCREL_LOCAL;
+    }
+
+  /* If it isn't pc-relative, check for 16-bit D/DS/DQ-form.  */
+  if (!REG_P (op0) && !SUBREG_P (op0))
+    return INSN_FORM_BAD;
+
+  /* Large offsets must be prefixed.  */
+  if (!SIGNED_16BIT_OFFSET_P (offset))
+    return (TARGET_PREFIXED_ADDR
+	    ? INSN_FORM_PREFIXED_NUMERIC
+	    : INSN_FORM_BAD);
+
+  /* 16-bit offset, see what default instruction format to use.  */
+  if (non_prefixed_insn == NON_PREFIXED_DEFAULT)
+    {
+      unsigned size = GET_MODE_SIZE (mode);
+
+      /* On 64-bit systems, assume 64-bit integers need to use DS form
+	 addresses.  VSX vectors need to use DQ form addresses.  */
+      if (TARGET_POWERPC64 && size >= 8 && GET_MODE_CLASS (mode) == MODE_INT)
+	non_prefixed_insn = NON_PREFIXED_DS;
+
+      else if (TARGET_VSX && size >= 16
+	       && ALTIVEC_OR_VSX_VECTOR_MODE (mode))
+	non_prefixed_insn = NON_PREFIXED_DQ;
+
+      else
+	non_prefixed_insn = NON_PREFIXED_D;
+    }
+
+  /* Classify the D/DS/DQ-form addresses.  */
+  switch (non_prefixed_insn)
+    {
+      /* Instruction format D, all 16 bits are valid.  */
+    case NON_PREFIXED_D:
+      return INSN_FORM_D;
+
+      /* Instruction format DS, bottom 2 bits must be 0.  */
+    case NON_PREFIXED_DS:
+      if ((offset & 3) == 0)
+	return INSN_FORM_DS;
+
+      else if (TARGET_PREFIXED_ADDR)
+	return INSN_FORM_PREFIXED_NUMERIC;
+
+      else
+	return INSN_FORM_BAD;
+
+      /* Instruction format DQ, bottom 4 bits must be 0.  */
+    case NON_PREFIXED_DQ:
+      if ((offset & 15) == 0)
+	return INSN_FORM_DQ;
+
+      else if (TARGET_PREFIXED_ADDR)
+	return INSN_FORM_PREFIXED_NUMERIC;
+
+      else
+	return INSN_FORM_BAD;
+
+    default:
+      break;
+    }
+
+  return INSN_FORM_BAD;
+}
+
+\f
 #ifdef HAVE_GAS_HIDDEN
 # define USE_HIDDEN_LINKONCE 1
 #else


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4, patch #2: Add prefixed insn attribute
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
  2019-09-18 23:49 ` [PATCH] V4, patch #1: Rework prefixed/pc-relative lookup Michael Meissner
@ 2019-09-18 23:56 ` Michael Meissner
  2019-09-18 23:58 ` [PATCH], V4, patch #3: Fix up mov<mode>_64bit_dm Michael Meissner
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-09-18 23:56 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch adds the "prefixed" insn attribute that says whether or not the insn
generates a prefixed instruction or not.

The attributes "prefixed_length" and "non_prefixed_length" give then length in
bytes (12 and 4 by default) of the insn if it is prefixed or not.

The "length" attribute is set based on the "prefixed" attribute.

I use the target hooks ASM_OUTPUT_OPCODE and FINAL_PRESCAN_INSN to decide
whether to emit a leading "p" before the insn.

There are functions (prefixed_load_p, prefixed_store_p, and prefixed_paddi_p)
that given an insn type, say whether that particular insn type is prefixed or
not.

In addition, this patch adds the support in rs6000_emit_move to load up
pc-relative addresses, both local addresses defined in the same compilation
unit, and external addresses that might be need to be loaded from a .GOT
address table.

I have done a bootstrap build with all of the patches applied, and there were
no regressions in the test suite.  After posting these patches, I will start a
job to build each set of patches in turn just to make sure there are no extra
warnings.

Can I commit this patch to the trunk?

2019-09-18  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000-protos.h (prefixed_load_p): New
	declaration.
	(prefixed_store_p): New declaration.
	(prefixed_paddi_p): New declaration.
	(rs6000_asm_output_opcode): New declaration.
	(rs6000_final_prescan_insn): Move declaration and update calling
	signature.
	(address_is_prefixed): New helper inline function.
	* config/rs6000/rs6000.c (rs6000_emit_move): Support loading
	pc-relative addresses.
	(reg_to_non_prefixed): New function to identify what the
	non-prefixed memory instruction format is for a register.
	(prefixed_load_p): New function to identify prefixed loads.
	(prefixed_store_p): New function to identify prefixed stores.
	(prefixed_paddi_p): New function to identify prefixed load
	immediates.
	(next_insn_prefixed_p): New static state variable.
	(rs6000_final_prescan_insn): New function to determine if an insn
	uses a prefixed instruction.
	(rs6000_asm_output_opcode): New function to emit 'p' in front of a
	prefixed instruction.
	* config/rs6000/rs6000.h (FINAL_PRESCAN_INSN): New target hook.
	(ASM_OUTPUT_OPCODE): New target hook.
	* config/rs6000/rs6000.md (prefixed): New insn attribute for
	prefixed instructions.
	(prefixed_length): New insn attribute for the size of prefixed
	instructions.
	(non_prefixed_length): New insn attribute for the size of
	non-prefixed instructions.
	(pcrel_local_addr): New insn to load up a local pc-relative
	address.
	(pcrel_extern_addr): New insn to load up an external pc-relative
	address.

Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 275908)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -189,6 +189,30 @@ enum non_prefixed {
 
 extern enum insn_form address_to_insn_form (rtx, machine_mode,
 					    enum non_prefixed);
+extern bool prefixed_load_p (rtx_insn *);
+extern bool prefixed_store_p (rtx_insn *);
+extern bool prefixed_paddi_p (rtx_insn *);
+extern void rs6000_asm_output_opcode (FILE *);
+extern void rs6000_final_prescan_insn (rtx_insn *, rtx [], int);
+
+/* Return true if the address is a prefixed instruction that can be directly
+   used in a memory instruction (i.e. using numeric offset or a pc-relative
+   reference to a local symbol).
+
+   References to external pc-relative symbols aren't allowed, because GCC has
+   to load the address into a register and then issue a separate load or
+   store.  */
+
+static inline bool
+address_is_prefixed (rtx addr,
+		     machine_mode mode,
+		     enum non_prefixed non_prefixed_insn)
+{
+  enum insn_form iform = address_to_insn_form (addr, mode,
+					       non_prefixed_insn);
+  return (iform == INSN_FORM_PREFIXED_NUMERIC
+	  || iform == INSN_FORM_PCREL_LOCAL);
+}
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
@@ -268,8 +292,6 @@ extern void rs6000_d_target_versions (vo
 const char * rs6000_xcoff_strip_dollar (const char *);
 #endif
 
-void rs6000_final_prescan_insn (rtx_insn *, rtx *operand, int num_operands);
-
 extern unsigned char rs6000_class_max_nregs[][LIM_REG_CLASSES];
 extern unsigned char rs6000_hard_regno_nregs[][FIRST_PSEUDO_REGISTER];
 
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 275908)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -9639,6 +9639,22 @@ rs6000_emit_move (rtx dest, rtx source,
 	  return;
 	}
 
+      /* Use the default pattern for loading up pc-relative addresses.  */
+      if (TARGET_PCREL && mode == Pmode
+	  && (SYMBOL_REF_P (operands[1]) || LABEL_REF_P (operands[1])
+	      || GET_CODE (operands[1]) == CONST))
+	{
+	  enum insn_form iform = address_to_insn_form (operands[1], mode,
+						       NON_PREFIXED_DEFAULT);
+
+	  if (iform == INSN_FORM_PCREL_LOCAL
+	      || iform == INSN_FORM_PCREL_EXTERNAL)
+	    {
+	      emit_insn (gen_rtx_SET (operands[0], operands[1]));
+	      return;
+	    }
+	}
+
       if (DEFAULT_ABI == ABI_V4
 	  && mode == Pmode && mode == SImode
 	  && flag_pic == 1 && got_operand (operands[1], mode))
@@ -24716,6 +24732,203 @@ address_to_insn_form (rtx addr,
   return INSN_FORM_BAD;
 }
 
+/* Helper function to take a REG and a MODE and turn it into the non-prefixed
+   instruction format (D/DS/DQ) used for offset memory.  */
+
+static enum non_prefixed
+reg_to_non_prefixed (rtx reg, machine_mode mode)
+{
+  /* If it isn't a register, use the defaults.  */
+  if (!REG_P (reg) && !SUBREG_P (reg))
+    return NON_PREFIXED_DEFAULT;
+
+  unsigned int r = reg_or_subregno (reg);
+
+  /* If we have a pseudo, use the default instruction format.  */
+  if (r >= FIRST_PSEUDO_REGISTER)
+    return NON_PREFIXED_DEFAULT;
+
+  unsigned size = GET_MODE_SIZE (mode);
+
+  /* FPR registers use D-mode for scalars, and DQ-mode for vectors.  */
+  if (FP_REGNO_P (r))
+    {
+      if (mode == SFmode || size == 8 || FLOAT128_2REG_P (mode))
+	return NON_PREFIXED_D;
+
+      else if (size < 8)
+	return NON_PREFIXED_X;
+
+      else if (TARGET_VSX && size >= 16 && ALTIVEC_OR_VSX_VECTOR_MODE (mode))
+	return NON_PREFIXED_DQ;
+
+      else
+	return NON_PREFIXED_DEFAULT;
+    }
+
+  /* Altivec registers use DS-mode for scalars, and DQ-mode for vectors.  */
+  else if (ALTIVEC_REGNO_P (r))
+    {
+      if (mode == SFmode || size == 8 || FLOAT128_2REG_P (mode))
+	return NON_PREFIXED_DS;
+
+      else if (size < 8)
+	return NON_PREFIXED_X;
+
+      else if (TARGET_VSX && size >= 16 && ALTIVEC_OR_VSX_VECTOR_MODE (mode))
+	return NON_PREFIXED_DQ;
+
+      else
+	return NON_PREFIXED_DEFAULT;
+    }
+
+  /* GPR registers use DS-mode for 64-bit items on 64-bit systems, and D-mode
+     otherwise.  Assume that any other register, such as LR, CRs, etc. will go
+     through the GPR registers for memory operations.  */
+  else if (TARGET_POWERPC64 && size >= 8)
+    return NON_PREFIXED_DS;
+
+  return NON_PREFIXED_D;
+}
+
+\f
+/* Whether a load instruction is a prefixed instruction.  This is called from
+   the prefixed attribute processing.  */
+
+bool
+prefixed_load_p (rtx_insn *insn)
+{
+  /* Validate the insn to make sure it is a normal load insn.  */
+  extract_insn_cached (insn);
+  if (recog_data.n_operands < 2)
+    return false;
+
+  rtx reg = recog_data.operand[0];
+  rtx mem = recog_data.operand[1];
+
+  if (!REG_P (reg) && !SUBREG_P (reg))
+    return false;
+
+  if (!MEM_P (mem))
+    return false;
+
+  /* LWA uses the DS format instead of the D format that LWZ uses.  */
+  enum non_prefixed non_prefixed_insn;
+  machine_mode reg_mode = GET_MODE (reg);
+  machine_mode mem_mode = GET_MODE (mem);
+
+  if (mem_mode == SImode && reg_mode == DImode
+      && get_attr_sign_extend (insn) == SIGN_EXTEND_YES)
+    non_prefixed_insn = NON_PREFIXED_DS;
+
+  else
+    non_prefixed_insn = reg_to_non_prefixed (reg, mem_mode);
+
+  return address_is_prefixed (XEXP (mem, 0), mem_mode, non_prefixed_insn);
+}
+
+/* Whether a store instruction is a prefixed instruction.  This is called from
+   the prefixed attribute processing.  */
+
+bool
+prefixed_store_p (rtx_insn *insn)
+{
+  /* Validate the insn to make sure it is a normal store insn.  */
+  extract_insn_cached (insn);
+  if (recog_data.n_operands < 2)
+    return false;
+
+  rtx mem = recog_data.operand[0];
+  rtx reg = recog_data.operand[1];
+
+  if (!REG_P (reg) && !SUBREG_P (reg))
+    return false;
+
+  if (!MEM_P (mem))
+    return false;
+
+  machine_mode mem_mode = GET_MODE (mem);
+  enum non_prefixed non_prefixed_insn = reg_to_non_prefixed (reg, mem_mode);
+  return address_is_prefixed (XEXP (mem, 0), mem_mode, non_prefixed_insn);
+}
+
+/* Whether a load immediate or add instruction is a prefixed instruction.  This
+   is called from the prefixed attribute processing.  */
+
+bool
+prefixed_paddi_p (rtx_insn *insn)
+{
+  rtx set = single_set (insn);
+  if (!set)
+    return false;
+
+  rtx dest = SET_DEST (set);
+  rtx src = SET_SRC (set);
+
+  if (!REG_P (dest) && !SUBREG_P (dest))
+    return false;
+
+  /* Is this a load immediate that can't be done with a simple ADDI or
+     ADDIS?  */
+  if (CONST_INT_P (src))
+    return (satisfies_constraint_eI (src)
+	    && !satisfies_constraint_I (src)
+	    && !satisfies_constraint_L (src));
+
+  /* Is this a PADDI instruction that can't be done with a simple ADDI or
+     ADDIS?  */
+  if (GET_CODE (src) == PLUS)
+    {
+      rtx op1 = XEXP (src, 1);
+
+      return (CONST_INT_P (op1)
+	      && satisfies_constraint_eI (op1)
+	      && !satisfies_constraint_I (op1)
+	      && !satisfies_constraint_L (op1));
+    }
+
+  /* If not, is it a load of a pc-relative address?  */
+  if (!TARGET_PCREL || GET_MODE (dest) != Pmode)
+    return false;
+
+  if (!SYMBOL_REF_P (src) && !LABEL_REF_P (src) && GET_CODE (src) != CONST)
+    return false;
+
+  enum insn_form iform = address_to_insn_form (src, Pmode,
+					       NON_PREFIXED_DEFAULT);
+
+  return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
+}
+
+/* Whether the next instruction needs a 'p' prefix issued before the
+   instruction is printed out.  */
+static bool next_insn_prefixed_p;
+
+/* Define FINAL_PRESCAN_INSN if some processing needs to be done before
+   outputting the assembler code.  On the PowerPC, we remember if the current
+   insn is a prefixed insn where we need to emit a 'p' before the insn.
+
+   In addition, if the insn is part of a pc-relative reference to an external
+   label optimization, this is recorded also.  */
+void
+rs6000_final_prescan_insn (rtx_insn *insn, rtx [], int)
+{
+  next_insn_prefixed_p = (get_attr_prefixed (insn) != PREFIXED_NO);
+  return;
+}
+
+/* Define ASM_OUTPUT_OPCODE to do anything special before emitting an opcode.
+   We use it to emit a 'p' for prefixed insns that is set in
+   FINAL_PRESCAN_INSN.  */
+void
+rs6000_asm_output_opcode (FILE *stream)
+{
+  if (next_insn_prefixed_p)
+    fputc ('p', stream);
+
+  return;
+}
+
 \f
 #ifdef HAVE_GAS_HIDDEN
 # define USE_HIDDEN_LINKONCE 1
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 275894)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -2547,3 +2547,24 @@ typedef struct GTY(()) machine_function
   IN_RANGE ((VALUE),							\
 	    -(HOST_WIDE_INT_1 << 33),					\
 	    (HOST_WIDE_INT_1 << 33) - 1 - (EXTRA))
+
+/* Define this if some processing needs to be done before outputting the
+   assembler code.  On the PowerPC, we remember if the current insn is a normal
+   prefixed insn where we need to emit a 'p' before the insn.  */
+#define FINAL_PRESCAN_INSN(INSN, OPERANDS, NOPERANDS)			\
+do									\
+  {									\
+    if (TARGET_PREFIXED_ADDR)						\
+      rs6000_final_prescan_insn (INSN, OPERANDS, NOPERANDS);		\
+  }									\
+while (0)
+
+/* Do anything special before emitting an opcode.  We use it to emit a 'p' for
+   prefixed insns that is set in FINAL_PRESCAN_INSN.  */
+#define ASM_OUTPUT_OPCODE(STREAM, OPCODE)				\
+  do									\
+    {									\
+     if (TARGET_PREFIXED_ADDR)						\
+       rs6000_asm_output_opcode (STREAM);				\
+    }									\
+  while (0)
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 275894)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -256,8 +256,52 @@ (define_attr "var_shift" "no,yes"
 ;; Is copying of this instruction disallowed?
 (define_attr "cannot_copy" "no,yes" (const_string "no"))
 
-;; Length of the instruction (in bytes).
-(define_attr "length" "" (const_int 4))
+
+;; Whether an insn is a prefixed insn, and an initial 'p' should be printed
+;; before the instruction.  A prefixed instruction has a prefix instruction
+;; word that extends the immediate value of the instructions from 12-16 bits to
+;; 34 bits.  The macro ASM_OUTPUT_OPCODE emits a leading 'p' for prefixed
+;; insns.  The default "length" attribute will also be adjusted by default to
+;; be 12 bytes.
+(define_attr "prefixed" "no,yes"
+  (cond [(ior (match_test "!TARGET_PREFIXED_ADDR")
+	      (match_test "!NONJUMP_INSN_P (insn)"))
+	 (const_string "no")
+
+	 (eq_attr "type" "load,fpload,vecload")
+	 (if_then_else (and (eq_attr "indexed" "no")
+			    (eq_attr "update" "no")
+			    (match_test "prefixed_load_p (insn)"))
+		       (const_string "yes")
+		       (const_string "no"))
+
+	 (eq_attr "type" "store,fpstore,vecstore")
+	 (if_then_else (and (eq_attr "indexed" "no")
+			    (eq_attr "update" "no")
+			    (match_test "prefixed_store_p (insn)"))
+		       (const_string "yes")
+		       (const_string "no"))
+
+	 (eq_attr "type" "integer,add")
+	 (if_then_else (match_test "prefixed_paddi_p (insn)")
+		       (const_string "yes")
+		       (const_string "no"))]
+	(const_string "no")))
+
+;; Length in bytes of instructions that use prefixed addressing and length in
+;; bytes of instructions that does not use prefixed addressing.  This allows
+;; both lengths to be defined as constants, and the length attribute can pick
+;; the size as appropriate.
+(define_attr "prefixed_length" "" (const_int 12))
+(define_attr "non_prefixed_length" "" (const_int 4))
+
+;; Length of the instruction (in bytes).  Prefixed insns are 8 bytes, but the
+;; assembler might issue need to issue a NOP so that the prefixed instruction
+;; does not cross a cache boundary, which makes them possibly 12 bytes.
+(define_attr "length" ""
+  (if_then_else (eq_attr "prefixed" "yes")
+		(attr "prefixed_length")
+		(attr "non_prefixed_length")))
 
 ;; Processor type -- this attribute must exactly match the processor_type
 ;; enumeration in rs6000-opts.h.
@@ -9875,6 +9919,28 @@ (define_expand "restore_stack_nonlocal"
   operands[6] = gen_rtx_PARALLEL (VOIDmode, p);
 })
 \f
+;; Load up a pc-relative address.  Print_operand_address will append a @pcrel
+;; to the symbol or label.
+(define_insn "*pcrel_local_addr"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+	(match_operand:DI 1 "pcrel_local_address"))]
+  "TARGET_PCREL"
+  "la %0,%a1"
+  [(set_attr "prefixed" "yes")])
+
+;; Load up a pc-relative address to an external symbol.  If the symbol and the
+;; program are both defined in the main program, the linker will optimize this
+;; to a PADDI.  Otherwise, it will create a GOT address that is relocated by
+;; the dynamic linker and loaded up.  Print_operand_address will append a
+;; @got@pcrel to the symbol.
+(define_insn "*pcrel_extern_addr"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+	(match_operand:DI 1 "pcrel_external_address"))]
+  "TARGET_PCREL"
+  "ld %0,%a1"
+  [(set_attr "prefixed" "yes")
+   (set_attr "type" "load")])
+
 ;; TOC register handling.
 
 ;; Code to initialize the TOC register...

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4, patch #3: Fix up mov<mode>_64bit_dm
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
  2019-09-18 23:49 ` [PATCH] V4, patch #1: Rework prefixed/pc-relative lookup Michael Meissner
  2019-09-18 23:56 ` [PATCH], V4, patch #2: Add prefixed insn attribute Michael Meissner
@ 2019-09-18 23:58 ` Michael Meissner
  2019-09-27 23:33   ` Segher Boessenkool
  2019-09-19  0:06 ` [PATCH], V4, patch #4: Enable prefixed/pc-rel addressing Michael Meissner
                   ` (18 subsequent siblings)
  21 siblings, 1 reply; 37+ messages in thread
From: Michael Meissner @ 2019-09-18 23:58 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

In doing the patches, I noticed that mov<mode>_64bit_dm had two alternatives
combined together.  This patch fixes the problem, before the next patch that
will need to modify mov<mode>_64bit_dm for prefixed addressing.

I have done a bootstrap build with all of the patches applied, and there were
no regressions in the test suite.  After posting these patches, I will start a
job to build each set of patches in turn just to make sure there are no extra
warnings.

Can I commit this patch to the trunk?

2019-09-18  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.md (mov<mode>_64bit_dm): Split the
	alternatives for loading 0.0 to a GPR and loading a 128-bit
	floating point type to a GPR.

Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 275816)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -7758,9 +7758,18 @@ (define_expand "mov<mode>"
 ;; not swapped like they are for TImode or TFmode.  Subregs therefore are
 ;; problematical.  Don't allow direct move for this case.
 
+;;		FPR load    FPR store   FPR move    FPR zero    GPR load
+;;		GPR zero    GPR store   GPR move    MFVSRD      MTVSRD
+
 (define_insn_and_split "*mov<mode>_64bit_dm"
-  [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r,r,d")
-	(match_operand:FMOVE128_FPR 1 "input_operand" "d,m,d,<zero_fp>,r,<zero_fp>Y,r,d,r"))]
+  [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand"
+		"=m,        d,          d,          d,          Y,
+		 r,         r,          r,          r,          d")
+
+	(match_operand:FMOVE128_FPR 1 "input_operand"
+		"d,         m,          d,          <zero_fp>,  r,
+		 <zero_fp>, Y,          r,          d,          r"))]
+
   "TARGET_HARD_FLOAT && TARGET_POWERPC64 && FLOAT128_2REG_P (<MODE>mode)
    && (<MODE>mode != TDmode || WORDS_BIG_ENDIAN)
    && (gpc_reg_operand (operands[0], <MODE>mode)
@@ -7769,8 +7778,8 @@ (define_insn_and_split "*mov<mode>_64bit
   "&& reload_completed"
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8,8,8,8,12,12,8,8,8")
-   (set_attr "isa" "*,*,*,*,*,*,*,p8v,p8v")])
+  [(set_attr "length" "8")
+   (set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")])
 
 (define_insn_and_split "*movtd_64bit_nodm"
   [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4, patch #4: Enable prefixed/pc-rel addressing
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (2 preceding siblings ...)
  2019-09-18 23:58 ` [PATCH], V4, patch #3: Fix up mov<mode>_64bit_dm Michael Meissner
@ 2019-09-19  0:06 ` Michael Meissner
  2019-09-19  0:09 ` [PATCH] V4, patch #5: Use PLI (PADDI) to load up 34-bit DImode Michael Meissner
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-09-19  0:06 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch is the patch that goes through and enables prefixed and pc-relative
addressing on all modes, except for SDmode.  SDmode is special in that for its
main use, you need to only use X-form addressing.  While you can do D-form
addressing to load/store SDmode in GPR registers, I found you really don't want
to do that, as the register allocator will load/store the value and do a direct
move.

I also discovered that if you did a vector extract of a variable offset where
the vector address is pc-relative, the code was incorrect because it re-used
the base register temporary.  This code prevents the vector extract from
combining the extract operation and the memory.

As you suggested in the last series of patches, I have made stack_protect_setdi
and stack_protect_testdi not support prefixed insns in the actual insn, and the
expander converts the memory address to a non-prefixed form.

I have done a bootstrap build with all of the patches applied, and there were
no regressions in the test suite.  After posting these patches, I will start a
job to build each set of patches in turn just to make sure there are no extra
warnings.

Can I commit this patch to the trunk?

2019-09-18  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/constraints.md (em constraint): New constraint for
	non pc-relative memory.
	* config/rs6000/predicates.md (lwa_operand): Allow odd offsets if
	we have prefixed addressing.
	(non_prefixed_memory): New predicate.
	(non_pcrel_memory): New predicate.
	(reg_or_non_pcrel_memory): New predicate.
	* config/rs6000/rs6000-protos.h (make_memory_non_prefixed): New
	declaration.
	* config/rs6000/rs6000.c (rs6000_adjust_vec_address): Optimize
	pc-relative addresses with constant offsets.  Signal an error if
	we have a pc-relative address and a variable offset.
	(rs6000_split_vec_extract_var): Signal an error if we have a
	pc-relative address and a variable offset.
	(quad_address_p): Add support for prefixed addresses.
	(mem_operand_gpr): Add support for prefixed addresses.
	(mem_operand_ds_form): Add support for prefixed addresses.
	(rs6000_legitimate_offset_address_p): Add support for prefixed
	addresses.
	(rs6000_legitimate_address_p): Add support for prefixed
	addresses.
	(rs6000_mode_dependent_address): Add support for prefixed
	addresses.
	(rs6000_num_insns): New helper function.
	(rs6000_insn_cost): Treat prefixed instructions as having the same
	cost as non prefixed instructions, even though the prefixed
	instructions are larger.
	(make_memory_non_prefixed): New function to make a non-prefixed
	memory operand.
	* config/rs6000/rs6000.md (mov<mode>_64bit_dm): Add support for
	prefixed addresses.
	(movtd_64bit_nodm): Add support for prefixed addresses.
	(stack_protect_setdi): Convert prefixed addresses to non-prefixed
	addresses.  Allow for indexed addressing as well as offsettable.
	(stack_protect_testdi): Convert prefixed addresses to non-prefixed
	addresses.  Allow for indexed addressing as well as offsettable.
	* config/rs6000/vsx.md (vsx_mov<mode>_64bit): Add support for
	prefixed addresses.
	(vsx_extract_<mode>_var, VSX_D iterator): Do not allow a vector in
	memory with a prefixed address to combine with variable offsets.
	(vsx_extract_v4sf_var): Do not allow a vector in memory with a
	prefixed address to combine with variable offsets.
	(vsx_extract_<mode>_var, VSX_EXTRACT_I iterator): Do not allow a
	vector in memory with a prefixed address to combine with variable
	offsets.
	(vsx_extract_<mode>_<VS_scalar>mode_var): Do not allow a vector in
	memory with a prefixed address to combine with variable offsets.
	* doc/md.texi (PowerPC constraints): Document 'em' constraint.

Index: gcc/config/rs6000/constraints.md
===================================================================
--- gcc/config/rs6000/constraints.md	(revision 275894)
+++ gcc/config/rs6000/constraints.md	(working copy)
@@ -210,6 +210,11 @@ several times, or that might not access
   (and (match_code "mem")
        (match_test "GET_RTX_CLASS (GET_CODE (XEXP (op, 0))) != RTX_AUTOINC")))
 
+(define_memory_constraint "em"
+  "A memory operand that does not contain a pc-relative reference."
+  (and (match_code "mem")
+       (match_test "non_pcrel_memory (op, mode)")))
+
 (define_memory_constraint "Q"
   "Memory operand that is an offset from a register (it is usually better
 to use @samp{m} or @samp{es} in @code{asm} statements)"
Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 275908)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -932,6 +932,14 @@ (define_predicate "lwa_operand"
     return false;
 
   addr = XEXP (inner, 0);
+
+  /* The LWA instruction uses the DS-form format where the bottom two bits of
+     the offset must be 0.  The prefixed PLWA does not have this
+     restriction.  */
+  if (TARGET_PREFIXED_ADDR
+      && address_is_prefixed (addr, DImode, NON_PREFIXED_DS))
+    return true;
+
   if (GET_CODE (addr) == PRE_INC
       || GET_CODE (addr) == PRE_DEC
       || (GET_CODE (addr) == PRE_MODIFY
@@ -1810,3 +1818,43 @@ (define_predicate "pcrel_local_or_extern
   enum insn_form iform = address_to_insn_form (op, mode, NON_PREFIXED_DEFAULT);
   return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
 })
+
+;; Return 1 if op is a memory operand that is not prefixed.
+(define_predicate "non_prefixed_memory"
+  (match_code "mem")
+{
+  if (!memory_operand (op, mode))
+    return false;
+
+  enum insn_form iform
+    = address_to_insn_form (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
+
+  return (iform != INSN_FORM_PREFIXED_NUMERIC
+          && iform != INSN_FORM_PCREL_LOCAL
+          && iform != INSN_FORM_BAD);
+})
+
+(define_predicate "non_pcrel_memory"
+  (match_code "mem")
+{
+  if (!memory_operand (op, mode))
+    return false;
+
+  enum insn_form iform
+    = address_to_insn_form (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
+
+  return (iform != INSN_FORM_PCREL_EXTERNAL
+          && iform != INSN_FORM_PCREL_LOCAL
+          && iform != INSN_FORM_BAD);
+})
+
+;; Return 1 if op is either a register operand or a memory operand that does
+;; not use a pc-relative address.
+(define_predicate "reg_or_non_pcrel_memory"
+  (match_code "reg,subreg,mem")
+{
+  if (REG_P (op) || SUBREG_P (op))
+    return register_operand (op, mode);
+
+  return non_pcrel_memory (op, mode);
+})
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 275909)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -192,6 +192,7 @@ extern enum insn_form address_to_insn_fo
 extern bool prefixed_load_p (rtx_insn *);
 extern bool prefixed_store_p (rtx_insn *);
 extern bool prefixed_paddi_p (rtx_insn *);
+extern rtx make_memory_non_prefixed (rtx);
 extern void rs6000_asm_output_opcode (FILE *);
 extern void rs6000_final_prescan_insn (rtx_insn *, rtx [], int);
 
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 275909)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -6700,6 +6700,7 @@ rs6000_adjust_vec_address (rtx scalar_re
   rtx element_offset;
   rtx new_addr;
   bool valid_addr_p;
+  bool pcrel_p = TARGET_PCREL && pcrel_local_address (addr, Pmode);
 
   /* Vector addresses should not have PRE_INC, PRE_DEC, or PRE_MODIFY.  */
   gcc_assert (GET_RTX_CLASS (GET_CODE (addr)) != RTX_AUTOINC);
@@ -6737,6 +6738,40 @@ rs6000_adjust_vec_address (rtx scalar_re
   else if (REG_P (addr) || SUBREG_P (addr))
     new_addr = gen_rtx_PLUS (Pmode, addr, element_offset);
 
+  /* Optimize pc-relative addresses.  */
+  else if (pcrel_p)
+    {
+      if (CONST_INT_P (element_offset))
+	{
+	  rtx addr2 = addr;
+	  HOST_WIDE_INT offset = INTVAL (element_offset);
+
+	  if (GET_CODE (addr2) == CONST)
+	    addr2 = XEXP (addr2, 0);
+
+	  if (GET_CODE (addr2) == PLUS)
+	    {
+	      offset += INTVAL (XEXP (addr2, 1));
+	      addr2 = XEXP (addr2, 0);
+	    }
+
+	  gcc_assert (SIGNED_34BIT_OFFSET_P (offset));
+	  if (offset)
+	    {
+	      addr2 = gen_rtx_PLUS (Pmode, addr2, GEN_INT (offset));
+	      new_addr = gen_rtx_CONST (Pmode, addr2);
+	    }
+	  else
+	    new_addr = addr2;
+	}
+
+      /* Make sure we do not have a pc-relative address with a variable offset,
+	 since we only have one temporary base register, and we would need two
+	 registers in that case.  */
+      else
+	gcc_unreachable ();
+    }
+
   /* Optimize D-FORM addresses with constant offset with a constant element, to
      include the element offset in the address directly.  */
   else if (GET_CODE (addr) == PLUS)
@@ -6800,11 +6835,11 @@ rs6000_adjust_vec_address (rtx scalar_re
       new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
     }
 
-  /* If we have a PLUS, we need to see whether the particular register class
-     allows for D-FORM or X-FORM addressing.  */
-  if (GET_CODE (new_addr) == PLUS)
+  /* If we have a PLUS or a pc-relative address without the PLUS, we need to
+     see whether the particular register class allows for D-FORM or X-FORM
+     addressing.  */
+  if (GET_CODE (new_addr) == PLUS || pcrel_p)
     {
-      rtx op1 = XEXP (new_addr, 1);
       addr_mask_type addr_mask;
       unsigned int scalar_regno = reg_or_subregno (scalar_reg);
 
@@ -6821,10 +6856,16 @@ rs6000_adjust_vec_address (rtx scalar_re
       else
 	gcc_unreachable ();
 
-      if (REG_P (op1) || SUBREG_P (op1))
-	valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0;
-      else
+      if (pcrel_p)
 	valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
+      else
+	{
+	  rtx op1 = XEXP (new_addr, 1);
+	  if (REG_P (op1) || SUBREG_P (op1))
+	    valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0;
+	  else
+	    valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
+	}
     }
 
   else if (REG_P (new_addr) || SUBREG_P (new_addr))
@@ -6860,6 +6901,12 @@ rs6000_split_vec_extract_var (rtx dest,
      systems.  */
   if (MEM_P (src))
     {
+      /* If this is a pc-relative address, we would need another register to
+	 hold the address of the vector along with the variable offset.  The
+	 callers should use reg_or_non_pcrel_memory to make sure we don't
+	 get a pc-relative address here.  */
+      gcc_assert (non_pcrel_memory (src, mode));
+
       int num_elements = GET_MODE_NUNITS (mode);
       rtx num_ele_m1 = GEN_INT (num_elements - 1);
 
@@ -7249,6 +7296,13 @@ quad_address_p (rtx addr, machine_mode m
   if (VECTOR_MODE_P (mode) && !mode_supports_dq_form (mode))
     return false;
 
+  /* Is this a valid prefixed address?  If the bottom four bits of the offset
+     are non-zero, we could use a prefixed instruction (which does not have the
+     DQ-form constraint that the traditional instruction had) instead of
+     forcing the unaligned offset to a GPR.  */
+  if (address_is_prefixed (addr, mode, NON_PREFIXED_DQ))
+    return true;
+
   if (GET_CODE (addr) != PLUS)
     return false;
 
@@ -7350,6 +7404,13 @@ mem_operand_gpr (rtx op, machine_mode mo
       && legitimate_indirect_address_p (XEXP (addr, 0), false))
     return true;
 
+  /* Allow prefixed instructions if supported.  If the bottom two bits of the
+     offset are non-zero, we could use a prefixed instruction (which does not
+     have the DS-form constraint that the traditional instruction had) instead
+     of forcing the unaligned offset to a GPR.  */
+  if (address_is_prefixed (addr, mode, NON_PREFIXED_DS))
+    return true;
+
   /* Don't allow non-offsettable addresses.  See PRs 83969 and 84279.  */
   if (!rs6000_offsettable_memref_p (op, mode, false))
     return false;
@@ -7371,7 +7432,7 @@ mem_operand_gpr (rtx op, machine_mode mo
        causes a wrap, so test only the low 16 bits.  */
     offset = ((offset & 0xffff) ^ 0x8000) - 0x8000;
 
-  return offset + 0x8000 < 0x10000u - extra;
+  return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
 
 /* As above, but for DS-FORM VSX insns.  Unlike mem_operand_gpr,
@@ -7384,6 +7445,13 @@ mem_operand_ds_form (rtx op, machine_mod
   int extra;
   rtx addr = XEXP (op, 0);
 
+  /* Allow prefixed instructions if supported.  If the bottom two bits of the
+     offset are non-zero, we could use a prefixed instruction (which does not
+     have the DS-form constraint that the traditional instruction had) instead
+     of forcing the unaligned offset to a GPR.  */
+  if (address_is_prefixed (addr, mode, NON_PREFIXED_DS))
+    return true;
+
   if (!offsettable_address_p (false, mode, addr))
     return false;
 
@@ -7404,7 +7472,7 @@ mem_operand_ds_form (rtx op, machine_mod
        causes a wrap, so test only the low 16 bits.  */
     offset = ((offset & 0xffff) ^ 0x8000) - 0x8000;
 
-  return offset + 0x8000 < 0x10000u - extra;
+  return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
 \f
 /* Subroutines of rs6000_legitimize_address and rs6000_legitimate_address_p.  */
@@ -7753,8 +7821,10 @@ rs6000_legitimate_offset_address_p (mach
       break;
     }
 
-  offset += 0x8000;
-  return offset < 0x10000 - extra;
+  if (TARGET_PREFIXED_ADDR)
+    return SIGNED_34BIT_OFFSET_EXTRA_P (offset, extra);
+  else
+    return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
 
 bool
@@ -8651,6 +8721,11 @@ rs6000_legitimate_address_p (machine_mod
       && mode_supports_pre_incdec_p (mode)
       && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict))
     return 1;
+
+  /* Handle prefixed addresses (pc-relative or 34-bit offset).  */
+  if (address_is_prefixed (x, mode, NON_PREFIXED_DEFAULT))
+    return 1;
+
   /* Handle restricted vector d-form offsets in ISA 3.0.  */
   if (quad_offset_p)
     {
@@ -8709,7 +8784,11 @@ rs6000_legitimate_address_p (machine_mod
 	  || (!avoiding_indexed_address_p (mode)
 	      && legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict)))
       && rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0)))
-    return 1;
+    {
+      /* There is no prefixed version of the load/store with update.  */
+      rtx addr = XEXP (x, 1);
+      return !address_is_prefixed (addr, mode, NON_PREFIXED_DEFAULT);
+    }
   if (reg_offset_p && !quad_offset_p
       && legitimate_lo_sum_address_p (mode, x, reg_ok_strict))
     return 1;
@@ -8771,8 +8850,12 @@ rs6000_mode_dependent_address (const_rtx
 	  && XEXP (addr, 0) != arg_pointer_rtx
 	  && CONST_INT_P (XEXP (addr, 1)))
 	{
-	  unsigned HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
-	  return val + 0x8000 >= 0x10000 - (TARGET_POWERPC64 ? 8 : 12);
+	  HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
+	  HOST_WIDE_INT extra = TARGET_POWERPC64 ? 8 : 12;
+	  if (TARGET_PREFIXED_ADDR)
+	    return !SIGNED_34BIT_OFFSET_EXTRA_P (val, extra);
+	  else
+	    return !SIGNED_16BIT_OFFSET_EXTRA_P (val, extra);
 	}
       break;
 
@@ -20948,6 +21031,42 @@ rs6000_debug_rtx_costs (rtx x, machine_m
   return ret;
 }
 
+/* How many real instructions are generated for this insn?  This is slightly
+   different from the length attribute, in that the length attribute counts the
+   number of bytes.  With prefixed instructions, we don't want to count a
+   prefixed instruction (length 12 bytes including possible NOP) as taking 3
+   instructions, but just one.  */
+
+static int
+rs6000_num_insns (rtx_insn *insn)
+{
+  /* Try to figure it out based on the length and whether there are prefixed
+     instructions.  While prefixed instructions are only 8 bytes, we have to
+     use 12 as the size of the first prefixed instruction in case the
+     instruction needs to be aligned.  Back to back prefixed instructions would
+     only take 20 bytes, since it is guaranteed that one of the prefixed
+     instructions does not need the alignment.  */
+  int length = get_attr_length (insn);
+
+  if (length >= 12 && TARGET_PREFIXED_ADDR
+      && get_attr_prefixed (insn) == PREFIXED_YES)
+    {
+      /* Single prefixed instruction.  */
+      if (length == 12)
+	return 1;
+
+      /* A normal instruction and a prefixed instruction (16) or two back
+	 to back prefixed instructions (20).  */
+      if (length == 16 || length == 20)
+	return 2;
+
+      /* Guess for larger instruction sizes.  */
+      return 2 + (length - 20) / 4;
+    }
+
+  return length / 4;
+}
+
 static int
 rs6000_insn_cost (rtx_insn *insn, bool speed)
 {
@@ -20961,7 +21080,7 @@ rs6000_insn_cost (rtx_insn *insn, bool s
   if (cost > 0)
     return cost;
 
-  int n = get_attr_length (insn) / 4;
+  int n = rs6000_num_insns (insn);
   enum attr_type type = get_attr_type (insn);
 
   switch (type)
@@ -24900,6 +25019,34 @@ prefixed_paddi_p (rtx_insn *insn)
   return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
 }
 
+/* Make a memory address non-prefixed if it is prefixed.  */
+
+rtx
+make_memory_non_prefixed (rtx mem)
+{
+  gcc_assert (MEM_P (mem));
+
+  rtx old_addr = XEXP (mem, 0);
+  if (address_is_prefixed (old_addr, GET_MODE (mem), NON_PREFIXED_DEFAULT))
+    {
+      rtx new_addr;
+
+      if (GET_CODE (old_addr) == PLUS
+	  && (REG_P (XEXP (old_addr, 0)) || SUBREG_P (XEXP (old_addr, 0)))
+	  && CONST_INT_P (XEXP (old_addr, 1)))
+	{
+	  rtx tmp_reg = force_reg (Pmode, XEXP (old_addr, 1));
+	  new_addr = gen_rtx_PLUS (Pmode, XEXP (old_addr, 0), tmp_reg);
+	}
+      else
+	new_addr = force_reg (Pmode, old_addr);
+
+      mem = change_address (mem, VOIDmode, new_addr);
+    }
+
+  return mem;
+}
+
 /* Whether the next instruction needs a 'p' prefix issued before the
    instruction is printed out.  */
 static bool next_insn_prefixed_p;
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 275910)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -7778,8 +7778,9 @@ (define_insn_and_split "*mov<mode>_64bit
   "&& reload_completed"
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8")
-   (set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")])
+  [(set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")
+   (set_attr "non_prefixed_length" "8")
+   (set_attr "prefixed_length" "20")])
 
 (define_insn_and_split "*movtd_64bit_nodm"
   [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
@@ -7790,8 +7791,12 @@ (define_insn_and_split "*movtd_64bit_nod
   "#"
   "&& reload_completed"
   [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8,8,8,12,12,8")])
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+}
+  [(set_attr "non_prefixed_length" "8")
+   (set_attr "prefixed_length" "20")])
 
 (define_insn_and_split "*mov<mode>_32bit"
   [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r")
@@ -11505,9 +11510,25 @@ (define_insn "stack_protect_setsi"
   [(set_attr "type" "three")
    (set_attr "length" "12")])
 
-(define_insn "stack_protect_setdi"
-  [(set (match_operand:DI 0 "memory_operand" "=Y")
-	(unspec:DI [(match_operand:DI 1 "memory_operand" "Y")] UNSPEC_SP_SET))
+(define_expand "stack_protect_setdi"
+  [(parallel [(set (match_operand:DI 0 "memory_operand")
+		   (unspec:DI [(match_operand:DI 1 "memory_operand")]
+		   UNSPEC_SP_SET))
+	      (set (match_scratch:DI 2)
+		   (const_int 0))])]
+  "TARGET_64BIT"
+{
+  if (TARGET_PREFIXED_ADDR)
+    {
+      operands[0] = make_memory_non_prefixed (operands[0]);
+      operands[1] = make_memory_non_prefixed (operands[1]);
+    }
+})
+
+(define_insn "*stack_protect_setdi"
+  [(set (match_operand:DI 0 "non_prefixed_memory" "=YZ")
+	(unspec:DI [(match_operand:DI 1 "non_prefixed_memory" "YZ")]
+		   UNSPEC_SP_SET))
    (set (match_scratch:DI 2 "=&r") (const_int 0))]
   "TARGET_64BIT"
   "ld%U1%X1 %2,%1\;std%U0%X0 %2,%0\;li %2,0"
@@ -11551,10 +11572,27 @@ (define_insn "stack_protect_testsi"
    lwz%U1%X1 %3,%1\;lwz%U2%X2 %4,%2\;cmplw %0,%3,%4\;li %3,0\;li %4,0"
   [(set_attr "length" "16,20")])
 
-(define_insn "stack_protect_testdi"
+(define_expand "stack_protect_testdi"
+  [(parallel [(set (match_operand:CCEQ 0 "cc_reg_operand")
+		   (unspec:CCEQ [(match_operand:DI 1 "memory_operand")
+				 (match_operand:DI 2 "memory_operand")]
+				UNSPEC_SP_TEST))
+	      (set (match_scratch:DI 4)
+		   (const_int 0))
+	      (clobber (match_scratch:DI 3))])]
+  "TARGET_64BIT"
+{
+  if (TARGET_PREFIXED_ADDR)
+    {
+      operands[1] = make_memory_non_prefixed (operands[1]);
+      operands[2] = make_memory_non_prefixed (operands[2]);
+    }
+})
+
+(define_insn "*stack_protect_testdi"
   [(set (match_operand:CCEQ 0 "cc_reg_operand" "=x,?y")
-        (unspec:CCEQ [(match_operand:DI 1 "memory_operand" "Y,Y")
-		      (match_operand:DI 2 "memory_operand" "Y,Y")]
+        (unspec:CCEQ [(match_operand:DI 1 "non_prefixed_memory" "YZ,YZ")
+		      (match_operand:DI 2 "non_prefixed_memory" "YZ,YZ")]
 		     UNSPEC_SP_TEST))
    (set (match_scratch:DI 4 "=r,r") (const_int 0))
    (clobber (match_scratch:DI 3 "=&r,&r"))]
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 275894)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -1149,10 +1149,30 @@ (define_insn "vsx_mov<mode>_64bit"
                "vecstore,  vecload,   vecsimple, mffgpr,    mftgpr,    load,
                 store,     load,      store,     *,         vecsimple, vecsimple,
                 vecsimple, *,         *,         vecstore,  vecload")
-   (set_attr "length"
-               "*,         *,         *,         8,         *,         8,
-                8,         8,         8,         8,         *,         *,
-                *,         20,        8,         *,         *")
+   (set (attr "non_prefixed_length")
+	(cond [(and (eq_attr "alternative" "4")		;; MTVSRDD
+		    (match_test "TARGET_P9_VECTOR"))
+	       (const_string "4")
+
+	       (eq_attr "alternative" "3,4")		;; GPR <-> VSX
+	       (const_string "8")
+
+	       (eq_attr "alternative" "5,6,7,8")	;; GPR load/store
+	       (const_string "8")]
+	      (const_string "*")))
+
+   (set (attr "prefixed_length")
+	(cond [(and (eq_attr "alternative" "4")		;; MTVSRDD
+		    (match_test "TARGET_P9_VECTOR"))
+	       (const_string "4")
+
+	       (eq_attr "alternative" "3,4")		;; GPR <-> VSX
+	       (const_string "8")
+
+	       (eq_attr "alternative" "5,6,7,8")	;; GPR load/store
+	       (const_string "20")]
+	      (const_string "*")))
+
    (set_attr "isa"
                "<VSisa>,   <VSisa>,   <VSisa>,   *,         *,         *,
                 *,         *,         *,         *,         p9v,       *,
@@ -3229,9 +3249,10 @@ (define_insn "vsx_vslo_<mode>"
 ;; Variable V2DI/V2DF extract
 (define_insn_and_split "vsx_extract_<mode>_var"
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v,wa,r")
-	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v,m,m")
-			     (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
-			    UNSPEC_VSX_EXTRACT))
+	(unspec:<VS_scalar>
+	 [(match_operand:VSX_D 1 "reg_or_non_pcrel_memory" "v,em,em")
+	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
+	 UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,&b,&b"))
    (clobber (match_scratch:V2DI 4 "=&v,X,X"))]
   "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_DIRECT_MOVE_64BIT"
@@ -3299,9 +3320,10 @@ (define_insn_and_split "*vsx_extract_v4s
 ;; Variable V4SF extract
 (define_insn_and_split "vsx_extract_v4sf_var"
   [(set (match_operand:SF 0 "gpc_reg_operand" "=wa,wa,?r")
-	(unspec:SF [(match_operand:V4SF 1 "input_operand" "v,m,m")
-		    (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
-		   UNSPEC_VSX_EXTRACT))
+	(unspec:SF
+	 [(match_operand:V4SF 1 "reg_or_non_pcrel_memory" "v,em,em")
+	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
+	 UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,&b,&b"))
    (clobber (match_scratch:V2DI 4 "=&v,X,X"))]
   "VECTOR_MEM_VSX_P (V4SFmode) && TARGET_DIRECT_MOVE_64BIT"
@@ -3662,7 +3684,7 @@ (define_insn_and_split "*vsx_extract_<mo
 (define_insn_and_split "vsx_extract_<mode>_var"
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
 	(unspec:<VS_scalar>
-	 [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m")
+	 [(match_operand:VSX_EXTRACT_I 1 "reg_or_non_pcrel_memory" "v,v,em")
 	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 	 UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,r,&b"))
@@ -3682,7 +3704,7 @@ (define_insn_and_split "*vsx_extract_<mo
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
 	(zero_extend:<VS_scalar>
 	 (unspec:<VSX_EXTRACT_I:VS_scalar>
-	  [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m")
+	  [(match_operand:VSX_EXTRACT_I 1 "reg_or_non_pcrel_memory" "v,v,em")
 	   (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 	  UNSPEC_VSX_EXTRACT)))
    (clobber (match_scratch:DI 3 "=r,r,&b"))
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(revision 275894)
+++ gcc/doc/md.texi	(working copy)
@@ -3373,6 +3373,9 @@ asm ("st %1,%0" : "=m<>" (mem) : "r" (va
 
 is not.
 
+@item em
+A memory operand that does not contain a pc-relative address.
+
 @item es
 A ``stable'' memory operand; that is, one which does not include any
 automodification of the base register.  This used to be useful when

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH] V4, patch #5: Use PLI (PADDI) to load up 34-bit DImode
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (3 preceding siblings ...)
  2019-09-19  0:06 ` [PATCH], V4, patch #4: Enable prefixed/pc-rel addressing Michael Meissner
@ 2019-09-19  0:09 ` Michael Meissner
  2019-09-19  0:11 ` [PATCH] V4, patch #6: Use PLI (PADDI) to load up 32-bit SImode constants Michael Meissner
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-09-19  0:09 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This is a simple patch to enable loading up 34-bit DImode integer constants via
the PLI (PADDI) instruction.  At your suggestion, I moved it from the previous
patch.

Due to the ordering of the alternatives, it does force all of the alternatives
to move down by one.

I have done a bootstrap build with all of the patches applied, and there were
no regressions in the test suite.  After posting these patches, I will start a
job to build each set of patches in turn just to make sure there are no extra
warnings.

Can I commit this patch to the trunk?

2019-09-18  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (num_insns_constant_gpr): Add support for
	PADDI to load up and/or add 34-bit integer constants.
	(rs6000_rtx_costs): Treat constants loaded up with PADDI with the
	same cost as normal 16-bit constants.
	* config/rs6000/rs6000.md (movdi_internal64): Add support to load
	up 34-bit integer constants with PADDI.
	(movdi integer constant splitter): Add comment about PADDI.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 275911)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -5522,7 +5522,7 @@ static int
 num_insns_constant_gpr (HOST_WIDE_INT value)
 {
   /* signed constant loadable with addi */
-  if (((unsigned HOST_WIDE_INT) value + 0x8000) < 0x10000)
+  if (SIGNED_16BIT_OFFSET_P (value))
     return 1;
 
   /* constant loadable with addis */
@@ -5530,6 +5530,10 @@ num_insns_constant_gpr (HOST_WIDE_INT va
 	   && (value >> 31 == -1 || value >> 31 == 0))
     return 1;
 
+  /* PADDI can support up to 34 bit signed integers.  */
+  else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (value))
+    return 1;
+
   else if (TARGET_POWERPC64)
     {
       HOST_WIDE_INT low  = ((value & 0xffffffff) ^ 0x80000000) - 0x80000000;
@@ -20663,7 +20667,8 @@ rs6000_rtx_costs (rtx x, machine_mode mo
 	    || outer_code == PLUS
 	    || outer_code == MINUS)
 	   && (satisfies_constraint_I (x)
-	       || satisfies_constraint_L (x)))
+	       || satisfies_constraint_L (x)
+	       || satisfies_constraint_eI (x)))
 	  || (outer_code == AND
 	      && (satisfies_constraint_K (x)
 		  || (mode == SImode
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 275911)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -8806,24 +8806,24 @@ (define_split
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
 
-;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR #
-;;              FPR store  FPR load   FPR move   AVX store  AVX store   AVX load
-;;              AVX load   VSX move   P9 0       P9 -1      AVX 0/-1    VSX 0
-;;              VSX -1     P9 const   AVX const  From SPR   To SPR      SPR<->SPR
-;;              VSX->GPR   GPR->VSX
+;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR pli
+;;              GPR #      FPR store  FPR load   FPR move   AVX store   AVX store
+;;              AVX load   AVX load   VSX move   P9 0       P9 -1       AVX 0/-1
+;;              VSX 0      VSX -1     P9 const   AVX const  From SPR    To SPR
+;;              SPR<->SPR  VSX->GPR   GPR->VSX
 (define_insn "*movdi_internal64"
   [(set (match_operand:DI 0 "nonimmediate_operand"
                "=YZ,       r,         r,         r,         r,          r,
-                m,         ^d,        ^d,        wY,        Z,          $v,
-                $v,        ^wa,       wa,        wa,        v,          wa,
-                wa,        v,         v,         r,         *h,         *h,
-                ?r,        ?wa")
+                r,         m,         ^d,        ^d,        wY,         Z,
+                $v,        $v,        ^wa,       wa,        wa,         v,
+                wa,        wa,        v,         v,         r,          *h,
+                *h,        ?r,        ?wa")
 	(match_operand:DI 1 "input_operand"
-               "r,         YZ,        r,         I,         L,          nF,
-                ^d,        m,         ^d,        ^v,        $v,         wY,
-                Z,         ^wa,       Oj,        wM,        OjwM,       Oj,
-                wM,        wS,        wB,        *h,        r,          0,
-                wa,        r"))]
+               "r,         YZ,        r,         I,         L,          eI,
+                nF,        ^d,        m,         ^d,        ^v,         $v,
+                wY,        Z,         ^wa,       Oj,        wM,         OjwM,
+                Oj,        wM,        wS,        wB,        *h,         r,
+                0,         wa,        r"))]
   "TARGET_POWERPC64
    && (gpc_reg_operand (operands[0], DImode)
        || gpc_reg_operand (operands[1], DImode))"
@@ -8833,6 +8833,7 @@ (define_insn "*movdi_internal64"
    mr %0,%1
    li %0,%1
    lis %0,%v1
+   li %0,%1
    #
    stfd%U0%X0 %1,%0
    lfd%U1%X1 %0,%1
@@ -8856,26 +8857,28 @@ (define_insn "*movdi_internal64"
    mtvsrd %x0,%1"
   [(set_attr "type"
                "store,      load,	*,         *,         *,         *,
-                fpstore,    fpload,     fpsimple,  fpstore,   fpstore,   fpload,
-                fpload,     veclogical, vecsimple, vecsimple, vecsimple, veclogical,
-                veclogical, vecsimple,  vecsimple, mfjmpr,    mtjmpr,    *,
-                mftgpr,    mffgpr")
+                *,          fpstore,    fpload,    fpsimple,  fpstore,   fpstore,
+                fpload,     fpload,     veclogical,vecsimple, vecsimple, vecsimple,
+                veclogical, veclogical, vecsimple,  vecsimple, mfjmpr,   mtjmpr,
+                *,          mftgpr,    mffgpr")
    (set_attr "size" "64")
    (set_attr "length"
-               "*,         *,         *,         *,         *,          20,
-                *,         *,         *,         *,         *,          *,
+               "*,         *,         *,         *,         *,          *,
+                20,        *,         *,         *,         *,          *,
                 *,         *,         *,         *,         *,          *,
-                *,         8,         *,         *,         *,          *,
-                *,         *")
+                *,         *,         8,         *,         *,          *,
+                *,         *,         *")
    (set_attr "isa"
-               "*,         *,         *,         *,         *,          *,
-                *,         *,         *,         p9v,       p7v,        p9v,
-                p7v,       *,         p9v,       p9v,       p7v,        *,
-                *,         p7v,       p7v,       *,         *,          *,
-                p8v,       p8v")])
+               "*,         *,         *,         *,         *,          fut,
+                *,         *,         *,         *,         p9v,        p7v,
+                p9v,       p7v,       *,         p9v,       p9v,        p7v,
+                *,         *,         p7v,       p7v,       *,          *,
+                *,         p8v,       p8v")])
 
 ; Some DImode loads are best done as a load of -1 followed by a mask
-; instruction.
+; instruction.  On systems that support the PADDI (PLI) instruction,
+; num_insns_constant returns 1, so these splitter would not be used for things
+; that be loaded with PLI.
 (define_split
   [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
 	(match_operand:DI 1 "const_int_operand"))]

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH] V4, patch #6: Use PLI (PADDI) to load up 32-bit SImode constants
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (4 preceding siblings ...)
  2019-09-19  0:09 ` [PATCH] V4, patch #5: Use PLI (PADDI) to load up 34-bit DImode Michael Meissner
@ 2019-09-19  0:11 ` Michael Meissner
  2019-09-19  0:13 ` [PATCH] V4, patch #7: Use PADDI to add 34-bit constants Michael Meissner
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-09-19  0:11 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch is similar to the previous patch, except it loads up 32-bit SImode
constants instead of DImode constants.

I have done a bootstrap build with all of the patches applied, and there were
no regressions in the test suite.  After posting these patches, I will start a
job to build each set of patches in turn just to make sure there are no extra
warnings.

Can I commit this patch to the trunk?

2019-09-18  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.md (movsi_internal1): Add support to load
	up 32-bit SImode integer constants with PADDI.
	(movsi integer constant splitter): Do not split constant if PADDI
	can load it up directly.

Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 275912)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -6908,22 +6908,22 @@ (define_insn "movsi_low"
 
 ;;		MR           LA           LWZ          LFIWZX       LXSIWZX
 ;;		STW          STFIWX       STXSIWX      LI           LIS
-;;		#            XXLOR        XXSPLTIB 0   XXSPLTIB -1  VSPLTISW
-;;		XXLXOR 0     XXLORC -1    P9 const     MTVSRWZ      MFVSRWZ
-;;		MF%1         MT%0         NOP
+;;		PLI          #            XXLOR        XXSPLTIB 0   XXSPLTIB -1
+;;		VSPLTISW     XXLXOR 0     XXLORC -1    P9 const     MTVSRWZ
+;;		MFVSRWZ      MF%1         MT%0         NOP
 (define_insn "*movsi_internal1"
   [(set (match_operand:SI 0 "nonimmediate_operand"
 		"=r,         r,           r,           d,           v,
 		 m,          Z,           Z,           r,           r,
-		 r,          wa,          wa,          wa,          v,
-		 wa,         v,           v,           wa,          r,
-		 r,          *h,          *h")
+		 r,          r,           wa,          wa,          wa,
+		 v,          wa,          v,           v,           wa,
+		 r,          r,           *h,          *h")
 	(match_operand:SI 1 "input_operand"
 		"r,          U,           m,           Z,           Z,
 		 r,          d,           v,           I,           L,
-		 n,          wa,          O,           wM,          wB,
-		 O,          wM,          wS,          r,           wa,
-		 *h,         r,           0"))]
+		 eI,         n,           wa,          O,           wM,
+		 wB,         O,           wM,          wS,          r,
+		 wa,         *h,          r,           0"))]
   "gpc_reg_operand (operands[0], SImode)
    || gpc_reg_operand (operands[1], SImode)"
   "@
@@ -6937,6 +6937,7 @@ (define_insn "*movsi_internal1"
    stxsiwx %x1,%y0
    li %0,%1
    lis %0,%v1
+   li %0,%1
    #
    xxlor %x0,%x1,%x1
    xxspltib %x0,0
@@ -6953,21 +6954,21 @@ (define_insn "*movsi_internal1"
   [(set_attr "type"
 		"*,          *,           load,        fpload,      fpload,
 		 store,      fpstore,     fpstore,     *,           *,
-		 *,          veclogical,  vecsimple,   vecsimple,   vecsimple,
-		 veclogical, veclogical,  vecsimple,   mffgpr,      mftgpr,
-		 *,          *,           *")
+		 *,          *,           veclogical,  vecsimple,   vecsimple,
+		 vecsimple,  veclogical,  veclogical,  vecsimple,   mffgpr,
+		 mftgpr,     *,           *,           *")
    (set_attr "length"
 		"*,          *,           *,           *,           *,
 		 *,          *,           *,           *,           *,
-		 8,          *,           *,           *,           *,
-		 *,          *,           8,           *,           *,
-		 *,          *,           *")
+		 *,          8,           *,           *,           *,
+		 *,          *,           *,           8,           *,
+		 *,          *,           *,           *")
    (set_attr "isa"
 		"*,          *,           *,           p8v,         p8v,
 		 *,          p8v,         p8v,         *,           *,
-		 *,          p8v,         p9v,         p9v,         p8v,
-		 p9v,        p8v,         p9v,         p8v,         p8v,
-		 *,          *,           *")])
+		 fut,        *,           p8v,         p9v,         p9v,
+		 p8v,        p9v,         p8v,         p9v,         p8v,
+		 p8v,        *,           *,           *")])
 
 ;; Like movsi, but adjust a SF value to be used in a SI context, i.e.
 ;; (set (reg:SI ...) (subreg:SI (reg:SF ...) 0))
@@ -7112,14 +7113,15 @@ (define_insn "*movsi_from_df"
   "xscvdpsp %x0,%x1"
   [(set_attr "type" "fp")])
 
-;; Split a load of a large constant into the appropriate two-insn
-;; sequence.
+;; Split a load of a large constant into the appropriate two-insn sequence.  On
+;; systems that support PADDI (PLI), we can use PLI to load any 32-bit constant
+;; in one instruction.
 
 (define_split
   [(set (match_operand:SI 0 "gpc_reg_operand")
 	(match_operand:SI 1 "const_int_operand"))]
   "(unsigned HOST_WIDE_INT) (INTVAL (operands[1]) + 0x8000) >= 0x10000
-   && (INTVAL (operands[1]) & 0xffff) != 0"
+   && (INTVAL (operands[1]) & 0xffff) != 0 && !TARGET_PREFIXED_ADDR"
   [(set (match_dup 0)
 	(match_dup 2))
    (set (match_dup 0)

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH] V4, patch #7: Use PADDI to add 34-bit constants
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (5 preceding siblings ...)
  2019-09-19  0:11 ` [PATCH] V4, patch #6: Use PLI (PADDI) to load up 32-bit SImode constants Michael Meissner
@ 2019-09-19  0:13 ` Michael Meissner
  2019-09-19  0:17 ` [PATCH] V4, patch #8: Enable -mpcrel on Linux 64-bit, but not on other targets Michael Meissner
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-09-19  0:13 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch now allows GCC to generate PADDI to add 34-bit constants.

I have done a bootstrap build with all of the patches applied, and there were
no regressions in the test suite.  After posting these patches, I will start a
job to build each set of patches in turn just to make sure there are no extra
warnings.

Can I commit this patch to the trunk?

2019-09-18  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (add_operand): Add support for
	PADDI.
	* config/rs6000/rs6000.md (add<mode>3): Add support for PADDI.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 275911)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -839,7 +839,8 @@ (define_special_predicate "indexed_addre
 (define_predicate "add_operand"
   (if_then_else (match_code "const_int")
     (match_test "satisfies_constraint_I (op)
-		 || satisfies_constraint_L (op)")
+		 || satisfies_constraint_L (op)
+		 || satisfies_constraint_eI (op)")
     (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is either a non-special register, or 0, or -1.
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 275913)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -1760,15 +1760,17 @@ (define_expand "add<mode>3"
 })
 
 (define_insn "*add<mode>3"
-  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
-	(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
-		  (match_operand:GPR 2 "add_operand" "r,I,L")))]
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r,r")
+	(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b,b")
+		  (match_operand:GPR 2 "add_operand" "r,I,L,eI")))]
   ""
   "@
    add %0,%1,%2
    addi %0,%1,%2
-   addis %0,%1,%v2"
-  [(set_attr "type" "add")])
+   addis %0,%1,%v2
+   addi %0,%1,%2"
+  [(set_attr "type" "add")
+   (set_attr "isa" "*,*,*,fut")])
 
 (define_insn "*addsi3_high"
   [(set (match_operand:SI 0 "gpc_reg_operand" "=b")

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH] V4, patch #8: Enable -mpcrel on Linux 64-bit, but not on other targets
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (6 preceding siblings ...)
  2019-09-19  0:13 ` [PATCH] V4, patch #7: Use PADDI to add 34-bit constants Michael Meissner
@ 2019-09-19  0:17 ` Michael Meissner
  2019-09-24  5:59 ` [PATCH] V4.1, patch #1: Rework prefixed/pc-relative lookup (revised) Michael Meissner
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-09-19  0:17 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This is the last patch in the compiler series for now.  On Linux 64-bit systems
it will enable -mpcrel (and -mprefixed-addr) by default.  On other systems, it
will not enable these switches until the tm.h for the OS enables it.

I have the 3 patches for the test suite that will be following this if things
are settling down.  At the moment, the tests have not been modified, but I will
look at your comments to see if I need to modify anything.

In addition, I will be re-vamping PCREL_OPT to take into account your comments.
That will come at a later date.

I have done a bootstrap build with all of the patches applied, and there were
no regressions in the test suite (this includes running all of the tests not
yet submitted).  After posting these patches, I will start a job to build each
set of patches in turn just to make sure there are no extra warnings.

Can I commit this patch to the trunk?

2019-09-18  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/linux64.h (TARGET_PREFIXED_ADDR_DEFAULT): Enable
	prefixed addressing by default.
	(TARGET_PCREL_DEFAULT): Enable pc-relative addressing by default.
	* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Only
	enable -mprefixed-addr and -mpcrel if the OS tm.h says to enable
	it.
	(ADDRESSING_FUTURE_MASKS): New mask macro.
	(OTHER_FUTURE_MASKS): Use ADDRESSING_FUTURE_MASKS.
	* config/rs6000/rs6000.c (TARGET_PREFIXED_ADDR_DEFAULT): Do not
	enable -mprefixed-addr unless the OS tm.h says to.
	(TARGET_PCREL_DEFAULT): Do not enable -mpcrel unless the OS tm.h
	says to.
	(rs6000_option_override_internal): Do not enable -mprefixed-addr
	or -mpcrel unless the OS tm.h says to enable it.  Add more checks
	for -mcpu=future.

Index: gcc/config/rs6000/linux64.h
===================================================================
--- gcc/config/rs6000/linux64.h	(revision 275894)
+++ gcc/config/rs6000/linux64.h	(working copy)
@@ -640,3 +640,11 @@ extern int dot_symbols;
    enabling the __float128 keyword.  */
 #undef	TARGET_FLOAT128_ENABLE_TYPE
 #define TARGET_FLOAT128_ENABLE_TYPE 1
+
+/* Enable support for pc-relative and numeric prefixed addressing on the
+   'future' system.  */
+#undef  TARGET_PREFIXED_ADDR_DEFAULT
+#define TARGET_PREFIXED_ADDR_DEFAULT	1
+
+#undef  TARGET_PCREL_DEFAULT
+#define TARGET_PCREL_DEFAULT		1
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def	(revision 275894)
+++ gcc/config/rs6000/rs6000-cpus.def	(working copy)
@@ -75,15 +75,21 @@
 				 | OPTION_MASK_P8_VECTOR		\
 				 | OPTION_MASK_P9_VECTOR)
 
-/* Support for a future processor's features.  Do not enable -mpcrel until it
-   is fully functional.  */
+/* Support for a future processor's features.  The prefixed and pc-relative
+   addressing bits are not added here.  Instead, rs6000.c adds them if the OS
+   tm.h says that it supports the addressing modes.  */
 #define ISA_FUTURE_MASKS_SERVER	(ISA_3_0_MASKS_SERVER			\
-				 | OPTION_MASK_FUTURE			\
+				 | OPTION_MASK_FUTURE)
+
+/* Addressing related flags on a future processor.  These flags are broken out
+   because not all targets will support either pc-relative addressing, or even
+   prefixed addressing, and we want to clear all of the addressing bits
+   on targets that cannot support prefixed/pcrel addressing.  */
+#define ADDRESSING_FUTURE_MASKS	(OPTION_MASK_PCREL			\
 				 | OPTION_MASK_PREFIXED_ADDR)
 
 /* Flags that need to be turned off if -mno-future.  */
-#define OTHER_FUTURE_MASKS	(OPTION_MASK_PCREL			\
-				 | OPTION_MASK_PREFIXED_ADDR)
+#define OTHER_FUTURE_MASKS	ADDRESSING_FUTURE_MASKS
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
 #define OTHER_P9_VECTOR_MASKS	(OPTION_MASK_FLOAT128_HW		\
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 275912)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -98,6 +98,16 @@
 #endif
 #endif
 
+/* Set up the defaults for whether prefixed addressing is used, and if it is
+   used, whether we want to turn on pc-relative support by default.  */
+#ifndef TARGET_PREFIXED_ADDR_DEFAULT
+#define TARGET_PREFIXED_ADDR_DEFAULT	0
+#endif
+
+#ifndef TARGET_PCREL_DEFAULT
+#define TARGET_PCREL_DEFAULT		0
+#endif
+
 /* Support targetm.vectorize.builtin_mask_for_load.  */
 GTY(()) tree altivec_builtin_mask_for_load;
 
@@ -2532,6 +2542,14 @@ rs6000_debug_reg_global (void)
   if (TARGET_DIRECT_MOVE_128)
     fprintf (stderr, DEBUG_FMT_D, "VSX easy 64-bit mfvsrld element",
 	     (int)VECTOR_ELEMENT_MFVSRLD_64BIT);
+
+  if (TARGET_FUTURE)
+    {
+      fprintf (stderr, DEBUG_FMT_D, "TARGET_PREFIXED_ADDR_DEFAULT",
+	       TARGET_PREFIXED_ADDR_DEFAULT);
+      fprintf (stderr, DEBUG_FMT_D, "TARGET_PCREL_DEFAULT",
+	       TARGET_PCREL_DEFAULT);
+    }
 }
 
 \f
@@ -4012,26 +4030,6 @@ rs6000_option_override_internal (bool gl
       rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_HW;
     }
 
-  /* -mprefixed-addr (and hence -mpcrel) requires -mcpu=future.  */
-  if (TARGET_PREFIXED_ADDR && !TARGET_FUTURE)
-    {
-      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mcpu=future");
-      else if ((rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED_ADDR) != 0)
-	error ("%qs requires %qs", "-mprefixed-addr", "-mcpu=future");
-
-      rs6000_isa_flags &= ~(OPTION_MASK_PCREL | OPTION_MASK_PREFIXED_ADDR);
-    }
-
-  /* -mpcrel requires prefixed load/store addressing.  */
-  if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
-    {
-      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
-
-      rs6000_isa_flags &= ~OPTION_MASK_PCREL;
-    }
-
   /* Print the options after updating the defaults.  */
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
     rs6000_print_isa_options (stderr, 0, "after defaults", rs6000_isa_flags);
@@ -4163,12 +4161,89 @@ rs6000_option_override_internal (bool gl
   SUB3TARGET_OVERRIDE_OPTIONS;
 #endif
 
-  /* -mpcrel requires -mcmodel=medium, but we can't check TARGET_CMODEL until
-      after the subtarget override options are done.  */
-  if (TARGET_PCREL && TARGET_CMODEL != CMODEL_MEDIUM)
+  /* Enable prefixed addressing and pc-relative addressing on 64-bit ELF v2
+     systems if the OS tm.h file says that it is supported and the user did not
+     explicitly use -mprefixed-addr or -mpcrel.  At the present time, only
+     64-bit Linux enables this.
+
+     Pc-relative support also requires the medium code model.
+
+     However, we can't check for ELFv2 or -mcmodel=medium until after the
+     subtarget macros are run.
+
+     If prefixed addressing is disabled by default, and the user does -mpcrel,
+     don't force them to also specify -mprefixed-addr.  */
+  if (TARGET_FUTURE)
+    {
+      bool explicit_prefixed = ((rs6000_isa_flags_explicit
+				 & OPTION_MASK_PREFIXED_ADDR) != 0);
+      bool explicit_pcrel = ((rs6000_isa_flags_explicit
+			      & OPTION_MASK_PCREL) != 0);
+
+      /* Prefixed addressing requires 64-bit registers.  */
+      if (!TARGET_POWERPC64)
+	{
+	  if (TARGET_PCREL && explicit_pcrel)
+	    error ("%qs requires %qs", "-mpcrel", "-m64");
+
+	  else if (TARGET_PREFIXED_ADDR && explicit_prefixed)
+	    error ("%qs requires %qs", "-mprefixed-addr", "-m64");
+
+	  rs6000_isa_flags &= ~ADDRESSING_FUTURE_MASKS;
+	}
+
+      /* Only ELFv2 currently supports prefixed/pcrel addressing.  */
+      else if (rs6000_current_abi != ABI_ELFv2)
+	{
+	  if (TARGET_PCREL && explicit_pcrel)
+	    error ("%qs requires %qs", "-mpcrel", "-mabi=elfv2");
+
+	  else if (TARGET_PREFIXED_ADDR && explicit_prefixed)
+	    error ("%qs requires %qs", "-mprefixed-addr", "-mabi=elfv2");
+
+	  rs6000_isa_flags &= ~ADDRESSING_FUTURE_MASKS;
+	}
+
+      /* Pc-relative requires the medium code model.  */
+      else if (TARGET_PCREL && TARGET_CMODEL != CMODEL_MEDIUM)
+	{
+	  if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+	    error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");
+
+	  rs6000_isa_flags &= ~OPTION_MASK_PCREL;
+	}
+
+      /* Enable defaults if desired.  */
+      else
+	{
+	  if (!explicit_prefixed
+	      && (TARGET_PREFIXED_ADDR_DEFAULT
+		  || TARGET_PCREL
+		  || TARGET_PCREL_DEFAULT))
+	    rs6000_isa_flags |= OPTION_MASK_PREFIXED_ADDR;
+
+	  if (!explicit_pcrel && TARGET_PCREL_DEFAULT
+	      && TARGET_CMODEL == CMODEL_MEDIUM)
+	    rs6000_isa_flags |= OPTION_MASK_PCREL;
+	}
+    }
+
+  /* -mprefixed-addr (and hence -mpcrel) requires -mcpu=future.  */
+  if (TARGET_PREFIXED_ADDR && !TARGET_FUTURE)
     {
       if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");
+	error ("%qs requires %qs", "-mpcrel", "-mcpu=future");
+      else if ((rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED_ADDR) != 0)
+	error ("%qs requires %qs", "-mprefixed-addr", "-mcpu=future");
+
+      rs6000_isa_flags &= ~(OPTION_MASK_PCREL | OPTION_MASK_PREFIXED_ADDR);
+    }
+
+  /* -mpcrel requires prefixed load/store addressing.  */
+  if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
+    {
+      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+	error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
 
       rs6000_isa_flags &= ~OPTION_MASK_PCREL;
     }

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] V4, patch #1: Rework prefixed/pc-relative lookup
  2019-09-18 23:49 ` [PATCH] V4, patch #1: Rework prefixed/pc-relative lookup Michael Meissner
@ 2019-09-21  1:29   ` Segher Boessenkool
  2019-09-23 17:49     ` Michael Meissner
  0 siblings, 1 reply; 37+ messages in thread
From: Segher Boessenkool @ 2019-09-21  1:29 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

Hi Mike,

On Wed, Sep 18, 2019 at 07:49:18PM -0400, Michael Meissner wrote:
> This patch reworks the prefixed and pc-relative memory matching functions.

This mostly looks fine, thanks!  A few smaller things:


> 	(pcrel_external_address): Replace with new implementation using
> 	address_to_insn_form..

(Two dots is one too many).


> +(define_predicate "pcrel_local_or_external_address"
> +  (match_code "label_ref,symbol_ref,const")
> +{
> +  enum insn_form iform = address_to_insn_form (op, mode, NON_PREFIXED_DEFAULT);
> +  return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
> +})

(define_predicate "pcrel_local_or_external_address"
  (ior (match_operand 0 "pcrel_local_address")
       (match_operand 0 "pcrel_external_address")))

(or similar) please.  genpreds will generate effectively the same code
as you had automatically.


> +/* Different PowerPC instruction formats that are used by GCC.  There are
> +   various other instruction formats used by the PowerPC hardware, but the
> +   these formats are not currently used by GCC.  */
> +
> +enum insn_form {
> +  INSN_FORM_BAD,		/* Bad instruction format.  */
> +  INSN_FORM_BASE_REG,		/* Base register only.  */
> +  INSN_FORM_D,			/* Base register + 16-bit numeric offset.  */
> +  INSN_FORM_DS,			/* Base register + 14-bit offset + 00.  */
> +  INSN_FORM_DQ,			/* Base register + 12-bit offset + 0000.  */

It may be easier to describe DS-form as "D-form, with the offset aligned
to a (single) word" and DQ-form as "D-form, with the offset aligned to a
quad-word".  (Or what you do below; see below).

> +  INSN_FORM_X,			/* Base register + index register.  */
> +  INSN_FORM_UPDATE,		/* Address udpates base register.  */

(typo, "updates").

> +  INSN_FORM_LO_SUM,		/* Special offset instruction.  */

That's a somewhat lame description :-)  It's not really a separate form
insn anyway, hrm.  Can you come up with a better comment?  I have no
suggestions, so yeah maybe just keep it like you have.


> +  INSN_FORM_PREFIXED_NUMERIC,	/* Base register + 34 bit numeric offset.  */
> +  INSN_FORM_PCREL_LOCAL,	/* Pc-relative local symbol.  */
> +  INSN_FORM_PCREL_EXTERNAL	/* Pc-relative external symbol.  */
> +};

Either pc or PC please.  It's an initialism.


> +/* Instruction format for the non-prefixed version of a load or store.  This is
> +   used to determine if a 16-bit offset is valid to be used with a non-prefixed
> +   (traditional) instruction or if the bottom bits of the offset cannot be used
> +   with a DS or DQ instruction format, and GCC has to use a prefixed
> +   instruction for the load or store.  */
> +
> +enum non_prefixed {
> +  NON_PREFIXED_DEFAULT,		/* Use the default.  */
> +  NON_PREFIXED_D,		/* All 16-bits are valid.  */
> +  NON_PREFIXED_DS,		/* Bottom 2 bits must be 0.  */
> +  NON_PREFIXED_DQ,		/* Bottom 4 bits must be 0.  */
> +  NON_PREFIXED_X		/* No offset memory form exists.  */
> +};

Yeah the DS- and DQ-form descriptions here are nicer I think, thanks.

Maybe non_prefixed_form is a clearer name?  But it is longer of course.
You decide.


> +      if (SYMBOL_REF_P (x) && !SYMBOL_REF_LOCAL_P (x))
> +	fputs ("@got", file);
> +
>        fputs ("@pcrel", file);

I'd just use fprintf btw, GCC knows since decades to optimise that to
fputs, and it is easier to read IMO.  Not that fputs is so super bad,
but every little we do not have to think helps (helps us think, think
about more important matters!)


> +/* Given an address (ADDR), a mode (MODE), and what the format of the
> +   non-prefixed address (NON_PREFIXED_INSN) is, return the instruction format
> +   for the address.  */
> +
> +enum insn_form
> +address_to_insn_form (rtx addr,
> +		      machine_mode mode,
> +		      enum non_prefixed non_prefixed_insn)

non_prefixed_form, instead?


> +{
> +  rtx op0, op1;

You can declare these at first use.  Declaring things in multiple blocks
(so with shorter scopes) is a bit nicer.

> +  /* If we don't support offset addressing, make sure only indexed addressing
> +     is allowed.  We special case SDmode so that the register allocator does
> +     try to move SDmode through GPR registers, but instead uses the 32-bit
> +     integer read/write instructions for the floating point registers.  */

Does *not* try?

Read/write, do you mean load/store?  lfiwzx and friends?


> +  if (GET_CODE (addr) != PLUS)
> +    return GET_CODE (addr) == LO_SUM ? INSN_FORM_LO_SUM : INSN_FORM_BAD;

  if (GET_CODE (addr) == LO_SUM)
    return INSN_FORM_LO_SUM;

  if (GET_CODE (addr) != PLUS)
    return INSN_FORM_BAD;

(Avoid using the conditional operator if you can use an "if" statement
just as well; easier to read).

> +  op0 = XEXP (addr, 0);
> +  op1 = XEXP (addr, 1);
> +
> +  if (REG_P (op1) || SUBREG_P (op1))
> +    return INSN_FORM_X;

I think you should have checked op0 here as well?

> +  /* If it isn't pc-relative, check for 16-bit D/DS/DQ-form.  */
> +  if (!REG_P (op0) && !SUBREG_P (op0))
> +    return INSN_FORM_BAD;

(Instead of only later here).

Overall this looks like it will work nicely, thanks!


Segher

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] V4, patch #1: Rework prefixed/pc-relative lookup
  2019-09-21  1:29   ` Segher Boessenkool
@ 2019-09-23 17:49     ` Michael Meissner
  0 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-09-23 17:49 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Fri, Sep 20, 2019 at 08:29:04PM -0500, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Wed, Sep 18, 2019 at 07:49:18PM -0400, Michael Meissner wrote:
> > This patch reworks the prefixed and pc-relative memory matching functions.
> 
> This mostly looks fine, thanks!  A few smaller things:
> 
> 
> > 	(pcrel_external_address): Replace with new implementation using
> > 	address_to_insn_form..
> 
> (Two dots is one too many).

Yes.

> > +(define_predicate "pcrel_local_or_external_address"
> > +  (match_code "label_ref,symbol_ref,const")
> > +{
> > +  enum insn_form iform = address_to_insn_form (op, mode, NON_PREFIXED_DEFAULT);
> > +  return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
> > +})
> 
> (define_predicate "pcrel_local_or_external_address"
>   (ior (match_operand 0 "pcrel_local_address")
>        (match_operand 0 "pcrel_external_address")))
> 
> (or similar) please.  genpreds will generate effectively the same code
> as you had automatically.

Ok.

> > +/* Different PowerPC instruction formats that are used by GCC.  There are
> > +   various other instruction formats used by the PowerPC hardware, but the
> > +   these formats are not currently used by GCC.  */
> > +
> > +enum insn_form {
> > +  INSN_FORM_BAD,		/* Bad instruction format.  */
> > +  INSN_FORM_BASE_REG,		/* Base register only.  */
> > +  INSN_FORM_D,			/* Base register + 16-bit numeric offset.  */
> > +  INSN_FORM_DS,			/* Base register + 14-bit offset + 00.  */
> > +  INSN_FORM_DQ,			/* Base register + 12-bit offset + 0000.  */
> 
> It may be easier to describe DS-form as "D-form, with the offset aligned
> to a (single) word" and DQ-form as "D-form, with the offset aligned to a
> quad-word".  (Or what you do below; see below).

Ok.

> > +  INSN_FORM_X,			/* Base register + index register.  */
> > +  INSN_FORM_UPDATE,		/* Address udpates base register.  */
> 
> (typo, "updates").
> 
> > +  INSN_FORM_LO_SUM,		/* Special offset instruction.  */
> 
> That's a somewhat lame description :-)  It's not really a separate form
> insn anyway, hrm.  Can you come up with a better comment?  I have no
> suggestions, so yeah maybe just keep it like you have.

None of my code uses INSN_FORM_LO_SUM, but I was adding it for completeness,
since it is a valid instruction format (as used within the compiler).

> 
> > +  INSN_FORM_PREFIXED_NUMERIC,	/* Base register + 34 bit numeric offset.  */
> > +  INSN_FORM_PCREL_LOCAL,	/* Pc-relative local symbol.  */
> > +  INSN_FORM_PCREL_EXTERNAL	/* Pc-relative external symbol.  */
> > +};
> 
> Either pc or PC please.  It's an initialism.

Yep.

> 
> > +/* Instruction format for the non-prefixed version of a load or store.  This is
> > +   used to determine if a 16-bit offset is valid to be used with a non-prefixed
> > +   (traditional) instruction or if the bottom bits of the offset cannot be used
> > +   with a DS or DQ instruction format, and GCC has to use a prefixed
> > +   instruction for the load or store.  */
> > +
> > +enum non_prefixed {
> > +  NON_PREFIXED_DEFAULT,		/* Use the default.  */
> > +  NON_PREFIXED_D,		/* All 16-bits are valid.  */
> > +  NON_PREFIXED_DS,		/* Bottom 2 bits must be 0.  */
> > +  NON_PREFIXED_DQ,		/* Bottom 4 bits must be 0.  */
> > +  NON_PREFIXED_X		/* No offset memory form exists.  */
> > +};
> 
> Yeah the DS- and DQ-form descriptions here are nicer I think, thanks.
> 
> Maybe non_prefixed_form is a clearer name?  But it is longer of course.
> You decide.

That is fine.

> 
> > +      if (SYMBOL_REF_P (x) && !SYMBOL_REF_LOCAL_P (x))
> > +	fputs ("@got", file);
> > +
> >        fputs ("@pcrel", file);
> 
> I'd just use fprintf btw, GCC knows since decades to optimise that to
> fputs, and it is easier to read IMO.  Not that fputs is so super bad,
> but every little we do not have to think helps (helps us think, think
> about more important matters!)

Ok.  It is just the many years of doing C code before GCC did the optimization
is ingrained on me.
 
> > +/* Given an address (ADDR), a mode (MODE), and what the format of the
> > +   non-prefixed address (NON_PREFIXED_INSN) is, return the instruction format
> > +   for the address.  */
> > +
> > +enum insn_form
> > +address_to_insn_form (rtx addr,
> > +		      machine_mode mode,
> > +		      enum non_prefixed non_prefixed_insn)
> 
> non_prefixed_form, instead?
> 
> 
> > +{
> > +  rtx op0, op1;
> 
> You can declare these at first use.  Declaring things in multiple blocks
> (so with shorter scopes) is a bit nicer.
> 
> > +  /* If we don't support offset addressing, make sure only indexed addressing
> > +     is allowed.  We special case SDmode so that the register allocator does
> > +     try to move SDmode through GPR registers, but instead uses the 32-bit
> > +     integer read/write instructions for the floating point registers.  */
> 
> Does *not* try?
> 
> Read/write, do you mean load/store?  lfiwzx and friends?
> 
> 
> > +  if (GET_CODE (addr) != PLUS)
> > +    return GET_CODE (addr) == LO_SUM ? INSN_FORM_LO_SUM : INSN_FORM_BAD;
> 
>   if (GET_CODE (addr) == LO_SUM)
>     return INSN_FORM_LO_SUM;
> 
>   if (GET_CODE (addr) != PLUS)
>     return INSN_FORM_BAD;
> 
> (Avoid using the conditional operator if you can use an "if" statement
> just as well; easier to read).
> 
> > +  op0 = XEXP (addr, 0);
> > +  op1 = XEXP (addr, 1);
> > +
> > +  if (REG_P (op1) || SUBREG_P (op1))
> > +    return INSN_FORM_X;
> 
> I think you should have checked op0 here as well?

Yes probably.  Too many years of writing secondary reload patterns where the
first argument can be funny during the reload phase. :-)

> > +  /* If it isn't pc-relative, check for 16-bit D/DS/DQ-form.  */
> > +  if (!REG_P (op0) && !SUBREG_P (op0))
> > +    return INSN_FORM_BAD;
> 
> (Instead of only later here).
> 
> Overall this looks like it will work nicely, thanks!

Did you want the patch updated now, or should I wait for you to comment on the
next set of patches?

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH] V4.1, patch #1: Rework prefixed/pc-relative lookup (revised)
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (7 preceding siblings ...)
  2019-09-19  0:17 ` [PATCH] V4, patch #8: Enable -mpcrel on Linux 64-bit, but not on other targets Michael Meissner
@ 2019-09-24  5:59 ` Michael Meissner
  2019-09-27 22:59   ` Segher Boessenkool
  2019-09-24  6:10 ` [PATCH], V4.1, patch #2: Add prefixed insn attribute (revised) Michael Meissner
                   ` (12 subsequent siblings)
  21 siblings, 1 reply; 37+ messages in thread
From: Michael Meissner @ 2019-09-24  5:59 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch replaces patch #1.  It tries to address most/all of the review
comments.

This patch bootstraps fine, and there are no regressions.  When I applied this
patch, the revised patch #2 that will be posted next and the other patches, it
builds Spec 2017 with -mcpu=future.  The other patches were the same other than
trying to be consistant about spelling PC-relative (and line numbers adjusted
due to patches #1/#2 being reworked).  Can I check this into trunk?

2019-09-23  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (pcrel_address): Delete predicate.
	(pcrel_local_address): Replace pcrel_address predicate, use the
	new function address_to_insn_form.
	(pcrel_external_address): Replace with new implementation using
	address_to_insn_form..
	(prefixed_mem_operand): Delete predicate which is now unused.
	(pcrel_external_mem_operand): Delete predicate which is now
	unused.
	* config/rs6000/rs6000-protos.h (enum insn_form): New
	enumeration.
	(enum non_prefixed): New enumeration.
	(address_to_insn_form): New declaration.
	* config/rs6000/rs6000.c (print_operand_address): Check for either
	PC-relative local symbols or PC-relative external symbols.
	(mode_supports_prefixed_address_p): Delete, no longer used.
	(rs6000_prefixed_address_mode_p): Delete, no longer used.
	(address_to_insn_form): New function to decode an address format.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 276068)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -1625,82 +1625,7 @@ (define_predicate "small_toc_ref"
   return GET_CODE (op) == UNSPEC && XINT (op, 1) == UNSPEC_TOCREL;
 })
 
-;; Return true if the operand is a pc-relative address.
-(define_predicate "pcrel_address"
-  (match_code "label_ref,symbol_ref,const")
-{
-  if (!rs6000_pcrel_p (cfun))
-    return false;
-
-  if (GET_CODE (op) == CONST)
-    op = XEXP (op, 0);
-
-  /* Validate offset.  */
-  if (GET_CODE (op) == PLUS)
-    {
-      rtx op0 = XEXP (op, 0);
-      rtx op1 = XEXP (op, 1);
-
-      if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
-	return false;
-
-      op = op0;
-    }
-
-  if (LABEL_REF_P (op))
-    return true;
-
-  return (SYMBOL_REF_P (op) && SYMBOL_REF_LOCAL_P (op));
-})
-
-;; Return true if the operand is an external symbol whose address can be loaded
-;; into a register using:
-;;	PLD reg,label@pcrel@got
-;;
-;; The linker will either optimize this to either a PADDI if the label is
-;; defined locally in another module or a PLD of the address if the label is
-;; defined in another module.
-
-(define_predicate "pcrel_external_address"
-  (match_code "symbol_ref,const")
-{
-  if (!rs6000_pcrel_p (cfun))
-    return false;
-
-  if (GET_CODE (op) == CONST)
-    op = XEXP (op, 0);
-
-  /* Validate offset.  */
-  if (GET_CODE (op) == PLUS)
-    {
-      rtx op0 = XEXP (op, 0);
-      rtx op1 = XEXP (op, 1);
-
-      if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
-	return false;
-
-      op = op0;
-    }
-
-  return (SYMBOL_REF_P (op) && !SYMBOL_REF_LOCAL_P (op));
-})
-
-;; Return 1 if op is a prefixed memory operand.
-(define_predicate "prefixed_mem_operand"
-  (match_code "mem")
-{
-  return rs6000_prefixed_address_mode_p (XEXP (op, 0), GET_MODE (op));
-})
-
-;; Return 1 if op is a memory operand to an external variable when we
-;; support pc-relative addressing and the PCREL_OPT relocation to
-;; optimize references to it.
-(define_predicate "pcrel_external_mem_operand"
-  (match_code "mem")
-{
-  return pcrel_external_address (XEXP (op, 0), Pmode);
-})
-
+\f
 ;; Match the first insn (addis) in fusing the combination of addis and loads to
 ;; GPR registers on power8.
 (define_predicate "fusion_gpr_addis"
@@ -1857,3 +1782,28 @@ (define_predicate "fusion_addis_mem_comb
 
   return 0;
 })
+
+\f
+;; Return true if the operand is a PC-relative address to a local symbol or a
+;; label that can be used directly in a memory operation.
+(define_predicate "pcrel_local_address"
+  (match_code "label_ref,symbol_ref,const")
+{
+  enum insn_form iform = address_to_insn_form (op, mode, NON_PREFIXED_DEFAULT);
+  return iform == INSN_FORM_PCREL_LOCAL;
+})
+
+;; Return true if the operand is a PC-relative external symbol whose address
+;; can be loaded into a register.
+(define_predicate "pcrel_external_address"
+  (match_code "symbol_ref,const")
+{
+  enum insn_form iform = address_to_insn_form (op, mode, NON_PREFIXED_DEFAULT);
+  return iform == INSN_FORM_PCREL_EXTERNAL;
+})
+
+;; Return true if the address is PC-relative and the symbol is either local or
+;; external.
+(define_predicate "pcrel_local_or_external_address"
+  (ior (match_operand 0 "pcrel_local_address")
+       (match_operand 0 "pcrel_external_address")))
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 276068)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -154,7 +154,41 @@ extern align_flags rs6000_loop_align (rt
 extern void rs6000_split_logical (rtx [], enum rtx_code, bool, bool, bool);
 extern bool rs6000_pcrel_p (struct function *);
 extern bool rs6000_fndecl_pcrel_p (const_tree);
-extern bool rs6000_prefixed_address_mode_p (rtx, machine_mode);
+
+/* Different PowerPC instruction formats that are used by GCC.  There are
+   various other instruction formats used by the PowerPC hardware, but the
+   these formats are not currently used by GCC.  */
+
+enum insn_form {
+  INSN_FORM_BAD,		/* Bad instruction format.  */
+  INSN_FORM_BASE_REG,		/* Base register only.  */
+  INSN_FORM_D,			/* Reg + 16-bit numeric offset.  */
+  INSN_FORM_DS,			/* Reg + offset, bottom 2 bits must be 0.  */
+  INSN_FORM_DQ,			/* Reg + offset, bottom 4 bits must be 0.  */
+  INSN_FORM_X,			/* Base register + index register.  */
+  INSN_FORM_UPDATE,		/* Address updates base register.  */
+  INSN_FORM_LO_SUM,		/* Reg + offset using symbol.  */
+  INSN_FORM_PREFIXED_NUMERIC,	/* Reg + 34 bit numeric offset.  */
+  INSN_FORM_PCREL_LOCAL,	/* PC-relative local symbol.  */
+  INSN_FORM_PCREL_EXTERNAL	/* PC-relative external symbol.  */
+};
+
+/* Instruction format for the non-prefixed version of a load or store.  This is
+   used to determine if a 16-bit offset is valid to be used with a non-prefixed
+   (traditional) instruction or if the bottom bits of the offset cannot be used
+   with a DS or DQ instruction format, and GCC has to use a prefixed
+   instruction for the load or store.  */
+
+enum non_prefixed {
+  NON_PREFIXED_DEFAULT,		/* Use the default.  */
+  NON_PREFIXED_D,		/* All 16-bits are valid.  */
+  NON_PREFIXED_DS,		/* Bottom 2 bits must be 0.  */
+  NON_PREFIXED_DQ,		/* Bottom 4 bits must be 0.  */
+  NON_PREFIXED_X		/* No offset memory form exists.  */
+};
+
+extern enum insn_form address_to_insn_form (rtx, machine_mode,
+					    enum non_prefixed);
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276068)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -13078,8 +13078,8 @@ print_operand_address (FILE *file, rtx x
   if (REG_P (x))
     fprintf (file, "0(%s)", reg_names[ REGNO (x) ]);
 
-  /* Is it a pc-relative address?  */
-  else if (pcrel_address (x, Pmode))
+  /* Is it a PC-relative address?  */
+  else if (TARGET_PCREL && pcrel_local_or_external_address (x, VOIDmode))
     {
       HOST_WIDE_INT offset;
 
@@ -13099,7 +13099,10 @@ print_operand_address (FILE *file, rtx x
       if (offset)
 	fprintf (file, "%+" PRId64, offset);
 
-      fputs ("@pcrel", file);
+      if (SYMBOL_REF_P (x) && !SYMBOL_REF_LOCAL_P (x))
+	fprintf (file, "@got");
+
+      fprintf (file, "@pcrel");
     }
   else if (SYMBOL_REF_P (x) || GET_CODE (x) == CONST
 	   || GET_CODE (x) == LABEL_REF)
@@ -13584,71 +13587,6 @@ rs6000_pltseq_template (rtx *operands, i
   return str;
 }
 #endif
-
-/* Helper function to return whether a MODE can do prefixed loads/stores.
-   VOIDmode is used when we are loading the pc-relative address into a base
-   register, but we are not using it as part of a memory operation.  As modes
-   add support for prefixed memory, they will be added here.  */
-
-static bool
-mode_supports_prefixed_address_p (machine_mode mode)
-{
-  return mode == VOIDmode;
-}
-
-/* Function to return true if ADDR is a valid prefixed memory address that uses
-   mode MODE.  */
-
-bool
-rs6000_prefixed_address_mode_p (rtx addr, machine_mode mode)
-{
-  if (!TARGET_PREFIXED_ADDR || !mode_supports_prefixed_address_p (mode))
-    return false;
-
-  /* Check for PC-relative addresses.  */
-  if (pcrel_address (addr, Pmode))
-    return true;
-
-  /* Check for prefixed memory addresses that have a large numeric offset,
-     or an offset that can't be used for a DS/DQ-form memory operation.  */
-  if (GET_CODE (addr) == PLUS)
-    {
-      rtx op0 = XEXP (addr, 0);
-      rtx op1 = XEXP (addr, 1);
-
-      if (!base_reg_operand (op0, Pmode) || !CONST_INT_P (op1))
-	return false;
-
-      HOST_WIDE_INT value = INTVAL (op1);
-      if (!SIGNED_34BIT_OFFSET_P (value))
-	return false;
-
-      /* Offset larger than 16-bits?  */
-      if (!SIGNED_16BIT_OFFSET_P (value))
-	return true;
-
-      /* DQ instruction (bottom 4 bits must be 0) for vectors.  */
-      HOST_WIDE_INT mask;
-      if (GET_MODE_SIZE (mode) >= 16)
-	mask = 15;
-
-      /* DS instruction (bottom 2 bits must be 0).  For 32-bit integers, we
-	 need to use DS instructions if we are sign-extending the value with
-	 LWA.  For 32-bit floating point, we need DS instructions to load and
-	 store values to the traditional Altivec registers.  */
-      else if (GET_MODE_SIZE (mode) >= 4)
-	mask = 3;
-
-      /* QImode/HImode has no restrictions.  */
-      else
-	return true;
-
-      /* Return true if we must use a prefixed instruction.  */
-      return (value & mask) != 0;
-    }
-
-  return false;
-}
 \f
 #if defined (HAVE_GAS_HIDDEN) && !TARGET_MACHO
 /* Emit an assembler directive to set symbol visibility for DECL to
@@ -24613,6 +24551,170 @@ rs6000_pcrel_p (struct function *fn)
   return rs6000_fndecl_pcrel_p (fn->decl);
 }
 
+\f
+/* Given an address (ADDR), a mode (MODE), and what the format of the
+   non-prefixed address (NON_PREFIXED_FORM) is, return the instruction format
+   for the address.  */
+
+enum insn_form
+address_to_insn_form (rtx addr,
+		      machine_mode mode,
+		      enum non_prefixed non_prefixed_form)
+{
+  /* Single register is easy.  */
+  if (REG_P (addr) || SUBREG_P (addr))
+    return INSN_FORM_BASE_REG;
+
+  /* If the non prefixed instruction format doesn't support offset addressing,
+     make sure only indexed addressing is allowed.
+
+     We special case SDmode so that the register allocator does not try to move
+     SDmode through GPR registers, but instead uses the 32-bit integer load and
+     store instructions for the floating point registers.  */
+  if (non_prefixed_form == NON_PREFIXED_X || (mode == SDmode && TARGET_DFP))
+    {
+      if (GET_CODE (addr) != PLUS)
+	return INSN_FORM_BAD;
+
+      rtx op0 = XEXP (addr, 0);
+      rtx op1 = XEXP (addr, 1);
+      if (!REG_P (op0) && !SUBREG_P (op0))
+	return INSN_FORM_BAD;
+
+      if (!REG_P (op1) && !SUBREG_P (op1))
+	return INSN_FORM_BAD;
+
+      return INSN_FORM_X;
+    }
+
+  /* Deal with update forms.  */
+  if (GET_RTX_CLASS (GET_CODE (addr)) == RTX_AUTOINC)
+    return INSN_FORM_UPDATE;
+
+  /* Handle PC-relative symbols and labels.  Check for both local and external
+     symbols.  Assume labels are always local.  */
+  if (TARGET_PCREL)
+    {
+      if (SYMBOL_REF_P (addr) && !SYMBOL_REF_LOCAL_P (addr))
+	return INSN_FORM_PCREL_EXTERNAL;
+
+      if (SYMBOL_REF_P (addr) || LABEL_REF_P (addr))
+	return INSN_FORM_PCREL_LOCAL;
+    }
+
+  if (GET_CODE (addr) == CONST)
+    addr = XEXP (addr, 0);
+
+  /* Recognize LO_SUM addresses used with TOC and 32-bit addressing.  */
+  if (GET_CODE (addr) == LO_SUM)
+    return INSN_FORM_LO_SUM;
+
+  /* Everything below must be an offset address of some form.  */
+  if (GET_CODE (addr) != PLUS)
+    return INSN_FORM_BAD;
+
+  rtx op0 = XEXP (addr, 0);
+  rtx op1 = XEXP (addr, 1);
+
+  /* Check for indexed addresses.  */
+  if (REG_P (op1) || SUBREG_P (op1))
+    {
+      if (REG_P (op0) || SUBREG_P (op0))
+	return INSN_FORM_X;
+
+      return INSN_FORM_BAD;
+    }
+
+  if (!CONST_INT_P (op1))
+    return INSN_FORM_BAD;
+
+  HOST_WIDE_INT offset = INTVAL (op1);
+  if (!SIGNED_34BIT_OFFSET_P (offset))
+    return INSN_FORM_BAD;
+
+  /* Check for local and external PC-relative addresses.  Labels are always
+     local.  */
+  if (TARGET_PCREL)
+    {
+      if (SYMBOL_REF_P (op0) && !SYMBOL_REF_LOCAL_P (op0))
+	return INSN_FORM_PCREL_EXTERNAL;
+
+      if (SYMBOL_REF_P (op0) || LABEL_REF_P (op0))
+	return INSN_FORM_PCREL_LOCAL;
+    }
+
+  /* If it isn't PC-relative, the address must use a base register.  */
+  if (!REG_P (op0) && !SUBREG_P (op0))
+    return INSN_FORM_BAD;
+
+  /* Large offsets must be prefixed.  */
+  if (!SIGNED_16BIT_OFFSET_P (offset))
+    {
+      if (TARGET_PREFIXED_ADDR)
+	return INSN_FORM_PREFIXED_NUMERIC;
+
+      return INSN_FORM_BAD;
+    }
+
+  /* We have a 16-bit offset, see what default instruction format to use.  */
+  if (non_prefixed_form == NON_PREFIXED_DEFAULT)
+    {
+      unsigned size = GET_MODE_SIZE (mode);
+
+      /* On 64-bit systems, assume 64-bit integers need to use DS form
+	 addresses (for LD/STD).  VSX vectors need to use DQ form addresses
+	 (for LXV and STXV).  TImode is problematical in that its normal usage
+	 is expected to be GPRs where it wants a DS instruction format, but if
+	 it goes into the vector registers, it wants a DQ instruction
+	 format.  */
+      if (TARGET_POWERPC64 && size >= 8 && GET_MODE_CLASS (mode) == MODE_INT)
+	non_prefixed_form = NON_PREFIXED_DS;
+
+      else if (TARGET_VSX && size >= 16
+	       && (VECTOR_MODE_P (mode) || FLOAT128_VECTOR_P (mode)))
+	non_prefixed_form = NON_PREFIXED_DQ;
+
+      else
+	non_prefixed_form = NON_PREFIXED_D;
+    }
+
+  /* Classify the D/DS/DQ-form addresses.  */
+  switch (non_prefixed_form)
+    {
+      /* Instruction format D, all 16 bits are valid.  */
+    case NON_PREFIXED_D:
+      return INSN_FORM_D;
+
+      /* Instruction format DS, bottom 2 bits must be 0.  */
+    case NON_PREFIXED_DS:
+      if ((offset & 3) == 0)
+	return INSN_FORM_DS;
+
+      else if (TARGET_PREFIXED_ADDR)
+	return INSN_FORM_PREFIXED_NUMERIC;
+
+      else
+	return INSN_FORM_BAD;
+
+      /* Instruction format DQ, bottom 4 bits must be 0.  */
+    case NON_PREFIXED_DQ:
+      if ((offset & 15) == 0)
+	return INSN_FORM_DQ;
+
+      else if (TARGET_PREFIXED_ADDR)
+	return INSN_FORM_PREFIXED_NUMERIC;
+
+      else
+	return INSN_FORM_BAD;
+
+    default:
+      break;
+    }
+
+  return INSN_FORM_BAD;
+}
+
+\f
 #ifdef HAVE_GAS_HIDDEN
 # define USE_HIDDEN_LINKONCE 1
 #else

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4.1, patch #2: Add prefixed insn attribute (revised)
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (8 preceding siblings ...)
  2019-09-24  5:59 ` [PATCH] V4.1, patch #1: Rework prefixed/pc-relative lookup (revised) Michael Meissner
@ 2019-09-24  6:10 ` Michael Meissner
  2019-09-27 23:27   ` Segher Boessenkool
  2019-09-30 14:13 ` [PATCH], V4, patch #4.1: Enable prefixed/pc-rel addressing (revised) Michael Meissner
                   ` (11 subsequent siblings)
  21 siblings, 1 reply; 37+ messages in thread
From: Michael Meissner @ 2019-09-24  6:10 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch revises patch #2, fixing an issue that shows up in compiling large
code like the Spec 2017 benchmark suite.  The issue was when a vector register
uses TImode, it needs to assume the non-prefixed instruction uses the DQ
encoding.

I also changed the spelling of PC-relative to be consitant.

The patch is also adjusted due to the changes made in the revised patch #1.

Assuming the revised patch #1 is checked in, can I check in this revised patch
into the trunk?  I did a bootstrap and make check with the patch and there were
no regressions.  I applied the remaining patches, and they also have no
regressions, and they can build the Spec 2017 test suite.

2019-09-23  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000-protos.h (prefixed_load_p): New
	declaration.
	(prefixed_store_p): New declaration.
	(prefixed_paddi_p): New declaration.
	(rs6000_asm_output_opcode): New declaration.
	(rs6000_final_prescan_insn): Move declaration and update calling
	signature.
	(address_is_prefixed): New helper inline function.
	* config/rs6000/rs6000.c (rs6000_emit_move): Support loading
	PC-relative addresses.
	(reg_to_non_prefixed): New function to identify what the
	non-prefixed memory instruction format is for a register.
	(prefixed_load_p): New function to identify prefixed loads.
	(prefixed_store_p): New function to identify prefixed stores.
	(prefixed_paddi_p): New function to identify prefixed load
	immediates.
	(next_insn_prefixed_p): New static state variable.
	(rs6000_final_prescan_insn): New function to determine if an insn
	uses a prefixed instruction.
	(rs6000_asm_output_opcode): New function to emit 'p' in front of a
	prefixed instruction.
	* config/rs6000/rs6000.h (FINAL_PRESCAN_INSN): New target hook.
	(ASM_OUTPUT_OPCODE): New target hook.
	* config/rs6000/rs6000.md (prefixed): New insn attribute for
	prefixed instructions.
	(prefixed_length): New insn attribute for the size of prefixed
	instructions.
	(non_prefixed_length): New insn attribute for the size of
	non-prefixed instructions.
	(pcrel_local_addr): New insn to load up a local PC-relative
	address.
	(pcrel_extern_addr): New insn to load up an external PC-relative
	address.

Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 276069)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -189,6 +189,30 @@ enum non_prefixed {
 
 extern enum insn_form address_to_insn_form (rtx, machine_mode,
 					    enum non_prefixed);
+extern bool prefixed_load_p (rtx_insn *);
+extern bool prefixed_store_p (rtx_insn *);
+extern bool prefixed_paddi_p (rtx_insn *);
+extern void rs6000_asm_output_opcode (FILE *);
+extern void rs6000_final_prescan_insn (rtx_insn *, rtx [], int);
+
+/* Return true if the address is a prefixed instruction that can be directly
+   used in a memory instruction (i.e. using numeric offset or a PC-relative
+   reference to a local symbol).
+
+   References to external PC-relative symbols aren't allowed, because GCC has
+   to load the address into a register and then issue a separate load or
+   store.  */
+
+static inline bool
+address_is_prefixed (rtx addr,
+		     machine_mode mode,
+		     enum non_prefixed non_prefixed_insn)
+{
+  enum insn_form iform = address_to_insn_form (addr, mode,
+					       non_prefixed_insn);
+  return (iform == INSN_FORM_PREFIXED_NUMERIC
+	  || iform == INSN_FORM_PCREL_LOCAL);
+}
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
@@ -268,8 +292,6 @@ extern void rs6000_d_target_versions (vo
 const char * rs6000_xcoff_strip_dollar (const char *);
 #endif
 
-void rs6000_final_prescan_insn (rtx_insn *, rtx *operand, int num_operands);
-
 extern unsigned char rs6000_class_max_nregs[][LIM_REG_CLASSES];
 extern unsigned char rs6000_hard_regno_nregs[][FIRST_PSEUDO_REGISTER];
 
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276069)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -9639,6 +9639,22 @@ rs6000_emit_move (rtx dest, rtx source,
 	  return;
 	}
 
+      /* Use the default pattern for loading up PC-relative addresses.  */
+      if (TARGET_PCREL && mode == Pmode
+	  && (SYMBOL_REF_P (operands[1]) || LABEL_REF_P (operands[1])
+	      || GET_CODE (operands[1]) == CONST))
+	{
+	  enum insn_form iform = address_to_insn_form (operands[1], mode,
+						       NON_PREFIXED_DEFAULT);
+
+	  if (iform == INSN_FORM_PCREL_LOCAL
+	      || iform == INSN_FORM_PCREL_EXTERNAL)
+	    {
+	      emit_insn (gen_rtx_SET (operands[0], operands[1]));
+	      return;
+	    }
+	}
+
       if (DEFAULT_ABI == ABI_V4
 	  && mode == Pmode && mode == SImode
 	  && flag_pic == 1 && got_operand (operands[1], mode))
@@ -24714,6 +24730,211 @@ address_to_insn_form (rtx addr,
   return INSN_FORM_BAD;
 }
 
+/* Helper function to take a REG and a MODE and turn it into the non-prefixed
+   instruction format (D/DS/DQ) used for offset memory.  */
+
+static enum non_prefixed
+reg_to_non_prefixed (rtx reg, machine_mode mode)
+{
+  /* If it isn't a register, use the defaults.  */
+  if (!REG_P (reg) && !SUBREG_P (reg))
+    return NON_PREFIXED_DEFAULT;
+
+  unsigned int r = reg_or_subregno (reg);
+
+  /* If we have a pseudo, use the default instruction format.  */
+  if (r >= FIRST_PSEUDO_REGISTER)
+    return NON_PREFIXED_DEFAULT;
+
+  unsigned size = GET_MODE_SIZE (mode);
+
+  /* FPR registers use D-mode for scalars, and DQ-mode for vectors, IEEE
+     128-bit floating point, and 128-bit integers.  */
+  if (FP_REGNO_P (r))
+    {
+      if (mode == SFmode || size == 8 || FLOAT128_2REG_P (mode))
+	return NON_PREFIXED_D;
+
+      else if (size < 8)
+	return NON_PREFIXED_X;
+
+      else if (TARGET_VSX && size >= 16
+	       && (VECTOR_MODE_P (mode)
+		   || FLOAT128_VECTOR_P (mode)
+		   || mode == TImode || mode == CTImode))
+	return NON_PREFIXED_DQ;
+
+      else
+	return NON_PREFIXED_DEFAULT;
+    }
+
+  /* Altivec registers use DS-mode for scalars, and DQ-mode for vectors, IEEE
+     128-bit floating point, and 128-bit integers.  */
+  else if (ALTIVEC_REGNO_P (r))
+    {
+      if (mode == SFmode || size == 8 || FLOAT128_2REG_P (mode))
+	return NON_PREFIXED_DS;
+
+      else if (size < 8)
+	return NON_PREFIXED_X;
+
+      else if (TARGET_VSX && size >= 16
+	       && (VECTOR_MODE_P (mode)
+		   || FLOAT128_VECTOR_P (mode)
+		   || mode == TImode || mode == CTImode))
+	return NON_PREFIXED_DQ;
+
+      else
+	return NON_PREFIXED_DEFAULT;
+    }
+
+  /* GPR registers use DS-mode for 64-bit items on 64-bit systems, and D-mode
+     otherwise.  Assume that any other register, such as LR, CRs, etc. will go
+     through the GPR registers for memory operations.  */
+  else if (TARGET_POWERPC64 && size >= 8)
+    return NON_PREFIXED_DS;
+
+  return NON_PREFIXED_D;
+}
+
+\f
+/* Whether a load instruction is a prefixed instruction.  This is called from
+   the prefixed attribute processing.  */
+
+bool
+prefixed_load_p (rtx_insn *insn)
+{
+  /* Validate the insn to make sure it is a normal load insn.  */
+  extract_insn_cached (insn);
+  if (recog_data.n_operands < 2)
+    return false;
+
+  rtx reg = recog_data.operand[0];
+  rtx mem = recog_data.operand[1];
+
+  if (!REG_P (reg) && !SUBREG_P (reg))
+    return false;
+
+  if (!MEM_P (mem))
+    return false;
+
+  /* LWA uses the DS format instead of the D format that LWZ uses.  */
+  enum non_prefixed non_prefixed_insn;
+  machine_mode reg_mode = GET_MODE (reg);
+  machine_mode mem_mode = GET_MODE (mem);
+
+  if (mem_mode == SImode && reg_mode == DImode
+      && get_attr_sign_extend (insn) == SIGN_EXTEND_YES)
+    non_prefixed_insn = NON_PREFIXED_DS;
+
+  else
+    non_prefixed_insn = reg_to_non_prefixed (reg, mem_mode);
+
+  return address_is_prefixed (XEXP (mem, 0), mem_mode, non_prefixed_insn);
+}
+
+/* Whether a store instruction is a prefixed instruction.  This is called from
+   the prefixed attribute processing.  */
+
+bool
+prefixed_store_p (rtx_insn *insn)
+{
+  /* Validate the insn to make sure it is a normal store insn.  */
+  extract_insn_cached (insn);
+  if (recog_data.n_operands < 2)
+    return false;
+
+  rtx mem = recog_data.operand[0];
+  rtx reg = recog_data.operand[1];
+
+  if (!REG_P (reg) && !SUBREG_P (reg))
+    return false;
+
+  if (!MEM_P (mem))
+    return false;
+
+  machine_mode mem_mode = GET_MODE (mem);
+  enum non_prefixed non_prefixed_insn = reg_to_non_prefixed (reg, mem_mode);
+  return address_is_prefixed (XEXP (mem, 0), mem_mode, non_prefixed_insn);
+}
+
+/* Whether a load immediate or add instruction is a prefixed instruction.  This
+   is called from the prefixed attribute processing.  */
+
+bool
+prefixed_paddi_p (rtx_insn *insn)
+{
+  rtx set = single_set (insn);
+  if (!set)
+    return false;
+
+  rtx dest = SET_DEST (set);
+  rtx src = SET_SRC (set);
+
+  if (!REG_P (dest) && !SUBREG_P (dest))
+    return false;
+
+  /* Is this a load immediate that can't be done with a simple ADDI or
+     ADDIS?  */
+  if (CONST_INT_P (src))
+    return (satisfies_constraint_eI (src)
+	    && !satisfies_constraint_I (src)
+	    && !satisfies_constraint_L (src));
+
+  /* Is this a PADDI instruction that can't be done with a simple ADDI or
+     ADDIS?  */
+  if (GET_CODE (src) == PLUS)
+    {
+      rtx op1 = XEXP (src, 1);
+
+      return (CONST_INT_P (op1)
+	      && satisfies_constraint_eI (op1)
+	      && !satisfies_constraint_I (op1)
+	      && !satisfies_constraint_L (op1));
+    }
+
+  /* If not, is it a load of a PC-relative address?  */
+  if (!TARGET_PCREL || GET_MODE (dest) != Pmode)
+    return false;
+
+  if (!SYMBOL_REF_P (src) && !LABEL_REF_P (src) && GET_CODE (src) != CONST)
+    return false;
+
+  enum insn_form iform = address_to_insn_form (src, Pmode,
+					       NON_PREFIXED_DEFAULT);
+
+  return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
+}
+
+/* Whether the next instruction needs a 'p' prefix issued before the
+   instruction is printed out.  */
+static bool next_insn_prefixed_p;
+
+/* Define FINAL_PRESCAN_INSN if some processing needs to be done before
+   outputting the assembler code.  On the PowerPC, we remember if the current
+   insn is a prefixed insn where we need to emit a 'p' before the insn.
+
+   In addition, if the insn is part of a PC-relative reference to an external
+   label optimization, this is recorded also.  */
+void
+rs6000_final_prescan_insn (rtx_insn *insn, rtx [], int)
+{
+  next_insn_prefixed_p = (get_attr_prefixed (insn) != PREFIXED_NO);
+  return;
+}
+
+/* Define ASM_OUTPUT_OPCODE to do anything special before emitting an opcode.
+   We use it to emit a 'p' for prefixed insns that is set in
+   FINAL_PRESCAN_INSN.  */
+void
+rs6000_asm_output_opcode (FILE *stream)
+{
+  if (next_insn_prefixed_p)
+    fprintf (stream "p");
+
+  return;
+}
+
 \f
 #ifdef HAVE_GAS_HIDDEN
 # define USE_HIDDEN_LINKONCE 1
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 276061)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -2547,3 +2547,24 @@ typedef struct GTY(()) machine_function
   IN_RANGE ((VALUE),							\
 	    -(HOST_WIDE_INT_1 << 33),					\
 	    (HOST_WIDE_INT_1 << 33) - 1 - (EXTRA))
+
+/* Define this if some processing needs to be done before outputting the
+   assembler code.  On the PowerPC, we remember if the current insn is a normal
+   prefixed insn where we need to emit a 'p' before the insn.  */
+#define FINAL_PRESCAN_INSN(INSN, OPERANDS, NOPERANDS)			\
+do									\
+  {									\
+    if (TARGET_PREFIXED_ADDR)						\
+      rs6000_final_prescan_insn (INSN, OPERANDS, NOPERANDS);		\
+  }									\
+while (0)
+
+/* Do anything special before emitting an opcode.  We use it to emit a 'p' for
+   prefixed insns that is set in FINAL_PRESCAN_INSN.  */
+#define ASM_OUTPUT_OPCODE(STREAM, OPCODE)				\
+  do									\
+    {									\
+     if (TARGET_PREFIXED_ADDR)						\
+       rs6000_asm_output_opcode (STREAM);				\
+    }									\
+  while (0)
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 276061)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -256,8 +256,52 @@ (define_attr "var_shift" "no,yes"
 ;; Is copying of this instruction disallowed?
 (define_attr "cannot_copy" "no,yes" (const_string "no"))
 
-;; Length of the instruction (in bytes).
-(define_attr "length" "" (const_int 4))
+
+;; Whether an insn is a prefixed insn, and an initial 'p' should be printed
+;; before the instruction.  A prefixed instruction has a prefix instruction
+;; word that extends the immediate value of the instructions from 12-16 bits to
+;; 34 bits.  The macro ASM_OUTPUT_OPCODE emits a leading 'p' for prefixed
+;; insns.  The default "length" attribute will also be adjusted by default to
+;; be 12 bytes.
+(define_attr "prefixed" "no,yes"
+  (cond [(ior (match_test "!TARGET_PREFIXED_ADDR")
+	      (match_test "!NONJUMP_INSN_P (insn)"))
+	 (const_string "no")
+
+	 (eq_attr "type" "load,fpload,vecload")
+	 (if_then_else (and (eq_attr "indexed" "no")
+			    (eq_attr "update" "no")
+			    (match_test "prefixed_load_p (insn)"))
+		       (const_string "yes")
+		       (const_string "no"))
+
+	 (eq_attr "type" "store,fpstore,vecstore")
+	 (if_then_else (and (eq_attr "indexed" "no")
+			    (eq_attr "update" "no")
+			    (match_test "prefixed_store_p (insn)"))
+		       (const_string "yes")
+		       (const_string "no"))
+
+	 (eq_attr "type" "integer,add")
+	 (if_then_else (match_test "prefixed_paddi_p (insn)")
+		       (const_string "yes")
+		       (const_string "no"))]
+	(const_string "no")))
+
+;; Length in bytes of instructions that use prefixed addressing and length in
+;; bytes of instructions that does not use prefixed addressing.  This allows
+;; both lengths to be defined as constants, and the length attribute can pick
+;; the size as appropriate.
+(define_attr "prefixed_length" "" (const_int 12))
+(define_attr "non_prefixed_length" "" (const_int 4))
+
+;; Length of the instruction (in bytes).  Prefixed insns are 8 bytes, but the
+;; assembler might issue need to issue a NOP so that the prefixed instruction
+;; does not cross a cache boundary, which makes them possibly 12 bytes.
+(define_attr "length" ""
+  (if_then_else (eq_attr "prefixed" "yes")
+		(attr "prefixed_length")
+		(attr "non_prefixed_length")))
 
 ;; Processor type -- this attribute must exactly match the processor_type
 ;; enumeration in rs6000-opts.h.
@@ -9875,6 +9919,28 @@ (define_expand "restore_stack_nonlocal"
   operands[6] = gen_rtx_PARALLEL (VOIDmode, p);
 })
 \f
+;; Load up a PC-relative address.  Print_operand_address will append a @pcrel
+;; to the symbol or label.
+(define_insn "*pcrel_local_addr"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+	(match_operand:DI 1 "pcrel_local_address"))]
+  "TARGET_PCREL"
+  "la %0,%a1"
+  [(set_attr "prefixed" "yes")])
+
+;; Load up a PC-relative address to an external symbol.  If the symbol and the
+;; program are both defined in the main program, the linker will optimize this
+;; to a PADDI.  Otherwise, it will create a GOT address that is relocated by
+;; the dynamic linker and loaded up.  Print_operand_address will append a
+;; @got@pcrel to the symbol.
+(define_insn "*pcrel_extern_addr"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+	(match_operand:DI 1 "pcrel_external_address"))]
+  "TARGET_PCREL"
+  "ld %0,%a1"
+  [(set_attr "prefixed" "yes")
+   (set_attr "type" "load")])
+
 ;; TOC register handling.
 
 ;; Code to initialize the TOC register...

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] V4.1, patch #1: Rework prefixed/pc-relative lookup (revised)
  2019-09-24  5:59 ` [PATCH] V4.1, patch #1: Rework prefixed/pc-relative lookup (revised) Michael Meissner
@ 2019-09-27 22:59   ` Segher Boessenkool
  2019-09-30 13:51     ` [PATCH, committed] V4.2, patch #1: Rework prefixed/pc-relative lookup (revised #2) Michael Meissner
  0 siblings, 1 reply; 37+ messages in thread
From: Segher Boessenkool @ 2019-09-27 22:59 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

Hi!

On Tue, Sep 24, 2019 at 01:59:07AM -0400, Michael Meissner wrote:
> +;; Return true if the operand is a PC-relative address to a local symbol or a
> +;; label that can be used directly in a memory operation.

"address of", not "address to"?

> +/* Different PowerPC instruction formats that are used by GCC.  There are
> +   various other instruction formats used by the PowerPC hardware, but the
> +   these formats are not currently used by GCC.  */

"the these".

> +enum non_prefixed {
> +  NON_PREFIXED_DEFAULT,		/* Use the default.  */
> +  NON_PREFIXED_D,		/* All 16-bits are valid.  */
> +  NON_PREFIXED_DS,		/* Bottom 2 bits must be 0.  */
> +  NON_PREFIXED_DQ,		/* Bottom 4 bits must be 0.  */
> +  NON_PREFIXED_X		/* No offset memory form exists.  */
> +};

Please call the enum non_prefixed_form.

With those nits fixed, this is fine for trunk.  Thank you!


Segher

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH], V4.1, patch #2: Add prefixed insn attribute (revised)
  2019-09-24  6:10 ` [PATCH], V4.1, patch #2: Add prefixed insn attribute (revised) Michael Meissner
@ 2019-09-27 23:27   ` Segher Boessenkool
  2019-09-30 13:53     ` [PATCH, committed], V4.2, patch #2: Add prefixed insn attribute (revised #2) Michael Meissner
  0 siblings, 1 reply; 37+ messages in thread
From: Segher Boessenkool @ 2019-09-27 23:27 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

Hi!

On Tue, Sep 24, 2019 at 02:10:10AM -0400, Michael Meissner wrote:
> +/* Return true if the address is a prefixed instruction that can be directly
> +   used in a memory instruction (i.e. using numeric offset or a PC-relative
> +   reference to a local symbol).

This could use a bit of a rewrite...  "Return whether the address is valid
for a prefixed memory instruction [...]"?

> +      /* Use the default pattern for loading up PC-relative addresses.  */
> +      if (TARGET_PCREL && mode == Pmode
> +	  && (SYMBOL_REF_P (operands[1]) || LABEL_REF_P (operands[1])
> +	      || GET_CODE (operands[1]) == CONST))

Maybe this can use some predicate function?  That will make the CONST
stand out more as being the special case here, too?

> +  unsigned int r = reg_or_subregno (reg);
> +
> +  /* If we have a pseudo, use the default instruction format.  */
> +  if (r >= FIRST_PSEUDO_REGISTER)
> +    return NON_PREFIXED_DEFAULT;

Please use

  if (!HARD_REGISTER_NUM_P (r))

> +	 (eq_attr "type" "load,fpload,vecload")
> +	 (if_then_else (and (eq_attr "indexed" "no")
> +			    (eq_attr "update" "no")
> +			    (match_test "prefixed_load_p (insn)"))
> +		       (const_string "yes")
> +		       (const_string "no"))

It looks like prefixed_load_p and prefixed_store_p should test for
"indexed" "no" and "update" "no" themselves?  The code here simplifies
a bit then.

(blank line before the default case please).

> +	(const_string "no")))


Okay for trunk with those things fixed.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH], V4, patch #3: Fix up mov<mode>_64bit_dm
  2019-09-18 23:58 ` [PATCH], V4, patch #3: Fix up mov<mode>_64bit_dm Michael Meissner
@ 2019-09-27 23:33   ` Segher Boessenkool
  0 siblings, 0 replies; 37+ messages in thread
From: Segher Boessenkool @ 2019-09-27 23:33 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

On Wed, Sep 18, 2019 at 07:58:46PM -0400, Michael Meissner wrote:
> In doing the patches, I noticed that mov<mode>_64bit_dm had two alternatives
> combined together.  This patch fixes the problem, before the next patch that
> will need to modify mov<mode>_64bit_dm for prefixed addressing.

This is okay for trunk.  Thanks!


Segher


> 2019-09-18  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/rs6000.md (mov<mode>_64bit_dm): Split the
> 	alternatives for loading 0.0 to a GPR and loading a 128-bit
> 	floating point type to a GPR.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH, committed] V4.2, patch #1: Rework prefixed/pc-relative lookup (revised #2)
  2019-09-27 22:59   ` Segher Boessenkool
@ 2019-09-30 13:51     ` Michael Meissner
  0 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-09-30 13:51 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

This is the reworked version of patch #1 that I committed:

2019-09-30  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (pcrel_address): Delete predicate.
	(pcrel_local_address): Replace pcrel_address predicate, use the
	new function address_to_insn_form.
	(pcrel_external_address): Replace with new implementation using
	address_to_insn_form..
	(prefixed_mem_operand): Delete predicate which is now unused.
	(pcrel_external_mem_operand): Delete predicate which is now
	unused.
	* config/rs6000/rs6000-protos.h (enum insn_form): New
	enumeration.
	(enum non_prefixed): New enumeration.
	(address_to_insn_form): New declaration.
	* config/rs6000/rs6000.c (print_operand_address): Check for either
	PC-relative local symbols or PC-relative external symbols.
	(mode_supports_prefixed_address_p): Delete, no longer used.
	(rs6000_prefixed_address_mode_p): Delete, no longer used.
	(address_to_insn_form): New function to decode an address format.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 276276)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -1625,82 +1625,7 @@ (define_predicate "small_toc_ref"
   return GET_CODE (op) == UNSPEC && XINT (op, 1) == UNSPEC_TOCREL;
 })
 
-;; Return true if the operand is a pc-relative address.
-(define_predicate "pcrel_address"
-  (match_code "label_ref,symbol_ref,const")
-{
-  if (!rs6000_pcrel_p (cfun))
-    return false;
-
-  if (GET_CODE (op) == CONST)
-    op = XEXP (op, 0);
-
-  /* Validate offset.  */
-  if (GET_CODE (op) == PLUS)
-    {
-      rtx op0 = XEXP (op, 0);
-      rtx op1 = XEXP (op, 1);
-
-      if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
-	return false;
-
-      op = op0;
-    }
-
-  if (LABEL_REF_P (op))
-    return true;
-
-  return (SYMBOL_REF_P (op) && SYMBOL_REF_LOCAL_P (op));
-})
-
-;; Return true if the operand is an external symbol whose address can be loaded
-;; into a register using:
-;;	PLD reg,label@pcrel@got
-;;
-;; The linker will either optimize this to either a PADDI if the label is
-;; defined locally in another module or a PLD of the address if the label is
-;; defined in another module.
-
-(define_predicate "pcrel_external_address"
-  (match_code "symbol_ref,const")
-{
-  if (!rs6000_pcrel_p (cfun))
-    return false;
-
-  if (GET_CODE (op) == CONST)
-    op = XEXP (op, 0);
-
-  /* Validate offset.  */
-  if (GET_CODE (op) == PLUS)
-    {
-      rtx op0 = XEXP (op, 0);
-      rtx op1 = XEXP (op, 1);
-
-      if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
-	return false;
-
-      op = op0;
-    }
-
-  return (SYMBOL_REF_P (op) && !SYMBOL_REF_LOCAL_P (op));
-})
-
-;; Return 1 if op is a prefixed memory operand.
-(define_predicate "prefixed_mem_operand"
-  (match_code "mem")
-{
-  return rs6000_prefixed_address_mode_p (XEXP (op, 0), GET_MODE (op));
-})
-
-;; Return 1 if op is a memory operand to an external variable when we
-;; support pc-relative addressing and the PCREL_OPT relocation to
-;; optimize references to it.
-(define_predicate "pcrel_external_mem_operand"
-  (match_code "mem")
-{
-  return pcrel_external_address (XEXP (op, 0), Pmode);
-})
-
+\f
 ;; Match the first insn (addis) in fusing the combination of addis and loads to
 ;; GPR registers on power8.
 (define_predicate "fusion_gpr_addis"
@@ -1857,3 +1782,28 @@ (define_predicate "fusion_addis_mem_comb
 
   return 0;
 })
+
+\f
+;; Return true if the operand is a PC-relative address of a local symbol or a
+;; label that can be used directly in a memory operation.
+(define_predicate "pcrel_local_address"
+  (match_code "label_ref,symbol_ref,const")
+{
+  enum insn_form iform = address_to_insn_form (op, mode, NON_PREFIXED_DEFAULT);
+  return iform == INSN_FORM_PCREL_LOCAL;
+})
+
+;; Return true if the operand is a PC-relative external symbol whose address
+;; can be loaded into a register.
+(define_predicate "pcrel_external_address"
+  (match_code "symbol_ref,const")
+{
+  enum insn_form iform = address_to_insn_form (op, mode, NON_PREFIXED_DEFAULT);
+  return iform == INSN_FORM_PCREL_EXTERNAL;
+})
+
+;; Return true if the address is PC-relative and the symbol is either local or
+;; external.
+(define_predicate "pcrel_local_or_external_address"
+  (ior (match_operand 0 "pcrel_local_address")
+       (match_operand 0 "pcrel_external_address")))
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 276276)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -154,7 +154,41 @@ extern align_flags rs6000_loop_align (rt
 extern void rs6000_split_logical (rtx [], enum rtx_code, bool, bool, bool);
 extern bool rs6000_pcrel_p (struct function *);
 extern bool rs6000_fndecl_pcrel_p (const_tree);
-extern bool rs6000_prefixed_address_mode_p (rtx, machine_mode);
+
+/* Different PowerPC instruction formats that are used by GCC.  There are
+   various other instruction formats used by the PowerPC hardware, but these
+   formats are not currently used by GCC.  */
+
+enum insn_form {
+  INSN_FORM_BAD,		/* Bad instruction format.  */
+  INSN_FORM_BASE_REG,		/* Base register only.  */
+  INSN_FORM_D,			/* Reg + 16-bit numeric offset.  */
+  INSN_FORM_DS,			/* Reg + offset, bottom 2 bits must be 0.  */
+  INSN_FORM_DQ,			/* Reg + offset, bottom 4 bits must be 0.  */
+  INSN_FORM_X,			/* Base register + index register.  */
+  INSN_FORM_UPDATE,		/* Address updates base register.  */
+  INSN_FORM_LO_SUM,		/* Reg + offset using symbol.  */
+  INSN_FORM_PREFIXED_NUMERIC,	/* Reg + 34 bit numeric offset.  */
+  INSN_FORM_PCREL_LOCAL,	/* PC-relative local symbol.  */
+  INSN_FORM_PCREL_EXTERNAL	/* PC-relative external symbol.  */
+};
+
+/* Instruction format for the non-prefixed version of a load or store.  This is
+   used to determine if a 16-bit offset is valid to be used with a non-prefixed
+   (traditional) instruction or if the bottom bits of the offset cannot be used
+   with a DS or DQ instruction format, and GCC has to use a prefixed
+   instruction for the load or store.  */
+
+enum non_prefixed_form {
+  NON_PREFIXED_DEFAULT,		/* Use the default.  */
+  NON_PREFIXED_D,		/* All 16-bits are valid.  */
+  NON_PREFIXED_DS,		/* Bottom 2 bits must be 0.  */
+  NON_PREFIXED_DQ,		/* Bottom 4 bits must be 0.  */
+  NON_PREFIXED_X		/* No offset memory form exists.  */
+};
+
+extern enum insn_form address_to_insn_form (rtx, machine_mode,
+					    enum non_prefixed_form);
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276276)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -13078,8 +13078,8 @@ print_operand_address (FILE *file, rtx x
   if (REG_P (x))
     fprintf (file, "0(%s)", reg_names[ REGNO (x) ]);
 
-  /* Is it a pc-relative address?  */
-  else if (pcrel_address (x, Pmode))
+  /* Is it a PC-relative address?  */
+  else if (TARGET_PCREL && pcrel_local_or_external_address (x, VOIDmode))
     {
       HOST_WIDE_INT offset;
 
@@ -13099,7 +13099,10 @@ print_operand_address (FILE *file, rtx x
       if (offset)
 	fprintf (file, "%+" PRId64, offset);
 
-      fputs ("@pcrel", file);
+      if (SYMBOL_REF_P (x) && !SYMBOL_REF_LOCAL_P (x))
+	fprintf (file, "@got");
+
+      fprintf (file, "@pcrel");
     }
   else if (SYMBOL_REF_P (x) || GET_CODE (x) == CONST
 	   || GET_CODE (x) == LABEL_REF)
@@ -13584,71 +13587,6 @@ rs6000_pltseq_template (rtx *operands, i
   return str;
 }
 #endif
-
-/* Helper function to return whether a MODE can do prefixed loads/stores.
-   VOIDmode is used when we are loading the pc-relative address into a base
-   register, but we are not using it as part of a memory operation.  As modes
-   add support for prefixed memory, they will be added here.  */
-
-static bool
-mode_supports_prefixed_address_p (machine_mode mode)
-{
-  return mode == VOIDmode;
-}
-
-/* Function to return true if ADDR is a valid prefixed memory address that uses
-   mode MODE.  */
-
-bool
-rs6000_prefixed_address_mode_p (rtx addr, machine_mode mode)
-{
-  if (!TARGET_PREFIXED_ADDR || !mode_supports_prefixed_address_p (mode))
-    return false;
-
-  /* Check for PC-relative addresses.  */
-  if (pcrel_address (addr, Pmode))
-    return true;
-
-  /* Check for prefixed memory addresses that have a large numeric offset,
-     or an offset that can't be used for a DS/DQ-form memory operation.  */
-  if (GET_CODE (addr) == PLUS)
-    {
-      rtx op0 = XEXP (addr, 0);
-      rtx op1 = XEXP (addr, 1);
-
-      if (!base_reg_operand (op0, Pmode) || !CONST_INT_P (op1))
-	return false;
-
-      HOST_WIDE_INT value = INTVAL (op1);
-      if (!SIGNED_34BIT_OFFSET_P (value))
-	return false;
-
-      /* Offset larger than 16-bits?  */
-      if (!SIGNED_16BIT_OFFSET_P (value))
-	return true;
-
-      /* DQ instruction (bottom 4 bits must be 0) for vectors.  */
-      HOST_WIDE_INT mask;
-      if (GET_MODE_SIZE (mode) >= 16)
-	mask = 15;
-
-      /* DS instruction (bottom 2 bits must be 0).  For 32-bit integers, we
-	 need to use DS instructions if we are sign-extending the value with
-	 LWA.  For 32-bit floating point, we need DS instructions to load and
-	 store values to the traditional Altivec registers.  */
-      else if (GET_MODE_SIZE (mode) >= 4)
-	mask = 3;
-
-      /* QImode/HImode has no restrictions.  */
-      else
-	return true;
-
-      /* Return true if we must use a prefixed instruction.  */
-      return (value & mask) != 0;
-    }
-
-  return false;
-}
 \f
 #if defined (HAVE_GAS_HIDDEN) && !TARGET_MACHO
 /* Emit an assembler directive to set symbol visibility for DECL to
@@ -24613,6 +24551,170 @@ rs6000_pcrel_p (struct function *fn)
   return rs6000_fndecl_pcrel_p (fn->decl);
 }
 
+\f
+/* Given an address (ADDR), a mode (MODE), and what the format of the
+   non-prefixed address (NON_PREFIXED_FORMAT) is, return the instruction format
+   for the address.  */
+
+enum insn_form
+address_to_insn_form (rtx addr,
+		      machine_mode mode,
+		      enum non_prefixed_form non_prefixed_format)
+{
+  /* Single register is easy.  */
+  if (REG_P (addr) || SUBREG_P (addr))
+    return INSN_FORM_BASE_REG;
+
+  /* If the non prefixed instruction format doesn't support offset addressing,
+     make sure only indexed addressing is allowed.
+
+     We special case SDmode so that the register allocator does not try to move
+     SDmode through GPR registers, but instead uses the 32-bit integer load and
+     store instructions for the floating point registers.  */
+  if (non_prefixed_format == NON_PREFIXED_X || (mode == SDmode && TARGET_DFP))
+    {
+      if (GET_CODE (addr) != PLUS)
+	return INSN_FORM_BAD;
+
+      rtx op0 = XEXP (addr, 0);
+      rtx op1 = XEXP (addr, 1);
+      if (!REG_P (op0) && !SUBREG_P (op0))
+	return INSN_FORM_BAD;
+
+      if (!REG_P (op1) && !SUBREG_P (op1))
+	return INSN_FORM_BAD;
+
+      return INSN_FORM_X;
+    }
+
+  /* Deal with update forms.  */
+  if (GET_RTX_CLASS (GET_CODE (addr)) == RTX_AUTOINC)
+    return INSN_FORM_UPDATE;
+
+  /* Handle PC-relative symbols and labels.  Check for both local and external
+     symbols.  Assume labels are always local.  */
+  if (TARGET_PCREL)
+    {
+      if (SYMBOL_REF_P (addr) && !SYMBOL_REF_LOCAL_P (addr))
+	return INSN_FORM_PCREL_EXTERNAL;
+
+      if (SYMBOL_REF_P (addr) || LABEL_REF_P (addr))
+	return INSN_FORM_PCREL_LOCAL;
+    }
+
+  if (GET_CODE (addr) == CONST)
+    addr = XEXP (addr, 0);
+
+  /* Recognize LO_SUM addresses used with TOC and 32-bit addressing.  */
+  if (GET_CODE (addr) == LO_SUM)
+    return INSN_FORM_LO_SUM;
+
+  /* Everything below must be an offset address of some form.  */
+  if (GET_CODE (addr) != PLUS)
+    return INSN_FORM_BAD;
+
+  rtx op0 = XEXP (addr, 0);
+  rtx op1 = XEXP (addr, 1);
+
+  /* Check for indexed addresses.  */
+  if (REG_P (op1) || SUBREG_P (op1))
+    {
+      if (REG_P (op0) || SUBREG_P (op0))
+	return INSN_FORM_X;
+
+      return INSN_FORM_BAD;
+    }
+
+  if (!CONST_INT_P (op1))
+    return INSN_FORM_BAD;
+
+  HOST_WIDE_INT offset = INTVAL (op1);
+  if (!SIGNED_34BIT_OFFSET_P (offset))
+    return INSN_FORM_BAD;
+
+  /* Check for local and external PC-relative addresses.  Labels are always
+     local.  */
+  if (TARGET_PCREL)
+    {
+      if (SYMBOL_REF_P (op0) && !SYMBOL_REF_LOCAL_P (op0))
+	return INSN_FORM_PCREL_EXTERNAL;
+
+      if (SYMBOL_REF_P (op0) || LABEL_REF_P (op0))
+	return INSN_FORM_PCREL_LOCAL;
+    }
+
+  /* If it isn't PC-relative, the address must use a base register.  */
+  if (!REG_P (op0) && !SUBREG_P (op0))
+    return INSN_FORM_BAD;
+
+  /* Large offsets must be prefixed.  */
+  if (!SIGNED_16BIT_OFFSET_P (offset))
+    {
+      if (TARGET_PREFIXED_ADDR)
+	return INSN_FORM_PREFIXED_NUMERIC;
+
+      return INSN_FORM_BAD;
+    }
+
+  /* We have a 16-bit offset, see what default instruction format to use.  */
+  if (non_prefixed_format == NON_PREFIXED_DEFAULT)
+    {
+      unsigned size = GET_MODE_SIZE (mode);
+
+      /* On 64-bit systems, assume 64-bit integers need to use DS form
+	 addresses (for LD/STD).  VSX vectors need to use DQ form addresses
+	 (for LXV and STXV).  TImode is problematical in that its normal usage
+	 is expected to be GPRs where it wants a DS instruction format, but if
+	 it goes into the vector registers, it wants a DQ instruction
+	 format.  */
+      if (TARGET_POWERPC64 && size >= 8 && GET_MODE_CLASS (mode) == MODE_INT)
+	non_prefixed_format = NON_PREFIXED_DS;
+
+      else if (TARGET_VSX && size >= 16
+	       && (VECTOR_MODE_P (mode) || FLOAT128_VECTOR_P (mode)))
+	non_prefixed_format = NON_PREFIXED_DQ;
+
+      else
+	non_prefixed_format = NON_PREFIXED_D;
+    }
+
+  /* Classify the D/DS/DQ-form addresses.  */
+  switch (non_prefixed_format)
+    {
+      /* Instruction format D, all 16 bits are valid.  */
+    case NON_PREFIXED_D:
+      return INSN_FORM_D;
+
+      /* Instruction format DS, bottom 2 bits must be 0.  */
+    case NON_PREFIXED_DS:
+      if ((offset & 3) == 0)
+	return INSN_FORM_DS;
+
+      else if (TARGET_PREFIXED_ADDR)
+	return INSN_FORM_PREFIXED_NUMERIC;
+
+      else
+	return INSN_FORM_BAD;
+
+      /* Instruction format DQ, bottom 4 bits must be 0.  */
+    case NON_PREFIXED_DQ:
+      if ((offset & 15) == 0)
+	return INSN_FORM_DQ;
+
+      else if (TARGET_PREFIXED_ADDR)
+	return INSN_FORM_PREFIXED_NUMERIC;
+
+      else
+	return INSN_FORM_BAD;
+
+    default:
+      break;
+    }
+
+  return INSN_FORM_BAD;
+}
+
+\f
 #ifdef HAVE_GAS_HIDDEN
 # define USE_HIDDEN_LINKONCE 1
 #else

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH, committed], V4.2, patch #2: Add prefixed insn attribute (revised #2)
  2019-09-27 23:27   ` Segher Boessenkool
@ 2019-09-30 13:53     ` Michael Meissner
  0 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-09-30 13:53 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

This is the patch that I committed (along with revised patch #1 and #3).  In
addition to the changes suggested, I needed to change the enumeration
non_prefixed_form to the new name.

2019-09-30  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000-protos.h (prefixed_load_p): New
	declaration.
	(prefixed_store_p): New declaration.
	(prefixed_paddi_p): New declaration.
	(rs6000_asm_output_opcode): New declaration.
	(rs6000_final_prescan_insn): Move declaration and update calling
	signature.
	(address_is_prefixed): New helper inline function.
	* config/rs6000/rs6000.c (rs6000_emit_move): Support loading
	PC-relative addresses.
	(reg_to_non_prefixed): New function to identify what the
	non-prefixed memory instruction format is for a register.
	(prefixed_load_p): New function to identify prefixed loads.
	(prefixed_store_p): New function to identify prefixed stores.
	(prefixed_paddi_p): New function to identify prefixed load
	immediates.
	(next_insn_prefixed_p): New static state variable.
	(rs6000_final_prescan_insn): New function to determine if an insn
	uses a prefixed instruction.
	(rs6000_asm_output_opcode): New function to emit 'p' in front of a
	prefixed instruction.
	* config/rs6000/rs6000.h (FINAL_PRESCAN_INSN): New target hook.
	(ASM_OUTPUT_OPCODE): New target hook.
	* config/rs6000/rs6000.md (prefixed): New insn attribute for
	prefixed instructions.
	(prefixed_length): New insn attribute for the size of prefixed
	instructions.
	(non_prefixed_length): New insn attribute for the size of
	non-prefixed instructions.
	(pcrel_local_addr): New insn to load up a local PC-relative
	address.
	(pcrel_extern_addr): New insn to load up an external PC-relative
	address.

Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 276277)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -189,6 +189,31 @@ enum non_prefixed_form {
 
 extern enum insn_form address_to_insn_form (rtx, machine_mode,
 					    enum non_prefixed_form);
+extern bool prefixed_load_p (rtx_insn *);
+extern bool prefixed_store_p (rtx_insn *);
+extern bool prefixed_paddi_p (rtx_insn *);
+extern void rs6000_asm_output_opcode (FILE *);
+extern void rs6000_final_prescan_insn (rtx_insn *, rtx [], int);
+
+/* Return true if the address can be used for a prefixed load, store, or add
+   immediate instructions that cannot be used with a non-prefixed instruction.
+   For example, using a numeric offset that is not valid for the non-prefixed
+   instruction or a PC-relative reference to a local symbol would return true,
+   but an address with an offset of 64 would not return true.
+
+   References to external PC-relative symbols aren't allowed, because GCC has
+   to load the address into a register and then issue a separate load or
+   store.  */
+
+static inline bool
+address_is_prefixed (rtx addr,
+		     machine_mode mode,
+		     enum non_prefixed_form non_prefixed)
+{
+  enum insn_form iform = address_to_insn_form (addr, mode, non_prefixed);
+  return (iform == INSN_FORM_PREFIXED_NUMERIC
+	  || iform == INSN_FORM_PCREL_LOCAL);
+}
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
@@ -268,8 +293,6 @@ extern void rs6000_d_target_versions (vo
 const char * rs6000_xcoff_strip_dollar (const char *);
 #endif
 
-void rs6000_final_prescan_insn (rtx_insn *, rtx *operand, int num_operands);
-
 extern unsigned char rs6000_class_max_nregs[][LIM_REG_CLASSES];
 extern unsigned char rs6000_hard_regno_nregs[][FIRST_PSEUDO_REGISTER];
 
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276277)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -9639,6 +9639,14 @@ rs6000_emit_move (rtx dest, rtx source,
 	  return;
 	}
 
+      /* Use the default pattern for loading up PC-relative addresses.  */
+      if (TARGET_PCREL && mode == Pmode
+	  && pcrel_local_or_external_address (operands[1], Pmode))
+	{
+	  emit_insn (gen_rtx_SET (operands[0], operands[1]));
+	  return;
+	}
+
       if (DEFAULT_ABI == ABI_V4
 	  && mode == Pmode && mode == SImode
 	  && flag_pic == 1 && got_operand (operands[1], mode))
@@ -24714,6 +24722,221 @@ address_to_insn_form (rtx addr,
   return INSN_FORM_BAD;
 }
 
+/* Helper function to take a REG and a MODE and turn it into the non-prefixed
+   instruction format (D/DS/DQ) used for offset memory.  */
+
+static enum non_prefixed_form
+reg_to_non_prefixed (rtx reg, machine_mode mode)
+{
+  /* If it isn't a register, use the defaults.  */
+  if (!REG_P (reg) && !SUBREG_P (reg))
+    return NON_PREFIXED_DEFAULT;
+
+  unsigned int r = reg_or_subregno (reg);
+
+  /* If we have a pseudo, use the default instruction format.  */
+  if (!HARD_REGISTER_NUM_P (r))
+    return NON_PREFIXED_DEFAULT;
+
+  unsigned size = GET_MODE_SIZE (mode);
+
+  /* FPR registers use D-mode for scalars, and DQ-mode for vectors, IEEE
+     128-bit floating point, and 128-bit integers.  */
+  if (FP_REGNO_P (r))
+    {
+      if (mode == SFmode || size == 8 || FLOAT128_2REG_P (mode))
+	return NON_PREFIXED_D;
+
+      else if (size < 8)
+	return NON_PREFIXED_X;
+
+      else if (TARGET_VSX && size >= 16
+	       && (VECTOR_MODE_P (mode)
+		   || FLOAT128_VECTOR_P (mode)
+		   || mode == TImode || mode == CTImode))
+	return NON_PREFIXED_DQ;
+
+      else
+	return NON_PREFIXED_DEFAULT;
+    }
+
+  /* Altivec registers use DS-mode for scalars, and DQ-mode for vectors, IEEE
+     128-bit floating point, and 128-bit integers.  */
+  else if (ALTIVEC_REGNO_P (r))
+    {
+      if (mode == SFmode || size == 8 || FLOAT128_2REG_P (mode))
+	return NON_PREFIXED_DS;
+
+      else if (size < 8)
+	return NON_PREFIXED_X;
+
+      else if (TARGET_VSX && size >= 16
+	       && (VECTOR_MODE_P (mode)
+		   || FLOAT128_VECTOR_P (mode)
+		   || mode == TImode || mode == CTImode))
+	return NON_PREFIXED_DQ;
+
+      else
+	return NON_PREFIXED_DEFAULT;
+    }
+
+  /* GPR registers use DS-mode for 64-bit items on 64-bit systems, and D-mode
+     otherwise.  Assume that any other register, such as LR, CRs, etc. will go
+     through the GPR registers for memory operations.  */
+  else if (TARGET_POWERPC64 && size >= 8)
+    return NON_PREFIXED_DS;
+
+  return NON_PREFIXED_D;
+}
+
+\f
+/* Whether a load instruction is a prefixed instruction.  This is called from
+   the prefixed attribute processing.  */
+
+bool
+prefixed_load_p (rtx_insn *insn)
+{
+  /* Validate the insn to make sure it is a normal load insn.  */
+  extract_insn_cached (insn);
+  if (recog_data.n_operands < 2)
+    return false;
+
+  rtx reg = recog_data.operand[0];
+  rtx mem = recog_data.operand[1];
+
+  if (!REG_P (reg) && !SUBREG_P (reg))
+    return false;
+
+  if (!MEM_P (mem))
+    return false;
+
+  /* Prefixed load instructions do not support update or indexed forms.  */
+  if (get_attr_indexed (insn) == INDEXED_YES
+      || get_attr_update (insn) == UPDATE_YES)
+    return false;
+
+  /* LWA uses the DS format instead of the D format that LWZ uses.  */
+  enum non_prefixed_form non_prefixed;
+  machine_mode reg_mode = GET_MODE (reg);
+  machine_mode mem_mode = GET_MODE (mem);
+
+  if (mem_mode == SImode && reg_mode == DImode
+      && get_attr_sign_extend (insn) == SIGN_EXTEND_YES)
+    non_prefixed = NON_PREFIXED_DS;
+
+  else
+    non_prefixed = reg_to_non_prefixed (reg, mem_mode);
+
+  return address_is_prefixed (XEXP (mem, 0), mem_mode, non_prefixed);
+}
+
+/* Whether a store instruction is a prefixed instruction.  This is called from
+   the prefixed attribute processing.  */
+
+bool
+prefixed_store_p (rtx_insn *insn)
+{
+  /* Validate the insn to make sure it is a normal store insn.  */
+  extract_insn_cached (insn);
+  if (recog_data.n_operands < 2)
+    return false;
+
+  rtx mem = recog_data.operand[0];
+  rtx reg = recog_data.operand[1];
+
+  if (!REG_P (reg) && !SUBREG_P (reg))
+    return false;
+
+  if (!MEM_P (mem))
+    return false;
+
+  /* Prefixed store instructions do not support update or indexed forms.  */
+  if (get_attr_indexed (insn) == INDEXED_YES
+      || get_attr_update (insn) == UPDATE_YES)
+    return false;
+
+  machine_mode mem_mode = GET_MODE (mem);
+  enum non_prefixed_form non_prefixed = reg_to_non_prefixed (reg, mem_mode);
+  return address_is_prefixed (XEXP (mem, 0), mem_mode, non_prefixed);
+}
+
+/* Whether a load immediate or add instruction is a prefixed instruction.  This
+   is called from the prefixed attribute processing.  */
+
+bool
+prefixed_paddi_p (rtx_insn *insn)
+{
+  rtx set = single_set (insn);
+  if (!set)
+    return false;
+
+  rtx dest = SET_DEST (set);
+  rtx src = SET_SRC (set);
+
+  if (!REG_P (dest) && !SUBREG_P (dest))
+    return false;
+
+  /* Is this a load immediate that can't be done with a simple ADDI or
+     ADDIS?  */
+  if (CONST_INT_P (src))
+    return (satisfies_constraint_eI (src)
+	    && !satisfies_constraint_I (src)
+	    && !satisfies_constraint_L (src));
+
+  /* Is this a PADDI instruction that can't be done with a simple ADDI or
+     ADDIS?  */
+  if (GET_CODE (src) == PLUS)
+    {
+      rtx op1 = XEXP (src, 1);
+
+      return (CONST_INT_P (op1)
+	      && satisfies_constraint_eI (op1)
+	      && !satisfies_constraint_I (op1)
+	      && !satisfies_constraint_L (op1));
+    }
+
+  /* If not, is it a load of a PC-relative address?  */
+  if (!TARGET_PCREL || GET_MODE (dest) != Pmode)
+    return false;
+
+  if (!SYMBOL_REF_P (src) && !LABEL_REF_P (src) && GET_CODE (src) != CONST)
+    return false;
+
+  enum insn_form iform = address_to_insn_form (src, Pmode,
+					       NON_PREFIXED_DEFAULT);
+
+  return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
+}
+
+/* Whether the next instruction needs a 'p' prefix issued before the
+   instruction is printed out.  */
+static bool next_insn_prefixed_p;
+
+/* Define FINAL_PRESCAN_INSN if some processing needs to be done before
+   outputting the assembler code.  On the PowerPC, we remember if the current
+   insn is a prefixed insn where we need to emit a 'p' before the insn.
+
+   In addition, if the insn is part of a PC-relative reference to an external
+   label optimization, this is recorded also.  */
+void
+rs6000_final_prescan_insn (rtx_insn *insn, rtx [], int)
+{
+  next_insn_prefixed_p = (get_attr_prefixed (insn) != PREFIXED_NO);
+  return;
+}
+
+/* Define ASM_OUTPUT_OPCODE to do anything special before emitting an opcode.
+   We use it to emit a 'p' for prefixed insns that is set in
+   FINAL_PRESCAN_INSN.  */
+void
+rs6000_asm_output_opcode (FILE *stream)
+{
+  if (next_insn_prefixed_p)
+    fprintf (stream, "p");
+
+  return;
+}
+
 \f
 #ifdef HAVE_GAS_HIDDEN
 # define USE_HIDDEN_LINKONCE 1
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 276276)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -2547,3 +2547,24 @@ typedef struct GTY(()) machine_function
   IN_RANGE ((VALUE),							\
 	    -(HOST_WIDE_INT_1 << 33),					\
 	    (HOST_WIDE_INT_1 << 33) - 1 - (EXTRA))
+
+/* Define this if some processing needs to be done before outputting the
+   assembler code.  On the PowerPC, we remember if the current insn is a normal
+   prefixed insn where we need to emit a 'p' before the insn.  */
+#define FINAL_PRESCAN_INSN(INSN, OPERANDS, NOPERANDS)			\
+do									\
+  {									\
+    if (TARGET_PREFIXED_ADDR)						\
+      rs6000_final_prescan_insn (INSN, OPERANDS, NOPERANDS);		\
+  }									\
+while (0)
+
+/* Do anything special before emitting an opcode.  We use it to emit a 'p' for
+   prefixed insns that is set in FINAL_PRESCAN_INSN.  */
+#define ASM_OUTPUT_OPCODE(STREAM, OPCODE)				\
+  do									\
+    {									\
+     if (TARGET_PREFIXED_ADDR)						\
+       rs6000_asm_output_opcode (STREAM);				\
+    }									\
+  while (0)
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 276276)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -256,8 +256,49 @@ (define_attr "var_shift" "no,yes"
 ;; Is copying of this instruction disallowed?
 (define_attr "cannot_copy" "no,yes" (const_string "no"))
 
-;; Length of the instruction (in bytes).
-(define_attr "length" "" (const_int 4))
+
+;; Whether an insn is a prefixed insn, and an initial 'p' should be printed
+;; before the instruction.  A prefixed instruction has a prefix instruction
+;; word that extends the immediate value of the instructions from 12-16 bits to
+;; 34 bits.  The macro ASM_OUTPUT_OPCODE emits a leading 'p' for prefixed
+;; insns.  The default "length" attribute will also be adjusted by default to
+;; be 12 bytes.
+(define_attr "prefixed" "no,yes"
+  (cond [(ior (match_test "!TARGET_PREFIXED_ADDR")
+	      (match_test "!NONJUMP_INSN_P (insn)"))
+	 (const_string "no")
+
+	 (eq_attr "type" "load,fpload,vecload")
+	 (if_then_else (match_test "prefixed_load_p (insn)")
+		       (const_string "yes")
+		       (const_string "no"))
+
+	 (eq_attr "type" "store,fpstore,vecstore")
+	 (if_then_else (match_test "prefixed_store_p (insn)")
+		       (const_string "yes")
+		       (const_string "no"))
+
+	 (eq_attr "type" "integer,add")
+	 (if_then_else (match_test "prefixed_paddi_p (insn)")
+		       (const_string "yes")
+		       (const_string "no"))]
+
+	(const_string "no")))
+
+;; Length in bytes of instructions that use prefixed addressing and length in
+;; bytes of instructions that does not use prefixed addressing.  This allows
+;; both lengths to be defined as constants, and the length attribute can pick
+;; the size as appropriate.
+(define_attr "prefixed_length" "" (const_int 12))
+(define_attr "non_prefixed_length" "" (const_int 4))
+
+;; Length of the instruction (in bytes).  Prefixed insns are 8 bytes, but the
+;; assembler might issue need to issue a NOP so that the prefixed instruction
+;; does not cross a cache boundary, which makes them possibly 12 bytes.
+(define_attr "length" ""
+  (if_then_else (eq_attr "prefixed" "yes")
+		(attr "prefixed_length")
+		(attr "non_prefixed_length")))
 
 ;; Processor type -- this attribute must exactly match the processor_type
 ;; enumeration in rs6000-opts.h.
@@ -9874,6 +9915,28 @@ (define_expand "restore_stack_nonlocal"
   operands[6] = gen_rtx_PARALLEL (VOIDmode, p);
 })
 \f
+;; Load up a PC-relative address.  Print_operand_address will append a @pcrel
+;; to the symbol or label.
+(define_insn "*pcrel_local_addr"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+	(match_operand:DI 1 "pcrel_local_address"))]
+  "TARGET_PCREL"
+  "la %0,%a1"
+  [(set_attr "prefixed" "yes")])
+
+;; Load up a PC-relative address to an external symbol.  If the symbol and the
+;; program are both defined in the main program, the linker will optimize this
+;; to a PADDI.  Otherwise, it will create a GOT address that is relocated by
+;; the dynamic linker and loaded up.  Print_operand_address will append a
+;; @got@pcrel to the symbol.
+(define_insn "*pcrel_extern_addr"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+	(match_operand:DI 1 "pcrel_external_address"))]
+  "TARGET_PCREL"
+  "ld %0,%a1"
+  [(set_attr "prefixed" "yes")
+   (set_attr "type" "load")])
+
 ;; TOC register handling.
 
 ;; Code to initialize the TOC register...

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4, patch #4.1: Enable prefixed/pc-rel addressing (revised)
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (9 preceding siblings ...)
  2019-09-24  6:10 ` [PATCH], V4.1, patch #2: Add prefixed insn attribute (revised) Michael Meissner
@ 2019-09-30 14:13 ` Michael Meissner
  2019-10-01 23:56   ` Segher Boessenkool
  2019-10-04 12:29 ` [PATCH], V4, patch #9 [part of patch #4.2], Add prefixed address offset checks Michael Meissner
                   ` (10 subsequent siblings)
  21 siblings, 1 reply; 37+ messages in thread
From: Michael Meissner @ 2019-09-30 14:13 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

As we discussed privately on Friday, I had a few nits to patch #4 that came up
after I submitted the patch.  The changes between that patch and this patch are:

1) I needed to change the enum non_prefixed_form to its new spelling in the
revised patch #1 that was committed.

2) In building glibc for 'future', we discovered the stack protector insns did
not use a constraint that prevented the register allocator from creating a
prefixed address.  Even though the predicate did not allow prefixed
instructions, the compiler still generated the address due to the constraint.
The previous patch used 'ZY', but 'Y' now allows prefixed addresses.  I needed
to add a second memory constraint ('eM') that prevents using a prefixed
address.

3) The function rs6000_adjust_vec_address did not have the optimizations I had
in my private branch that allowed folding extracting a constant element from a
vector in memory into a single memory instruction.  I have put those
optimizations into this patch.

4) In the previous patch, I missed setting the prefixed size and non prefixed
size for mov<mode>_ppc64 in the insn.  This pattern is used for moving PTImode
in GPR registers (on non-VSX systems, it would move TImode also).  By the time
it gets to final, it will have been split, but it is still useful to get the
sizes correct before the mode is split.

Note, patches #5-8 do not need any modification (other than adjusting the line
numbers, which patch does).  As I stated in other mail, patches #1-3 have been
checked in.

I have done a bootstrap/make check on a little endian power8 system and there
were no regressions.  I have also built both Spec 2017 rate and Spec 2006 CPU
benchmarks for each of -mcpu=power8, -mcpu=power9, and -mcpu=future, and there
were no build failures.

Can I check this into the trunk?

2019-09-30  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/constraints.md (em constraint): New constraint for
	non PC-relative memory.
	(eM constraint): New constraint for non-prefixed memory.
	* config/rs6000/predicates.md (lwa_operand): Allow odd offsets if
	we have prefixed addressing.
	(non_prefixed_memory): New predicate.
	(non_pcrel_memory): New predicate.
	(reg_or_non_pcrel_memory): New predicate.
	* config/rs6000/rs6000-protos.h (make_memory_non_prefixed): New
	declaration.
	* config/rs6000/rs6000.c (rs6000_adjust_vec_address): Optimize
	PC-relative addresses with constant offsets.  Signal an error if
	we have a PC-relative address and a variable offset.  Use the
	SIGNED_16BIT_OFFSET_P macro.
	(rs6000_split_vec_extract_var): Signal an error if we have a
	PC-relative address and a variable offset.
	(quad_address_p): Add support for prefixed addresses.
	(mem_operand_gpr): Add support for prefixed addresses.
	(mem_operand_ds_form): Add support for prefixed addresses.
	(rs6000_legitimate_offset_address_p): Add support for prefixed
	addresses.
	(rs6000_legitimate_address_p): Add support for prefixed
	addresses.
	(rs6000_mode_dependent_address): Add support for prefixed
	addresses.
	(rs6000_num_insns): New helper function.
	(rs6000_insn_cost): Treat prefixed instructions as having the same
	cost as non prefixed instructions, even though the prefixed
	instructions are larger.
	(make_memory_non_prefixed): New function to make a non-prefixed
	memory operand.
	* config/rs6000/rs6000.md (mov<mode>_64bit_dm): Add support for
	prefixed addresses.
	(movtd_64bit_nodm): Add support for prefixed addresses.
	(mov<mode>_ppc64): Add support for prefixed addresses.
	(stack_protect_setdi): Convert prefixed addresses to non-prefixed
	addresses.  Allow for indexed addressing as well as offsettable.
	(stack_protect_testdi): Convert prefixed addresses to non-prefixed
	addresses.  Allow for indexed addressing as well as offsettable.
	* config/rs6000/vsx.md (vsx_mov<mode>_64bit): Add support for
	prefixed addresses.
	(vsx_extract_<mode>_var, VSX_D iterator): Do not allow a vector in
	memory with a prefixed address to combine with variable offsets.
	(vsx_extract_v4sf_var): Do not allow a vector in memory with a
	prefixed address to combine with variable offsets.
	(vsx_extract_<mode>_var, VSX_EXTRACT_I iterator): Do not allow a
	vector in memory with a prefixed address to combine with variable
	offsets.
	(vsx_extract_<mode>_<VS_scalar>mode_var): Do not allow a vector in
	memory with a prefixed address to combine with variable offsets.
	* doc/md.texi (PowerPC constraints): Document the 'em' and 'eM'
	constraints.

Index: gcc/config/rs6000/constraints.md
===================================================================
--- gcc/config/rs6000/constraints.md	(revision 276284)
+++ gcc/config/rs6000/constraints.md	(working copy)
@@ -210,6 +210,16 @@ several times, or that might not access
   (and (match_code "mem")
        (match_test "GET_RTX_CLASS (GET_CODE (XEXP (op, 0))) != RTX_AUTOINC")))
 
+(define_memory_constraint "em"
+  "A memory operand that does not contain a PC-relative reference."
+  (and (match_code "mem")
+       (match_test "non_pcrel_memory (op, mode)")))
+
+(define_memory_constraint "eM"
+  "A memory operand that does not contain a prefixed address."
+  (and (match_code "mem")
+       (match_test "non_prefixed_memory (op, mode)")))
+
 (define_memory_constraint "Q"
   "Memory operand that is an offset from a register (it is usually better
 to use @samp{m} or @samp{es} in @code{asm} statements)"
Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 276284)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -932,6 +932,14 @@ (define_predicate "lwa_operand"
     return false;
 
   addr = XEXP (inner, 0);
+
+  /* The LWA instruction uses the DS-form format where the bottom two bits of
+     the offset must be 0.  The prefixed PLWA does not have this
+     restriction.  */
+  if (TARGET_PREFIXED_ADDR
+      && address_is_prefixed (addr, DImode, NON_PREFIXED_DS))
+    return true;
+
   if (GET_CODE (addr) == PRE_INC
       || GET_CODE (addr) == PRE_DEC
       || (GET_CODE (addr) == PRE_MODIFY
@@ -1807,3 +1815,43 @@ (define_predicate "pcrel_external_addres
 (define_predicate "pcrel_local_or_external_address"
   (ior (match_operand 0 "pcrel_local_address")
        (match_operand 0 "pcrel_external_address")))
+
+;; Return 1 if op is a memory operand that is not prefixed.
+(define_predicate "non_prefixed_memory"
+  (match_code "mem")
+{
+  if (!memory_operand (op, mode))
+    return false;
+
+  enum insn_form iform
+    = address_to_insn_form (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
+
+  return (iform != INSN_FORM_PREFIXED_NUMERIC
+          && iform != INSN_FORM_PCREL_LOCAL
+          && iform != INSN_FORM_BAD);
+})
+
+(define_predicate "non_pcrel_memory"
+  (match_code "mem")
+{
+  if (!memory_operand (op, mode))
+    return false;
+
+  enum insn_form iform
+    = address_to_insn_form (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
+
+  return (iform != INSN_FORM_PCREL_EXTERNAL
+          && iform != INSN_FORM_PCREL_LOCAL
+          && iform != INSN_FORM_BAD);
+})
+
+;; Return 1 if op is either a register operand or a memory operand that does
+;; not use a PC-relative address.
+(define_predicate "reg_or_non_pcrel_memory"
+  (match_code "reg,subreg,mem")
+{
+  if (REG_P (op) || SUBREG_P (op))
+    return register_operand (op, mode);
+
+  return non_pcrel_memory (op, mode);
+})
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 276284)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -192,6 +192,7 @@ extern enum insn_form address_to_insn_fo
 extern bool prefixed_load_p (rtx_insn *);
 extern bool prefixed_store_p (rtx_insn *);
 extern bool prefixed_paddi_p (rtx_insn *);
+extern rtx make_memory_non_prefixed (rtx);
 extern void rs6000_asm_output_opcode (FILE *);
 extern void rs6000_final_prescan_insn (rtx_insn *, rtx [], int);
 
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276284)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -6700,6 +6700,7 @@ rs6000_adjust_vec_address (rtx scalar_re
   rtx element_offset;
   rtx new_addr;
   bool valid_addr_p;
+  bool pcrel_p = TARGET_PCREL && pcrel_local_address (addr, Pmode);
 
   /* Vector addresses should not have PRE_INC, PRE_DEC, or PRE_MODIFY.  */
   gcc_assert (GET_RTX_CLASS (GET_CODE (addr)) != RTX_AUTOINC);
@@ -6737,6 +6738,40 @@ rs6000_adjust_vec_address (rtx scalar_re
   else if (REG_P (addr) || SUBREG_P (addr))
     new_addr = gen_rtx_PLUS (Pmode, addr, element_offset);
 
+  /* Optimize PC-relative addresses.  */
+  else if (pcrel_p)
+    {
+      if (CONST_INT_P (element_offset))
+	{
+	  rtx addr2 = addr;
+	  HOST_WIDE_INT offset = INTVAL (element_offset);
+
+	  if (GET_CODE (addr2) == CONST)
+	    addr2 = XEXP (addr2, 0);
+
+	  if (GET_CODE (addr2) == PLUS)
+	    {
+	      offset += INTVAL (XEXP (addr2, 1));
+	      addr2 = XEXP (addr2, 0);
+	    }
+
+	  gcc_assert (SIGNED_34BIT_OFFSET_P (offset));
+	  if (offset)
+	    {
+	      addr2 = gen_rtx_PLUS (Pmode, addr2, GEN_INT (offset));
+	      new_addr = gen_rtx_CONST (Pmode, addr2);
+	    }
+	  else
+	    new_addr = addr2;
+	}
+
+      /* Make sure we do not have a PC-relative address with a variable offset,
+	 since we only have one temporary base register, and we would need two
+	 registers in that case.  */
+      else
+	gcc_unreachable ();
+    }
+
   /* Optimize D-FORM addresses with constant offset with a constant element, to
      include the element offset in the address directly.  */
   else if (GET_CODE (addr) == PLUS)
@@ -6751,8 +6786,11 @@ rs6000_adjust_vec_address (rtx scalar_re
 	  HOST_WIDE_INT offset = INTVAL (op1) + INTVAL (element_offset);
 	  rtx offset_rtx = GEN_INT (offset);
 
-	  if (IN_RANGE (offset, -32768, 32767)
-	      && (scalar_size < 8 || (offset & 0x3) == 0))
+	  if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (offset))
+	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
+
+	  else if (SIGNED_16BIT_OFFSET_P (offset)
+		   && (scalar_size < 8 || (offset & 0x3) == 0))
 	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
 	  else
 	    {
@@ -6800,11 +6838,11 @@ rs6000_adjust_vec_address (rtx scalar_re
       new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
     }
 
-  /* If we have a PLUS, we need to see whether the particular register class
-     allows for D-FORM or X-FORM addressing.  */
-  if (GET_CODE (new_addr) == PLUS)
+  /* If we have a PLUS or a PC-relative address without the PLUS, we need to
+     see whether the particular register class allows for D-FORM or X-FORM
+     addressing.  */
+  if (GET_CODE (new_addr) == PLUS || pcrel_p)
     {
-      rtx op1 = XEXP (new_addr, 1);
       addr_mask_type addr_mask;
       unsigned int scalar_regno = reg_or_subregno (scalar_reg);
 
@@ -6821,10 +6859,16 @@ rs6000_adjust_vec_address (rtx scalar_re
       else
 	gcc_unreachable ();
 
-      if (REG_P (op1) || SUBREG_P (op1))
-	valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0;
-      else
+      if (pcrel_p)
 	valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
+      else
+	{
+	  rtx op1 = XEXP (new_addr, 1);
+	  if (REG_P (op1) || SUBREG_P (op1))
+	    valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0;
+	  else
+	    valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
+	}
     }
 
   else if (REG_P (new_addr) || SUBREG_P (new_addr))
@@ -6860,6 +6904,12 @@ rs6000_split_vec_extract_var (rtx dest,
      systems.  */
   if (MEM_P (src))
     {
+      /* If this is a PC-relative address, we would need another register to
+	 hold the address of the vector along with the variable offset.  The
+	 callers should use reg_or_non_pcrel_memory to make sure we don't
+	 get a PC-relative address here.  */
+      gcc_assert (non_pcrel_memory (src, mode));
+
       int num_elements = GET_MODE_NUNITS (mode);
       rtx num_ele_m1 = GEN_INT (num_elements - 1);
 
@@ -7249,6 +7299,13 @@ quad_address_p (rtx addr, machine_mode m
   if (VECTOR_MODE_P (mode) && !mode_supports_dq_form (mode))
     return false;
 
+  /* Is this a valid prefixed address?  If the bottom four bits of the offset
+     are non-zero, we could use a prefixed instruction (which does not have the
+     DQ-form constraint that the traditional instruction had) instead of
+     forcing the unaligned offset to a GPR.  */
+  if (address_is_prefixed (addr, mode, NON_PREFIXED_DQ))
+    return true;
+
   if (GET_CODE (addr) != PLUS)
     return false;
 
@@ -7350,6 +7407,13 @@ mem_operand_gpr (rtx op, machine_mode mo
       && legitimate_indirect_address_p (XEXP (addr, 0), false))
     return true;
 
+  /* Allow prefixed instructions if supported.  If the bottom two bits of the
+     offset are non-zero, we could use a prefixed instruction (which does not
+     have the DS-form constraint that the traditional instruction had) instead
+     of forcing the unaligned offset to a GPR.  */
+  if (address_is_prefixed (addr, mode, NON_PREFIXED_DS))
+    return true;
+
   /* Don't allow non-offsettable addresses.  See PRs 83969 and 84279.  */
   if (!rs6000_offsettable_memref_p (op, mode, false))
     return false;
@@ -7371,7 +7435,7 @@ mem_operand_gpr (rtx op, machine_mode mo
        causes a wrap, so test only the low 16 bits.  */
     offset = ((offset & 0xffff) ^ 0x8000) - 0x8000;
 
-  return offset + 0x8000 < 0x10000u - extra;
+  return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
 
 /* As above, but for DS-FORM VSX insns.  Unlike mem_operand_gpr,
@@ -7384,6 +7448,13 @@ mem_operand_ds_form (rtx op, machine_mod
   int extra;
   rtx addr = XEXP (op, 0);
 
+  /* Allow prefixed instructions if supported.  If the bottom two bits of the
+     offset are non-zero, we could use a prefixed instruction (which does not
+     have the DS-form constraint that the traditional instruction had) instead
+     of forcing the unaligned offset to a GPR.  */
+  if (address_is_prefixed (addr, mode, NON_PREFIXED_DS))
+    return true;
+
   if (!offsettable_address_p (false, mode, addr))
     return false;
 
@@ -7404,7 +7475,7 @@ mem_operand_ds_form (rtx op, machine_mod
        causes a wrap, so test only the low 16 bits.  */
     offset = ((offset & 0xffff) ^ 0x8000) - 0x8000;
 
-  return offset + 0x8000 < 0x10000u - extra;
+  return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
 \f
 /* Subroutines of rs6000_legitimize_address and rs6000_legitimate_address_p.  */
@@ -7753,8 +7824,10 @@ rs6000_legitimate_offset_address_p (mach
       break;
     }
 
-  offset += 0x8000;
-  return offset < 0x10000 - extra;
+  if (TARGET_PREFIXED_ADDR)
+    return SIGNED_34BIT_OFFSET_EXTRA_P (offset, extra);
+  else
+    return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
 
 bool
@@ -8651,6 +8724,11 @@ rs6000_legitimate_address_p (machine_mod
       && mode_supports_pre_incdec_p (mode)
       && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict))
     return 1;
+
+  /* Handle prefixed addresses (PC-relative or 34-bit offset).  */
+  if (address_is_prefixed (x, mode, NON_PREFIXED_DEFAULT))
+    return 1;
+
   /* Handle restricted vector d-form offsets in ISA 3.0.  */
   if (quad_offset_p)
     {
@@ -8709,7 +8787,11 @@ rs6000_legitimate_address_p (machine_mod
 	  || (!avoiding_indexed_address_p (mode)
 	      && legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict)))
       && rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0)))
-    return 1;
+    {
+      /* There is no prefixed version of the load/store with update.  */
+      rtx addr = XEXP (x, 1);
+      return !address_is_prefixed (addr, mode, NON_PREFIXED_DEFAULT);
+    }
   if (reg_offset_p && !quad_offset_p
       && legitimate_lo_sum_address_p (mode, x, reg_ok_strict))
     return 1;
@@ -8771,8 +8853,12 @@ rs6000_mode_dependent_address (const_rtx
 	  && XEXP (addr, 0) != arg_pointer_rtx
 	  && CONST_INT_P (XEXP (addr, 1)))
 	{
-	  unsigned HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
-	  return val + 0x8000 >= 0x10000 - (TARGET_POWERPC64 ? 8 : 12);
+	  HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
+	  HOST_WIDE_INT extra = TARGET_POWERPC64 ? 8 : 12;
+	  if (TARGET_PREFIXED_ADDR)
+	    return !SIGNED_34BIT_OFFSET_EXTRA_P (val, extra);
+	  else
+	    return !SIGNED_16BIT_OFFSET_EXTRA_P (val, extra);
 	}
       break;
 
@@ -20926,6 +21012,42 @@ rs6000_debug_rtx_costs (rtx x, machine_m
   return ret;
 }
 
+/* How many real instructions are generated for this insn?  This is slightly
+   different from the length attribute, in that the length attribute counts the
+   number of bytes.  With prefixed instructions, we don't want to count a
+   prefixed instruction (length 12 bytes including possible NOP) as taking 3
+   instructions, but just one.  */
+
+static int
+rs6000_num_insns (rtx_insn *insn)
+{
+  /* Try to figure it out based on the length and whether there are prefixed
+     instructions.  While prefixed instructions are only 8 bytes, we have to
+     use 12 as the size of the first prefixed instruction in case the
+     instruction needs to be aligned.  Back to back prefixed instructions would
+     only take 20 bytes, since it is guaranteed that one of the prefixed
+     instructions does not need the alignment.  */
+  int length = get_attr_length (insn);
+
+  if (length >= 12 && TARGET_PREFIXED_ADDR
+      && get_attr_prefixed (insn) == PREFIXED_YES)
+    {
+      /* Single prefixed instruction.  */
+      if (length == 12)
+	return 1;
+
+      /* A normal instruction and a prefixed instruction (16) or two back
+	 to back prefixed instructions (20).  */
+      if (length == 16 || length == 20)
+	return 2;
+
+      /* Guess for larger instruction sizes.  */
+      return 2 + (length - 20) / 4;
+    }
+
+  return length / 4;
+}
+
 static int
 rs6000_insn_cost (rtx_insn *insn, bool speed)
 {
@@ -20939,7 +21061,7 @@ rs6000_insn_cost (rtx_insn *insn, bool s
   if (cost > 0)
     return cost;
 
-  int n = get_attr_length (insn) / 4;
+  int n = rs6000_num_insns (insn);
   enum attr_type type = get_attr_type (insn);
 
   switch (type)
@@ -24908,6 +25030,34 @@ prefixed_paddi_p (rtx_insn *insn)
   return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
 }
 
+/* Make a memory address non-prefixed if it is prefixed.  */
+
+rtx
+make_memory_non_prefixed (rtx mem)
+{
+  gcc_assert (MEM_P (mem));
+
+  rtx old_addr = XEXP (mem, 0);
+  if (address_is_prefixed (old_addr, GET_MODE (mem), NON_PREFIXED_DEFAULT))
+    {
+      rtx new_addr;
+
+      if (GET_CODE (old_addr) == PLUS
+	  && (REG_P (XEXP (old_addr, 0)) || SUBREG_P (XEXP (old_addr, 0)))
+	  && CONST_INT_P (XEXP (old_addr, 1)))
+	{
+	  rtx tmp_reg = force_reg (Pmode, XEXP (old_addr, 1));
+	  new_addr = gen_rtx_PLUS (Pmode, XEXP (old_addr, 0), tmp_reg);
+	}
+      else
+	new_addr = force_reg (Pmode, old_addr);
+
+      mem = change_address (mem, VOIDmode, new_addr);
+    }
+
+  return mem;
+}
+
 /* Whether the next instruction needs a 'p' prefix issued before the
    instruction is printed out.  */
 static bool next_insn_prefixed_p;
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 276284)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -7774,8 +7774,9 @@ (define_insn_and_split "*mov<mode>_64bit
   "&& reload_completed"
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8")
-   (set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")])
+  [(set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")
+   (set_attr "non_prefixed_length" "8")
+   (set_attr "prefixed_length" "20")])
 
 (define_insn_and_split "*movtd_64bit_nodm"
   [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
@@ -7786,8 +7787,12 @@ (define_insn_and_split "*movtd_64bit_nod
   "#"
   "&& reload_completed"
   [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8,8,8,12,12,8")])
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+}
+  [(set_attr "non_prefixed_length" "8")
+   (set_attr "prefixed_length" "20")])
 
 (define_insn_and_split "*mov<mode>_32bit"
   [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r")
@@ -8984,7 +8989,8 @@ (define_insn "*mov<mode>_ppc64"
   return rs6000_output_move_128bit (operands);
 }
   [(set_attr "type" "store,store,load,load,*,*")
-   (set_attr "length" "8")])
+   (set_attr "non_prefixed_length" "8,8,8,8,8,40")
+   (set_attr "prefixed_length" "20,20,20,20,8,40")])
 
 (define_split
   [(set (match_operand:TI2 0 "int_reg_operand")
@@ -11502,9 +11508,25 @@ (define_insn "stack_protect_setsi"
   [(set_attr "type" "three")
    (set_attr "length" "12")])
 
-(define_insn "stack_protect_setdi"
-  [(set (match_operand:DI 0 "memory_operand" "=Y")
-	(unspec:DI [(match_operand:DI 1 "memory_operand" "Y")] UNSPEC_SP_SET))
+(define_expand "stack_protect_setdi"
+  [(parallel [(set (match_operand:DI 0 "memory_operand")
+		   (unspec:DI [(match_operand:DI 1 "memory_operand")]
+		   UNSPEC_SP_SET))
+	      (set (match_scratch:DI 2)
+		   (const_int 0))])]
+  "TARGET_64BIT"
+{
+  if (TARGET_PREFIXED_ADDR)
+    {
+      operands[0] = make_memory_non_prefixed (operands[0]);
+      operands[1] = make_memory_non_prefixed (operands[1]);
+    }
+})
+
+(define_insn "*stack_protect_setdi"
+  [(set (match_operand:DI 0 "non_prefixed_memory" "=eM")
+	(unspec:DI [(match_operand:DI 1 "non_prefixed_memory" "eM")]
+		   UNSPEC_SP_SET))
    (set (match_scratch:DI 2 "=&r") (const_int 0))]
   "TARGET_64BIT"
   "ld%U1%X1 %2,%1\;std%U0%X0 %2,%0\;li %2,0"
@@ -11548,10 +11570,27 @@ (define_insn "stack_protect_testsi"
    lwz%U1%X1 %3,%1\;lwz%U2%X2 %4,%2\;cmplw %0,%3,%4\;li %3,0\;li %4,0"
   [(set_attr "length" "16,20")])
 
-(define_insn "stack_protect_testdi"
+(define_expand "stack_protect_testdi"
+  [(parallel [(set (match_operand:CCEQ 0 "cc_reg_operand")
+		   (unspec:CCEQ [(match_operand:DI 1 "memory_operand")
+				 (match_operand:DI 2 "memory_operand")]
+				UNSPEC_SP_TEST))
+	      (set (match_scratch:DI 4)
+		   (const_int 0))
+	      (clobber (match_scratch:DI 3))])]
+  "TARGET_64BIT"
+{
+  if (TARGET_PREFIXED_ADDR)
+    {
+      operands[1] = make_memory_non_prefixed (operands[1]);
+      operands[2] = make_memory_non_prefixed (operands[2]);
+    }
+})
+
+(define_insn "*stack_protect_testdi"
   [(set (match_operand:CCEQ 0 "cc_reg_operand" "=x,?y")
-        (unspec:CCEQ [(match_operand:DI 1 "memory_operand" "Y,Y")
-		      (match_operand:DI 2 "memory_operand" "Y,Y")]
+        (unspec:CCEQ [(match_operand:DI 1 "non_prefixed_memory" "eM,eM")
+		      (match_operand:DI 2 "non_prefixed_memory" "eM,eM")]
 		     UNSPEC_SP_TEST))
    (set (match_scratch:DI 4 "=r,r") (const_int 0))
    (clobber (match_scratch:DI 3 "=&r,&r"))]
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 276284)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -1149,10 +1149,30 @@ (define_insn "vsx_mov<mode>_64bit"
                "vecstore,  vecload,   vecsimple, mffgpr,    mftgpr,    load,
                 store,     load,      store,     *,         vecsimple, vecsimple,
                 vecsimple, *,         *,         vecstore,  vecload")
-   (set_attr "length"
-               "*,         *,         *,         8,         *,         8,
-                8,         8,         8,         8,         *,         *,
-                *,         20,        8,         *,         *")
+   (set (attr "non_prefixed_length")
+	(cond [(and (eq_attr "alternative" "4")		;; MTVSRDD
+		    (match_test "TARGET_P9_VECTOR"))
+	       (const_string "4")
+
+	       (eq_attr "alternative" "3,4")		;; GPR <-> VSX
+	       (const_string "8")
+
+	       (eq_attr "alternative" "5,6,7,8")	;; GPR load/store
+	       (const_string "8")]
+	      (const_string "*")))
+
+   (set (attr "prefixed_length")
+	(cond [(and (eq_attr "alternative" "4")		;; MTVSRDD
+		    (match_test "TARGET_P9_VECTOR"))
+	       (const_string "4")
+
+	       (eq_attr "alternative" "3,4")		;; GPR <-> VSX
+	       (const_string "8")
+
+	       (eq_attr "alternative" "5,6,7,8")	;; GPR load/store
+	       (const_string "20")]
+	      (const_string "*")))
+
    (set_attr "isa"
                "<VSisa>,   <VSisa>,   <VSisa>,   *,         *,         *,
                 *,         *,         *,         *,         p9v,       *,
@@ -3235,9 +3255,10 @@ (define_insn "vsx_vslo_<mode>"
 ;; Variable V2DI/V2DF extract
 (define_insn_and_split "vsx_extract_<mode>_var"
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v,wa,r")
-	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v,m,m")
-			     (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
-			    UNSPEC_VSX_EXTRACT))
+	(unspec:<VS_scalar>
+	 [(match_operand:VSX_D 1 "reg_or_non_pcrel_memory" "v,em,em")
+	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
+	 UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,&b,&b"))
    (clobber (match_scratch:V2DI 4 "=&v,X,X"))]
   "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_DIRECT_MOVE_64BIT"
@@ -3305,9 +3326,10 @@ (define_insn_and_split "*vsx_extract_v4s
 ;; Variable V4SF extract
 (define_insn_and_split "vsx_extract_v4sf_var"
   [(set (match_operand:SF 0 "gpc_reg_operand" "=wa,wa,?r")
-	(unspec:SF [(match_operand:V4SF 1 "input_operand" "v,m,m")
-		    (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
-		   UNSPEC_VSX_EXTRACT))
+	(unspec:SF
+	 [(match_operand:V4SF 1 "reg_or_non_pcrel_memory" "v,em,em")
+	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
+	 UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,&b,&b"))
    (clobber (match_scratch:V2DI 4 "=&v,X,X"))]
   "VECTOR_MEM_VSX_P (V4SFmode) && TARGET_DIRECT_MOVE_64BIT"
@@ -3668,7 +3690,7 @@ (define_insn_and_split "*vsx_extract_<mo
 (define_insn_and_split "vsx_extract_<mode>_var"
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
 	(unspec:<VS_scalar>
-	 [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m")
+	 [(match_operand:VSX_EXTRACT_I 1 "reg_or_non_pcrel_memory" "v,v,em")
 	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 	 UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,r,&b"))
@@ -3688,7 +3710,7 @@ (define_insn_and_split "*vsx_extract_<mo
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
 	(zero_extend:<VS_scalar>
 	 (unspec:<VSX_EXTRACT_I:VS_scalar>
-	  [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m")
+	  [(match_operand:VSX_EXTRACT_I 1 "reg_or_non_pcrel_memory" "v,v,em")
 	   (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 	  UNSPEC_VSX_EXTRACT)))
    (clobber (match_scratch:DI 3 "=r,r,&b"))
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(revision 276284)
+++ gcc/doc/md.texi	(working copy)
@@ -3373,6 +3373,12 @@ asm ("st %1,%0" : "=m<>" (mem) : "r" (va
 
 is not.
 
+@item em
+A memory operand that does not contain a PC-relative address.
+
+@item eM
+A memory operand that does not contain a prefixed address.
+
 @item es
 A ``stable'' memory operand; that is, one which does not include any
 automodification of the base register.  This used to be useful when

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH], V4, patch #4.1: Enable prefixed/pc-rel addressing (revised)
  2019-09-30 14:13 ` [PATCH], V4, patch #4.1: Enable prefixed/pc-rel addressing (revised) Michael Meissner
@ 2019-10-01 23:56   ` Segher Boessenkool
  2019-10-02 19:04     ` Michael Meissner
  0 siblings, 1 reply; 37+ messages in thread
From: Segher Boessenkool @ 2019-10-01 23:56 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

Hi Mike,

On Mon, Sep 30, 2019 at 10:12:54AM -0400, Michael Meissner wrote:
> I needed
> to add a second memory constraint ('eM') that prevents using a prefixed
> address.

Do we need both em and eM?  Do we ever want to allow prefixed insns but not
pcrel?  Or, alternatively, this only uses eM for some insns where the "p"
prefix trick won't work (because there are multiple insns in the template);
we could solve that some other way (by inserting the p's manually for
example).

But what should inline asm users do wrt prefixed/pcrel memory?  Should
they not use prefixed memory at all?  That means for asm we should always
disallow prefixed for "m".

Having both em and eM names is a bit confusing (which is what?)  The eM
one should not be documented in the user manual, probably.

Maybe just using wY here will work just as well?  That is also not ideal
of course, but we already have that one anyway.

> 4) In the previous patch, I missed setting the prefixed size and non prefixed
> size for mov<mode>_ppc64 in the insn.  This pattern is used for moving PTImode
> in GPR registers (on non-VSX systems, it would move TImode also).  By the time
> it gets to final, it will have been split, but it is still useful to get the
> sizes correct before the mode is split.

So that is a separate patch?  Please send it as one, then?

> +  /* The LWA instruction uses the DS-form format where the bottom two bits of
> +     the offset must be 0.  The prefixed PLWA does not have this
> +     restriction.  */
> +  if (TARGET_PREFIXED_ADDR
> +      && address_is_prefixed (addr, DImode, NON_PREFIXED_DS))
> +    return true;

Should TARGET_PREFIXED_ADDR be part of address_is_prefixed, instead of
part of all its callers?

> +;; Return 1 if op is a memory operand that is not prefixed.
> +(define_predicate "non_prefixed_memory"
> +  (match_code "mem")
> +{
> +  if (!memory_operand (op, mode))
> +    return false;
> +
> +  enum insn_form iform
> +    = address_to_insn_form (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
> +
> +  return (iform != INSN_FORM_PREFIXED_NUMERIC
> +          && iform != INSN_FORM_PCREL_LOCAL
> +          && iform != INSN_FORM_BAD);
> +})
> +
> +(define_predicate "non_pcrel_memory"
> +  (match_code "mem")
> +{
> +  if (!memory_operand (op, mode))
> +    return false;
> +
> +  enum insn_form iform
> +    = address_to_insn_form (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
> +
> +  return (iform != INSN_FORM_PCREL_EXTERNAL
> +          && iform != INSN_FORM_PCREL_LOCAL
> +          && iform != INSN_FORM_BAD);
> +})

Why does non_prefixed_memory not check INSN_FORM_PCREL_EXTERNAL?  Why does
non_prefixed_memory not use non_pcrel_memory, instead of open-coding it?

What is INSN_FORM_BAD about, in both functions?

> +;; Return 1 if op is either a register operand or a memory operand that does
> +;; not use a PC-relative address.
> +(define_predicate "reg_or_non_pcrel_memory"
> +  (match_code "reg,subreg,mem")
> +{
> +  if (REG_P (op) || SUBREG_P (op))
> +    return register_operand (op, mode);
> +
> +  return non_pcrel_memory (op, mode);
> +})

Why do we need this predicate?  Should it use register_operand like this,
or should it use gpc_reg_operand?

> +  bool pcrel_p = TARGET_PCREL && pcrel_local_address (addr, Pmode);

Similar as above: should TARGET_PCREL be part of pcrel_local_address?

> @@ -6860,6 +6904,12 @@ rs6000_split_vec_extract_var (rtx dest,
>       systems.  */
>    if (MEM_P (src))
>      {
> +      /* If this is a PC-relative address, we would need another register to
> +	 hold the address of the vector along with the variable offset.  The
> +	 callers should use reg_or_non_pcrel_memory to make sure we don't
> +	 get a PC-relative address here.  */

I don't understand this comment, nor the problem.  Please expand?

> +  /* Allow prefixed instructions if supported.  If the bottom two bits of the
> +     offset are non-zero, we could use a prefixed instruction (which does not
> +     have the DS-form constraint that the traditional instruction had) instead
> +     of forcing the unaligned offset to a GPR.  */
> +  if (address_is_prefixed (addr, mode, NON_PREFIXED_DS))
> +    return true;

Here (and for DQ) you aren't testing TARGET_PREFIXED?

> @@ -7371,7 +7435,7 @@ mem_operand_gpr (rtx op, machine_mode mo
>         causes a wrap, so test only the low 16 bits.  */
>      offset = ((offset & 0xffff) ^ 0x8000) - 0x8000;
>  
> -  return offset + 0x8000 < 0x10000u - extra;
> +  return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);

This is a separate patch (and pre-approved).

> @@ -7404,7 +7475,7 @@ mem_operand_ds_form (rtx op, machine_mod
>         causes a wrap, so test only the low 16 bits.  */
>      offset = ((offset & 0xffff) ^ 0x8000) - 0x8000;
>  
> -  return offset + 0x8000 < 0x10000u - extra;
> +  return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);

Together with this.

> -  offset += 0x8000;
> -  return offset < 0x10000 - extra;
> +  if (TARGET_PREFIXED_ADDR)
> +    return SIGNED_34BIT_OFFSET_EXTRA_P (offset, extra);
> +  else
> +    return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);

And the 16-bit part of this.

> -	  unsigned HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
> -	  return val + 0x8000 >= 0x10000 - (TARGET_POWERPC64 ? 8 : 12);
> +	  HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
> +	  HOST_WIDE_INT extra = TARGET_POWERPC64 ? 8 : 12;
> +	  if (TARGET_PREFIXED_ADDR)
> +	    return !SIGNED_34BIT_OFFSET_EXTRA_P (val, extra);
> +	  else
> +	    return !SIGNED_16BIT_OFFSET_EXTRA_P (val, extra);

And this.

> +/* How many real instructions are generated for this insn?  This is slightly
> +   different from the length attribute, in that the length attribute counts the
> +   number of bytes.  With prefixed instructions, we don't want to count a
> +   prefixed instruction (length 12 bytes including possible NOP) as taking 3
> +   instructions, but just one.  */
> +
> +static int
> +rs6000_num_insns (rtx_insn *insn)

Separate patch please.  This is a whole different issue.

> --- gcc/config/rs6000/rs6000.md	(revision 276284)
> +++ gcc/config/rs6000/rs6000.md	(working copy)
> @@ -7774,8 +7774,9 @@ (define_insn_and_split "*mov<mode>_64bit
>    "&& reload_completed"
>    [(pc)]
>  { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
> -  [(set_attr "length" "8")
> -   (set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")])
> +  [(set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")
> +   (set_attr "non_prefixed_length" "8")
> +   (set_attr "prefixed_length" "20")])

This one is fine.

> @@ -7786,8 +7787,12 @@ (define_insn_and_split "*movtd_64bit_nod
>    "#"
>    "&& reload_completed"
>    [(pc)]
> -{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
> -  [(set_attr "length" "8,8,8,12,12,8")])
> +{
> +  rs6000_split_multireg_move (operands[0], operands[1]);
> +  DONE;
> +}
> +  [(set_attr "non_prefixed_length" "8")
> +   (set_attr "prefixed_length" "20")])

But this one I don't get...  Various alternatives were 12 before, what
changed?  Or was it wrong?

> @@ -8984,7 +8989,8 @@ (define_insn "*mov<mode>_ppc64"
>    return rs6000_output_move_128bit (operands);
>  }
>    [(set_attr "type" "store,store,load,load,*,*")
> -   (set_attr "length" "8")])
> +   (set_attr "non_prefixed_length" "8,8,8,8,8,40")
> +   (set_attr "prefixed_length" "20,20,20,20,8,40")])

What about the 40 here?

> -(define_insn "stack_protect_setdi"
> -  [(set (match_operand:DI 0 "memory_operand" "=Y")
> -	(unspec:DI [(match_operand:DI 1 "memory_operand" "Y")] UNSPEC_SP_SET))
> +(define_expand "stack_protect_setdi"
> +  [(parallel [(set (match_operand:DI 0 "memory_operand")
> +		   (unspec:DI [(match_operand:DI 1 "memory_operand")]
> +		   UNSPEC_SP_SET))
> +	      (set (match_scratch:DI 2)
> +		   (const_int 0))])]
> +  "TARGET_64BIT"
> +{
> +  if (TARGET_PREFIXED_ADDR)
> +    {
> +      operands[0] = make_memory_non_prefixed (operands[0]);
> +      operands[1] = make_memory_non_prefixed (operands[1]);
> +    }
> +})

I don't understand why this is needed...  Won't a better predicate do
this easier, safer, simpler?  Better than memory_operand, which of course
allows pretty much anything.

> @@ -1149,10 +1149,30 @@ (define_insn "vsx_mov<mode>_64bit"
>                 "vecstore,  vecload,   vecsimple, mffgpr,    mftgpr,    load,
>                  store,     load,      store,     *,         vecsimple, vecsimple,
>                  vecsimple, *,         *,         vecstore,  vecload")
> -   (set_attr "length"
> -               "*,         *,         *,         8,         *,         8,
> -                8,         8,         8,         8,         *,         *,
> -                *,         20,        8,         *,         *")
> +   (set (attr "non_prefixed_length")
> +	(cond [(and (eq_attr "alternative" "4")		;; MTVSRDD
> +		    (match_test "TARGET_P9_VECTOR"))
> +	       (const_string "4")
> +
> +	       (eq_attr "alternative" "3,4")		;; GPR <-> VSX
> +	       (const_string "8")

TARGET_P9_VECTOR is always true for alt 4 (it uses the "we" constraint).
And exactly the same is true for alt 3?  And MTVSRDD has nothing to do
with it?

> +
> +	       (eq_attr "alternative" "5,6,7,8")	;; GPR load/store
> +	       (const_string "8")]
> +	      (const_string "*")))

This loses that alt 13 was len 20, what happened there?  And alts 9 and 14?

> @@ -3235,9 +3255,10 @@ (define_insn "vsx_vslo_<mode>"
>  ;; Variable V2DI/V2DF extract
>  (define_insn_and_split "vsx_extract_<mode>_var"
>    [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v,wa,r")
> -	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v,m,m")
> -			     (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
> -			    UNSPEC_VSX_EXTRACT))
> +	(unspec:<VS_scalar>
> +	 [(match_operand:VSX_D 1 "reg_or_non_pcrel_memory" "v,em,em")
> +	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
> +	 UNSPEC_VSX_EXTRACT))

Please explain why this needs to be non-pcrel (and split this to a separate
patch if possible).


Segher

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH], V4, patch #4.1: Enable prefixed/pc-rel addressing (revised)
  2019-10-01 23:56   ` Segher Boessenkool
@ 2019-10-02 19:04     ` Michael Meissner
  2019-10-02 22:52       ` Segher Boessenkool
  0 siblings, 1 reply; 37+ messages in thread
From: Michael Meissner @ 2019-10-02 19:04 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Tue, Oct 01, 2019 at 06:56:01PM -0500, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Mon, Sep 30, 2019 at 10:12:54AM -0400, Michael Meissner wrote:
> > I needed
> > to add a second memory constraint ('eM') that prevents using a prefixed
> > address.
> 
> Do we need both em and eM?  Do we ever want to allow prefixed insns but not
> pcrel?  Or, alternatively, this only uses eM for some insns where the "p"
> prefix trick won't work (because there are multiple insns in the template);
> we could solve that some other way (by inserting the p's manually for
> example).

No right now we need one (no prefix) and the other (no pcrel) is desirable, but
if we can only have one, it will mean extra instructions being generated.

In the case of no prefix, we need this for stack_protect_testdi and
stack_protect_setdi.  There if we have a large stack frame and enable stack
protection, we don't want the register allocator from re-combining the insns to
an insn with a large offset, which it will do, even though the predicate does
not allow this case.  This was discovered when we tried to build glibc, and one
module (vfwprintf-internal.c) has a large stack frame and is built with
-fstack-protector-all.

In the case of no pc-rel, this occurs in optimizing vector extracts where the
vector is in memory.  In this code, we only have one temporary base register.
In the case where the address is PC-relative, and the element number being
extracted is variable, we would need two temporary base registers, one to hold
the PC-relative address, and the other to hold the offset from the start of the
vector.  So here, we disallow PC-relative addresses, but numeric addresses that
result in a prefixed instruction are fine, since the code calculates the
offset, adds in the offset, and then does the memory operation.

Aaron found this bug where it previously tried to use the single temporary with
two different uses.

If we only had a no prefixed constraint, the code would not combine the vector
extract from memory, and instead, it would load the whole vector into a
register, and then do the appropriate shifts, VSLO, etc. to extract the
element.  I imagine the case shows up when you have large stack frames (or very
large structures).

> But what should inline asm users do wrt prefixed/pcrel memory?  Should
> they not use prefixed memory at all?  That means for asm we should always
> disallow prefixed for "m".

Yes, I've been thinking that for some time.  But I'm not going to worry about
that until the patches are in.

> Having both em and eM names is a bit confusing (which is what?)  The eM
> one should not be documented in the user manual, probably.
> 
> Maybe just using wY here will work just as well?  That is also not ideal
> of course, but we already have that one anyway.

Well, wY does allow prefixed addresses (because it calls mem_operand_ds_form
which also was modified to support prefixed addresses), so it wouldn't help.

> > 4) In the previous patch, I missed setting the prefixed size and non prefixed
> > size for mov<mode>_ppc64 in the insn.  This pattern is used for moving PTImode
> > in GPR registers (on non-VSX systems, it would move TImode also).  By the time
> > it gets to final, it will have been split, but it is still useful to get the
> > sizes correct before the mode is split.
> 
> So that is a separate patch?  Please send it as one, then?

No, it needs to be part of this patch.  It was just missing from the patch I
sent out.

> > +  /* The LWA instruction uses the DS-form format where the bottom two bits of
> > +     the offset must be 0.  The prefixed PLWA does not have this
> > +     restriction.  */
> > +  if (TARGET_PREFIXED_ADDR
> > +      && address_is_prefixed (addr, DImode, NON_PREFIXED_DS))
> > +    return true;
> 
> Should TARGET_PREFIXED_ADDR be part of address_is_prefixed, instead of
> part of all its callers?

Just trying to do the test before the call, so the default case (not 'future')
won't have the call/return.

> > +;; Return 1 if op is a memory operand that is not prefixed.
> > +(define_predicate "non_prefixed_memory"
> > +  (match_code "mem")
> > +{
> > +  if (!memory_operand (op, mode))
> > +    return false;
> > +
> > +  enum insn_form iform
> > +    = address_to_insn_form (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
> > +
> > +  return (iform != INSN_FORM_PREFIXED_NUMERIC
> > +          && iform != INSN_FORM_PCREL_LOCAL
> > +          && iform != INSN_FORM_BAD);
> > +})
> > +
> > +(define_predicate "non_pcrel_memory"
> > +  (match_code "mem")
> > +{
> > +  if (!memory_operand (op, mode))
> > +    return false;
> > +
> > +  enum insn_form iform
> > +    = address_to_insn_form (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
> > +
> > +  return (iform != INSN_FORM_PCREL_EXTERNAL
> > +          && iform != INSN_FORM_PCREL_LOCAL
> > +          && iform != INSN_FORM_BAD);
> > +})
> 
> Why does non_prefixed_memory not check INSN_FORM_PCREL_EXTERNAL?  Why does
> non_prefixed_memory not use non_pcrel_memory, instead of open-coding it?

Again, I'm trying not to have the extra call/return in the pipeline.

> What is INSN_FORM_BAD about, in both functions?

Because I think it is a bad idea to return true in the case where the memory is
not correct (for example, if somebody created an address with 35-bit offsets,
the address_to_insn_form would return INSN_FORM_BAD, but you would get a false
positive if non_pcrel_memory returned true for it).

> > +;; Return 1 if op is either a register operand or a memory operand that does
> > +;; not use a PC-relative address.
> > +(define_predicate "reg_or_non_pcrel_memory"
> > +  (match_code "reg,subreg,mem")
> > +{
> > +  if (REG_P (op) || SUBREG_P (op))
> > +    return register_operand (op, mode);
> > +
> > +  return non_pcrel_memory (op, mode);
> > +})
> 
> Why do we need this predicate?  Should it use register_operand like this,
> or should it use gpc_reg_operand?

Again, just micro-optimizing.

> > +  bool pcrel_p = TARGET_PCREL && pcrel_local_address (addr, Pmode);
> 
> Similar as above: should TARGET_PCREL be part of pcrel_local_address?

Sure it can.

> > @@ -6860,6 +6904,12 @@ rs6000_split_vec_extract_var (rtx dest,
> >       systems.  */
> >    if (MEM_P (src))
> >      {
> > +      /* If this is a PC-relative address, we would need another register to
> > +	 hold the address of the vector along with the variable offset.  The
> > +	 callers should use reg_or_non_pcrel_memory to make sure we don't
> > +	 get a PC-relative address here.  */
> 
> I don't understand this comment, nor the problem.  Please expand?

If you have:

	static vector double v;

	double
	get_v_n (size_t n)
	{
	  return vec_extract (v, n);
	}

It calls rs6000_split_vec_extract_var with:

   dest		= (mem:V2DF (SYMBOL_REF:DImode "v"))
   element	= (reg:DI <xxx>)
   tmp_gpr	= (reg:DI <yyy>)
   tmp_altivec	= (scratch:V2DF)

So to turn this into a (MEM:DF ...) to access the element, it needs to
calculate the offset from the beginning of the vector into tmp_gpr.  But since
the hardware doesn't allow indexed PC-relative instructions, we have to load
the address into a register to do an indexed load.

Because that combination is not allowed, the compiler then loads up the address
into a temporary register, and calls rs6000_split_vec_extract_var with:
	(MEM:V2DF (reg:DI <zzz>))

I.e.

        rldicl 3,3,0,63
        pla 9,.LANCHOR0@pcrel
        sldi 10,3,3
        lfdx 1,9,10


> > +  /* Allow prefixed instructions if supported.  If the bottom two bits of the
> > +     offset are non-zero, we could use a prefixed instruction (which does not
> > +     have the DS-form constraint that the traditional instruction had) instead
> > +     of forcing the unaligned offset to a GPR.  */
> > +  if (address_is_prefixed (addr, mode, NON_PREFIXED_DS))
> > +    return true;
> 
> Here (and for DQ) you aren't testing TARGET_PREFIXED?

Probably it should be (or remove the tests elsewhere).

> > @@ -7371,7 +7435,7 @@ mem_operand_gpr (rtx op, machine_mode mo
> >         causes a wrap, so test only the low 16 bits.  */
> >      offset = ((offset & 0xffff) ^ 0x8000) - 0x8000;
> >  
> > -  return offset + 0x8000 < 0x10000u - extra;
> > +  return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
> 
> This is a separate patch (and pre-approved).
> 
> > @@ -7404,7 +7475,7 @@ mem_operand_ds_form (rtx op, machine_mod
> >         causes a wrap, so test only the low 16 bits.  */
> >      offset = ((offset & 0xffff) ^ 0x8000) - 0x8000;
> >  
> > -  return offset + 0x8000 < 0x10000u - extra;
> > +  return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
> 
> Together with this.
> 
> > -  offset += 0x8000;
> > -  return offset < 0x10000 - extra;
> > +  if (TARGET_PREFIXED_ADDR)
> > +    return SIGNED_34BIT_OFFSET_EXTRA_P (offset, extra);
> > +  else
> > +    return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
> 
> And the 16-bit part of this.
>
> > -	  unsigned HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
> > -	  return val + 0x8000 >= 0x10000 - (TARGET_POWERPC64 ? 8 : 12);
> > +	  HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
> > +	  HOST_WIDE_INT extra = TARGET_POWERPC64 ? 8 : 12;
> > +	  if (TARGET_PREFIXED_ADDR)
> > +	    return !SIGNED_34BIT_OFFSET_EXTRA_P (val, extra);
> > +	  else
> > +	    return !SIGNED_16BIT_OFFSET_EXTRA_P (val, extra);
> 
> And this.

Ok.

> > +/* How many real instructions are generated for this insn?  This is slightly
> > +   different from the length attribute, in that the length attribute counts the
> > +   number of bytes.  With prefixed instructions, we don't want to count a
> > +   prefixed instruction (length 12 bytes including possible NOP) as taking 3
> > +   instructions, but just one.  */
> > +
> > +static int
> > +rs6000_num_insns (rtx_insn *insn)
> 
> Separate patch please.  This is a whole different issue.

Ok.

> > --- gcc/config/rs6000/rs6000.md	(revision 276284)
> > +++ gcc/config/rs6000/rs6000.md	(working copy)
> > @@ -7774,8 +7774,9 @@ (define_insn_and_split "*mov<mode>_64bit
> >    "&& reload_completed"
> >    [(pc)]
> >  { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
> > -  [(set_attr "length" "8")
> > -   (set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")])
> > +  [(set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")
> > +   (set_attr "non_prefixed_length" "8")
> > +   (set_attr "prefixed_length" "20")])
> 
> This one is fine.
> 
> > @@ -7786,8 +7787,12 @@ (define_insn_and_split "*movtd_64bit_nod
> >    "#"
> >    "&& reload_completed"
> >    [(pc)]
> > -{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
> > -  [(set_attr "length" "8,8,8,12,12,8")])
> > +{
> > +  rs6000_split_multireg_move (operands[0], operands[1]);
> > +  DONE;
> > +}
> > +  [(set_attr "non_prefixed_length" "8")
> > +   (set_attr "prefixed_length" "20")])
> 
> But this one I don't get...  Various alternatives were 12 before, what
> changed?  Or was it wrong?

As far as I can tell it was wrong.  But I can add 4 to the values if you think
it might be needed.

> > @@ -8984,7 +8989,8 @@ (define_insn "*mov<mode>_ppc64"
> >    return rs6000_output_move_128bit (operands);
> >  }
> >    [(set_attr "type" "store,store,load,load,*,*")
> > -   (set_attr "length" "8")])
> > +   (set_attr "non_prefixed_length" "8,8,8,8,8,40")
> > +   (set_attr "prefixed_length" "20,20,20,20,8,40")])
> 
> What about the 40 here?

Dunno.  The original code just used 8, which is wrong.  I was trying to guess
what the maximum # of insns to load up a 128-bit value in two GPRs was.  I can
certainly use 8 like we have now.  Fortunately for the time it matters, it will
have been split and then the normal counts for loading up a single register
will be correct.

> > -(define_insn "stack_protect_setdi"
> > -  [(set (match_operand:DI 0 "memory_operand" "=Y")
> > -	(unspec:DI [(match_operand:DI 1 "memory_operand" "Y")] UNSPEC_SP_SET))
> > +(define_expand "stack_protect_setdi"
> > +  [(parallel [(set (match_operand:DI 0 "memory_operand")
> > +		   (unspec:DI [(match_operand:DI 1 "memory_operand")]
> > +		   UNSPEC_SP_SET))
> > +	      (set (match_scratch:DI 2)
> > +		   (const_int 0))])]
> > +  "TARGET_64BIT"
> > +{
> > +  if (TARGET_PREFIXED_ADDR)
> > +    {
> > +      operands[0] = make_memory_non_prefixed (operands[0]);
> > +      operands[1] = make_memory_non_prefixed (operands[1]);
> > +    }
> > +})
> 
> I don't understand why this is needed...  Won't a better predicate do
> this easier, safer, simpler?  Better than memory_operand, which of course
> allows pretty much anything.

For the define_expand, you need it to be more general, so that you can call
make_memory_non_prefixed.

> > @@ -1149,10 +1149,30 @@ (define_insn "vsx_mov<mode>_64bit"
> >                 "vecstore,  vecload,   vecsimple, mffgpr,    mftgpr,    load,
> >                  store,     load,      store,     *,         vecsimple, vecsimple,
> >                  vecsimple, *,         *,         vecstore,  vecload")
> > -   (set_attr "length"
> > -               "*,         *,         *,         8,         *,         8,
> > -                8,         8,         8,         8,         *,         *,
> > -                *,         20,        8,         *,         *")
> > +   (set (attr "non_prefixed_length")
> > +	(cond [(and (eq_attr "alternative" "4")		;; MTVSRDD
> > +		    (match_test "TARGET_P9_VECTOR"))
> > +	       (const_string "4")
> > +
> > +	       (eq_attr "alternative" "3,4")		;; GPR <-> VSX
> > +	       (const_string "8")
> 
> TARGET_P9_VECTOR is always true for alt 4 (it uses the "we" constraint).
> And exactly the same is true for alt 3?  And MTVSRDD has nothing to do
> with it?

I thought it was P8 code, not P9.  On P8, you would have to do two MTVSRD's and
a combine insn, which you can't express in a normal move.

> 
> > +
> > +	       (eq_attr "alternative" "5,6,7,8")	;; GPR load/store
> > +	       (const_string "8")]
> > +	      (const_string "*")))
> 
> This loses that alt 13 was len 20, what happened there?  And alts 9 and 14?
> 
> > @@ -3235,9 +3255,10 @@ (define_insn "vsx_vslo_<mode>"
> >  ;; Variable V2DI/V2DF extract
> >  (define_insn_and_split "vsx_extract_<mode>_var"
> >    [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v,wa,r")
> > -	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v,m,m")
> > -			     (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
> > -			    UNSPEC_VSX_EXTRACT))
> > +	(unspec:<VS_scalar>
> > +	 [(match_operand:VSX_D 1 "reg_or_non_pcrel_memory" "v,em,em")
> > +	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
> > +	 UNSPEC_VSX_EXTRACT))
> 
> Please explain why this needs to be non-pcrel (and split this to a separate
> patch if possible).

See above.  If is is a separate patch, it will at least need to have the
gcc_unreachable's to flag when it is used.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH], V4, patch #4.1: Enable prefixed/pc-rel addressing (revised)
  2019-10-02 19:04     ` Michael Meissner
@ 2019-10-02 22:52       ` Segher Boessenkool
  0 siblings, 0 replies; 37+ messages in thread
From: Segher Boessenkool @ 2019-10-02 22:52 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

On Wed, Oct 02, 2019 at 03:04:35PM -0400, Michael Meissner wrote:
> On Tue, Oct 01, 2019 at 06:56:01PM -0500, Segher Boessenkool wrote:
> > On Mon, Sep 30, 2019 at 10:12:54AM -0400, Michael Meissner wrote:
> > > I needed
> > > to add a second memory constraint ('eM') that prevents using a prefixed
> > > address.
> > 
> > Do we need both em and eM?  Do we ever want to allow prefixed insns but not
> > pcrel?  Or, alternatively, this only uses eM for some insns where the "p"
> > prefix trick won't work (because there are multiple insns in the template);
> > we could solve that some other way (by inserting the p's manually for
> > example).
> 
> No right now we need one (no prefix) and the other (no pcrel) is desirable, but
> if we can only have one, it will mean extra instructions being generated.

We can have both if we need to, but it should have less confusing names at
a minimum.

> In the case of no prefix, we need this for stack_protect_testdi and
> stack_protect_setdi.  There if we have a large stack frame and enable stack
> protection, we don't want the register allocator from re-combining the insns to
> an insn with a large offset, which it will do, even though the predicate does
> not allow this case.  This was discovered when we tried to build glibc, and one
> module (vfwprintf-internal.c) has a large stack frame and is built with
> -fstack-protector-all.

Yes, but why does it not allow prefixed insns anyway?  It does not currently
*handle* prefixed insns properly, but that can be fixed.  It won't be
pretty, but it won't be horrible either.

Anyway, we need to be able to handle non-prefixed anyway (for asm, as I
mentioned later), so yes we want a constraint for that, and have "m" in
inline asm mean that (just like right now it actually means "m but not
update form").

> In the case of no pc-rel, this occurs in optimizing vector extracts where the
> vector is in memory.  In this code, we only have one temporary base register.

Why?

> In the case where the address is PC-relative, and the element number being
> extracted is variable, we would need two temporary base registers, one to hold
> the PC-relative address, and the other to hold the offset from the start of the
> vector.  So here, we disallow PC-relative addresses, but numeric addresses that
> result in a prefixed instruction are fine, since the code calculates the
> offset, adds in the offset, and then does the memory operation.

So it should have more scratch registers here?

Or, alternatively, we can just disallow all prefixed addressing here?  Will
that really degrade anything?

> If we only had a no prefixed constraint, the code would not combine the vector
> extract from memory, and instead, it would load the whole vector into a
> register, and then do the appropriate shifts, VSLO, etc. to extract the
> element.  I imagine the case shows up when you have large stack frames (or very
> large structures).

I don't understand.

> > But what should inline asm users do wrt prefixed/pcrel memory?  Should
> > they not use prefixed memory at all?  That means for asm we should always
> > disallow prefixed for "m".
> 
> Yes, I've been thinking that for some time.  But I'm not going to worry about
> that until the patches are in.

Please do worry about it.

> > > 4) In the previous patch, I missed setting the prefixed size and non prefixed
> > > size for mov<mode>_ppc64 in the insn.  This pattern is used for moving PTImode
> > > in GPR registers (on non-VSX systems, it would move TImode also).  By the time
> > > it gets to final, it will have been split, but it is still useful to get the
> > > sizes correct before the mode is split.
> > 
> > So that is a separate patch?  Please send it as one, then?
> 
> No, it needs to be part of this patch.  It was just missing from the patch I
> sent out.

This patch does a whole lot of separate things.  It needs to be split up,
it took ages to review it like this.

> > > +  /* The LWA instruction uses the DS-form format where the bottom two bits of
> > > +     the offset must be 0.  The prefixed PLWA does not have this
> > > +     restriction.  */
> > > +  if (TARGET_PREFIXED_ADDR
> > > +      && address_is_prefixed (addr, DImode, NON_PREFIXED_DS))
> > > +    return true;
> > 
> > Should TARGET_PREFIXED_ADDR be part of address_is_prefixed, instead of
> > part of all its callers?
> 
> Just trying to do the test before the call, so the default case (not 'future')
> won't have the call/return.

Don't micro-optimise like this please, the compiler can do this better
than humans can.  And humans *can* forget it in places, which gives ICEs.

> > What is INSN_FORM_BAD about, in both functions?
> 
> Because I think it is a bad idea to return true in the case where the memory is
> not correct (for example, if somebody created an address with 35-bit offsets,
> the address_to_insn_form would return INSN_FORM_BAD, but you would get a false
> positive if non_pcrel_memory returned true for it).

Add an assert that this doesn't happen, if you are worried that it could.

> > > +;; Return 1 if op is either a register operand or a memory operand that does
> > > +;; not use a PC-relative address.
> > > +(define_predicate "reg_or_non_pcrel_memory"
> > > +  (match_code "reg,subreg,mem")
> > > +{
> > > +  if (REG_P (op) || SUBREG_P (op))
> > > +    return register_operand (op, mode);
> > > +
> > > +  return non_pcrel_memory (op, mode);
> > > +})
> > 
> > Why do we need this predicate?  Should it use register_operand like this,
> > or should it use gpc_reg_operand?
> 
> Again, just micro-optimizing.

Please don't?  Also, that doesn't answer the second question.

> > > @@ -6860,6 +6904,12 @@ rs6000_split_vec_extract_var (rtx dest,
> > >       systems.  */
> > >    if (MEM_P (src))
> > >      {
> > > +      /* If this is a PC-relative address, we would need another register to
> > > +	 hold the address of the vector along with the variable offset.  The
> > > +	 callers should use reg_or_non_pcrel_memory to make sure we don't
> > > +	 get a PC-relative address here.  */
> > 
> > I don't understand this comment, nor the problem.  Please expand?
> 
> If you have:
> 
> 	static vector double v;
> 
> 	double
> 	get_v_n (size_t n)
> 	{
> 	  return vec_extract (v, n);
> 	}
> 
> It calls rs6000_split_vec_extract_var with:
> 
>    dest		= (mem:V2DF (SYMBOL_REF:DImode "v"))
>    element	= (reg:DI <xxx>)
>    tmp_gpr	= (reg:DI <yyy>)
>    tmp_altivec	= (scratch:V2DF)
> 
> So to turn this into a (MEM:DF ...) to access the element, it needs to
> calculate the offset from the beginning of the vector into tmp_gpr.  But since
> the hardware doesn't allow indexed PC-relative instructions, we have to load
> the address into a register to do an indexed load.
> 
> Because that combination is not allowed, the compiler then loads up the address
> into a temporary register, and calls rs6000_split_vec_extract_var with:
> 	(MEM:V2DF (reg:DI <zzz>))
> 
> I.e.
> 
>         rldicl 3,3,0,63
>         pla 9,.LANCHOR0@pcrel
>         sldi 10,3,3
>         lfdx 1,9,10

Where is that rldicl generated?  You can solve this problem there (already
shift it there, saving an instruction as well!)

Or you can do
  sldi 3,3,3
  lfdx 1,9,3
  srdi 3,3,3
(and that last insn will be deleted later, if r3 isn't used after this).

> > > -{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
> > > -  [(set_attr "length" "8,8,8,12,12,8")])
> > > +{
> > > +  rs6000_split_multireg_move (operands[0], operands[1]);
> > > +  DONE;
> > > +}
> > > +  [(set_attr "non_prefixed_length" "8")
> > > +   (set_attr "prefixed_length" "20")])
> > 
> > But this one I don't get...  Various alternatives were 12 before, what
> > changed?  Or was it wrong?
> 
> As far as I can tell it was wrong.  But I can add 4 to the values if you think
> it might be needed.

Please check if it was correct, and send a patch fixing it *before* this
one if not.

> > > @@ -8984,7 +8989,8 @@ (define_insn "*mov<mode>_ppc64"
> > >    return rs6000_output_move_128bit (operands);
> > >  }
> > >    [(set_attr "type" "store,store,load,load,*,*")
> > > -   (set_attr "length" "8")])
> > > +   (set_attr "non_prefixed_length" "8,8,8,8,8,40")
> > > +   (set_attr "prefixed_length" "20,20,20,20,8,40")])
> > 
> > What about the 40 here?
> 
> Dunno.  The original code just used 8, which is wrong.

Same as above then, please: separate patch.


Segher

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4, patch #9 [part of patch #4.2], Add prefixed address offset checks
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (10 preceding siblings ...)
  2019-09-30 14:13 ` [PATCH], V4, patch #4.1: Enable prefixed/pc-rel addressing (revised) Michael Meissner
@ 2019-10-04 12:29 ` Michael Meissner
  2019-10-09 22:24   ` Segher Boessenkool
  2019-10-04 12:35 ` [PATCH], V4, patch #10 [part of patch #4.2], Set prefixed length for 128-bit non-vector type Michael Meissner
                   ` (9 subsequent siblings)
  21 siblings, 1 reply; 37+ messages in thread
From: Michael Meissner @ 2019-10-04 12:29 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

I was asked to split V4 patch #4.2 into smaller chuncks.  This patch is one of
8 patches that were broken out from 4.2.  Another patch from 4.2 to use
SIGNED_16BIT_OFFSET_EXTRA_P has already been committed.

This patch adds checks in the various places that check whether an offset is
valid to allow numeric 34-bit offsets and PC-relative offsets that prefixed
memory instructions adds.

Using all of the patches in this series, I have bootstrapped the compiler on a
little endian power8 system and ran the regression tests.  In addition, I have
built the Spec 2006 and 2017 benchmark suites, for -mcpu=power8, -mcpu=power9,
and -mcpu=future, and all of the benchmarks build.  Can I check this into the
trunk?

2019-10-03  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (quad_address_p): Add check for prefixed
	addresses.
	(mem_operand_gpr): Add check for prefixed addresses.
	(mem_operand_ds_form): Add check for prefixed addresses.
	(rs6000_legitimate_offset_address_p): If we support prefixed
	addresses, check for a 34-bit offset instead of 16-bit.
	(rs6000_legitimate_address_p): Add check for prefixed addresses.
	Do not allow load/store with update if the address is prefixed.
	(rs6000_mode_dependent_address):  If we support prefixed
	addresses, check for a 34-bit offset instead of 16-bit.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276523)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -7250,6 +7250,13 @@ quad_address_p (rtx addr, machine_mode m
   if (VECTOR_MODE_P (mode) && !mode_supports_dq_form (mode))
     return false;
 
+  /* Is this a valid prefixed address?  If the bottom four bits of the offset
+     are non-zero, we could use a prefixed instruction (which does not have the
+     DQ-form constraint that the traditional instruction had) instead of
+     forcing the unaligned offset to a GPR.  */
+  if (address_is_prefixed (addr, mode, NON_PREFIXED_DQ))
+    return true;
+
   if (GET_CODE (addr) != PLUS)
     return false;
 
@@ -7351,6 +7358,13 @@ mem_operand_gpr (rtx op, machine_mode mo
       && legitimate_indirect_address_p (XEXP (addr, 0), false))
     return true;
 
+  /* Allow prefixed instructions if supported.  If the bottom two bits of the
+     offset are non-zero, we could use a prefixed instruction (which does not
+     have the DS-form constraint that the traditional instruction had) instead
+     of forcing the unaligned offset to a GPR.  */
+  if (address_is_prefixed (addr, mode, NON_PREFIXED_DS))
+    return true;
+
   /* Don't allow non-offsettable addresses.  See PRs 83969 and 84279.  */
   if (!rs6000_offsettable_memref_p (op, mode, false))
     return false;
@@ -7385,6 +7399,13 @@ mem_operand_ds_form (rtx op, machine_mod
   int extra;
   rtx addr = XEXP (op, 0);
 
+  /* Allow prefixed instructions if supported.  If the bottom two bits of the
+     offset are non-zero, we could use a prefixed instruction (which does not
+     have the DS-form constraint that the traditional instruction had) instead
+     of forcing the unaligned offset to a GPR.  */
+  if (address_is_prefixed (addr, mode, NON_PREFIXED_DS))
+    return true;
+
   if (!offsettable_address_p (false, mode, addr))
     return false;
 
@@ -7754,7 +7775,10 @@ rs6000_legitimate_offset_address_p (mach
       break;
     }
 
-  return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
+  if (TARGET_PREFIXED_ADDR)
+    return SIGNED_34BIT_OFFSET_EXTRA_P (offset, extra);
+  else
+    return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
 
 bool
@@ -8651,6 +8675,11 @@ rs6000_legitimate_address_p (machine_mod
       && mode_supports_pre_incdec_p (mode)
       && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict))
     return 1;
+
+  /* Handle prefixed addresses (PC-relative or 34-bit offset).  */
+  if (address_is_prefixed (x, mode, NON_PREFIXED_DEFAULT))
+    return 1;
+
   /* Handle restricted vector d-form offsets in ISA 3.0.  */
   if (quad_offset_p)
     {
@@ -8709,7 +8738,11 @@ rs6000_legitimate_address_p (machine_mod
 	  || (!avoiding_indexed_address_p (mode)
 	      && legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict)))
       && rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0)))
-    return 1;
+    {
+      /* There is no prefixed version of the load/store with update.  */
+      rtx addr = XEXP (x, 1);
+      return !address_is_prefixed (addr, mode, NON_PREFIXED_DEFAULT);
+    }
   if (reg_offset_p && !quad_offset_p
       && legitimate_lo_sum_address_p (mode, x, reg_ok_strict))
     return 1;
@@ -8773,7 +8806,10 @@ rs6000_mode_dependent_address (const_rtx
 	{
 	  HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
 	  HOST_WIDE_INT extra = TARGET_POWERPC64 ? 8 : 12;
-	  return !SIGNED_16BIT_OFFSET_EXTRA_P (val, extra);
+	  if (TARGET_PREFIXED_ADDR)
+	    return !SIGNED_34BIT_OFFSET_EXTRA_P (val, extra);
+	  else
+	    return !SIGNED_16BIT_OFFSET_EXTRA_P (val, extra);
 	}
       break;
 

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4, patch #10 [part of patch #4.2], Set prefixed length for 128-bit non-vector type
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (11 preceding siblings ...)
  2019-10-04 12:29 ` [PATCH], V4, patch #9 [part of patch #4.2], Add prefixed address offset checks Michael Meissner
@ 2019-10-04 12:35 ` Michael Meissner
  2019-10-04 12:41 ` [PATCH], V4, patch #11 [part of patch #4.2], Adjust insn cost for prefixed instructions Michael Meissner
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-10-04 12:35 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

I was asked to split V4 patch #4.2 into smaller chuncks.  This patch is one of
8 patches that were broken out from 4.2.  Another patch from 4.2 to use
SIGNED_16BIT_OFFSET_EXTRA_P has already been committed.

This patch sets the prefixed and non-prefixed instruction sizes for the
non-vector 128-bit mode (PTImode, TDmode, IFmode, and optionally TFmode).

Using all of the patches in this series, I have bootstrapped the compiler on a
little endian power8 system and ran the regression tests.  In addition, I have
built the Spec 2006 and 2017 benchmark suites, for -mcpu=power8, -mcpu=power9,
and -mcpu=future, and all of the benchmarks build.  Can I check this into the
trunk?

2019-10-03  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.md (mov<mode>_64bit_dm): Set prefixed and
	non-prefixed length.
	(movtd_64bit_nodm): Set prefixed and non-prefixed length.
	(mov<mode>_ppc64): Set prefixed and non-prefixed length.

Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 276534)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -7773,9 +7773,13 @@ (define_insn_and_split "*mov<mode>_64bit
   "#"
   "&& reload_completed"
   [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8")
-   (set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")])
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+}
+  [(set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")
+   (set_attr "non_prefixed_length" "8")
+   (set_attr "prefixed_length" "20")])
 
 (define_insn_and_split "*movtd_64bit_nodm"
   [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
@@ -7786,8 +7790,12 @@ (define_insn_and_split "*movtd_64bit_nod
   "#"
   "&& reload_completed"
   [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8,8,8,12,12,8")])
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+}
+  [(set_attr "non_prefixed_length" "8")
+   (set_attr "prefixed_length" "20")])
 
 (define_insn_and_split "*mov<mode>_32bit"
   [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r")
@@ -8984,7 +8992,8 @@ (define_insn "*mov<mode>_ppc64"
   return rs6000_output_move_128bit (operands);
 }
   [(set_attr "type" "store,store,load,load,*,*")
-   (set_attr "length" "8")])
+   (set_attr "prefixed_length" "20")
+   (set_attr "non_prefixed_length" "8")])
 
 (define_split
   [(set (match_operand:TI2 0 "int_reg_operand")

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4, patch #11 [part of patch #4.2], Adjust insn cost for prefixed instructions
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (12 preceding siblings ...)
  2019-10-04 12:35 ` [PATCH], V4, patch #10 [part of patch #4.2], Set prefixed length for 128-bit non-vector type Michael Meissner
@ 2019-10-04 12:41 ` Michael Meissner
  2019-10-04 12:46 ` [PATCH], V4, patch #12 [part of patch #4.2], Update predicates Michael Meissner
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-10-04 12:41 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

I was asked to split V4 patch #4.2 into smaller chuncks.  This patch is one of
8 patches that were broken out from 4.2.  Another patch from 4.2 to use
SIGNED_16BIT_OFFSET_EXTRA_P has already been committed.

This patch adjusts the insn cost to treat prefixed instructions the same as
non-prefixed instructions, rather than making them seem 3 times as expensive,
since the prefixed instruction length is 12 bytes compared to the normal 4
bytes.

Using all of the patches in this series, I have bootstrapped the compiler on a
little endian power8 system and ran the regression tests.  In addition, I have
built the Spec 2006 and 2017 benchmark suites, for -mcpu=power8, -mcpu=power9,
and -mcpu=future, and all of the benchmarks build.  Can I check this into the
trunk?

2019-10-03  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_insn_cost): Do not make prefixed
	instructions cost more because they are larger in size.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276535)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -20972,14 +20972,38 @@ rs6000_insn_cost (rtx_insn *insn, bool s
   if (recog_memoized (insn) < 0)
     return 0;
 
-  if (!speed)
-    return get_attr_length (insn);
+  if (speed)
+    {
+      int cost = get_attr_cost (insn);
+      if (cost > 0)
+	return cost;
+    }
 
-  int cost = get_attr_cost (insn);
-  if (cost > 0)
-    return cost;
+  int cost;
+  int length = get_attr_length (insn);
+  int n = length / 4;
+
+  /* How many real instructions are generated for this insn?  This is slightly
+     different from the length attribute, in that the length attribute counts
+     the number of bytes.  With prefixed instructions, we don't want to count a
+     prefixed instruction (length 12 bytes including possible NOP) as taking 3
+     instructions, but just one.  */
+  if (length >= 12 && get_attr_prefixed (insn) == PREFIXED_YES)
+    {
+      /* Single prefixed instruction.  */
+      if (length == 12)
+	n = 1;
+
+      /* A normal instruction and a prefixed instruction (16) or two back
+	 to back prefixed instructions (20).  */
+      else if (length == 16 || length == 20)
+	n = 2;
+
+      /* Guess for larger instruction sizes.  */
+      else
+	n = 2 + (length - 20) / 4;
+    }
 
-  int n = get_attr_length (insn) / 4;
   enum attr_type type = get_attr_type (insn);
 
   switch (type)

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4, patch #12 [part of patch #4.2], Update predicates
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (13 preceding siblings ...)
  2019-10-04 12:41 ` [PATCH], V4, patch #11 [part of patch #4.2], Adjust insn cost for prefixed instructions Michael Meissner
@ 2019-10-04 12:46 ` Michael Meissner
  2019-10-04 12:51 ` [PATCH], V4, patch #13 [part of patch #4.2], Update stack protect insns for prefixed addresses Michael Meissner
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-10-04 12:46 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

I was asked to split V4 patch #4.2 into smaller chuncks.  This patch is one of
8 patches that were broken out from 4.2.  Another patch from 4.2 to use
SIGNED_16BIT_OFFSET_EXTRA_P has already been committed.

This patch adds some new predicates that will be used in future patches.  It
also updates the lwa_operand predicate used in extendsidi2 to know that if we
support prefixed memory addresses, we can use odd offsets.

2019-10-03  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (lwa_operand): Add support for
	prefixed instructions.
	(non_prefixed_memory): New predicate.
	(non_pcrel_memory): New predicate.
	(reg_or_non_pcrel_memory): New predicate.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 276534)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -932,6 +932,13 @@ (define_predicate "lwa_operand"
     return false;
 
   addr = XEXP (inner, 0);
+
+  /* The LWA instruction uses the DS-form format where the bottom two bits of
+     the offset must be 0.  The prefixed PLWA does not have this
+     restriction.  */
+  if (address_is_prefixed (addr, DImode, NON_PREFIXED_DS))
+    return true;
+
   if (GET_CODE (addr) == PRE_INC
       || GET_CODE (addr) == PRE_DEC
       || (GET_CODE (addr) == PRE_MODIFY
@@ -1807,3 +1814,30 @@ (define_predicate "pcrel_external_addres
 (define_predicate "pcrel_local_or_external_address"
   (ior (match_operand 0 "pcrel_local_address")
        (match_operand 0 "pcrel_external_address")))
+
+;; Return 1 if op is a memory operand that is not prefixed.
+(define_predicate "non_prefixed_memory"
+  (match_code "mem")
+{
+  if (!memory_operand (op, mode))
+    return false;
+
+  return !address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
+})
+
+(define_predicate "non_pcrel_memory"
+  (match_code "mem")
+{
+  if (!memory_operand (op, mode))
+    return false;
+
+  return !pcrel_local_or_external_address (XEXP (op, 0), Pmode);
+})
+
+;; Return 1 if op is either a register operand or a memory operand that does
+;; not use a PC-relative address.
+(define_predicate "reg_or_non_pcrel_memory"
+  (match_code "reg,subreg,mem")
+{
+  return (gpc_reg_operand (op, mode) || non_pcrel_memory (op, mode));
+})

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4, patch #13 [part of patch #4.2], Update stack protect insns for prefixed addresses
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (14 preceding siblings ...)
  2019-10-04 12:46 ` [PATCH], V4, patch #12 [part of patch #4.2], Update predicates Michael Meissner
@ 2019-10-04 12:51 ` Michael Meissner
  2019-10-04 12:56 ` [PATCH], V4, patch #14 [part of patch #4.2], Update vector 128-bit instruction sizes Michael Meissner
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-10-04 12:51 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

I was asked to split V4 patch #4.2 into smaller chuncks.  This patch is one of
8 patches that were broken out from 4.2.  Another patch from 4.2 to use
SIGNED_16BIT_OFFSET_EXTRA_P has already been committed.

This patch makes the stack_protect_setdi and stack_protect_testdi insns work if
the stack size is greater than 32K.  It forces the addresses into registers if
the offset is too large, and uses a new constraint (em) to make sure the
register allocator does not combine the addresses.

Using all of the patches in this series, I have bootstrapped the compiler on a
little endian power8 system and ran the regression tests.  In addition, I have
built the Spec 2006 and 2017 benchmark suites, for -mcpu=power8, -mcpu=power9,
and -mcpu=future, and all of the benchmarks build.  Can I check this into the
trunk?

2019-10-03  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/constraints.md (em constraint): New constraint.
	* config/rs6000/rs6000-protos.h (make_memory_non_prefixed): New
	declaration.
	* config/rs6000/rs6000.c (make_memory_non_prefixed): New function
	to return a traditional non-prefixed instruction.
	* config/rs6000/rs6000.md (stack_protect_setdi): Make sure both
	memory operands are not	prefixed.
	(stack_protect_testdi): Make sure both memory operands are not
	prefixed.
	* doc/md.texi (PowerPC constraints): Document the em constraint.

Index: gcc/config/rs6000/constraints.md
===================================================================
--- gcc/config/rs6000/constraints.md	(revision 276523)
+++ gcc/config/rs6000/constraints.md	(working copy)
@@ -210,6 +210,11 @@ several times, or that might not access
   (and (match_code "mem")
        (match_test "GET_RTX_CLASS (GET_CODE (XEXP (op, 0))) != RTX_AUTOINC")))
 
+(define_memory_constraint "em"
+  "A memory operand that does not contain a PC-relative reference."
+  (and (match_code "mem")
+       (match_test "non_pcrel_memory (op, mode)")))
+
 (define_memory_constraint "Q"
   "Memory operand that is an offset from a register (it is usually better
 to use @samp{m} or @samp{es} in @code{asm} statements)"
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 276523)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -192,6 +192,7 @@ extern enum insn_form address_to_insn_fo
 extern bool prefixed_load_p (rtx_insn *);
 extern bool prefixed_store_p (rtx_insn *);
 extern bool prefixed_paddi_p (rtx_insn *);
+extern rtx make_memory_non_prefixed (rtx);
 extern void rs6000_asm_output_opcode (FILE *);
 extern void rs6000_final_prescan_insn (rtx_insn *, rtx [], int);
 
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276537)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -24972,6 +24972,34 @@ prefixed_paddi_p (rtx_insn *insn)
   return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
 }
 
+/* Make a memory address non-prefixed if it is prefixed.  */
+
+rtx
+make_memory_non_prefixed (rtx mem)
+{
+  gcc_assert (MEM_P (mem));
+
+  rtx old_addr = XEXP (mem, 0);
+  if (address_is_prefixed (old_addr, GET_MODE (mem), NON_PREFIXED_DEFAULT))
+    {
+      rtx new_addr;
+
+      if (GET_CODE (old_addr) == PLUS
+	  && (REG_P (XEXP (old_addr, 0)) || SUBREG_P (XEXP (old_addr, 0)))
+	  && CONST_INT_P (XEXP (old_addr, 1)))
+	{
+	  rtx tmp_reg = force_reg (Pmode, XEXP (old_addr, 1));
+	  new_addr = gen_rtx_PLUS (Pmode, XEXP (old_addr, 0), tmp_reg);
+	}
+      else
+	new_addr = force_reg (Pmode, old_addr);
+
+      mem = change_address (mem, VOIDmode, new_addr);
+    }
+
+  return mem;
+}
+
 /* Whether the next instruction needs a 'p' prefix issued before the
    instruction is printed out.  */
 static bool next_insn_prefixed_p;
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 276536)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -11531,9 +11531,25 @@ (define_insn "stack_protect_setsi"
   [(set_attr "type" "three")
    (set_attr "length" "12")])
 
-(define_insn "stack_protect_setdi"
-  [(set (match_operand:DI 0 "memory_operand" "=Y")
-	(unspec:DI [(match_operand:DI 1 "memory_operand" "Y")] UNSPEC_SP_SET))
+(define_expand "stack_protect_setdi"
+  [(parallel [(set (match_operand:DI 0 "memory_operand")
+		   (unspec:DI [(match_operand:DI 1 "memory_operand")]
+		   UNSPEC_SP_SET))
+	      (set (match_scratch:DI 2)
+		   (const_int 0))])]
+  "TARGET_64BIT"
+{
+  if (TARGET_PREFIXED_ADDR)
+    {
+      operands[0] = make_memory_non_prefixed (operands[0]);
+      operands[1] = make_memory_non_prefixed (operands[1]);
+    }
+})
+
+(define_insn "*stack_protect_setdi"
+  [(set (match_operand:DI 0 "non_prefixed_memory" "=em")
+	(unspec:DI [(match_operand:DI 1 "non_prefixed_memory" "em")]
+		   UNSPEC_SP_SET))
    (set (match_scratch:DI 2 "=&r") (const_int 0))]
   "TARGET_64BIT"
   "ld%U1%X1 %2,%1\;std%U0%X0 %2,%0\;li %2,0"
@@ -11577,10 +11593,27 @@ (define_insn "stack_protect_testsi"
    lwz%U1%X1 %3,%1\;lwz%U2%X2 %4,%2\;cmplw %0,%3,%4\;li %3,0\;li %4,0"
   [(set_attr "length" "16,20")])
 
-(define_insn "stack_protect_testdi"
+(define_expand "stack_protect_testdi"
+  [(parallel [(set (match_operand:CCEQ 0 "cc_reg_operand")
+		   (unspec:CCEQ [(match_operand:DI 1 "memory_operand")
+				 (match_operand:DI 2 "memory_operand")]
+				UNSPEC_SP_TEST))
+	      (set (match_scratch:DI 4)
+		   (const_int 0))
+	      (clobber (match_scratch:DI 3))])]
+  "TARGET_64BIT"
+{
+  if (TARGET_PREFIXED_ADDR)
+    {
+      operands[1] = make_memory_non_prefixed (operands[1]);
+      operands[2] = make_memory_non_prefixed (operands[2]);
+    }
+})
+
+(define_insn "*stack_protect_testdi"
   [(set (match_operand:CCEQ 0 "cc_reg_operand" "=x,?y")
-        (unspec:CCEQ [(match_operand:DI 1 "memory_operand" "Y,Y")
-		      (match_operand:DI 2 "memory_operand" "Y,Y")]
+        (unspec:CCEQ [(match_operand:DI 1 "non_prefixed_memory" "em,em")
+		      (match_operand:DI 2 "non_prefixed_memory" "em,em")]
 		     UNSPEC_SP_TEST))
    (set (match_scratch:DI 4 "=r,r") (const_int 0))
    (clobber (match_scratch:DI 3 "=&r,&r"))]
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(revision 276523)
+++ gcc/doc/md.texi	(working copy)
@@ -3373,6 +3373,9 @@ asm ("st %1,%0" : "=m<>" (mem) : "r" (va
 
 is not.
 
+@item em
+A memory operand that does not contain a prefixed address.
+
 @item es
 A ``stable'' memory operand; that is, one which does not include any
 automodification of the base register.  This used to be useful when

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4, patch #14 [part of patch #4.2], Update vector 128-bit instruction sizes
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (15 preceding siblings ...)
  2019-10-04 12:51 ` [PATCH], V4, patch #13 [part of patch #4.2], Update stack protect insns for prefixed addresses Michael Meissner
@ 2019-10-04 12:56 ` Michael Meissner
  2019-10-04 13:02 ` [PATCH], V4, patch #15 [part of patch #4.2], Make vector extract/insert support prefixed instructions Michael Meissner
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-10-04 12:56 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

I was asked to split V4 patch #4.2 into smaller chuncks.  This patch is one of
8 patches that were broken out from 4.2.  Another patch from 4.2 to use
SIGNED_16BIT_OFFSET_EXTRA_P has already been committed.

This patch adjusts the instruction size for prefixed addresses for vector
128-bit types.  Compared to patch #4, I simplified the size calculator quite a
bit (that old calculation was done before I created the prefixed_length and
non_prefixed_length attributes).

2019-10-03  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/vsx.md (vsx_mov<mode>_64bit): Make sure the
	instruction length is correct for prefixed loads and stores.

Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 276523)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -1149,10 +1149,14 @@ (define_insn "vsx_mov<mode>_64bit"
                "vecstore,  vecload,   vecsimple, mffgpr,    mftgpr,    load,
                 store,     load,      store,     *,         vecsimple, vecsimple,
                 vecsimple, *,         *,         vecstore,  vecload")
-   (set_attr "length"
+   (set_attr "non_prefixed_length"
                "*,         *,         *,         8,         *,         8,
                 8,         8,         8,         8,         *,         *,
                 *,         20,        8,         *,         *")
+   (set_attr "prefixed_length"
+               "*,         *,         *,         8,         *,         20,
+                20,        20,        20,        8,         *,         *,
+                *,         20,        8,         *,         *")
    (set_attr "isa"
                "<VSisa>,   <VSisa>,   <VSisa>,   *,         *,         *,
                 *,         *,         *,         *,         p9v,       *,

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4, patch #15 [part of patch #4.2], Make vector extract/insert support prefixed instructions
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (16 preceding siblings ...)
  2019-10-04 12:56 ` [PATCH], V4, patch #14 [part of patch #4.2], Update vector 128-bit instruction sizes Michael Meissner
@ 2019-10-04 13:02 ` Michael Meissner
  2019-10-04 13:07 ` [PATCH], V4, patch #16 [Same as patch #5], Support DImode 34-bt constants Michael Meissner
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-10-04 13:02 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

I was asked to split V4 patch #4.2 into smaller chuncks.  This patch is one of
8 patches that were broken out from 4.2.  Another patch from 4.2 to use
SIGNED_16BIT_OFFSET_EXTRA_P has already been committed.

This patch updates the functions that adjust a vector address to access a
scalar element to support prefixed addresses.  Compared to patch #4.2, this
patch eliminates adding a second constraint for non PC-relative addresses.
Instead I just added a pattern to be able to add a PC-relative address to a
register, and used that in the problematical case of doing a vector extract
with a variable element number where the vector is a static (i.e. uses a
PC-relative address).

Using all of the patches in this series, I have bootstrapped the compiler on a
little endian power8 system and ran the regression tests.  In addition, I have
built the Spec 2006 and 2017 benchmark suites, for -mcpu=power8, -mcpu=power9,
and -mcpu=future, and all of the benchmarks build.  Can I check this into the
trunk?

2019-10-03  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_adjust_vec_address): Add support
	for extracting the element of vector with a PC-relative address.
	* config/rs6000/rs6000.md (pcrel_add_local_addr): New insn to add
	an address to a register.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276540)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -6701,6 +6701,7 @@ rs6000_adjust_vec_address (rtx scalar_re
   rtx element_offset;
   rtx new_addr;
   bool valid_addr_p;
+  bool pcrel_p = pcrel_local_address (addr, Pmode);
 
   /* Vector addresses should not have PRE_INC, PRE_DEC, or PRE_MODIFY.  */
   gcc_assert (GET_RTX_CLASS (GET_CODE (addr)) != RTX_AUTOINC);
@@ -6738,6 +6739,40 @@ rs6000_adjust_vec_address (rtx scalar_re
   else if (REG_P (addr) || SUBREG_P (addr))
     new_addr = gen_rtx_PLUS (Pmode, addr, element_offset);
 
+  /* Optimize PC-relative addresses with a constant offset.  */
+  else if (pcrel_p && CONST_INT_P (element_offset))
+    {
+      rtx addr2 = addr;
+      HOST_WIDE_INT offset = INTVAL (element_offset);
+
+      if (GET_CODE (addr2) == CONST)
+	addr2 = XEXP (addr2, 0);
+
+      if (GET_CODE (addr2) == PLUS)
+	{
+	  offset += INTVAL (XEXP (addr2, 1));
+	  addr2 = XEXP (addr2, 0);
+	}
+
+      gcc_assert (SIGNED_34BIT_OFFSET_P (offset));
+      if (offset)
+	{
+	  addr2 = gen_rtx_PLUS (Pmode, addr2, GEN_INT (offset));
+	  new_addr = gen_rtx_CONST (Pmode, addr2);
+	}
+      else
+	new_addr = addr2;
+    }
+
+  /* Optimize PC-relative addresses with a variable offset to add the
+     PC-relative address to the offset.  */
+  else if (pcrel_p)
+    {
+      emit_insn (gen_pcrel_add_local_addr (base_tmp, element_offset, addr));
+      new_addr = base_tmp;
+      pcrel_p = false;
+    }
+
   /* Optimize D-FORM addresses with constant offset with a constant element, to
      include the element offset in the address directly.  */
   else if (GET_CODE (addr) == PLUS)
@@ -6752,8 +6787,11 @@ rs6000_adjust_vec_address (rtx scalar_re
 	  HOST_WIDE_INT offset = INTVAL (op1) + INTVAL (element_offset);
 	  rtx offset_rtx = GEN_INT (offset);
 
-	  if (IN_RANGE (offset, -32768, 32767)
-	      && (scalar_size < 8 || (offset & 0x3) == 0))
+	  if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (offset))
+	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
+
+	  else if (SIGNED_16BIT_OFFSET_P (offset)
+		   && (scalar_size < 8 || (offset & 0x3) == 0))
 	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
 	  else
 	    {
@@ -6801,11 +6839,11 @@ rs6000_adjust_vec_address (rtx scalar_re
       new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
     }
 
-  /* If we have a PLUS, we need to see whether the particular register class
-     allows for D-FORM or X-FORM addressing.  */
-  if (GET_CODE (new_addr) == PLUS)
+  /* If we have a PLUS or a PC-relative address without the PLUS, we need to
+     see whether the particular register class allows for D-FORM or X-FORM
+     addressing.  */
+  if (GET_CODE (new_addr) == PLUS || pcrel_p)
     {
-      rtx op1 = XEXP (new_addr, 1);
       addr_mask_type addr_mask;
       unsigned int scalar_regno = reg_or_subregno (scalar_reg);
 
@@ -6822,10 +6860,16 @@ rs6000_adjust_vec_address (rtx scalar_re
       else
 	gcc_unreachable ();
 
-      if (REG_P (op1) || SUBREG_P (op1))
-	valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0;
-      else
+      if (pcrel_p)
 	valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
+      else
+	{
+	  rtx op1 = XEXP (new_addr, 1);
+	  if (REG_P (op1) || SUBREG_P (op1))
+	    valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0;
+	  else
+	    valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
+	}
     }
 
   else if (REG_P (new_addr) || SUBREG_P (new_addr))
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 276540)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -9962,6 +9962,15 @@ (define_insn "*pcrel_local_addr"
   "la %0,%a1"
   [(set_attr "prefixed" "yes")])
 
+;; Add a local PC-relative address to a register.
+(define_insn "pcrel_add_local_addr"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+	(plus:DI (match_operand:DI 1 "gpc_reg_operand" "r")
+		 (match_operand:DI 2 "pcrel_local_address")))]
+  "TARGET_PCREL"
+  "addi %0,%1,%a2"
+  [(set_attr "prefixed" "yes")])
+
 ;; Load up a PC-relative address to an external symbol.  If the symbol and the
 ;; program are both defined in the main program, the linker will optimize this
 ;; to a PADDI.  Otherwise, it will create a GOT address that is relocated by

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4, patch #16 [Same as patch #5], Support DImode 34-bt constants
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (17 preceding siblings ...)
  2019-10-04 13:02 ` [PATCH], V4, patch #15 [part of patch #4.2], Make vector extract/insert support prefixed instructions Michael Meissner
@ 2019-10-04 13:07 ` Michael Meissner
  2019-10-04 13:11 ` [PATCH], V4, patch #17 [Same as patch #6], Use PADDI to load up 32-bit SImode constants Michael Meissner
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-10-04 13:07 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This is the same as patch #5, but re-issued after patches 9-15 replaced the old
patch #4.

This patch adds support for using PADDI (PLI) to load up 34-bit DImode
constants.

Using all of the patches in this series, I have bootstrapped the compiler on a
little endian power8 system and ran the regression tests.  In addition, I have
built the Spec 2006 and 2017 benchmark suites, for -mcpu=power8, -mcpu=power9,
and -mcpu=future, and all of the benchmarks build.  Can I check this into the
trunk?

2019-10-03  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (num_insns_constant_gpr): Add support for
	PADDI to load up and/or add 34-bit integer constants.
	(rs6000_rtx_costs): Treat constants loaded up with PADDI with the
	same cost as normal 16-bit constants.
	* config/rs6000/rs6000.md (movdi_internal64): Add support to load
	up 34-bit integer constants with PADDI.
	(movdi integer constant splitter): Add comment about PADDI.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276544)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -5523,7 +5523,7 @@ static int
 num_insns_constant_gpr (HOST_WIDE_INT value)
 {
   /* signed constant loadable with addi */
-  if (((unsigned HOST_WIDE_INT) value + 0x8000) < 0x10000)
+  if (SIGNED_16BIT_OFFSET_P (value))
     return 1;
 
   /* constant loadable with addis */
@@ -5531,6 +5531,10 @@ num_insns_constant_gpr (HOST_WIDE_INT va
 	   && (value >> 31 == -1 || value >> 31 == 0))
     return 1;
 
+  /* PADDI can support up to 34 bit signed integers.  */
+  else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (value))
+    return 1;
+
   else if (TARGET_POWERPC64)
     {
       HOST_WIDE_INT low  = ((value & 0xffffffff) ^ 0x80000000) - 0x80000000;
@@ -20643,7 +20647,8 @@ rs6000_rtx_costs (rtx x, machine_mode mo
 	    || outer_code == PLUS
 	    || outer_code == MINUS)
 	   && (satisfies_constraint_I (x)
-	       || satisfies_constraint_L (x)))
+	       || satisfies_constraint_L (x)
+	       || satisfies_constraint_eI (x)))
 	  || (outer_code == AND
 	      && (satisfies_constraint_K (x)
 		  || (mode == SImode
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 276544)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -8805,24 +8805,24 @@ (define_split
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
 
-;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR #
-;;              FPR store  FPR load   FPR move   AVX store  AVX store   AVX load
-;;              AVX load   VSX move   P9 0       P9 -1      AVX 0/-1    VSX 0
-;;              VSX -1     P9 const   AVX const  From SPR   To SPR      SPR<->SPR
-;;              VSX->GPR   GPR->VSX
+;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR pli
+;;              GPR #      FPR store  FPR load   FPR move   AVX store   AVX store
+;;              AVX load   AVX load   VSX move   P9 0       P9 -1       AVX 0/-1
+;;              VSX 0      VSX -1     P9 const   AVX const  From SPR    To SPR
+;;              SPR<->SPR  VSX->GPR   GPR->VSX
 (define_insn "*movdi_internal64"
   [(set (match_operand:DI 0 "nonimmediate_operand"
                "=YZ,       r,         r,         r,         r,          r,
-                m,         ^d,        ^d,        wY,        Z,          $v,
-                $v,        ^wa,       wa,        wa,        v,          wa,
-                wa,        v,         v,         r,         *h,         *h,
-                ?r,        ?wa")
+                r,         m,         ^d,        ^d,        wY,         Z,
+                $v,        $v,        ^wa,       wa,        wa,         v,
+                wa,        wa,        v,         v,         r,          *h,
+                *h,        ?r,        ?wa")
 	(match_operand:DI 1 "input_operand"
-               "r,         YZ,        r,         I,         L,          nF,
-                ^d,        m,         ^d,        ^v,        $v,         wY,
-                Z,         ^wa,       Oj,        wM,        OjwM,       Oj,
-                wM,        wS,        wB,        *h,        r,          0,
-                wa,        r"))]
+               "r,         YZ,        r,         I,         L,          eI,
+                nF,        ^d,        m,         ^d,        ^v,         $v,
+                wY,        Z,         ^wa,       Oj,        wM,         OjwM,
+                Oj,        wM,        wS,        wB,        *h,         r,
+                0,         wa,        r"))]
   "TARGET_POWERPC64
    && (gpc_reg_operand (operands[0], DImode)
        || gpc_reg_operand (operands[1], DImode))"
@@ -8832,6 +8832,7 @@ (define_insn "*movdi_internal64"
    mr %0,%1
    li %0,%1
    lis %0,%v1
+   li %0,%1
    #
    stfd%U0%X0 %1,%0
    lfd%U1%X1 %0,%1
@@ -8855,26 +8856,28 @@ (define_insn "*movdi_internal64"
    mtvsrd %x0,%1"
   [(set_attr "type"
                "store,      load,	*,         *,         *,         *,
-                fpstore,    fpload,     fpsimple,  fpstore,   fpstore,   fpload,
-                fpload,     veclogical, vecsimple, vecsimple, vecsimple, veclogical,
-                veclogical, vecsimple,  vecsimple, mfjmpr,    mtjmpr,    *,
-                mftgpr,    mffgpr")
+                *,          fpstore,    fpload,    fpsimple,  fpstore,   fpstore,
+                fpload,     fpload,     veclogical,vecsimple, vecsimple, vecsimple,
+                veclogical, veclogical, vecsimple,  vecsimple, mfjmpr,   mtjmpr,
+                *,          mftgpr,    mffgpr")
    (set_attr "size" "64")
    (set_attr "length"
-               "*,         *,         *,         *,         *,          20,
-                *,         *,         *,         *,         *,          *,
+               "*,         *,         *,         *,         *,          *,
+                20,        *,         *,         *,         *,          *,
                 *,         *,         *,         *,         *,          *,
-                *,         8,         *,         *,         *,          *,
-                *,         *")
+                *,         *,         8,         *,         *,          *,
+                *,         *,         *")
    (set_attr "isa"
-               "*,         *,         *,         *,         *,          *,
-                *,         *,         *,         p9v,       p7v,        p9v,
-                p7v,       *,         p9v,       p9v,       p7v,        *,
-                *,         p7v,       p7v,       *,         *,          *,
-                p8v,       p8v")])
+               "*,         *,         *,         *,         *,          fut,
+                *,         *,         *,         *,         p9v,        p7v,
+                p9v,       p7v,       *,         p9v,       p9v,        p7v,
+                *,         *,         p7v,       p7v,       *,          *,
+                *,         p8v,       p8v")])
 
 ; Some DImode loads are best done as a load of -1 followed by a mask
-; instruction.
+; instruction.  On systems that support the PADDI (PLI) instruction,
+; num_insns_constant returns 1, so these splitter would not be used for things
+; that be loaded with PLI.
 (define_split
   [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
 	(match_operand:DI 1 "const_int_operand"))]

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4, patch #17 [Same as patch #6], Use PADDI to load up 32-bit SImode constants
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (18 preceding siblings ...)
  2019-10-04 13:07 ` [PATCH], V4, patch #16 [Same as patch #5], Support DImode 34-bt constants Michael Meissner
@ 2019-10-04 13:11 ` Michael Meissner
  2019-10-04 13:14 ` [PATCH], V4, patch #18 [Same as patch #7], Use PADDI to add 34-bit constants Michael Meissner
  2019-10-04 13:18 ` [PATCH], V4, patch #19 [Same as patch #8], Enable -mpcrel on Linux 64-bit systems Michael Meissner
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-10-04 13:11 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This is the same as patch #6, but re-issued after patches 9-15 replaced the old
patch #4.

This patch supports using PADDI (PLI) to load up 32-bit SImode constants.

Using all of the patches in this series, I have bootstrapped the compiler on a
little endian power8 system and ran the regression tests.  In addition, I have
built the Spec 2006 and 2017 benchmark suites, for -mcpu=power8, -mcpu=power9,
and -mcpu=future, and all of the benchmarks build.  Can I check this into the
trunk?

2019-10-03  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.md (movsi_internal1): Add support to load
	up 32-bit SImode integer constants with PADDI.
	(movsi integer constant splitter): Do not split constant if PADDI
	can load it up directly.

Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 276545)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -6904,22 +6904,22 @@ (define_insn "movsi_low"
 
 ;;		MR           LA           LWZ          LFIWZX       LXSIWZX
 ;;		STW          STFIWX       STXSIWX      LI           LIS
-;;		#            XXLOR        XXSPLTIB 0   XXSPLTIB -1  VSPLTISW
-;;		XXLXOR 0     XXLORC -1    P9 const     MTVSRWZ      MFVSRWZ
-;;		MF%1         MT%0         NOP
+;;		PLI          #            XXLOR        XXSPLTIB 0   XXSPLTIB -1
+;;		VSPLTISW     XXLXOR 0     XXLORC -1    P9 const     MTVSRWZ
+;;		MFVSRWZ      MF%1         MT%0         NOP
 (define_insn "*movsi_internal1"
   [(set (match_operand:SI 0 "nonimmediate_operand"
 		"=r,         r,           r,           d,           v,
 		 m,          Z,           Z,           r,           r,
-		 r,          wa,          wa,          wa,          v,
-		 wa,         v,           v,           wa,          r,
-		 r,          *h,          *h")
+		 r,          r,           wa,          wa,          wa,
+		 v,          wa,          v,           v,           wa,
+		 r,          r,           *h,          *h")
 	(match_operand:SI 1 "input_operand"
 		"r,          U,           m,           Z,           Z,
 		 r,          d,           v,           I,           L,
-		 n,          wa,          O,           wM,          wB,
-		 O,          wM,          wS,          r,           wa,
-		 *h,         r,           0"))]
+		 eI,         n,           wa,          O,           wM,
+		 wB,         O,           wM,          wS,          r,
+		 wa,         *h,          r,           0"))]
   "gpc_reg_operand (operands[0], SImode)
    || gpc_reg_operand (operands[1], SImode)"
   "@
@@ -6933,6 +6933,7 @@ (define_insn "*movsi_internal1"
    stxsiwx %x1,%y0
    li %0,%1
    lis %0,%v1
+   li %0,%1
    #
    xxlor %x0,%x1,%x1
    xxspltib %x0,0
@@ -6949,21 +6950,21 @@ (define_insn "*movsi_internal1"
   [(set_attr "type"
 		"*,          *,           load,        fpload,      fpload,
 		 store,      fpstore,     fpstore,     *,           *,
-		 *,          veclogical,  vecsimple,   vecsimple,   vecsimple,
-		 veclogical, veclogical,  vecsimple,   mffgpr,      mftgpr,
-		 *,          *,           *")
+		 *,          *,           veclogical,  vecsimple,   vecsimple,
+		 vecsimple,  veclogical,  veclogical,  vecsimple,   mffgpr,
+		 mftgpr,     *,           *,           *")
    (set_attr "length"
 		"*,          *,           *,           *,           *,
 		 *,          *,           *,           *,           *,
-		 8,          *,           *,           *,           *,
-		 *,          *,           8,           *,           *,
-		 *,          *,           *")
+		 *,          8,           *,           *,           *,
+		 *,          *,           *,           8,           *,
+		 *,          *,           *,           *")
    (set_attr "isa"
 		"*,          *,           *,           p8v,         p8v,
 		 *,          p8v,         p8v,         *,           *,
-		 *,          p8v,         p9v,         p9v,         p8v,
-		 p9v,        p8v,         p9v,         p8v,         p8v,
-		 *,          *,           *")])
+		 fut,        *,           p8v,         p9v,         p9v,
+		 p8v,        p9v,         p8v,         p9v,         p8v,
+		 p8v,        *,           *,           *")])
 
 ;; Like movsi, but adjust a SF value to be used in a SI context, i.e.
 ;; (set (reg:SI ...) (subreg:SI (reg:SF ...) 0))
@@ -7108,14 +7109,15 @@ (define_insn "*movsi_from_df"
   "xscvdpsp %x0,%x1"
   [(set_attr "type" "fp")])
 
-;; Split a load of a large constant into the appropriate two-insn
-;; sequence.
+;; Split a load of a large constant into the appropriate two-insn sequence.  On
+;; systems that support PADDI (PLI), we can use PLI to load any 32-bit constant
+;; in one instruction.
 
 (define_split
   [(set (match_operand:SI 0 "gpc_reg_operand")
 	(match_operand:SI 1 "const_int_operand"))]
   "(unsigned HOST_WIDE_INT) (INTVAL (operands[1]) + 0x8000) >= 0x10000
-   && (INTVAL (operands[1]) & 0xffff) != 0"
+   && (INTVAL (operands[1]) & 0xffff) != 0 && !TARGET_PREFIXED_ADDR"
   [(set (match_dup 0)
 	(match_dup 2))
    (set (match_dup 0)

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4, patch #18 [Same as patch #7], Use PADDI to add 34-bit constants
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (19 preceding siblings ...)
  2019-10-04 13:11 ` [PATCH], V4, patch #17 [Same as patch #6], Use PADDI to load up 32-bit SImode constants Michael Meissner
@ 2019-10-04 13:14 ` Michael Meissner
  2019-10-04 13:18 ` [PATCH], V4, patch #19 [Same as patch #8], Enable -mpcrel on Linux 64-bit systems Michael Meissner
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-10-04 13:14 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This is the same as patch #7, but was re-issued after patches 9-15 replaced the
old patch #4.

This patch generates the PADDI instruction to add 34-bit constant values on the
'future' system.

Using all of the patches in this series, I have bootstrapped the compiler on a
little endian power8 system and ran the regression tests.  In addition, I have
built the Spec 2006 and 2017 benchmark suites, for -mcpu=power8, -mcpu=power9,
and -mcpu=future, and all of the benchmarks build.  Can I check this into the
trunk?

2019-10-03  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (add_operand): Add support for
	PADDI.
	* config/rs6000/rs6000.md (add<mode>3): Add support for PADDI.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 276538)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -839,7 +839,8 @@ (define_special_predicate "indexed_addre
 (define_predicate "add_operand"
   (if_then_else (match_code "const_int")
     (match_test "satisfies_constraint_I (op)
-		 || satisfies_constraint_L (op)")
+		 || satisfies_constraint_L (op)
+		 || satisfies_constraint_eI (op)")
     (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is either a non-special register, or 0, or -1.
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 276546)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -1756,15 +1756,17 @@ (define_expand "add<mode>3"
 })
 
 (define_insn "*add<mode>3"
-  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
-	(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
-		  (match_operand:GPR 2 "add_operand" "r,I,L")))]
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r,r")
+	(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b,b")
+		  (match_operand:GPR 2 "add_operand" "r,I,L,eI")))]
   ""
   "@
    add %0,%1,%2
    addi %0,%1,%2
-   addis %0,%1,%v2"
-  [(set_attr "type" "add")])
+   addis %0,%1,%v2
+   addi %0,%1,%2"
+  [(set_attr "type" "add")
+   (set_attr "isa" "*,*,*,fut")])
 
 (define_insn "*addsi3_high"
   [(set (match_operand:SI 0 "gpc_reg_operand" "=b")

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH], V4, patch #19 [Same as patch #8], Enable -mpcrel on Linux 64-bit systems
  2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
                   ` (20 preceding siblings ...)
  2019-10-04 13:14 ` [PATCH], V4, patch #18 [Same as patch #7], Use PADDI to add 34-bit constants Michael Meissner
@ 2019-10-04 13:18 ` Michael Meissner
  21 siblings, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-10-04 13:18 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This is the same as patch #8, but re-issued after patches 9-15 replaced the old
patch #4.

This patch enables -mpcrel for the default on Linux 64-bit systems when
compiling for -mcpu=future.  If/when other OSs add support for prefixed
instructions (including PC-relative instructions), it is simple to modify their
tm.h file to enable the default.

Using all of the patches in this series, I have bootstrapped the compiler on a
little endian power8 system and ran the regression tests.  In addition, I have
built the Spec 2006 and 2017 benchmark suites, for -mcpu=power8, -mcpu=power9,
and -mcpu=future, and all of the benchmarks build.  Can I check this into the
trunk?

2019-10-03  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/linux64.h (TARGET_PREFIXED_ADDR_DEFAULT): Enable
	prefixed addressing by default.
	(TARGET_PCREL_DEFAULT): Enable pc-relative addressing by default.
	* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Only
	enable -mprefixed-addr and -mpcrel if the OS tm.h says to enable
	it.
	(ADDRESSING_FUTURE_MASKS): New mask macro.
	(OTHER_FUTURE_MASKS): Use ADDRESSING_FUTURE_MASKS.
	* config/rs6000/rs6000.c (TARGET_PREFIXED_ADDR_DEFAULT): Do not
	enable -mprefixed-addr unless the OS tm.h says to.
	(TARGET_PCREL_DEFAULT): Do not enable -mpcrel unless the OS tm.h
	says to.
	(rs6000_option_override_internal): Do not enable -mprefixed-addr
	or -mpcrel unless the OS tm.h says to enable it.  Add more checks
	for -mcpu=future.

Index: gcc/config/rs6000/linux64.h
===================================================================
--- gcc/config/rs6000/linux64.h	(revision 276523)
+++ gcc/config/rs6000/linux64.h	(working copy)
@@ -640,3 +640,11 @@ extern int dot_symbols;
    enabling the __float128 keyword.  */
 #undef	TARGET_FLOAT128_ENABLE_TYPE
 #define TARGET_FLOAT128_ENABLE_TYPE 1
+
+/* Enable support for pc-relative and numeric prefixed addressing on the
+   'future' system.  */
+#undef  TARGET_PREFIXED_ADDR_DEFAULT
+#define TARGET_PREFIXED_ADDR_DEFAULT	1
+
+#undef  TARGET_PCREL_DEFAULT
+#define TARGET_PCREL_DEFAULT		1
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def	(revision 276523)
+++ gcc/config/rs6000/rs6000-cpus.def	(working copy)
@@ -75,15 +75,21 @@
 				 | OPTION_MASK_P8_VECTOR		\
 				 | OPTION_MASK_P9_VECTOR)
 
-/* Support for a future processor's features.  Do not enable -mpcrel until it
-   is fully functional.  */
+/* Support for a future processor's features.  The prefixed and pc-relative
+   addressing bits are not added here.  Instead, rs6000.c adds them if the OS
+   tm.h says that it supports the addressing modes.  */
 #define ISA_FUTURE_MASKS_SERVER	(ISA_3_0_MASKS_SERVER			\
-				 | OPTION_MASK_FUTURE			\
+				 | OPTION_MASK_FUTURE)
+
+/* Addressing related flags on a future processor.  These flags are broken out
+   because not all targets will support either pc-relative addressing, or even
+   prefixed addressing, and we want to clear all of the addressing bits
+   on targets that cannot support prefixed/pcrel addressing.  */
+#define ADDRESSING_FUTURE_MASKS	(OPTION_MASK_PCREL			\
 				 | OPTION_MASK_PREFIXED_ADDR)
 
 /* Flags that need to be turned off if -mno-future.  */
-#define OTHER_FUTURE_MASKS	(OPTION_MASK_PCREL			\
-				 | OPTION_MASK_PREFIXED_ADDR)
+#define OTHER_FUTURE_MASKS	ADDRESSING_FUTURE_MASKS
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
 #define OTHER_P9_VECTOR_MASKS	(OPTION_MASK_FLOAT128_HW		\
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276545)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -98,6 +98,16 @@
 #endif
 #endif
 
+/* Set up the defaults for whether prefixed addressing is used, and if it is
+   used, whether we want to turn on pc-relative support by default.  */
+#ifndef TARGET_PREFIXED_ADDR_DEFAULT
+#define TARGET_PREFIXED_ADDR_DEFAULT	0
+#endif
+
+#ifndef TARGET_PCREL_DEFAULT
+#define TARGET_PCREL_DEFAULT		0
+#endif
+
 /* Support targetm.vectorize.builtin_mask_for_load.  */
 GTY(()) tree altivec_builtin_mask_for_load;
 
@@ -2532,6 +2542,14 @@ rs6000_debug_reg_global (void)
   if (TARGET_DIRECT_MOVE_128)
     fprintf (stderr, DEBUG_FMT_D, "VSX easy 64-bit mfvsrld element",
 	     (int)VECTOR_ELEMENT_MFVSRLD_64BIT);
+
+  if (TARGET_FUTURE)
+    {
+      fprintf (stderr, DEBUG_FMT_D, "TARGET_PREFIXED_ADDR_DEFAULT",
+	       TARGET_PREFIXED_ADDR_DEFAULT);
+      fprintf (stderr, DEBUG_FMT_D, "TARGET_PCREL_DEFAULT",
+	       TARGET_PCREL_DEFAULT);
+    }
 }
 
 \f
@@ -4012,26 +4030,6 @@ rs6000_option_override_internal (bool gl
       rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_HW;
     }
 
-  /* -mprefixed-addr (and hence -mpcrel) requires -mcpu=future.  */
-  if (TARGET_PREFIXED_ADDR && !TARGET_FUTURE)
-    {
-      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mcpu=future");
-      else if ((rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED_ADDR) != 0)
-	error ("%qs requires %qs", "-mprefixed-addr", "-mcpu=future");
-
-      rs6000_isa_flags &= ~(OPTION_MASK_PCREL | OPTION_MASK_PREFIXED_ADDR);
-    }
-
-  /* -mpcrel requires prefixed load/store addressing.  */
-  if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
-    {
-      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
-
-      rs6000_isa_flags &= ~OPTION_MASK_PCREL;
-    }
-
   /* Print the options after updating the defaults.  */
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
     rs6000_print_isa_options (stderr, 0, "after defaults", rs6000_isa_flags);
@@ -4163,12 +4161,89 @@ rs6000_option_override_internal (bool gl
   SUB3TARGET_OVERRIDE_OPTIONS;
 #endif
 
-  /* -mpcrel requires -mcmodel=medium, but we can't check TARGET_CMODEL until
-      after the subtarget override options are done.  */
-  if (TARGET_PCREL && TARGET_CMODEL != CMODEL_MEDIUM)
+  /* Enable prefixed addressing and pc-relative addressing on 64-bit ELF v2
+     systems if the OS tm.h file says that it is supported and the user did not
+     explicitly use -mprefixed-addr or -mpcrel.  At the present time, only
+     64-bit Linux enables this.
+
+     Pc-relative support also requires the medium code model.
+
+     However, we can't check for ELFv2 or -mcmodel=medium until after the
+     subtarget macros are run.
+
+     If prefixed addressing is disabled by default, and the user does -mpcrel,
+     don't force them to also specify -mprefixed-addr.  */
+  if (TARGET_FUTURE)
+    {
+      bool explicit_prefixed = ((rs6000_isa_flags_explicit
+				 & OPTION_MASK_PREFIXED_ADDR) != 0);
+      bool explicit_pcrel = ((rs6000_isa_flags_explicit
+			      & OPTION_MASK_PCREL) != 0);
+
+      /* Prefixed addressing requires 64-bit registers.  */
+      if (!TARGET_POWERPC64)
+	{
+	  if (TARGET_PCREL && explicit_pcrel)
+	    error ("%qs requires %qs", "-mpcrel", "-m64");
+
+	  else if (TARGET_PREFIXED_ADDR && explicit_prefixed)
+	    error ("%qs requires %qs", "-mprefixed-addr", "-m64");
+
+	  rs6000_isa_flags &= ~ADDRESSING_FUTURE_MASKS;
+	}
+
+      /* Only ELFv2 currently supports prefixed/pcrel addressing.  */
+      else if (rs6000_current_abi != ABI_ELFv2)
+	{
+	  if (TARGET_PCREL && explicit_pcrel)
+	    error ("%qs requires %qs", "-mpcrel", "-mabi=elfv2");
+
+	  else if (TARGET_PREFIXED_ADDR && explicit_prefixed)
+	    error ("%qs requires %qs", "-mprefixed-addr", "-mabi=elfv2");
+
+	  rs6000_isa_flags &= ~ADDRESSING_FUTURE_MASKS;
+	}
+
+      /* Pc-relative requires the medium code model.  */
+      else if (TARGET_PCREL && TARGET_CMODEL != CMODEL_MEDIUM)
+	{
+	  if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+	    error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");
+
+	  rs6000_isa_flags &= ~OPTION_MASK_PCREL;
+	}
+
+      /* Enable defaults if desired.  */
+      else
+	{
+	  if (!explicit_prefixed
+	      && (TARGET_PREFIXED_ADDR_DEFAULT
+		  || TARGET_PCREL
+		  || TARGET_PCREL_DEFAULT))
+	    rs6000_isa_flags |= OPTION_MASK_PREFIXED_ADDR;
+
+	  if (!explicit_pcrel && TARGET_PCREL_DEFAULT
+	      && TARGET_CMODEL == CMODEL_MEDIUM)
+	    rs6000_isa_flags |= OPTION_MASK_PCREL;
+	}
+    }
+
+  /* -mprefixed-addr (and hence -mpcrel) requires -mcpu=future.  */
+  if (TARGET_PREFIXED_ADDR && !TARGET_FUTURE)
     {
       if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");
+	error ("%qs requires %qs", "-mpcrel", "-mcpu=future");
+      else if ((rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED_ADDR) != 0)
+	error ("%qs requires %qs", "-mprefixed-addr", "-mcpu=future");
+
+      rs6000_isa_flags &= ~(OPTION_MASK_PCREL | OPTION_MASK_PREFIXED_ADDR);
+    }
+
+  /* -mpcrel requires prefixed load/store addressing.  */
+  if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
+    {
+      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+	error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
 
       rs6000_isa_flags &= ~OPTION_MASK_PCREL;
     }

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH], V4, patch #9 [part of patch #4.2], Add prefixed address offset checks
  2019-10-04 12:29 ` [PATCH], V4, patch #9 [part of patch #4.2], Add prefixed address offset checks Michael Meissner
@ 2019-10-09 22:24   ` Segher Boessenkool
  2019-10-09 23:41     ` Michael Meissner
  2019-10-09 23:44     ` Michael Meissner
  0 siblings, 2 replies; 37+ messages in thread
From: Segher Boessenkool @ 2019-10-09 22:24 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

Hi!

On Fri, Oct 04, 2019 at 08:29:11AM -0400, Michael Meissner wrote:
> @@ -8651,6 +8675,11 @@ rs6000_legitimate_address_p (machine_mod
>        && mode_supports_pre_incdec_p (mode)
>        && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict))
>      return 1;
> +
> +  /* Handle prefixed addresses (PC-relative or 34-bit offset).  */
> +  if (address_is_prefixed (x, mode, NON_PREFIXED_DEFAULT))
> +    return 1;

Is this correct?  Are addresses with a larger offset always legitimate?
I don't see why that would be the case.

The rest of the patch looks good, thanks.


Segher

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH], V4, patch #9 [part of patch #4.2], Add prefixed address offset checks
  2019-10-09 22:24   ` Segher Boessenkool
@ 2019-10-09 23:41     ` Michael Meissner
  2019-10-10 21:54       ` Segher Boessenkool
  2019-10-09 23:44     ` Michael Meissner
  1 sibling, 1 reply; 37+ messages in thread
From: Michael Meissner @ 2019-10-09 23:41 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Wed, Oct 09, 2019 at 04:56:48PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Oct 04, 2019 at 08:29:11AM -0400, Michael Meissner wrote:
> > @@ -8651,6 +8675,11 @@ rs6000_legitimate_address_p (machine_mod
> >        && mode_supports_pre_incdec_p (mode)
> >        && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict))
> >      return 1;
> > +
> > +  /* Handle prefixed addresses (PC-relative or 34-bit offset).  */
> > +  if (address_is_prefixed (x, mode, NON_PREFIXED_DEFAULT))
> > +    return 1;
> 
> Is this correct?  Are addresses with a larger offset always legitimate?
> I don't see why that would be the case.
> 
> The rest of the patch looks good, thanks.

As far as I know, with the exception of SDmode (which is not allowed to have an
offset) all other modes that use D*-form addresses would work with a prefixed
instruction, assuming the offset fits in the 34-bit field.

The function address_to_insn_form, which is called by address_is_prefixed,
checks if the offset is 34-bits or less, whether the mode is SDmode, etc. are
all valid.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH], V4, patch #9 [part of patch #4.2], Add prefixed address offset checks
  2019-10-09 22:24   ` Segher Boessenkool
  2019-10-09 23:41     ` Michael Meissner
@ 2019-10-09 23:44     ` Michael Meissner
  1 sibling, 0 replies; 37+ messages in thread
From: Michael Meissner @ 2019-10-09 23:44 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Wed, Oct 09, 2019 at 04:56:48PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Oct 04, 2019 at 08:29:11AM -0400, Michael Meissner wrote:
> > @@ -8651,6 +8675,11 @@ rs6000_legitimate_address_p (machine_mod
> >        && mode_supports_pre_incdec_p (mode)
> >        && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict))
> >      return 1;
> > +
> > +  /* Handle prefixed addresses (PC-relative or 34-bit offset).  */
> > +  if (address_is_prefixed (x, mode, NON_PREFIXED_DEFAULT))
> > +    return 1;
> 
> Is this correct?  Are addresses with a larger offset always legitimate?
> I don't see why that would be the case.
> 
> The rest of the patch looks good, thanks.

This patch BTW is the same as the new V5 patch #1.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH], V4, patch #9 [part of patch #4.2], Add prefixed address offset checks
  2019-10-09 23:41     ` Michael Meissner
@ 2019-10-10 21:54       ` Segher Boessenkool
  0 siblings, 0 replies; 37+ messages in thread
From: Segher Boessenkool @ 2019-10-10 21:54 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

On Wed, Oct 09, 2019 at 07:40:23PM -0400, Michael Meissner wrote:
> On Wed, Oct 09, 2019 at 04:56:48PM -0500, Segher Boessenkool wrote:
> > On Fri, Oct 04, 2019 at 08:29:11AM -0400, Michael Meissner wrote:
> > > @@ -8651,6 +8675,11 @@ rs6000_legitimate_address_p (machine_mod
> > >        && mode_supports_pre_incdec_p (mode)
> > >        && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict))
> > >      return 1;
> > > +
> > > +  /* Handle prefixed addresses (PC-relative or 34-bit offset).  */
> > > +  if (address_is_prefixed (x, mode, NON_PREFIXED_DEFAULT))
> > > +    return 1;
> > 
> > Is this correct?  Are addresses with a larger offset always legitimate?
> > I don't see why that would be the case.

> The function address_to_insn_form, which is called by address_is_prefixed,
> checks if the offset is 34-bits or less

Ah, right.

And "address_is_prefixed" is a long enough name, "address_is_a_valid_prefixed_address"
isn't better ;-)


Segher

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2019-10-10 21:53 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-18 23:42 PowerPC future machine patches, version 4 Michael Meissner
2019-09-18 23:49 ` [PATCH] V4, patch #1: Rework prefixed/pc-relative lookup Michael Meissner
2019-09-21  1:29   ` Segher Boessenkool
2019-09-23 17:49     ` Michael Meissner
2019-09-18 23:56 ` [PATCH], V4, patch #2: Add prefixed insn attribute Michael Meissner
2019-09-18 23:58 ` [PATCH], V4, patch #3: Fix up mov<mode>_64bit_dm Michael Meissner
2019-09-27 23:33   ` Segher Boessenkool
2019-09-19  0:06 ` [PATCH], V4, patch #4: Enable prefixed/pc-rel addressing Michael Meissner
2019-09-19  0:09 ` [PATCH] V4, patch #5: Use PLI (PADDI) to load up 34-bit DImode Michael Meissner
2019-09-19  0:11 ` [PATCH] V4, patch #6: Use PLI (PADDI) to load up 32-bit SImode constants Michael Meissner
2019-09-19  0:13 ` [PATCH] V4, patch #7: Use PADDI to add 34-bit constants Michael Meissner
2019-09-19  0:17 ` [PATCH] V4, patch #8: Enable -mpcrel on Linux 64-bit, but not on other targets Michael Meissner
2019-09-24  5:59 ` [PATCH] V4.1, patch #1: Rework prefixed/pc-relative lookup (revised) Michael Meissner
2019-09-27 22:59   ` Segher Boessenkool
2019-09-30 13:51     ` [PATCH, committed] V4.2, patch #1: Rework prefixed/pc-relative lookup (revised #2) Michael Meissner
2019-09-24  6:10 ` [PATCH], V4.1, patch #2: Add prefixed insn attribute (revised) Michael Meissner
2019-09-27 23:27   ` Segher Boessenkool
2019-09-30 13:53     ` [PATCH, committed], V4.2, patch #2: Add prefixed insn attribute (revised #2) Michael Meissner
2019-09-30 14:13 ` [PATCH], V4, patch #4.1: Enable prefixed/pc-rel addressing (revised) Michael Meissner
2019-10-01 23:56   ` Segher Boessenkool
2019-10-02 19:04     ` Michael Meissner
2019-10-02 22:52       ` Segher Boessenkool
2019-10-04 12:29 ` [PATCH], V4, patch #9 [part of patch #4.2], Add prefixed address offset checks Michael Meissner
2019-10-09 22:24   ` Segher Boessenkool
2019-10-09 23:41     ` Michael Meissner
2019-10-10 21:54       ` Segher Boessenkool
2019-10-09 23:44     ` Michael Meissner
2019-10-04 12:35 ` [PATCH], V4, patch #10 [part of patch #4.2], Set prefixed length for 128-bit non-vector type Michael Meissner
2019-10-04 12:41 ` [PATCH], V4, patch #11 [part of patch #4.2], Adjust insn cost for prefixed instructions Michael Meissner
2019-10-04 12:46 ` [PATCH], V4, patch #12 [part of patch #4.2], Update predicates Michael Meissner
2019-10-04 12:51 ` [PATCH], V4, patch #13 [part of patch #4.2], Update stack protect insns for prefixed addresses Michael Meissner
2019-10-04 12:56 ` [PATCH], V4, patch #14 [part of patch #4.2], Update vector 128-bit instruction sizes Michael Meissner
2019-10-04 13:02 ` [PATCH], V4, patch #15 [part of patch #4.2], Make vector extract/insert support prefixed instructions Michael Meissner
2019-10-04 13:07 ` [PATCH], V4, patch #16 [Same as patch #5], Support DImode 34-bt constants Michael Meissner
2019-10-04 13:11 ` [PATCH], V4, patch #17 [Same as patch #6], Use PADDI to load up 32-bit SImode constants Michael Meissner
2019-10-04 13:14 ` [PATCH], V4, patch #18 [Same as patch #7], Use PADDI to add 34-bit constants Michael Meissner
2019-10-04 13:18 ` [PATCH], V4, patch #19 [Same as patch #8], Enable -mpcrel on Linux 64-bit systems Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).