PowerPC future machine patches, version 5

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* PowerPC future machine patches, version 5
@ 2019-10-09 19:53 Michael Meissner
  2019-10-09 19:56 ` [PATCH] V5, #1 of 15: Support 34-bit offsets for prefixed instructions Michael Meissner
                   ` (14 more replies)
  0 siblings, 15 replies; 28+ messages in thread
From: Michael Meissner @ 2019-10-09 19:53 UTC (permalink / raw)
  To: gcc-patches, segher, dje.gcc, meissner

I'm going to re-issue the outstanding 'future' patches as separate threads
under this message.

Compared to the original V4 patches, I have tried to break the larger patches
(particularly #4) into smaller chunks.

One of the V4 patches that I submitted was wrong, and this series of patches
corrects the issue (the issue was I used PC-relative offsets to add to a
register, and the current instruction definition does not support that).  The
code modified dealt with vector extracts where the vector has a PC-relative
address, but the insn I added also broke normal switch statements.

This series of patches only adds one new constraint (em) that matches any
memory operand that does not use prefixed addressing (including PC-relative
addressing).  In some of the previous V4 patches, I added two constraints.

I adjusted the code to address several of the previous comments:

   * Eliminate having a TARGET_<xxx> test before calling a function that does
     the same test;

   * Fixing vector moves instruction length.

There are currently 15 patches in this set (as of this writing).

   * Patches 1-10 provide the necessary functionality;

   * Patch 11 sets the default for Linux 64-bit to use PC-relative and prefixed
     instructions.

   * Patches 12-14 are the tests to the test suite.

   * Patch 15 is a patch to change how @pcrel is emitted to prevent some mixups
     (like I had).

I had originally posted a patch to support PCREL_OPT in the V3 series.  Given
your comments, I plan to re-implement it, so it is not part of this patch set.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] V5, #1 of 15: Support 34-bit offsets for prefixed instructions
  2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
@ 2019-10-09 19:56 ` Michael Meissner
  2019-10-10 22:02   ` Segher Boessenkool
  2019-10-09 20:03 ` [PATCH] V5, #2 of 15: Fix prefixed instruction length for some 128-bit types Michael Meissner
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 28+ messages in thread
From: Michael Meissner @ 2019-10-09 19:56 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch adds support in the various functions that check memory offsets for
the 34-bit offset with prefixed instructions on the 'future' machine.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  Can I check this into
the trunk?

2019-10-08  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (quad_address_p): Add check for prefixed
	addresses.
	(mem_operand_gpr): Add check for prefixed addresses.
	(mem_operand_ds_form): Add check for prefixed addresses.
	(rs6000_legitimate_offset_address_p): If we support prefixed
	addresses, check for a 34-bit offset instead of 16-bit.
	(rs6000_legitimate_address_p): Add check for prefixed addresses.
	Do not allow load/store with update if the address is prefixed.
	(rs6000_mode_dependent_address):  If we support prefixed
	addresses, check for a 34-bit offset instead of 16-bit.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276713)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -7250,6 +7250,13 @@ quad_address_p (rtx addr, machine_mode m
   if (VECTOR_MODE_P (mode) && !mode_supports_dq_form (mode))
     return false;
 
+  /* Is this a valid prefixed address?  If the bottom four bits of the offset
+     are non-zero, we could use a prefixed instruction (which does not have the
+     DQ-form constraint that the traditional instruction had) instead of
+     forcing the unaligned offset to a GPR.  */
+  if (address_is_prefixed (addr, mode, NON_PREFIXED_DQ))
+    return true;
+
   if (GET_CODE (addr) != PLUS)
     return false;
 
@@ -7351,6 +7358,13 @@ mem_operand_gpr (rtx op, machine_mode mo
       && legitimate_indirect_address_p (XEXP (addr, 0), false))
     return true;
 
+  /* Allow prefixed instructions if supported.  If the bottom two bits of the
+     offset are non-zero, we could use a prefixed instruction (which does not
+     have the DS-form constraint that the traditional instruction had) instead
+     of forcing the unaligned offset to a GPR.  */
+  if (address_is_prefixed (addr, mode, NON_PREFIXED_DS))
+    return true;
+
   /* Don't allow non-offsettable addresses.  See PRs 83969 and 84279.  */
   if (!rs6000_offsettable_memref_p (op, mode, false))
     return false;
@@ -7385,6 +7399,13 @@ mem_operand_ds_form (rtx op, machine_mod
   int extra;
   rtx addr = XEXP (op, 0);
 
+  /* Allow prefixed instructions if supported.  If the bottom two bits of the
+     offset are non-zero, we could use a prefixed instruction (which does not
+     have the DS-form constraint that the traditional instruction had) instead
+     of forcing the unaligned offset to a GPR.  */
+  if (address_is_prefixed (addr, mode, NON_PREFIXED_DS))
+    return true;
+
   if (!offsettable_address_p (false, mode, addr))
     return false;
 
@@ -7754,7 +7775,10 @@ rs6000_legitimate_offset_address_p (mach
       break;
     }
 
-  return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
+  if (TARGET_PREFIXED_ADDR)
+    return SIGNED_34BIT_OFFSET_EXTRA_P (offset, extra);
+  else
+    return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
 
 bool
@@ -8651,6 +8675,11 @@ rs6000_legitimate_address_p (machine_mod
       && mode_supports_pre_incdec_p (mode)
       && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict))
     return 1;
+
+  /* Handle prefixed addresses (PC-relative or 34-bit offset).  */
+  if (address_is_prefixed (x, mode, NON_PREFIXED_DEFAULT))
+    return 1;
+
   /* Handle restricted vector d-form offsets in ISA 3.0.  */
   if (quad_offset_p)
     {
@@ -8709,7 +8738,11 @@ rs6000_legitimate_address_p (machine_mod
 	  || (!avoiding_indexed_address_p (mode)
 	      && legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict)))
       && rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0)))
-    return 1;
+    {
+      /* There is no prefixed version of the load/store with update.  */
+      rtx addr = XEXP (x, 1);
+      return !address_is_prefixed (addr, mode, NON_PREFIXED_DEFAULT);
+    }
   if (reg_offset_p && !quad_offset_p
       && legitimate_lo_sum_address_p (mode, x, reg_ok_strict))
     return 1;
@@ -8773,7 +8806,10 @@ rs6000_mode_dependent_address (const_rtx
 	{
 	  HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
 	  HOST_WIDE_INT extra = TARGET_POWERPC64 ? 8 : 12;
-	  return !SIGNED_16BIT_OFFSET_EXTRA_P (val, extra);
+	  if (TARGET_PREFIXED_ADDR)
+	    return !SIGNED_34BIT_OFFSET_EXTRA_P (val, extra);
+	  else
+	    return !SIGNED_16BIT_OFFSET_EXTRA_P (val, extra);
 	}
       break;
 

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] V5, #2 of 15: Fix prefixed instruction length for some 128-bit types
  2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
  2019-10-09 19:56 ` [PATCH] V5, #1 of 15: Support 34-bit offsets for prefixed instructions Michael Meissner
@ 2019-10-09 20:03 ` Michael Meissner
  2019-10-10 22:18   ` Segher Boessenkool
  2019-10-09 20:07 ` [PATCH] V5, #3 of 15: Deal with prefixed instructions in rs6000_insn_cost Michael Meissner
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 28+ messages in thread
From: Michael Meissner @ 2019-10-09 20:03 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch fixes the prefixed and non-prefixed instruction sizes for the
128-bit types that aren't loaded into 128-bit vectors (TDmode, TFmode, IFmode,
PTImode).

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  Can I check this into
the trunk?

2019-10-08  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.md (mov<mode>_64bit_dm): Set prefixed and
	non-prefixed length.
	(movtd_64bit_nodm): Set prefixed and non-prefixed length.
	(mov<mode>_ppc64): Set prefixed and non-prefixed length.

Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 276713)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -7773,9 +7773,13 @@ (define_insn_and_split "*mov<mode>_64bit
   "#"
   "&& reload_completed"
   [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8")
-   (set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")])
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+}
+  [(set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")
+   (set_attr "non_prefixed_length" "8")
+   (set_attr "prefixed_length" "20")])
 
 (define_insn_and_split "*movtd_64bit_nodm"
   [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
@@ -7786,8 +7790,12 @@ (define_insn_and_split "*movtd_64bit_nod
   "#"
   "&& reload_completed"
   [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8,8,8,12,12,8")])
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+}
+  [(set_attr "non_prefixed_length" "8")
+   (set_attr "prefixed_length" "20")])
 
 (define_insn_and_split "*mov<mode>_32bit"
   [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r")
@@ -8984,7 +8992,8 @@ (define_insn "*mov<mode>_ppc64"
   return rs6000_output_move_128bit (operands);
 }
   [(set_attr "type" "store,store,load,load,*,*")
-   (set_attr "length" "8")])
+   (set_attr "prefixed_length" "20")
+   (set_attr "non_prefixed_length" "8")])
 
 (define_split
   [(set (match_operand:TI2 0 "int_reg_operand")

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] V5, #3 of 15: Deal with prefixed instructions in rs6000_insn_cost
  2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
  2019-10-09 19:56 ` [PATCH] V5, #1 of 15: Support 34-bit offsets for prefixed instructions Michael Meissner
  2019-10-09 20:03 ` [PATCH] V5, #2 of 15: Fix prefixed instruction length for some 128-bit types Michael Meissner
@ 2019-10-09 20:07 ` Michael Meissner
  2019-10-11 21:17   ` Segher Boessenkool
  2019-10-09 20:11 ` [PATCH] V5, #4 of 15: Update predicates Michael Meissner
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 28+ messages in thread
From: Michael Meissner @ 2019-10-09 20:07 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch is a simplification of the previous patch to adjust the rtx insn
cost for prefixed instructions.  Rather than making it a separate function with
only one caller, I folded the code into rs6000_insn_cost.

You had asked for this to be a separate patch, so it is in this patch.

The basic problem is if there is no explicit cost predicate, rs6000_insn_cost
uses the instruction size to figure out how many instructions are present, and
make the cost a fact on that.  Since prefixed instructions are 12 bytes within
GCC (to deal with the implicit NOP), if we did not do this change, the
optimizers would try to save registers from prefixed loads because they thought
the load was more expensive.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  Can I check this into
the trunk?

2019-10-08  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_insn_cost): Do not make prefixed
	instructions cost more because they are larger in size.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276715)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -20972,14 +20972,38 @@ rs6000_insn_cost (rtx_insn *insn, bool s
   if (recog_memoized (insn) < 0)
     return 0;
 
-  if (!speed)
-    return get_attr_length (insn);
+  if (speed)
+    {
+      int cost = get_attr_cost (insn);
+      if (cost > 0)
+	return cost;
+    }
 
-  int cost = get_attr_cost (insn);
-  if (cost > 0)
-    return cost;
+  int cost;
+  int length = get_attr_length (insn);
+  int n = length / 4;
+
+  /* How many real instructions are generated for this insn?  This is slightly
+     different from the length attribute, in that the length attribute counts
+     the number of bytes.  With prefixed instructions, we don't want to count a
+     prefixed instruction (length 12 bytes including possible NOP) as taking 3
+     instructions, but just one.  */
+  if (length >= 12 && get_attr_prefixed (insn) == PREFIXED_YES)
+    {
+      /* Single prefixed instruction.  */
+      if (length == 12)
+	n = 1;
+
+      /* A normal instruction and a prefixed instruction (16) or two back
+	 to back prefixed instructions (20).  */
+      else if (length == 16 || length == 20)
+	n = 2;
+
+      /* Guess for larger instruction sizes.  */
+      else
+	n = 2 + (length - 20) / 4;
+    }
 
-  int n = get_attr_length (insn) / 4;
   enum attr_type type = get_attr_type (insn);
 
   switch (type)

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] V5, #4 of 15: Update predicates
  2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
                   ` (2 preceding siblings ...)
  2019-10-09 20:07 ` [PATCH] V5, #3 of 15: Deal with prefixed instructions in rs6000_insn_cost Michael Meissner
@ 2019-10-09 20:11 ` Michael Meissner
  2019-10-11 21:43   ` Segher Boessenkool
  2019-10-09 20:26 ` [PATCH] V5, #5 of 15: Support -fstack-protect and large stack frames Michael Meissner
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 28+ messages in thread
From: Michael Meissner @ 2019-10-09 20:11 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch updates the predicates.

It allows lwa_operand (used by extendsidi2) to generate offsets that can be
used with PLWA but not LWA (due to LWA using the DS instruction format).

It also adds two new predicates that disallow prefixed memory that will used in
patches #5 and #7.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  Can I check this into
the trunk?

2019-10-08  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (lwa_operand): Allow using PLWA to
	generate sign extend with odd offsets.
	(non_prefixed_memory): New predicate.
	(reg_or_non_prefixed_memory): New predicate.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 276713)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -932,6 +932,13 @@ (define_predicate "lwa_operand"
     return false;
 
   addr = XEXP (inner, 0);
+
+  /* The LWA instruction uses the DS-form format where the bottom two bits of
+     the offset must be 0.  The prefixed PLWA does not have this
+     restriction.  */
+  if (address_is_prefixed (addr, DImode, NON_PREFIXED_DS))
+    return true;
+
   if (GET_CODE (addr) == PRE_INC
       || GET_CODE (addr) == PRE_DEC
       || (GET_CODE (addr) == PRE_MODIFY
@@ -1807,3 +1814,21 @@ (define_predicate "pcrel_external_addres
 (define_predicate "pcrel_local_or_external_address"
   (ior (match_operand 0 "pcrel_local_address")
        (match_operand 0 "pcrel_external_address")))
+
+;; Return 1 if op is a memory operand that is not prefixed.
+(define_predicate "non_prefixed_memory"
+  (match_code "mem")
+{
+  if (!memory_operand (op, mode))
+    return false;
+
+  return !address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
+})
+
+;; Return 1 if op is either a register operand or a memory operand that does
+;; not use a prefixed address.
+(define_predicate "reg_or_non_prefixed_memory"
+  (match_code "reg,subreg,mem")
+{
+  return (gpc_reg_operand (op, mode) || non_prefixed_memory (op, mode));
+})

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] V5, #5 of 15: Support -fstack-protect and large stack frames
  2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
                   ` (3 preceding siblings ...)
  2019-10-09 20:11 ` [PATCH] V5, #4 of 15: Update predicates Michael Meissner
@ 2019-10-09 20:26 ` Michael Meissner
  2019-10-11 21:48   ` Segher Boessenkool
  2019-10-09 20:26 ` [PATCH] V5, #6 of 15: Make vector load/store instruction length correct with prefixed addresses Michael Meissner
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 28+ messages in thread
From: Michael Meissner @ 2019-10-09 20:26 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch allows -fstack-protect to work with large stack frames.

It does so by adding a new constraint (em) that matches normal memory that does
not use a prefixed address (including PC-relative addresses).

The stack protect test and set functions now can take indexed address forms,
and there is a define_expand to make sure memory would not use a prefixed
address.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  Can I check this into
the trunk?

2019-10-08  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/constraints.md (em constraint): New constraint.
	* config/rs6000/rs6000-protos.h (make_memory_non_prefixed): New
	declaration.
	* config/rs6000/rs6000.c (make_memory_non_prefixed): New function
	to return a traditional non-prefixed instruction.
	* config/rs6000/rs6000.md (stack_protect_setdi): Make sure both
	memory operands are not	prefixed.
	(stack_protect_testdi): Make sure both memory operands are not
	prefixed.
	* doc/md.texi (PowerPC constraints): Document the em constraint.

Index: gcc/config/rs6000/constraints.md
===================================================================
--- gcc/config/rs6000/constraints.md	(revision 276713)
+++ gcc/config/rs6000/constraints.md	(working copy)
@@ -210,6 +210,11 @@ several times, or that might not access
   (and (match_code "mem")
        (match_test "GET_RTX_CLASS (GET_CODE (XEXP (op, 0))) != RTX_AUTOINC")))
 
+(define_memory_constraint "em"
+  "A memory operand that does not contain a prefixed address."
+  (and (match_code "mem")
+       (match_test "non_prefixed_memory (op, mode)")))
+
 (define_memory_constraint "Q"
   "Memory operand that is an offset from a register (it is usually better
 to use @samp{m} or @samp{es} in @code{asm} statements)"
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 276713)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -192,6 +192,7 @@ extern enum insn_form address_to_insn_fo
 extern bool prefixed_load_p (rtx_insn *);
 extern bool prefixed_store_p (rtx_insn *);
 extern bool prefixed_paddi_p (rtx_insn *);
+extern rtx make_memory_non_prefixed (rtx);
 extern void rs6000_asm_output_opcode (FILE *);
 extern void rs6000_final_prescan_insn (rtx_insn *, rtx [], int);
 
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276717)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -24972,6 +24972,34 @@ prefixed_paddi_p (rtx_insn *insn)
   return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL);
 }
 
+/* Make a memory address non-prefixed if it is prefixed.  */
+
+rtx
+make_memory_non_prefixed (rtx mem)
+{
+  gcc_assert (MEM_P (mem));
+
+  rtx old_addr = XEXP (mem, 0);
+  if (address_is_prefixed (old_addr, GET_MODE (mem), NON_PREFIXED_DEFAULT))
+    {
+      rtx new_addr;
+
+      if (GET_CODE (old_addr) == PLUS
+	  && (REG_P (XEXP (old_addr, 0)) || SUBREG_P (XEXP (old_addr, 0)))
+	  && CONST_INT_P (XEXP (old_addr, 1)))
+	{
+	  rtx tmp_reg = force_reg (Pmode, XEXP (old_addr, 1));
+	  new_addr = gen_rtx_PLUS (Pmode, XEXP (old_addr, 0), tmp_reg);
+	}
+      else
+	new_addr = force_reg (Pmode, old_addr);
+
+      mem = change_address (mem, VOIDmode, new_addr);
+    }
+
+  return mem;
+}
+
 /* Whether the next instruction needs a 'p' prefix issued before the
    instruction is printed out.  */
 static bool next_insn_prefixed_p;
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 276716)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -11531,9 +11531,25 @@ (define_insn "stack_protect_setsi"
   [(set_attr "type" "three")
    (set_attr "length" "12")])
 
-(define_insn "stack_protect_setdi"
-  [(set (match_operand:DI 0 "memory_operand" "=Y")
-	(unspec:DI [(match_operand:DI 1 "memory_operand" "Y")] UNSPEC_SP_SET))
+(define_expand "stack_protect_setdi"
+  [(parallel [(set (match_operand:DI 0 "memory_operand")
+		   (unspec:DI [(match_operand:DI 1 "memory_operand")]
+		   UNSPEC_SP_SET))
+	      (set (match_scratch:DI 2)
+		   (const_int 0))])]
+  "TARGET_64BIT"
+{
+  if (TARGET_PREFIXED_ADDR)
+    {
+      operands[0] = make_memory_non_prefixed (operands[0]);
+      operands[1] = make_memory_non_prefixed (operands[1]);
+    }
+})
+
+(define_insn "*stack_protect_setdi"
+  [(set (match_operand:DI 0 "non_prefixed_memory" "=em")
+	(unspec:DI [(match_operand:DI 1 "non_prefixed_memory" "em")]
+		   UNSPEC_SP_SET))
    (set (match_scratch:DI 2 "=&r") (const_int 0))]
   "TARGET_64BIT"
   "ld%U1%X1 %2,%1\;std%U0%X0 %2,%0\;li %2,0"
@@ -11577,10 +11593,27 @@ (define_insn "stack_protect_testsi"
    lwz%U1%X1 %3,%1\;lwz%U2%X2 %4,%2\;cmplw %0,%3,%4\;li %3,0\;li %4,0"
   [(set_attr "length" "16,20")])
 
-(define_insn "stack_protect_testdi"
+(define_expand "stack_protect_testdi"
+  [(parallel [(set (match_operand:CCEQ 0 "cc_reg_operand")
+		   (unspec:CCEQ [(match_operand:DI 1 "memory_operand")
+				 (match_operand:DI 2 "memory_operand")]
+				UNSPEC_SP_TEST))
+	      (set (match_scratch:DI 4)
+		   (const_int 0))
+	      (clobber (match_scratch:DI 3))])]
+  "TARGET_64BIT"
+{
+  if (TARGET_PREFIXED_ADDR)
+    {
+      operands[1] = make_memory_non_prefixed (operands[1]);
+      operands[2] = make_memory_non_prefixed (operands[2]);
+    }
+})
+
+(define_insn "*stack_protect_testdi"
   [(set (match_operand:CCEQ 0 "cc_reg_operand" "=x,?y")
-        (unspec:CCEQ [(match_operand:DI 1 "memory_operand" "Y,Y")
-		      (match_operand:DI 2 "memory_operand" "Y,Y")]
+        (unspec:CCEQ [(match_operand:DI 1 "non_prefixed_memory" "em,em")
+		      (match_operand:DI 2 "non_prefixed_memory" "em,em")]
 		     UNSPEC_SP_TEST))
    (set (match_scratch:DI 4 "=r,r") (const_int 0))
    (clobber (match_scratch:DI 3 "=&r,&r"))]
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(revision 276713)
+++ gcc/doc/md.texi	(working copy)
@@ -3373,6 +3373,9 @@ asm ("st %1,%0" : "=m<>" (mem) : "r" (va
 
 is not.
 
+@item em
+A memory operand that does not contain a prefixed address.
+
 @item es
 A ``stable'' memory operand; that is, one which does not include any
 automodification of the base register.  This used to be useful when

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] V5, #6 of 15: Make vector load/store instruction length correct with prefixed addresses
  2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
                   ` (4 preceding siblings ...)
  2019-10-09 20:26 ` [PATCH] V5, #5 of 15: Support -fstack-protect and large stack frames Michael Meissner
@ 2019-10-09 20:26 ` Michael Meissner
  2019-10-11 21:58   ` Segher Boessenkool
  2019-10-09 20:36 ` [PATCH] V5, #7 of 15: Restrict vector extract to not use prefixed memory Michael Meissner
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 28+ messages in thread
From: Michael Meissner @ 2019-10-09 20:26 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch updates the instruction length for 128-bit move insns for types that
go into a single vector register.  It is a lot simpler than the previous patch,
which was written before I added the prefixed_length and non_prefixed_length
attributes.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  Can I check this into
the trunk?

2019-10-08  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/vsx.md (vsx_mov<mode>_64bit): Make sure the
	instruction length is correct for prefixed loads and stores.

Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 276713)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -1149,10 +1149,14 @@ (define_insn "vsx_mov<mode>_64bit"
                "vecstore,  vecload,   vecsimple, mffgpr,    mftgpr,    load,
                 store,     load,      store,     *,         vecsimple, vecsimple,
                 vecsimple, *,         *,         vecstore,  vecload")
-   (set_attr "length"
+   (set_attr "non_prefixed_length"
                "*,         *,         *,         8,         *,         8,
                 8,         8,         8,         8,         *,         *,
                 *,         20,        8,         *,         *")
+   (set_attr "prefixed_length"
+               "*,         *,         *,         8,         *,         20,
+                20,        20,        20,        8,         *,         *,
+                *,         20,        8,         *,         *")
    (set_attr "isa"
                "<VSisa>,   <VSisa>,   <VSisa>,   *,         *,         *,
                 *,         *,         *,         *,         p9v,       *,

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] V5, #7 of 15: Restrict vector extract to not use prefixed memory
  2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
                   ` (5 preceding siblings ...)
  2019-10-09 20:26 ` [PATCH] V5, #6 of 15: Make vector load/store instruction length correct with prefixed addresses Michael Meissner
@ 2019-10-09 20:36 ` Michael Meissner
  2019-10-09 20:39 ` [PATCH] V5, #9 of 15: Add PADDI (PLI) support to load up 32-bit SImode constants Michael Meissner
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 28+ messages in thread
From: Michael Meissner @ 2019-10-09 20:36 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch fixes an issue Aaron Sawdey with vector extracts.  The vector
address adjust functions have a single base register temporary.  However for
PC-relative addresses where the vector element is not constant, it logically
will need two temporary registers, one to load up the PC-relative address, and
the other to hold the offset of the vector element being extracted.

If a vector has a prefixed address and the vector element being extracted is
not constant, this code will load the vector into a register, and then do the
variable extract from that register, insted of just loading the scalar value
up.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  Can I check this into
the trunk?

2019-10-08  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_adjust_vec_address): Add support
	for optimizing PC-relative addresses with a constant element
	number.  Assert that PC-relative vectors do not have variable
	element numbers.  Add support for prefixed addresses with 34-bit
	offsets.
	* config/rs6000/vsx.md (vsx_extract_<mode>_var, VSX_D iterator):
	Do not allow combining prefixed memory with a variable vector
	extract.
	(vsx_extract_v4sf_var): Do not allow combining prefixed memory
	with a variable vector extract.
	(vsx_extract_<mode>_var, VSX_EXTRACT_I iterator): Do not allow
	combining prefixed memory with a variable vector extract.
	(vsx_extract_<mode>_<VS_scalar>mode_var): Do not allow combining
	prefixed memory with a variable vector extract.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276719)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -6701,6 +6701,7 @@ rs6000_adjust_vec_address (rtx scalar_re
   rtx element_offset;
   rtx new_addr;
   bool valid_addr_p;
+  bool pcrel_p = pcrel_local_address (addr, Pmode);
 
   /* Vector addresses should not have PRE_INC, PRE_DEC, or PRE_MODIFY.  */
   gcc_assert (GET_RTX_CLASS (GET_CODE (addr)) != RTX_AUTOINC);
@@ -6738,6 +6739,38 @@ rs6000_adjust_vec_address (rtx scalar_re
   else if (REG_P (addr) || SUBREG_P (addr))
     new_addr = gen_rtx_PLUS (Pmode, addr, element_offset);
 
+  /* Optimize PC-relative addresses with a constant offset.  */
+  else if (pcrel_p && CONST_INT_P (element_offset))
+    {
+      rtx addr2 = addr;
+      HOST_WIDE_INT offset = INTVAL (element_offset);
+
+      if (GET_CODE (addr2) == CONST)
+	addr2 = XEXP (addr2, 0);
+
+      if (GET_CODE (addr2) == PLUS)
+	{
+	  offset += INTVAL (XEXP (addr2, 1));
+	  addr2 = XEXP (addr2, 0);
+	}
+
+      gcc_assert (SIGNED_34BIT_OFFSET_P (offset));
+      if (offset)
+	{
+	  addr2 = gen_rtx_PLUS (Pmode, addr2, GEN_INT (offset));
+	  new_addr = gen_rtx_CONST (Pmode, addr2);
+	}
+      else
+	new_addr = addr2;
+    }
+
+  /* With only one temporary base register, we can't support a PC-relative
+     address added to a variable offset.  This is because the PADDI instruction
+     requires RA to be 0 when doing a PC-relative add (i.e. no register to add
+     to).  */
+  else if (pcrel_p)
+    gcc_unreachable ();
+
   /* Optimize D-FORM addresses with constant offset with a constant element, to
      include the element offset in the address directly.  */
   else if (GET_CODE (addr) == PLUS)
@@ -6752,8 +6785,11 @@ rs6000_adjust_vec_address (rtx scalar_re
 	  HOST_WIDE_INT offset = INTVAL (op1) + INTVAL (element_offset);
 	  rtx offset_rtx = GEN_INT (offset);
 
-	  if (IN_RANGE (offset, -32768, 32767)
-	      && (scalar_size < 8 || (offset & 0x3) == 0))
+	  if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (offset))
+	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
+
+	  else if (SIGNED_16BIT_OFFSET_P (offset)
+		   && (scalar_size < 8 || (offset & 0x3) == 0))
 	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
 	  else
 	    {
@@ -6801,11 +6837,11 @@ rs6000_adjust_vec_address (rtx scalar_re
       new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
     }
 
-  /* If we have a PLUS, we need to see whether the particular register class
-     allows for D-FORM or X-FORM addressing.  */
-  if (GET_CODE (new_addr) == PLUS)
+  /* If we have a PLUS or a PC-relative address without the PLUS, we need to
+     see whether the particular register class allows for D-FORM or X-FORM
+     addressing.  */
+  if (GET_CODE (new_addr) == PLUS || pcrel_p)
     {
-      rtx op1 = XEXP (new_addr, 1);
       addr_mask_type addr_mask;
       unsigned int scalar_regno = reg_or_subregno (scalar_reg);
 
@@ -6822,10 +6858,16 @@ rs6000_adjust_vec_address (rtx scalar_re
       else
 	gcc_unreachable ();
 
-      if (REG_P (op1) || SUBREG_P (op1))
-	valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0;
-      else
+      if (pcrel_p)
 	valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
+      else
+	{
+	  rtx op1 = XEXP (new_addr, 1);
+	  if (REG_P (op1) || SUBREG_P (op1))
+	    valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0;
+	  else
+	    valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
+	}
     }
 
   else if (REG_P (new_addr) || SUBREG_P (new_addr))
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 276720)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -3239,9 +3239,10 @@ (define_insn "vsx_vslo_<mode>"
 ;; Variable V2DI/V2DF extract
 (define_insn_and_split "vsx_extract_<mode>_var"
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v,wa,r")
-	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v,m,m")
-			     (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
-			    UNSPEC_VSX_EXTRACT))
+	(unspec:<VS_scalar>
+	 [(match_operand:VSX_D 1 "reg_or_non_prefixed_memory" "v,em,em")
+	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
+	 UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,&b,&b"))
    (clobber (match_scratch:V2DI 4 "=&v,X,X"))]
   "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_DIRECT_MOVE_64BIT"
@@ -3309,9 +3310,10 @@ (define_insn_and_split "*vsx_extract_v4s
 ;; Variable V4SF extract
 (define_insn_and_split "vsx_extract_v4sf_var"
   [(set (match_operand:SF 0 "gpc_reg_operand" "=wa,wa,?r")
-	(unspec:SF [(match_operand:V4SF 1 "input_operand" "v,m,m")
-		    (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
-		   UNSPEC_VSX_EXTRACT))
+	(unspec:SF
+	 [(match_operand:V4SF 1 "reg_or_non_prefixed_memory" "v,em,em")
+	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
+	 UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,&b,&b"))
    (clobber (match_scratch:V2DI 4 "=&v,X,X"))]
   "VECTOR_MEM_VSX_P (V4SFmode) && TARGET_DIRECT_MOVE_64BIT"
@@ -3672,7 +3674,7 @@ (define_insn_and_split "*vsx_extract_<mo
 (define_insn_and_split "vsx_extract_<mode>_var"
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
 	(unspec:<VS_scalar>
-	 [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m")
+	 [(match_operand:VSX_EXTRACT_I 1 "reg_or_non_prefixed_memory" "v,v,em")
 	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 	 UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,r,&b"))
@@ -3692,7 +3694,7 @@ (define_insn_and_split "*vsx_extract_<mo
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
 	(zero_extend:<VS_scalar>
 	 (unspec:<VSX_EXTRACT_I:VS_scalar>
-	  [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m")
+	  [(match_operand:VSX_EXTRACT_I 1 "reg_or_non_prefixed_memory" "v,v,em")
 	   (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 	  UNSPEC_VSX_EXTRACT)))
    (clobber (match_scratch:DI 3 "=r,r,&b"))

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] V5, #8 of 15: Use PADDI/PLI to load up 34-bit DImode constants
  2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
                   ` (7 preceding siblings ...)
  2019-10-09 20:39 ` [PATCH] V5, #9 of 15: Add PADDI (PLI) support to load up 32-bit SImode constants Michael Meissner
@ 2019-10-09 20:39 ` Michael Meissner
  2019-10-09 20:42 ` [PATCH] V5, #10 of 15: Use PADDI to add large 34-bit constants Michael Meissner
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 28+ messages in thread
From: Michael Meissner @ 2019-10-09 20:39 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch uses PADDI (PLI) to load up 34-bit DImode constants.  In the V3 time
frame, it was part of a larger patch, but you asked that this be a separate
patch.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  Can I check this into
the trunk?

2019-10-08  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (num_insns_constant_gpr): Add support for
	PADDI to load up and/or add 34-bit integer constants.
	(rs6000_rtx_costs): Treat constants loaded up with PADDI with the
	same cost as normal 16-bit constants.
	* config/rs6000/rs6000.md (movdi_internal64): Add support to load
	up 34-bit integer constants with PADDI.
	(movdi integer constant splitter): Add comment about PADDI.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276723)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -5523,7 +5523,7 @@ static int
 num_insns_constant_gpr (HOST_WIDE_INT value)
 {
   /* signed constant loadable with addi */
-  if (((unsigned HOST_WIDE_INT) value + 0x8000) < 0x10000)
+  if (SIGNED_16BIT_OFFSET_P (value))
     return 1;
 
   /* constant loadable with addis */
@@ -5531,6 +5531,10 @@ num_insns_constant_gpr (HOST_WIDE_INT va
 	   && (value >> 31 == -1 || value >> 31 == 0))
     return 1;
 
+  /* PADDI can support up to 34 bit signed integers.  */
+  else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (value))
+    return 1;
+
   else if (TARGET_POWERPC64)
     {
       HOST_WIDE_INT low  = ((value & 0xffffffff) ^ 0x80000000) - 0x80000000;
@@ -20640,7 +20644,8 @@ rs6000_rtx_costs (rtx x, machine_mode mo
 	    || outer_code == PLUS
 	    || outer_code == MINUS)
 	   && (satisfies_constraint_I (x)
-	       || satisfies_constraint_L (x)))
+	       || satisfies_constraint_L (x)
+	       || satisfies_constraint_eI (x)))
 	  || (outer_code == AND
 	      && (satisfies_constraint_K (x)
 		  || (mode == SImode
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 276719)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -8805,24 +8805,24 @@ (define_split
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
 
-;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR #
-;;              FPR store  FPR load   FPR move   AVX store  AVX store   AVX load
-;;              AVX load   VSX move   P9 0       P9 -1      AVX 0/-1    VSX 0
-;;              VSX -1     P9 const   AVX const  From SPR   To SPR      SPR<->SPR
-;;              VSX->GPR   GPR->VSX
+;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR pli
+;;              GPR #      FPR store  FPR load   FPR move   AVX store   AVX store
+;;              AVX load   AVX load   VSX move   P9 0       P9 -1       AVX 0/-1
+;;              VSX 0      VSX -1     P9 const   AVX const  From SPR    To SPR
+;;              SPR<->SPR  VSX->GPR   GPR->VSX
 (define_insn "*movdi_internal64"
   [(set (match_operand:DI 0 "nonimmediate_operand"
                "=YZ,       r,         r,         r,         r,          r,
-                m,         ^d,        ^d,        wY,        Z,          $v,
-                $v,        ^wa,       wa,        wa,        v,          wa,
-                wa,        v,         v,         r,         *h,         *h,
-                ?r,        ?wa")
+                r,         m,         ^d,        ^d,        wY,         Z,
+                $v,        $v,        ^wa,       wa,        wa,         v,
+                wa,        wa,        v,         v,         r,          *h,
+                *h,        ?r,        ?wa")
 	(match_operand:DI 1 "input_operand"
-               "r,         YZ,        r,         I,         L,          nF,
-                ^d,        m,         ^d,        ^v,        $v,         wY,
-                Z,         ^wa,       Oj,        wM,        OjwM,       Oj,
-                wM,        wS,        wB,        *h,        r,          0,
-                wa,        r"))]
+               "r,         YZ,        r,         I,         L,          eI,
+                nF,        ^d,        m,         ^d,        ^v,         $v,
+                wY,        Z,         ^wa,       Oj,        wM,         OjwM,
+                Oj,        wM,        wS,        wB,        *h,         r,
+                0,         wa,        r"))]
   "TARGET_POWERPC64
    && (gpc_reg_operand (operands[0], DImode)
        || gpc_reg_operand (operands[1], DImode))"
@@ -8832,6 +8832,7 @@ (define_insn "*movdi_internal64"
    mr %0,%1
    li %0,%1
    lis %0,%v1
+   li %0,%1
    #
    stfd%U0%X0 %1,%0
    lfd%U1%X1 %0,%1
@@ -8855,26 +8856,28 @@ (define_insn "*movdi_internal64"
    mtvsrd %x0,%1"
   [(set_attr "type"
                "store,      load,	*,         *,         *,         *,
-                fpstore,    fpload,     fpsimple,  fpstore,   fpstore,   fpload,
-                fpload,     veclogical, vecsimple, vecsimple, vecsimple, veclogical,
-                veclogical, vecsimple,  vecsimple, mfjmpr,    mtjmpr,    *,
-                mftgpr,    mffgpr")
+                *,          fpstore,    fpload,    fpsimple,  fpstore,   fpstore,
+                fpload,     fpload,     veclogical,vecsimple, vecsimple, vecsimple,
+                veclogical, veclogical, vecsimple,  vecsimple, mfjmpr,   mtjmpr,
+                *,          mftgpr,    mffgpr")
    (set_attr "size" "64")
    (set_attr "length"
-               "*,         *,         *,         *,         *,          20,
-                *,         *,         *,         *,         *,          *,
+               "*,         *,         *,         *,         *,          *,
+                20,        *,         *,         *,         *,          *,
                 *,         *,         *,         *,         *,          *,
-                *,         8,         *,         *,         *,          *,
-                *,         *")
+                *,         *,         8,         *,         *,          *,
+                *,         *,         *")
    (set_attr "isa"
-               "*,         *,         *,         *,         *,          *,
-                *,         *,         *,         p9v,       p7v,        p9v,
-                p7v,       *,         p9v,       p9v,       p7v,        *,
-                *,         p7v,       p7v,       *,         *,          *,
-                p8v,       p8v")])
+               "*,         *,         *,         *,         *,          fut,
+                *,         *,         *,         *,         p9v,        p7v,
+                p9v,       p7v,       *,         p9v,       p9v,        p7v,
+                *,         *,         p7v,       p7v,       *,          *,
+                *,         p8v,       p8v")])
 
 ; Some DImode loads are best done as a load of -1 followed by a mask
-; instruction.
+; instruction.  On systems that support the PADDI (PLI) instruction,
+; num_insns_constant returns 1, so these splitter would not be used for things
+; that be loaded with PLI.
 (define_split
   [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
 	(match_operand:DI 1 "const_int_operand"))]

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] V5, #9 of 15: Add PADDI (PLI) support to load up 32-bit SImode constants
  2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
                   ` (6 preceding siblings ...)
  2019-10-09 20:36 ` [PATCH] V5, #7 of 15: Restrict vector extract to not use prefixed memory Michael Meissner
@ 2019-10-09 20:39 ` Michael Meissner
  2019-10-09 20:39 ` [PATCH] V5, #8 of 15: Use PADDI/PLI to load up 34-bit DImode constants Michael Meissner
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 28+ messages in thread
From: Michael Meissner @ 2019-10-09 20:39 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch is a companion patch to patch #8, and it allows using the PADDI
(PLI) instruction to load up 32-bit SImode constants rather than generating
separate ADDI + ADDIS instructions.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  Can I check this into
the trunk?

2019-10-08  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.md (movsi_internal1): Add support to load
	up 32-bit SImode integer constants with PADDI.
	(movsi integer constant splitter): Do not split constant if PADDI
	can load it up directly.

Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 276724)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -6904,22 +6904,22 @@ (define_insn "movsi_low"
 
 ;;		MR           LA           LWZ          LFIWZX       LXSIWZX
 ;;		STW          STFIWX       STXSIWX      LI           LIS
-;;		#            XXLOR        XXSPLTIB 0   XXSPLTIB -1  VSPLTISW
-;;		XXLXOR 0     XXLORC -1    P9 const     MTVSRWZ      MFVSRWZ
-;;		MF%1         MT%0         NOP
+;;		PLI          #            XXLOR        XXSPLTIB 0   XXSPLTIB -1
+;;		VSPLTISW     XXLXOR 0     XXLORC -1    P9 const     MTVSRWZ
+;;		MFVSRWZ      MF%1         MT%0         NOP
 (define_insn "*movsi_internal1"
   [(set (match_operand:SI 0 "nonimmediate_operand"
 		"=r,         r,           r,           d,           v,
 		 m,          Z,           Z,           r,           r,
-		 r,          wa,          wa,          wa,          v,
-		 wa,         v,           v,           wa,          r,
-		 r,          *h,          *h")
+		 r,          r,           wa,          wa,          wa,
+		 v,          wa,          v,           v,           wa,
+		 r,          r,           *h,          *h")
 	(match_operand:SI 1 "input_operand"
 		"r,          U,           m,           Z,           Z,
 		 r,          d,           v,           I,           L,
-		 n,          wa,          O,           wM,          wB,
-		 O,          wM,          wS,          r,           wa,
-		 *h,         r,           0"))]
+		 eI,         n,           wa,          O,           wM,
+		 wB,         O,           wM,          wS,          r,
+		 wa,         *h,          r,           0"))]
   "gpc_reg_operand (operands[0], SImode)
    || gpc_reg_operand (operands[1], SImode)"
   "@
@@ -6933,6 +6933,7 @@ (define_insn "*movsi_internal1"
    stxsiwx %x1,%y0
    li %0,%1
    lis %0,%v1
+   li %0,%1
    #
    xxlor %x0,%x1,%x1
    xxspltib %x0,0
@@ -6949,21 +6950,21 @@ (define_insn "*movsi_internal1"
   [(set_attr "type"
 		"*,          *,           load,        fpload,      fpload,
 		 store,      fpstore,     fpstore,     *,           *,
-		 *,          veclogical,  vecsimple,   vecsimple,   vecsimple,
-		 veclogical, veclogical,  vecsimple,   mffgpr,      mftgpr,
-		 *,          *,           *")
+		 *,          *,           veclogical,  vecsimple,   vecsimple,
+		 vecsimple,  veclogical,  veclogical,  vecsimple,   mffgpr,
+		 mftgpr,     *,           *,           *")
    (set_attr "length"
 		"*,          *,           *,           *,           *,
 		 *,          *,           *,           *,           *,
-		 8,          *,           *,           *,           *,
-		 *,          *,           8,           *,           *,
-		 *,          *,           *")
+		 *,          8,           *,           *,           *,
+		 *,          *,           *,           8,           *,
+		 *,          *,           *,           *")
    (set_attr "isa"
 		"*,          *,           *,           p8v,         p8v,
 		 *,          p8v,         p8v,         *,           *,
-		 *,          p8v,         p9v,         p9v,         p8v,
-		 p9v,        p8v,         p9v,         p8v,         p8v,
-		 *,          *,           *")])
+		 fut,        *,           p8v,         p9v,         p9v,
+		 p8v,        p9v,         p8v,         p9v,         p8v,
+		 p8v,        *,           *,           *")])
 
 ;; Like movsi, but adjust a SF value to be used in a SI context, i.e.
 ;; (set (reg:SI ...) (subreg:SI (reg:SF ...) 0))
@@ -7108,14 +7109,15 @@ (define_insn "*movsi_from_df"
   "xscvdpsp %x0,%x1"
   [(set_attr "type" "fp")])
 
-;; Split a load of a large constant into the appropriate two-insn
-;; sequence.
+;; Split a load of a large constant into the appropriate two-insn sequence.  On
+;; systems that support PADDI (PLI), we can use PLI to load any 32-bit constant
+;; in one instruction.
 
 (define_split
   [(set (match_operand:SI 0 "gpc_reg_operand")
 	(match_operand:SI 1 "const_int_operand"))]
   "(unsigned HOST_WIDE_INT) (INTVAL (operands[1]) + 0x8000) >= 0x10000
-   && (INTVAL (operands[1]) & 0xffff) != 0"
+   && (INTVAL (operands[1]) & 0xffff) != 0 && !TARGET_PREFIXED_ADDR"
   [(set (match_dup 0)
 	(match_dup 2))
    (set (match_dup 0)

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] V5, #10 of 15: Use PADDI to add large 34-bit constants
  2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
                   ` (8 preceding siblings ...)
  2019-10-09 20:39 ` [PATCH] V5, #8 of 15: Use PADDI/PLI to load up 34-bit DImode constants Michael Meissner
@ 2019-10-09 20:42 ` Michael Meissner
  2019-10-09 20:54 ` [PATCH] V5, #11 of 15: Make -mpcrel default on Linux 64-bit Michael Meissner
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 28+ messages in thread
From: Michael Meissner @ 2019-10-09 20:42 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch enables generating PADDI to add 34-bit constants.

This is the last of the patches in V5 to add the necessary support for the
'future' machine.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  Can I check this into
the trunk?

2019-10-08  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (add_operand): Add support for
	PADDI.
	* config/rs6000/rs6000.md (add<mode>3): Add support for PADDI.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 276718)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -839,7 +839,8 @@ (define_special_predicate "indexed_addre
 (define_predicate "add_operand"
   (if_then_else (match_code "const_int")
     (match_test "satisfies_constraint_I (op)
-		 || satisfies_constraint_L (op)")
+		 || satisfies_constraint_L (op)
+		 || satisfies_constraint_eI (op)")
     (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is either a non-special register, or 0, or -1.
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 276726)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -1756,15 +1756,17 @@ (define_expand "add<mode>3"
 })
 
 (define_insn "*add<mode>3"
-  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
-	(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
-		  (match_operand:GPR 2 "add_operand" "r,I,L")))]
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r,r")
+	(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b,b")
+		  (match_operand:GPR 2 "add_operand" "r,I,L,eI")))]
   ""
   "@
    add %0,%1,%2
    addi %0,%1,%2
-   addis %0,%1,%v2"
-  [(set_attr "type" "add")])
+   addis %0,%1,%v2
+   addi %0,%1,%2"
+  [(set_attr "type" "add")
+   (set_attr "isa" "*,*,*,fut")])
 
 (define_insn "*addsi3_high"
   [(set (match_operand:SI 0 "gpc_reg_operand" "=b")

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] V5, #11 of 15: Make -mpcrel default on Linux 64-bit
  2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
                   ` (9 preceding siblings ...)
  2019-10-09 20:42 ` [PATCH] V5, #10 of 15: Use PADDI to add large 34-bit constants Michael Meissner
@ 2019-10-09 20:54 ` Michael Meissner
  2019-10-10 17:52 ` [PATCH] V5, #12 of 15: Miscellaneous prefixed address tests Michael Meissner
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 28+ messages in thread
From: Michael Meissner @ 2019-10-09 20:54 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch assumes patches #1 through #10 have been applied.  Once all of those
patches have been applied, this patch will change the default for Linux 64-bit
systems to enable -mpcrel and -mprefixed-addr being default when -mcpu=future
is used.  The other OS/bit-size targets (currently) do not enable these
options.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  Can I check this into
the trunk?

2019-10-08  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/linux64.h (TARGET_PREFIXED_ADDR_DEFAULT): Enable
	prefixed addressing by default.
	(TARGET_PCREL_DEFAULT): Enable pc-relative addressing by default.
	* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Only
	enable -mprefixed-addr and -mpcrel if the OS tm.h says to enable
	it.
	(ADDRESSING_FUTURE_MASKS): New mask macro.
	(OTHER_FUTURE_MASKS): Use ADDRESSING_FUTURE_MASKS.
	* config/rs6000/rs6000.c (TARGET_PREFIXED_ADDR_DEFAULT): Do not
	enable -mprefixed-addr unless the OS tm.h says to.
	(TARGET_PCREL_DEFAULT): Do not enable -mpcrel unless the OS tm.h
	says to.
	(rs6000_option_override_internal): Do not enable -mprefixed-addr
	or -mpcrel unless the OS tm.h says to enable it.  Add more checks
	for -mcpu=future.

Index: gcc/config/rs6000/linux64.h
===================================================================
--- gcc/config/rs6000/linux64.h	(revision 276713)
+++ gcc/config/rs6000/linux64.h	(working copy)
@@ -640,3 +640,11 @@ extern int dot_symbols;
    enabling the __float128 keyword.  */
 #undef	TARGET_FLOAT128_ENABLE_TYPE
 #define TARGET_FLOAT128_ENABLE_TYPE 1
+
+/* Enable support for pc-relative and numeric prefixed addressing on the
+   'future' system.  */
+#undef  TARGET_PREFIXED_ADDR_DEFAULT
+#define TARGET_PREFIXED_ADDR_DEFAULT	1
+
+#undef  TARGET_PCREL_DEFAULT
+#define TARGET_PCREL_DEFAULT		1
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def	(revision 276713)
+++ gcc/config/rs6000/rs6000-cpus.def	(working copy)
@@ -75,15 +75,21 @@
 				 | OPTION_MASK_P8_VECTOR		\
 				 | OPTION_MASK_P9_VECTOR)
 
-/* Support for a future processor's features.  Do not enable -mpcrel until it
-   is fully functional.  */
+/* Support for a future processor's features.  The prefixed and pc-relative
+   addressing bits are not added here.  Instead, rs6000.c adds them if the OS
+   tm.h says that it supports the addressing modes.  */
 #define ISA_FUTURE_MASKS_SERVER	(ISA_3_0_MASKS_SERVER			\
-				 | OPTION_MASK_FUTURE			\
+				 | OPTION_MASK_FUTURE)
+
+/* Addressing related flags on a future processor.  These flags are broken out
+   because not all targets will support either pc-relative addressing, or even
+   prefixed addressing, and we want to clear all of the addressing bits
+   on targets that cannot support prefixed/pcrel addressing.  */
+#define ADDRESSING_FUTURE_MASKS	(OPTION_MASK_PCREL			\
 				 | OPTION_MASK_PREFIXED_ADDR)
 
 /* Flags that need to be turned off if -mno-future.  */
-#define OTHER_FUTURE_MASKS	(OPTION_MASK_PCREL			\
-				 | OPTION_MASK_PREFIXED_ADDR)
+#define OTHER_FUTURE_MASKS	ADDRESSING_FUTURE_MASKS
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
 #define OTHER_P9_VECTOR_MASKS	(OPTION_MASK_FLOAT128_HW		\
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276724)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -98,6 +98,16 @@
 #endif
 #endif
 
+/* Set up the defaults for whether prefixed addressing is used, and if it is
+   used, whether we want to turn on pc-relative support by default.  */
+#ifndef TARGET_PREFIXED_ADDR_DEFAULT
+#define TARGET_PREFIXED_ADDR_DEFAULT	0
+#endif
+
+#ifndef TARGET_PCREL_DEFAULT
+#define TARGET_PCREL_DEFAULT		0
+#endif
+
 /* Support targetm.vectorize.builtin_mask_for_load.  */
 GTY(()) tree altivec_builtin_mask_for_load;
 
@@ -2532,6 +2542,14 @@ rs6000_debug_reg_global (void)
   if (TARGET_DIRECT_MOVE_128)
     fprintf (stderr, DEBUG_FMT_D, "VSX easy 64-bit mfvsrld element",
 	     (int)VECTOR_ELEMENT_MFVSRLD_64BIT);
+
+  if (TARGET_FUTURE)
+    {
+      fprintf (stderr, DEBUG_FMT_D, "TARGET_PREFIXED_ADDR_DEFAULT",
+	       TARGET_PREFIXED_ADDR_DEFAULT);
+      fprintf (stderr, DEBUG_FMT_D, "TARGET_PCREL_DEFAULT",
+	       TARGET_PCREL_DEFAULT);
+    }
 }
 
 \f
@@ -4012,26 +4030,6 @@ rs6000_option_override_internal (bool gl
       rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_HW;
     }
 
-  /* -mprefixed-addr (and hence -mpcrel) requires -mcpu=future.  */
-  if (TARGET_PREFIXED_ADDR && !TARGET_FUTURE)
-    {
-      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mcpu=future");
-      else if ((rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED_ADDR) != 0)
-	error ("%qs requires %qs", "-mprefixed-addr", "-mcpu=future");
-
-      rs6000_isa_flags &= ~(OPTION_MASK_PCREL | OPTION_MASK_PREFIXED_ADDR);
-    }
-
-  /* -mpcrel requires prefixed load/store addressing.  */
-  if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
-    {
-      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
-
-      rs6000_isa_flags &= ~OPTION_MASK_PCREL;
-    }
-
   /* Print the options after updating the defaults.  */
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
     rs6000_print_isa_options (stderr, 0, "after defaults", rs6000_isa_flags);
@@ -4163,12 +4161,89 @@ rs6000_option_override_internal (bool gl
   SUB3TARGET_OVERRIDE_OPTIONS;
 #endif
 
-  /* -mpcrel requires -mcmodel=medium, but we can't check TARGET_CMODEL until
-      after the subtarget override options are done.  */
-  if (TARGET_PCREL && TARGET_CMODEL != CMODEL_MEDIUM)
+  /* Enable prefixed addressing and pc-relative addressing on 64-bit ELF v2
+     systems if the OS tm.h file says that it is supported and the user did not
+     explicitly use -mprefixed-addr or -mpcrel.  At the present time, only
+     64-bit Linux enables this.
+
+     Pc-relative support also requires the medium code model.
+
+     However, we can't check for ELFv2 or -mcmodel=medium until after the
+     subtarget macros are run.
+
+     If prefixed addressing is disabled by default, and the user does -mpcrel,
+     don't force them to also specify -mprefixed-addr.  */
+  if (TARGET_FUTURE)
+    {
+      bool explicit_prefixed = ((rs6000_isa_flags_explicit
+				 & OPTION_MASK_PREFIXED_ADDR) != 0);
+      bool explicit_pcrel = ((rs6000_isa_flags_explicit
+			      & OPTION_MASK_PCREL) != 0);
+
+      /* Prefixed addressing requires 64-bit registers.  */
+      if (!TARGET_POWERPC64)
+	{
+	  if (TARGET_PCREL && explicit_pcrel)
+	    error ("%qs requires %qs", "-mpcrel", "-m64");
+
+	  else if (TARGET_PREFIXED_ADDR && explicit_prefixed)
+	    error ("%qs requires %qs", "-mprefixed-addr", "-m64");
+
+	  rs6000_isa_flags &= ~ADDRESSING_FUTURE_MASKS;
+	}
+
+      /* Only ELFv2 currently supports prefixed/pcrel addressing.  */
+      else if (rs6000_current_abi != ABI_ELFv2)
+	{
+	  if (TARGET_PCREL && explicit_pcrel)
+	    error ("%qs requires %qs", "-mpcrel", "-mabi=elfv2");
+
+	  else if (TARGET_PREFIXED_ADDR && explicit_prefixed)
+	    error ("%qs requires %qs", "-mprefixed-addr", "-mabi=elfv2");
+
+	  rs6000_isa_flags &= ~ADDRESSING_FUTURE_MASKS;
+	}
+
+      /* Pc-relative requires the medium code model.  */
+      else if (TARGET_PCREL && TARGET_CMODEL != CMODEL_MEDIUM)
+	{
+	  if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+	    error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");
+
+	  rs6000_isa_flags &= ~OPTION_MASK_PCREL;
+	}
+
+      /* Enable defaults if desired.  */
+      else
+	{
+	  if (!explicit_prefixed
+	      && (TARGET_PREFIXED_ADDR_DEFAULT
+		  || TARGET_PCREL
+		  || TARGET_PCREL_DEFAULT))
+	    rs6000_isa_flags |= OPTION_MASK_PREFIXED_ADDR;
+
+	  if (!explicit_pcrel && TARGET_PCREL_DEFAULT
+	      && TARGET_CMODEL == CMODEL_MEDIUM)
+	    rs6000_isa_flags |= OPTION_MASK_PCREL;
+	}
+    }
+
+  /* -mprefixed-addr (and hence -mpcrel) requires -mcpu=future.  */
+  if (TARGET_PREFIXED_ADDR && !TARGET_FUTURE)
     {
       if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");
+	error ("%qs requires %qs", "-mpcrel", "-mcpu=future");
+      else if ((rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED_ADDR) != 0)
+	error ("%qs requires %qs", "-mprefixed-addr", "-mcpu=future");
+
+      rs6000_isa_flags &= ~(OPTION_MASK_PCREL | OPTION_MASK_PREFIXED_ADDR);
+    }
+
+  /* -mpcrel requires prefixed load/store addressing.  */
+  if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
+    {
+      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+	error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
 
       rs6000_isa_flags &= ~OPTION_MASK_PCREL;
     }

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] V5, #12 of 15: Miscellaneous prefixed address tests
  2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
                   ` (10 preceding siblings ...)
  2019-10-09 20:54 ` [PATCH] V5, #11 of 15: Make -mpcrel default on Linux 64-bit Michael Meissner
@ 2019-10-10 17:52 ` Michael Meissner
  2019-10-10 17:54 ` [PATCH] V5, #14 of 15: PC-relative tests Michael Meissner
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 28+ messages in thread
From: Michael Meissner @ 2019-10-10 17:52 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch is the first of 3 patches to the test framework for the prefixed
(and PC-relative) address support.  This patch adds some miscellaneous tests.
It also adds new target supports for prefixed addressing and PC-relative
addressing.

With patches #1-11 installed, these tests all pass.  Can I install them into
the trunck once pages 1-11 are committed?

2019-10-08  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/paddi-1.c: New test for loading up 34-bit
	DImode constants with PADDI.
	* gcc.target/powerpc/paddi-2.c: New test for loading up 32-bit
	SImode constants with PADDI.
	* /gcc.target/powerpc/paddi-3.c: New test for using PADDI to add
	34-bit constants.
	* gcc.target/powerpc/prefix-odd-memory.c: New test to make sure
	prefixed instructions are generated if an offset would not be
	legal for the non-prefixed DS/DQ instructions.
	* gcc.target/powerpc/prefix-premodify.c: New test to make sure we
	do not generate PRE_INC, PRE_DEC, or PRE_MODIFY on prefixed loads
	or stores.
	* lib/target-supports.exp
	(check_effective_target_powerpc_future_ok): Do not require 64-bit
	or Linux support before doing the test.  Use a 32-bit constant in
	PLI.
	(check_effective_target_powerpc_prefixed_addr_ok): New effective
	target test to see if prefixed memory instructions are supported.
	(check_effective_target_powerpc_pcrel_ok): New effective target
	test to test whether PC-relative addressing is supported.
	(is-effective-target): Add test for the PowerPC 'future' hardware
	support.

Index: gcc/testsuite/gcc.target/powerpc/paddi-1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/paddi-1.c	(revision 276774)
+++ gcc/testsuite/gcc.target/powerpc/paddi-1.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PADDI is generated to add a large constant.  */
+unsigned long
+add (unsigned long a)
+{
+  return a + 0x12345678UL;
+}
+
+/* { dg-final { scan-assembler {\mpaddi\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/paddi-2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/paddi-2.c	(revision 276774)
+++ gcc/testsuite/gcc.target/powerpc/paddi-2.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant.  */
+unsigned long
+large (void)
+{
+  return 0x12345678UL;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/paddi-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/paddi-3.c	(revision 276774)
+++ gcc/testsuite/gcc.target/powerpc/paddi-3.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant for SImode.  */
+void
+large_si (unsigned int *p)
+{
+  *p = 0x12345U;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-odd-memory.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-odd-memory.c	(revision 276774)
+++ gcc/testsuite/gcc.target/powerpc/prefix-odd-memory.c	(working copy)
@@ -0,0 +1,156 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests whether we can generate a prefixed load/store operation for addresses
+   that don't meet DS/DQ alignment constraints.  */
+
+unsigned long
+load_uc_odd (unsigned char *p)
+{
+  return p[1];				/* should generate LBZ.  */
+}
+
+long
+load_sc_odd (signed char *p)
+{
+  return p[1];				/* should generate LBZ + EXTSB.  */
+}
+
+unsigned long
+load_us_odd (unsigned char *p)
+{
+  return *(unsigned short *)(p + 1);	/* should generate LHZ.  */
+}
+
+long
+load_ss_odd (unsigned char *p)
+{
+  return *(short *)(p + 1);		/* should generate LHA.  */
+}
+
+unsigned long
+load_ui_odd (unsigned char *p)
+{
+  return *(unsigned int *)(p + 1);	/* should generate LWZ.  */
+}
+
+long
+load_si_odd (unsigned char *p)
+{
+  return *(int *)(p + 1);		/* should generate PLWA.  */
+}
+
+unsigned long
+load_ul_odd (unsigned char *p)
+{
+  return *(unsigned long *)(p + 1);	/* should generate PLD.  */
+}
+
+long
+load_sl_odd (unsigned char *p)
+{
+  return *(long *)(p + 1);	/* should generate PLD.  */
+}
+
+float
+load_float_odd (unsigned char *p)
+{
+  return *(float *)(p + 1);		/* should generate LFS.  */
+}
+
+double
+load_double_odd (unsigned char *p)
+{
+  return *(double *)(p + 1);		/* should generate LFD.  */
+}
+
+__ieee128
+load_ieee128_odd (unsigned char *p)
+{
+  return *(__ieee128 *)(p + 1);		/* should generate PLXV.  */
+}
+
+void
+store_uc_odd (unsigned char uc, unsigned char *p)
+{
+  p[1] = uc;				/* should generate STB.  */
+}
+
+void
+store_sc_odd (signed char sc, signed char *p)
+{
+  p[1] = sc;				/* should generate STB.  */
+}
+
+void
+store_us_odd (unsigned short us, unsigned char *p)
+{
+  *(unsigned short *)(p + 1) = us;	/* should generate STH.  */
+}
+
+void
+store_ss_odd (signed short ss, unsigned char *p)
+{
+  *(signed short *)(p + 1) = ss;	/* should generate STH.  */
+}
+
+void
+store_ui_odd (unsigned int ui, unsigned char *p)
+{
+  *(unsigned int *)(p + 1) = ui;	/* should generate STW.  */
+}
+
+void
+store_si_odd (signed int si, unsigned char *p)
+{
+  *(signed int *)(p + 1) = si;		/* should generate STW.  */
+}
+
+void
+store_ul_odd (unsigned long ul, unsigned char *p)
+{
+  *(unsigned long *)(p + 1) = ul;	/* should generate PSTD.  */
+}
+
+void
+store_sl_odd (signed long sl, unsigned char *p)
+{
+  *(signed long *)(p + 1) = sl;		/* should generate PSTD.  */
+}
+
+void
+store_float_odd (float f, unsigned char *p)
+{
+  *(float *)(p + 1) = f;		/* should generate STF.  */
+}
+
+void
+store_double_odd (double d, unsigned char *p)
+{
+  *(double *)(p + 1) = d;		/* should generate STD.  */
+}
+
+void
+store_ieee128_odd (__ieee128 ieee, unsigned char *p)
+{
+  *(__ieee128 *)(p + 1) = ieee;		/* should generate PSTXV.  */
+}
+
+/* { dg-final { scan-assembler-times {\mextsb\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mlbz\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mlfd\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlfs\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlha\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlhz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlwz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mpld\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mplwa\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mplxv\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstb\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mstfd\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mstfs\M}  1 } } */
+/* { dg-final { scan-assembler-times {\msth\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mstw\M}   2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-premodify.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-premodify.c	(revision 276774)
+++ gcc/testsuite/gcc.target/powerpc/prefix-premodify.c	(working copy)
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Make sure that we don't try to generate a prefixed form of the load and
+   store with update instructions.  */
+
+#ifndef SIZE
+#define SIZE 50000
+#endif
+
+struct foo {
+  unsigned int field;
+  char pad[SIZE];
+};
+
+struct foo *inc_load (struct foo *p, unsigned int *q)
+{
+  *q = (++p)->field;
+  return p;
+}
+
+struct foo *dec_load (struct foo *p, unsigned int *q)
+{
+  *q = (--p)->field;
+  return p;
+}
+
+struct foo *inc_store (struct foo *p, unsigned int *q)
+{
+  (++p)->field = *q;
+  return p;
+}
+
+struct foo *dec_store (struct foo *p, unsigned int *q)
+{
+  (--p)->field = *q;
+  return p;
+}
+
+/* { dg-final { scan-assembler-times {\mpli\M|\mpla\M|\mpaddi\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mplwz\M}                  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}                  2 } } */
+/* { dg-final { scan-assembler-not   {\mp?lwzu\M}                  } } */
+/* { dg-final { scan-assembler-not   {\mp?stwzu\M}                 } } */
+/* { dg-final { scan-assembler-not   {\maddis\M}                   } } */
+/* { dg-final { scan-assembler-not   {\maddi\M}                    } } */
Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	(revision 276759)
+++ gcc/testsuite/lib/target-supports.exp	(working copy)
@@ -5307,16 +5307,14 @@ proc check_effective_target_powerpc_p9mo
     }
 }
 
-# Return 1 if this is a PowerPC target supporting -mfuture.
-# Limit this to 64-bit linux systems for now until other
-# targets support FUTURE.
+# Return 1 if this is a PowerPC target supporting -mcpu=future.
 
 proc check_effective_target_powerpc_future_ok { } {
-    if { ([istarget powerpc64*-*-linux*]) } {
+    if { ([istarget powerpc*-*-*]) } {
 	return [check_no_compiler_messages powerpc_future_ok object {
 	    int main (void) {
 		long e;
-		asm ("pli %0,%1" : "=r" (e) : "n" (0x12345));
+		asm ("pli %0,%1" : "=r" (e) : "n" (0x1234));
 		return e;
 	    }
 	} "-mfuture"]
@@ -5325,6 +5323,46 @@ proc check_effective_target_powerpc_futu
     }
 }
 
+# Return 1 if this is a PowerPC target supporting -mcpu=future.  The compiler
+# must support large numeric prefixed addresses by default when -mfuture is
+# used.  We test loading up a large constant to verify that the full 34-bit
+# offset for prefixed instructions is supported and we check for a prefixed
+# load as well.
+
+proc check_effective_target_powerpc_prefixed_addr_ok { } {
+    if { ([istarget powerpc*-*-*]) } {
+	return [check_no_compiler_messages powerpc_prefixed_addr_ok object {
+	    int main (void) {
+		extern long l[];
+		long e, e2;
+		asm ("pli %0,%1" : "=r" (e) : "n" (0x12345678));
+		asm ("pld %0,0x12345678(%1)" : "=r" (e2) : "r" (& l[0]));
+		return e - e2;
+	    }
+	} "-mfuture"]
+    } else {
+	return 0
+    }
+}
+
+# Return 1 if this is a PowerPC target supporting -mfuture.  The compiler must
+# support PC-relative addressing when -mcpu=future is used to pass this test.
+
+proc check_effective_target_powerpc_pcrel_ok { } {
+    if { ([istarget powerpc*-*-*]) } {
+	return [check_no_compiler_messages powerpc_pcrel_ok object {
+	      int main (void) {
+		  static int s __attribute__((__used__));
+		  int e;
+		  asm ("plwa %0,s@pcrel(0),1" : "=r" (e));
+		  return e;
+	      }
+	  } "-mfuture"]
+      } else {
+	  return 0
+      }
+}
+
 # Return 1 if this is a PowerPC target supporting -mfloat128 via either
 # software emulation on power7/power8 systems or hardware support on power9.
 
@@ -7200,6 +7238,7 @@ proc is-effective-target { arg } {
 	  "named_sections" { set selected [check_named_sections_available] }
 	  "gc_sections"    { set selected [check_gc_sections_available] }
 	  "cxa_atexit"     { set selected [check_cxa_atexit_available] }
+	  "powerpc_future_hw" { set selected [check_powerpc_future_hw_available] }
 	  default          { error "unknown effective target keyword `$arg'" }
 	}
     }

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] V5, #13 of 15: Prefixed address tests with large offsets
  2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
                   ` (12 preceding siblings ...)
  2019-10-10 17:54 ` [PATCH] V5, #14 of 15: PC-relative tests Michael Meissner
@ 2019-10-10 17:54 ` Michael Meissner
  2019-10-10 18:06 ` [PATCH] V5, #15 of 15: Add (0),1 to @pcrel syntax Michael Meissner
  14 siblings, 0 replies; 28+ messages in thread
From: Michael Meissner @ 2019-10-10 17:54 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

These tests go through each of the types to make sure the number of expected
prefixed instructions are generated.


With patches #1-12 installed, these tests all pass.  Can I install them into
the trunck once pages 1-11 are committed?

2019-10-03  Michael Meissner  <meissner@linux.ibm.com>

	* gcc/testsuite/gcc.target/powerpc/prefix-large.h: New set of
	tests to test prefixed addressing on 'future' system with large
	numeric offsets.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-df.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-di.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-si.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c: New test.

Index: gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE _Decimal64
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-df.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-df.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE double
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-di.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-di.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-di.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE long
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE short
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplh[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE __float128
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE signed char
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c	(working copy)
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE _Decimal32
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mpaddi\M|\mpli|\mpla\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mlfiwzx\M}              2 } } */
+/* { dg-final { scan-assembler-times {\mstfiwx\M}              2 } } */
+
+
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE float
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfs\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfs\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-si.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-si.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-si.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE int
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplw[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE unsigned long
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE unsigned short
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplhz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE unsigned char
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE unsigned int
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplwz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE vector double
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large.h
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large.h	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large.h	(working copy)
@@ -0,0 +1,59 @@
+/* Common tests for prefixed instructions testing whether we can generate a
+   34-bit offset using 1 instruction.  */
+
+typedef signed char	schar;
+typedef unsigned char	uchar;
+typedef unsigned short	ushort;
+typedef unsigned int	uint;
+typedef unsigned long	ulong;
+typedef long double	ldouble;
+typedef vector double	v2df;
+typedef vector long	v2di;
+typedef vector float	v4sf;
+typedef vector int	v4si;
+
+#ifndef TYPE
+#define TYPE ulong
+#endif
+
+#ifndef ITYPE
+#define ITYPE TYPE
+#endif
+
+#ifndef OTYPE
+#define OTYPE TYPE
+#endif
+
+#if !defined(DO_ADD) && !defined(DO_VALUE) && !defined(DO_SET)
+#define DO_ADD		1
+#define DO_VALUE	1
+#define DO_SET		1
+#endif
+
+#ifndef CONSTANT
+#define CONSTANT	0x123450UL
+#endif
+
+#if DO_ADD
+void
+add (TYPE *p, TYPE a)
+{
+  p[CONSTANT] += a;
+}
+#endif
+
+#if DO_VALUE
+OTYPE
+value (TYPE *p)
+{
+  return p[CONSTANT];
+}
+#endif
+
+#if DO_SET
+void
+set (TYPE *p, ITYPE a)
+{
+  p[CONSTANT] = a;
+}
+#endif

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] V5, #14 of 15: PC-relative tests
  2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
                   ` (11 preceding siblings ...)
  2019-10-10 17:52 ` [PATCH] V5, #12 of 15: Miscellaneous prefixed address tests Michael Meissner
@ 2019-10-10 17:54 ` Michael Meissner
  2019-10-10 17:54 ` [PATCH] V5, #13 of 15: Prefixed address tests with large offsets Michael Meissner
  2019-10-10 18:06 ` [PATCH] V5, #15 of 15: Add (0),1 to @pcrel syntax Michael Meissner
  14 siblings, 0 replies; 28+ messages in thread
From: Michael Meissner @ 2019-10-10 17:54 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

These tests add tests for PC-relative addressing with the various types.

With patches #1-12 installed, these tests all pass.  Can I install them into
the trunk once pages 1-11 are committed?

2019-10-08  Michael Meissner  <meissner@linux.ibm.com>

	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h: New set of
	tests to test prefixed addressing on 'future' system with
	PC-relative tests.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c: New test.

Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for SImode.  */
+
+#define TYPE _Decimal64
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for DFmode.  */
+
+#define TYPE double
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for DImode.  */
+
+#define TYPE long
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for HImode.  */
+
+#define TYPE short
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplh[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for KFmode.  */
+
+#define TYPE __float128
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for QImode.  */
+
+#define TYPE signed char
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for SImode.  */
+
+#define TYPE _Decimal32
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mpaddi|\mpla\M} 3 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for SFmode.  */
+
+#define TYPE float
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplfs\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfs\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for SImode.  */
+
+#define TYPE int
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplw[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for unsigned DImode.  */
+
+#define TYPE unsigned long
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for unsigned HImode.  */
+
+#define TYPE unsigned short
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplhz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for unsigned QImode.  */
+
+#define TYPE unsigned char
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for unsigned SImode.  */
+
+#define TYPE unsigned int
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplwz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for V2DFmode.  */
+
+#define TYPE vector double
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h	(working copy)
@@ -0,0 +1,58 @@
+/* Common tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for each type.  */
+
+typedef signed char	schar;
+typedef unsigned char	uchar;
+typedef unsigned short	ushort;
+typedef unsigned int	uint;
+typedef unsigned long	ulong;
+typedef long double	ldouble;
+typedef vector double	v2df;
+typedef vector long	v2di;
+typedef vector float	v4sf;
+typedef vector int	v4si;
+
+#ifndef TYPE
+#define TYPE ulong
+#endif
+
+#ifndef ITYPE
+#define ITYPE TYPE
+#endif
+
+#ifndef OTYPE
+#define OTYPE TYPE
+#endif
+
+static TYPE a;
+TYPE *p = &a;
+
+#if !defined(DO_ADD) && !defined(DO_VALUE) && !defined(DO_SET)
+#define DO_ADD		1
+#define DO_VALUE	1
+#define DO_SET		1
+#endif
+
+#if DO_ADD
+void
+add (TYPE b)
+{
+  a += b;
+}
+#endif
+
+#if DO_VALUE
+OTYPE
+value (void)
+{
+  return (OTYPE)a;
+}
+#endif
+
+#if DO_SET
+void
+set (ITYPE b)
+{
+  a = (TYPE)b;
+}
+#endif

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] V5, #15 of 15: Add (0),1 to @pcrel syntax
  2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
                   ` (13 preceding siblings ...)
  2019-10-10 17:54 ` [PATCH] V5, #13 of 15: Prefixed address tests with large offsets Michael Meissner
@ 2019-10-10 18:06 ` Michael Meissner
  14 siblings, 0 replies; 28+ messages in thread
From: Michael Meissner @ 2019-10-10 18:06 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

In the code gaffe I had with the V4 patches, where I mistakenly did:

	PADDI rt,ra,label@pcrel

(which the machine does not support), it was suggested that perhaps I should
add an explicit (0),1 to each @pcrel relocation, so that the assembler will
flag any usage where you try to combine an index register and a PC-relative
offset.

With this patch applied, I bootstrapped the compiler on a power8 little endian
machine running Linux, and there were no regressions in the test suite.  I also
built both Spec 2017 rate and Spec 2006 cpu benchmarks with the compiler, and
there were no regressions in the benchmarks that built previously (Spec 2017
parest_r fails in gimple, both with the original compiler and the patched
version of the compiler).

2019-10-09  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (print_operand_address): Add (0),1 to
	@pcrel to catch errant usage.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 276758)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -13271,7 +13271,10 @@ print_operand_address (FILE *file, rtx x
       if (SYMBOL_REF_P (x) && !SYMBOL_REF_LOCAL_P (x))
 	fprintf (file, "@got");

-      fprintf (file, "@pcrel");
+      /* Specifically add (0),1 to catch uses where a @pcrel was added to a an
+	 address with a base register, since the hardware does not support
+	 adding a base register to a PC-relative address.  */
+      fprintf (file, "@pcrel(0),1");
     }
   else if (SYMBOL_REF_P (x) || GET_CODE (x) == CONST
 	   || GET_CODE (x) == LABEL_REF)

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] V5, #1 of 15: Support 34-bit offsets for prefixed instructions
  2019-10-09 19:56 ` [PATCH] V5, #1 of 15: Support 34-bit offsets for prefixed instructions Michael Meissner
@ 2019-10-10 22:02   ` Segher Boessenkool
  0 siblings, 0 replies; 28+ messages in thread
From: Segher Boessenkool @ 2019-10-10 22:02 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

On Wed, Oct 09, 2019 at 03:52:54PM -0400, Michael Meissner wrote:
> This patch adds support in the various functions that check memory offsets for
> the 34-bit offset with prefixed instructions on the 'future' machine.

> 2019-10-08  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/rs6000.c (quad_address_p): Add check for prefixed
> 	addresses.
> 	(mem_operand_gpr): Add check for prefixed addresses.
> 	(mem_operand_ds_form): Add check for prefixed addresses.
> 	(rs6000_legitimate_offset_address_p): If we support prefixed
> 	addresses, check for a 34-bit offset instead of 16-bit.
> 	(rs6000_legitimate_address_p): Add check for prefixed addresses.
> 	Do not allow load/store with update if the address is prefixed.
> 	(rs6000_mode_dependent_address):  If we support prefixed
> 	addresses, check for a 34-bit offset instead of 16-bit.

s/  If/ If/

Okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] V5, #2 of 15: Fix prefixed instruction length for some 128-bit types
  2019-10-09 20:03 ` [PATCH] V5, #2 of 15: Fix prefixed instruction length for some 128-bit types Michael Meissner
@ 2019-10-10 22:18   ` Segher Boessenkool
  2019-10-14 21:14     ` Michael Meissner
  0 siblings, 1 reply; 28+ messages in thread
From: Segher Boessenkool @ 2019-10-10 22:18 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

On Wed, Oct 09, 2019 at 03:56:01PM -0400, Michael Meissner wrote:
> This patch fixes the prefixed and non-prefixed instruction sizes for the
> 128-bit types that aren't loaded into 128-bit vectors (TDmode, TFmode, IFmode,
> PTImode).

> 2019-10-08  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/rs6000.md (mov<mode>_64bit_dm): Set prefixed and
> 	non-prefixed length.
> 	(movtd_64bit_nodm): Set prefixed and non-prefixed length.
> 	(mov<mode>_ppc64): Set prefixed and non-prefixed length.

Please also note the patterns you reformatted.  (Just "Reformat." is
enough of course).

>  (define_insn_and_split "*movtd_64bit_nodm"
>    [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
> @@ -7786,8 +7790,12 @@ (define_insn_and_split "*movtd_64bit_nod
>    "#"
>    "&& reload_completed"
>    [(pc)]
> -{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
> -  [(set_attr "length" "8,8,8,12,12,8")])
> +{
> +  rs6000_split_multireg_move (operands[0], operands[1]);
> +  DONE;
> +}
> +  [(set_attr "non_prefixed_length" "8")
> +   (set_attr "prefixed_length" "20")])

It used to be 8,8,8,12,12,8 before.  Was that in error?  Please explain.


Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] V5, #3 of 15: Deal with prefixed instructions in rs6000_insn_cost
  2019-10-09 20:07 ` [PATCH] V5, #3 of 15: Deal with prefixed instructions in rs6000_insn_cost Michael Meissner
@ 2019-10-11 21:17   ` Segher Boessenkool
  0 siblings, 0 replies; 28+ messages in thread
From: Segher Boessenkool @ 2019-10-11 21:17 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

Hi!

On Wed, Oct 09, 2019 at 04:03:16PM -0400, Michael Meissner wrote:
> The basic problem is if there is no explicit cost predicate, rs6000_insn_cost
> uses the instruction size to figure out how many instructions are present, and
> make the cost a fact on that.  Since prefixed instructions are 12 bytes within
> GCC (to deal with the implicit NOP), if we did not do this change, the
> optimizers would try to save registers from prefixed loads because they thought
> the load was more expensive.

Maybe we should just have an attribute that says how many insns this is?
You can get rid of many prefixed_length and non_prefixed_length attributes
that way, too.

> +  int cost;
> +  int length = get_attr_length (insn);
> +  int n = length / 4;
> +
> +  /* How many real instructions are generated for this insn?  This is slightly

What is a "real" instruction?  Machine instruction?

> +     different from the length attribute, in that the length attribute counts
> +     the number of bytes.  With prefixed instructions, we don't want to count a
> +     prefixed instruction (length 12 bytes including possible NOP) as taking 3
> +     instructions, but just one.  */
> +  if (length >= 12 && get_attr_prefixed (insn) == PREFIXED_YES)
> +    {
> +      /* Single prefixed instruction.  */
> +      if (length == 12)
> +	n = 1;
> +
> +      /* A normal instruction and a prefixed instruction (16) or two back
> +	 to back prefixed instructions (20).  */
> +      else if (length == 16 || length == 20)
> +	n = 2;
> +
> +      /* Guess for larger instruction sizes.  */
> +      else
> +	n = 2 + (length - 20) / 4;

That's a pretty bad estimate.

Can you look at non_prefixed_size, will that help?


Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] V5, #4 of 15: Update predicates
  2019-10-09 20:11 ` [PATCH] V5, #4 of 15: Update predicates Michael Meissner
@ 2019-10-11 21:43   ` Segher Boessenkool
  2019-10-14 21:18     ` Michael Meissner
  0 siblings, 1 reply; 28+ messages in thread
From: Segher Boessenkool @ 2019-10-11 21:43 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

On Wed, Oct 09, 2019 at 04:07:50PM -0400, Michael Meissner wrote:
> It also adds two new predicates that disallow prefixed memory that will used in
> patches #5 and #7.

Then you should have really introduced them in *those* patches.

> 2019-10-08  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/predicates.md (lwa_operand): Allow using PLWA to
> 	generate sign extend with odd offsets.

I don't understand what this means.  "Odd offsets" isn't correct, in any
case?

> +  /* The LWA instruction uses the DS-form format where the bottom two bits of
> +     the offset must be 0.  The prefixed PLWA does not have this
> +     restriction.  */
> +  if (address_is_prefixed (addr, DImode, NON_PREFIXED_DS))
> +    return true;

DImode?

> +;; Return 1 if op is a memory operand that is not prefixed.
> +(define_predicate "non_prefixed_memory"
> +  (match_code "mem")
> +{
> +  if (!memory_operand (op, mode))
> +    return false;
> +
> +  return !address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
> +})

This one is fine.

> +;; Return 1 if op is either a register operand or a memory operand that does
> +;; not use a prefixed address.
> +(define_predicate "reg_or_non_prefixed_memory"
> +  (match_code "reg,subreg,mem")
> +{
> +  return (gpc_reg_operand (op, mode) || non_prefixed_memory (op, mode));
> +})

This never allows subreg.


Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] V5, #5 of 15: Support -fstack-protect and large stack frames
  2019-10-09 20:26 ` [PATCH] V5, #5 of 15: Support -fstack-protect and large stack frames Michael Meissner
@ 2019-10-11 21:48   ` Segher Boessenkool
  2019-10-14 21:35     ` Michael Meissner
  0 siblings, 1 reply; 28+ messages in thread
From: Segher Boessenkool @ 2019-10-11 21:48 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

On Wed, Oct 09, 2019 at 04:22:06PM -0400, Michael Meissner wrote:
> This patch allows -fstack-protect to work with large stack frames.

I don't understand; please explain.

What I see is a workaround for not properly handling prefixed addresses
in the stack protect code (by forcing the addresses to not be prefixed at
expand time).

> +rtx
> +make_memory_non_prefixed (rtx mem)
> +{
> +  gcc_assert (MEM_P (mem));
> +
> +  rtx old_addr = XEXP (mem, 0);
> +  if (address_is_prefixed (old_addr, GET_MODE (mem), NON_PREFIXED_DEFAULT))
> +    {
> +      rtx new_addr;
> +
> +      if (GET_CODE (old_addr) == PLUS
> +	  && (REG_P (XEXP (old_addr, 0)) || SUBREG_P (XEXP (old_addr, 0)))

How can you ever have a subreg in an address?  One in Pmode even?

> +	  && CONST_INT_P (XEXP (old_addr, 1)))
> +	{
> +	  rtx tmp_reg = force_reg (Pmode, XEXP (old_addr, 1));
> +	  new_addr = gen_rtx_PLUS (Pmode, XEXP (old_addr, 0), tmp_reg);
> +	}
> +      else
> +	new_addr = force_reg (Pmode, old_addr);
> +
> +      mem = change_address (mem, VOIDmode, new_addr);

replace_equiv_address ?

> +(define_expand "stack_protect_setdi"
> +  [(parallel [(set (match_operand:DI 0 "memory_operand")
> +		   (unspec:DI [(match_operand:DI 1 "memory_operand")]
> +		   UNSPEC_SP_SET))
> +	      (set (match_scratch:DI 2)
> +		   (const_int 0))])]
> +  "TARGET_64BIT"
> +{
> +  if (TARGET_PREFIXED_ADDR)
> +    {
> +      operands[0] = make_memory_non_prefixed (operands[0]);
> +      operands[1] = make_memory_non_prefixed (operands[1]);
> +    }
> +})

It shouldn't be terribly hard to make the define_insns just *work* with
prefixed insns, instead?  Is there any reason we are sure these memory
references will not turn into something that needs prefixed insns, after
expand?  It seems fragile like this.

> +@item em
> +A memory operand that does not contain a prefixed address.

"A memory operand that does not require a prefixed instruction"?


Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] V5, #6 of 15: Make vector load/store instruction length correct with prefixed addresses
  2019-10-09 20:26 ` [PATCH] V5, #6 of 15: Make vector load/store instruction length correct with prefixed addresses Michael Meissner
@ 2019-10-11 21:58   ` Segher Boessenkool
  2019-10-14 21:31     ` Michael Meissner
  0 siblings, 1 reply; 28+ messages in thread
From: Segher Boessenkool @ 2019-10-11 21:58 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

Hi!

On Wed, Oct 09, 2019 at 04:26:20PM -0400, Michael Meissner wrote:
> --- gcc/config/rs6000/vsx.md	(revision 276713)
> +++ gcc/config/rs6000/vsx.md	(working copy)
> @@ -1149,10 +1149,14 @@ (define_insn "vsx_mov<mode>_64bit"
>                 "vecstore,  vecload,   vecsimple, mffgpr,    mftgpr,    load,
>                  store,     load,      store,     *,         vecsimple, vecsimple,
>                  vecsimple, *,         *,         vecstore,  vecload")
> -   (set_attr "length"
> +   (set_attr "non_prefixed_length"
>                 "*,         *,         *,         8,         *,         8,
>                  8,         8,         8,         8,         *,         *,
>                  *,         20,        8,         *,         *")
> +   (set_attr "prefixed_length"
> +               "*,         *,         *,         8,         *,         20,
> +                20,        20,        20,        8,         *,         *,
> +                *,         20,        8,         *,         *")

Alternative 13 has non_prefixed_length 20, I wonder what insns that
generates?

Other than that, looks good afaics.


Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] V5, #2 of 15: Fix prefixed instruction length for some 128-bit types
  2019-10-10 22:18   ` Segher Boessenkool
@ 2019-10-14 21:14     ` Michael Meissner
  2019-10-14 22:24       ` Segher Boessenkool
  0 siblings, 1 reply; 28+ messages in thread
From: Michael Meissner @ 2019-10-14 21:14 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Thu, Oct 10, 2019 at 05:02:09PM -0500, Segher Boessenkool wrote:
> On Wed, Oct 09, 2019 at 03:56:01PM -0400, Michael Meissner wrote:
> > This patch fixes the prefixed and non-prefixed instruction sizes for the
> > 128-bit types that aren't loaded into 128-bit vectors (TDmode, TFmode, IFmode,
> > PTImode).
> 
> > 2019-10-08  Michael Meissner  <meissner@linux.ibm.com>
> > 
> > 	* config/rs6000/rs6000.md (mov<mode>_64bit_dm): Set prefixed and
> > 	non-prefixed length.
> > 	(movtd_64bit_nodm): Set prefixed and non-prefixed length.
> > 	(mov<mode>_ppc64): Set prefixed and non-prefixed length.
> 
> Please also note the patterns you reformatted.  (Just "Reformat." is
> enough of course).

Ok.

> >  (define_insn_and_split "*movtd_64bit_nodm"
> >    [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
> > @@ -7786,8 +7790,12 @@ (define_insn_and_split "*movtd_64bit_nod
> >    "#"
> >    "&& reload_completed"
> >    [(pc)]
> > -{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
> > -  [(set_attr "length" "8,8,8,12,12,8")])
> > +{
> > +  rs6000_split_multireg_move (operands[0], operands[1]);
> > +  DONE;
> > +}
> > +  [(set_attr "non_prefixed_length" "8")
> > +   (set_attr "prefixed_length" "20")])
> 
> It used to be 8,8,8,12,12,8 before.  Was that in error?  Please explain.

We've had this discussion before, but I didn't update the ChangeLog.

The two cases for 12 bytes (i.e. 3 insns) are r->Y and Y->r constaints.  Y
matches DS offsets (i.e. bottom 2 bits non-zero) for traditional instructions.
In looking at rs6000_split_multireg_move, the only way I can see that 3
instructions would be generated would be if pre_modify/pre_update/etc. were
generated.  But we don't allow pre_modify on larger types.

But if you think there might be a case where it does generate 3 instructions, I
can modify it to use 8,8,8,12,12,8 for the non-prefixed case, and
20,20,20,24,24,20 for the prefixed case.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] V5, #4 of 15: Update predicates
  2019-10-11 21:43   ` Segher Boessenkool
@ 2019-10-14 21:18     ` Michael Meissner
  2019-10-14 23:12       ` Segher Boessenkool
  0 siblings, 1 reply; 28+ messages in thread
From: Michael Meissner @ 2019-10-14 21:18 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Fri, Oct 11, 2019 at 04:17:02PM -0500, Segher Boessenkool wrote:
> On Wed, Oct 09, 2019 at 04:07:50PM -0400, Michael Meissner wrote:
> > It also adds two new predicates that disallow prefixed memory that will used in
> > patches #5 and #7.
> 
> Then you should have really introduced them in *those* patches.

I was trying to break it into smaller pieces.

> > 2019-10-08  Michael Meissner  <meissner@linux.ibm.com>
> > 
> > 	* config/rs6000/predicates.md (lwa_operand): Allow using PLWA to
> > 	generate sign extend with odd offsets.
> 
> I don't understand what this means.  "Odd offsets" isn't correct, in any
> case?

Non-zero in the bottom 2/4 bits.  I'll try to come up with different words.

> > +  /* The LWA instruction uses the DS-form format where the bottom two bits of
> > +     the offset must be 0.  The prefixed PLWA does not have this
> > +     restriction.  */
> > +  if (address_is_prefixed (addr, DImode, NON_PREFIXED_DS))
> > +    return true;
> 
> DImode?

Yes, because LWA converts SImode to DImode.

> > +;; Return 1 if op is a memory operand that is not prefixed.
> > +(define_predicate "non_prefixed_memory"
> > +  (match_code "mem")
> > +{
> > +  if (!memory_operand (op, mode))
> > +    return false;
> > +
> > +  return !address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
> > +})
> 
> This one is fine.
> 
> > +;; Return 1 if op is either a register operand or a memory operand that does
> > +;; not use a prefixed address.
> > +(define_predicate "reg_or_non_prefixed_memory"
> > +  (match_code "reg,subreg,mem")
> > +{
> > +  return (gpc_reg_operand (op, mode) || non_prefixed_memory (op, mode));
> > +})
> 
> This never allows subreg.

Gpc_reg_operand allows subreg, assuming that register_operand allows subreg.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] V5, #6 of 15: Make vector load/store instruction length correct with prefixed addresses
  2019-10-11 21:58   ` Segher Boessenkool
@ 2019-10-14 21:31     ` Michael Meissner
  0 siblings, 0 replies; 28+ messages in thread
From: Michael Meissner @ 2019-10-14 21:31 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Fri, Oct 11, 2019 at 04:53:23PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Oct 09, 2019 at 04:26:20PM -0400, Michael Meissner wrote:
> > --- gcc/config/rs6000/vsx.md	(revision 276713)
> > +++ gcc/config/rs6000/vsx.md	(working copy)
> > @@ -1149,10 +1149,14 @@ (define_insn "vsx_mov<mode>_64bit"
> >                 "vecstore,  vecload,   vecsimple, mffgpr,    mftgpr,    load,
> >                  store,     load,      store,     *,         vecsimple, vecsimple,
> >                  vecsimple, *,         *,         vecstore,  vecload")
> > -   (set_attr "length"
> > +   (set_attr "non_prefixed_length"
> >                 "*,         *,         *,         8,         *,         8,
> >                  8,         8,         8,         8,         *,         *,
> >                  *,         20,        8,         *,         *")
> > +   (set_attr "prefixed_length"
> > +               "*,         *,         *,         8,         *,         20,
> > +                20,        20,        20,        8,         *,         *,
> > +                *,         20,        8,         *,         *")
> 
> Alternative 13 has non_prefixed_length 20, I wonder what insns that
> generates?

All of the vector constants that match the constants matched by
easy_altivec_constant.

For example:

	vector int foo (void)
	{
	  return (vector int) { 0, 0, 0, 1 };
	}

generates:

        vspltisw 2,0
        vspltisw 0,1
        vsldoi 2,0,2,12

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] V5, #5 of 15: Support -fstack-protect and large stack frames
  2019-10-11 21:48   ` Segher Boessenkool
@ 2019-10-14 21:35     ` Michael Meissner
  0 siblings, 0 replies; 28+ messages in thread
From: Michael Meissner @ 2019-10-14 21:35 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Fri, Oct 11, 2019 at 04:44:51PM -0500, Segher Boessenkool wrote:
> On Wed, Oct 09, 2019 at 04:22:06PM -0400, Michael Meissner wrote:
> > This patch allows -fstack-protect to work with large stack frames.
> 
> I don't understand; please explain.
> 
> What I see is a workaround for not properly handling prefixed addresses
> in the stack protect code (by forcing the addresses to not be prefixed at
> expand time).

Several iterations ago, you explicitly told me to not to allow prefixed insns
here.  So I made it to use traditional instructions.

If you recall the code was:

(define_insn "stack_protect_testdi"
  [(set (match_operand:CCEQ 0 "cc_reg_operand" "=x,?y")
        (unspec:CCEQ [(match_operand:DI 1 "memory_operand" "Y,Y")
		      (match_operand:DI 2 "memory_operand" "Y,Y")]
		     UNSPEC_SP_TEST))
   (set (match_scratch:DI 4 "=r,r") (const_int 0))
   (clobber (match_scratch:DI 3 "=&r,&r"))]
  "TARGET_64BIT"
{
  if (prefixed_mem_operand (operands[1], DImode))
    output_asm_insn ("pld %3,%1", operands);
  else
    output_asm_insn ("ld%U1%X1 %3,%1", operands);

  if (prefixed_mem_operand (operands[2], DImode))
    output_asm_insn ("pld %4,%2", operands);
  else
    output_asm_insn ("ld%U2%X2 %4,%2", operands);

  if (which_alternative == 0)
    output_asm_insn ("xor. %3,%3,%4", operands);
  else
    output_asm_insn ("cmpld %0,%3,%4\;li %3,0", operands);

  return "li %4,0";
}
  ;; Back to back prefixed memory instructions take 20 bytes (8 bytes for each
  ;; prefixed instruction + 4 bytes for the possible NOP).
  [(set (attr "length")
	(cond [(and (match_operand 1 "prefixed_mem_operand")
		    (match_operand 2 "prefixed_mem_operand"))
	       (if_then_else (eq_attr "alternative" "0")
			     (const_string "28")
			     (const_string "32"))

	       (ior (match_operand 1 "prefixed_mem_operand")
		    (match_operand 2 "prefixed_mem_operand"))
	       (if_then_else (eq_attr "alternative" "0")
			     (const_string "20")
			     (const_string "24"))]

	      (if_then_else (eq_attr "alternative" "0")
			    (const_string "16")
			    (const_string "20"))))
   (set_attr "prefixed" "no")])

It can't use the normal prefixed support used in other insns because this one
has multiple insns (hence the 'p' trick for ASM_OUTPUT_OPCODE won't work) and
two memory addresses.

> > +rtx
> > +make_memory_non_prefixed (rtx mem)
> > +{
> > +  gcc_assert (MEM_P (mem));
> > +
> > +  rtx old_addr = XEXP (mem, 0);
> > +  if (address_is_prefixed (old_addr, GET_MODE (mem), NON_PREFIXED_DEFAULT))
> > +    {
> > +      rtx new_addr;
> > +
> > +      if (GET_CODE (old_addr) == PLUS
> > +	  && (REG_P (XEXP (old_addr, 0)) || SUBREG_P (XEXP (old_addr, 0)))
> 
> How can you ever have a subreg in an address?  One in Pmode even?

I could imagine having a union or something similar that generates a subreg.

> 
> > +	  && CONST_INT_P (XEXP (old_addr, 1)))
> > +	{
> > +	  rtx tmp_reg = force_reg (Pmode, XEXP (old_addr, 1));
> > +	  new_addr = gen_rtx_PLUS (Pmode, XEXP (old_addr, 0), tmp_reg);
> > +	}
> > +      else
> > +	new_addr = force_reg (Pmode, old_addr);
> > +
> > +      mem = change_address (mem, VOIDmode, new_addr);
> 
> replace_equiv_address ?
> 
> > +(define_expand "stack_protect_setdi"
> > +  [(parallel [(set (match_operand:DI 0 "memory_operand")
> > +		   (unspec:DI [(match_operand:DI 1 "memory_operand")]
> > +		   UNSPEC_SP_SET))
> > +	      (set (match_scratch:DI 2)
> > +		   (const_int 0))])]
> > +  "TARGET_64BIT"
> > +{
> > +  if (TARGET_PREFIXED_ADDR)
> > +    {
> > +      operands[0] = make_memory_non_prefixed (operands[0]);
> > +      operands[1] = make_memory_non_prefixed (operands[1]);
> > +    }
> > +})
> 
> It shouldn't be terribly hard to make the define_insns just *work* with
> prefixed insns, instead?  Is there any reason we are sure these memory
> references will not turn into something that needs prefixed insns, after
> expand?  It seems fragile like this.

As I said, I've done it in the past.  It is complicated because this insn must
generate two or more insns without spliting.  But it is doable.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] V5, #2 of 15: Fix prefixed instruction length for some 128-bit types
  2019-10-14 21:14     ` Michael Meissner
@ 2019-10-14 22:24       ` Segher Boessenkool
  0 siblings, 0 replies; 28+ messages in thread
From: Segher Boessenkool @ 2019-10-14 22:24 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

On Mon, Oct 14, 2019 at 05:12:56PM -0400, Michael Meissner wrote:
> On Thu, Oct 10, 2019 at 05:02:09PM -0500, Segher Boessenkool wrote:
> > > @@ -7786,8 +7790,12 @@ (define_insn_and_split "*movtd_64bit_nod
> > >    "#"
> > >    "&& reload_completed"
> > >    [(pc)]
> > > -{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
> > > -  [(set_attr "length" "8,8,8,12,12,8")])
> > > +{
> > > +  rs6000_split_multireg_move (operands[0], operands[1]);
> > > +  DONE;
> > > +}
> > > +  [(set_attr "non_prefixed_length" "8")
> > > +   (set_attr "prefixed_length" "20")])
> > 
> > It used to be 8,8,8,12,12,8 before.  Was that in error?  Please explain.
> 
> We've had this discussion before, but I didn't update the ChangeLog.
> 
> The two cases for 12 bytes (i.e. 3 insns) are r->Y and Y->r constaints.  Y
> matches DS offsets (i.e. bottom 2 bits non-zero) for traditional instructions.
> In looking at rs6000_split_multireg_move, the only way I can see that 3
> instructions would be generated would be if pre_modify/pre_update/etc. were
> generated.  But we don't allow pre_modify on larger types.

So do this as a separate change, first, please?

If it cannot happen, please also add an assert (somewhere early, before
doing anything else with the automodifies), an remove any now-dead code
(if there is any).

> But if you think there might be a case where it does generate 3 instructions, I
> can modify it to use 8,8,8,12,12,8 for the non-prefixed case, and
> 20,20,20,24,24,20 for the prefixed case.

I think you are right, but it's not obvious, and it warrants a separate
patch.  That way, we can easily bisect it if some problem manifests, etc.
Thanks,


Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] V5, #4 of 15: Update predicates
  2019-10-14 21:18     ` Michael Meissner
@ 2019-10-14 23:12       ` Segher Boessenkool
  0 siblings, 0 replies; 28+ messages in thread
From: Segher Boessenkool @ 2019-10-14 23:12 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

On Mon, Oct 14, 2019 at 05:16:03PM -0400, Michael Meissner wrote:
> On Fri, Oct 11, 2019 at 04:17:02PM -0500, Segher Boessenkool wrote:
> > > 	* config/rs6000/predicates.md (lwa_operand): Allow using PLWA to
> > > 	generate sign extend with odd offsets.
> > 
> > I don't understand what this means.  "Odd offsets" isn't correct, in any
> > case?
> 
> Non-zero in the bottom 2/4 bits.  I'll try to come up with different words.

Ah.  "If prefixed is allowed, allow addresses not a multiple of 4."?

> > > +  /* The LWA instruction uses the DS-form format where the bottom two bits of
> > > +     the offset must be 0.  The prefixed PLWA does not have this
> > > +     restriction.  */
> > > +  if (address_is_prefixed (addr, DImode, NON_PREFIXED_DS))
> > > +    return true;
> > 
> > DImode?
> 
> Yes, because LWA converts SImode to DImode.

Well, the access is to a SImode datum.  But you want DImode here so it uses
the code for ld (the DS-mode code) also for lwa.  Hrm, address_to_insn_form
needs a comment for that then, and so do callers like this one.  Should be
fine with that, it's just an irregularity in the ISA.

> > > +;; Return 1 if op is either a register operand or a memory operand that does
> > > +;; not use a prefixed address.
> > > +(define_predicate "reg_or_non_prefixed_memory"
> > > +  (match_code "reg,subreg,mem")
> > > +{
> > > +  return (gpc_reg_operand (op, mode) || non_prefixed_memory (op, mode));
> > > +})
> > 
> > This never allows subreg.
> 
> Gpc_reg_operand allows subreg, assuming that register_operand allows subreg.

It does, and I have no idea what I thought here now.  Sorry.


Segher

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2019-10-14 22:45 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-09 19:53 PowerPC future machine patches, version 5 Michael Meissner
2019-10-09 19:56 ` [PATCH] V5, #1 of 15: Support 34-bit offsets for prefixed instructions Michael Meissner
2019-10-10 22:02   ` Segher Boessenkool
2019-10-09 20:03 ` [PATCH] V5, #2 of 15: Fix prefixed instruction length for some 128-bit types Michael Meissner
2019-10-10 22:18   ` Segher Boessenkool
2019-10-14 21:14     ` Michael Meissner
2019-10-14 22:24       ` Segher Boessenkool
2019-10-09 20:07 ` [PATCH] V5, #3 of 15: Deal with prefixed instructions in rs6000_insn_cost Michael Meissner
2019-10-11 21:17   ` Segher Boessenkool
2019-10-09 20:11 ` [PATCH] V5, #4 of 15: Update predicates Michael Meissner
2019-10-11 21:43   ` Segher Boessenkool
2019-10-14 21:18     ` Michael Meissner
2019-10-14 23:12       ` Segher Boessenkool
2019-10-09 20:26 ` [PATCH] V5, #5 of 15: Support -fstack-protect and large stack frames Michael Meissner
2019-10-11 21:48   ` Segher Boessenkool
2019-10-14 21:35     ` Michael Meissner
2019-10-09 20:26 ` [PATCH] V5, #6 of 15: Make vector load/store instruction length correct with prefixed addresses Michael Meissner
2019-10-11 21:58   ` Segher Boessenkool
2019-10-14 21:31     ` Michael Meissner
2019-10-09 20:36 ` [PATCH] V5, #7 of 15: Restrict vector extract to not use prefixed memory Michael Meissner
2019-10-09 20:39 ` [PATCH] V5, #9 of 15: Add PADDI (PLI) support to load up 32-bit SImode constants Michael Meissner
2019-10-09 20:39 ` [PATCH] V5, #8 of 15: Use PADDI/PLI to load up 34-bit DImode constants Michael Meissner
2019-10-09 20:42 ` [PATCH] V5, #10 of 15: Use PADDI to add large 34-bit constants Michael Meissner
2019-10-09 20:54 ` [PATCH] V5, #11 of 15: Make -mpcrel default on Linux 64-bit Michael Meissner
2019-10-10 17:52 ` [PATCH] V5, #12 of 15: Miscellaneous prefixed address tests Michael Meissner
2019-10-10 17:54 ` [PATCH] V5, #14 of 15: PC-relative tests Michael Meissner
2019-10-10 17:54 ` [PATCH] V5, #13 of 15: Prefixed address tests with large offsets Michael Meissner
2019-10-10 18:06 ` [PATCH] V5, #15 of 15: Add (0),1 to @pcrel syntax Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).