public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* PowerPC future machine patches, version 6
@ 2019-10-16 12:51 Michael Meissner
  2019-10-16 13:39 ` [PATCH] V6, #1 of 17: Use ADJUST_INSN_LENGTH for prefixed instructions Michael Meissner
                   ` (16 more replies)
  0 siblings, 17 replies; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 12:51 UTC (permalink / raw)
  To: gcc-patches, segher, dje.gcc, meissner

This is version 6 of the patches for the PowerPC 'future' machine.  There are
currently 17 patches in this series.

Compared to the V5 patches, the following changes have been made:

1) The length calculation for memory references involving prefixed addresses
has been moved to the target hook ADJUST_INSN_LENGTH.

2) There is a new insn attribute (num_insns) that gives the number of machine
instructions in an insn, rather than having rs6000_insn_cost trying to figure
out how many instructions an insn had on the fly from the insn length.

3) I moved a patch reformatting the code to be a separate patch.

4) I separated the predicates patches, putting the lwa_operand modification as
a separate patch, and the other new predicate functions are with the other
changes that use them.

5) I reworked stack_protect_setdi and stack_protect_testdi to once again allow
prefixed instructions.  I have added a test for this as a later patch.

6) I have slightly reworked the vector extract patches to fit in with the new
scheme.  This patch still has the restriction that it will not combine a vector
extract from memory if the vector is pointed uses a PC-relative address and the
element number is variable.  This is due to not having two temporary registers
(one for the PC-relative address, and one for the variable offset).

7) I split up the miscellaneous tests into separate patches, including a
separate patch to add new effective targets for tests.

Note, on October 17th and 18th, I likely will have limited email access.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #1 of 17: Use ADJUST_INSN_LENGTH for prefixed instructions
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
@ 2019-10-16 13:39 ` Michael Meissner
  2019-10-22 22:38   ` Segher Boessenkool
  2019-10-16 13:42 ` [PATCH] V6, #2 of 17: Minor code reformat Michael Meissner
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 13:39 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch uses the target hook ADJUST_INSN_LENGTH to change the length of
instructions that contain prefixed memory/add instructions.

There are 2 new insn attributes:

1) num_insns: If non-zero, returns the number of machine instructions in an
insn.  This simplifies the calculations in rs6000_insn_cost.

2) max_prefixed_insns: Returns the maximum number of prefixed instructions in
an insn.  Normally this is 1, but in the insns that load up 128-bit values into
GPRs, it will be 2.

This patch replaces patches #2, #3, and #6 from the V5 patch series.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_insn_cost): Use num_insns insn
	attribute if it exists, rather than the insn size.  If we use the
	insn size, adjust the size to remove the extra size that prefixed
	instructions take.
	* config/rs6000/rs6000.h (ADJUST_INSN_LENGTH): New target hook to
	update the instruction sized if prefixed instructions are used.
	* config/rs6000/rs6000.md (prefixed_length attribute): Delete.
	(non_prefixed_length attribute): Delete.
	(num_insns attribute): New insn attribute to return the number of
	instructions.
	(max_prefixed_insns attribute): New insn attribute to return the
	maximum number of prefixed instructions in an insn.
	(length attribute): Do not adjust for prefix instructions here,
	punt to ADJUST_INSN_LENGTH.
	(mov<mode>_64bit): Set max_prefixed_insns and num_insns.
	(movtd_64bit_nodm): Set max_prefixed_insns and num_insns.
	(mov<mode>_ppc64): Set max_prefixed_insns and num_insns.
	* config/rs6000/vsx.md: (vsx_mov<mode>_64bit): Set
	max_prefixed_insns and num_insns.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 277017)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -20973,14 +20973,32 @@ rs6000_insn_cost (rtx_insn *insn, bool s
   if (recog_memoized (insn) < 0)
     return 0;
 
+  /* If we are optimizing for size, just use the length.  */
   if (!speed)
     return get_attr_length (insn);
 
+  /* Use the cost if provided.  */
   int cost = get_attr_cost (insn);
   if (cost > 0)
     return cost;
 
-  int n = get_attr_length (insn) / 4;
+  /* If the insn tells us how many insns there are, use that.  Otherwise use
+     the length/4.  Adjust the insn length to remove the extra size that
+     prefixed instructions take.  */
+  int n = get_attr_num_insns (insn);
+  if (n == 0)
+    {
+      int length = get_attr_length (insn);
+      if (get_attr_prefixed (insn) == PREFIXED_YES)
+	{
+	  int adjust = 0;
+	  ADJUST_INSN_LENGTH (insn, adjust);
+	  length -= adjust;
+	}
+
+      n = length / 4;
+    }
+
   enum attr_type type = get_attr_type (insn);
 
   switch (type)
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 277017)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -1847,9 +1847,30 @@ extern scalar_int_mode rs6000_pmode;
 /* Adjust the length of an INSN.  LENGTH is the currently-computed length and
    should be adjusted to reflect any required changes.  This macro is used when
    there is some systematic length adjustment required that would be difficult
-   to express in the length attribute.  */
+   to express in the length attribute.
 
-/* #define ADJUST_INSN_LENGTH(X,LENGTH) */
+   In the PowerPC, we use this to adjust the length of an instruction if one or
+   more prefixed instructions are generated, using the attribute
+   num_prefixed_insns.  A prefixed instruction is 8 bytes instead of 4, but the
+   hardware requires that a prefied instruciton not cross a 64-byte boundary.
+   This means the compiler has to assume the length of the first prefixed
+   instruction is 12 bytes instead of 8 bytes.  Since the length is already set
+   for the non-prefixed instruction, we just need to udpate for the
+   difference.  */
+
+#define ADJUST_INSN_LENGTH(INSN,LENGTH)					\
+{									\
+  if (NONJUMP_INSN_P (INSN))						\
+    {									\
+      rtx pattern = PATTERN (INSN);					\
+      if (GET_CODE (pattern) != USE && GET_CODE (pattern) != CLOBBER	\
+	  && get_attr_prefixed (INSN) == PREFIXED_YES)			\
+	{								\
+	  int num_prefixed = get_attr_max_prefixed_insns (INSN);	\
+	  (LENGTH) += 4 * (num_prefixed + 1);				\
+	}								\
+    }									\
+}
 
 /* Given a comparison code (EQ, NE, etc.) and the first operand of a
    COMPARE, return the mode to be used for the comparison.  For
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 277017)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -285,20 +285,24 @@ (define_attr "prefixed" "no,yes"
 
 	(const_string "no")))
 
-;; Length in bytes of instructions that use prefixed addressing and length in
-;; bytes of instructions that does not use prefixed addressing.  This allows
-;; both lengths to be defined as constants, and the length attribute can pick
-;; the size as appropriate.
-(define_attr "prefixed_length" "" (const_int 12))
-(define_attr "non_prefixed_length" "" (const_int 4))
-
-;; Length of the instruction (in bytes).  Prefixed insns are 8 bytes, but the
-;; assembler might issue need to issue a NOP so that the prefixed instruction
-;; does not cross a cache boundary, which makes them possibly 12 bytes.
-(define_attr "length" ""
-  (if_then_else (eq_attr "prefixed" "yes")
-		(attr "prefixed_length")
-		(attr "non_prefixed_length")))
+;; Return the number of real hardware instructions in a combined insn.  If it
+;; is 0, just use the length / 4.
+(define_attr "num_insns" "" (const_int 0))
+
+;; If an insn is prefixed, return the maximum number of prefixed instructions
+;; in the insn.  The macro ADJUST_INSN_LENGTH uses this number to adjust the
+;; insn length.
+(define_attr "max_prefixed_insns" "" (const_int 1))
+
+;; Length of the instruction (in bytes).  This length does not consider the
+;; length for prefixed instructions.  The macro ADJUST_INSN_LENGTH will adjust
+;; the length if there are prefixed instructions.
+;;
+;; While it might be tempting to use num_insns to calculate the length, it can
+;; be problematical unless all insn lengths are adjusted to use num_insns
+;; (i.e. if num_insns is 0, it will get the length, which in turn will get
+;; num_insns and recurse).
+(define_attr "length" "" (const_int 4))
 
 ;; Processor type -- this attribute must exactly match the processor_type
 ;; enumeration in rs6000-opts.h.
@@ -7775,7 +7779,9 @@ (define_insn_and_split "*mov<mode>_64bit
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
   [(set_attr "length" "8")
-   (set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")])
+   (set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")
+   (set_attr "max_prefixed_insns" "2")
+   (set_attr "num_insns" "2")])
 
 (define_insn_and_split "*movtd_64bit_nodm"
   [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
@@ -7787,7 +7793,9 @@ (define_insn_and_split "*movtd_64bit_nod
   "&& reload_completed"
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8,8,8,12,12,8")])
+  [(set_attr "length" "8,8,8,12,12,8")
+   (set_attr "max_prefixed_insns" "2")
+   (set_attr "num_insns" "2,2,2,3,3,2")])
 
 (define_insn_and_split "*mov<mode>_32bit"
   [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r")
@@ -8984,7 +8992,8 @@ (define_insn "*mov<mode>_ppc64"
   return rs6000_output_move_128bit (operands);
 }
   [(set_attr "type" "store,store,load,load,*,*")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "max_prefixed_insns" "2")])
 
 (define_split
   [(set (match_operand:TI2 0 "int_reg_operand")
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 277017)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -1149,6 +1149,14 @@ (define_insn "vsx_mov<mode>_64bit"
                "vecstore,  vecload,   vecsimple, mffgpr,    mftgpr,    load,
                 store,     load,      store,     *,         vecsimple, vecsimple,
                 vecsimple, *,         *,         vecstore,  vecload")
+   (set_attr "num_insns"
+               "*,         *,         *,         2,         *,         2,
+                2,         2,         2,         2,         *,         *,
+                *,         5,         2,         *,         *")
+   (set_attr "max_prefixed_insns"
+               "*,         *,         *,         *,         *,         2,
+                2,         2,         2,         2,         *,         *,
+                *,         *,         *,         *,         *")
    (set_attr "length"
                "*,         *,         *,         8,         *,         8,
                 8,         8,         8,         8,         *,         *,

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #2 of 17: Minor code reformat
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
  2019-10-16 13:39 ` [PATCH] V6, #1 of 17: Use ADJUST_INSN_LENGTH for prefixed instructions Michael Meissner
@ 2019-10-16 13:42 ` Michael Meissner
  2019-10-22 22:39   ` Segher Boessenkool
  2019-10-16 13:47 ` [PATCH] V6, #3 of 17: Update lwa_operand for prefixed PLWA Michael Meissner
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 13:42 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch tweaks the code formatting that I noticed in making the previous
patch for some of the 128-bit mode move instructions.  Originally this was part
of V5 patch #2, but it has been moved to a separate patch.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.md (mov<mode>_64bit_dm): Reformat.
	(movtd_64bit_nodm): Reformat.
	(mov<mode>_32bit): Reformat.
	(mov<mode>_softfloat): Reformat.
	(FMOVE128_GPR splitter): Reformat.
	(DIFD splitter): Reformat.
	(TI2 splitter): Reformat.

Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 277018)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -7777,7 +7777,10 @@ (define_insn_and_split "*mov<mode>_64bit
   "#"
   "&& reload_completed"
   [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+}
   [(set_attr "length" "8")
    (set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")
    (set_attr "max_prefixed_insns" "2")
@@ -7792,7 +7795,10 @@ (define_insn_and_split "*movtd_64bit_nod
   "#"
   "&& reload_completed"
   [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;}
+
   [(set_attr "length" "8,8,8,12,12,8")
    (set_attr "max_prefixed_insns" "2")
    (set_attr "num_insns" "2,2,2,3,3,2")])
@@ -7809,7 +7815,10 @@ (define_insn_and_split "*mov<mode>_32bit
   "#"
   "&& reload_completed"
   [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+}
   [(set_attr "length" "8,8,8,8,20,20,16")])
 
 (define_insn_and_split "*mov<mode>_softfloat"
@@ -7821,7 +7830,10 @@ (define_insn_and_split "*mov<mode>_softf
   "#"
   "&& reload_completed"
   [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+}
   [(set_attr_alternative "length"
        [(if_then_else (match_test "TARGET_POWERPC64")
 	    (const_string "8")
@@ -8613,7 +8625,10 @@ (define_split
        || (!vsx_register_operand (operands[0], <MODE>mode)
            && !vsx_register_operand (operands[1], <MODE>mode)))"
   [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+})
 
 ;; Move SFmode to a VSX from a GPR register.  Because scalar floating point
 ;; type is stored internally as double precision in the VSX registers, we have
@@ -8803,7 +8818,10 @@ (define_split
    && gpr_or_gpr_p (operands[0], operands[1])
    && !direct_move_p (operands[0], operands[1])"
   [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+})
 
 ;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR #
 ;;              FPR store  FPR load   FPR move   AVX store  AVX store   AVX load
@@ -9030,7 +9048,10 @@ (define_split
    && !direct_move_p (operands[0], operands[1])
    && !quad_load_store_p (operands[0], operands[1])"
   [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+})
 \f
 (define_expand "setmemsi"
   [(parallel [(set (match_operand:BLK 0 "")

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #3 of 17: Update lwa_operand for prefixed PLWA
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
  2019-10-16 13:39 ` [PATCH] V6, #1 of 17: Use ADJUST_INSN_LENGTH for prefixed instructions Michael Meissner
  2019-10-16 13:42 ` [PATCH] V6, #2 of 17: Minor code reformat Michael Meissner
@ 2019-10-16 13:47 ` Michael Meissner
  2019-10-22 23:33   ` Segher Boessenkool
  2019-10-16 13:49 ` [PATCH] V6, #4 of 17: Add prefixed instruction support to stack protect insns Michael Meissner
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 13:47 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch allows using load SImode with sign extend to DImode to generate the
PLWA instruction on the 'future' machine if the offset for the load has the
bottom 2 bits being non-zero.  The normal LWA instruction is a DS format
instruction, and it needs the bottom 2 bits to be 0.

This patch was originally part of V5 patch #4.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (lwa_operand): If the bottom two
	bits of the offset for the memory address are non-zero, use PLWA
	if prefixed instructions are available.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 277017)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -932,6 +932,14 @@ (define_predicate "lwa_operand"
     return false;
 
   addr = XEXP (inner, 0);
+
+  /* The LWA instruction uses the DS-form instruction format which requires
+     that the bottom two bits of the offset must be 0.  The prefixed PLWA does
+     not have this restriction.  While the actual load from memory is 32-bits,
+     we pass in DImode here to test for using a DS instruction.  */
+  if (address_is_prefixed (addr, DImode, NON_PREFIXED_DS))
+    return true;
+
   if (GET_CODE (addr) == PRE_INC
       || GET_CODE (addr) == PRE_DEC
       || (GET_CODE (addr) == PRE_MODIFY

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #4 of 17: Add prefixed instruction support to stack protect insns
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
                   ` (2 preceding siblings ...)
  2019-10-16 13:47 ` [PATCH] V6, #3 of 17: Update lwa_operand for prefixed PLWA Michael Meissner
@ 2019-10-16 13:49 ` Michael Meissner
  2019-11-02  3:22   ` Segher Boessenkool
  2019-10-16 14:08 ` [PATCH] V6, #5 of 17: Add prefixed instruction support to vector extract optimizations Michael Meissner
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 13:49 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch fixes the stack protection insns to support stacks larger than
16-bits on the 'future' system using prefixed loads and stores.

This rewrites V5 patch #5.  In earlier patches, I had had a variant of this
patch, but I was asked to restrict the protect insns to use non-prefixed insns,
which I did in V5.  However, in V5, I was asked to support prefixed
instructions once again.  This patch was updated to use the new support in V6
patch #1.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (prefixed_memory): New predicate.
	* config/rs6000/rs6000.md (stack_protect_setdi): Deal with either
	address being a prefixed load/store.
	(stack_protect_testdi): Deal with either address being a prefixed
	load.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 277022)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -1815,3 +1815,10 @@ (define_predicate "pcrel_external_addres
 (define_predicate "pcrel_local_or_external_address"
   (ior (match_operand 0 "pcrel_local_address")
        (match_operand 0 "pcrel_external_address")))
+
+;; Return true if the operand is a memory address that uses a prefixed address.
+(define_predicate "prefixed_memory"
+  (match_code "mem")
+{
+  return address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
+})
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 277052)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -11559,14 +11559,44 @@ (define_insn "stack_protect_setsi"
   [(set_attr "type" "three")
    (set_attr "length" "12")])
 
+;; We can't use the prefixed attribute here because there are two memory
+;; instructions.  We can't split the insn due to the fact that this operation
+;; needs to be done in one piece.
 (define_insn "stack_protect_setdi"
   [(set (match_operand:DI 0 "memory_operand" "=Y")
 	(unspec:DI [(match_operand:DI 1 "memory_operand" "Y")] UNSPEC_SP_SET))
    (set (match_scratch:DI 2 "=&r") (const_int 0))]
   "TARGET_64BIT"
-  "ld%U1%X1 %2,%1\;std%U0%X0 %2,%0\;li %2,0"
+{
+  if (prefixed_memory (operands[1], DImode))
+    output_asm_insn ("pld %2,%1", operands);
+  else
+    output_asm_insn ("ld%U1%X1 %2,%1", operands);
+
+  if (prefixed_memory (operands[0], DImode))
+    output_asm_insn ("pstd %2,%0", operands);
+  else
+    output_asm_insn ("std%U0%X0 %2,%0", operands);
+
+  return "li %2,0";
+}
   [(set_attr "type" "three")
-   (set_attr "length" "12")])
+
+  ;; Back to back prefixed memory instructions take 20 bytes (8 bytes for each
+  ;; prefixed instruction + 4 bytes for the possible NOP).  Add in 4 bytes for
+  ;; the LI 0 at the end.
+   (set_attr "prefixed" "no")
+   (set_attr "num_insns" "3")
+   (set (attr "length")
+	(cond [(and (match_operand 0 "prefixed_memory")
+		    (match_operand 1 "prefixed_memory"))
+	       (const_string "24")
+
+	       (ior (match_operand 0 "prefixed_memory")
+		    (match_operand 1 "prefixed_memory"))
+	       (const_string "20")]
+
+	      (const_string "12")))])
 
 (define_expand "stack_protect_test"
   [(match_operand 0 "memory_operand")
@@ -11605,6 +11635,9 @@ (define_insn "stack_protect_testsi"
    lwz%U1%X1 %3,%1\;lwz%U2%X2 %4,%2\;cmplw %0,%3,%4\;li %3,0\;li %4,0"
   [(set_attr "length" "16,20")])
 
+;; We can't use the prefixed attribute here because there are two memory
+;; instructions.  We can't split the insn due to the fact that this operation
+;; needs to be done in one piece.
 (define_insn "stack_protect_testdi"
   [(set (match_operand:CCEQ 0 "cc_reg_operand" "=x,?y")
         (unspec:CCEQ [(match_operand:DI 1 "memory_operand" "Y,Y")
@@ -11613,10 +11646,45 @@ (define_insn "stack_protect_testdi"
    (set (match_scratch:DI 4 "=r,r") (const_int 0))
    (clobber (match_scratch:DI 3 "=&r,&r"))]
   "TARGET_64BIT"
-  "@
-   ld%U1%X1 %3,%1\;ld%U2%X2 %4,%2\;xor. %3,%3,%4\;li %4,0
-   ld%U1%X1 %3,%1\;ld%U2%X2 %4,%2\;cmpld %0,%3,%4\;li %3,0\;li %4,0"
-  [(set_attr "length" "16,20")])
+{
+  if (prefixed_memory (operands[1], DImode))
+    output_asm_insn ("pld %3,%1", operands);
+  else
+    output_asm_insn ("ld%U1%X1 %3,%1", operands);
+
+  if (prefixed_memory (operands[2], DImode))
+    output_asm_insn ("pld %4,%2", operands);
+  else
+    output_asm_insn ("ld%U2%X2 %4,%2", operands);
+
+  if (which_alternative == 0)
+    output_asm_insn ("xor. %3,%3,%4", operands);
+  else
+    output_asm_insn ("cmpld %0,%3,%4\;li %3,0", operands);
+
+  return "li %4,0";
+}
+  ;; Back to back prefixed memory instructions take 20 bytes (8 bytes for each
+  ;; prefixed instruction + 4 bytes for the possible NOP).  Add in either 4 or
+  ;; 8 bytes to do the test.
+  [(set_attr "prefixed" "no")
+   (set_attr "num_insns" "4,5")
+   (set (attr "length")
+	(cond [(and (match_operand 1 "prefixed_memory")
+		    (match_operand 2 "prefixed_memory"))
+	       (if_then_else (eq_attr "alternative" "0")
+			     (const_string "28")
+			     (const_string "32"))
+
+	       (ior (match_operand 1 "prefixed_memory")
+		    (match_operand 2 "prefixed_memory"))
+	       (if_then_else (eq_attr "alternative" "0")
+			     (const_string "20")
+			     (const_string "24"))]
+
+	      (if_then_else (eq_attr "alternative" "0")
+			    (const_string "16")
+			    (const_string "20"))))])
 
 \f
 ;; Here are the actual compare insns.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #5 of 17: Add prefixed instruction support to vector extract optimizations
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
                   ` (3 preceding siblings ...)
  2019-10-16 13:49 ` [PATCH] V6, #4 of 17: Add prefixed instruction support to stack protect insns Michael Meissner
@ 2019-10-16 14:08 ` Michael Meissner
  2019-10-16 14:11 ` [PATCH] V6, #6 of 17: Use PADDI/PLI to load up 34-bit DImode constants Michael Meissner
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 14:08 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch updates the support for optimizing vector extracts to know about
prefixed addressing.  There are two parts to the patch:

1) If a vector extract with a constant element number extracts an element from
a vector residing in memory that uses a prefixed address (either numeric or
PC-relative), the offset for the element number is folded into the address of
the vector and a scalar load is done.

2) If a vector extract with a variable element number extracts an element from
a vector residing in memory that uses a prefixed address, the optimization is
not done because we would need two temporary base registers to do this
calculation and currently we only have one.  Instead, the vector is loaded into
a vector register, and the element extract is done from the value in a
register.  We discovered this trying to run real code through the mambo
simulator.  The compiler previously would try to use the single base register
both to hold the address and to generate the element offset.

This patch adds a new constraint (em) that prevents using prefixed addresses.
Without the constraint the register allocator would recreate the prefixed
address for the insn.

This patch updates V5 patch #7 which did the same thing.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/constraints.md (em constraint): New constraint for
	non-prefixed memory.
	* config/rs6000/predicates.md (non_prefixed_memory): New
	predicate.
	(reg_or_non_prefixed_memory): New predicate.
	* config/rs6000/rs6000.c (rs6000_adjust_vec_address): Add support
	for optimizing extracting a constant vector element from a vector
	that uses a prefixed address.  If the element number is variable
	and the address uses a prefixed address, abort.
	* config/rs6000/vsx.md (vsx_extract_<mode>_var, VSX_D iterator):
	Do not allow combining prefixed memory with a variable vector
	extract.
	(vsx_extract_v4sf_var): Do not allow combining prefixed memory
	with a variable vector extract.
	(vsx_extract_<mode>_var, VSX_EXTRACT_I iterator): Do not allow
	combining prefixed memory with a variable vector extract.
	(vsx_extract_<mode>_<VS_scalar>mode_var): Do not allow combining
	prefixed memory with a variable vector extract.
	* doc/md.texi (PowerPC constraints): Document the em constraint.

Index: gcc/config/rs6000/constraints.md
===================================================================
--- gcc/config/rs6000/constraints.md	(revision 276974)
+++ gcc/config/rs6000/constraints.md	(working copy)
@@ -202,6 +202,11 @@ (define_constraint "H"
 
 ;; Memory constraints
 
+(define_memory_constraint "em"
+  "A memory operand that does not contain a prefixed address."
+  (and (match_code "mem")
+       (match_test "non_prefixed_memory (op, mode)")))
+
 (define_memory_constraint "es"
   "A ``stable'' memory operand; that is, one which does not include any
 automodification of the base register.  Unlike @samp{m}, this constraint
Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 277024)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -1822,3 +1822,24 @@ (define_predicate "prefixed_memory"
 {
   return address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
 })
+
+;; Return true if the operand is a memory address that does not use a prefixed
+;; address.
+(define_predicate "non_prefixed_memory"
+  (match_code "mem")
+{
+  /* If the operand is not a valid memory operand even if it is not prefixed,
+     do not return true.  */
+  if (!memory_operand (op, mode))
+    return false;
+
+  return !address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
+})
+
+;; Return true if the operand is either a register or it is a non-prefixed
+;; memory operand.
+(define_predicate "reg_or_non_prefixed_memory"
+  (match_code "reg,subreg,mem")
+{
+  return gpc_reg_operand (op, mode) || non_prefixed_memory (op, mode);
+})
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 277018)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -6702,6 +6702,7 @@ rs6000_adjust_vec_address (rtx scalar_re
   rtx element_offset;
   rtx new_addr;
   bool valid_addr_p;
+  bool pcrel_p = pcrel_local_address (addr, Pmode);
 
   /* Vector addresses should not have PRE_INC, PRE_DEC, or PRE_MODIFY.  */
   gcc_assert (GET_RTX_CLASS (GET_CODE (addr)) != RTX_AUTOINC);
@@ -6739,6 +6740,38 @@ rs6000_adjust_vec_address (rtx scalar_re
   else if (REG_P (addr) || SUBREG_P (addr))
     new_addr = gen_rtx_PLUS (Pmode, addr, element_offset);
 
+  /* Optimize PC-relative addresses with a constant offset.  */
+  else if (pcrel_p && CONST_INT_P (element_offset))
+    {
+      rtx addr2 = addr;
+      HOST_WIDE_INT offset = INTVAL (element_offset);
+
+      if (GET_CODE (addr2) == CONST)
+	addr2 = XEXP (addr2, 0);
+
+      if (GET_CODE (addr2) == PLUS)
+	{
+	  offset += INTVAL (XEXP (addr2, 1));
+	  addr2 = XEXP (addr2, 0);
+	}
+
+      gcc_assert (SIGNED_34BIT_OFFSET_P (offset));
+      if (offset)
+	{
+	  addr2 = gen_rtx_PLUS (Pmode, addr2, GEN_INT (offset));
+	  new_addr = gen_rtx_CONST (Pmode, addr2);
+	}
+      else
+	new_addr = addr2;
+    }
+
+  /* With only one temporary base register, we can't support a PC-relative
+     address added to a variable offset.  This is because the PADDI instruction
+     requires RA to be 0 when doing a PC-relative add (i.e. no register to add
+     to).  */
+  else if (pcrel_p)
+    gcc_unreachable ();
+
   /* Optimize D-FORM addresses with constant offset with a constant element, to
      include the element offset in the address directly.  */
   else if (GET_CODE (addr) == PLUS)
@@ -6753,8 +6786,11 @@ rs6000_adjust_vec_address (rtx scalar_re
 	  HOST_WIDE_INT offset = INTVAL (op1) + INTVAL (element_offset);
 	  rtx offset_rtx = GEN_INT (offset);
 
-	  if (IN_RANGE (offset, -32768, 32767)
-	      && (scalar_size < 8 || (offset & 0x3) == 0))
+	  if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (offset))
+	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
+
+	  else if (SIGNED_16BIT_OFFSET_P (offset)
+		   && (scalar_size < 8 || (offset & 0x3) == 0))
 	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
 	  else
 	    {
@@ -6802,11 +6838,11 @@ rs6000_adjust_vec_address (rtx scalar_re
       new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
     }
 
-  /* If we have a PLUS, we need to see whether the particular register class
-     allows for D-FORM or X-FORM addressing.  */
-  if (GET_CODE (new_addr) == PLUS)
+  /* If we have a PLUS or a PC-relative address without the PLUS, we need to
+     see whether the particular register class allows for D-FORM or X-FORM
+     addressing.  */
+  if (GET_CODE (new_addr) == PLUS || pcrel_p)
     {
-      rtx op1 = XEXP (new_addr, 1);
       addr_mask_type addr_mask;
       unsigned int scalar_regno = reg_or_subregno (scalar_reg);
 
@@ -6823,10 +6859,16 @@ rs6000_adjust_vec_address (rtx scalar_re
       else
 	gcc_unreachable ();
 
-      if (REG_P (op1) || SUBREG_P (op1))
-	valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0;
-      else
+      if (pcrel_p)
 	valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
+      else
+	{
+	  rtx op1 = XEXP (new_addr, 1);
+	  if (REG_P (op1) || SUBREG_P (op1))
+	    valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0;
+	  else
+	    valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
+	}
     }
 
   else if (REG_P (new_addr) || SUBREG_P (new_addr))
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 277018)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -3243,9 +3243,10 @@ (define_insn "vsx_vslo_<mode>"
 ;; Variable V2DI/V2DF extract
 (define_insn_and_split "vsx_extract_<mode>_var"
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v,wa,r")
-	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v,m,m")
-			     (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
-			    UNSPEC_VSX_EXTRACT))
+	(unspec:<VS_scalar>
+	 [(match_operand:VSX_D 1 "reg_or_non_prefixed_memory" "v,em,em")
+	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
+	 UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,&b,&b"))
    (clobber (match_scratch:V2DI 4 "=&v,X,X"))]
   "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_DIRECT_MOVE_64BIT"
@@ -3313,9 +3314,10 @@ (define_insn_and_split "*vsx_extract_v4s
 ;; Variable V4SF extract
 (define_insn_and_split "vsx_extract_v4sf_var"
   [(set (match_operand:SF 0 "gpc_reg_operand" "=wa,wa,?r")
-	(unspec:SF [(match_operand:V4SF 1 "input_operand" "v,m,m")
-		    (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
-		   UNSPEC_VSX_EXTRACT))
+	(unspec:SF
+	 [(match_operand:V4SF 1 "reg_or_non_prefixed_memory" "v,em,em")
+	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
+	 UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,&b,&b"))
    (clobber (match_scratch:V2DI 4 "=&v,X,X"))]
   "VECTOR_MEM_VSX_P (V4SFmode) && TARGET_DIRECT_MOVE_64BIT"
@@ -3676,7 +3678,7 @@ (define_insn_and_split "*vsx_extract_<mo
 (define_insn_and_split "vsx_extract_<mode>_var"
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
 	(unspec:<VS_scalar>
-	 [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m")
+	 [(match_operand:VSX_EXTRACT_I 1 "reg_or_non_prefixed_memory" "v,v,em")
 	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 	 UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,r,&b"))
@@ -3696,7 +3698,7 @@ (define_insn_and_split "*vsx_extract_<mo
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
 	(zero_extend:<VS_scalar>
 	 (unspec:<VSX_EXTRACT_I:VS_scalar>
-	  [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m")
+	  [(match_operand:VSX_EXTRACT_I 1 "reg_or_non_prefixed_memory" "v,v,em")
 	   (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 	  UNSPEC_VSX_EXTRACT)))
    (clobber (match_scratch:DI 3 "=r,r,&b"))
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(revision 276974)
+++ gcc/doc/md.texi	(working copy)
@@ -3373,6 +3373,9 @@ asm ("st %1,%0" : "=m<>" (mem) : "r" (va
 
 is not.
 
+@item em
+A memory operand that does not contain a prefixed address.
+
 @item es
 A ``stable'' memory operand; that is, one which does not include any
 automodification of the base register.  This used to be useful when

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #6 of 17: Use PADDI/PLI to load up 34-bit DImode constants
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
                   ` (4 preceding siblings ...)
  2019-10-16 14:08 ` [PATCH] V6, #5 of 17: Add prefixed instruction support to vector extract optimizations Michael Meissner
@ 2019-10-16 14:11 ` Michael Meissner
  2019-10-16 14:14 ` [PATCH] V6, #7 of 17: Use PADDI/PLI to load up 32-bit SImode constants Michael Meissner
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 14:11 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch uses PADDI (PLI) to load up 34-bit DImode constants.  This is the
same patch as V5 patch #8.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (num_insns_constant_gpr): Add support for
	PADDI to load up and/or add 34-bit integer constants.
	(rs6000_rtx_costs): Treat constants loaded up with PADDI with the
	same cost as normal 16-bit constants.
	* config/rs6000/rs6000.md (movdi_internal64): Add support to load
	up 34-bit integer constants with PADDI.
	(movdi integer constant splitter): Add comment about PADDI.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 277025)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -5524,7 +5524,7 @@ static int
 num_insns_constant_gpr (HOST_WIDE_INT value)
 {
   /* signed constant loadable with addi */
-  if (((unsigned HOST_WIDE_INT) value + 0x8000) < 0x10000)
+  if (SIGNED_16BIT_OFFSET_P (value))
     return 1;
 
   /* constant loadable with addis */
@@ -5532,6 +5532,10 @@ num_insns_constant_gpr (HOST_WIDE_INT va
 	   && (value >> 31 == -1 || value >> 31 == 0))
     return 1;
 
+  /* PADDI can support up to 34 bit signed integers.  */
+  else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (value))
+    return 1;
+
   else if (TARGET_POWERPC64)
     {
       HOST_WIDE_INT low  = ((value & 0xffffffff) ^ 0x80000000) - 0x80000000;
@@ -20641,7 +20645,8 @@ rs6000_rtx_costs (rtx x, machine_mode mo
 	    || outer_code == PLUS
 	    || outer_code == MINUS)
 	   && (satisfies_constraint_I (x)
-	       || satisfies_constraint_L (x)))
+	       || satisfies_constraint_L (x)
+	       || satisfies_constraint_eI (x)))
 	  || (outer_code == AND
 	      && (satisfies_constraint_K (x)
 		  || (mode == SImode
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 277024)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -8823,24 +8823,24 @@ (define_split
   DONE;
 })
 
-;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR #
-;;              FPR store  FPR load   FPR move   AVX store  AVX store   AVX load
-;;              AVX load   VSX move   P9 0       P9 -1      AVX 0/-1    VSX 0
-;;              VSX -1     P9 const   AVX const  From SPR   To SPR      SPR<->SPR
-;;              VSX->GPR   GPR->VSX
+;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR pli
+;;              GPR #      FPR store  FPR load   FPR move   AVX store   AVX store
+;;              AVX load   AVX load   VSX move   P9 0       P9 -1       AVX 0/-1
+;;              VSX 0      VSX -1     P9 const   AVX const  From SPR    To SPR
+;;              SPR<->SPR  VSX->GPR   GPR->VSX
 (define_insn "*movdi_internal64"
   [(set (match_operand:DI 0 "nonimmediate_operand"
                "=YZ,       r,         r,         r,         r,          r,
-                m,         ^d,        ^d,        wY,        Z,          $v,
-                $v,        ^wa,       wa,        wa,        v,          wa,
-                wa,        v,         v,         r,         *h,         *h,
-                ?r,        ?wa")
+                r,         m,         ^d,        ^d,        wY,         Z,
+                $v,        $v,        ^wa,       wa,        wa,         v,
+                wa,        wa,        v,         v,         r,          *h,
+                *h,        ?r,        ?wa")
 	(match_operand:DI 1 "input_operand"
-               "r,         YZ,        r,         I,         L,          nF,
-                ^d,        m,         ^d,        ^v,        $v,         wY,
-                Z,         ^wa,       Oj,        wM,        OjwM,       Oj,
-                wM,        wS,        wB,        *h,        r,          0,
-                wa,        r"))]
+               "r,         YZ,        r,         I,         L,          eI,
+                nF,        ^d,        m,         ^d,        ^v,         $v,
+                wY,        Z,         ^wa,       Oj,        wM,         OjwM,
+                Oj,        wM,        wS,        wB,        *h,         r,
+                0,         wa,        r"))]
   "TARGET_POWERPC64
    && (gpc_reg_operand (operands[0], DImode)
        || gpc_reg_operand (operands[1], DImode))"
@@ -8850,6 +8850,7 @@ (define_insn "*movdi_internal64"
    mr %0,%1
    li %0,%1
    lis %0,%v1
+   li %0,%1
    #
    stfd%U0%X0 %1,%0
    lfd%U1%X1 %0,%1
@@ -8873,26 +8874,28 @@ (define_insn "*movdi_internal64"
    mtvsrd %x0,%1"
   [(set_attr "type"
                "store,      load,	*,         *,         *,         *,
-                fpstore,    fpload,     fpsimple,  fpstore,   fpstore,   fpload,
-                fpload,     veclogical, vecsimple, vecsimple, vecsimple, veclogical,
-                veclogical, vecsimple,  vecsimple, mfjmpr,    mtjmpr,    *,
-                mftgpr,    mffgpr")
+                *,          fpstore,    fpload,    fpsimple,  fpstore,   fpstore,
+                fpload,     fpload,     veclogical,vecsimple, vecsimple, vecsimple,
+                veclogical, veclogical, vecsimple,  vecsimple, mfjmpr,   mtjmpr,
+                *,          mftgpr,    mffgpr")
    (set_attr "size" "64")
    (set_attr "length"
-               "*,         *,         *,         *,         *,          20,
-                *,         *,         *,         *,         *,          *,
+               "*,         *,         *,         *,         *,          *,
+                20,        *,         *,         *,         *,          *,
                 *,         *,         *,         *,         *,          *,
-                *,         8,         *,         *,         *,          *,
-                *,         *")
+                *,         *,         8,         *,         *,          *,
+                *,         *,         *")
    (set_attr "isa"
-               "*,         *,         *,         *,         *,          *,
-                *,         *,         *,         p9v,       p7v,        p9v,
-                p7v,       *,         p9v,       p9v,       p7v,        *,
-                *,         p7v,       p7v,       *,         *,          *,
-                p8v,       p8v")])
+               "*,         *,         *,         *,         *,          fut,
+                *,         *,         *,         *,         p9v,        p7v,
+                p9v,       p7v,       *,         p9v,       p9v,        p7v,
+                *,         *,         p7v,       p7v,       *,          *,
+                *,         p8v,       p8v")])
 
 ; Some DImode loads are best done as a load of -1 followed by a mask
-; instruction.
+; instruction.  On systems that support the PADDI (PLI) instruction,
+; num_insns_constant returns 1, so these splitter would not be used for things
+; that be loaded with PLI.
 (define_split
   [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
 	(match_operand:DI 1 "const_int_operand"))]

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #7 of 17: Use PADDI/PLI to load up 32-bit SImode constants
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
                   ` (5 preceding siblings ...)
  2019-10-16 14:11 ` [PATCH] V6, #6 of 17: Use PADDI/PLI to load up 34-bit DImode constants Michael Meissner
@ 2019-10-16 14:14 ` Michael Meissner
  2019-10-16 14:21 ` [PATCH] V6, #8 of 17: Use PADDI to add 34-bit constants Michael Meissner
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 14:14 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch uses PADDI (PLI) to load up 32-bit SImode constants that can't be
loaded with either a single ADDI or ADDIS instruction.  This patch is the same
as V5 patch #9.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.md (movsi_internal1): Add support to load
	up 32-bit SImode integer constants with PADDI.
	(movsi integer constant splitter): Do not split constant if PADDI
	can load it up directly.

Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 277026)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -6908,22 +6908,22 @@ (define_insn "movsi_low"
 
 ;;		MR           LA           LWZ          LFIWZX       LXSIWZX
 ;;		STW          STFIWX       STXSIWX      LI           LIS
-;;		#            XXLOR        XXSPLTIB 0   XXSPLTIB -1  VSPLTISW
-;;		XXLXOR 0     XXLORC -1    P9 const     MTVSRWZ      MFVSRWZ
-;;		MF%1         MT%0         NOP
+;;		PLI          #            XXLOR        XXSPLTIB 0   XXSPLTIB -1
+;;		VSPLTISW     XXLXOR 0     XXLORC -1    P9 const     MTVSRWZ
+;;		MFVSRWZ      MF%1         MT%0         NOP
 (define_insn "*movsi_internal1"
   [(set (match_operand:SI 0 "nonimmediate_operand"
 		"=r,         r,           r,           d,           v,
 		 m,          Z,           Z,           r,           r,
-		 r,          wa,          wa,          wa,          v,
-		 wa,         v,           v,           wa,          r,
-		 r,          *h,          *h")
+		 r,          r,           wa,          wa,          wa,
+		 v,          wa,          v,           v,           wa,
+		 r,          r,           *h,          *h")
 	(match_operand:SI 1 "input_operand"
 		"r,          U,           m,           Z,           Z,
 		 r,          d,           v,           I,           L,
-		 n,          wa,          O,           wM,          wB,
-		 O,          wM,          wS,          r,           wa,
-		 *h,         r,           0"))]
+		 eI,         n,           wa,          O,           wM,
+		 wB,         O,           wM,          wS,          r,
+		 wa,         *h,          r,           0"))]
   "gpc_reg_operand (operands[0], SImode)
    || gpc_reg_operand (operands[1], SImode)"
   "@
@@ -6937,6 +6937,7 @@ (define_insn "*movsi_internal1"
    stxsiwx %x1,%y0
    li %0,%1
    lis %0,%v1
+   li %0,%1
    #
    xxlor %x0,%x1,%x1
    xxspltib %x0,0
@@ -6953,21 +6954,21 @@ (define_insn "*movsi_internal1"
   [(set_attr "type"
 		"*,          *,           load,        fpload,      fpload,
 		 store,      fpstore,     fpstore,     *,           *,
-		 *,          veclogical,  vecsimple,   vecsimple,   vecsimple,
-		 veclogical, veclogical,  vecsimple,   mffgpr,      mftgpr,
-		 *,          *,           *")
+		 *,          *,           veclogical,  vecsimple,   vecsimple,
+		 vecsimple,  veclogical,  veclogical,  vecsimple,   mffgpr,
+		 mftgpr,     *,           *,           *")
    (set_attr "length"
 		"*,          *,           *,           *,           *,
 		 *,          *,           *,           *,           *,
-		 8,          *,           *,           *,           *,
-		 *,          *,           8,           *,           *,
-		 *,          *,           *")
+		 *,          8,           *,           *,           *,
+		 *,          *,           *,           8,           *,
+		 *,          *,           *,           *")
    (set_attr "isa"
 		"*,          *,           *,           p8v,         p8v,
 		 *,          p8v,         p8v,         *,           *,
-		 *,          p8v,         p9v,         p9v,         p8v,
-		 p9v,        p8v,         p9v,         p8v,         p8v,
-		 *,          *,           *")])
+		 fut,        *,           p8v,         p9v,         p9v,
+		 p8v,        p9v,         p8v,         p9v,         p8v,
+		 p8v,        *,           *,           *")])
 
 ;; Like movsi, but adjust a SF value to be used in a SI context, i.e.
 ;; (set (reg:SI ...) (subreg:SI (reg:SF ...) 0))
@@ -7112,14 +7113,15 @@ (define_insn "*movsi_from_df"
   "xscvdpsp %x0,%x1"
   [(set_attr "type" "fp")])
 
-;; Split a load of a large constant into the appropriate two-insn
-;; sequence.
+;; Split a load of a large constant into the appropriate two-insn sequence.  On
+;; systems that support PADDI (PLI), we can use PLI to load any 32-bit constant
+;; in one instruction.
 
 (define_split
   [(set (match_operand:SI 0 "gpc_reg_operand")
 	(match_operand:SI 1 "const_int_operand"))]
   "(unsigned HOST_WIDE_INT) (INTVAL (operands[1]) + 0x8000) >= 0x10000
-   && (INTVAL (operands[1]) & 0xffff) != 0"
+   && (INTVAL (operands[1]) & 0xffff) != 0 && !TARGET_PREFIXED_ADDR"
   [(set (match_dup 0)
 	(match_dup 2))
    (set (match_dup 0)

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #8 of 17: Use PADDI to add 34-bit constants
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
                   ` (6 preceding siblings ...)
  2019-10-16 14:14 ` [PATCH] V6, #7 of 17: Use PADDI/PLI to load up 32-bit SImode constants Michael Meissner
@ 2019-10-16 14:21 ` Michael Meissner
  2019-10-16 14:25 ` [PATCH] V6, #9 of 17: Change defaults on Linux 64-bit to enable -mpcrel Michael Meissner
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 14:21 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch uses the PADDI instruction to add 34-bit constants that can't be
done with a single ADDI or ADDIS instruction.  This patch is the same as V5
patch #10.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (add_operand): Add support for
	PADDI.
	* config/rs6000/rs6000.md (add<mode>3): Add support for PADDI.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 277025)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -839,7 +839,8 @@ (define_special_predicate "indexed_addre
 (define_predicate "add_operand"
   (if_then_else (match_code "const_int")
     (match_test "satisfies_constraint_I (op)
-		 || satisfies_constraint_L (op)")
+		 || satisfies_constraint_L (op)
+		 || satisfies_constraint_eI (op)")
     (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is either a non-special register, or 0, or -1.
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 277027)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -1760,15 +1760,17 @@ (define_expand "add<mode>3"
 })
 
 (define_insn "*add<mode>3"
-  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
-	(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
-		  (match_operand:GPR 2 "add_operand" "r,I,L")))]
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r,r")
+	(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b,b")
+		  (match_operand:GPR 2 "add_operand" "r,I,L,eI")))]
   ""
   "@
    add %0,%1,%2
    addi %0,%1,%2
-   addis %0,%1,%v2"
-  [(set_attr "type" "add")])
+   addis %0,%1,%v2
+   addi %0,%1,%2"
+  [(set_attr "type" "add")
+   (set_attr "isa" "*,*,*,fut")])
 
 (define_insn "*addsi3_high"
   [(set (match_operand:SI 0 "gpc_reg_operand" "=b")

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #9 of 17: Change defaults on Linux 64-bit to enable -mpcrel
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
                   ` (7 preceding siblings ...)
  2019-10-16 14:21 ` [PATCH] V6, #8 of 17: Use PADDI to add 34-bit constants Michael Meissner
@ 2019-10-16 14:25 ` Michael Meissner
  2019-10-16 14:26 ` [PATCH] V6, #10 of 17: Update target-supports.exp Michael Meissner
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 14:25 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch changes the default for Linux 64-bit to enable -mpcrel and
-mprefixed-addr by default when you use -mcpu=future.  Other OS targets do not
enable these switches by default.  This is the same as V5 patch #11.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/linux64.h (TARGET_PREFIXED_ADDR_DEFAULT): Enable
	prefixed addressing by default.
	(TARGET_PCREL_DEFAULT): Enable pc-relative addressing by default.
	* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Only
	enable -mprefixed-addr and -mpcrel if the OS tm.h says to enable
	it.
	(ADDRESSING_FUTURE_MASKS): New mask macro.
	(OTHER_FUTURE_MASKS): Use ADDRESSING_FUTURE_MASKS.
	* config/rs6000/rs6000.c (TARGET_PREFIXED_ADDR_DEFAULT): Do not
	enable -mprefixed-addr unless the OS tm.h says to.
	(TARGET_PCREL_DEFAULT): Do not enable -mpcrel unless the OS tm.h
	says to.
	(rs6000_option_override_internal): Do not enable -mprefixed-addr
	or -mpcrel unless the OS tm.h says to enable it.  Add more checks
	for -mcpu=future.

Index: gcc/config/rs6000/linux64.h
===================================================================
--- gcc/config/rs6000/linux64.h	(revision 276974)
+++ gcc/config/rs6000/linux64.h	(working copy)
@@ -640,3 +640,11 @@ extern int dot_symbols;
    enabling the __float128 keyword.  */
 #undef	TARGET_FLOAT128_ENABLE_TYPE
 #define TARGET_FLOAT128_ENABLE_TYPE 1
+
+/* Enable support for pc-relative and numeric prefixed addressing on the
+   'future' system.  */
+#undef  TARGET_PREFIXED_ADDR_DEFAULT
+#define TARGET_PREFIXED_ADDR_DEFAULT	1
+
+#undef  TARGET_PCREL_DEFAULT
+#define TARGET_PCREL_DEFAULT		1
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def	(revision 276974)
+++ gcc/config/rs6000/rs6000-cpus.def	(working copy)
@@ -75,15 +75,21 @@
 				 | OPTION_MASK_P8_VECTOR		\
 				 | OPTION_MASK_P9_VECTOR)
 
-/* Support for a future processor's features.  Do not enable -mpcrel until it
-   is fully functional.  */
+/* Support for a future processor's features.  The prefixed and pc-relative
+   addressing bits are not added here.  Instead, rs6000.c adds them if the OS
+   tm.h says that it supports the addressing modes.  */
 #define ISA_FUTURE_MASKS_SERVER	(ISA_3_0_MASKS_SERVER			\
-				 | OPTION_MASK_FUTURE			\
+				 | OPTION_MASK_FUTURE)
+
+/* Addressing related flags on a future processor.  These flags are broken out
+   because not all targets will support either pc-relative addressing, or even
+   prefixed addressing, and we want to clear all of the addressing bits
+   on targets that cannot support prefixed/pcrel addressing.  */
+#define ADDRESSING_FUTURE_MASKS	(OPTION_MASK_PCREL			\
 				 | OPTION_MASK_PREFIXED_ADDR)
 
 /* Flags that need to be turned off if -mno-future.  */
-#define OTHER_FUTURE_MASKS	(OPTION_MASK_PCREL			\
-				 | OPTION_MASK_PREFIXED_ADDR)
+#define OTHER_FUTURE_MASKS	ADDRESSING_FUTURE_MASKS
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
 #define OTHER_P9_VECTOR_MASKS	(OPTION_MASK_FLOAT128_HW		\
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 277026)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -98,6 +98,16 @@
 #endif
 #endif
 
+/* Set up the defaults for whether prefixed addressing is used, and if it is
+   used, whether we want to turn on pc-relative support by default.  */
+#ifndef TARGET_PREFIXED_ADDR_DEFAULT
+#define TARGET_PREFIXED_ADDR_DEFAULT	0
+#endif
+
+#ifndef TARGET_PCREL_DEFAULT
+#define TARGET_PCREL_DEFAULT		0
+#endif
+
 /* Support targetm.vectorize.builtin_mask_for_load.  */
 GTY(()) tree altivec_builtin_mask_for_load;
 
@@ -2532,6 +2542,14 @@ rs6000_debug_reg_global (void)
   if (TARGET_DIRECT_MOVE_128)
     fprintf (stderr, DEBUG_FMT_D, "VSX easy 64-bit mfvsrld element",
 	     (int)VECTOR_ELEMENT_MFVSRLD_64BIT);
+
+  if (TARGET_FUTURE)
+    {
+      fprintf (stderr, DEBUG_FMT_D, "TARGET_PREFIXED_ADDR_DEFAULT",
+	       TARGET_PREFIXED_ADDR_DEFAULT);
+      fprintf (stderr, DEBUG_FMT_D, "TARGET_PCREL_DEFAULT",
+	       TARGET_PCREL_DEFAULT);
+    }
 }
 
 \f
@@ -4012,26 +4030,6 @@ rs6000_option_override_internal (bool gl
       rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_HW;
     }
 
-  /* -mprefixed-addr (and hence -mpcrel) requires -mcpu=future.  */
-  if (TARGET_PREFIXED_ADDR && !TARGET_FUTURE)
-    {
-      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mcpu=future");
-      else if ((rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED_ADDR) != 0)
-	error ("%qs requires %qs", "-mprefixed-addr", "-mcpu=future");
-
-      rs6000_isa_flags &= ~(OPTION_MASK_PCREL | OPTION_MASK_PREFIXED_ADDR);
-    }
-
-  /* -mpcrel requires prefixed load/store addressing.  */
-  if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
-    {
-      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
-
-      rs6000_isa_flags &= ~OPTION_MASK_PCREL;
-    }
-
   /* Print the options after updating the defaults.  */
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
     rs6000_print_isa_options (stderr, 0, "after defaults", rs6000_isa_flags);
@@ -4163,12 +4161,89 @@ rs6000_option_override_internal (bool gl
   SUB3TARGET_OVERRIDE_OPTIONS;
 #endif
 
-  /* -mpcrel requires -mcmodel=medium, but we can't check TARGET_CMODEL until
-      after the subtarget override options are done.  */
-  if (TARGET_PCREL && TARGET_CMODEL != CMODEL_MEDIUM)
+  /* Enable prefixed addressing and pc-relative addressing on 64-bit ELF v2
+     systems if the OS tm.h file says that it is supported and the user did not
+     explicitly use -mprefixed-addr or -mpcrel.  At the present time, only
+     64-bit Linux enables this.
+
+     Pc-relative support also requires the medium code model.
+
+     However, we can't check for ELFv2 or -mcmodel=medium until after the
+     subtarget macros are run.
+
+     If prefixed addressing is disabled by default, and the user does -mpcrel,
+     don't force them to also specify -mprefixed-addr.  */
+  if (TARGET_FUTURE)
+    {
+      bool explicit_prefixed = ((rs6000_isa_flags_explicit
+				 & OPTION_MASK_PREFIXED_ADDR) != 0);
+      bool explicit_pcrel = ((rs6000_isa_flags_explicit
+			      & OPTION_MASK_PCREL) != 0);
+
+      /* Prefixed addressing requires 64-bit registers.  */
+      if (!TARGET_POWERPC64)
+	{
+	  if (TARGET_PCREL && explicit_pcrel)
+	    error ("%qs requires %qs", "-mpcrel", "-m64");
+
+	  else if (TARGET_PREFIXED_ADDR && explicit_prefixed)
+	    error ("%qs requires %qs", "-mprefixed-addr", "-m64");
+
+	  rs6000_isa_flags &= ~ADDRESSING_FUTURE_MASKS;
+	}
+
+      /* Only ELFv2 currently supports prefixed/pcrel addressing.  */
+      else if (rs6000_current_abi != ABI_ELFv2)
+	{
+	  if (TARGET_PCREL && explicit_pcrel)
+	    error ("%qs requires %qs", "-mpcrel", "-mabi=elfv2");
+
+	  else if (TARGET_PREFIXED_ADDR && explicit_prefixed)
+	    error ("%qs requires %qs", "-mprefixed-addr", "-mabi=elfv2");
+
+	  rs6000_isa_flags &= ~ADDRESSING_FUTURE_MASKS;
+	}
+
+      /* Pc-relative requires the medium code model.  */
+      else if (TARGET_PCREL && TARGET_CMODEL != CMODEL_MEDIUM)
+	{
+	  if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+	    error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");
+
+	  rs6000_isa_flags &= ~OPTION_MASK_PCREL;
+	}
+
+      /* Enable defaults if desired.  */
+      else
+	{
+	  if (!explicit_prefixed
+	      && (TARGET_PREFIXED_ADDR_DEFAULT
+		  || TARGET_PCREL
+		  || TARGET_PCREL_DEFAULT))
+	    rs6000_isa_flags |= OPTION_MASK_PREFIXED_ADDR;
+
+	  if (!explicit_pcrel && TARGET_PCREL_DEFAULT
+	      && TARGET_CMODEL == CMODEL_MEDIUM)
+	    rs6000_isa_flags |= OPTION_MASK_PCREL;
+	}
+    }
+
+  /* -mprefixed-addr (and hence -mpcrel) requires -mcpu=future.  */
+  if (TARGET_PREFIXED_ADDR && !TARGET_FUTURE)
     {
       if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");
+	error ("%qs requires %qs", "-mpcrel", "-mcpu=future");
+      else if ((rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED_ADDR) != 0)
+	error ("%qs requires %qs", "-mprefixed-addr", "-mcpu=future");
+
+      rs6000_isa_flags &= ~(OPTION_MASK_PCREL | OPTION_MASK_PREFIXED_ADDR);
+    }
+
+  /* -mpcrel requires prefixed load/store addressing.  */
+  if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
+    {
+      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+	error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
 
       rs6000_isa_flags &= ~OPTION_MASK_PCREL;
     }

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #10 of 17: Update target-supports.exp
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
                   ` (8 preceding siblings ...)
  2019-10-16 14:25 ` [PATCH] V6, #9 of 17: Change defaults on Linux 64-bit to enable -mpcrel Michael Meissner
@ 2019-10-16 14:26 ` Michael Meissner
  2019-10-16 14:27 ` [PATCH] V6, #11 of 17: Add PADDI tests Michael Meissner
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 14:26 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch adds 2 new target supports options for the testsuite.  One is for
whether prefixed instructins with 34-bit offsets are supported.  The other is
whether PC-relative instructions are supported.  This was originally part of V5
patch #12, but it was split out to be a separate patch.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15   Michael Meissner  <meissner@linux.ibm.com>

	* lib/target-supports.exp
	(check_effective_target_powerpc_future_ok): Do not require 64-bit
	or Linux support before doing the test.  Use a 32-bit constant in
	PLI.
	(check_effective_target_powerpc_prefixed_addr_ok): New effective
	target test to see if prefixed memory instructions are supported.
	(check_effective_target_powerpc_pcrel_ok): New effective target
	test to test whether PC-relative addressing is supported.
	(is-effective-target): Add test for the PowerPC 'future' hardware
	support.

Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	(revision 276974)
+++ gcc/testsuite/lib/target-supports.exp	(working copy)
@@ -5307,16 +5307,14 @@ proc check_effective_target_powerpc_p9mo
     }
 }
 
-# Return 1 if this is a PowerPC target supporting -mfuture.
-# Limit this to 64-bit linux systems for now until other
-# targets support FUTURE.
+# Return 1 if this is a PowerPC target supporting -mcpu=future.
 
 proc check_effective_target_powerpc_future_ok { } {
-    if { ([istarget powerpc64*-*-linux*]) } {
+    if { ([istarget powerpc*-*-*]) } {
 	return [check_no_compiler_messages powerpc_future_ok object {
 	    int main (void) {
 		long e;
-		asm ("pli %0,%1" : "=r" (e) : "n" (0x12345));
+		asm ("pli %0,%1" : "=r" (e) : "n" (0x1234));
 		return e;
 	    }
 	} "-mfuture"]
@@ -5325,6 +5323,46 @@ proc check_effective_target_powerpc_futu
     }
 }
 
+# Return 1 if this is a PowerPC target supporting -mcpu=future.  The compiler
+# must support large numeric prefixed addresses by default when -mfuture is
+# used.  We test loading up a large constant to verify that the full 34-bit
+# offset for prefixed instructions is supported and we check for a prefixed
+# load as well.
+
+proc check_effective_target_powerpc_prefixed_addr_ok { } {
+    if { ([istarget powerpc*-*-*]) } {
+	return [check_no_compiler_messages powerpc_prefixed_addr_ok object {
+	    int main (void) {
+		extern long l[];
+		long e, e2;
+		asm ("pli %0,%1" : "=r" (e) : "n" (0x12345678));
+		asm ("pld %0,0x12345678(%1)" : "=r" (e2) : "r" (& l[0]));
+		return e - e2;
+	    }
+	} "-mfuture"]
+    } else {
+	return 0
+    }
+}
+
+# Return 1 if this is a PowerPC target supporting -mfuture.  The compiler must
+# support PC-relative addressing when -mcpu=future is used to pass this test.
+
+proc check_effective_target_powerpc_pcrel_ok { } {
+    if { ([istarget powerpc*-*-*]) } {
+	return [check_no_compiler_messages powerpc_pcrel_ok object {
+	      int main (void) {
+		  static int s __attribute__((__used__));
+		  int e;
+		  asm ("plwa %0,s@pcrel(0),1" : "=r" (e));
+		  return e;
+	      }
+	  } "-mfuture"]
+      } else {
+	  return 0
+      }
+}
+
 # Return 1 if this is a PowerPC target supporting -mfloat128 via either
 # software emulation on power7/power8 systems or hardware support on power9.
 
@@ -7223,6 +7261,7 @@ proc is-effective-target { arg } {
 	  "named_sections" { set selected [check_named_sections_available] }
 	  "gc_sections"    { set selected [check_gc_sections_available] }
 	  "cxa_atexit"     { set selected [check_cxa_atexit_available] }
+	  "powerpc_future_hw" { set selected [check_powerpc_future_hw_available] }
 	  default          { error "unknown effective target keyword `$arg'" }
 	}
     }

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #11 of 17: Add PADDI tests
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
                   ` (9 preceding siblings ...)
  2019-10-16 14:26 ` [PATCH] V6, #10 of 17: Update target-supports.exp Michael Meissner
@ 2019-10-16 14:27 ` Michael Meissner
  2019-10-16 14:37 ` [PATCH] V6, #12 of 17: Add prefix test for DS/DQ instructions Michael Meissner
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 14:27 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch adds 3 tests for the PADDI instruction.  This was originally part of
V5 patch #12, but it was split out.

2019-10-15   Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/paddi-1.c: New test to test using PLI to
	load up a large DImode constant.
	* gcc.target/powerpc/paddi-2.c: New test to test using PLI to
	load up a large SImode constant.
	* gcc.target/powerpc/paddi-3.c: New test to test using PADDI to
	add a large DImode constant.

Index: gcc/testsuite/gcc.target/powerpc/paddi-1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/paddi-1.c	(revision 276774)
+++ gcc/testsuite/gcc.target/powerpc/paddi-1.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PADDI is generated to add a large constant.  */
+unsigned long
+add (unsigned long a)
+{
+  return a + 0x12345678UL;
+}
+
+/* { dg-final { scan-assembler {\mpaddi\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/paddi-2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/paddi-2.c	(revision 276774)
+++ gcc/testsuite/gcc.target/powerpc/paddi-2.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant.  */
+unsigned long
+large (void)
+{
+  return 0x12345678UL;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/paddi-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/paddi-3.c	(revision 276774)
+++ gcc/testsuite/gcc.target/powerpc/paddi-3.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant for SImode.  */
+void
+large_si (unsigned int *p)
+{
+  *p = 0x12345U;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #12 of 17: Add prefix test for DS/DQ instructions
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
                   ` (10 preceding siblings ...)
  2019-10-16 14:27 ` [PATCH] V6, #11 of 17: Add PADDI tests Michael Meissner
@ 2019-10-16 14:37 ` Michael Meissner
  2019-10-16 14:40 ` [PATCH] V6, #13 of 17: Add test for prefix pre-modify Michael Meissner
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 14:37 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch adds a new test that makes sure the appropriate prefixed instruction
is generated if a memory is attempted for DS-format or DQ-format instructions
where the offset does not fit the DS or DQ constraints.  This patch was in V5
patch #12 and was split out in this patch.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/prefix-odd-memory.c: New test to make sure
	prefixed instructions are generated if an offset would not be
	legal for the non-prefixed DS/DQ instructions.

Index: gcc/testsuite/gcc.target/powerpc/prefix-odd-memory.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-odd-memory.c	(revision 277037)
+++ gcc/testsuite/gcc.target/powerpc/prefix-odd-memory.c	(working copy)
@@ -0,0 +1,156 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests whether we can generate a prefixed load/store operation for addresses
+   that don't meet DS/DQ alignment constraints.  */
+
+unsigned long
+load_uc_odd (unsigned char *p)
+{
+  return p[1];				/* should generate LBZ.  */
+}
+
+long
+load_sc_odd (signed char *p)
+{
+  return p[1];				/* should generate LBZ + EXTSB.  */
+}
+
+unsigned long
+load_us_odd (unsigned char *p)
+{
+  return *(unsigned short *)(p + 1);	/* should generate LHZ.  */
+}
+
+long
+load_ss_odd (unsigned char *p)
+{
+  return *(short *)(p + 1);		/* should generate LHA.  */
+}
+
+unsigned long
+load_ui_odd (unsigned char *p)
+{
+  return *(unsigned int *)(p + 1);	/* should generate LWZ.  */
+}
+
+long
+load_si_odd (unsigned char *p)
+{
+  return *(int *)(p + 1);		/* should generate PLWA.  */
+}
+
+unsigned long
+load_ul_odd (unsigned char *p)
+{
+  return *(unsigned long *)(p + 1);	/* should generate PLD.  */
+}
+
+long
+load_sl_odd (unsigned char *p)
+{
+  return *(long *)(p + 1);	/* should generate PLD.  */
+}
+
+float
+load_float_odd (unsigned char *p)
+{
+  return *(float *)(p + 1);		/* should generate LFS.  */
+}
+
+double
+load_double_odd (unsigned char *p)
+{
+  return *(double *)(p + 1);		/* should generate LFD.  */
+}
+
+__ieee128
+load_ieee128_odd (unsigned char *p)
+{
+  return *(__ieee128 *)(p + 1);		/* should generate PLXV.  */
+}
+
+void
+store_uc_odd (unsigned char uc, unsigned char *p)
+{
+  p[1] = uc;				/* should generate STB.  */
+}
+
+void
+store_sc_odd (signed char sc, signed char *p)
+{
+  p[1] = sc;				/* should generate STB.  */
+}
+
+void
+store_us_odd (unsigned short us, unsigned char *p)
+{
+  *(unsigned short *)(p + 1) = us;	/* should generate STH.  */
+}
+
+void
+store_ss_odd (signed short ss, unsigned char *p)
+{
+  *(signed short *)(p + 1) = ss;	/* should generate STH.  */
+}
+
+void
+store_ui_odd (unsigned int ui, unsigned char *p)
+{
+  *(unsigned int *)(p + 1) = ui;	/* should generate STW.  */
+}
+
+void
+store_si_odd (signed int si, unsigned char *p)
+{
+  *(signed int *)(p + 1) = si;		/* should generate STW.  */
+}
+
+void
+store_ul_odd (unsigned long ul, unsigned char *p)
+{
+  *(unsigned long *)(p + 1) = ul;	/* should generate PSTD.  */
+}
+
+void
+store_sl_odd (signed long sl, unsigned char *p)
+{
+  *(signed long *)(p + 1) = sl;		/* should generate PSTD.  */
+}
+
+void
+store_float_odd (float f, unsigned char *p)
+{
+  *(float *)(p + 1) = f;		/* should generate STF.  */
+}
+
+void
+store_double_odd (double d, unsigned char *p)
+{
+  *(double *)(p + 1) = d;		/* should generate STD.  */
+}
+
+void
+store_ieee128_odd (__ieee128 ieee, unsigned char *p)
+{
+  *(__ieee128 *)(p + 1) = ieee;		/* should generate PSTXV.  */
+}
+
+/* { dg-final { scan-assembler-times {\mextsb\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mlbz\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mlfd\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlfs\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlha\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlhz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlwz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mpld\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mplwa\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mplxv\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstb\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mstfd\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mstfs\M}  1 } } */
+/* { dg-final { scan-assembler-times {\msth\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mstw\M}   2 } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #13 of 17: Add test for prefix pre-modify
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
                   ` (11 preceding siblings ...)
  2019-10-16 14:37 ` [PATCH] V6, #12 of 17: Add prefix test for DS/DQ instructions Michael Meissner
@ 2019-10-16 14:40 ` Michael Meissner
  2019-10-16 14:42 ` [PATCH] V6, #14 of 17: Add prefixed load/store tests with large offsets Michael Meissner
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 14:40 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch adds a test to make sure the GCC compiler does not try to issue a
pre-modify prefixed address load/store since the prefixed instructions do not
support an update form.  This patch was in V5 patch #12 but it was split out.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/prefix-premodify.c: New test to make sure we
	do not generate PRE_INC, PRE_DEC, or PRE_MODIFY on prefixed loads
	or stores.

Index: gcc/testsuite/gcc.target/powerpc/prefix-premodify.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-premodify.c	(revision 277039)
+++ gcc/testsuite/gcc.target/powerpc/prefix-premodify.c	(working copy)
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Make sure that we don't try to generate a prefixed form of the load and
+   store with update instructions.  */
+
+#ifndef SIZE
+#define SIZE 50000
+#endif
+
+struct foo {
+  unsigned int field;
+  char pad[SIZE];
+};
+
+struct foo *inc_load (struct foo *p, unsigned int *q)
+{
+  *q = (++p)->field;
+  return p;
+}
+
+struct foo *dec_load (struct foo *p, unsigned int *q)
+{
+  *q = (--p)->field;
+  return p;
+}
+
+struct foo *inc_store (struct foo *p, unsigned int *q)
+{
+  (++p)->field = *q;
+  return p;
+}
+
+struct foo *dec_store (struct foo *p, unsigned int *q)
+{
+  (--p)->field = *q;
+  return p;
+}
+
+/* { dg-final { scan-assembler-times {\mpli\M|\mpla\M|\mpaddi\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mplwz\M}                  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}                  2 } } */
+/* { dg-final { scan-assembler-not   {\mp?lwzu\M}                  } } */
+/* { dg-final { scan-assembler-not   {\mp?stwzu\M}                 } } */
+/* { dg-final { scan-assembler-not   {\maddis\M}                   } } */
+/* { dg-final { scan-assembler-not   {\maddi\M}                    } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #14 of 17: Add prefixed load/store tests with large offsets
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
                   ` (12 preceding siblings ...)
  2019-10-16 14:40 ` [PATCH] V6, #13 of 17: Add test for prefix pre-modify Michael Meissner
@ 2019-10-16 14:42 ` Michael Meissner
  2019-10-16 14:50 ` [PATCH] V6, #15 of 17: Add PC-relative tests Michael Meissner
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 14:42 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch adds a bunch of tests for each of the types to verify that it
generates the appropriate instructions for addresses that do not fit in a
16-bit offset.  This patch is essentially V5 patch #13.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* gcc/testsuite/gcc.target/powerpc/prefix-large.h: New set of
	tests to test prefixed addressing on 'future' system with large
	numeric offsets.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-df.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-di.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-si.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c: New test.

Index: gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE _Decimal64
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-df.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-df.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE double
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-di.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-di.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-di.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE long
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE short
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplh[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE __float128
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE signed char
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c	(working copy)
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE _Decimal32
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mpaddi\M|\mpli|\mpla\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mlfiwzx\M}              2 } } */
+/* { dg-final { scan-assembler-times {\mstfiwx\M}              2 } } */
+
+
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE float
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfs\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfs\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-si.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-si.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-si.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE int
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplw[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE unsigned long
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE unsigned short
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplhz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE unsigned char
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE unsigned int
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplwz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE vector double
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large.h
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large.h	(revision 276777)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large.h	(working copy)
@@ -0,0 +1,59 @@
+/* Common tests for prefixed instructions testing whether we can generate a
+   34-bit offset using 1 instruction.  */
+
+typedef signed char	schar;
+typedef unsigned char	uchar;
+typedef unsigned short	ushort;
+typedef unsigned int	uint;
+typedef unsigned long	ulong;
+typedef long double	ldouble;
+typedef vector double	v2df;
+typedef vector long	v2di;
+typedef vector float	v4sf;
+typedef vector int	v4si;
+
+#ifndef TYPE
+#define TYPE ulong
+#endif
+
+#ifndef ITYPE
+#define ITYPE TYPE
+#endif
+
+#ifndef OTYPE
+#define OTYPE TYPE
+#endif
+
+#if !defined(DO_ADD) && !defined(DO_VALUE) && !defined(DO_SET)
+#define DO_ADD		1
+#define DO_VALUE	1
+#define DO_SET		1
+#endif
+
+#ifndef CONSTANT
+#define CONSTANT	0x123450UL
+#endif
+
+#if DO_ADD
+void
+add (TYPE *p, TYPE a)
+{
+  p[CONSTANT] += a;
+}
+#endif
+
+#if DO_VALUE
+OTYPE
+value (TYPE *p)
+{
+  return p[CONSTANT];
+}
+#endif
+
+#if DO_SET
+void
+set (TYPE *p, ITYPE a)
+{
+  p[CONSTANT] = a;
+}
+#endif

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #15 of 17: Add PC-relative tests
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
                   ` (13 preceding siblings ...)
  2019-10-16 14:42 ` [PATCH] V6, #14 of 17: Add prefixed load/store tests with large offsets Michael Meissner
@ 2019-10-16 14:50 ` Michael Meissner
  2019-10-16 14:52 ` [PATCH] V6, #16 of 17: New test for stack protection Michael Meissner
  2019-10-16 14:58 ` [PATCH] V6, #17 of 17: Add stack protection test Michael Meissner
  16 siblings, 0 replies; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 14:50 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch adds PC-relative tests for the various types, and verifies that
expected instructions are generated.  This is the same as V5 patch #14.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h: New set of
	tests to test prefixed addressing on 'future' system with
	PC-relative tests.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c: New test.
	* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c: New test.

Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for SImode.  */
+
+#define TYPE _Decimal64
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for DFmode.  */
+
+#define TYPE double
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for DImode.  */
+
+#define TYPE long
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for HImode.  */
+
+#define TYPE short
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplh[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for KFmode.  */
+
+#define TYPE __float128
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for QImode.  */
+
+#define TYPE signed char
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for SImode.  */
+
+#define TYPE _Decimal32
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mpaddi|\mpla\M} 3 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for SFmode.  */
+
+#define TYPE float
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplfs\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfs\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for SImode.  */
+
+#define TYPE int
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplw[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for unsigned DImode.  */
+
+#define TYPE unsigned long
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for unsigned HImode.  */
+
+#define TYPE unsigned short
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplhz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for unsigned QImode.  */
+
+#define TYPE unsigned char
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for unsigned SImode.  */
+
+#define TYPE unsigned int
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplwz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for V2DFmode.  */
+
+#define TYPE vector double
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h	(revision 276779)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h	(working copy)
@@ -0,0 +1,58 @@
+/* Common tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for each type.  */
+
+typedef signed char	schar;
+typedef unsigned char	uchar;
+typedef unsigned short	ushort;
+typedef unsigned int	uint;
+typedef unsigned long	ulong;
+typedef long double	ldouble;
+typedef vector double	v2df;
+typedef vector long	v2di;
+typedef vector float	v4sf;
+typedef vector int	v4si;
+
+#ifndef TYPE
+#define TYPE ulong
+#endif
+
+#ifndef ITYPE
+#define ITYPE TYPE
+#endif
+
+#ifndef OTYPE
+#define OTYPE TYPE
+#endif
+
+static TYPE a;
+TYPE *p = &a;
+
+#if !defined(DO_ADD) && !defined(DO_VALUE) && !defined(DO_SET)
+#define DO_ADD		1
+#define DO_VALUE	1
+#define DO_SET		1
+#endif
+
+#if DO_ADD
+void
+add (TYPE b)
+{
+  a += b;
+}
+#endif
+
+#if DO_VALUE
+OTYPE
+value (void)
+{
+  return (OTYPE)a;
+}
+#endif
+
+#if DO_SET
+void
+set (ITYPE b)
+{
+  a = (TYPE)b;
+}
+#endif

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #16 of 17: New test for stack protection
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
                   ` (14 preceding siblings ...)
  2019-10-16 14:50 ` [PATCH] V6, #15 of 17: Add PC-relative tests Michael Meissner
@ 2019-10-16 14:52 ` Michael Meissner
  2019-10-16 14:54   ` [PATCH] V6, #16 of 17: Wrong subject, should have been update @pcrel Michael Meissner
  2019-10-16 14:58 ` [PATCH] V6, #17 of 17: Add stack protection test Michael Meissner
  16 siblings, 1 reply; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 14:52 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This patch adds an explicit (0),1 to labels used with the @pcrel syntax.  The
intention is make sure that the user does not use an instruction that assumes
PC-relative instructions can take a base register (as I did in the V4 patches).
This was V5 patch #15.  This patch is optional.  If it is not applied the code
will still work.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (print_operand_address): Add (0),1 to
	@pcrel to catch errant usage.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 277029)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -13272,7 +13272,10 @@ print_operand_address (FILE *file, rtx x
       if (SYMBOL_REF_P (x) && !SYMBOL_REF_LOCAL_P (x))
 	fprintf (file, "@got");
 
-      fprintf (file, "@pcrel");
+      /* Specifically add (0),1 to catch uses where a @pcrel was added to a an
+	 address with a base register, since the hardware does not support
+	 adding a base register to a PC-relative address.  */
+      fprintf (file, "@pcrel(0),1");
     }
   else if (SYMBOL_REF_P (x) || GET_CODE (x) == CONST
 	   || GET_CODE (x) == LABEL_REF)

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #16 of 17: Wrong subject, should have been update @pcrel
  2019-10-16 14:52 ` [PATCH] V6, #16 of 17: New test for stack protection Michael Meissner
@ 2019-10-16 14:54   ` Michael Meissner
  0 siblings, 0 replies; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 14:54 UTC (permalink / raw)
  To: Michael Meissner; +Cc: gcc-patches, segher, dje.gcc

Note, patch #16 had the wrong subject line.  It should have been that modifies
@pcrel to use an explicit (0),1.  Sorry about that.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] V6, #17 of 17: Add stack protection test
  2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
                   ` (15 preceding siblings ...)
  2019-10-16 14:52 ` [PATCH] V6, #16 of 17: New test for stack protection Michael Meissner
@ 2019-10-16 14:58 ` Michael Meissner
  16 siblings, 0 replies; 30+ messages in thread
From: Michael Meissner @ 2019-10-16 14:58 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, segher, dje.gcc

This is a new test for the stack protection code that was added in V6 patch #4.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* gcc/testsuite/gcc.target/powerpc/prefix-stack-protect.c: New
	test to make sure -fstack-protect-strong works with prefixed
	addressing.

Index: gcc/testsuite/gcc.target/powerpc/prefix-stack-protect.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-stack-protect.c	(revision 277050)
+++ gcc/testsuite/gcc.target/powerpc/prefix-stack-protect.c	(working copy)
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future -fstack-protector-strong" } */
+
+/* Test that we can handle large stack frames with -fstack-protector-strong and
+   prefixed addressing.  */
+
+extern long foo (char *);
+
+long
+bar (void)
+{
+  char buffer[0x20000];
+  return foo (buffer) + 1;
+}
+
+/* { dg-final { scan-assembler {\mpld\M}  } } */
+/* { dg-final { scan-assembler {\mpstd\M} } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] V6, #1 of 17: Use ADJUST_INSN_LENGTH for prefixed instructions
  2019-10-16 13:39 ` [PATCH] V6, #1 of 17: Use ADJUST_INSN_LENGTH for prefixed instructions Michael Meissner
@ 2019-10-22 22:38   ` Segher Boessenkool
  2019-10-23 21:18     ` Michael Meissner
  0 siblings, 1 reply; 30+ messages in thread
From: Segher Boessenkool @ 2019-10-22 22:38 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

Hi!

On Wed, Oct 16, 2019 at 09:35:33AM -0400, Michael Meissner wrote:
> This patch uses the target hook ADJUST_INSN_LENGTH to change the length of
> instructions that contain prefixed memory/add instructions.

That made this amazingly hard to review.  But it might well be worth it,
thankfully :-)

> There are 2 new insn attributes:
> 
> 1) num_insns: If non-zero, returns the number of machine instructions in an
> insn.  This simplifies the calculations in rs6000_insn_cost.

This is great.

> 2) max_prefixed_insns: Returns the maximum number of prefixed instructions in
> an insn.  Normally this is 1, but in the insns that load up 128-bit values into
> GPRs, it will be 2.

This one, I am not so sure.

> -  int n = get_attr_length (insn) / 4;
> +  /* If the insn tells us how many insns there are, use that.  Otherwise use
> +     the length/4.  Adjust the insn length to remove the extra size that
> +     prefixed instructions take.  */

This should be temporary, until we have converted everything to use
num_insns, right?

> +  int n = get_attr_num_insns (insn);
> +  if (n == 0)
> +    {
> +      int length = get_attr_length (insn);
> +      if (get_attr_prefixed (insn) == PREFIXED_YES)
> +	{
> +	  int adjust = 0;
> +	  ADJUST_INSN_LENGTH (insn, adjust);
> +	  length -= adjust;
> +	}
> +
> +      n = length / 4;
> +    }

> --- gcc/config/rs6000/rs6000.h	(revision 277017)
> +++ gcc/config/rs6000/rs6000.h	(working copy)
> @@ -1847,9 +1847,30 @@ extern scalar_int_mode rs6000_pmode;
>  /* Adjust the length of an INSN.  LENGTH is the currently-computed length and
>     should be adjusted to reflect any required changes.  This macro is used when
>     there is some systematic length adjustment required that would be difficult
> -   to express in the length attribute.  */
> +   to express in the length attribute.
>  
> -/* #define ADJUST_INSN_LENGTH(X,LENGTH) */
> +   In the PowerPC, we use this to adjust the length of an instruction if one or
> +   more prefixed instructions are generated, using the attribute
> +   num_prefixed_insns.  A prefixed instruction is 8 bytes instead of 4, but the
> +   hardware requires that a prefied instruciton not cross a 64-byte boundary.

"prefixed instruction does not"

> +   This means the compiler has to assume the length of the first prefixed
> +   instruction is 12 bytes instead of 8 bytes.  Since the length is already set
> +   for the non-prefixed instruction, we just need to udpate for the
> +   difference.  */
> +
> +#define ADJUST_INSN_LENGTH(INSN,LENGTH)					\
> +{									\
> +  if (NONJUMP_INSN_P (INSN))						\
> +    {									\
> +      rtx pattern = PATTERN (INSN);					\
> +      if (GET_CODE (pattern) != USE && GET_CODE (pattern) != CLOBBER	\
> +	  && get_attr_prefixed (INSN) == PREFIXED_YES)			\
> +	{								\
> +	  int num_prefixed = get_attr_max_prefixed_insns (INSN);	\
> +	  (LENGTH) += 4 * (num_prefixed + 1);				\
> +	}								\
> +    }									\
> +}

Please use a function, not a function-like macro.

So this computes the *maximum* RTL instruction length, not considering how
many of the machine insns in it need a prefix insn.  Can't we do better?
Hrm, I guess in all cases that matter we will split early anyway.


> +;; Return the number of real hardware instructions in a combined insn.  If it
> +;; is 0, just use the length / 4.
> +(define_attr "num_insns" "" (const_int 0))

So we could have the default value *be* length/4, not 0?

> +;; If an insn is prefixed, return the maximum number of prefixed instructions
> +;; in the insn.  The macro ADJUST_INSN_LENGTH uses this number to adjust the
> +;; insn length.
> +(define_attr "max_prefixed_insns" "" (const_int 1))

"maximum number of prefixed machine instructions in the RTL instruction".

> +;; Length of the instruction (in bytes).  This length does not consider the
> +;; length for prefixed instructions.  The macro ADJUST_INSN_LENGTH will adjust
> +;; the length if there are prefixed instructions.

That is not what it does...  it uses the maximum number of prefixed insns
there could be.

> +;; While it might be tempting to use num_insns to calculate the length, it can
> +;; be problematical unless all insn lengths are adjusted to use num_insns
> +;; (i.e. if num_insns is 0, it will get the length, which in turn will get
> +;; num_insns and recurse).
> +(define_attr "length" "" (const_int 4))

Yes, and not only is it tempting, it is what we are going to do!  Right?
:-)  Just not just yet.

So please use a function for ADJUST_INSN_LENGTH.  Okay for trunk like that.
Thanks!  And sorry this took so long.


Segher

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] V6, #2 of 17: Minor code reformat
  2019-10-16 13:42 ` [PATCH] V6, #2 of 17: Minor code reformat Michael Meissner
@ 2019-10-22 22:39   ` Segher Boessenkool
  0 siblings, 0 replies; 30+ messages in thread
From: Segher Boessenkool @ 2019-10-22 22:39 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

Hi!

On Wed, Oct 16, 2019 at 09:38:54AM -0400, Michael Meissner wrote:
> This patch tweaks the code formatting that I noticed in making the previous
> patch for some of the 128-bit mode move instructions.  Originally this was part
> of V5 patch #2, but it has been moved to a separate patch.

> @@ -7792,7 +7795,10 @@ (define_insn_and_split "*movtd_64bit_nod
>    "#"
>    "&& reload_completed"
>    [(pc)]
> -{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
> +{
> +  rs6000_split_multireg_move (operands[0], operands[1]);
> +  DONE;}
> +
>    [(set_attr "length" "8,8,8,12,12,8")
>     (set_attr "max_prefixed_insns" "2")
>     (set_attr "num_insns" "2,2,2,3,3,2")])

Newline before the } here, and no empty line after.  Like all the rest ;-)

Okay with that fixed.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] V6, #3 of 17: Update lwa_operand for prefixed PLWA
  2019-10-16 13:47 ` [PATCH] V6, #3 of 17: Update lwa_operand for prefixed PLWA Michael Meissner
@ 2019-10-22 23:33   ` Segher Boessenkool
  0 siblings, 0 replies; 30+ messages in thread
From: Segher Boessenkool @ 2019-10-22 23:33 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

On Wed, Oct 16, 2019 at 09:42:16AM -0400, Michael Meissner wrote:
> 2019-10-15  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/predicates.md (lwa_operand): If the bottom two
> 	bits of the offset for the memory address are non-zero, use PLWA
> 	if prefixed instructions are available.

Okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] V6, #1 of 17: Use ADJUST_INSN_LENGTH for prefixed instructions
  2019-10-22 22:38   ` Segher Boessenkool
@ 2019-10-23 21:18     ` Michael Meissner
  2019-10-23 21:45       ` Segher Boessenkool
  0 siblings, 1 reply; 30+ messages in thread
From: Michael Meissner @ 2019-10-23 21:18 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Tue, Oct 22, 2019 at 05:27:19PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Oct 16, 2019 at 09:35:33AM -0400, Michael Meissner wrote:
> > This patch uses the target hook ADJUST_INSN_LENGTH to change the length of
> > instructions that contain prefixed memory/add instructions.
> 
> That made this amazingly hard to review.  But it might well be worth it,
> thankfully :-)
> 
> > There are 2 new insn attributes:
> > 
> > 1) num_insns: If non-zero, returns the number of machine instructions in an
> > insn.  This simplifies the calculations in rs6000_insn_cost.
> 
> This is great.
> 
> > 2) max_prefixed_insns: Returns the maximum number of prefixed instructions in
> > an insn.  Normally this is 1, but in the insns that load up 128-bit values into
> > GPRs, it will be 2.
> 
> This one, I am not so sure.

I wanted it to be simple, so in general it was just a constant.  Since the only
user of it has already checked that the insn is prefixed, I didn't think it
needed the prefixed test to set it to 0.

> > -  int n = get_attr_length (insn) / 4;
> > +  /* If the insn tells us how many insns there are, use that.  Otherwise use
> > +     the length/4.  Adjust the insn length to remove the extra size that
> > +     prefixed instructions take.  */
> 
> This should be temporary, until we have converted everything to use
> num_insns, right?

Well there were some 200+ places where length was set.

> > --- gcc/config/rs6000/rs6000.h	(revision 277017)
> > +++ gcc/config/rs6000/rs6000.h	(working copy)
> > @@ -1847,9 +1847,30 @@ extern scalar_int_mode rs6000_pmode;
> >  /* Adjust the length of an INSN.  LENGTH is the currently-computed length and
> >     should be adjusted to reflect any required changes.  This macro is used when
> >     there is some systematic length adjustment required that would be difficult
> > -   to express in the length attribute.  */
> > +   to express in the length attribute.
> >  
> > -/* #define ADJUST_INSN_LENGTH(X,LENGTH) */
> > +   In the PowerPC, we use this to adjust the length of an instruction if one or
> > +   more prefixed instructions are generated, using the attribute
> > +   num_prefixed_insns.  A prefixed instruction is 8 bytes instead of 4, but the
> > +   hardware requires that a prefied instruciton not cross a 64-byte boundary.
> 
> "prefixed instruction does not"

Thanks.

> > +   This means the compiler has to assume the length of the first prefixed
> > +   instruction is 12 bytes instead of 8 bytes.  Since the length is already set
> > +   for the non-prefixed instruction, we just need to udpate for the
> > +   difference.  */
> > +
> > +#define ADJUST_INSN_LENGTH(INSN,LENGTH)					\
> > +{									\
> > +  if (NONJUMP_INSN_P (INSN))						\
> > +    {									\
> > +      rtx pattern = PATTERN (INSN);					\
> > +      if (GET_CODE (pattern) != USE && GET_CODE (pattern) != CLOBBER	\
> > +	  && get_attr_prefixed (INSN) == PREFIXED_YES)			\
> > +	{								\
> > +	  int num_prefixed = get_attr_max_prefixed_insns (INSN);	\
> > +	  (LENGTH) += 4 * (num_prefixed + 1);				\
> > +	}								\
> > +    }									\
> > +}
> 
> Please use a function, not a function-like macro.

Ok, I added rs6000_adjust_insn_length in rs6000.c.

> So this computes the *maximum* RTL instruction length, not considering how
> many of the machine insns in it need a prefix insn.  Can't we do better?
> Hrm, I guess in all cases that matter we will split early anyway.

Well before register allocation for the 128-bit types, you really can't say
what the precise length is, even if it is not prefixed.

And of course even after register allocation, it isn't precise, since the
length of a prefixed instruction is normally 8, but sometimes 12.  So we have
to use 12.

> 
> > +;; Return the number of real hardware instructions in a combined insn.  If it
> > +;; is 0, just use the length / 4.
> > +(define_attr "num_insns" "" (const_int 0))
> 
> So we could have the default value *be* length/4, not 0?

Only if you make sure that every place sets num_insns.  As the comment says,
until it is set every where, you run the risk of a deadly embrace.

> > +;; If an insn is prefixed, return the maximum number of prefixed instructions
> > +;; in the insn.  The macro ADJUST_INSN_LENGTH uses this number to adjust the
> > +;; insn length.
> > +(define_attr "max_prefixed_insns" "" (const_int 1))
> 
> "maximum number of prefixed machine instructions in the RTL instruction".
> 
> So please use a function for ADJUST_INSN_LENGTH.  Okay for trunk like that.
> Thanks!  And sorry this took so long.

Checked in, thanks.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] V6, #1 of 17: Use ADJUST_INSN_LENGTH for prefixed instructions
  2019-10-23 21:18     ` Michael Meissner
@ 2019-10-23 21:45       ` Segher Boessenkool
  2019-10-31 23:03         ` "length" troubles (was: Re: [PATCH] V6, #1 of 17: Use ADJUST_INSN_LENGTH for prefixed instructions) Segher Boessenkool
  0 siblings, 1 reply; 30+ messages in thread
From: Segher Boessenkool @ 2019-10-23 21:45 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

On Wed, Oct 23, 2019 at 05:00:58PM -0400, Michael Meissner wrote:
> On Tue, Oct 22, 2019 at 05:27:19PM -0500, Segher Boessenkool wrote:
> > On Wed, Oct 16, 2019 at 09:35:33AM -0400, Michael Meissner wrote:
> > > -  int n = get_attr_length (insn) / 4;
> > > +  /* If the insn tells us how many insns there are, use that.  Otherwise use
> > > +     the length/4.  Adjust the insn length to remove the extra size that
> > > +     prefixed instructions take.  */
> > 
> > This should be temporary, until we have converted everything to use
> > num_insns, right?
> 
> Well there were some 200+ places where length was set.

Yes, and I did volunteer to do this work, if needed / wanted.

> > Please use a function, not a function-like macro.
> 
> Ok, I added rs6000_adjust_insn_length in rs6000.c.

Thanks.

> > > +;; Return the number of real hardware instructions in a combined insn.  If it
> > > +;; is 0, just use the length / 4.
> > > +(define_attr "num_insns" "" (const_int 0))
> > 
> > So we could have the default value *be* length/4, not 0?
> 
> Only if you make sure that every place sets num_insns.  As the comment says,
> until it is set every where, you run the risk of a deadly embrace.

Sure :-)


Segher

^ permalink raw reply	[flat|nested] 30+ messages in thread

* "length" troubles (was: Re: [PATCH] V6, #1 of 17: Use ADJUST_INSN_LENGTH for prefixed instructions)
  2019-10-23 21:45       ` Segher Boessenkool
@ 2019-10-31 23:03         ` Segher Boessenkool
  0 siblings, 0 replies; 30+ messages in thread
From: Segher Boessenkool @ 2019-10-31 23:03 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

On Wed, Oct 23, 2019 at 04:42:19PM -0500, Segher Boessenkool wrote:
> On Wed, Oct 23, 2019 at 05:00:58PM -0400, Michael Meissner wrote:
> > On Tue, Oct 22, 2019 at 05:27:19PM -0500, Segher Boessenkool wrote:
> > > On Wed, Oct 16, 2019 at 09:35:33AM -0400, Michael Meissner wrote:
> > > > -  int n = get_attr_length (insn) / 4;
> > > > +  /* If the insn tells us how many insns there are, use that.  Otherwise use
> > > > +     the length/4.  Adjust the insn length to remove the extra size that
> > > > +     prefixed instructions take.  */
> > > 
> > > This should be temporary, until we have converted everything to use
> > > num_insns, right?
> > 
> > Well there were some 200+ places where length was set.
> 
> Yes, and I did volunteer to do this work, if needed / wanted.

So I did this, but it does not work: I end up with out-of-range branches.
I rewrote it in a different way: same thing.  And again, and again.

The fundamental problem is that the "length" attribute is special.  It
actually is four attributes:

insn_current_length     (that's the one you expect)
insn_variable_length_p  (true if the length of this insn depends on
                         the (relative) addresses of any (other) insns)
insn_min_length         (minimum length: minimum of the alternatives)
insn_default_length     (which really should be called insn_max_length)

which are used in the shorten pass (which actually makes branches *longer*
in the normal case).  The problem is that these are only computed for static
values that are set in the "length" attribute in the machine description,
while we actually want to set "length" based on some other attributes
("num_insns", "prefixed", "max_prefixed_insns"), either directly or via
adjust_insn_length.

The shorten pass (which lenghtens branches, unless -O0) does not notice
most insns that can become longer, and eventually we ICE because of an
out-of-range branch.

Hrm, maybe if I force insn_variable_length_p to "true".  Let me try that.


Segher

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] V6, #4 of 17: Add prefixed instruction support to stack protect insns
  2019-10-16 13:49 ` [PATCH] V6, #4 of 17: Add prefixed instruction support to stack protect insns Michael Meissner
@ 2019-11-02  3:22   ` Segher Boessenkool
  2019-11-10  6:39     ` Michael Meissner
  2019-11-11 23:16     ` Michael Meissner
  0 siblings, 2 replies; 30+ messages in thread
From: Segher Boessenkool @ 2019-11-02  3:22 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

Hi!

On Wed, Oct 16, 2019 at 09:47:41AM -0400, Michael Meissner wrote:
> This patch fixes the stack protection insns to support stacks larger than
> 16-bits on the 'future' system using prefixed loads and stores.

> +;; We can't use the prefixed attribute here because there are two memory
> +;; instructions.  We can't split the insn due to the fact that this operation
> +;; needs to be done in one piece.
>  (define_insn "stack_protect_setdi"
>    [(set (match_operand:DI 0 "memory_operand" "=Y")
>  	(unspec:DI [(match_operand:DI 1 "memory_operand" "Y")] UNSPEC_SP_SET))
>     (set (match_scratch:DI 2 "=&r") (const_int 0))]
>    "TARGET_64BIT"
> -  "ld%U1%X1 %2,%1\;std%U0%X0 %2,%0\;li %2,0"
> +{
> +  if (prefixed_memory (operands[1], DImode))
> +    output_asm_insn ("pld %2,%1", operands);
> +  else
> +    output_asm_insn ("ld%U1%X1 %2,%1", operands);
> +
> +  if (prefixed_memory (operands[0], DImode))
> +    output_asm_insn ("pstd %2,%0", operands);
> +  else
> +    output_asm_insn ("std%U0%X0 %2,%0", operands);

We could make %pN mean 'p' for prefixed, for memory as operands[N]?  Are
there more places than this that could use that?  How about inline asm?

> +   (set (attr "length")
> +	(cond [(and (match_operand 0 "prefixed_memory")
> +		    (match_operand 1 "prefixed_memory"))
> +	       (const_string "24")
> +
> +	       (ior (match_operand 0 "prefixed_memory")
> +		    (match_operand 1 "prefixed_memory"))
> +	       (const_string "20")]
> +
> +	      (const_string "12")))])

You can use const_int instead of const_string here, I think?  Please do
that if it works.

Quite a simple expression, phew :-)

> +  if (which_alternative == 0)
> +    output_asm_insn ("xor. %3,%3,%4", operands);
> +  else
> +    output_asm_insn ("cmpld %0,%3,%4\;li %3,0", operands);

That doesn't work: the backslash is treated like the escape character, in
a C block.  I think doubling it will work?  Check the generated insn-output.c,
it should be translated to \t\n in there.

Okay for trunk with those things taken care of.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] V6, #4 of 17: Add prefixed instruction support to stack protect insns
  2019-11-02  3:22   ` Segher Boessenkool
@ 2019-11-10  6:39     ` Michael Meissner
  2019-11-10  7:01       ` Segher Boessenkool
  2019-11-11 23:16     ` Michael Meissner
  1 sibling, 1 reply; 30+ messages in thread
From: Michael Meissner @ 2019-11-10  6:39 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Fri, Nov 01, 2019 at 10:22:03PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Oct 16, 2019 at 09:47:41AM -0400, Michael Meissner wrote:
> > This patch fixes the stack protection insns to support stacks larger than
> > 16-bits on the 'future' system using prefixed loads and stores.
> 
> > +;; We can't use the prefixed attribute here because there are two memory
> > +;; instructions.  We can't split the insn due to the fact that this operation
> > +;; needs to be done in one piece.
> >  (define_insn "stack_protect_setdi"
> >    [(set (match_operand:DI 0 "memory_operand" "=Y")
> >  	(unspec:DI [(match_operand:DI 1 "memory_operand" "Y")] UNSPEC_SP_SET))
> >     (set (match_scratch:DI 2 "=&r") (const_int 0))]
> >    "TARGET_64BIT"
> > -  "ld%U1%X1 %2,%1\;std%U0%X0 %2,%0\;li %2,0"
> > +{
> > +  if (prefixed_memory (operands[1], DImode))
> > +    output_asm_insn ("pld %2,%1", operands);
> > +  else
> > +    output_asm_insn ("ld%U1%X1 %2,%1", operands);
> > +
> > +  if (prefixed_memory (operands[0], DImode))
> > +    output_asm_insn ("pstd %2,%0", operands);
> > +  else
> > +    output_asm_insn ("std%U0%X0 %2,%0", operands);
> 
> We could make %pN mean 'p' for prefixed, for memory as operands[N]?  Are
> there more places than this that could use that?  How about inline asm?

Right now, the only two places that do this are the two stack protect insns.
Everything else that I'm aware of that generates multiple loads or stores will
do a split before final.

> > +   (set (attr "length")
> > +	(cond [(and (match_operand 0 "prefixed_memory")
> > +		    (match_operand 1 "prefixed_memory"))
> > +	       (const_string "24")
> > +
> > +	       (ior (match_operand 0 "prefixed_memory")
> > +		    (match_operand 1 "prefixed_memory"))
> > +	       (const_string "20")]
> > +
> > +	      (const_string "12")))])
> 
> You can use const_int instead of const_string here, I think?  Please do
> that if it works.

I'll try it out on Monday.

> Quite a simple expression, phew :-)
> 
> > +  if (which_alternative == 0)
> > +    output_asm_insn ("xor. %3,%3,%4", operands);
> > +  else
> > +    output_asm_insn ("cmpld %0,%3,%4\;li %3,0", operands);
> 
> That doesn't work: the backslash is treated like the escape character, in
> a C block.  I think doubling it will work?  Check the generated insn-output.c,
> it should be translated to \t\n in there.

Yes it does work.  I just checked.

> Okay for trunk with those things taken care of.  Thanks!

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] V6, #4 of 17: Add prefixed instruction support to stack protect insns
  2019-11-10  6:39     ` Michael Meissner
@ 2019-11-10  7:01       ` Segher Boessenkool
  2019-11-10  9:43         ` Segher Boessenkool
  0 siblings, 1 reply; 30+ messages in thread
From: Segher Boessenkool @ 2019-11-10  7:01 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

On Sun, Nov 10, 2019 at 01:32:29AM -0500, Michael Meissner wrote:
> On Fri, Nov 01, 2019 at 10:22:03PM -0500, Segher Boessenkool wrote:
> > On Wed, Oct 16, 2019 at 09:47:41AM -0400, Michael Meissner wrote:
> > We could make %pN mean 'p' for prefixed, for memory as operands[N]?  Are
> > there more places than this that could use that?  How about inline asm?
> 
> Right now, the only two places that do this are the two stack protect insns.
> Everything else that I'm aware of that generates multiple loads or stores will
> do a split before final.

How about inline asm?

> > > +  if (which_alternative == 0)
> > > +    output_asm_insn ("xor. %3,%3,%4", operands);
> > > +  else
> > > +    output_asm_insn ("cmpld %0,%3,%4\;li %3,0", operands);
> > 
> > That doesn't work: the backslash is treated like the escape character, in
> > a C block.  I think doubling it will work?  Check the generated insn-output.c,
> > it should be translated to \t\n in there.
> 
> Yes it does work.  I just checked.

It emits a literal ';' into the assembler code.


Segher

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] V6, #4 of 17: Add prefixed instruction support to stack protect insns
  2019-11-10  7:01       ` Segher Boessenkool
@ 2019-11-10  9:43         ` Segher Boessenkool
  0 siblings, 0 replies; 30+ messages in thread
From: Segher Boessenkool @ 2019-11-10  9:43 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

On Sun, Nov 10, 2019 at 12:38:56AM -0600, Segher Boessenkool wrote:
> It emits a literal ';' into the assembler code.

Or actually, huh, it doesn't.  Sorry.  See read_braced_string in
read-md.c .  Your code is fine.


Segher

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] V6, #4 of 17: Add prefixed instruction support to stack protect insns
  2019-11-02  3:22   ` Segher Boessenkool
  2019-11-10  6:39     ` Michael Meissner
@ 2019-11-11 23:16     ` Michael Meissner
  1 sibling, 0 replies; 30+ messages in thread
From: Michael Meissner @ 2019-11-11 23:16 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, dje.gcc

On Fri, Nov 01, 2019 at 10:22:03PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Oct 16, 2019 at 09:47:41AM -0400, Michael Meissner wrote:
> > This patch fixes the stack protection insns to support stacks larger than
> > 16-bits on the 'future' system using prefixed loads and stores.
> 
> > +;; We can't use the prefixed attribute here because there are two memory
> > +;; instructions.  We can't split the insn due to the fact that this operation
> > +;; needs to be done in one piece.
> >  (define_insn "stack_protect_setdi"
> >    [(set (match_operand:DI 0 "memory_operand" "=Y")
> >  	(unspec:DI [(match_operand:DI 1 "memory_operand" "Y")] UNSPEC_SP_SET))
> >     (set (match_scratch:DI 2 "=&r") (const_int 0))]
> >    "TARGET_64BIT"
> > -  "ld%U1%X1 %2,%1\;std%U0%X0 %2,%0\;li %2,0"
> > +{
> > +  if (prefixed_memory (operands[1], DImode))
> > +    output_asm_insn ("pld %2,%1", operands);
> > +  else
> > +    output_asm_insn ("ld%U1%X1 %2,%1", operands);
> > +
> > +  if (prefixed_memory (operands[0], DImode))
> > +    output_asm_insn ("pstd %2,%0", operands);
> > +  else
> > +    output_asm_insn ("std%U0%X0 %2,%0", operands);
> 
> We could make %pN mean 'p' for prefixed, for memory as operands[N]?  Are
> there more places than this that could use that?  How about inline asm?

At the moment, I did not add this.  We can revisit it later.

> > +   (set (attr "length")
> > +	(cond [(and (match_operand 0 "prefixed_memory")
> > +		    (match_operand 1 "prefixed_memory"))
> > +	       (const_string "24")
> > +
> > +	       (ior (match_operand 0 "prefixed_memory")
> > +		    (match_operand 1 "prefixed_memory"))
> > +	       (const_string "20")]
> > +
> > +	      (const_string "12")))])
> 
> You can use const_int instead of const_string here, I think?  Please do
> that if it works.
> 
> Quite a simple expression, phew :-)

Const_int works.

> > +  if (which_alternative == 0)
> > +    output_asm_insn ("xor. %3,%3,%4", operands);
> > +  else
> > +    output_asm_insn ("cmpld %0,%3,%4\;li %3,0", operands);
> 
> That doesn't work: the backslash is treated like the escape character, in
> a C block.  I think doubling it will work?  Check the generated insn-output.c,
> it should be translated to \t\n in there.
> 
> Okay for trunk with those things taken care of.  Thanks!

As we discussed, this does work.

Here is the patch committed.  I did a bootstrap and did make check.  There were
no regressions.

2019-11-11  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (prefixed_memory): New predicate.
	* config/rs6000/rs6000.md (stack_protect_setdi): Deal with either
	address being a prefixed load/store.
	(stack_protect_testdi): Deal with either address being a prefixed
	load.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 278062)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -1828,3 +1828,10 @@ (define_predicate "pcrel_external_addres
 (define_predicate "pcrel_local_or_external_address"
   (ior (match_operand 0 "pcrel_local_address")
        (match_operand 0 "pcrel_external_address")))
+
+;; Return true if the operand is a memory address that uses a prefixed address.
+(define_predicate "prefixed_memory"
+  (match_code "mem")
+{
+  return address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
+})
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 278062)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -11536,14 +11536,44 @@ (define_insn "stack_protect_setsi"
   [(set_attr "type" "three")
    (set_attr "length" "12")])
 
+;; We can't use the prefixed attribute here because there are two memory
+;; instructions.  We can't split the insn due to the fact that this operation
+;; needs to be done in one piece.
 (define_insn "stack_protect_setdi"
   [(set (match_operand:DI 0 "memory_operand" "=Y")
 	(unspec:DI [(match_operand:DI 1 "memory_operand" "Y")] UNSPEC_SP_SET))
    (set (match_scratch:DI 2 "=&r") (const_int 0))]
   "TARGET_64BIT"
-  "ld%U1%X1 %2,%1\;std%U0%X0 %2,%0\;li %2,0"
+{
+  if (prefixed_memory (operands[1], DImode))
+    output_asm_insn ("pld %2,%1", operands);
+  else
+    output_asm_insn ("ld%U1%X1 %2,%1", operands);
+
+  if (prefixed_memory (operands[0], DImode))
+    output_asm_insn ("pstd %2,%0", operands);
+  else
+    output_asm_insn ("std%U0%X0 %2,%0", operands);
+
+  return "li %2,0";
+}
   [(set_attr "type" "three")
-   (set_attr "length" "12")])
+
+  ;; Back to back prefixed memory instructions take 20 bytes (8 bytes for each
+  ;; prefixed instruction + 4 bytes for the possible NOP).  Add in 4 bytes for
+  ;; the LI 0 at the end.
+   (set_attr "prefixed" "no")
+   (set_attr "num_insns" "3")
+   (set (attr "length")
+	(cond [(and (match_operand 0 "prefixed_memory")
+		    (match_operand 1 "prefixed_memory"))
+	       (const_int 24)
+
+	       (ior (match_operand 0 "prefixed_memory")
+		    (match_operand 1 "prefixed_memory"))
+	       (const_int 20)]
+
+	      (const_int 12)))])
 
 (define_expand "stack_protect_test"
   [(match_operand 0 "memory_operand")
@@ -11582,6 +11612,9 @@ (define_insn "stack_protect_testsi"
    lwz%U1%X1 %3,%1\;lwz%U2%X2 %4,%2\;cmplw %0,%3,%4\;li %3,0\;li %4,0"
   [(set_attr "length" "16,20")])
 
+;; We can't use the prefixed attribute here because there are two memory
+;; instructions.  We can't split the insn due to the fact that this operation
+;; needs to be done in one piece.
 (define_insn "stack_protect_testdi"
   [(set (match_operand:CCEQ 0 "cc_reg_operand" "=x,?y")
         (unspec:CCEQ [(match_operand:DI 1 "memory_operand" "Y,Y")
@@ -11590,10 +11623,45 @@ (define_insn "stack_protect_testdi"
    (set (match_scratch:DI 4 "=r,r") (const_int 0))
    (clobber (match_scratch:DI 3 "=&r,&r"))]
   "TARGET_64BIT"
-  "@
-   ld%U1%X1 %3,%1\;ld%U2%X2 %4,%2\;xor. %3,%3,%4\;li %4,0
-   ld%U1%X1 %3,%1\;ld%U2%X2 %4,%2\;cmpld %0,%3,%4\;li %3,0\;li %4,0"
-  [(set_attr "length" "16,20")])
+{
+  if (prefixed_memory (operands[1], DImode))
+    output_asm_insn ("pld %3,%1", operands);
+  else
+    output_asm_insn ("ld%U1%X1 %3,%1", operands);
+
+  if (prefixed_memory (operands[2], DImode))
+    output_asm_insn ("pld %4,%2", operands);
+  else
+    output_asm_insn ("ld%U2%X2 %4,%2", operands);
+
+  if (which_alternative == 0)
+    output_asm_insn ("xor. %3,%3,%4", operands);
+  else
+    output_asm_insn ("cmpld %0,%3,%4\;li %3,0", operands);
+
+  return "li %4,0";
+}
+  ;; Back to back prefixed memory instructions take 20 bytes (8 bytes for each
+  ;; prefixed instruction + 4 bytes for the possible NOP).  Add in either 4 or
+  ;; 8 bytes to do the test.
+  [(set_attr "prefixed" "no")
+   (set_attr "num_insns" "4,5")
+   (set (attr "length")
+	(cond [(and (match_operand 1 "prefixed_memory")
+		    (match_operand 2 "prefixed_memory"))
+	       (if_then_else (eq_attr "alternative" "0")
+			     (const_int 28)
+			     (const_int 32))
+
+	       (ior (match_operand 1 "prefixed_memory")
+		    (match_operand 2 "prefixed_memory"))
+	       (if_then_else (eq_attr "alternative" "0")
+			     (const_int 20)
+			     (const_int 24))]
+
+	      (if_then_else (eq_attr "alternative" "0")
+			    (const_int 16)
+			    (const_int 20))))])
 
 \f
 ;; Here are the actual compare insns.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2019-11-11 23:02 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-16 12:51 PowerPC future machine patches, version 6 Michael Meissner
2019-10-16 13:39 ` [PATCH] V6, #1 of 17: Use ADJUST_INSN_LENGTH for prefixed instructions Michael Meissner
2019-10-22 22:38   ` Segher Boessenkool
2019-10-23 21:18     ` Michael Meissner
2019-10-23 21:45       ` Segher Boessenkool
2019-10-31 23:03         ` "length" troubles (was: Re: [PATCH] V6, #1 of 17: Use ADJUST_INSN_LENGTH for prefixed instructions) Segher Boessenkool
2019-10-16 13:42 ` [PATCH] V6, #2 of 17: Minor code reformat Michael Meissner
2019-10-22 22:39   ` Segher Boessenkool
2019-10-16 13:47 ` [PATCH] V6, #3 of 17: Update lwa_operand for prefixed PLWA Michael Meissner
2019-10-22 23:33   ` Segher Boessenkool
2019-10-16 13:49 ` [PATCH] V6, #4 of 17: Add prefixed instruction support to stack protect insns Michael Meissner
2019-11-02  3:22   ` Segher Boessenkool
2019-11-10  6:39     ` Michael Meissner
2019-11-10  7:01       ` Segher Boessenkool
2019-11-10  9:43         ` Segher Boessenkool
2019-11-11 23:16     ` Michael Meissner
2019-10-16 14:08 ` [PATCH] V6, #5 of 17: Add prefixed instruction support to vector extract optimizations Michael Meissner
2019-10-16 14:11 ` [PATCH] V6, #6 of 17: Use PADDI/PLI to load up 34-bit DImode constants Michael Meissner
2019-10-16 14:14 ` [PATCH] V6, #7 of 17: Use PADDI/PLI to load up 32-bit SImode constants Michael Meissner
2019-10-16 14:21 ` [PATCH] V6, #8 of 17: Use PADDI to add 34-bit constants Michael Meissner
2019-10-16 14:25 ` [PATCH] V6, #9 of 17: Change defaults on Linux 64-bit to enable -mpcrel Michael Meissner
2019-10-16 14:26 ` [PATCH] V6, #10 of 17: Update target-supports.exp Michael Meissner
2019-10-16 14:27 ` [PATCH] V6, #11 of 17: Add PADDI tests Michael Meissner
2019-10-16 14:37 ` [PATCH] V6, #12 of 17: Add prefix test for DS/DQ instructions Michael Meissner
2019-10-16 14:40 ` [PATCH] V6, #13 of 17: Add test for prefix pre-modify Michael Meissner
2019-10-16 14:42 ` [PATCH] V6, #14 of 17: Add prefixed load/store tests with large offsets Michael Meissner
2019-10-16 14:50 ` [PATCH] V6, #15 of 17: Add PC-relative tests Michael Meissner
2019-10-16 14:52 ` [PATCH] V6, #16 of 17: New test for stack protection Michael Meissner
2019-10-16 14:54   ` [PATCH] V6, #16 of 17: Wrong subject, should have been update @pcrel Michael Meissner
2019-10-16 14:58 ` [PATCH] V6, #17 of 17: Add stack protection test Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).